Hello I have 2d matrix data saved in the ILArray < double >. This matrix represents the weights of the neural network from one neuron and i want to see how the weights looks with ilnumerics. Any idea how can i do this? I find many examples for 3d plotting but nothing for plotting 2d image data representation.
Image data are currently best (simplest) visualized by utilizing ILSurface. Since this is a 3D plot, you may not get the optimal performance for large image data. Fortunately, ILNumerics' scene graph makes it easy to improve this with your own implementation.
The most simple attempt would take an ILPoints shape, arrange the needed number of points in a grid and let every point visualize the value of the corresponding element within the input matrix - let's say by color (or size).
private void ilPanel1_Load(object sender, EventArgs e) {
using (ILScope.Enter()) {
// some 'input matrix'
ILArray<float> Z = ILSpecialData.sincf(40, 50);
// do some reordering: prepare vertices
ILArray<float> Y = 1, X = ILMath.meshgrid(
ILMath.vec<float>(1, Z.S[1]),
ILMath.vec<float>(1,Z.S[0]),
Y);
// reallocate the vertex positions matrix
ILArray<float> pos = ILMath.zeros<float>(3, X.S.NumberOfElements);
// fill in values
pos["0;:"] = X[":"];
pos["1;:"] = Y[":"];
pos["2;:"] = Z[":"];
// colormap used to map the values to colors
ILColormap cmap = new ILColormap(Colormaps.Hot);
// setup the scene
ilPanel1.Scene.Add(new ILPlotCube {
new ILPoints() {
Positions = pos,
Colors = cmap.Map(Z).T,
Color = null
}
});
}
}
Obviously, the resulting points do not scale with the form. So the 'image' suffers from larger gaps between the points when the form size is increased. So, for a better implementation you may adapt the approach to utilize ILTriangles instead of ILPoints, in order to assemble adjacent rectangles.
Related
What I want to achieve?
I'm working on an evolutionary algorithm finding min/max of non-linear functions. I have fully functional WPF application, but there's one feature missing: 3D plots.
What is the problem?
To accomplish this I've started with free trial of ilNumerics which provide 3D data visualisation. It works completely fine with examples from documentation, but there's something what prevents me from plotting properly my own 3D graphs.
Visualising problem:
So, here is how it behaves at the moment
Those are graphs of non-linear function: x1^4+x2^4-0.62*x1^2-0.62*x2^2
Left side: Contour achieved with OxyPlot
Right side: 3D graph achieved with ilNumerics
As you can see, OxyPlot contour is completely fine and 3D graph which I'm trying to plot with exactly same data is not proper at all.
How actual (not working) solution is done?
I'm trying to visualise 3D surface using points in space. ILNumerics has class called Surface which object I have to create in order to plot my graph. It has following constructor:
public Surface(InArray<float> ZXYPositions, InArray<float> C = null, Tuple<float, float> colorsDataRange = null, Colormap colormap = null, object tag = null);
where as you can see ZXYPositions is what I actually have problem with. Before instantiating Surface object I'm creating an Array like this:
int m = 0;
for (int i = 0; i < p; ++i)
{
for (int j = 0; j < p; ++j)
{
sigma[m, 0] = (float)data[i, j];
sigma[m, 1] = (float)xy[0][i];
sigma[m, 2] = (float)xy[1][j];
m++;
}
}
where sigma[m, 0] = Z; sigma[m, 1] = X; sigma[m, 2] = Y;
And here's the problem. I cannot find any logical error in this approach.
Here is code responsible for creating object which I'm passing to ilNumerics plot panel:
var scene = new PlotCube(twoDMode: false) {
// add a surface
new Surface(sigma) {
// make thin transparent wireframes
Wireframe = { Color = Color.FromArgb(50, Color.LightGray) },
// choose a different colormap
Colormap = Colormaps.Jet,
}
};
Additionaly I want to say that sigma array is constructed properly, because I've printed out its values and they're definitely correct.
Plot only data points.
At the end I need to add, that when I'm not creating surface object and plot only data points it looks much more reasonable:
But sadly it's not what I'm looking for. I want to create a surface with this data.
Good News!
I found the answer. Oddly almost evereything was fine.. I missunderstood just one thing. When I'm passing ZXYPositions argument to surface it can actually expect only Z data from me to plot graph correctly.
What did I changed to make it work
Two first for loops now looks like that:
sigma = data;
As you can see they're no longer loops, because sigma now contains only "solution" coordinates (which are Z coords), so I need to just assign data array to sigma.
Second part, where I'm creating Surface now looks like this:
var B = ILMath.tosingle(sigma);
var scene = new PlotCube(twoDMode: false) {
// add a surface
new Surface(B) {
// make thin transparent wireframes
Wireframe = { Color = Color.FromArgb(50, Color.LightGray) },
// choose a different colormap
Colormap = Colormaps.Jet,
}
};
scene.Axes.XAxis.Max = (float)arguments[0].Maximum;
scene.Axes.XAxis.Min = (float)arguments[0].Minimum;
scene.Axes.YAxis.Max = (float)arguments[1].Maximum;
scene.Axes.YAxis.Min = (float)arguments[1].Minimum;
scene.First<PlotCube>().Rotation = Matrix4.Rotation(new Vector3(1f, 0.23f, 1), 0.7f);
Basically one thing which changed is scaling XY axes to proper values.
Final results
Here you have final results:
How would I go about generating the 2D coordinates for an area of an image, so for example if one of the countries on this map was singled out and was the only one visible: but on a canvas the same size, how would I go about getting the 2D coordinates for it?
As I then want to create hover/click areas based on these coordinates using c#, I'm unable to find a tool which can detect for example a shape within a blank canvas and spit out its outline coordinates.
I mainly believe this to be a phrasing/terminology issue on my part, as I feel this whole process is already a "thing", and well documented.
There are many ways to achieve your task here are few:
Look at Generating Polygons from Image (Filled Shapes) which is Almost duplicate of yours but has a bit different start point.
In a nutshell:
extract all non white pixels which are neighboring white pixel
Just loop through whole image (except outer border pixels) if processed pixel is not white then look to its 4/8 neighbors of processed pixel. If any of them is different color then add the processed pixel color and coordinates to a list.
sort the point list by color
This will separate countries
apply closed loop / connectivity analysis
This is vectorisation/polygonize process. Just join not yet used neighboring pixels from list to form lines ...
There is also A* alternative for this that might be easier to implement:
extract all non white pixels which are neighboring white pixel
Just loop through whole image (except outer border pixels) if processed pixel is not white then look to its 4/8 neighbors of processed pixel. If none of them is different color then clear current pixel with some unused color (black).
recolor all white and the clear color to single color (black).
from this the recolor color will mean wall
Apply A* path finding
find first non wall pixel and apply A* like growth filling. When you done filling then just trace back remembering the order of points in a list as a polygon. Optionally joining straight line pixels to single line ...
Another option is adapt this Finding holes in 2d point sets
[notes]
If your image is filtered (Antialiasing,scaling,etc) then you need to do the color comparisons with some margin for error and may be even port to HSV (depends on the level of color distortion).
You can use opencv's findcontour() function. See documentation here: http://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/find_contours/find_contours.html.
I think you're going at this the wrong way. Outlines of continents are madness; they are often made up of several parts with lots of small islands. And, you don't need the coordinates of the continents on the image; looking up if your current coordinates are in a list would take far too long. Instead, you should do the opposite: make an index table of the whole image, on which is indicated for each pixel which continent it belongs to.
And that's much, much easier.
Since you obviously have to assign a colour to each continent to identify them, you can go over all of the image's pixels, match each pixel's colour to the closest match in the colours of your continents, and fill each byte in the array with the corresponding found continent index. This way, you get a byte array that directly references your continents array. Effectively, this means you create an indexed 8-bit image, just as a plain bytes array. (There are methods to actually combine this with the colours array and get an image you can use, mind you. It's not too hard.)
For the actual colour matching, the best practice is to use LockBits on the source image to get direct access to the underlying bytes array. In the code below, the call to GetImageData gets me the bytes and the data stride. Then you can iterate over the bytes per line, and build a colour from each block of data that represents one pixel. If you don't want to bother too much with supporting different pixel sizes (like 24bpp), a quick trick is to just paint the source image on a new 32bpp image of the same dimensions (the call to PaintOn32bpp), so you can always simply iterate per four bytes and take the byte values in the order 3,2,1,0 for ARGB. I ignored transparency here because it just complicates the concept of what is and isn't a colour.
private void InitContinents(Bitmap map, Int32 nearPixelLimit)
{
// Build hues map from colour palette. Since detection is done
// by hue value, any grey or white values on the image will be ignored.
// This does mean the process only works with actual colours.
// In this function it is assumed that index 0 in the palette is the white background.
Double[] hueMap = new Double[this.continentsPal.Length];
for (Int32 i = 0; i < this.continentsPal.Length; i++)
{
Color col = this.continentsPal[i];
if (col.GetSaturation() < .25)
hueMap[i] = -2;
else
hueMap[i] = col.GetHue();
}
Int32 w = map.Width;
Int32 h = map.Height;
Bitmap newMap = ImageUtils.PaintOn32bpp(map, continentsPal[0]);
// BUILD REDUCED COLOR MAP
Byte[] guideMap = new Byte[w * h];
Int32 stride;
Byte[] imageData = ImageUtils.GetImageData(newMap, out stride);
for (Int32 y = 0; y < h; y++)
{
Int32 sourceOffs = y * stride;
Int32 targetOffs = y * w;
for (Int32 x = 0; x < w; x++)
{
Color c = Color.FromArgb(255, imageData[sourceOffs + 2], imageData[sourceOffs + 1], imageData[sourceOffs + 0]);
Double hue;
// Detecting on hue. Values with < 25% saturation are ignored.
if (c.GetSaturation() < .25)
hue = -2;
else
hue = c.GetHue();
// Get the closest match
Double smallestHueDiff = Int32.MaxValue;
Int32 smallestHueIndex = -1;
for (Int32 i = 0; i < hueMap.Length; i++)
{
Double hueDiff = Math.Abs(hueMap[i] - hue);
if (hueDiff < smallestHueDiff)
{
smallestHueDiff = hueDiff;
smallestHueIndex = i;
}
}
guideMap[targetOffs] = (Byte)(smallestHueIndex < 0 ? 0 : smallestHueIndex);
// Increase read pointer with 4 bytes for next pixel
sourceOffs += 4;
// Increase write pointer with 1 byte for next index
targetOffs++;
}
}
// Remove random edge pixels, and save in global var.
this.continentGuide = RefineMap(guideMap, w, h, nearPixelLimit);
// Build image from the guide map.
this.overlay = ImageUtils.BuildImage(this.continentGuide, w, h, w, PixelFormat.Format8bppIndexed, this.continentsPal, null);
}
The GetImageData function:
/// <summary>
/// Gets the raw bytes from an image.
/// </summary>
/// <param name="sourceImage">The image to get the bytes from.</param>
/// <param name="stride">Stride of the retrieved image data.</param>
/// <returns>The raw bytes of the image</returns>
public static Byte[] GetImageData(Bitmap sourceImage, out Int32 stride)
{
BitmapData sourceData = sourceImage.LockBits(new Rectangle(0, 0, sourceImage.Width, sourceImage.Height), ImageLockMode.ReadOnly, sourceImage.PixelFormat);
stride = sourceData.Stride;
Byte[] data = new Byte[stride * sourceImage.Height];
Marshal.Copy(sourceData.Scan0, data, 0, data.Length);
sourceImage.UnlockBits(sourceData);
return data;
}
Now, back to the process; once you have that reference table, all you need are the coordinates of your mouse and you can check the reference map at index (Y*Width + X) to see what area you're in. To do that, you can add a MouseMove listener on an ImageBox, like this:
private void picImage_MouseMove(object sender, MouseEventArgs e)
{
Int32 x = e.X - picImage.Padding.Top;
Int32 y = e.Y - picImage.Padding.Left;
Int32 coord = y * this.picWidth + x;
if (x < 0 || x > this.picWidth || y < 0 || y > this.picHeight || coord > this.continentGuide.Length)
return;
Int32 continent = this.continentGuide[coord];
if (continent == previousContinent)
return;
previousContinent = continent;
if (continent >= this.continents.Length)
return;
this.lblContinent.Text = this.continents[continent];
this.picImage.Image = GetHighlightPic(continent);
}
Note that a simple generated map produced by nearest colour matching may have errors; when I did automatic mapping of this world map's colours, the border between blue and red, and some small islands in Central America, ended up identifying as Antarctica's purple colour, and some other rogue pixels appeared around the edges of different continents too.
This can be avoided by clearing (I used 0 as default "none") all indices not bordered by the same index at the top, bottom, left and right. This removes some smaller islands, and creates a slight gap between any neighbouring continents, but for mouse coordinates detection it'll still very nicely match the areas. This is the RefineMap call in my InitContinents function. The argument it gets determines how many identical neighbouring values an index needs to allow it to survive the pruning.
A similar technique with checking neigbouring pixels can be used to get outlines, by making a map of pixels not surrounded at all sides by the same value.
I'm using the official Kinect SDK 2.0 and Emgu CV in order to recognize the colors of a Rubik's Cube.
At first I use Canny Edge Extraction on the Infrared Camera since it handles different lightning conditions better than the RGB Camera and is much better to detect contours.
Then I use this code to convert the coordinates of the infrared sensor to the ones of the RGB camera.
As you can see the in the picture they are still off from what I am looking for. Since I already use the official KinectSensor.CoordinateMapper.MapDepthFrameToColorSpace I don't know how else I can improve the situation.
using (var colorFrame = reference.ColorFrameReference.AcquireFrame())
using (var irFrame = reference.InfraredFrameReference.AcquireFrame())
{
if (colorFrame == null || irFrame == null)
return;
// initialize depth frame data
FrameDescription depthDesc = irFrame.FrameDescription;
if (_depthData == null)
{
uint depthSize = depthDesc.LengthInPixels;
_depthData = new ushort[depthSize];
_colorSpacePoints = new ColorSpacePoint[depthSize];
// fill Array with max value so all pixels can be mapped
for (int i = 0; i < _depthData.Length; i++)
{
_depthData[i] = UInt16.MaxValue;
}
// didn't work so well with the actual depth-data
//depthFrame.CopyFrameDataToArray(_depthData);
_sensor.CoordinateMapper.MapDepthFrameToColorSpace(_depthData, _colorSpacePoints);
}
}
This is a helper-function I created in order to convert Point-Arrays in Infrared-Space to Color-Space
public static System.Drawing.Point[] DepthPointsToColorSpace(System.Drawing.Point[] depthPoints, ColorSpacePoint[] colorSpace){
for (int i = 0; i < depthPoints.Length; i++)
{
// 512 is the width of the depth/infrared image
int index = 512 * depthPoints[i].Y + depthPoints[i].X;
depthPoints[i].X = (int)Math.Floor(colorSpace[index].X + 0.5);
depthPoints[i].Y = (int)Math.Floor(colorSpace[index].Y + 0.5);
}
return depthPoints;
}
We can solve this problem by transforming infrared image coordinates to color image coordinates with 2 quadrilateral mapping.
A quadrilateral Q(x1,y1,x2,y2,x3,y3,x4,y4) in an infrared image, similarly,
it's mapping quadrilateral Q'(x1',y1',x2',y2',x3',y3',x4',y4') in the corresponding color image.
We can write the above mapping in form of equation as follows:
Q'= Q*A
where, A is a 3 X 3 matrix with coefficients a11, a12, a13, a21,.., a33;
The formula to obtain the coefficients are listed as follows:
x1=173; y1=98; x2=387; y2=93; x3=395; y3=262; x4=172; y4=264;
x1p=787; y1p=235; x2p=1407; y2p=215; x3p=1435; y3p=705; x4p=795; y4p=715;
tx=(x1p-x2p+x3p-x4p)*(y4p-y3p)-(y1p-y2p+y3p-y4p)*(x4p-x3p);
ty=(x2p-x3p)*(y4p-y3p)-(x4p-x3p)*(y2p-y3p);
a31=tx/ty;
tx=(y1p-y2p+y3p-y4p)*(x2p-x3p)-(x1p-x2p+x3p-x4p)*(y2p-y3p);
ty=(x2p-x3p)*(y4p-y3p)-(x4p-x3p)*(y2p-y3p);
a32=tx/ty;
a11=x2p-x1p+a31*x2p;
a12=x4p-x1p+a32*x4p;
a13=x1p;
a21=y2p-y1p+a31*y2p;
a22=y4p-y1p+a32*y4p;
a23=y1p;
a33=1.0;
Its because its not the same camera the camera that retrieves the depth data and the one that retrieves color data.
So you should apply a correction factor to displace the depth data.
Its a factor that is almost constant but its related to the distance.
I've got no code for you, but its something you can calculate yourself.
I have decided to have a go at making a dungeon crawler game with the Xna framework. I am a computer science student and am quite familiar with c# and .net framework. I have some questions about different parts of the development for my engine.
Loading Maps
I have a tile class that stores the vector2 position, 2dtexture and dimensions of the tile. I have another class called tilemap that has a list of tiles that are indexed by position. I am reading from a text file which is in the number format above that matches the number to the index in the tile list and creates a new tile with the correct texture and position, storing it into another list of tiles.
public List<Tile> tiles = new List<Tile>(); // List of tiles that I have added to the game
public List<TileRow> testTiles = new List<TileRow>(); // Tilerow contains a list of tiles along the x axis along with there vector2 position.
Reading and storing the map tiles.
using (StreamReader stream = new StreamReader("TextFile1.txt"))
{
while (stream.EndOfStream != true)
{
line = stream.ReadLine().Trim(' ');
lineArray = line.Split(' ');
TileRow tileRow = new TileRow();
for (int x = 0; x < lineArray.Length; x++)
{
tileXCo = x * tiles[int.Parse(lineArray[x])].width;
tileYCo = yCo * tiles[int.Parse(lineArray[x])].height;
tileRow.tileList.Add(new Tile(tiles[int.Parse(lineArray[x])].titleTexture, new Vector2(tileXCo,tileYCo)));
}
testTiles.Add(tileRow);
yCo++;
}
}
For drawing the map.
public void Draw(SpriteBatch spriteBatch, GameTime gameTime)
{
foreach (TileRow tes in testTiles)
{
foreach (Tile t in tes.tileList)
{
spriteBatch.Draw(t.titleTexture, t.position, Color.White);
}
}
}
Questions:
Is this the correct way I should be doing it, or should I just be storing a list referencing my tiles list?
How would I deal with Multi Layered Maps?
Collision Detection
At the moment I have a method that is looping through every tile that is stored in my testTiles list and checking to see if its dimensions are intersecting with the players dimensions and then return a list of all the tiles that are. I have a derived class of my tile class called CollisionTile that triggers a collision when the player and that rectangle intersect. (public class CollisionTile : Tile)
public List<Tile> playerArrayPosition(TileMap tileMap)
{
List<Tile> list = new List<Tile>();
foreach (TileRow test in tileMap.testTiles)
{
foreach (Tile t in test.tileList)
{
Rectangle rectangle = new Rectangle((int)tempPosition.X, (int)tempPosition.Y, (int)playerImage.Width / 4, (int)playerImage.Height / 4);
Rectangle rectangle2 = new Rectangle((int)t.position.X, (int)t.position.Y, t.width, t.height);
if (rectangle.Intersects(rectangle2))
{
list.Add(t);
}
}
}
return list;
}
Yeah, I am pretty sure this is not the right way to check for tile collision. Any help would be great.
Sorry for the long post, any help would be much appreciated.
You are right. This is a very inefficient way to draw and check for collision on your tiles. What you should be looking into is a Quadtree data structure.
A quadtree will store your tiles in a manner that will allow you to query your world using a Rectangle, and your quadtree will return all tiles that are contained inside of that Rectangle.
List<Tiles> tiles = Quadtree.GetObjects(rectangle);
This allows you to select only the tiles that need to be processed. For example, when drawing your tiles, you could specify a Rectangle the size of your viewport, and only those tiles would be drawn (culling).
Another example, is you can query the world with your player's Rectangle and only check for collisions on the tiles that are returned for that portion of your world.
For loading your tiles, you may want to consider loading into a two dimensional array, instead of a List. This would allow you to fetch a tile based on its position, instead of cross referencing it between two lists.
Tile[,] tiles = new Tile[,]
Tile tile = tiles[x,y];
Also, in this case, an array data structure would be a lot more efficient than using a List.
For uniform sets of tiles with standard widths and heights, it is quite easy to calculate which tiles are visible on the screen, and to determine which tile(s) your character is overlapping with. Even though I wrote the QuadTree in Jon's answer, I think it's overkill for this. Generally, the formula is:
tileX = someXCoordinate % tileWidth;
tileY = someYCoordinate % tileHeight;
Then you can just look that up in a 2D array tiles[tileX, tileY]. For drawing, this can be used to figure out which tile is in the upper left corner of the screen, then either do the same again for the bottom right (+1), or add tiles to the upper left to fill the screen. Then your loop will look more like:
leftmostTile = screenX % tileWidth; // screenX is the left edge of the screen in world coords
topmostTile = screenY % tileHeight;
rightmostTile = (screenX + screenWidth) % tileWidth;
bottommostTile = (screenY + screenHeight) % tileHeight;
for(int tileX = leftmostTile; tileX <= rightmostTile; tileX++)
{
for(int tileY = topmostTile; tileY <= bottommostTile; tileY++)
{
Tile t = tiles[tileX][tileY];
// ... more stuff
}
}
The same simple formula can be used to quickly figure out which tile(s) are under rectangular areas.
IF however, your tiles are non-uniform, or you have an isometric view, or you want the additional functionality that a QuadTree provides, I would consider Jon's answer and make use of a QuadTree. I would try to keep tiles out of the QuadTree if you can though.
I am trying to extract out 3D distance in mm between two known points in a 2D image. I am using square AR markers in order to get the camera coordinates relative to the markers in the scene. The points are the corners of these markers.
An example is shown below:
The code is written in C# and I am using XNA. I am using AForge.net for the CoPlanar POSIT
The steps I take in order to work out the distance:
1. Mark corners on screen. Corners are represented in 2D vector form, Image centre is (0,0). Up is positive in the Y direction, right is positive in the X direction.
2. Use AForge.net Co-Planar POSIT algorithm to get pose of each marker:
float focalLength = 640; //Needed for POSIT
float halfCornerSize = 50; //Represents 1/2 an edge i.e. 50mm
AVector[] modelPoints = new AVector3[]
{
new AVector3( -halfCornerSize, 0, halfCornerSize ),
new AVector3( halfCornerSize, 0, halfCornerSize ),
new AVector3( halfCornerSize, 0, -halfCornerSize ),
new AVector3( -halfCornerSize, 0, -halfCornerSize ),
};
CoplanarPosit coPosit = new CoplanarPosit(modelPoints, focalLength);
coPosit.EstimatePose(cornersToEstimate, out marker1Rot, out marker1Trans);
3. Convert to XNA rotation/translation matrix (AForge uses OpenGL matrix form):
float yaw, pitch, roll;
marker1Rot.ExtractYawPitchRoll(out yaw, out pitch, out roll);
Matrix xnaRot = Matrix.CreateFromYawPitchRoll(-yaw, -pitch, roll);
Matrix xnaTranslation = Matrix.CreateTranslation(marker1Trans.X, marker1Trans.Y, -marker1Trans.Z);
Matrix transform = xnaRot * xnaTranslation;
4. Find 3D coordinates of the corners:
//Model corner points
cornerModel = new Vector3[]
{
new Vector3(halfCornerSize,0,-halfCornerSize),
new Vector3(-halfCornerSize,0,-halfCornerSize),
new Vector3(halfCornerSize,0,halfCornerSize),
new Vector3(-halfCornerSize,0,halfCornerSize)
};
Matrix markerTransform = Matrix.CreateTranslation(cornerModel[i].X, cornerModel[i].Y, cornerModel[i].Z);
cornerPositions3d1[i] = (markerTransform * transform).Translation;
//DEBUG: project corner onto screen - represented by brown dots
Vector3 t3 = viewPort.Project(markerTransform.Translation, projectionMatrix, viewMatrix, transform);
cornersProjected1[i].X = t3.X; cornersProjected1[i].Y = t3.Y;
5. Look at the 3D distance between two corners on a marker, this represents 100mm. Find the scaling factor needed to convert this 3D distance to 100mm. (I actually get the average scaling factor):
for (int i = 0; i < 4; i++)
{
//Distance scale;
distanceScale1 += (halfCornerSize * 2) / Vector3.Distance(cornerPositions3d1[i], cornerPositions3d1[(i + 1) % 4]);
}
distanceScale1 /= 4;
6. Finally I find the 3D distance between related corners and multiply by the scaling factor to get distance in mm:
for(int i = 0; i < 4; i++)
{
distance[i] = Vector3.Distance(cornerPositions3d1[i], cornerPositions3d2[i]) * scalingFactor;
}
The distances acquired are never truly correct. I used the cutting board as it allowed me easy calculation of what the distances should be. The above image calculated a distance of 147mm (expected 150mm) for corner 1 (red to purple). The image below shows 188mm (expected 200mm).
What is also worrying is the fact that when measuring the distance between marker corners sharing an edge on the same marker, the 3D distances obtained are never the same. Another thing I noticed is that the brown dots never seem to exactly match up with the colored dots. The colored dots are the coordinates used as input to the CoPlanar posit. The brown dots are the calculated positions from the center of the marker calculated via POSIT.
Does anyone have any idea what might be wrong here? I am pulling out my hair trying to figure it out. The code should be quite simple, I don't think I have made any obvious mistakes with the code. I am not great at maths so please point out where my basic maths might be wrong as well...
You are using way to many black boxes in your question. What is the focal length in the second step? Why go through ypr in step 3? How do you calibrate? I recommend to start over from scratch without using libraries that you do not understand.
Step 1: Create a camera model. Understand the errors, build a projection. If needed apply a 2d filter for lens distortion. This might be hard.
Step 2: Find you markers in 2d, after removing lens distortion. Make sure you know the error and that you get the center. Maybe over multiple frames.
Step 3: Un-project to 3d. After 1 and 2 this should be easy.
Step 4: ???
Step 5: Profit! (Measure distance in 3d and know your error)
I think you need to have 3D photo (two photo from a set of distance) so you can get the parallax distance from image differences