Finding the Negative Space in an Image or Cartesian Plane - c#

Please Note - This is a Math question essentially. However, i have also tagged C#
as this is the language i am working in
Summary
I'm looking for an algorithm (or name thereof) that can find the Negative Space (or space) in an image. The closest i have found Dijkstra's algorithm (which is seemingly close), yet its actually a subset of the actual problem. Namely, to walk through a Cartesian Plane traversing every coordinate that isn't filled (or black in my case) to find a mask. Example below
Example of Dijkstra's Algorithm
The background
I need to tidy up 10's of thousands of images that have artefacts in them. By cleaning up i mean these things specifically :
Using Edge Detection to find the edges of the objects in the images
Masking the Negative Space so i can covert the image backgrounds to plain white
Cropping the images to their optimal size.
Currently i'm using Canny Edge Detection to find the most important part of the image. I can crop the image fairly well (shown below), and also find all the images that have the problem. However i am having trouble locating the best algorithm (or name thereof) to find the negative space.
Example of the original image
As you can see the image looks pretty clean, however its not
Example of the accentuated problem
The image has lots of artefacts in the background and they need to be removed
Example of Canny Edge Detection
This does a wonderful job of cleaning up the image
The Problem
Dijkstra's algorithms premise is it looks for all the possible paths, its basically a solves the Travelling Sales man problem
The problems is; The algorithm actually does much more than i need to do with regards to the weighing and the distance measures , and it stops when it has the shortest path (where i need it to complete the image).
The pseudo code
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Node with the least distance
14 // will be selected first
15 remove u from Q
16
17 for each neighbor v of u: // where v is still in Q.
18 alt ← dist[u] + length(u, v)
19 if alt < dist[v]: // A shorter path to v has been found
20 dist[v] ← alt
21 prev[v] ← u
22
23 return dist[], prev[]
Can anyone suggest an Algorithm or modify the Pseudo Code to Dijkstra's Algorithms to achieve this?

The answer to the question was simply the Flood-fill Algorithm.
However, to solve the entire problem of cleaning subtle artefacts from images, the total solution was as follows.
Use Canny Edge Detection with appropriate thresholds to get the outline of objects in the image
Use a Gaussian Blur to Blur the Canny results enough so the flood full wont bleed
Use a flood fill to create the Mask and apply it back to the original image
Some traps for your for young players.
PixelFormats, you need to make sure everything is talking the same format
Not editing the bitmap directly by using scanlines or locked pixels
paralleling algorithms where possible, in this case the flood fill and Blur where good candiates
Update
Even a faster method was just to use Parallel FloodFill with a Color Threshold value
Color Threshold
public static bool IsSimilarColor(this Color source, Color target, int threshold)
{
int r = source.R - target.R, g = source.G - target.G, b = source.B - target.B;
return (r * r + g * g + b * b) <= threshold * threshold;
}
Parallel FloodFill
public static Bitmap ToWhiteCorrection(this Bitmap source, Color sourceColor, Color targetColor, Color maskColor, int threshold, Size tableSize, int cpu = 0)
{
using (var dbMask = new DirectBitmap(source))
{
using (var dbDest = new DirectBitmap(source))
{
var options = new ParallelOptions
{
MaxDegreeOfParallelism = cpu <= 0 ? Environment.ProcessorCount : cpu
};
// Divide the image up
var rects = dbMask.Bounds.GetSubRects(tableSize);
Parallel.ForEach(rects, options, rect => ProcessWhiteCorrection(dbMask, dbDest, rect, sourceColor, targetColor, maskColor, threshold));
return dbDest.CloneBitmap();
}
}
}
private static void ProcessWhiteCorrection(this DirectBitmap dbMask, DirectBitmap dbDest, Rectangle rect, Color sourceColor, Color targetColor, Color maskColor, int threshold)
{
var pixels = new Stack<Point>();
AddStartLocations(dbMask, rect, pixels, sourceColor, threshold);
while (pixels.Count > 0)
{
var point = pixels.Pop();
if (!rect.Contains(point))
{
continue;
}
if (!dbMask[point]
.IsSimilarColor(sourceColor, threshold))
{
continue;
}
dbMask[point] = maskColor;
dbDest[point] = targetColor;
pixels.Push(new Point(point.X - 1, point.Y));
pixels.Push(new Point(point.X + 1, point.Y));
pixels.Push(new Point(point.X, point.Y - 1));
pixels.Push(new Point(point.X, point.Y + 1));
}
}
Worker
private static void ProcessWhiteCorrection(this DirectBitmap dbMask, DirectBitmap dbDest, Rectangle rect, Color sourceColor, Color targetColor, Color maskColor, int threshold)
{
var pixels = new Stack<Point>();
// this basically looks at a 5 by 5 rectangle in all 4 corners of the current rect
// and looks to see if we are all the source color
// basically it just picks good places to start the fill
AddStartLocations(dbMask, rect, pixels, sourceColor, threshold);
while (pixels.Count > 0)
{
var point = pixels.Pop();
if (!rect.Contains(point))
{
continue;
}
if (!dbMask[point].IsSimilarColor(sourceColor, threshold))
{
continue;
}
dbMask[point] = maskColor;
dbDest[point] = targetColor;
pixels.Push(new Point(point.X - 1, point.Y));
pixels.Push(new Point(point.X + 1, point.Y));
pixels.Push(new Point(point.X, point.Y - 1));
pixels.Push(new Point(point.X, point.Y + 1));
}
}
Direct bitmap
public class DirectBitmap : IDisposable
{
public DirectBitmap(int width, int height, PixelFormat pixelFormat = PixelFormat.Format32bppPArgb)
{
Width = width;
Height = height;
Bounds = new Rectangle(0, 0, Width, Height);
Bits = new int[width * height];
BitsHandle = GCHandle.Alloc(Bits, GCHandleType.Pinned);
Bitmap = new Bitmap(width, height, width * 4, PixelFormat.Format32bppPArgb, BitsHandle.AddrOfPinnedObject());
using (var g = Graphics.FromImage(Bitmap))
{
g.Clear(Color.White);
}
}
public DirectBitmap(Bitmap source)
{
Width = source.Width;
Height = source.Height;
Bounds = new Rectangle(0, 0, Width, Height);
Bits = new int[source.Width * source.Height];
BitsHandle = GCHandle.Alloc(Bits, GCHandleType.Pinned);
Stride = (int)GetStride(PixelFormat, Width);
Bitmap = new Bitmap(source.Width, source.Height, Stride, PixelFormat.Format32bppPArgb, BitsHandle.AddrOfPinnedObject());
using (var g = Graphics.FromImage(Bitmap))
{
g.DrawImage(source, new Rectangle(0, 0, source.Width, source.Height));
}
}
...

Related

Cut faraway objects based on depth map

I would like to do grabcut which uses a depth map that cuts away far objects, that is used in mixed reality application. So I would like to show just the front of what I see and the background as virtual reality scene.
The problem right now I tried to adapt so code and what I get is front which is cut but in black color, the mask actually.
I don't know where is the problem settle.
The input is a depth map from zed camera.
here is a picture of the behaviour:
My trial:
private void convertToGrayScaleValues(Mat mask)
{
int width = mask.rows();
int height = mask.cols();
byte[] buffer = new byte[width * height];
mask.get(0, 0, buffer);
for (int x = 0; x < width; x++)
{
for (int y = 0; y < height; y++)
{
int value = buffer[y * width + x];
if (value == Imgproc.GC_BGD)
{
buffer[y * width + x] = 0; // for sure background
}
else if (value == Imgproc.GC_PR_BGD)
{
buffer[y * width + x] = 85; // probably background
}
else if (value == Imgproc.GC_PR_FGD)
{
buffer[y * width + x] = (byte)170; // probably foreground
}
else
{
buffer[y * width + x] = (byte)255; // for sure foreground
}
}
}
mask.put(0, 0, buffer);
}
For Each depth frame from Camera:
Mat erodeElement = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(4, 4));
Mat dilateElement = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(7, 7));
depth.copyTo(maskFar);
Core.normalize(maskFar, maskFar, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);
Imgproc.cvtColor(maskFar, maskFar, Imgproc.COLOR_BGR2GRAY);
Imgproc.threshold(maskFar, maskFar, 180, 255, Imgproc.THRESH_BINARY);
Imgproc.dilate(maskFar, maskFar, erodeElement);
Imgproc.erode(maskFar, maskFar, dilateElement);
Mat bgModel = new Mat();
Mat fgModel = new Mat();
Imgproc.grabCut(image, maskFar, new OpenCVForUnity.CoreModule.Rect(), bgModel, fgModel, 1, Imgproc.GC_INIT_WITH_MASK);
convertToGrayScaleValues(maskFar); // back to grayscale values
Imgproc.threshold(maskFar, maskFar, 180, 255, Imgproc.THRESH_TOZERO);
Mat foreground = new Mat(image.size(), CvType.CV_8UC4, new Scalar(0, 0, 0));
image.copyTo(foreground, maskFar);
Utils.fastMatToTexture2D(foreground, texture);
In this case, the graph cut on the depth image might not be the correct method to solve all of your issue.
If you insist the processing should be done in the depth image. To find everything that is not on the table and filter out the table part. You may first apply the disparity based approach for finding the object that's is not on the ground. Reference: https://github.com/windowsub0406/StereoVision
Then based on the V disparity output image, find the locally connected component that is grouped together. You may follow this link how to do this disparity map in OpenCV which is asking the similar way to find the objects that's not on the ground
If you are ok with RGB based approaches, then use any deep learning-based method to recognize the monitor should be the correct approaches. It can directly detect the mointer bounding box. By apply this bounding box to the depth image, you may have what you want. For deep learning based approaches, there are many available package such as Yolo series. You may find one that is suitable for you. reference: https://medium.com/#dvshah13/project-image-recognition-1d316d04cb4c

Ellipse Texture not doing what is expected

I am generating dynamic textures in monogame for simple shapes. Yes I know the disadvantages to this system, but I am just experimenting with building my own physics engine. I am trying to generate the texture for an ellipse as is described here.
I have a function PaintDescriptor that takes an x and y pixel coordinate and gives back what color it should be. Red is just while I am debugging, and normally it would be Color.Transparent.
public override Color PaintDescriptor(int x, int y)
{
float c = (float)Width / 2;
float d = (float)Height / 2;
return pow((x - c) / c, 2) + pow((y - d) / d, 2) <= 1 ? BackgroundColor : Color.Red;
}
Now this works if Width == Height, so, a circle. However, if they are not equal, it generates a texture with some ellipse like shapes, but also with banding/striping.
I have tried seeing if my width and height were switched, and ive tried several other things. One thing to note is that where in the normal coordinate system on desmos I have (y + d) / d, but since the screen's y axis is flipped, I have to flip the y offset in the code: (y - d) / d. The rest of the relating code for texture generation and drawing is here:
public Texture2D GenerateTexture(GraphicsDevice device, Func<int, int, Color> paint)
{
Texture2D texture = new Texture2D(device, Width, Height);
Color[] data = new Color[Width * Height];
for (int pixel = 0; pixel < data.Count(); pixel++)
data[pixel] = paint(pixel / Width, pixel % Height);
texture.SetData(data);
return texture;
}
public void Draw(float scale = 1, float layerdepth = 0, SpriteEffects se = SpriteEffects.None)
{
if (SBRef == null)
throw new Exception("No reference to spritebatch object");
SBRef.Draw(Texture, new Vector2(X, Y), null, null, null, 0, new Vector2(scale, scale), Color.White, se, layerdepth);
}
public float pow(float num, float power) //this is a redirect of math.pow to make code shorter and more readable
{
return (float)Math.Pow(num, power);
}
Why doesnt this match desmos? Why does it not make an ellipse?
EDIT: I forgot to mention, but one possible solution I have come across is to always draw a circle, and then scale it to the desired width and height. This is not acceptable for me for one because of some possible blurriness in drawing, or other artifacts, but more mainly because I want to understand whatever im not currently getting with this solution.
After sleeping and coming back with a fresh mindset for like the 10th time, I found the answer. in the function GenerateTexture:
data[pixel] = paint(pixel / Width, pixel % Height);
should be
data[pixel] = paint(pixel % Width, pixel / Height);

Reading a wedge area of Circular bitmap in c#

I am working on a program that takes a Bitmap and converts it into circular form. The code is as follows:
public static Image CropToCircle(Image srcImage, Color backGround)
{
Image dstImage = new Bitmap(srcImage.Width, srcImage.Height, srcImage.PixelFormat);
Graphics g = Graphics.FromImage(dstImage);
using (Brush br = new SolidBrush(backGround)) {
g.FillRectangle(br, 0, 0, dstImage.Width, dstImage.Height);
}
GraphicsPath path = new GraphicsPath();
path.AddEllipse(0, 0, dstImage.Width, dstImage.Height);
g.SetClip(path);
g.DrawImage(srcImage, 0, 0);
return dstImage;
}
It returns the image in circular shape; however I need to read an image wedge in the form of degrees; that is, the circle has 360 degrees and I am trying to write a function that will accept a degree (e.g. 10) and will return the pixels of the image that fall in 10th degree. Such that entire image will be readable in 1 to 360 degrees.
Since my hint was actually rather misleading, let me make up by giving you a working code:
// collect a list of colors from a bitmap with a cetner c and radius r
List<Color> getColorsByAngle(Bitmap bmp, Point c, int r, float angle)
{
List<Color> colors = new List<Color>();
for (int i = 0; i < r; i++)
{
int x = (int)(Math.Sin(angle / 180f * Math.PI) * i);
int y = (int)(Math.Cos(angle / 180f * Math.PI) * i);
colors.Add(bmp.GetPixel(c.X + x, c.Y + y));
}
return colors;
}
Here it is at work:
(The gif is rather quantized for size..)
Note that
Pixels close to the center will be read multiple time, the center itself even each time
To collect all outer pixels you need to read as many angles as the circumference of the circle has pixels, ie 2 * PI * radius. So for a circle with a radius of 300 pixels you need to step the angle in 360° / (600 * 3.14) or about 0.2°..
Also note the the coordinate systems in GDI and in geometry are not the same, neither in the direction of the axes nor the angles. Adapting this is left for you..
The original version didn't mention a 'wedge area'. To read an area or the whole image simply loop over an angle range in suitable steps!

How to draw thousands of tiles without killing FPS

I've looked everywhere for a workaround to this issue (I may just be blind to see the solutions lying around). My game currently renders the tilemap on the screen and will not render tiles that are not actually within the screen bounds. However, each tile is 16x16 pixels, that means 8100 tiles to draw if every pixel on the screen contains a tile at 1920x1080 resolution.
Drawing that many tiles every cycle really kills my FPS. If I run 800x600 resolution my FPS goes to ~20, and at 1920x1080 it runs at around 3-5 FPS. This really drives me nuts.
I've tried threading and using async tasks, but those just flicker the screen. Probably just me coding it incorrectly.
Here's the drawing code that I currently use.
// Get top-left tile-X
Vector topLeft = new Vector(Screen.Camera.X / 16 - 1,
Screen.Camera.Y / 16 - 1);
Vector bottomRight = new Vector(topLeft.X + (Screen.Width / 16) + 2,
topLeft.Y + (Screen.Height / 16) + 2);
// Iterate sections
foreach (WorldSection section in Sections)
{
// Continue if out of bounds
if (section.X + ((Screen.Width / 16) + 2) < (int)topLeft.X ||
section.X >= bottomRight.X)
continue;
// Draw all tiles within the screen range
for (int x = topLeft.X; x < bottomRight.X; x++)
for (int y = topLeft.Y; y < bottomRight.Y; y++)
if (section.Blocks[x - section.X, y] != '0')
DrawBlock(section.Blocks[x - section.X, y],
x + section.X, y);
}
There are between 8 and 12 sections. Each tile is represented by a char object in the two-dimensional array.
Draw block method:
public void DrawBlock(char block, int x int y)
{
// Get the source rectangle
Rectangle source = new Rectangle(Index(block) % Columns * FrameWidth,
Index(block) / Columns * FrameHeight, FrameWidth, FrameHeight);
// Get position
Vector2 position = new Vector2(x, y);
// Draw the block
Game.spriteBatch.Draw(Frameset, position * new Vector2(FrameWidth, FrameHeight) - Screen.Camera, source, Color.White);
}
The Index() method just returns the frame index of the tile corresponding to the char.
I'm wondering how I could make it possible to actually allow this much to be drawn at once without killing the framerate in this manner. Is the code I provided clearly not very optimized, or is it something specific I should be doing to make it possible to draw this many individual tiles without reducing performance?
Not sure if this is the best way to deal with the problem, but I've started to use RenderTarget2D to pre-render chunks of the world into textures. I have to load chunks within a given area around the actual screen bounds at a time, because loading all chunks at once will make it run out of memory.
When you get close to the bounds of the current pre-rendered area, it will re-process chunks based on your new position in the world. The processing takes roughly 100 milliseconds, so when loading new areas the player will feel a slight slowdown for this duration. I don't really like that, but at least the FPS is 60 now.
Here's my chunk processor:
public bool ProcessChunk(int x, int y)
{
// Create render target
using (RenderTarget2D target = new RenderTarget2D(Game.CurrentDevice, 16 * 48, 16 * 48,
false, SurfaceFormat.Color, DepthFormat.Depth24))
{
// Set render target
Game.CurrentDevice.SetRenderTarget(target);
// Clear back buffer
Game.CurrentDevice.Clear(Color.Black * 0f);
// Begin drawing
Game.spriteBatch.Begin(SpriteSortMode.Texture, BlendState.AlphaBlend);
// Get block coordinates
int bx = x * 48,
by = y * 48;
// Draw blocks
int count = 0;
foreach (WorldSection section in Sections)
{
// Continue if section is out of chunk bounds
if (section.X >= bx + 48) continue;
// Draw all tiles within the screen range
for (int ax = 0; ax < 48; ax++)
for (int ay = 0; ay < 48; ay++)
{
// Get the block character
char b = section.Blocks[ax + bx - section.X, ay + by];
// Draw the block unless it's an empty block
if (b != '0')
{
Processor.Blocks[b.ToString()].DrawBlock(new Vector2(ax, ay), true);
count++;
}
}
}
// End drawing
Game.spriteBatch.End();
// Clear target
target.GraphicsDevice.SetRenderTarget(null);
// Set texture
if (count > 0)
{
// Create texture
Chunks[x, y] = new Texture2D(Game.CurrentDevice, target.Width, target.Height, true, target.Format);
// Set data
Color[] data = new Color[target.Width * target.Height];
target.GetData<Color>(data);
Chunks[x, y].SetData<Color>(data);
// Return true
return true;
}
}
// Return false
return false;
}
If there are any suggestions on how this approach can be improved, I won't be sad to hear them!
Thanks for the help given here!

How to determine the background color of document when there are 3 options, using c# or imagemagick

i am currently developing an application that has to process scanned forms. One of the tasks of my application is to determine which kind of form is scanned. There are 3 possible types of forms with a unique background color to identify each kind. The 3 colors that are possible are red/pink, green and blue. The problem i am having is, that my attempts fail to distinguish between the green and blue forms.
Here are links to the green and blue sample files:
http://dl.dropbox.com/u/686228/Image0037.JPG
http://dl.dropbox.com/u/686228/Image0038.JPG
I am using C# .net Application and ImageMagick for some tasks i need to perform.
Currently i am getting color reduced histogram of my scanned form and try to determine which colors are in the form. But my app can't rely distinguish the green and blue ones.
Any advise or maybe a smarter approach would be gladly appreciated.
Thanks,
Erik
I found this rather interesting and dug into it a little deeper.
The code to get the average color of a bitmap found at How to calculate the average rgb color values of a bitmap had problems like some invalid casts and red/blue channels swapped. Here is a fixed version:
private System.Drawing.Color CalculateAverageColor(Bitmap bm)
{
int width = bm.Width;
int height = bm.Height;
int red = 0;
int green = 0;
int blue = 0;
int minDiversion = 15; // drop pixels that do not differ by at least minDiversion between color values (white, gray or black)
int dropped = 0; // keep track of dropped pixels
long[] totals = new long[] { 0, 0, 0 };
int bppModifier = bm.PixelFormat == System.Drawing.Imaging.PixelFormat.Format24bppRgb ? 3 : 4; // cutting corners, will fail on anything else but 32 and 24 bit images
BitmapData srcData = bm.LockBits(new System.Drawing.Rectangle(0, 0, bm.Width, bm.Height), ImageLockMode.ReadOnly, bm.PixelFormat);
int stride = srcData.Stride;
IntPtr Scan0 = srcData.Scan0;
unsafe
{
byte* p = (byte*)(void*)Scan0;
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
int idx = (y * stride) + x * bppModifier;
red = p[idx + 2];
green = p[idx + 1];
blue = p[idx];
if (Math.Abs(red - green) > minDiversion || Math.Abs(red - blue) > minDiversion || Math.Abs(green - blue) > minDiversion)
{
totals[2] += red;
totals[1] += green;
totals[0] += blue;
}
else
{
dropped++;
}
}
}
}
int count = width * height - dropped;
int avgR = (int)(totals[2] / count);
int avgG = (int)(totals[1] / count);
int avgB = (int)(totals[0] / count);
return System.Drawing.Color.FromArgb(avgR, avgG, avgB);
}
Running this function on your input images, however, returned some indistinguishable grayish color for both of them, as already anticipated by Will A in the comments, which is why i'm dropping any colors from the calculation that do not have a difference of at least 15 between R, G and B.
The interesting thing is that the supposedly blue prescription scan averages equal values for G and B (R: 214, G: 237, B: 237). However the green prescription scan resulted in a big difference (18) between the values for G and B (R: 202, G: 232, B: 214) so that might be what you should be looking into. Ex:
if (color.G - color.B > 15) { form.Type = FormTypes.GreenForm }
Workflow outline:
Convert Image into HSL/HSV colorspace
Build histogram of the H channel
The histogram should show a clear peak (for your samples at least) in the blue/green channel
If that's not distinctive enough, you can weight the histogram votes by something to reduce the effect of the white areas (e.g. in the HSV colorspace, weight by S).
I haven't tried this out, but how about resizing the image to 1x1 pixel (which should "average" out all the pixels) and then check the hue of that pixel to see if it closest to red, blue or green.
EDIT
I don't have ImageMagik installed, so I hacked this with GetThumbnailImage:
private static bool ThumbnailCallback()
{
return false;
}
static void Main(string[] args)
{
var blueImage = Image.FromFile("blue.jpg").GetThumbnailImage(1, 1, new Image.GetThumbnailImageAbort(ThumbnailCallback), IntPtr.Zero);
var blueBitmap = new Bitmap(blueImage);
var blueHue = blueBitmap.GetPixel(0, 0).GetHue();
var greenImage = Image.FromFile("green.jpg").GetThumbnailImage(1, 1, new Image.GetThumbnailImageAbort(ThumbnailCallback), IntPtr.Zero);
var greenBitmap = new Bitmap(greenImage);
var greenHue = greenBitmap.GetPixel(0, 0).GetHue();
}
Using your images I got a blueHue value of 169.0909 (Where true blue would be 180), and greenHue equal to 140 (pure green is 120, cyan is 150).
Red forms should be somewhere near 0 or 360.
I know you've already found an answer - just thought I'd give you an alternative.

Categories