I am trying to detectect the bounding box of sentences in an Image. I am using Emgu OpenCV in C# using HougLinesP method to extract lines, but I am obviously doing something wrong. I have looked at many examples and estimate the skew level with houghLines is pretty much what I am trying to do.
Using that sample image I do some pre-processing (Thresholding, canny, ect) and end up with http://snag.gy/sWCuO.jpg, but then when I do HoughLines and draw the lines on the original image, I get http://snag.gy/ESKmR.jpg .
Here is an extract of my code:
using (MemStorage stor = new MemStorage())
{
Image<Hsv, byte> imgHSV = new Image<Hsv, byte>(bitmap);
Image<Gray, Byte> gray = imgHSV.Convert<Gray, Byte>().PyrDown().PyrUp();
CvInvoke.cvCanny(gray, EdgeMap, 100, 400, 3);
IntPtr lines = CvInvoke.cvHoughLines2(EdgeMap, stor,
Emgu.CV.CvEnum.HOUGH_TYPE.CV_HOUGH_PROBABILISTIC, 1, Math.PI / 360, 10,
gray.Width / 4, 20);
Seq<LineSegment2D> segments = new Seq<LineSegment2D>(lines, stor);
ar = segments.ToArray();
}
Graphics g = Graphics.FromImage(OriginalImage);
foreach (LineSegment2D line in ar)
{
g.DrawLine(new Pen(Color.Blue),
new Point(line.P1.X, line.P1.Y),
new Point(line.P2.X, line.P2.Y));
}
g.Save();
Any help would be appreciated.
You can try two approaches:
1- Utilize the frequency domain. Example here
2- After pre-processing, extract the contours, collect all the points (or at least collect all points that are not black); find the minimum bounding rectangle with its angle. Example here
Related
I'm making a labeling tool.
Goal :By drawing a polygon on the picture, you have to export the image inside the polygon to the outside.
example
extract
This is what I drew in the my program.
But I don't know how to extract this region. I want to know how to extract this area.
I have saved the vertices of the picture above in an object. But I don't know how to extract data from the image through these vertices
========================================
So I found this.
https://www.codeproject.com/Articles/703519/Cropping-Particular-Region-In-Image-Using-Csharp
but it is not work
Can't convert Bitmap to IplImage
It doesn't work for the same reason.
In the post, I am going to use opencvsharp 4.x, but the program I am fixing now is .netframework 3.5, so it does not support opencvsharp 4.x.
What should I do?
============================
I made a function referring to the answer, but it doesn't work...
I want to know why.
void CropImage(Bitmap bitmap, Point[] points)
{
Rectangle rect = PaddingImage(points, bitmap);
TextureBrush textureBrush = new TextureBrush(bitmap);
Bitmap bmp1 = new Bitmap(rect.Width, rect.Height);
using (Graphics g = Graphics.FromImage(bmp1))
{
g.FillPolygon(textureBrush, points);
}
string ima_path = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
bmp1.Save(ima_path + "\\Image.png", ImageFormat.Png);
}
extract Image
original
If you use a small polygon, there is no output at all.
You will notice that the two images are slightly different.
It seems to me that the part where the center point is cut and extracted is different. I don't know if what I was thinking is correct.
You would create a new bitmap, at least as large as the bounding box of your polygon. Create a graphics object from this new bitmap. You can then draw the polygon to this bitmap, using the original image as a texture brush. Note that you might need to apply transform matrix to translate from the full image coordinates to the cropped image coordinates.
Note that it looks like you have radiological images. These are typically 16 bit images, so they will need to be converted to 8bit mono, or 24bit RGB before they can be used. This should already be done in the drawing code if you have access to the source. Or you can do it yourself.
this works for me
private Bitmap CropImage(Bitmap bitmap, List<Point> points)
{
int pminx = 9999, pminy = 9999, pmaxx = 0, pmaxy = 0; System.Drawing.Point[] pcol = new System.Drawing.Point[points.Count]; int i = 0;
foreach (Point pc in points)
{
if (pc.X > pmaxx) pmaxx = (int)pc.X;
if (pc.Y > pmaxy) pmaxy = (int)pc.Y;
if (pc.X < pminx) pminx = (int)pc.X;
if (pc.Y < pminy) pminy = (int)pc.Y;
pcol[i] = new System.Drawing.Point((int)pc.X, (int)pc.Y);
i++;
}
TextureBrush textureBrush = new TextureBrush(bitmap);
Bitmap bmpWrk = new Bitmap(bitmap.Width, bitmap.Height);
using (Graphics g = Graphics.FromImage(bmpWrk))
{
g.FillPolygon(textureBrush, pcol);
}
System.Drawing.Rectangle CropRect = new System.Drawing.Rectangle(pminx, pminy, pmaxx - pminx, pmaxy - pminy);
return bmpWrk.Clone(CropRect, bmpWrk.PixelFormat);
}
Well I am not able to get good accuracy of text detection in tesseract. Please check code and image below.
Mat imgInput = CvInvoke.Imread(#"D:\workspace\raw2\IMG_20200625_194541.jpg", ImreadModes.AnyColor);
int kernel_size = 11;
//Dilation
Mat imgDilatedEdges = new Mat();
CvInvoke.Dilate(
imgInput,
imgDilatedEdges,
CvInvoke.GetStructuringElement(
ElementShape.Rectangle,
new Size(kernel_size, kernel_size),
new Point(1, 1)),
new Point(1, 1),
1,
BorderType.Default,
new MCvScalar(0));
//Blur
Mat imgBlur = new Mat();
CvInvoke.MedianBlur(imgDilatedEdges, imgBlur, kernel_size);
//Abs diff
Mat imgAbsDiff = new Mat();
CvInvoke.AbsDiff(imgInput, imgBlur, imgAbsDiff);
Mat imgNorm = imgAbsDiff;
//Normalize
CvInvoke.Normalize(imgAbsDiff, imgNorm, 0, 255, NormType.MinMax, DepthType.Default);
Mat imgThreshhold = new Mat();
//getting threshhold value
double thresholdval = CvInvoke.Threshold(imgAbsDiff, imgThreshhold, 230, 0, ThresholdType.Trunc);
//Normalize
CvInvoke.Normalize(imgThreshhold, imgThreshhold, 0, 255, NormType.MinMax, DepthType.Default);
imgThreshhold.Save(#"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
//contrast correction
Mat lab = new Mat();
CvInvoke.CvtColor(imgThreshhold, lab, ColorConversion.Bgr2Lab);
VectorOfMat colorChannelB = new VectorOfMat();
CvInvoke.Split(lab, colorChannelB);
CvInvoke.CLAHE(colorChannelB[0], 3.0, new Size(12, 12), colorChannelB[0]);
Mat clahe = new Mat();
//merge
CvInvoke.Merge(colorChannelB, clahe);
Image<Bgr, byte> output = new Image<Bgr, byte>(#"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
Bitmap bmp = output.ToBitmap();
//setting image to 300 dpi since tesseract likes that
bmp.SetResolution(300, 300);
bmp.Save(#"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
I am not getting expected accuracy. Please check how image is converted.
source image
converted image
I have posted few images above that you can refer. For first image i am getting garbage data. For last two images i am getting partial data.
Converting image to gray scale and playing with threshold gives better output.
I want to understand that if in case threshold is the key part then how i will be able to get dynamic threshhold value for each new image? It is going to work as service so user will simply pass the image and get the result. My app should be intelligent enough to process and understand image.
Do i have to adjust contrast, threshold more accurately? If yes how i will do that? or image itself is faulty I mean noise causing problem.
Please let me know what i am doing wrong in the algorithm or anything which will help me to understand issue. Any one who is aware of please tell me what should be ideal steps for image preprocessing for OCR?
I am using csharp, emucv and tesseract.
Any suggestion will be highly appreciated.
I'm trying to detect the corners of the image using emguCv. To do that I used Harris Corner detection method in emguCv. but output result is blurred and using that method I cannot get the number of the corners in the image. When I searching I found a code in OpenCvSharp to detect corners and It will give output as my wish.
I tried to convert that OpenCvSharp code to EmguCv and I stuck in here. When converting 'Cv.GoodFeaturesToTrack()' method to EmguCv. In EmguCv structure it requires 11 parameters and for last 4 parameters what should I pass? Can someone help me?
OpencvSharp code as follows:
IplImage src;
IplImage gray;
IplImage eigImg;
public void Grascale()
{
gray = Cv.CreateImage(src.Size, BitDepth.U8, 1);
Cv.CvtColor(src, gray, ColorConversion.RgbToGray);
Cv.SaveImage("grayimg.jpg", src);
}
public void DetectCorners()
{
Grascale();
int cornerCount = 15000000;
using (src)
using (gray)
using (IplImage eigImg = new IplImage(gray.GetSize(), BitDepth.F32, 1))
using (IplImage tempImg = new IplImage(gray.GetSize(), BitDepth.F32, 1))
{
CvPoint2D32f[] corners;
Cv.GoodFeaturesToTrack(gray, eigImg, tempImg, out corners, ref cornerCount, 0.1, 15);
Cv.FindCornerSubPix(gray, corners, cornerCount, new CvSize(3, 3), new CvSize(-1, -1), new CvTermCriteria(20, 0.03));
for (int i = 0; i < cornerCount; i++)
Cv.Circle(src, corners[i], 3, new CvColor(0, 0, 255), 2);
Cv.SaveImage("result_img.jpg", src);
}
}
you better use findcountours method. Then, approxPolyDP so you can get "corners" (depends on the shape), then cornersSubPix to approximate even more the results and apply some logic loops to group similar segments.
Today I started using Tesseract to parse portions of my screen for numbers. I have had a decent amount of success with larger text (which results in a higher resolution image). Now I am trying to use Tesseract in a more practical sense and the image quality is too low. I have tried increasing the resolution and redrawing with anti-aliasing, but I am not sure if I am even doing these things right. Do you have any suggestions as to how I might be able to get Tesseract to recognize the "12" in my tiny image?
Image:
static public void test()
{
string readIn;
TesseractEngine engine = new TesseractEngine(#".\tessdata","eng", EngineMode.Default);
engine.SetVariable("tessedit_char_whitelist", "0123456789"); //only read as numbers
Rectangle rect = new Rectangle(181, 107, 25, 25);
Bitmap bmp = new Bitmap(rect.Width, rect.Height, PixelFormat.Format32bppArgb);
Graphics g = Graphics.FromImage(bmp);
g.CopyFromScreen(rect.Left, rect.Top, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy);
g.InterpolationMode = InterpolationMode.High;
g.CompositingQuality = CompositingQuality.HighQuality;
g.SmoothingMode = SmoothingMode.AntiAlias;
g.DrawImage(bmp, rect.Width, rect.Height); //Do some anti-aliasing hopefully?
bmp.SetResolution(300, 300) //Try increasing resolution??
bmp.Save(#".\tmp.jpg");
readIn = engine.Process(PixConverter.ToPix(bmp)).GetText();
Console.WriteLine("This is what was read: " + readIn); //Empty
}
I suggest using image processing methodes to improve the accuracy of tesseract-ocr. I use OpenCV libraries in c++ for this.
So let's take your image and rescale it by +500%:
You can see the image is getting a bit pixely. In this case you want to smooth the edges by using a Gaussian filter. I used a Gaussian filter with a kernal size of 3x3:
The last thing you need to do is segmentation of the digits by using a threshold:
Running tesseract on the segmented image using the digit whitelist will result in "12".
Hope this helped. :)
I'm fairly new to image processing and that being said, I was hoping someone could tell me if I am on the right track and if not, point me in the right direction and/or provide some code samples.
The requirements I am working on:
Detect the number of cookies on a baking sheet.
The cookies can be any color.
The cookies may be covered in chocolate (white or black) in which case they will have a mess of chocolate around each cookies meaning doing a simple contrast check probably won't work.
The cookies will not overlap but they may touch one another.
I am trying to use the Emgu CV library with HoughCirlces but I am getting mixed results. Here is my code using winforms and C# in which I load an image of cookies on a baking sheet and run it on (I am not confident in my values).
Am I on the right track? Any ideas? Code samples?
Below are some test images:
http://imgur.com/a/dJmU6, followed by my code
private int GetHoughCircles(Image image)
{
Bitmap bitmap = new Bitmap(image);
Image<Bgr, Byte> img = new Image<Bgr, byte>(bitmap).Resize(466, 345, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
//Get and sharpen gray image (don't remember where I found this code; prob here on SO)
Image<Gray, Byte> graySoft = img.Convert<Gray, Byte>().PyrDown().PyrUp();
Image<Gray, Byte> gray = graySoft.SmoothGaussian(3);
gray = gray.AddWeighted(graySoft, 1.5, -0.5, 0);
Image<Gray, Byte> bin = gray.ThresholdBinary(new Gray(149), new Gray(255));
Gray cannyThreshold = new Gray(150);
Gray cannyThresholdLinking = new Gray(120);
Gray circleAccumulatorThreshold = new Gray(50);
Image<Gray, Byte> cannyEdges = bin.Canny(cannyThreshold.Intensity, cannyThresholdLinking.Intensity);
//Image<Gray, Byte> cannyEdges = bin.Canny(cannyThreshold, cannyThresholdLinking);
//Circles
CircleF[] circles = cannyEdges.HoughCircles(
cannyThreshold,
circleAccumulatorThreshold,
3.0, //Resolution of the accumulator used to detect centers of the circles
50.0, //min distance
20, //min radius
30 //max radius
)[0]; //Get the circles from the first channel
//draw circles (on original image)
foreach (CircleF circle in circles)
{
img.Draw(circle, new Bgr(Color.Brown), 2);
}
pictureBox1.Image = new Bitmap(img.ToBitmap());
return circles.Count();
}