Well I am not able to get good accuracy of text detection in tesseract. Please check code and image below.
Mat imgInput = CvInvoke.Imread(#"D:\workspace\raw2\IMG_20200625_194541.jpg", ImreadModes.AnyColor);
int kernel_size = 11;
//Dilation
Mat imgDilatedEdges = new Mat();
CvInvoke.Dilate(
imgInput,
imgDilatedEdges,
CvInvoke.GetStructuringElement(
ElementShape.Rectangle,
new Size(kernel_size, kernel_size),
new Point(1, 1)),
new Point(1, 1),
1,
BorderType.Default,
new MCvScalar(0));
//Blur
Mat imgBlur = new Mat();
CvInvoke.MedianBlur(imgDilatedEdges, imgBlur, kernel_size);
//Abs diff
Mat imgAbsDiff = new Mat();
CvInvoke.AbsDiff(imgInput, imgBlur, imgAbsDiff);
Mat imgNorm = imgAbsDiff;
//Normalize
CvInvoke.Normalize(imgAbsDiff, imgNorm, 0, 255, NormType.MinMax, DepthType.Default);
Mat imgThreshhold = new Mat();
//getting threshhold value
double thresholdval = CvInvoke.Threshold(imgAbsDiff, imgThreshhold, 230, 0, ThresholdType.Trunc);
//Normalize
CvInvoke.Normalize(imgThreshhold, imgThreshhold, 0, 255, NormType.MinMax, DepthType.Default);
imgThreshhold.Save(#"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
//contrast correction
Mat lab = new Mat();
CvInvoke.CvtColor(imgThreshhold, lab, ColorConversion.Bgr2Lab);
VectorOfMat colorChannelB = new VectorOfMat();
CvInvoke.Split(lab, colorChannelB);
CvInvoke.CLAHE(colorChannelB[0], 3.0, new Size(12, 12), colorChannelB[0]);
Mat clahe = new Mat();
//merge
CvInvoke.Merge(colorChannelB, clahe);
Image<Bgr, byte> output = new Image<Bgr, byte>(#"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
Bitmap bmp = output.ToBitmap();
//setting image to 300 dpi since tesseract likes that
bmp.SetResolution(300, 300);
bmp.Save(#"D:\workspace\ocr_images\IMG_20200625_194541.jpg");
I am not getting expected accuracy. Please check how image is converted.
source image
converted image
I have posted few images above that you can refer. For first image i am getting garbage data. For last two images i am getting partial data.
Converting image to gray scale and playing with threshold gives better output.
I want to understand that if in case threshold is the key part then how i will be able to get dynamic threshhold value for each new image? It is going to work as service so user will simply pass the image and get the result. My app should be intelligent enough to process and understand image.
Do i have to adjust contrast, threshold more accurately? If yes how i will do that? or image itself is faulty I mean noise causing problem.
Please let me know what i am doing wrong in the algorithm or anything which will help me to understand issue. Any one who is aware of please tell me what should be ideal steps for image preprocessing for OCR?
I am using csharp, emucv and tesseract.
Any suggestion will be highly appreciated.
Related
I'm trying to take a smaller image mat and copy it into larger mat so I can resize it while keeping the aspect ratio of the image. So, basically this:
So far, this is the code I've written:
private Mat MakeMatFrame(Texture2D image)
{
// Texture must be of right input size
Mat img_mat = new Mat(image.height, image.width, CvType.CV_8UC4, new Scalar(0, 0, 0, 255));
texture2DToMat(image, img_mat);
return img_mat;
}
private void letterBoxImage(Texture2D image)
{
// Get input image as a mat
Mat source = MakeMatFrame(image);
// Create the mat that the source will be put in
int col = source.cols();
int row = source.rows();
int _max = Math.Max(col, row);
Mat resized = Mat.zeros(_max, _max, CvType.CV_8UC4);
// Fill resized
Mat roi = new Mat(resized, new Rect(0, 0, col, row));
source.copyTo(roi);
Texture2D tex2d = new Texture2D(resized.cols(), resized.rows());
matToTexture2D(resized, tex2d);
rawImage.texture = tex2d;
}
Everything I've looked at tells me this is the right approach to take (get a region of interest, fill it in). But instead of getting that third image with the children above the gray region, I just have a gray region.
In other words, the image isn't copying over properly. I've trying using a submat as well, but it failed miserably.
I've been looking for C# code on how to do this sort of thing with OpenCv For Unity, but I can only find C++ code. Which tells me to do exactly this.
Is there some sort of "apply changes" function I'm unaware of for Mats? Am I selecting the region of interest incorrectly? Or is it something else?
sorry for my english,but ur code has a bug.
Mat roi = new Mat(resized, new Rect(0, 0, col, row));
image copied to roi,but this mat not related with resized Mat.so u have to do like this:
Rect roi=new Rect(0,0,width,height);
source.copyto(resized.submat(roi));
This maybe a stupid question but how can you make a threshold so that the depth distance of the camera can get changed. Now I am using the Cv2.threshold to to that but with the otsu method the whole picture changes to one color instead of different kinds of a color.
The code used:
var colorizedDepth = colorizer.Process<VideoFrame>(depthFrame).DisposeWith(frames);
Mat testcd = new Mat(colorizedDepth.Height, colorizedDepth.Width, MatType.CV_8UC3, colorizedDepth.Data);
Mat testgd = new Mat();
Cv2.CvtColor(testcd, testgd, ColorConversionCodes.RGBA2GRAY);
Mat testbd = new Mat();
Cv2.Threshold(testgd, testbd, 0, 255, ThresholdTypes.Otsu | ThresholdTypes.Binary);
Cv2.ImShow("camera", testgd);
Cv2.WaitKey(0);
The code to get the colored depth is from the wrapper librealsense:
https://github.com/IntelRealSense/librealsense/tree/master/wrappers/csharp
Does anyone know what I am doing wrong for the threshold so that the depth distances get changed?
For a school project, I am trying to calibrate a Dahua IP Camera, which has Fisheye distortion.
We need to calibrate the camera to undistort the image, because we need a flat image to do image processing.
So far, we have managed to do the calibration in Python with OpenCV, but the rest of the script is written in C#, so we would like to convert the code to C# using the EmguCV wrapper (OpenCV for .NET)
Correct me if I'm wrong, but so far I have done these steps:
I took 50 pictures of a Chessboard Grid, which is used to find the Corners
I let OpenCV calculate the correct matrices, one called CameraMatrix and the other is the Distortion Coëfficiënts.
With the 2 calculated matrices, we then Undistort an image, and the result is a flat image with no distortion.
In Python, this code works. I get 2 matrices that works to undistort.
I tried to copy these matrices in C#, without doing the rest of the Calibration. Since it's the same camera setup, the Distortion should be the same.
However, when I try to hardcode the matrices, the result is not what I want.
Could it be an issue related to EmguCV (The wrapper of OpenCV for .NET), or is it something related to my code?
Python Code
img = cv2.imread(filename)
K = np.array(...) # removed for brevity
D = np.array(...) # removed for brevity
DIM = (width, height) # image resolution
map1, map2 = cv2.fisheye.initUndistortRectifyMap(K, D, np.eye(3), K, DIM, cv2.CV_16SC2)
undistorted_img = cv2.remap(img, map1, map2, interpolation=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
C# code
Matrix<double> cameraMatrix = new Matrix<double>(3, 3);
Matrix<double> distortionCoeffs = new Matrix<double>(4, 1);
Mat outputMap1 = new Mat();
Mat outputMap2 = new Mat();
Mat r = new Mat(); // Can be empty, according to the documentation
Mat p = new Mat(); // Can be empty, according to the documentation
Fisheye.InitUndistorRectifyMap(cameraMatrix, distortionCoeffs, r, p, image.Size, DepthType.Cv32F, outputMap1, outputMap2);
CvInvoke.Remap(image, undistorted, outputMap1, outputMap2, Inter.Linear, BorderType.Constant);
I'm fairly new to image processing and that being said, I was hoping someone could tell me if I am on the right track and if not, point me in the right direction and/or provide some code samples.
The requirements I am working on:
Detect the number of cookies on a baking sheet.
The cookies can be any color.
The cookies may be covered in chocolate (white or black) in which case they will have a mess of chocolate around each cookies meaning doing a simple contrast check probably won't work.
The cookies will not overlap but they may touch one another.
I am trying to use the Emgu CV library with HoughCirlces but I am getting mixed results. Here is my code using winforms and C# in which I load an image of cookies on a baking sheet and run it on (I am not confident in my values).
Am I on the right track? Any ideas? Code samples?
Below are some test images:
http://imgur.com/a/dJmU6, followed by my code
private int GetHoughCircles(Image image)
{
Bitmap bitmap = new Bitmap(image);
Image<Bgr, Byte> img = new Image<Bgr, byte>(bitmap).Resize(466, 345, Emgu.CV.CvEnum.INTER.CV_INTER_CUBIC);
//Get and sharpen gray image (don't remember where I found this code; prob here on SO)
Image<Gray, Byte> graySoft = img.Convert<Gray, Byte>().PyrDown().PyrUp();
Image<Gray, Byte> gray = graySoft.SmoothGaussian(3);
gray = gray.AddWeighted(graySoft, 1.5, -0.5, 0);
Image<Gray, Byte> bin = gray.ThresholdBinary(new Gray(149), new Gray(255));
Gray cannyThreshold = new Gray(150);
Gray cannyThresholdLinking = new Gray(120);
Gray circleAccumulatorThreshold = new Gray(50);
Image<Gray, Byte> cannyEdges = bin.Canny(cannyThreshold.Intensity, cannyThresholdLinking.Intensity);
//Image<Gray, Byte> cannyEdges = bin.Canny(cannyThreshold, cannyThresholdLinking);
//Circles
CircleF[] circles = cannyEdges.HoughCircles(
cannyThreshold,
circleAccumulatorThreshold,
3.0, //Resolution of the accumulator used to detect centers of the circles
50.0, //min distance
20, //min radius
30 //max radius
)[0]; //Get the circles from the first channel
//draw circles (on original image)
foreach (CircleF circle in circles)
{
img.Draw(circle, new Bgr(Color.Brown), 2);
}
pictureBox1.Image = new Bitmap(img.ToBitmap());
return circles.Count();
}
I am trying to detectect the bounding box of sentences in an Image. I am using Emgu OpenCV in C# using HougLinesP method to extract lines, but I am obviously doing something wrong. I have looked at many examples and estimate the skew level with houghLines is pretty much what I am trying to do.
Using that sample image I do some pre-processing (Thresholding, canny, ect) and end up with http://snag.gy/sWCuO.jpg, but then when I do HoughLines and draw the lines on the original image, I get http://snag.gy/ESKmR.jpg .
Here is an extract of my code:
using (MemStorage stor = new MemStorage())
{
Image<Hsv, byte> imgHSV = new Image<Hsv, byte>(bitmap);
Image<Gray, Byte> gray = imgHSV.Convert<Gray, Byte>().PyrDown().PyrUp();
CvInvoke.cvCanny(gray, EdgeMap, 100, 400, 3);
IntPtr lines = CvInvoke.cvHoughLines2(EdgeMap, stor,
Emgu.CV.CvEnum.HOUGH_TYPE.CV_HOUGH_PROBABILISTIC, 1, Math.PI / 360, 10,
gray.Width / 4, 20);
Seq<LineSegment2D> segments = new Seq<LineSegment2D>(lines, stor);
ar = segments.ToArray();
}
Graphics g = Graphics.FromImage(OriginalImage);
foreach (LineSegment2D line in ar)
{
g.DrawLine(new Pen(Color.Blue),
new Point(line.P1.X, line.P1.Y),
new Point(line.P2.X, line.P2.Y));
}
g.Save();
Any help would be appreciated.
You can try two approaches:
1- Utilize the frequency domain. Example here
2- After pre-processing, extract the contours, collect all the points (or at least collect all points that are not black); find the minimum bounding rectangle with its angle. Example here