I'm trying to scan some pictures together (personal 3x4 cm images) and then split them into separated images. the first step about scanning is done but about second step (edge detection and splitting) I've some problems.
1- Normally when they scan pictures, some pictures rotate some degrees and its preventing me to have straight edges.
2- How do I remove big noises? (Imagine when they scan those pictures, they put a paper behind them. sometimes the paper makes some edges in the scanned picture... how can I understand that its not the edge I'm looking for?)
Here is a sample image:
The sample images within the scan are all rectangular, and they are all roughly the same size. There are a variety of techniques for finding rectangles in an image (even at completely arbitrary rotation), but I'll start with the more fundamental techniques.
Hough line fit can be used to find lines in an image, even when the background is noisy. From the Hough line fits you can find intersection points and perhaps compare those intersection points to points found with corner detections (see 3 below).
Edge points on lines have gradients perpendicular to those lines. When searching for edge points, you can favor edge points that are roughly a distance L or a distance W from other edge points with gradients in the parallel direction, where L and W are the known length and width of your images.
Corner detectors can help identify corners of your small rectangular images. You know the length and width of the pictures, which should help you accept/reject corners.
If you want to get fancy (which I don't recommend), then a simple normalized cross-correlation technique could detect all instances of a "template" subimage within a larger image. The technique is a bit crude, but it works okay if there isn't much rotation. Since the subimages have well-defined borders of known shape and (presumably) consistent size, it'd be easier just to find the edges rather than try to match the image content.
Once you've identified the location and orientation of each rectangular subimage, then a simple rotational transform + interpolation could generate a "right side up" version of each image. With scanners you won't have problems with perspective distortion, but if at some point in the future you would take pictures of pictures (?) at an angle, then an affine transform can map the distorted, trapezoidal images to rectangular images.
Hough transform
http://en.wikipedia.org/wiki/Hough_transform
Corner detection
http://en.wikipedia.org/wiki/Corner_detection
For simple edge detection that should work sufficiently well for your application, see the section "Other first-order methods" in the Edge Detection article on Wikipedia. The technique is easy to understand and simple to implement.
http://en.wikipedia.org/wiki/Edge_detection
Good luck, and once again Happy New Year!
Related
I´ve got a problem. I am taking pictures of a common solar module with a camera flash. I need to detect the frame of the module to cut out the module and undistort it (I only need all of the cell area (dark area inside the frame)).
sample image - direct flash --> problems with big reflection ( I think i can reduce it with a good diffusor)
sample image - flash from angle
Anybody have some recommendation for a robust method to detect the frame? I need something to work with various image angles and lighting.
processed sample image 2
The last picture is processed. I blured the image, grayscaled, inverted. After that I thresholded the image and tried to detect contours (Got some Problems with the shadow on the bottom of the image)
Thanks for your time.
Chris
as mentioned in :
Rectangle recognition with perspective projection
Hough transform should work well for rectangle detection IFF you can assume that the sides of the rectangle are the most prominent lines in your image. Then you can simply detect the 4 biggest peaks in hough space and you got your rectangle.
This works for example with a photo of a white sheet of paper in front of a dark background.
Ideally you would preprocess the image with blur, threshold, morphological operators to remove any small-scale structures before hough transform.
If there are multiple smaller rectangles or other sorts of prominent lines in your images, contour detection might be the better choice.
Some general advantages for the hough transform off the top of my head:
Hough transform can still work if part of the rectangle is obstructed or out of the frame.
Hough transform should be faster than contour detection, I guess?
Hough transform will ignore anything that is not a straight line, so you may have greater success with cluttered images. (if the rectangle sides are the most prominent lines)
For a research project, I have to find the ellipses in a fossil image.
For each fossil image, I also have a CSV file containing the contour of the fossil, in cartesian coordinates.
I need help in determining the starting and ending points of each ellipses that are present in the fossil, so that I can apply a ellipse-fitting algorithm on them.
I started to look at the possibility of studying the variations in the different slopes of the contour.
It somehow worked, until the point where I tried on fossils that have very low curve variations. As you can see on the image below (click the link), the pink points are where the variation of the slopes of the contours are the highest. However, it doesn't work for the bottom ellipse.
So I need a new approach on that. Do you have any hints or ideas where I can look at ?
I'm feeding in a Bitmap image to my C# program to be able to perform OCR to identify the characters in the image. I can do this fairly well if the image is not rotated. One of the program requirements, however, is that the program automatically determines if the image has been rotated, and that it automatically corrects these rotations.
I've tried implementing a simple method where lines are traced across the image and points which contact a character are recorded, and then performing a simple linear regression on the line points. This works to an extent, although it has not proven very accurate due to curvature of characters, etc.
I was wondering if there was a better method to solve this problem? Many thanks in advance! :)
I use gmseDeskew algorithm to deskew an image in my program. It works very well.
It's an interesting problem to be sure. I'd look for certain letters that are easier to tell rotation for. For example, a capital A or R or K should have both of the lower parts are roughly the same horizontal plane. Another option is to take letters that cannot be identified and rotate them in various ways and re-attempt to identify them. If a letter than could not be identified in the raw scan CAN be identified when you rotate it, that's a pretty big clue. Once you have identified the "correction" rotation that makes a non-recognizable character into a recognizable one, apply the same rotation value to the others.
If it recognizes lines of text, then try to blur the image so that lines are mostly solid and find direction of the lines (either with analysis of Fourier transform or by ridge detection).
If the text is formatted like a printed document (column(s) and lines of text) then you can take advantage of this.
An approach that I've often seen used for document text is to do projection profiles:
Scan a document at a specific orientation and sum up the number of "black" pixels on each scan line (creating a 1D array of counts, each index representing a Y coordinate, the profile).
Calculate the variance of the counts (profile).
Repeat for multiple angles, (can be done in a binary search fashion to reduce processing)
The angle that results in the greatest variance is the correct angle (due to the text lines creating large peaks from the printed text, and low valleys due to the absence of text between the lines)
Then after finding this angle you can adjust your image accordingly and do your awesome OCR.
It might be easier to find the vertical-ish lines that are adjacent to the text (i.e., the left margin). For each scanline, record the first black pixel. Put all of those in a linear regression, and you should get a near vertical line. Measure its angle from true vertical and you should be able to unrotate the text. You could imagine doing the same thing for the top, bottom, and right sides, too, and taking an average.
We faced a similar problem before, and we searched for an easy and quick solution, and we ended up using a commercial toolkit (leadtools). You can use it to do auto processing to the image before OCR it. You can check this help topic to know how to use this toolkit to process and scan images.
I have a big polygon (Pa). Inside the polygon there are a lot of small "holes", as shown:
Here are a few condition for the holes:
The holes cannot overlap one another
The holes cannot go outside the outer polygon
However, the holes can touch the outer polygon edge
How to obtain the remaining polygon ( or the polygon list) in an efficient manner? The easiest way ( brute force way) is to take the Pa, and gradually computing the remaining polygon by subtracting out the holes. Although this idea is feasible, but I suspect that there is a more efficient algorithm.
Edit: I'm not asking about how to perform polygon clipping ( or subtraction) algorithm! In fact that's something I would do by brute force. I'm asking in addition to the polygon clipping method ( take the main polygon and then gradually clip the holes out), is there other more efficient way?
This is very hard to do in a general manner. You can find source code for a solution here:
General Polygon Clipper (GPC)
Well, if you use the right representation for your polygon you would not need to do anything. Just append the list of edges of the holes to the list of edges of Pa.
The only consideration you should have is that if some hole vertex or edge can touch Pa edge, you will have to perform some simplification there.
A different problem is rendering that polygon into a bitmap!
You can do like this.
Draw the main polygon with a color in a bitmap.
Draw the holes with another color in the same bitmap.
Then extract the polygon by running marching square algorithm with the main polygons color as threshold.
The output will contain all the points that belong to that polygon.
You can sort the points if you want it as a continous closed polygon.
I agree with salva, but my post is going to address the drawing part. Basically, you can add up all lines of the main and the hole polygons together and thereby get a single complex polygon.
The algorithm itself is not very complicted and it is nicely explained in the Polygon Fill Teaching Tool.
I am hoping to obtain some some help with 2D object detection. I'll give a brief overview of the context in which this will be implemented.
There will be an image taken of the ceiling. The ceiling will have markers placed on it so the orientation of the camera can be determined. The pictures will always be taken facing straight up. My goal is to detect one of these markers in the image and determine its rotation. So rotation and scaling(to a lesser extent) will be the two primary factors used in the image detection. I will be writing the software in either C# or matlab(not quite sure yet).
For example, the marker might be an arrow like this:
An image taken of the ceiling would contain markers. The software needs to detect a single marker and determine that it has been rotated by 170 degrees.
I have no prior experience with image analysis. I know image processing is a fairly broad topic and was hoping to get some advice on which direction I should take and which techniques would be best for my application. Thanks!
I'm not directly in this field but I would tell you to start by looking into edge detection specifically. If you have a background in math/engineering the materials are pretty easy to understand:
This seemed to spark some ideas:
http://www.cfar.umd.edu/~fer/cmsc426/lectures/edge1.ppt
I'd recommend MATLAB or if you're intent on using C#, Emgu CV is pretty good.
Hough transforms are a great idea. Once you detect the edges in your image, using, say a Canny edge detector, you get an edge image (which is binary image with only 1 or 0 for values).
Then, the Hough straight line transform (essentially) spins a line about each white pixel in the edge image (the resolution of the line depends on you) using a parametrized function for the line and calculates the total number of white (valued at 1) pixels along each spun line and stores this information in a big accumulator which stores the data indexed by the parameters of the line.
alt text http://upload.wikimedia.org/wikipedia/en/a/af/Hough_space_plot_example.png
In the example above, the parametric form for a line is:
rho = x*cos(theta) + y*sin(theta)
where rho is the distance and theta is
the angle
So as you can see the, if you look at the bin at a particular orientation you can find out how many lines are oriented at that angle. Of course, you'll have to do some extra work to figure out which lines are oriented at that angle since you have 5 other lines per arrow but that shouldn't be too hard.
as always in computer vision, your first problem is image illumination and acquisition. before going further, establish how your markers will be printed on the ceiling, what their form will be, what light you will be using to see them, and what camera setup you will chose to look at the markers.
given a good material, a good light and a good camera, you may have no problem at all to process the image. for example, you can print a full arrow in a retro-reflective material, with a longer tail than your example, use a colored light and a corresponding filter on the camera. now all you have on your image is arrows... there are plenty other ways of acquiring the image that will help you there.
once you have plain arrows, a simple blob analysis (which consist of computing statistical moments of objects in the image) will give you a lot of informations: each arrow should have values almost equal for the 7 hu moments, which allows you to filter objects efficiently, also the orientation computed from the central moments will give you the angle of the arrow. blob analysis being only statistical, it is extremely fast.
Several systems have been developed to detect markers and their orientation robustly:
reacTIVision (open source) uses these types of tags to find position and orientation:
ARToolKit (open source) uses a different type of tags to extract all 6 degrees of freedom:
alt text http://www.schanes.net/docs/robot/marker.png
If your primary goal is not to learn, but to make the application work, I would suggest you use one of these. It is not a trivial task for a beginner to robustly detect the position and orientation of a random marker in an image.
On the other hand, if you are manly interested in learning, I would also direct you to ARToolKit and its publications (and their references) that explain how to robustly implement marker detection.
You will need to explore edge detection, so look into Hough filters. After that you will need to look into pattern classifiers and feature extraction.
This paper has an algorithm that appears to work without edge detection.
This book excerpt is more oriented toward the kind of symbol detection you intend, once you have done the edge detection.
A rigorous way to determine the orientation of an imaged acquired under projective geometry (most of cameras) is using the vanishing points and vanishing lines. Good news to you: your marker can be used to find this information! More good news, your image can be rectified, so the image columns (the y-axis) will correspond to the up-down direction. You will find more about this stuff in chapter 8 of Hartley and Zisserman's book, Multiple View Geometry in Computer Vision.
Also remember that probably you will need to work on the radial distortion issue, the distortion caused by the camera lens. The other guys are right about the arrow detection problem: you have to use edge detection and, after that, Hough transform or template matching. Refer to Gonzalez and Woods' book Digital Image Processing for details.