I was wondering if anyone would give me pointers to image rec packages that would help me recognize "text" (not OCR, just something that looks like text) and a black box frame. So, suppose:
text
+----------+
| |
| text1 |
| |
| |
+----------+
text
How do I recognize that "text" boxes are text, and that, say, text1 is inside the box?
Apologies for the vague question... I wouldn't know where to start. This is not homework, btw.
[This is of interest to us.] I am assuming your input is effectively a bitmap - a rectangular matrix of pixels. The first question is whether it is aligned with the axes - if it's been scanned it's probably not. You may need deskewing algorithms (rather dated but it's a useful start: http://www.eecs.berkeley.edu/~fateman/kathey/node11.html)
The classic line detection is the Hough transform (http://en.wikipedia.org/wiki/Hough_transform) though our current collaborators do better than this for simple boxes and project pixels onto different viewpoints - similar to tomography. Rotate the image and count the density/histogram of points on the projection lines. For simple boxes that gives a clear signal.
For the text I suspect you either have to have a set of likely fonts or to use machine learning. In the latter you have to devise features and then select a series of images that are classified by humans as text and not-text. Your algorithm (and there are many, neural nets, maximum entropy, etc.) are then trained against these.
The quality of the pixel map makes a great deal of difference. Documents 20 years ago and much harder than bitmaps of documents created though drawing programs and dumped as PDF (of course if you can interpret text in PDF that helps a good deal.)
You can apply any border detection algorithm to detect box. and since color of text is different form the color of background you can use even linear search to find black pixels of 'text'. I may be wrong, sorry about that.
A very simple algorithm would to scan left-to-right and top-to-bottom, looking for the three black pixels that make up an upper-left corner of a box (and then continuing to scan for the three pixels that would make up the matching lower-right corner). Once you've identified each box in the image in this way, you could scan the inner portion and assume that any non-white pixels mean there is some text in the box. Of course, this would not differentiate between text and images inside the box, but that would be a much more difficult problem anyway.
Related
I need to enter a text to existing pdf (in top or bottom of the page) in c#.
I need to make sure that I dont overwrite any visible text or image.
Is there any way I could check an area in pdf if it contains text, image, control etc? I understand it will not be 100% accurate
You're going to need a full PDF consumer at the very least, because the only way to find out where the marks are on the page is to parse (and possibly render) the PDF.
There are complications which you haven't covered (possibly they have not occurred to you); what do you consider to be the area of the PDF file ? The MediaBox ? CropBox, TrimBox, ArtBox, BleedBox ? What if the PDF file contains, for example, a rectangular fill with white which covers the page ? What about a /Separation space called /White ? is that white (it generally gets rendered that way on the output) or not ? And yes, this is a widely used ink in the T-shirt printing industry amongst others.
To me the simplest solution would seem to be to use a tool which will give you the BoundingBox of marks on the page. I know the Ghostscript bbox device can do this, I imagine there are other tools which can do so. But note (for Ghostscript at least); if there are any marks in white (whatever the colour space), these are considered as marking the page and will be counted into the bbox.
The same tool ought to be able to give the size of the various Boxes in the PDF file (you'd need the pdf_info.ps program for Ghostscript to get this, currently). You can then quickly calculate which areas are unmarked.
But 'unmarked' isn't the same things as 'white'. If you want to not count areas which are painted in 'white' then the problem becomes greater. You really need to render the content and then look at each image sample in the output to see if its white or not, recording the maxima and minima of the x and y co-ordinates to determine the 'non-white' area of the page.
This is because there are complications like transfer functions, transparency blending, colour management, and image masking, any or all of which might cause an area which is marked with a non-white colour to be rendered white (a transparency SMask for example) or an area marked with white to be rendered non-white (eg a transfer function).
Your question is unclear because you haven't defined whether any of these issues are important to you, and how you want to treat them.
I'm trying to develop object detection algorithm. I plan to compare 2 image with different focus length. One image that correct focus on the object and one image that correct focus on background.
By reading about autofocus algorithm. I think it can done with contrast detection passive autofocus algorithm. It work on light intensity on the sensor.
But I don't sure that light intensity value from the image file has the same value as from the sensor. (it not a RAW image file. a jpeg image.) Is the light intensity value in jpeg image were the same as on the sensor? Can I use it to detect focus correctness with contrast detection? Is there a better way to detect which area of image were correct focus on the image?
I have tried to process the images a bit and I saw some progress. THis is what I did using opencv:
converted images to gray using cvtColor(I, Mgrey, CV_RGB2GRAY);
downsampled/decimated them a bit since they are huge (several Mb)
Took the sum of absolute horizontal and vertical gradients using http://docs.opencv.org/modules/imgproc/doc/filtering.html?highlight=sobel#cv.Sobel.
The result is below. The foreground when in focus does look brighter than background and vice versa.
You can probably try to match and subtract these images using translation from matchTemplate() on the original gray images; and then assemble pieces using the convex hull of the results as initialization mask for grab cut and plugging in color images. In case you aren’t familiar with the grab cut, chceck out my answer to this question.
But may be a simpler method will work here as well. You can try to apply a strong blur to your gradient images instead of precise matching and see what the difference give you in this case. The images below demonstrate the idea when I turned the difference in the binary masks.
It will be helpful to see your images. It I understood you correctly you try to separate background from foreground using focus (or blur) cue. Contrast in the image depends on focus but it also depend on the contrast of the target. So if the target is clouds you will never get sharp edges or high contrast. Finally jpeg image that use little compression should not affect the critical properties of your algorithm.
I would try to get a number of images at all possible focus lengths in a row and then build a graph of the contrast as a function of focal length (or even better focusing distance). The peak in this graph will give you the distance to the object regardless of object's own contrast. Note, however, that the accuracy of such visual cues goes down sharply with viewing distance.
This is what I expect you to obtain when measuring the sum of absolute gradient in a small window:
The next step for you will be to combine areas that are in focus with the areas that are solid color that is has no particular peak in the graph but none the less belong to the same object. Sometimes getting a convex hull of the focused areas can help to pinpoint the raw boundary of the object.
I have a treeview that contains icons. The icons are ugly. and by ugly i mean low quality. and by low quality i mean something i would expect to see from a dos program
i would hope there would be a way to improve the image quality of the icons, but after looking around on microsoft's development site, I have yet to find a solution.
to be honest, at this point i don't even know what to look for. 'image quality' is very too broad of a phrase to search for (i've gotten very random results from google searches).
i am using an ImageList to store these icons in the TreeView. there really isnt much code to show that would be of use here - at least i dont think so.
sorry for the boring question.
Is this what you're looking for?
http://msdn.microsoft.com/en-us/library/system.windows.forms.imagelist.colordepth.aspx
The default value is set to 8-bit, so changing this property of your list will probably help. Essentially, just add the following line to your code:
imagelist.ColorDepth = ColorDepth.Depth32Bit;
It can be improved. What you need to do is to open it in photoshop or gimp and resample it at 3 times it size. Then resample it at 600 dpi and resize it down to 8 x 10.
Notice, that I specified to resize and not down size. You click on the pic and the corners have a handle for you to click on and size it down to the appropraite size. This way, the pixels are closer together.
Then, you use a very soft blur, I mean really soft. The secret is when it is printed. We print at 600 to 1200 dpi which is what will enhance the picture and that is what a printer has to keep in mind. When you print on glossy paper at 600 dpi, the pic comes out really nice and sharp.
I'm writing a CSS sprite sheet generator. I know there are several decent ones out there, but this is more a personal interest project than anything. I've got a decent algorithm now, but I'm ending up with a lot of leftover whitespace due to the way I'm calculating the position for the next image to pack. So my thought was this:
Given images A and B in C#, how could I find the top-left-most transparent area in image A that would accomodate the area of image B? In other words, assuming image B is a 10x10 image, how could I find the first 10x10 transparent area in image A (assuming there is one)?
I have an image like below.
What I want is a monochrome image such that white parts are kept white, the rest is black. However, the tricky part is that I also want to reduce the white parts to be one pixel in thickness.
It's the second part that I'm stuck with.
My first thought was to do a simple threshold, then use a sort of "Game of Life" type iterative process where a white pixel was removed if it had neighbours on one side but not the other (i.e. it's an edge) however I have a feeling this would reduce ends of lines to nothing over time so I'd end up with a blank image.
What algorithm can I use to get the image I want, given the original image?
(My language of choice is C#, but anything is fine)
Original Image:
After detecting the morphological extended maxima of a given height:
and then thinning gives:
You can also manipulate the height parameter, or prune the thinned image.
Code in Mathematica:
img = ColorConvert[Import["http://i.stack.imgur.com/zPtl6.png"], "Grayscale"];
max = MaxDetect[img, .55]
Thinning[max]
EDIT I followed my own advice and a height of .4 gives segments which are more precisely localized:
I suggest that you look into Binary Morphological transformations such as Erosion and Dilation. Graphics libraries such as OpenCV() http://opencv.willowgarage.com/wiki/ and that statistical/matrix tool Gnu Octave http://octave.sourceforge.net/image/function/bwmorph.html support these operations.