I want to programmatically place text on an image in an area where there is least "going on". It has been some time since I took Computer-Vision, could someone point me in the right direction. Either with respect to C# or Matlab?
I suggest dividing the image into distinct regions, each the size of the space you need for the text overlay. Calculate some measure of visual "energy", such as standard deviation, and choose the region with the lowest value. You could also slide a window around, looking for an arbitrary space of low energy, but this would be computationally much more expensive.
If you have the image processing toolbox for Matlab, you can run an entropy filter (ENTROPYFILT) on the image, matching the filter size to the size of your text. Then, all you need to do is find the filter-result with the smallest value, and you have the center of where you want to put the text.
Related
I am trying to implement a sliding window technique to identify my hypothesis. I am using a 64 x 128 window however I also want to find objects smaller than my window or objects larger than my window. I heard of an apporoach of resizing the image after a full traversal with my kernel window and placing my sliding window at the coordinates obtained from my current possition and the scalling factor, I have no idea how that technique was called...
Does anyone know how to solve this issue?
Thank you for your time !
I assume that you are referring to pyramidal approach which is frequently used in optical flow.
The patch is first detected at upper pyramid scale, than the obtained coordinates are rescaled and the patch is searched again in lower pyramid scale.
Haar object detection does not need image rescaling as the features can be easily rescaled using integral image.
I'm trying to develop object detection algorithm. I plan to compare 2 image with different focus length. One image that correct focus on the object and one image that correct focus on background.
By reading about autofocus algorithm. I think it can done with contrast detection passive autofocus algorithm. It work on light intensity on the sensor.
But I don't sure that light intensity value from the image file has the same value as from the sensor. (it not a RAW image file. a jpeg image.) Is the light intensity value in jpeg image were the same as on the sensor? Can I use it to detect focus correctness with contrast detection? Is there a better way to detect which area of image were correct focus on the image?
I have tried to process the images a bit and I saw some progress. THis is what I did using opencv:
converted images to gray using cvtColor(I, Mgrey, CV_RGB2GRAY);
downsampled/decimated them a bit since they are huge (several Mb)
Took the sum of absolute horizontal and vertical gradients using http://docs.opencv.org/modules/imgproc/doc/filtering.html?highlight=sobel#cv.Sobel.
The result is below. The foreground when in focus does look brighter than background and vice versa.
You can probably try to match and subtract these images using translation from matchTemplate() on the original gray images; and then assemble pieces using the convex hull of the results as initialization mask for grab cut and plugging in color images. In case you aren’t familiar with the grab cut, chceck out my answer to this question.
But may be a simpler method will work here as well. You can try to apply a strong blur to your gradient images instead of precise matching and see what the difference give you in this case. The images below demonstrate the idea when I turned the difference in the binary masks.
It will be helpful to see your images. It I understood you correctly you try to separate background from foreground using focus (or blur) cue. Contrast in the image depends on focus but it also depend on the contrast of the target. So if the target is clouds you will never get sharp edges or high contrast. Finally jpeg image that use little compression should not affect the critical properties of your algorithm.
I would try to get a number of images at all possible focus lengths in a row and then build a graph of the contrast as a function of focal length (or even better focusing distance). The peak in this graph will give you the distance to the object regardless of object's own contrast. Note, however, that the accuracy of such visual cues goes down sharply with viewing distance.
This is what I expect you to obtain when measuring the sum of absolute gradient in a small window:
The next step for you will be to combine areas that are in focus with the areas that are solid color that is has no particular peak in the graph but none the less belong to the same object. Sometimes getting a convex hull of the focused areas can help to pinpoint the raw boundary of the object.
I'm working on an experimental project in which the challenge is to identify and extract an image of the icon or control that the user is has clicked on/touched. The method I'm trying is as follows (I need some help with step 3):
1) Take a screen shot when the user clicks/touches the screen:
2) Apply edge detection:
3) Extract the possible icon images around the Point associated with the user's cursor (Don't know how to do this)
There are easier cases in which the mouse-over event will highlight the icon/control, which allows me to identify the control with a simple screen shot comparison (before and after mouse-over). The above method is specifically for cases in which the icon is not highlighted. I'm new to emgu, so if anyone has any pointers on how to better achieve this, I'm all ears.
Cheers! Matt
Instead of doing edge detection. Consider taking the following steps:
Only grab pixels which are within a certain radius of the point of the user's cursor. Create a new image with just these pixels.
Use thresholding to classify into foreground and background.
Calculate the centroid, (use mean x coordinate and mean y coordinate). Calculate deviation from the mean. Discard foreground pixels which are beyond a certain deviation from the mean. Eg: discard pixels that are more than 1.6 deviations from the mean.
(You may need to experiment with this step ).
Use a convex hull to find the area of the image with the icon in it.
I have an image like below.
What I want is a monochrome image such that white parts are kept white, the rest is black. However, the tricky part is that I also want to reduce the white parts to be one pixel in thickness.
It's the second part that I'm stuck with.
My first thought was to do a simple threshold, then use a sort of "Game of Life" type iterative process where a white pixel was removed if it had neighbours on one side but not the other (i.e. it's an edge) however I have a feeling this would reduce ends of lines to nothing over time so I'd end up with a blank image.
What algorithm can I use to get the image I want, given the original image?
(My language of choice is C#, but anything is fine)
Original Image:
After detecting the morphological extended maxima of a given height:
and then thinning gives:
You can also manipulate the height parameter, or prune the thinned image.
Code in Mathematica:
img = ColorConvert[Import["http://i.stack.imgur.com/zPtl6.png"], "Grayscale"];
max = MaxDetect[img, .55]
Thinning[max]
EDIT I followed my own advice and a height of .4 gives segments which are more precisely localized:
I suggest that you look into Binary Morphological transformations such as Erosion and Dilation. Graphics libraries such as OpenCV() http://opencv.willowgarage.com/wiki/ and that statistical/matrix tool Gnu Octave http://octave.sourceforge.net/image/function/bwmorph.html support these operations.
I'm diving into something without sufficient background, but I feel like there may be simple solutions that don't require me to have in depth knowledge of the topic.
What I am trying to do is have an image co-ordinate system. Basically the user will supply an image, like a house plan. They can then click on points in the image and create markers (like google maps). The next time they retrieve the map, all the markers they added before are there and they can add new ones.
I need to identify the points these markers are located on so I can store that information. I also need to be able to create a layer on the image that contains the markers and renders them in the exact locations they were placed.
I imagine the easiest way to do this is to use pixel co-ordinates...the rub here is that the image won't be a fixed size since there is a web application and an IPad application, so the co-ordinate system needs to work as long as the image is in the same size ratio.
The server size is .NET and as mentioned there is an IPad app, so the solution needs to be viable given that tech stack.
Any ideas?
Instead of using pixel coordinates in absolute terms, you can use the 0 to 1 range. The top left corner is (0,0), bottom right is (1,1) and the center of the image is (0.5,0.5). This way not matter what image size (or zoom level) you have, the markers will always be in the same place.
My suggestion is don't try to figure out the correlation between the actual image and the coordinates. The only thing I would do is use the resolution of the image, aka 800x600 and use that for your grid. Then overlay your markers using that grid on the image. The points you'd remember would just be X and Y values and maybe a tag name/id.