I'm searching for the best library to search for identical area in two different images, all images being compressed in JPEG, with a lot of noise. I'm having a hard time finding one. The problem is if you zoom a jpeg, you'll see that it looks like a Monet, I mean, the noise contains a color palette that have no direct link with the original image. So instead of searching for an identical array in the image, I need to find the 'most similar array'.
These images come from random screenshot over a googlemap similar websites, and the images cannot be in another format than jpeg.
I tried a lot of manual way.
One of my method is:
Transforming my two images in smaller images
Changing them in 4bpp images, or even less colors
Taking a small part of image 1
Searching for the byte[] array version of a cropped part of image 1 in image 2
Not searching for identical, but for similar matches.
This algorithm works, but I'm doing everything in one dimension array, and it is very slow.
Is there existing libraries that would do this algorithm directly?
My algorithm is:
// Where SRC is the bigger image in which I search
// Offset is where in my small image I start to search
// Len is how long is my searched array
// Size is the size of the bigger image in which I'm searching.
// private Point simpleSearch(byte[] src, int offset, int len, byte[] search, Size size)
{
byte[] ddd = new byte[len];
Array.Copy(search, offset, ddd, 0, len);
int lowest = 100000000;
int locmatch = 0;
for (int i = 0; i < src.Length - len; i++)
{
int thed = 0;
for (int a = 0; a < len; a++)
{
int diff = Math.Abs(src[i + a] - ddd[a]);
thed += diff;
}
thed = thed / len;
if (thed < lowest)
{
lowest = thed;
locmatch = i-len;
}
}
int yy = (locmatch / size.Width);
int xx = locmatch - (yy * size.Width);
Point p = new Point(xx, yy);
return p;
}
Yep correlation or spectrum signature are ways to tell how similar two image regions are. But I think what you really want here is an algorithm to efficiently search the overlapping region.
Correspondence problem is a well defined problem in computer vision that tries to figure out which parts of an image correspond to which parts of another image. There are RANSAC based algorithms.
There's also a quad-tree algorithm that brings the complexity down to logarithm order.
Related
I am trying to take a grayscale bitmap and extract a single line from it and then graph the gray values. I got something to work, but I'm not really happy with it. It just seems slow and tedious. I am sure someone has a better idea
WriteableBitmap someImg; //camera image
int imgWidth = someImg.PixelWidth;
int imgHeight = someImg.PixelHeight;
Int32Rect rectLine = new Int32Rect(0, imgHeight / 2, imgWidth, 1); //horizontal line half way down the image as a rectangle with height 1
//calculate stride and buffer size
int imgStride = (imgWidth * someImg.Format.BitsPerPixel + 7) / 8; // not sure I understand this part
byte[] buffer = new byte[imgStride * rectLine.Height];
//copy pixels to buffer
someImg.CopyPixels(rectLine, buffer, imgStride, 0);
const int xGraphHeight = 256;
WriteableBitmap xgraph = new WriteableBitmap(imgWidth, xGraphHeight, someImg.DpiX, someImg.DpiY, PixelFormats.Gray8, null);
//loop through pixels
for (int i = 0; i < imgWidth; i++)
{
Int32Rect dot = new Int32Rect(i, buffer[i], 1, 1); //1x1 rectangle
byte[] WhiteDotByte = { 255 }; //white
xgraph.WritePixels(dot, WhiteDotByte, imgStride, 0);//write pixel
}
You can see the image and the plot below the green line. I guess I am having some WPF issues that make it look funny but that's a problem for another post.
I assume the goal is to create a plot of the pixel value intensities of the selected line.
The first approach to consider it to use an actual plotting library. I have used oxyplot, it works fine, but is lacking in some aspects. Unless you have specific performance requirements this will likely be the most flexible approach to take.
If you actually want to render to an image you might be better of using unsafe code to access the pixel values directly. For example:
xgraph.Lock();
for (int y = 0; y < imgHeight; y++){
var rowPtr = (byte*)(xgraph.BackBuffer + y * xgraph.BackBufferStride);
for(int x = 0; x < imgWidth; x++){
rowPtr[x] = (byte)(y < buffer[i] ? 0 : 255);
}
}
self.Unlock(); // this should be placed in a finally statement
This should be faster than writing 1x1 rectangles. It should also write columns instead of single pixels, and that should help making the graph more visible. You might also consider allowing arbitrary image height and scale the comparison value.
If you want to plot the pixel values along an arbitrary line, and not just a horizontal one. You can take equidistant samples along the line, and use bilinear interpolation to sample the image.
So I want to grab a partial image from a byte array of colors. The image is a unity logo that is 64x64 pixels. I want to grab a third of the image (Unity Logo). How would I traverse the byte array to get this image?
Unity Byte Array
assuming each byte is a single pixel (which is only true for 8-bit depth images), the bytes 0-63 are the first row, 64-127 are the second row, etc etc.
meaning that to find out the position of a pixel in the one-dimensional array, based on its two-dimensional coordinates in the image itself, you do
int oneDimPos = (y*64) + x;
if each pixel were 3 bytes (24-bit color depth), the conversion from 2dimensional to 1dimensional coordinates would be:
int oneDimPos = (y * 64 * 3) + (x * 3);
(so the most generic equation is:
int oneDimPos = (y * imageWidth * colorDepth) + (x * colorDepth);
and you need to keep this in mind and adjust the code accordingly. or even better, use this most generic version, and actually read the image width and its color depth from the asset you're using as source.
BEWARE: if the image is anything else than 8bits per pixel, this equation will, naturally, only give you the first, starting bit belonging to that pixel, and you still need to take care to actually also read the other ones that belong to that pixel
i'm gonna finish the answer assuming 8bit color depth, for simplicity, as well as so that you can't just copypaste the answer, but also have to understand it and re-shape it according to your specific needs ;)
)
meaning you can now do classic two nested loops for x and y:
List<byte> result = new List(); //i'm going to use list so i can just .Add each byte instead of having to calculate and allocate the final size in advance, and having to mess around with recalculating the index from the source array into the destination one, because i'm lazy
for(int x=0; x < 22; x++){ //no way for you to grab precise third since that boundary is in the middle of a pixel for an image 64pixels wide
for(int y = 0; y < 64; y++){ //we go all the way to the bottom
result.Add(sourceAsset.bytes[(y*64) + x]);
}
}
//now just convert the list to actual byte array
byte[] resultBytes = result.ToArray();
The original issue that I was having was not exactly the same as the question. I wanted to simplify it by having a byte array that everyone could take a look at. The byte array from Unity's website wasn't exactly what I was getting.
So I have 3 x 1080p portrait screen (1080 x 1920 pixels) with RGBA channels. I grabbed a screenshot from this and got a 24,883,200 size byte array.
Note, 3 * width(1080) * height(1920) * channels(4) = 24,883,200.
byte[] colors = new byte[24883200]; // Screenshot of 3x1080p screen.
byte[] leftThird = new byte[colors.Length / 3];
Array.Copy(colors, 0, leftThird, 0, colors.Length / 3); // Grab the first third of array
This is an issue because the colors array is read from top to bottom, left to right. So instead, you should read a portion of the 3 x 1080 x 4 channels.
int width = 1080 * 4; // 4 channels of colors (RGBA)
int fullWidth = width * 3; // Three screens
int height = 1920;
byte[] leftScreen = new byte[screenShotByteArray.Length / 3];
for(int i = 0; i < height; i++)
{
Array.Copy(screenShotByteArray, (i * fullWidth) + (offset * 4), leftScreen, i * width, width);
}
I have acquired Digital Elevation Maps(Height Map of Earth) of some area. My aim was to create Realistic Terrains.
Terrain Generation is no problem. I have practiced that using VC# & XNA framework.
The problem is that those Height Map Files are in GeoTIFF format which i don't know how to read. Nor do i have previous experience with reading any image files so that i could experiment something using little tips-bits available on internet about reading GeoTIFF files. So far i have been unsuccessful.
The geoTIFF files I have are 3601 x 3601 files.
Each file has two version, a decimal & num valued files.
Each file has data of every second of longitude & latitude of
Geo-Coords along with Height Map i.e Lon, Lat, height from sea level
How to read these file :)
The files I have are from ASTER G-DEM Version-2 LINK TO OFFICIAL DESCRIPTION according to them GeoTIFF is pretty standard which is because some GeoTIFF Visualizers I dwonloaded are showing me the correct data.
I am gonna be using C#. I would appreciate if we talk in relation to this language.
E D I T
okay i got the libtiff and this what i have done,
using (Tiff tiff = Tiff.Open(#"Test\N41E071_dem.tif", r))
{
int width = tiff.GetField(TiffTag.IMAGEWIDTH)[0].ToInt();
int height = tiff.GetField(TiffTag.IMAGELENGTH)[0].ToInt();
double dpiX = tiff.GetField(TiffTag.XRESOLUTION)[0].ToDouble();
double dpiY = tiff.GetField(TiffTag.YRESOLUTION)[0].ToDouble();
byte[] scanline = new byte[tiff.ScanlineSize()];
ushort[] scanline16Bit = new ushort[tiff.ScanlineSize() / 2];
for (int i = 0; i < height; i++)
{
tiff.ReadScanline(scanline, i); //Loading ith Line
MultiplyScanLineAs16BitSamples(scanline, scanline16Bit, 16,i);
}
}
private static void MultiplyScanLineAs16BitSamples(byte[] scanline, ushort[] temp, ushort factor,int row)
{
if (scanline.Length % 2 != 0)
{
// each two bytes define one sample so there should be even number of bytes
throw new ArgumentException();
}
Buffer.BlockCopy(scanline, 0, temp, 0, scanline.Length);
for (int i = 0; i < temp.Length; i++)
{
temp[i] *= factor;
MessageBox.Show("Row:"+row.ToString()+"Column:"+(i/2).ToString()+"Value:"+temp[i].ToString());
}
}
where i am displaying the message box, i am displaying the corresponding values, Am i doing it Right, i am asking this cuz this is my maiden experience with images & 8\16 bit problem. I think unlike the official tutorials of libtiff i should be using short instead of ushort because the images i am using are "GeoTIFF, signed 16 bits"
There are some SDKs out there usable from C# to read GeoTIFF files:
http://www.bluemarblegeo.com/global-mapper/developer/developer.php#details (commercial)
http://bitmiracle.com/libtiff/ (free)
http://trac.osgeo.org/gdal/wiki/GdalOgrInCsharp (free?)
UPDATE:
The spec for GeoTIFF can be found here - to me it seems that GeoTIFFs can contain different "subtypes" of information which in turn need to be interpreted appropriately...
Here's a guy that did it without GDAL: http://build-failed.blogspot.com.au/2014/12/processing-geotiff-files-in-net-without.html
GDAL is available in NuGet, though.
If the GeoTIFF contains tiles, you need a different approach. This is how to read a GeoTiff that contains 32bit floats with height data:
int buffersize = 1000000;
using (Tiff tiff = Tiff.Open(geotifffile, "r"))
{
int nooftiles = tiff.GetField(TiffTag.TILEBYTECOUNTS).Length;
int width = tiff.GetField(TiffTag.TILEWIDTH)[0].ToInt();
int height = tiff.GetField(TiffTag.TILELENGTH)[0].ToInt();
byte[] buffer = new byte[buffersize];
for (int i = 0; i < nooftiles; i++)
{
int size = tiff.ReadEncodedTile(i, buffer, 0, buffersize);
float[,] data = new float[width, height];
Buffer.BlockCopy(buffer, 0, data, 0, size); // Convert byte array to x,y array of floats (height data)
// Do whatever you want with the height data (calculate hillshade images etc.)
}
}
I'm working on a strange project. I have access to a laser cutter that I am using to make stencils (from metal). I can use coordinates to program the machine to cut a certain image, but what I was wondering was: how can I write a program that would take a scanned image that was black and white, and give me the coordinates of the black areas? I don't mind if it gives every pixel even though I need only the outer lines, I can do that part.
I've searched for this for a while, but the question has so many words with lots of results such as colors and pixels, that I find tons of information that isn't relevant. I would like to use C++ or C#, but I can use any language including scripting.
I used GetPixel in C#:
public List<String> GetBlackDots()
{
Color pixelColor;
var list = new st<String>();
for (int y = 0; y < bitmapImage.Height; y++)
{
for (int x = 0; x < bitmapImage.Width; x++)
{
pixelColor = bitmapImage.GetPixel(x, y);
if (pixelColor.R == 0 && pixelColor.G == 0 && pixelColor.B == 0)
list.Add(String.Format("x:{0} y:{1}", x, y));
}
}
return list;
}
If we assume that the scanned image is perfectly white and perfectly black with no in-between colors, then we can just take the image as an array of rgb values and simply scan for 0 values. If the value is 0, it must be black right? However, the image probably won't be perfectly black, so you'll want some wiggle room.
What you do then would look something like this:
for(int i = 0; i < img.width; i++){
for(int j = 0; j < img.height; j++){
// 20 is an arbitrary value and subject to your opinion and need.
if(img[i][j].color <= 20)
//store i and j, those are your pixel location
}
}
Now if you use C#, it'll be easy to import most image formats, stick em in an array, and get your results. But if you want faster results, you'd be better off with C++.
This shortcut relies completely on the image values being very extreme. If large areas of your images are really grey, then the accuracy of this approach is terrible.
While there are many solutions in many languages, I'll outline a simple solution that I would probably use myself. There is a imaging great library for Python called PIL (Python Imaging Library - http://www.pythonware.com/products/pil/) which could accomplish what you need very easily.
Here's an example of something that might help you get started.
image = Image.open("image.png")
datas = image.getdata()
for item in datas:
if item[0] < 255 and item[1] < 255 and item[2] < 255 :
// THIS PIXEL IS NOT WHITE
Of course that will count any pixel that is not completely white, you might want to add some padding so pixels which are not EXACTLY white also get picked up as being white. You'll also have to keep track of which pixel you are currently looking at.
In my project, I have to digitize an ECG image taken with a normal camera (jpeg). For example, I have the following camera captured image:
i'm using c# to implement this
Then i convert this image to greyscale image and then apply threshold to seperate the wave from the grid.
Finally remove unnecessary things from the image and final output is like this
now i want to fetch the values which are mention on bellow image using pixel count between those segments.what is the best way to do that?
main things i want to get are height of QR wave and length between two Q waves.(pixel values)
how to implement bellow code to get those values and store them in arrays
public void black(Bitmap bmp)
{
Color[,] results = new Color[bmp.Width, bmp.Height];
for (int i = 0; i < bmp.Height; i++)
{
for (int j = 0; j < bmp.Width; j++)
{
Color col = bmp.GetPixel(j, i);
if (col.R == 0)
{
results[j, i] = bmp.GetPixel(j, i);
}
}
}
}
For a theoretical (i.e. no source code) overview of the problem, read Section III of Syeda-Mahmood, Beymer, and Wang "Shaped-based Matching of ECG Recordings.
Basically, your black & white image is an array of datapoints: the x axis is simply the width of the image in pixels, and the y axis is obtained by averaging the y-position of the black pixels at each x-position (not needed if the black line is only one pixel high).
To make the data more manageable, you can down-sample by selecting every nth x-position from the image. You probably want to stick with a standard ECG sampling rate to ensure that you do not miss important data; modern ECG hardware often samples at 1000Hz, while the data in MIT's QRS database on Physionet is at 250Hz or 360Hz. Using one of these rates would mean reading 1000, 250, or 360 pixels for every second of data (25mm) in the scanned image.