I have two images (A and B) and would like to know the rectangle intersection of the images. The images represent scrolled screen content, and I would only like the exclusive value of the images with no intersections. So A represents a capture, then B represents the same screen area but scrolled some % of the content, with additional content at the bottom. but because the scroll will not always represent an entire scroll of content (at the end, for instance)
So basically if B was only a partially scrolled view of the content, I want to reduce B so it does not contain any image data that's present in A; so when I merge A and B (and any other scrolls prior to A) they do not contain duplicate data.
This is my example image representing A and B stitched together; as you can see the content is duplicated.
There is some ambiguity here since there could be multiple intersections. By that I mean there could a 1 pixel high intersection as well as a 100 pixel high intersection. One way to resolve that ambiguity is to take the greatest intersection possible between the two images. Here's a brute force method. I'll assume the two screenshots have the same dimensions w x h.
Compare each of `h` rows of A to B.
Perform the same comparison on `h-1` rows with A's y coordinates offset by +1 (down 1).
Perform the same comparison on `h-2` rows with A's y coordinates offset by +2 (down 2).
...
Whenever the comparison succeeds, return that offset. That is the Y coordinate of A that should be stitched to the top of B.
The main issue here is performance. The worst case time is pretty bad (width * height * height) but that's if every pixel compares identical until the last one, for every row. By returning early as soon as one pixel doesn't match the performance might be practically good enough.
Comparing with memcp
Here's an easy and fast way to compare your images starting at a given Y offset.
[DllImport("msvcrt.dll", CallingConvention=CallingConvention.Cdecl)]
static extern int memcmp(IntPtr b1, IntPtr b2, long count);
static bool compareIntersection(int width, int height, int yOffset, IntPtr a, IntPtr b)
{
int bytesPerPixel = 4; // this needs to be set to reflect your image format
int bytesToRead = (height - yOffset) * width * bytesPerPixel;
IntPtr aStart = a + yOffset * width * bytesPerPixel;
return 0 == memcmp(aStart, b, bytesToRead);
}
The arguments to the function are the width of the image, height of the image, the y offset in A, and pointers into the beginning of each bitmap's memory. You can get those pointers by using Bitmap.LockBits and then BitmapData.Scan0.
The code is untested so it could have problems.
Related
I've picked up a project that creates a noise map using setPixel(). The majority of the runtime of this application is spent within the setPixel() function so I was looking to increase the speed at which the function executes.
I have done some research into this and found that this:
int index = x + (y * Width);
int col = color.ToArgb();
if (this.Bits == null)
{
this.Bits = new Int32[Width * Height];
}
Bits[index] = col;
has been recommended as a quicker approach. However, this is generating a completely black image.
I don't understand 100% how image manipulation and memory pointers work to be able to understand the code completely and refactor it to something better.
Here's the original code the person before implemented:
unsafe
{
var scan0 = (byte*)Iptr;
int bitmapStride = Stride;
int bitmapPixelFormatSize = Depth / 8;
index = (bitmapStride * y) + (x * bitmapPixelFormatSize);
if (bitmapPixelFormatSize == 4)
{
scan0[index + 3] = color.A;
scan0[index + 2] = color.R;
scan0[index + 1] = color.G;
scan0[index] = color.B;
}
else if (bitmapPixelFormatSize == 1)
{
scan0[index] = color.R;
}
else
{
scan0[index + 2] = color.R;
scan0[index + 1] = color.G;
scan0[index] = color.B;
}
}
Iptr is just an IntPtr
Stride is an int, the only place I can find this being set is Stride = (PixelCount/Height) * (Depth / 8)
x is width
y is height
Would I be able to get an explanation of what is happening in the original block of code and possibly some help in understanding how to convert that to something that executes quicker, currently it takes around 500,000ms to finish this function due to a nested for loop of width * height.
Note: The following information was originally created by Bob Powell. The original link is no longer functional, so I have copied this information from the Internet Archive at https://web.archive.org/web/20120330012542/http://bobpowell.net/lockingbits.htm. It's a bit long, but I think it's worth preserving.
I'm not sure if this will serve as a direct answer to your question, but perhaps it will help you in finding a solution.
Using the LockBits method to access image data
Many image processing tasks and even file type conversions, say from 32 bit-per-pixel to 8 bit-per-pixel can be speeded up by accessing the pixel data array directly, rather than relying on GetPixel and SetPixel or other methods.
You will be aware that .NET is a managed code system which most often uses managed data so it's not often that we need to gain access to bytes stored in memory anymore however, image manipulation is one of the few times when managed data access is just too slow and so we need to delve once again into the knotty problems of finding the data and manipulating it.
Before I start on the subject in hand, I'll just remind you that the methods used to access any unmanaged data will be different depending on the language in which your program is written. C# developers have the opportunity, via the unsafe keyword and use of pointers, to access data in memory directly. Visual basic programmers should access such data through the Marshal class methods which may also show a small performance loss.
Lock up your bits
The Bitmap class provides the LockBits and corresponding UnlockBits methods which enable you to fix a portion of the bitmap pixel data array in memory, access it directly and finally replace the bits in the bitmap with the modified data. LockBits returns a BitmapData class that describes the layout and position of the data in the locked array.
The BitmapData class contains the following important properties;
Scan0 The address in memory of the fixed data array
Stride The width, in bytes, of a single row of pixel data. This width
is a multiple, or possibly sub-multiple, of the pixel dimensions of
the image and may be padded out to include a few more bytes. I'll
explain why shortly.
PixelFormat The actual pixel format of the data. This is important
for finding the right bytes
Width The width of the locked image
Height The height of the locked image
The relationship of Scan0 and Stride to the array in memory is shown in figure1.
The Stride property, as shown in figure 1, holds the width of one row in bytes. The size of a row however may not be an exact multiple of the pixel size because for efficiency, the system ensures that the data is packed into rows that begin on a four byte boundary and are padded out to a multiple of four bytes. This means for example that a 24 bit per pixel image 17 pixels wide would have a stride of 52. The used data in each row would take up 317 = 51 bytes and the padding of 1 byte would expand each row to 52 bytes or 134 bytes. A 4BppIndexed image of 17 pixels wide would have a stride of 12. Nine of the bytes, or more properly eight and a half, would contain data and the row would be padded out with a further 3 bytes to a 4 byte boundary.
The data carrying portion of the row, as has been suggested above, is laid out according to the pixel format. A 24 bit per pixel image containing RGB data would have a new pixel every 3 bytes, a 32 bit per pixel RGBA every four bytes. Pixel formats that contain more than one pixel per byte, such as the 4 bit per pixel Indexed and 1 bit per pixel indexed, have to be processed carefully so that the pixel required is not confused with it's neigbour pixels in the same byte.
Finding the right byte.
Because the stride is the width of a row, to index any given row or Y coordinate you can multiply the stride by the Y coordinate to get the beginning of a particular row. Finding the correct pixel within the row is possibly more difficult and depends on knowing the layout of the pixel formats. The following examples show how to access a particular pixel for a given pixel format.
Format32BppArgb Given X and Y coordinates, the address of the first
element in the pixel is Scan0+(y * stride)+(x*4). This Points to the
blue byte. The following three bytes contain the green, red and alpha
bytes.
Format24BppRgb Given X and Y coordinates, the address of the first
element in the pixel is Scan0+(yStride)+(x3). This points to the
blue byte which is followed by the green and the red.
Format8BppIndexed Given the X and Y coordinates the address of the
byte is Scan0+(y*Stride)+x. This byte is the index into the image
palette.
Format4BppIndexed Given X and Y coordinates the byte containing the
pixel data is calculated as Scan0+(y*Stride)+(x/2). The corresponding
byte contains two pixels, the upper nibble is the leftmost and the
lower nibble is the rightmost of two pixels. The four bits of the
upper and lower nibble are used to select the colour from the 16
colour palette.
Format1BppIndexed Given the X and Y coordinates, the byte containing
the pixel is calculated by Scan0+(y*Stride)+(x/8). The byte contains
8 bits, each bit is one pixel with the leftmost pixel in bit 8 and
the rightmost pixel in bit 0. The bits select from the two entry
colour palette.
Iterating through the pixels
For pixel formats with one or more bytes per pixel, the formula is simple and can be accomplished by looping through all Y and X values in order. The code in the following listings sets the blue component of a 32 bit per pixel image to 255. In both cases bm is a bitmap previously created.
BitmapData bmd=bm.LockBits(new Rectangle(0, 0, 10, 10), System.Drawing.Imaging.ImageLockMode.ReadOnly, bm.PixelFormat);
int PixelSize=4;
for(int y=0; y<bmd.Height; y++)
{
byte* row = (byte *)bmd.Scan0+(y*bmd.Stride);
for(int x = 0; x<bmd.Width; x++)
{
row[x * PixelSize] = 255;
}
}
In VB this operation would be treated a little differently because VB has no knowledge of pointers and requires the use of the marshal class to access unmanaged data.
Dim x As Integer
Dim y As Integer
Dim PixelSize As Integer = 4
Dim bmd As BitmapData = bm.LockBits(new Rectangle(0, 0, 10, 10), System.Drawing.Imaging.ImageLockMode.ReadOnly, bm.PixelFormat)
For y = 0 To bmd.Height - 1
For x = 0 To bmd.Width - 1
Marshal.WriteByte(bmd.Scan0, (bmd.Stride * y) + (4 * x) , 255)
Next
Next
Sub-byte pixels.
The Format4BppIndexed and Format1BppIndexed pixel formats mentioned earlier both have more than one pixel stored in a byte. In such cases, it's up to you to ensure that changing the data for one pixel does not effect the other pixel or pixels held in that byte.
The method for indexing a 1 bit per pixel image relies on using bitwise logical operations And and Or to reset or set specific bits in the byte. After using the formula shown above for 1 bit per pixel images, the lower 3 bits of the X coordinate is used to select the bit required. The listings below show this process in C# and VB. In both examples bmd is a bitmap data extracted from a 1 bit per pixel image.
C# code uses pointers and requires compiling with unsafe code
byte* p=(byte*)bmd.Scan0.ToPointer();
int index=y*bmd.Stride+(x>>3);
byte mask=(byte)(0x80>>(x&0x7));
if(pixel)
p[index]|=mask;
else
p[index]&=(byte)(mask^0xff);
VB code uses the marshal class
Dim mask As Byte = 128 >> (x And 7)
Dim offset As Integer = (y * bmd.Stride) + (x >> 3)
Dim currentPixel As Byte = Marshal.ReadByte(bmd.Scan0, offset)
If pixel = True Then
Marshal.WriteByte(bmd.Scan0, offset, currentPixel Or mask)
Else
Marshal.WriteByte(bmd.Scan0, offset, CByte(currentPixel And (mask Xor 255)))
End If
Note, it's quite valid to use the Marshal class from C# code. I used pointers because it offers the best performance.
Accessing individual pixels in a 4 bit per pixel image is handled in a similar manner. The upper and lower nibble of the byte must be dealt with separately and changing the contents of the odd X pixels should not effect the even X pixels. The code below shows how to perform this in C# and VB.
C#
int offset = (y * bmd.Stride) + (x >> 1);
byte currentByte = ((byte *)bmd.Scan0)[offset];
if((x&1) == 1)
{
currentByte &= 0xF0;
currentByte |= (byte)(colorIndex & 0x0F);
}
else
{
currentByte &= 0x0F;
currentByte |= (byte)(colorIndex << 4);
}
((byte *)bmd.Scan0)[offset]=currentByte;
VB
Dim offset As Integer = (y * bmd.Stride) + (x >> 1)
Dim currentByte As Byte = Marshal.ReadByte(bmd.Scan0, offset)
If (x And 1) = 1 Then
currentByte = currentByte And &HF0
currentByte = currentByte Or (colorIndex And &HF)
Else
currentByte = currentByte And &HF
currentByte = currentByte Or (colorIndex << 4)
End If
Marshal.WriteByte(bmd.Scan0, offset, currentByte)
Using LockBits and UnlockBits
The LockBits method takes a rectangle which may be the same size or smaller than the image being processed, a PixelFormat which is usually the same as that of the image being processed and a ImageLockMode value that specifies whether the data is read-only, write-only, read-write or a user allocated buffer. This last option cannot be used from C# or VB because the method overload for LockBits that specifies a user buffer is not included in the GDI+ managed wrapper.
It is very important that when all operations are complete the BitmapData is put back into the bitmap with the UnlockBits method. The snippet of code below illustrates this.
Dim bmd As BitmapData = bm.LockBits(New Rectangle(0, 0, 10, 10), ImageLockMode.ReadWrite, bm.PixelFormat)
' do operations here
bm.UnlockBits(bmd)
Summary
That just about covers the aspects of accessing the most popular and most difficult pixel formats directly. Using these techniques instead of the GetPixel and SetPixel methods provided by Bitmap will show a marked performance boost to your image processing and image format conversion routines.
I've read that if image is greyscale then it shouldn't have B and R component in RGB and no U and V in YUV color spaces.
Does it mean they should be equal to 0?
I'm using this code to get YUV values:
var px = (this.canvas.Image as Bitmap).GetPixel(i, j);
var y = 0.299 * px.R + 0.587 * px.G + 0.114 * px.B;
var u = -0.14713 * px.R - 0.28886 * px.G + 0.436 * px.B;
var v = 0.615 * px.R - 0.51499 * px.G - 0.10001 * px.B;
And I'm getting non-zero u and v values, though they're are close to 0. Should I convert them to int anyway? Was the image really greyscale?
Your conversion from RBG to YUV uses multiplication with floating point values. Both floating point multiplication and the formulas you use are approximative, thus it is not strange that the resulting values are not exactly 0 even if it was a greyscale image. It is probably safe to set u and v to 0 if you wish.
Assuming that your input image is given in color space RGB (red-green-blue), then all three components should have the same value on every pixel. If this is not the case (you may or may not want to neglect small differences though) for your image, then it is not really gray-scale. The RGB has three truly colored components.
On the other hand, the YUV color space has one luma (or brightness) channel, which is the Y-component, and 2 chrominance chanels (U- and V-components).
The Y is built from the RGB-components by a weighted average, see your code calculating the y. However, the green component gets more weight in order to resemble the sensitivity of the human eye. The three coeeficients sum up to 1.0 and are all positive, thus they are computing a linear combination similar to an average.
But the 3 coefficients for computing the U-components (most sensitive to blueish colors) all sum up to 0.0. (same for the V-component, which is most sensitve to redish colors). Therefore, assuming you put in a truely gray-scaled RGB-image, the resulting U and V would be 0.0.
If that is not the case (assume very small values to simply be noisy variants of 0), then again, your input RGB-image was not truely a gray-scaled image.
I have calculated that the current Mandelbrot iterates 208,200 times. But if I use a break to control the iterations it outputs kinda like a printer that has ran out of ink half way through, so I am obviously not doing it correctly, does anyone know how iteration controls should be implemented?
int iterations = 0;
for (x = 0; x < x1; x ++)
{
for (y = 0; y < y1; y++)
{
// PAINT CONTROLS HERE
if (iterations > 200000)
{
break;
}
iterations++;
}
}
You need to change the values of y1 and x1 to control the "depth" of your Mandelbrot set.
By breaking at a certain number of iterations, you've gone "deep" for a while (because x1 and y1 are large) and then just stop part way through.
It's not clear what you're asking. But taking the two most obvious interpretations of "iterations":
1) You mean to reduce the maximum iterations per-pixel. I wouldn't say this affects the "smoothness" of the resulting image, but "smooth" is not a well-defined technical term in the first place, so maybe this is what you mean. It's certainly more consistent with how the Mandelbrot set is visualized.
If this is the meaning you intend, then in your per-pixel loop (which you did not include in your code example), you need to reset the iteration count to 0 for each pixel, and then stop iterating if and when you hit the maximum you've chosen. Pixels where you hit the maximum before the iterated value for the pixel are in the set.
Typically this maximum would be at least 100 or so, which is enough to give you the basic shape of the set. For fine detail at high zoom factors, this can be in the 10's or 100's of thousands of iterations.
2) You mean to reduce the number of pixels you've actually computed. To me, this affects the "smoothness" of the image, because the resulting image is essentially lower-resolution.
If this is what you mean, then you need to either change the pixel width and height of the computed image (i.e. make x1 and y1 smaller), or change the X and Y step sizes in your loop and then fill in the image with larger rectangles of the correct color.
Without a better code example, it's impossible to offer more specific advice.
The Perlin.GetValue(x, y, z) method of libnoise (using the C# port) returns 0's if the input values are integers. How could I go about mapping a 2D array of tiles--as the indices of an array are integers--to the values of the noise? This makes little sense to me as even 3D terrain eventually rests upon whole integer values. For those who are generating such 3D landscapes, are these positions in the terrain always restrained to being a value of 0?
EDIT: I should mention that I am using a chunk system, so I loop through each chunk's [32, 32] array of tiles to get the Perlin noise values. I was hoping that by adding the offset of the chunk in world space to the [x, y] value of the tile in the array I could have continuous terrain. Regardless, I tried something like this and still got zero from the noise function:
double temp = generator.GetValue((x + offsetX) / ChunkSize, (y + offsetY) / ChunkSize, 0);
EDIT 2: I output the values of the noise functions to 32x32 textures and put them next to each other. Noise is being produced but it isn't continuous despite adjusting the input of the x and y values for the offset of the chunk.
EDIT 3: Problem solved. My offset values were set to pixel coordinates instead of tile/chunk coordinates. I was multiplying the index of the chunk by the chunk size times the tile size instead of just the chunk size (which is in tiles).
The input values to Perlin.GetValue(x, y, z) are doubles and not technically limited to the range 0.0-1.0 but I would recommend you take all your array indices and divide them with the length of the array in that dimension so they all fall in the range 0.0-1.0 and you should get nice noise values, if get too much noise you can adjust that with the scaling if the indices, for example scaling between 0.0 to 2.0 instead might produce a bit smoother noise.
This is my C# implementation of a stack-based flood fill algorithm (which I based on wikipedia's definition). Earlier while coding, I only wanted to see it work. And it did. Then, I wanted to know the number of pixels that was actually filled. So in my code, I changed the return type to int and returned the "ctr" variable. But then ctr turned out to be approximately twice the actual number of filled pixels (I made a separate function with the sole purpose of counting those pixels -- just to know for certain).
Can anyone enlighten as to how and why the variable "ctr" is incremented twice as it should have?
*Pixel class only serves as a container for the x, y, and color values of the pixels from the bitmap.
public Bitmap floodfill(Bitmap image, int x, int y, Color newColor)
{
Bitmap result = new Bitmap(image.Width, image.Height);
Stack<Pixel> pixels = new Stack<Pixel>();
Color oldColor = image.GetPixel(x, y);
int ctr = 0;
pixels.Push(new Pixel(x, y, oldColor));
while (pixels.Count > 0)
{
Pixel popped = pixels.Pop();
if (popped.color == oldColor)
{
ctr++;
result.SetPixel(popped.x, popped.y, newColor);
pixels.Push(new Pixel(popped.x - 1, popped.y, image.GetPixel(x - 1, y));
pixels.Push(new Pixel(popped.x + 1, popped.y, image.GetPixel(x + 1, y));
pixels.Push(new Pixel(popped.x, popped.y - 1, image.GetPixel(x, y - 1));
pixels.Push(new Pixel(popped.x, popped.y + 1, image.GetPixel(x, y + 1));
}
}
return result;
}
You do check the color of the pixel here:
if (popped.color == oldColor)
But popped.color may be (and apperently is in 50% of the cases) outdated. Because you do not check for duplicates when you insert pixel into your stack you will have duplicates.
Upon popping these duplicates, the color attribute would have been saved a long time ago.
Maybe it gets clearer with a drawing:
As an example I took a bitmap with 9 pixels. On the first pane you have the numbering of the pixels, and on the right side you have your stack.
You start with pixel no 5. and push pixels 2, 4, 6 and 8 on the stack. Then you take pixel 2 off and push 1 and 3. In the next step you pop 1 and push 2 and 4 (again!). Then you may take 2 and realise it has already gotten the new color when it was pushed. (A bit late, but better late than never)
however: pixel no. 4 is there twice and has remembered the old color twice. So you take pixel no.4 and color it.
Some steps later you have the image filled and still some items on your stack. Because the old color value is still stored inside these items, they get counted again.
While I may have a wrong ordering in the stack, the point stays valid.
Solution to your problem:
Quick and dirty (because it it still inefficient)
if (image.GetPixel(popped.x, popped.y) == oldColor)
it counts pixels only if the current color is wrong, not the remembered color.
Recommended: Check your Pixels whether they need coloring before pushing them onto the stack.
If all Pixel does is hold the color passed to its constructor, it won't update the color after the pixel is filled, therefore can increment ctr more than once per pixel.
If you change Pixel to take a pointer to the image in its constructor, you could re-read the color (i.e. make color a get property that reads the current colour), or track the coordinates already filled and don't push those a second time.
[Edit]
In case it wasn't obvious from the accepted answer, GetPixel returns a Color - a value type. Think of it as an int that encodes the RGB value of the pixel at that time.
If you want to perform a fill fast, look up a Graphics.FloodFill example.
If you're goal is learning I'd recommend copying your image data to an array for processing and back again - most classic image algorithms are not much fun using GetPixel().