I've been playing with Huffman Compression on images to reduce size while maintaining a lossless image, but I've also read that you can use predictive coding to further compress image data by reducing entropy.
From what I understand, in the lossless JPEG standard, each pixel is predicted as the weighted average of the adjacent 4 pixels already encountered in raster order (three above and one to the left). e.g., trying to predict the value of a pixel a based on preceding pixels, x, to the left as well as above a :
x x x
x a
Then calculate and encode the residual (difference between predicted and actual value).
But what I don't get is if the average 4 neighbor pixels aren't a multiple of 4, you'd get a fraction right? Should that fraction be ignored? If so, would the proper encoding of an 8 bit image (saved in a byte[]) be something like:
public static void Encode(byte[] buffer, int width, int height)
{
var tempBuff = new byte[buffer.Length];
for (int i = 0; i < buffer.Length; i++)
{
tempBuff[i] = buffer[i];
}
for (int i = 1; i < height; i++)
{
for (int j = 1; j < width - 1; j++)
{
int offsetUp = ((i - 1) * width) + (j - 1);
int offset = (i * width) + (j - 1);
int a = tempBuff[offsetUp];
int b = tempBuff[offsetUp + 1];
int c = tempBuff[offsetUp + 2];
int d = tempBuff[offset];
int pixel = tempBuff[offset + 1];
var ave = (a + b + c + d) / 4;
var val = (byte)(ave - pixel);
buffer[offset + 1] = val;
}
}
}
public static void Decode(byte[] buffer, int width, int height)
{
for (int i = 1; i < height; i++)
{
for (int j = 1; j < width - 1; j++)
{
int offsetUp = ((i - 1) * width) + (j - 1);
int offset = (i * width) + (j - 1);
int a = buffer[offsetUp];
int b = buffer[offsetUp + 1];
int c = buffer[offsetUp + 2];
int d = buffer[offset];
int pixel = buffer[offset + 1];
var ave = (a + b + c + d) / 4;
var val = (byte)(ave - pixel);
buffer[offset + 1] = val;
}
}
}
I don't see how this really will reduce entropy? How will this help compress my images further while still being lossless?
Thanks for any enlightenment
EDIT:
So after playing with the predictive coding images, I noticed that the histogram data shows a lot of +-1's of the varous pixels. This reduces entropy quite a bit in some cases. Here is a screenshot:
Yes, just truncate. Doesn't matter because you store the difference. It reduces entropy because you only store small values, a lot of them will be -1, 0 or 1. There are a couple of off-by-one bugs in your snippet btw.
Related
I'm trying to convert C++ code written OpenCV 2.x to Emgu.CV in C#.
I have a function in C++:
cv::Mat computeMatXGradient(const cv::Mat &mat) {
cv::Mat out(mat.rows, mat.cols, CV_64F);
for (int y = 0; y < mat.rows; ++y) {
const uchar* Mr = mat.ptr<uchar>(y);
double* Or = out.ptr<double>(y);
Or[0] = Mr[1] - Mr[0];
for (int x = 1; x < mat.cols - 1; ++x) {
Or[x] = (Mr[x + 1] - Mr[x - 1]) / 2.0;
}
Or[mat.cols - 1] = Mr[mat.cols - 1] - Mr[mat.cols - 2];
}
return out;
}
How to do the same thing in C# with EmguCV efficiently?
So far - I have this C# code:
(I can't test it because a lot of code is missing)
Mat computeMatXGradient(Mat inMat)
{
Mat outMat = new Mat(inMat.Rows, inMat.Cols, DepthType.Cv64F, inMat.NumberOfChannels);
for (int y = 0; y < inMat.Rows; ++y)
{
// unsafe is required if I'm using pointers
unsafe {
byte* Mr = (byte*) inMat.DataPointer;
double* Or = (double*) outMat.DataPointer;
Or[0] = Mr[1] - Mr[0];
for (int x = 1; x < inMat.Cols - 1; ++x)
{
Or[x] = (Mr[x + 1] - Mr[x - 1]) / 2.0;
}
Or[inMat.Cols - 1] = Mr[inMat.Cols - 1] - Mr[inMat.Cols - 2];
}
}
return outMat;
}
Questions:
Is my C# code correct?
Is there better/more efficient way?
You can try converting inMat to an array inM, then calculating the values you need in another array Or, and finally converting the latter array to the output Mat outMat.
Note: I considered NumberOfChannels to be 1 as I think this will be always the case.
Mat computeMatXGradient(Mat inMat)
{
int x, y;
byte[] inM, Mr;
double[] Or;
inM = new byte[(int)inMat.Total];
inMat.CopyTo(inM);
Mr = new byte[inMat.Cols];
Or = new double[inMat.Rows * inMat.Cols];
for (y = 0; y < inMat.Rows; y++)
{
Array.Copy(inM, y * inMat.Cols, Mr, 0, inMat.Cols);
Or[y * inMat.Cols] = Mr[1] - Mr[0];
for (x = 1; x < inMat.Cols - 1; x++)
Or[y * inMat.Cols + x] = (Mr[x + 1] - Mr[x - 1]) / 2.0;
Or[y * inMat.Cols + inMat.Cols - 1] = Mr[inMat.Cols - 1] - Mr[inMat.Cols - 2];
}
Mat outMat = new Mat(inMat.Rows, inMat.Cols, DepthType.Cv64F, 1);
Marshal.Copy(Or, 0, outMat.DataPointer, inMat.Rows * inMat.Cols);
return outMat;
}
Here is save code to do equivalent. Did not test. I checked the OpenCV library to get the structure of Mat. The real structure in OpenCV has a pointer to the data. Instead I made an array of bytes. So if you were loading/writing a OpenCV struct you would need an interface to convert. The output matrix is eight times the input since input is an array of bytes and output is an array double.
The code would be better if there was an input byte[][] and an output double[][].
public class Mat
{
public int flags;
//! the array dimensionality, >= 2
public int dims;
//! the number of rows and columns or (-1, -1) when the array has more than 2 dimensions
public int rows;
public int cols;
//! pointer to the data
public byte[] data;
//! pointer to the reference counter;
// when array points to user-allocated data, the pointer is NULL
public Mat[] refcount;
// other members
public Mat computeMatXGradient(Mat inMat)
{
Mat outMat = new Mat();
outMat.rows = inMat.rows;
outMat.cols = inMat.cols;
outMat.flags = inMat.flags; //include depthType and NumberofChannels
outMat.dims = inMat.dims;
outMat.data = new byte[inMat.rows * inMat.cols * sizeof(Double)];
int outIndex = 0;
byte[] d = new byte[sizeof(double)];
for (int y = 0; y < inMat.rows; ++y)
{
int inRowIndex = y * inMat.cols;
d = BitConverter.GetBytes((double)(inMat.data[inRowIndex + 1] - inMat.data[inRowIndex]));
Array.Copy(d, 0, outMat.data, outIndex, sizeof(double));
outIndex += sizeof(double);
for (int x = 1; x < inMat.cols - 1; ++x)
{
d = BitConverter.GetBytes((double)(inMat.data[inRowIndex + x + 1] - inMat.data[inRowIndex + x - 1]) / 2.0);
Array.Copy(d, 0, outMat.data, outIndex,sizeof(double));
outIndex += sizeof(double);
}
d = BitConverter.GetBytes((double)(inMat.data[inRowIndex + inMat.cols - 1] - inMat.data[inRowIndex - inMat.cols - 2]));
Array.Copy(d, 0, outMat.data, outIndex, sizeof(double));
}
return outMat;
}
}
I have a byte[] for a RGBA array. I have the following method that flips the image vertically:
private byte[] FlipPixelsVertically(byte[] frameData, int height, int width)
{
byte[] data = new byte[frameData.Length];
int k = 0;
for (int j = height - 1; j >= 0 && k < height; j--)
{
for (int i = 0; i < width * 4; i++)
{
data[k * width * 4 + i] = frameData[j * width * 4 + i];
}
k++;
}
return data;
}
The reason I am creating new byte[] is because I do not want to alter the contents of frameData, since the original info will be used elsewhere. So for now, I just have a nested for loop that copies the byte to the proper place in data.
As height and width increase, this will become an expensive operation. How can I optimize this so that the copy/swap is faster?
Using Buffer.BlockCopy:
private byte[] FlipPixelsVertically(byte[] frameData, int height, int width)
{
byte[] data = new byte[frameData.Length];
int k = 0;
for (int k = 0; k < height; k++)
{
int j = height - k - 1;
Buffer.BlockCopy(
frameData, k * width * 4,
data, j * width * 4,
width*4);
}
return data;
}
Trying to write an efficient algorithm to scale down YUV 4:2:2 by a factor of 2 - and which doesn't require a conversion to RGB (which is CPU intensive).
I've seen plenty of code on stack overflow for YUV to RGB conversion - but only an example of scaling for YUV 4:2:0 here which I have started based my code on. However, this produces an image which is effectively 3 columns of the same image with corrupt colours, so something is wrong with the algo when applied to 4:2:2.
Can anybody see what is wrong with this code?
public static byte[] HalveYuv(byte[] data, int imageWidth, int imageHeight)
{
byte[] yuv = new byte[imageWidth / 2 * imageHeight / 2 * 3 / 2];
int i = 0;
for (int y = 0; y < imageHeight; y += 2)
{
for (int x = 0; x < imageWidth; x += 2)
{
yuv[i] = data[y * imageWidth + x];
i++;
}
}
for (int y = 0; y < imageHeight / 2; y += 2)
{
for (int x = 0; x < imageWidth; x += 4)
{
yuv[i] = data[(imageWidth * imageHeight) + (y * imageWidth) + x];
i++;
yuv[i] = data[(imageWidth * imageHeight) + (y * imageWidth) + (x + 1)];
i++;
}
}
return yuv;
}
A fast way to generate a low quality thumbnail would be to discard half of the data in each dimension.
We break the image in 4x2 grid of pixels - each pair of pixels in the grid is represented by 4 bytes. In the down-scaled image, we take the color values for the first 2 pixels in the grid by copying the first 4 bytes, whilst discarding the other 12 bytes worth of data.
This scaling can be generalized to any power of 2 (1/2, 1/4, 1/8, ...) - this method is quick because it doesn't use any interpolation. This will give a lower quality image which appears blocky however - for better results consider some sampling approach.
public static byte[] FastResize(
byte[] data,
int imageWidth,
int imageHeight,
int scaleDownExponent)
{
var scaleDownFactor = (uint)Math.Pow(2, scaleDownExponent);
var outputImageWidth = imageWidth / scaleDownFactor;
var outputImageHeight = imageHeight / scaleDownFactor;
// 2 bytes per pixel.
byte[] yuv = new byte[outputImageWidth * outputImageHeight * 2];
var pos = 0;
// Process every other line.
for (uint pixelY = 0; pixelY < imageHeight; pixelY += scaleDownFactor)
{
// Work in blocks of 2 pixels, we discard the second.
for (uint pixelX = 0; pixelX < imageWidth; pixelX += 2*scaleDownFactor)
{
// Position of pixel bytes.
var start = ((pixelY * imageWidth) + pixelX) * 2;
yuv[pos] = data[start];
yuv[pos + 1] = data[start + 1];
yuv[pos + 2] = data[start + 2];
yuv[pos + 3] = data[start + 3];
pos += 4;
}
}
return yuv;
}
I assume that the original data is in the following order (as it seems so from your example code): First there are the luminance (Y) values of the pixels of the image (size = imageWidth*imageHeight bytes). After that there are the chrominance components UV, s.t., the values for a single pixel are given after each other. This means that the total size of the original image is 3*size.
Now for 4:2:2 subsampling means that every other value of the horizontal chrominance component are discarded. This reduces the data to size size + 0.5*size + 0.5*size = 2*size, i.e., luminance is kept completely and both chrominance components are divided to half. Therefore, the result image should be allocated as:
byte[] yuv = new byte[2*imageWidth*imageHeight];
As the first part of the image is copied in full the first loop becomes:
int i = 0;
for (int y = 0; y < imageHeight; y++)
{
for (int x = 0; x < imageWidth; x++)
{
yuv[i] = data[y * imageWidth + x];
i++;
}
}
Because this just copies the beginning of data this can be simplified to
int size = imageHeight*imageWidth;
int i = 0;
for (; i < size; i++)
{
yuv[i] = data[i];
}
Now to copy the rest we need to skip every other horizontal coordinate
for (int y = 0; y < imageHeight; y++)
{
for (int x = 0; x < imageWidth; x += 2) // +2 skip each other horizontal component
{
yuv[i] = data[size + y*2*imageWidth + 2*x];
i++;
yuv[i] = data[size + y*2*imageWidth + 2*x + 1];
i++;
}
}
The factor two in data-array index is needed because there are 2 bytes for each pixel (both chrominance components), so each "row" has 2*imageWidth bytes of data.
I have the following code:
if (source != null)
{
int count = 0;
int stride = (source.PixelWidth * source.Format.BitsPerPixel + 7) / 8;
byte[] pixels = new byte[source.PixelHeight * stride];
source.CopyPixels(pixels, stride, 0);
for (int y = 0; y < source.PixelHeight; y = y + 2)
{
for (int x = 0; x < source.PixelWidth; x = x + 2)
{
int index = y * stride + 4 * x;
count = index;
byte red = pixels[index];
byte green = pixels[index + 1];
byte blue = pixels[index + 2];
byte alpha = pixels[index + 3];
}
}
MessageBox.Show("Array Length, pixels: " + pixels.Count() + "," + count);
}
However, i am having an issue where certain bitmap images, when stepped through throw an exception
"System.IndexOutOfRangeException" as the index passes the pixel [ ] array count, does anyone know how to solve this efficiently without oversizing the array?
I want to display the progress as i go along hence the need for an accurate array :)
Thanks in advance.
You should be able to step through your code and figure out when index is larger than your pixels buffer size.
Add some debugging output to your code and step through it. Something like:
if (source != null)
{
int count = 0;
int stride = (source.PixelWidth * source.Format.BitsPerPixel + 7) / 8;
byte[] pixels = new byte[source.PixelHeight * stride];
source.CopyPixels(pixels, stride, 0);
for (int y = 0; y < source.PixelHeight; y = y + 2)
{
for (int x = 0; x < source.PixelWidth; x = x + 2)
{
int index = y * stride + 4 * x;
count = index;
int bufsize = source.PixelHeight * stride;
System.Diagnostics.Debug.WriteLine($"bufsize={bufsize}, index={index}, x={x}, y={y}");
System.Diagnostics.Debug.Assert((index+3) <= bufsize);
byte red = pixels[index];
byte green = pixels[index + 1];
byte blue = pixels[index + 2];
byte alpha = pixels[index + 3];
}
}
MessageBox.Show("Array Length, pixels: " + pixels.Count() + "," + count);
}
A big part about writing code is learning how to debug, use the debugger, and to verify the correctness of your algorithms. Good luck with your project.
This code will crash on any image where Format.BitsPerPixel is less than 32 e.g. 24-bit RGB with no alpha. You also shouldn't assume that stride is what you think it is, you should use the value returned from LockBits.
I am writing an image processing program with the express purpose to alter large images, the one I'm working with is 8165 pixels by 4915 pixels. I was told to implement gpu processing, so after some research I decided to go with OpenCL. I started implementing the OpenCL C# wrapper OpenCLTemplate.
My code takes in a bitmap and uses lockbits to lock its memory location. I then copy the order of each bit into an array, run the array through the openCL kernel, and it inverts each bit in the array. I then run the inverted bits back into the memory location of the image. I split this process into ten chunks so that i can increment a progress bar.
My code works perfectly with smaller images, but when I try to run it with my big image I keep getting a MemObjectAllocationFailure when trying to execute the kernel. I don't know why its doing this and i would appreciate any help in figuring out why or how to fix it.
using OpenCLTemplate;
public static void Invert(Bitmap image, ToolStripProgressBar progressBar)
{
string openCLInvert = #"
__kernel void Filter(__global uchar * Img0,
__global float * ImgF)
{
// Gets information about work-item
int x = get_global_id(0);
int y = get_global_id(1);
// Gets information about work size
int width = get_global_size(0);
int height = get_global_size(1);
int ind = 4 * (x + width * y );
// Inverts image colors
ImgF[ind]= 255.0f - (float)Img0[ind];
ImgF[1 + ind]= 255.0f - (float)Img0[1 + ind];
ImgF[2 + ind]= 255.0f - (float)Img0[2 + ind];
// Leave alpha component equal
ImgF[ind + 3] = (float)Img0[ind + 3];
}";
//Lock the image in memory and get image lock data
var imageData = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
CLCalc.InitCL();
for (int i = 0; i < 10; i++)
{
unsafe
{
int adjustedHeight = (((i + 1) * imageData.Height) / 10) - ((i * imageData.Height) / 10);
int count = 0;
byte[] Data = new byte[(4 * imageData.Stride * adjustedHeight)];
var startPointer = (byte*)imageData.Scan0;
for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
{
for (int x = 0; x < imageData.Width; x++)
{
byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));
Data[count] = *Byte;
Data[count + 1] = *(Byte + 1);
Data[count + 2] = *(Byte + 2);
Data[count + 3] = *(Byte + 3);
count += 4;
}
}
CLCalc.Program.Compile(openCLInvert);
CLCalc.Program.Kernel kernel = new CLCalc.Program.Kernel("Filter");
CLCalc.Program.Variable CLData = new CLCalc.Program.Variable(Data);
float[] imgProcessed = new float[Data.Length];
CLCalc.Program.Variable CLFiltered = new CLCalc.Program.Variable(imgProcessed);
CLCalc.Program.Variable[] args = new CLCalc.Program.Variable[] { CLData, CLFiltered };
kernel.Execute(args, new int[] { imageData.Width, adjustedHeight });
CLCalc.Program.Sync();
CLFiltered.ReadFromDeviceTo(imgProcessed);
count = 0;
for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
{
for (int x = 0; x < imageData.Width; x++)
{
byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));
*Byte = (byte)imgProcessed[count];
*(Byte + 1) = (byte)imgProcessed[count + 1];
*(Byte + 2) = (byte)imgProcessed[count + 2];
*(Byte + 3) = (byte)imgProcessed[count + 3];
count += 4;
}
}
}
progressBar.Owner.Invoke((Action)progressBar.PerformStep);
}
//Unlock image
image.UnlockBits(imageData);
}
You may have reached a memory allocation limit of your OpenCL driver/device. Check the values returned by clGetDeviceInfo. There is a limit for the size of one single memory object. The OpenCL driver may allow the total size of all allocated memory objects to exceed the memory size on your device, and will copy them to/from host memory when needed.
To process large images, you may have to split them into smaller pieces, and process them separately.