Creating a code to decompress byte-oriented RLE image

Creating a code to decompress byte-oriented RLE image - c#

I'm trying to create a code to decompress an RLE Byte-Oriented image from a PostScript File I've already tried solutions found around the web and also tried to build my own ; but none of them produced the result i need.
After decompressing the rle image, i should have an RAW image i can open on photoshop (informing width, height and number of channels). However when i try to open the extracted image it doesn't work ; only a black output is show.
My inputs are an Binary ASCII Encoded file (encoded as a hexadecimal string) and a binary file ; both RLE Byte-Oriented compressed (in the hex file case, its just a question of converting it to bytes before trying the rle decompression).
https://drive.google.com/drive/u/0/folders/1Q476HB9SvOG_RDwK6J7PPycbw94zjPYU
I've posted samples here.
WorkingSample.raw -> Image Sample i got using another software, and its dimensions as well.
MySample.raw -> Image sample i built using my code, and its dimensions as well.
OriginalFile.ppf -> File containing the original image data and everything else.
ExtractedBinary.bin -> Only a binary portion from OriginalFile.ppf - makes it easier to read and work with the data.
This code was provided by the user nyerguds, he's part of the SO Community.
Original Source: http://www.shikadi.net/moddingwiki/RLE_Compression#Types_of_RLE
Its the one i tried to use but the results weren't correct. And to be honest i had difficulties understanding his code (he told me to change a few things in order to get it working for my case but i was unable to).
And here's what i tried to do following the PostScript Red Book:
Book: https://www.adobe.com/content/dam/acom/en/devnet/actionscript/articles/PLRM.pdf
The part:
"The RunLengthEncode filter encodes data in a simple-byte oriented format based on run length.
The compressed data format is a sequence of runs, where each run consists of a length byte followed by 1 to 128 bytes of data. If the length byte is in the range 0 to 127, the following length + 1 bytes (1 to 128 bytes) are to be copied literally upon decompression. If length is in the range of 129 to 255, the following single byte is to be replicated 257 - length times (2 to 128 times) upon decompression."
Page 142, RunLengthEncode Filter.
List<byte> final = new List<byte>();
var split01 = ArraySplit(bytefile, 2);
foreach (var binPart in split01)
{
try
{
if (binPart.ElementAt(0) <= 127)
{
int currLen = binPart[0] + 1;
for (int i = 0; i <= binPart[0]; i++)
{
final.Add(binPart[1]);
//Console.WriteLine(binPart[1]);
}
}
else if (binPart[0] >= 128)
{
int currLen = 257 - binPart[0];
for (int i = 0; i < currLen; i++)
{
final.Add(binPart[1]);
// Console.WriteLine(binPart[1]);
}
}
}
catch(Exception)
{
break;
}
}
File.WriteAllBytes(#"C:\test\again.raw", final.ToArray());
private static IEnumerable<byte[]> ArraySplit(byte[] bArray, int intBufforLengt)
{
int bArrayLenght = bArray.Length;
byte[] bReturn = null;
int i = 0;
for (; bArrayLenght > (i + 1) * intBufforLengt; i++)
{
bReturn = new byte[intBufforLengt];
Array.Copy(bArray, i * intBufforLengt, bReturn, 0, intBufforLengt);
yield return bReturn;
}
int intBufforLeft = bArrayLenght - i * intBufforLengt;
if (intBufforLeft > 0)
{
bReturn = new byte[intBufforLeft];
Array.Copy(bArray, i * intBufforLengt, bReturn, 0, intBufforLeft);
yield return bReturn;
}
}
private static byte[] StringToByteArray(String hex)
{
int iValue = 0;
int NumberChars = hex.Length;
if (NumberChars % 2 != 0)
{
string m = string.Empty;
}
byte[] bytes = new byte[NumberChars / 2];
try
{
for (int i = 0; i < NumberChars; i += 2)
{
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
iValue = i;
}
}
catch (Exception e)
{
var value = iValue;
Console.WriteLine(e.Message);
}
return bytes;
}
The desired output would be an TIFF Grayscale. However, i can deal with PNG''s also.
I've managed to extract uncompressed data from this kind of file already ; with Emgu(OpenCV Wrapper) i was able to create a viewable image and do my logic on it.
My actual results from RLE Compressed are only invalid RAW files that can't be viewed even on photoshop or IrfanViewer.
Any input is appreciated. Thanks.
EDIT1: stuck on this part
for(int i=0; i < bytefile.Length; i+=2)
{
try
{
var lengthByte = bytefile[i];
if (lengthByte <= 127)
{
int currLen = lengthByte + 1;
for (int j = 0; j < currLen; j++)
{
final.Add(bytefile[i]);
i++;
}
}
if (bytefile[i] >= 128)
{
int currLen = 257 - bytefile[i];
for (int k = 0; k < currLen; k++)
{
final.Add(bytefile[i + 1]);
}
}
}
catch(Exception)
{
break;
}
}
This is the logic i'm following. Before it was raising an Exception but i figured it out (it was because i forgot to add the ending byte ; makes no difference in the final result).

Try this basic outline:
int i = 0;
while (i < bytefile.length)
{
var lengthByte = bytefile[i++];
if (lengthByte <= 127)
{
int currLen = lengthByte + 1;
for (int j = 0; j < currLen; j++)
final.Add(bytefile[i++]);
}
else
{
int currLen = 257 - lengthByte;
byte byteToCopy = bytefile[i++];
for (int j = 0; j < currLen; j++)
final.Add(byteToCopy);
}
}
This is how I understand what's specified above, anyway.

Although not explicitly stated, I believe you are attempting to extract a RunLength Encoded image from a Postscript file and save that out as a grayscale TIFF.
As a starting point for something like this, have you tried simply saving out an uncompressed image from a Postscript file as a grayscale TIFF to ensure your application logic responsible for building up the TIFF image data indeed works as you expect it to? I'd caution that would a be a good first step before moving onto now supporting decompressing RLE data to then turn into a TIFF.
The reason I think that's important is because your problem may have nothing to do with how you're decompressing the RLE data but rather how you're creating your output TIFF from presumably correctly decoded data.

Related

How do I put more than 2GB file into byte array in C#?

I have an algorithm that finds and replaces some hex values of a bin file. But I can't find out how to bypass that 2 GB limit for files that I load into the byte array.
It works just fine when the file size is smaller than 2 GB, but when it is larger, I get an exception
System.IO.IOException: The file is too long. This operation is currently limited to supporting files less than 2 gigabytes in size
So, I store my values in:
byte[] find = Array.Empty<byte>(); // assume there is some hex
byte[] replace = Array.Empty<byte>(); // assume there is some hex
byte[] bytes = File.ReadAllBytes("example.bin");
Where find and replace are hex values to be found and replaced, when bytes is array of bytes for my file. Then, to find and replace those values I use the following algorithm:
foreach (int index in PatternAt(bytes, find))
{
for (int i = index, replaceIndex = 0; i < bytes.Length && replaceIndex < replace.Length; i++, replaceIndex++)
{
bytes[i] = replace[replaceIndex];
}
File.WriteAllBytes("example.bin", bytes);
Console.WriteLine("Pattern found at offset {0} and replaced.", index);
continue;
}
And the code for PatternAt looks like this:
private static IEnumerable<int> PatternAt(byte[] source, byte[] pattern)
{
for (int i = 0; i < source.Length; i++)
{
if (source.Skip(i).Take(pattern.Length).SequenceEqual(pattern))
{
yield return i;
}
}
}

C# - Padding image bytes with white bytes to fill 512 x 512

I'm using Digital Persona SDK to scan fingerprints in wsq format, for requeriment I need 512 x 512 image, the SDK only export 357 x 392 image.
The sdk provide a method to compress captured image from device in wsq format and return a byte array that I can write to disk.
-I've tried to allocate a buffer of 262144 for 512 x 512 image.
-Fill the new buffer with white pixel data each byte to value 255.
-Copy the original image buffer into the new image buffer. The original image doesn’t need to be centered but it's important to make sure to copy without corrupting the image data.
To summarize I've tried to copy the old image into the upper right corner of the new image.
DPUruNet.Compression.Start();
DPUruNet.Compression.SetWsqBitrate(95, 0);
Fid capturedImage = captureResult.Data;
//Fill the new buffer with white pixel data each byte to value 255.
byte[] bytesWSQ512 = new byte[262144];
for (int i = 0; i < bytesWSQ512.Length; i++)
{
bytesWSQ512[i] = 255;
}
//Compress capturedImage and get bytes (357 x 392)
byte[] bytesWSQ = DPUruNet.Compression.CompressRaw(capturedImage.Views[0].Width, capturedImage.Views[0].Height, 500, 8, capturedImage.Views[0].RawImage, CompressionAlgorithm.COMPRESSION_WSQ_NIST);
//Copy the original image buffer into the new image buffer
for (int i = 0; i < capturedImage.Views[0].Height; i++)
{
for (int j = 0; j < capturedImage.Views[0].Width; j++)
{
bytesWSQ512[i * bytesWSQ512.Length + j ] = bytesWSQ[i * capturedImage.Views[0].Width + j];
}
}
//Write bytes to disk
File.WriteAllBytes(#"C:\Users\Admin\Desktop\bytesWSQ512.wsq", bytesWSQ512);
DPUruNet.Compression.Finish();
When running that snippet I get IndexOutOfRangeException, I don't know if the loop or the calculation of indexes for new array are right.
Here is a representation of what I'm trying to do.

If someone is trying to achieve something like this or padding a raw image, I hope this will help.
DPUruNet.Compression.
DPUruNet.Compression.SetWsqBitrate(75, 0);
Fid ISOFid = captureResult.Data;
byte[] paddedImage = PadImage8BPP(captureResult.Data.Views[0].RawImage, captureResult.Data.Views[0].Width, captureResult.Data.Views[0].Height, 512, 512, 255);
byte[] bytesWSQ512 = Compression.CompressRaw(512, 512, 500, 8, paddedImage, CompressionAlgorithm.COMPRESSION_WSQ_NIST);
And the method to resize (pad) the image is:
public byte[] PadImage8BPP(byte[] original, int original_width, int original_height, int desired_width, int desired_height, byte pad_color)
{
byte[] canvas_8bpp = new byte[desired_width * desired_height];
for (int i = 0; i < canvas_8bpp.Length; i++)
canvas_8bpp[i] = pad_color; //Fill background. Note this type of fill will fail histogram checks.
int clamp_y_begin = 0;
int clamp_y_end = original_height;
int clamp_x_begin = 0;
int clamp_x_end = original_width;
int pad_y = 0;
int pad_x = 0;
if (original_height > desired_height)
{
int crop_distance = (int)Math.Ceiling((original_height - desired_height) / 2.0);
clamp_y_begin = crop_distance;
clamp_y_end = original_height - crop_distance;
}
else
{
pad_y = (desired_height - original_height) / 2;
}
if (original_width > desired_width)
{
int crop_distance = (int)Math.Ceiling((original_width - desired_width) / 2.0);
clamp_x_begin = crop_distance;
clamp_x_end = original_width - crop_distance;
}
else
{
pad_x = (desired_width - original_width) / 2;
}
//We traverse the captured image (either whole image or subset)
for (int y = clamp_y_begin; y < clamp_y_end; y++)
{
for (int x = clamp_x_begin; x < clamp_x_end; x++)
{
byte image_pixel = original[y * original_width + x];
canvas_8bpp[(pad_y + y - clamp_y_begin) * desired_width + pad_x + x - clamp_x_begin] = image_pixel;
}
}
return canvas_8bpp;
}

How is PNG CRC calculated exactly?

For the past 4 hours I've been studying the CRC algorithm. I'm pretty sure I got the hang of it already.
I'm trying to write a png encoder, and I don't wish to use external libraries for the CRC calculation, nor for the png encoding itself.
My program has been able to get the same CRC's as the examples on tutorials. Like on Wikipedia:
Using the same polynomial and message as in the example, I was able to produce the same result in both of the cases. I was able to do this for several other examples as well.
However, I can't seem to properly calculate the CRC of png files. I tested this by creating a blank, one pixel big .png file in paint, and using it's CRC as a comparision. I copied the data (and chunk name) from the IDAT chunk of the png (which the CRC is calculated from), and calculated it's CRC using the polynomial provided in the png specification.
The polynomial provided in the png specification is the following:
x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1
Which should translate to:
1 00000100 11000001 00011101 10110111
Using that polynomial, I tried to get the CRC of the following data:
01001001 01000100 01000001 01010100
00011000 01010111 01100011 11101000
11101100 11101100 00000100 00000000
00000011 00111010 00000001 10011100
This is what I get:
01011111 11000101 01100001 01101000 (MSB First)
10111011 00010011 00101010 11001100 (LSB First)
This is what is the actual CRC:
11111010 00010110 10110110 11110111
I'm not exactly sure how to fix this, but my guess would be I'm doing this part from the specification wrong:
In PNG, the 32-bit CRC is initialized to all 1's, and then the data from each byte is processed from the least significant bit (1) to the most significant bit (128). After all the data bytes are processed, the CRC is inverted (its ones complement is taken). This value is transmitted (stored in the datastream) MSB first. For the purpose of separating into bytes and ordering, the least significant bit of the 32-bit CRC is defined to be the coefficient of the x31 term.
I'm not completely sure I can understand all of that.
Also, here is the code I use to get the CRC:
public BitArray GetCRC(BitArray data)
{
// Prepare the divident; Append the proper amount of zeros to the end
BitArray divident = new BitArray(data.Length + polynom.Length - 1);
for (int i = 0; i < divident.Length; i++)
{
if (i < data.Length)
{
divident[i] = data[i];
}
else
{
divident[i] = false;
}
}
// Calculate CRC
for (int i = 0; i < divident.Length - polynom.Length + 1; i++)
{
if (divident[i] && polynom[0])
{
for (int j = 0; j < polynom.Length; j++)
{
if ((divident[i + j] && polynom[j]) || (!divident[i + j] && !polynom[j]))
{
divident[i + j] = false;
}
else
{
divident[i + j] = true;
}
}
}
}
// Strip the CRC off the divident
BitArray crc = new BitArray(polynom.Length - 1);
for (int i = data.Length, j = 0; i < divident.Length; i++, j++)
{
crc[j] = divident[i];
}
return crc;
}
So, how do I fix this to match the PNG specification?

You can find a complete implementation of the CRC calculation (and PNG encoding in general) in this public domain code:
static uint[] crcTable;
// Stores a running CRC (initialized with the CRC of "IDAT" string). When
// you write this to the PNG, write as a big-endian value
static uint idatCrc = Crc32(new byte[] { (byte)'I', (byte)'D', (byte)'A', (byte)'T' }, 0, 4, 0);
// Call this function with the compressed image bytes,
// passing in idatCrc as the last parameter
private static uint Crc32(byte[] stream, int offset, int length, uint crc)
{
uint c;
if(crcTable==null){
crcTable=new uint[256];
for(uint n=0;n<=255;n++){
c = n;
for(var k=0;k<=7;k++){
if((c & 1) == 1)
c = 0xEDB88320^((c>>1)&0x7FFFFFFF);
else
c = ((c>>1)&0x7FFFFFFF);
}
crcTable[n] = c;
}
}
c = crc^0xffffffff;
var endOffset=offset+length;
for(var i=offset;i<endOffset;i++){
c = crcTable[(c^stream[i]) & 255]^((c>>8)&0xFFFFFF);
}
return c^0xffffffff;
}
1 https://web.archive.org/web/20150825201508/http://upokecenter.dreamhosters.com/articles/png-image-encoder-in-c/

Inserting bytes in the middle of binary file

I want to add some string in the middle of image metadata block. Under some specific marker. I have to do it on bytes level since .NET has no support for custom metadata fields.
The block is built like 1C 02 XX YY YY ZZ ZZ ZZ ... where XX is the ID of the field I need to append and YY YY is the size of it, ZZ = data.
I imagine it should be more or less possible to read all the image data up to this marker (1C 02 XX) then increase the size bytes (YY YY), add data at the end of ZZ and then add the rest of the original file? Is this correct?
How should I go on with it? It needs to work as fast as possible with 4-5 MB JPEG files.

In general there is no way to speed up this operation. You have to read at least portion that needs to be moved and write it again in updated file. Creating new file and copying content to it may be faster if you can parallelize read and write operations.
Note: In you particular case it may not be possible to just insert content in the middle of the file as most of file formats are not designed with such modifcations in mind. Often there are offsets to portions of the file that will be invalid when you shift part of the file. Specifying what file format you trying to work with may help other people to provide better approaches.

Solved the problem with this code:
List<byte> dataNew = new List<byte>();
byte[] data = File.ReadAllBytes(jpegFilePath);
int j = 0;
for (int i = 1; i < data.Length; i++)
{
if (data[i - 1] == (byte)0x1C) // 1C IPTC
{
if (data[i] == (byte)0x02) // 02 IPTC
{
if (data[i + 1] == (byte)fileByte) // IPTC field_number, i.e. 0x78 = IPTC_120
{
j = i;
break;
}
}
}
}
for (int i = 0; i < j + 2; i++) // add data from file before this field
dataNew.Add(data[i]);
int countOld = (data[j + 2] & 255) << 8 | (data[j + 3] & 255); // curr field length
int countNew = valueToAdd.Length; // new string length
int newfullSize = countOld + countNew; // sum
byte[] newSize = BitConverter.GetBytes((Int16)newfullSize); // Int16 on 2 bytes (to use 2 bytes as size)
Array.Reverse(newSize); // changes order 10 00 to 00 10
for (int i = 0; i < newSize.Length; i++) // add changed size
dataNew.Add(newSize[i]);
for (int i = j + 4; i < j + 4 + countOld; i++) // add old field value
dataNew.Add(data[i]);
byte[] newString = ASCIIEncoding.ASCII.GetBytes(valueToAdd);
for (int i = 0; i < newString.Length; i++) // append with new field value
dataNew.Add(newString[i]);
for (int i = j + 4 + newfullSize; i < data.Length; i++) // add rest of the file
dataNew.Add(data[i]);
byte[] finalArray = dataNew.ToArray();
File.WriteAllBytes(Path.Combine(Path.GetDirectoryName(jpegFilePath), "newfile.jpg"), finalArray);

Here is an easy and quite fast solution. It moves all bytes after given offset to their new position according to given extraBytes, so you can insert your data.
public void ExpandFile(FileStream stream, long offset, int extraBytes)
{
// http://stackoverflow.com/questions/3033771/file-io-with-streams-best-memory-buffer-size
const int SIZE = 4096;
var buffer = new byte[SIZE];
var length = stream.Length;
// Expand file
stream.SetLength(length + extraBytes);
var pos = length;
int to_read;
while (pos > offset)
{
to_read = pos - SIZE >= offset ? SIZE : (int)(pos - offset);
pos -= to_read;
stream.Position = pos;
stream.Read(buffer, 0, to_read);
stream.Position = pos + extraBytes;
stream.Write(buffer, 0, to_read);
}
Need to be checked, though...

C++ file i/o error?

Why is everything being read as 0?
int width = 5;
int height = 5;
int someTile = 1;
char buff[128];
ifstream file("test.txt", ios::in|ios::binary);
if(file.is_open())
{
cout << "open";
}
file.read(buff, sizeof(int));
width = atoi(buff);
file.read(buff, sizeof(int));
height = atoi(buff);
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
file.read(buff, sizeof(int));
someTile = atoi(buff);
cout << someTile;
}
}
My file format code is in C# and written like this:
FileStream stream = new FileStream("test.txt", FileMode.Create);
BinaryWriter writer = new BinaryWriter(stream);
// write a line of text to the file
writer.Write(15);
writer.Write(5);
for (int i = 0; i < 15; i++)
{
for (int j = 0; j < 5; j++)
{
writer.Write(1);
}
}
// close the stream
writer.Close();

Without knowing the contents of test.txt it's difficult to say exactly, but it looks like you're repeatedly reading 4 bytes (size of an int on most platforms) into a character buffer / string, and then trying to turn that into a number. Unless your file is constructed entirely of four byte blocks that are null-terminated, I wouldn't expect this to work.
Update: Ok, looking at your file format you're not writing strings, you're writing ints. Therefore I'd expect you to be able to read your numbers straight back in, with no need for atoi.
For example:
int value;
file.read((char*)&value, sizeof(int));
value should now contain the number from the file. To convert your whole example you're looking for something like this:
int width = 5;
int height = 5;
int someTile = 1;
ifstream file("test.txt", ios::in|ios::binary);
if(file.is_open())
{
cout << "open";
file.read(reinterpret_cast<char*>(&width), sizeof(int));
file.read(reinterpret_cast<char*>(&height), sizeof(int));
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
file.read(reinterpret_cast<char*>(&someTime), sizeof(int));
cout << someTile;
}
}
}

atoi converts a NUL terminated string to an integer - you are reading four bytes from the file (it's in binary mode) - which may not be correct..
for example, a valid string (for atoi to work could be, "1234" - NOTE: NUL terminated), however the byte representation of this is 0x31 0x32 0x33 0x34 (note NUL terminated given you only read 4 bytes, so, atoi could be doing anything). What is the format of this file? If it really is byte representation, the number 1234 would look like (depending on endianess), 0x00 0x00 0x04 0xD2, the way to correctly read this int would be to shift in byte by byte.
So, big question - what is the format?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Creating a code to decompress byte-oriented RLE image - c#

Related

How do I put more than 2GB file into byte array in C#?

C# - Padding image bytes with white bytes to fill 512 x 512

How is PNG CRC calculated exactly?

Inserting bytes in the middle of binary file

C++ file i/o error?

Categories

Resources