How is PNG CRC calculated exactly? - c#

For the past 4 hours I've been studying the CRC algorithm. I'm pretty sure I got the hang of it already.
I'm trying to write a png encoder, and I don't wish to use external libraries for the CRC calculation, nor for the png encoding itself.
My program has been able to get the same CRC's as the examples on tutorials. Like on Wikipedia:
Using the same polynomial and message as in the example, I was able to produce the same result in both of the cases. I was able to do this for several other examples as well.
However, I can't seem to properly calculate the CRC of png files. I tested this by creating a blank, one pixel big .png file in paint, and using it's CRC as a comparision. I copied the data (and chunk name) from the IDAT chunk of the png (which the CRC is calculated from), and calculated it's CRC using the polynomial provided in the png specification.
The polynomial provided in the png specification is the following:
x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1
Which should translate to:
1 00000100 11000001 00011101 10110111
Using that polynomial, I tried to get the CRC of the following data:
01001001 01000100 01000001 01010100
00011000 01010111 01100011 11101000
11101100 11101100 00000100 00000000
00000011 00111010 00000001 10011100
This is what I get:
01011111 11000101 01100001 01101000 (MSB First)
10111011 00010011 00101010 11001100 (LSB First)
This is what is the actual CRC:
11111010 00010110 10110110 11110111
I'm not exactly sure how to fix this, but my guess would be I'm doing this part from the specification wrong:
In PNG, the 32-bit CRC is initialized to all 1's, and then the data from each byte is processed from the least significant bit (1) to the most significant bit (128). After all the data bytes are processed, the CRC is inverted (its ones complement is taken). This value is transmitted (stored in the datastream) MSB first. For the purpose of separating into bytes and ordering, the least significant bit of the 32-bit CRC is defined to be the coefficient of the x31 term.
I'm not completely sure I can understand all of that.
Also, here is the code I use to get the CRC:
public BitArray GetCRC(BitArray data)
{
// Prepare the divident; Append the proper amount of zeros to the end
BitArray divident = new BitArray(data.Length + polynom.Length - 1);
for (int i = 0; i < divident.Length; i++)
{
if (i < data.Length)
{
divident[i] = data[i];
}
else
{
divident[i] = false;
}
}
// Calculate CRC
for (int i = 0; i < divident.Length - polynom.Length + 1; i++)
{
if (divident[i] && polynom[0])
{
for (int j = 0; j < polynom.Length; j++)
{
if ((divident[i + j] && polynom[j]) || (!divident[i + j] && !polynom[j]))
{
divident[i + j] = false;
}
else
{
divident[i + j] = true;
}
}
}
}
// Strip the CRC off the divident
BitArray crc = new BitArray(polynom.Length - 1);
for (int i = data.Length, j = 0; i < divident.Length; i++, j++)
{
crc[j] = divident[i];
}
return crc;
}
So, how do I fix this to match the PNG specification?

You can find a complete implementation of the CRC calculation (and PNG encoding in general) in this public domain code:
static uint[] crcTable;
// Stores a running CRC (initialized with the CRC of "IDAT" string). When
// you write this to the PNG, write as a big-endian value
static uint idatCrc = Crc32(new byte[] { (byte)'I', (byte)'D', (byte)'A', (byte)'T' }, 0, 4, 0);
// Call this function with the compressed image bytes,
// passing in idatCrc as the last parameter
private static uint Crc32(byte[] stream, int offset, int length, uint crc)
{
uint c;
if(crcTable==null){
crcTable=new uint[256];
for(uint n=0;n<=255;n++){
c = n;
for(var k=0;k<=7;k++){
if((c & 1) == 1)
c = 0xEDB88320^((c>>1)&0x7FFFFFFF);
else
c = ((c>>1)&0x7FFFFFFF);
}
crcTable[n] = c;
}
}
c = crc^0xffffffff;
var endOffset=offset+length;
for(var i=offset;i<endOffset;i++){
c = crcTable[(c^stream[i]) & 255]^((c>>8)&0xFFFFFF);
}
return c^0xffffffff;
}
1 https://web.archive.org/web/20150825201508/http://upokecenter.dreamhosters.com/articles/png-image-encoder-in-c/

Related

Creating a code to decompress byte-oriented RLE image

I'm trying to create a code to decompress an RLE Byte-Oriented image from a PostScript File I've already tried solutions found around the web and also tried to build my own ; but none of them produced the result i need.
After decompressing the rle image, i should have an RAW image i can open on photoshop (informing width, height and number of channels). However when i try to open the extracted image it doesn't work ; only a black output is show.
My inputs are an Binary ASCII Encoded file (encoded as a hexadecimal string) and a binary file ; both RLE Byte-Oriented compressed (in the hex file case, its just a question of converting it to bytes before trying the rle decompression).
https://drive.google.com/drive/u/0/folders/1Q476HB9SvOG_RDwK6J7PPycbw94zjPYU
I've posted samples here.
WorkingSample.raw -> Image Sample i got using another software, and its dimensions as well.
MySample.raw -> Image sample i built using my code, and its dimensions as well.
OriginalFile.ppf -> File containing the original image data and everything else.
ExtractedBinary.bin -> Only a binary portion from OriginalFile.ppf - makes it easier to read and work with the data.
This code was provided by the user nyerguds, he's part of the SO Community.
Original Source: http://www.shikadi.net/moddingwiki/RLE_Compression#Types_of_RLE
Its the one i tried to use but the results weren't correct. And to be honest i had difficulties understanding his code (he told me to change a few things in order to get it working for my case but i was unable to).
And here's what i tried to do following the PostScript Red Book:
Book: https://www.adobe.com/content/dam/acom/en/devnet/actionscript/articles/PLRM.pdf
The part:
"The RunLengthEncode filter encodes data in a simple-byte oriented format based on run length.
The compressed data format is a sequence of runs, where each run consists of a length byte followed by 1 to 128 bytes of data. If the length byte is in the range 0 to 127, the following length + 1 bytes (1 to 128 bytes) are to be copied literally upon decompression. If length is in the range of 129 to 255, the following single byte is to be replicated 257 - length times (2 to 128 times) upon decompression."
Page 142, RunLengthEncode Filter.
List<byte> final = new List<byte>();
var split01 = ArraySplit(bytefile, 2);
foreach (var binPart in split01)
{
try
{
if (binPart.ElementAt(0) <= 127)
{
int currLen = binPart[0] + 1;
for (int i = 0; i <= binPart[0]; i++)
{
final.Add(binPart[1]);
//Console.WriteLine(binPart[1]);
}
}
else if (binPart[0] >= 128)
{
int currLen = 257 - binPart[0];
for (int i = 0; i < currLen; i++)
{
final.Add(binPart[1]);
// Console.WriteLine(binPart[1]);
}
}
}
catch(Exception)
{
break;
}
}
File.WriteAllBytes(#"C:\test\again.raw", final.ToArray());
private static IEnumerable<byte[]> ArraySplit(byte[] bArray, int intBufforLengt)
{
int bArrayLenght = bArray.Length;
byte[] bReturn = null;
int i = 0;
for (; bArrayLenght > (i + 1) * intBufforLengt; i++)
{
bReturn = new byte[intBufforLengt];
Array.Copy(bArray, i * intBufforLengt, bReturn, 0, intBufforLengt);
yield return bReturn;
}
int intBufforLeft = bArrayLenght - i * intBufforLengt;
if (intBufforLeft > 0)
{
bReturn = new byte[intBufforLeft];
Array.Copy(bArray, i * intBufforLengt, bReturn, 0, intBufforLeft);
yield return bReturn;
}
}
private static byte[] StringToByteArray(String hex)
{
int iValue = 0;
int NumberChars = hex.Length;
if (NumberChars % 2 != 0)
{
string m = string.Empty;
}
byte[] bytes = new byte[NumberChars / 2];
try
{
for (int i = 0; i < NumberChars; i += 2)
{
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
iValue = i;
}
}
catch (Exception e)
{
var value = iValue;
Console.WriteLine(e.Message);
}
return bytes;
}
The desired output would be an TIFF Grayscale. However, i can deal with PNG''s also.
I've managed to extract uncompressed data from this kind of file already ; with Emgu(OpenCV Wrapper) i was able to create a viewable image and do my logic on it.
My actual results from RLE Compressed are only invalid RAW files that can't be viewed even on photoshop or IrfanViewer.
Any input is appreciated. Thanks.
EDIT1: stuck on this part
for(int i=0; i < bytefile.Length; i+=2)
{
try
{
var lengthByte = bytefile[i];
if (lengthByte <= 127)
{
int currLen = lengthByte + 1;
for (int j = 0; j < currLen; j++)
{
final.Add(bytefile[i]);
i++;
}
}
if (bytefile[i] >= 128)
{
int currLen = 257 - bytefile[i];
for (int k = 0; k < currLen; k++)
{
final.Add(bytefile[i + 1]);
}
}
}
catch(Exception)
{
break;
}
}
This is the logic i'm following. Before it was raising an Exception but i figured it out (it was because i forgot to add the ending byte ; makes no difference in the final result).
Try this basic outline:
int i = 0;
while (i < bytefile.length)
{
var lengthByte = bytefile[i++];
if (lengthByte <= 127)
{
int currLen = lengthByte + 1;
for (int j = 0; j < currLen; j++)
final.Add(bytefile[i++]);
}
else
{
int currLen = 257 - lengthByte;
byte byteToCopy = bytefile[i++];
for (int j = 0; j < currLen; j++)
final.Add(byteToCopy);
}
}
This is how I understand what's specified above, anyway.
Although not explicitly stated, I believe you are attempting to extract a RunLength Encoded image from a Postscript file and save that out as a grayscale TIFF.
As a starting point for something like this, have you tried simply saving out an uncompressed image from a Postscript file as a grayscale TIFF to ensure your application logic responsible for building up the TIFF image data indeed works as you expect it to? I'd caution that would a be a good first step before moving onto now supporting decompressing RLE data to then turn into a TIFF.
The reason I think that's important is because your problem may have nothing to do with how you're decompressing the RLE data but rather how you're creating your output TIFF from presumably correctly decoded data.

Encoding an array of bytes similar to Base64, but with arbitrary radix

Does the procedure have a name, where you take a stream of 8-bit bytes and slice them into n-bit snippets stored in 8-bit containers?
The idea is very similar to Base64 encoding, where you split the stream of 1's and 0's into 6-bit chunks (instead of 8), meaning each chunk can have a decimal value of 0 - 63, each of which is assigned a unique human-readable character. In my case, I'm not looking to assign each chunk a specific character.
For example, the input 8-bit bytes:
11100101 01101100 01010011 00001100 11000000 10111101
become the 6-bit snippets:
111001 010110 110001 010011 000011 001100 000010 111101
which are subsequently stored as:
00111001 00010110 00110001 00010011 00000011 00001100 00000010 00111101
or, optionally, with an offset of 1 bit:
01110010 00101100 01100010 00100110 00000110 00011000 00000100 01111010
or and offset of 2 bits:
11100100 01011000 11000100 01001100 00001100 00110000 00001000 11110100
I was looking to write an algorithm in C# to encode a byte array to an arbitrary length with arbitrary offset, and another algorithm to convert it back again.
After quite a lot of headache, I thought I had successfully written the forward algorithm to encode an array of bytes. It worked for all my test cases, but when started writing the reverse algorithm I realised the whole problem was a lot more complicated than I thought it would be, and, in fact, my forward algorithm didn't work where n < 4.
I wanted to write the algorithms with bitwise operators, which is the more proper and elegant solution. The other way would have been to dump the byte array as a long string of 1's and 0's to slice, but that would have been much, much slower.
Here is my forward algorithm that works for cases where n >= 4:
public static byte[] EncodeForward(byte[] input, int n, int offset = 0)
{
byte[] output = new byte[(int)Math.Ceiling(input.Length * 8.0 / n)];
output[0] = (byte)(input[0] >> (8 - n));
int p = 1;
int r = 8 - n;
for (int i = 1; i < input.Length; i++)
{
output[p++] = (byte)((byte)((byte)(input[i - 1] << (8 - r)) | (byte)(input[i] >> r)) >> (8 - n - offset));
if ((r += (8 - n)) == n)
{
output[p++] = (byte)(input[i] & (byte)(0xFF >> (8 - n)));
r = 0;
}
}
return output;
}
I originally conceived it for just the case of n = 7, so each output byte would be composed by parts of at most 7 input bytes. However in the case where n < 4, each output byte would be composed by up to, I think, ceil(8/n) input bytes, so the process is a little more complex than above.
I was hoping to write the forward and reverse algorithms myself, but, honestly, after all this time debugging and testing what I've written and now finding this approach will never work for n < 4, I'm just looking for something that works. These two algorithms are just a very small piece of the project I'm working on.
Does this encoding/decoding procedure have a name, and is there either a built-in way to do it in C# or is there a library that will do it?
You are almost there. You just need and intermediate 16-bit buffer and an unprocessed bits counter. Disclaimer: I don't know C#. The (pseudo) code below is written with C in mind; you may need some tweaking.
For encoding,
uint16_t mask = 0xffff << (16 - width);
uint16_t buffer = (input[0] << 8) | uint[1];
i += 2;
int remaining = 16;
while (i < input.Length) {
while (remaining >= width) {
output[p++] = (buffer & mask) >> (16 - width);
buffer <<= width;
remaining -= width;
}
// Refill the buffer. Since it is 16-bit wide there is a room
// for an _entire_ input byte.
buffer |= input[i++] << (8 - remaining);
remaining += 8;
}
emit_remaining_bits(buffer, remaining);
For decoding:
uint16_t buffer = 0;
int remaining = 16;
while (i < input.Length) {
while (remaining > 8) {
buffer |= input[i++] << (remaining - width);
remaining += width;
}
output[p++] = (buffer >> 8) & 0x00ff;
buffer <<= 8;
remaining += 8;
}

Reading 24-bit samples from a .WAV file

I understand how to read 8-bit, 16-bit & 32-bit samples (PCM & floating-point) from a .wav file, since (conveniently) the .Net Framework has an in-built integral type for those exact sizes. But, I don't know how to read (and store) 24-bit (3 byte) samples.
How can I read 24-bit audio? Is there maybe some way I can alter my current method (below) for reading 32-bit audio to solve my problem?
private List<float> Read32BitSamples(FileStream stream, int sampleStartIndex, int sampleEndIndex)
{
var samples = new List<float>();
var bytes = ReadChannelBytes(stream, Channels.Left, sampleStartIndex, sampleEndIndex); // Reads bytes of a single channel.
if (audioFormat == WavFormat.PCM) // audioFormat determines whether to process sample bytes as PCM or floating point.
{
for (var i = 0; i < bytes.Length / 4; i++)
{
samples.Add(BitConverter.ToInt32(bytes, i * 4) / 2147483648f);
}
}
else
{
for (var i = 0; i < bytes.Length / 4; i++)
{
samples.Add(BitConverter.ToSingle(bytes, i * 4));
}
}
return samples;
}
Reading (and storing) 24-bit samples is very simple. Now, as you've rightly said, a 3 byte integral type does not exist within the framework, which means you're left with two choices; either create your own type, or, you can pad your 24-bit samples by inserting an empty byte (0) to the start of your sample's byte array therefore making them 32-bit samples (so you can then use an int to store/manipulate them).
I will explain and demonstrate how to do the later (which is also in my opinion the more simpler approach).
First we must look at how a 24-bit sample would be stored within an int,
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ MSB ~ ~ 2ndMSB ~ ~ 2ndLSB ~ ~ LSB ~ ~
24-bit sample: 11001101 01101001 01011100 00000000
32-bit sample: 11001101 01101001 01011100 00101001
MSB = Most Significant Byte, LSB = Lest Significant Byte.
As you can see the LSB of the 24-bit sample is 0, therefore all you have to is declare a byte[] with 4 elements, then read the 3 bytes of the sample into the array (starting at element 1) so that your array looks like below (effectively bit shifting by 8 places to the left),
myArray[0]: 00000000
myArray[1]: 01011100
myArray[2]: 01101001
myArray[3]: 11001101
Once you have your byte array full you can pass it to BitConverter.ToInt32(myArray, 0);, you will then need to shift the sample by 8 places to the right to get the sample in it's proper 24-bit intergal representation (from -8388608 to 8388608); then divide by 8388608 to have it as a floating-point value.
So, putting that all together you should end up with something like this,
Note, I wrote the following code with the intention to be "easy-to-follow", therefore this will not be the most performant method, for a faster solution see the code below this one.
private List<float> Read24BitSamples(FileStream stream, int startIndex, int endIndex)
{
var samples = new List<float>();
var bytes = ReadChannelBytes(stream, Channels.Left, startIndex, endIndex);
var temp = new List<byte>();
var paddedBytes = new byte[bytes.Length / 3 * 4];
// Right align our samples to 32-bit (effectively bit shifting 8 places to the left).
for (var i = 0; i < bytes.Length; i += 3)
{
temp.Add(0); // LSB
temp.Add(bytes[i]); // 2nd LSB
temp.Add(bytes[i + 1]); // 2nd MSB
temp.Add(bytes[i + 2]); // MSB
}
// BitConverter requires collection to be an array.
paddedBytes = temp.ToArray();
temp = null;
bytes = null;
for (var i = 0; i < paddedBytes.Length / 4; i++)
{
samples.Add(BitConverter.ToInt32(paddedBytes, i * 4) / 2147483648f); // Skip the bit shift and just divide, since our sample has been "shited" 8 places to the right we need to divide by 2147483648, not 8388608.
}
return samples;
}
For a faster1 implementation you can do the following instead,
private List<float> Read24BitSamples(FileStream stream, int startIndex, int endIndex)
{
var bytes = ReadChannelBytes(stream, Channels.Left, startIndex, endIndex);
var samples = new float[bytes.Length / 3];
for (var i = 0; i < bytes.Length; i += 3)
{
samples[i / 3] = (bytes[i] << 8 | bytes[i + 1] << 16 | bytes[i + 2] << 24) / 2147483648f;
}
return samples.ToList();
}
1 After benchmarking the above code against the previous method, this solution is approximately 450% to 550% faster.

How can I calculate Longitudinal Redundancy Check (LRC)?

I've tried the example from wikipedia: http://en.wikipedia.org/wiki/Longitudinal_redundancy_check
This is the code for lrc (C#):
/// <summary>
/// Longitudinal Redundancy Check (LRC) calculator for a byte array.
/// ex) DATA (hex 6 bytes): 02 30 30 31 23 03
/// LRC (hex 1 byte ): EC
/// </summary>
public static byte calculateLRC(byte[] bytes)
{
byte LRC = 0x00;
for (int i = 0; i < bytes.Length; i++)
{
LRC = (LRC + bytes[i]) & 0xFF;
}
return ((LRC ^ 0xFF) + 1) & 0xFF;
}
It said the result is "EC" but I get "71", what I'm doing wrong?
Thanks.
Here's a cleaned up version that doesn't do all those useless operations (instead of discarding the high bits every time, they're discarded all at once in the end), and it gives the result you observed. This is the version that uses addition, but that has a negation at the end - might as well subtract and skip the negation. That's a valid transformation even in the case of overflow.
public static byte calculateLRC(byte[] bytes)
{
int LRC = 0;
for (int i = 0; i < bytes.Length; i++)
{
LRC -= bytes[i];
}
return (byte)LRC;
}
Here's the alternative LRC (a simple xor of bytes)
public static byte calculateLRC(byte[] bytes)
{
byte LRC = 0;
for (int i = 0; i < bytes.Length; i++)
{
LRC ^= bytes[i];
}
return LRC;
}
And Wikipedia is simply wrong in this case, both in the code (doesn't compile) and in the expected result.
Guess this one looks cooler ;)
public static byte calculateLRC(byte[] bytes)
{
return bytes.Aggregate<byte, byte>(0, (x, y) => (byte) (x^ y));
}
If someone wants to get the LRC char from a string:
public static char CalculateLRC(string toEncode)
{
byte[] bytes = Encoding.ASCII.GetBytes(toEncode);
byte LRC = 0;
for (int i = 0; i < bytes.Length; i++)
{
LRC ^= bytes[i];
}
return Convert.ToChar(LRC);
}
The corrected Wikipedia version is as follows:
private byte calculateLRC(byte[] b)
{
byte lrc = 0x00;
for (int i = 0; i < b.Length; i++)
{
lrc = (byte)((lrc + b[i]) & 0xFF);
}
lrc = (byte)(((lrc ^ 0xff) + 2) & 0xFF);
return lrc;
}
I created this for Arduino to understand the algorithm (of course it's not written in the most efficient way)
String calculateModbusAsciiLRC(String input)
{
//Refer this document http://www.simplymodbus.ca/ASCII.htm
if((input.length()%2)!=0) { return "ERROR COMMAND SHOULD HAVE EVEN NUMBER OF CHARACTERS"; }
// Make sure to omit the semicolon in input string and input String has even number of characters
byte byteArray[input.length()+1];
input.getBytes(byteArray, sizeof(byteArray));
byte LRC = 0;
for (int i = 0; i <sizeof(byteArray)/2; i++)
{
// Gettting the sum of all registers
uint x=0;
if(47<byteArray[i*2] && byteArray[i*2] <58) {x=byteArray[i*2] -48;}
else { x=byteArray[i*2] -55; }
uint y=0;
if(47<byteArray[i*2+1] && byteArray[i*2+1] <58) {y=byteArray[i*2+1] -48;}
else { y=byteArray[i*2+1] -55; }
LRC += x*16 + y;
}
LRC = ~LRC + 1; // Getting twos Complement
String checkSum = String(LRC, HEX);
checkSum.toUpperCase(); // Converting to upper case eg: bc to BC - Optional somedevices are case insensitve
return checkSum;
}
I realize that this question pretty old, but I had trouble figuring out how to do this. It's working now, so I figured I should paste the code. In my case, the checksum needs to return as an ASCII string.
public function getLrc($string)
{
$LRC = 0;
// Get hex checksum.
foreach (str_split($string, 1) as $char) {
$LRC ^= ord($char);
}
$hex = dechex($LRC);
// convert hex to string
$str = '';
for($i=0;$i<strlen($hex);$i+=2) $str .= chr(hexdec(substr($hex,$i,2)));
return $str;
}

C# - Converting a Sequence of Numbers into Bytes

I am trying to send a UDP packet of bytes corresponding to the numbers 1-1000 in sequence. How do I convert each number (1,2,3,4,...,998,999,1000) into the minimum number of bytes required and put them in a sequence that I can send as a UDP packet?
I've tried the following with no success. Any help would be greatly appreciated!
List<byte> byteList = new List<byte>();
for (int i = 1; i <= 255; i++)
{
byte[] nByte = BitConverter.GetBytes((byte)i);
foreach (byte b in nByte)
{
byteList.Add(b);
}
}
for (int g = 256; g <= 1000; g++)
{
UInt16 st = Convert.ToUInt16(g);
byte[] xByte = BitConverter.GetBytes(st);
foreach (byte c in xByte)
{
byteList.Add(c);
}
}
byte[] sendMsg = byteList.ToArray();
Thank you.
You need to use :
BitConverter.GetBytes(INTEGER);
Think about how you are going to be able to tell the difference between:
260, 1 -> 0x1, 0x4, 0x1
1, 4, 1 -> 0x1, 0x4, 0x1
If you use one byte for numbers up to 255 and two bytes for the numbers 256-1000, you won't be able to work out at the other end which number corresponds to what.
If you just need to encode them as described without worrying about how they are decoded, it smacks to me of a contrived homework assignment or test, and I'm uninclined to solve it for you.
I think you are looking for something along the lines of a 7-bit encoded integer:
protected void Write7BitEncodedInt(int value)
{
uint num = (uint) value;
while (num >= 0x80)
{
this.Write((byte) (num | 0x80));
num = num >> 7;
}
this.Write((byte) num);
}
(taken from System.IO.BinaryWriter.Write(String)).
The reverse is found in the System.IO.BinaryReader class and looks something like this:
protected internal int Read7BitEncodedInt()
{
byte num3;
int num = 0;
int num2 = 0;
do
{
if (num2 == 0x23)
{
throw new FormatException(Environment.GetResourceString("Format_Bad7BitInt32"));
}
num3 = this.ReadByte();
num |= (num3 & 0x7f) << num2;
num2 += 7;
}
while ((num3 & 0x80) != 0);
return num;
}
I do hope this is not homework, even though is really smells like it.
EDIT:
Ok, so to put it all together for you:
using System;
using System.IO;
namespace EncodedNumbers
{
class Program
{
protected static void Write7BitEncodedInt(BinaryWriter bin, int value)
{
uint num = (uint)value;
while (num >= 0x80)
{
bin.Write((byte)(num | 0x80));
num = num >> 7;
}
bin.Write((byte)num);
}
static void Main(string[] args)
{
MemoryStream ms = new MemoryStream();
BinaryWriter bin = new BinaryWriter(ms);
for(int i = 1; i < 1000; i++)
{
Write7BitEncodedInt(bin, i);
}
byte[] data = ms.ToArray();
int size = data.Length;
Console.WriteLine("Total # of Bytes = " + size);
Console.ReadLine();
}
}
}
The total size I get is 1871 bytes for numbers 1-1000.
Btw, could you simply state whether or not this is homework? Obviously, we will still help either way. But we would much rather you try a little harder so you can actually learn for yourself.
EDIT #2:
If you want to just pack them in ignoring the ability to decode them back, you can do something like this:
protected static void WriteMinimumInt(BinaryWriter bin, int value)
{
byte[] bytes = BitConverter.GetBytes(value);
int skip = bytes.Length-1;
while (bytes[skip] == 0)
{
skip--;
}
for (int i = 0; i <= skip; i++)
{
bin.Write(bytes[i]);
}
}
This ignores any bytes that are zero (from MSB to LSB). So for 0-255 it will use one byte.
As states elsewhere, this will not allow you to decode the data back since the stream is now ambiguous. As a side note, this approach crams it down to 1743 bytes (as opposed to 1871 using 7-bit encoding).
A byte can only hold 256 distinct values, so you cannot store the numbers above 255 in one byte. The easiest way would be to use short, which is 16 bits. If you realy need to conserve space, you can use 10 bit numbers and pack that into a byte array ( 10 bits = 2^10 = 1024 possible values).
Naively (also, untested):
List<byte> bytes = new List<byte>();
for (int i = 1; i <= 1000; i++)
{
byte[] nByte = BitConverter.GetBytes(i);
foreach(byte b in nByte) bytes.Add(b);
}
byte[] byteStream = bytes.ToArray();
Will give you a stream of bytes were each group of 4 bytes is a number [1, 1000].
You might be tempted to do some work so that i < 256 take a single byte, i < 65535 take two bytes, etc. However, if you do this you can't read the values out of the stream. Instead, you'd add length encoding or sentinels bits or something of the like.
I'd say, don't. Just compress the stream, either using a built-in class, or gin up a Huffman encoding implementation using an agree'd upon set of frequencies.

Categories