Can anyone define the Windows PE Checksum Algorithm? - c#

I would like to implement this in C#
I have looked here:
http://www.codeproject.com/KB/cpp/PEChecksum.aspx
And am aware of the ImageHlp.dll MapFileAndCheckSum function.
However, for various reasons, I would like to implement this myself.
The best I have found is here:
http://forum.sysinternals.com/optional-header-checksum-calculation_topic24214.html
But, I don't understand the explanation. Can anyone clarify how the checksum is calculated?
Thanks!
Update
I from the code example, I do not understand what this means, and how to translate it into C#
sum -= sum < low 16 bits of CheckSum in file // 16-bit borrow
sum -= low 16 bits of CheckSum in file
sum -= sum < high 16 bits of CheckSum in file
sum -= high 16 bits of CheckSum in file
Update #2
Thanks, came across some Python code that does similar too here
def generate_checksum(self):
# This will make sure that the data representing the PE image
# is updated with any changes that might have been made by
# assigning values to header fields as those are not automatically
# updated upon assignment.
#
self.__data__ = self.write()
# Get the offset to the CheckSum field in the OptionalHeader
#
checksum_offset = self.OPTIONAL_HEADER.__file_offset__ + 0x40 # 64
checksum = 0
# Verify the data is dword-aligned. Add padding if needed
#
remainder = len(self.__data__) % 4
data = self.__data__ + ( '\0' * ((4-remainder) * ( remainder != 0 )) )
for i in range( len( data ) / 4 ):
# Skip the checksum field
#
if i == checksum_offset / 4:
continue
dword = struct.unpack('I', data[ i*4 : i*4+4 ])[0]
checksum = (checksum & 0xffffffff) + dword + (checksum>>32)
if checksum > 2**32:
checksum = (checksum & 0xffffffff) + (checksum >> 32)
checksum = (checksum & 0xffff) + (checksum >> 16)
checksum = (checksum) + (checksum >> 16)
checksum = checksum & 0xffff
# The length is the one of the original data, not the padded one
#
return checksum + len(self.__data__)
However, it's still not working for me - here is my conversion of this code:
using System;
using System.IO;
namespace CheckSumTest
{
class Program
{
static void Main(string[] args)
{
var data = File.ReadAllBytes(#"c:\Windows\notepad.exe");
var PEStart = BitConverter.ToInt32(data, 0x3c);
var PECoffStart = PEStart + 4;
var PEOptionalStart = PECoffStart + 20;
var PECheckSum = PEOptionalStart + 64;
var checkSumInFile = BitConverter.ToInt32(data, PECheckSum);
Console.WriteLine(string.Format("{0:x}", checkSumInFile));
long checksum = 0;
var remainder = data.Length % 4;
if (remainder > 0)
{
Array.Resize(ref data, data.Length + (4 - remainder));
}
var top = Math.Pow(2, 32);
for (int i = 0; i < data.Length / 4; i++)
{
if (i == PECheckSum / 4)
{
continue;
}
var dword = BitConverter.ToInt32(data, i * 4);
checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
if (checksum > top)
{
checksum = (checksum & 0xffffffff) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = (checksum) + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += (uint)data.Length;
Console.WriteLine(string.Format("{0:x}", checksum));
Console.ReadKey();
}
}
}
Can anyone tell me where I'm being stupid?

Ok, finally got it working ok... my problem was that I was using ints not uints!!!
So, this code works (assuming data is 4-byte aligned, otherwise you'll have to pad it out a little) - and PECheckSum is the position of the CheckSum value within the PE (which is clearly not used when calculating the checksum!!!!)
static uint CalcCheckSum(byte[] data, int PECheckSum)
{
long checksum = 0;
var top = Math.Pow(2, 32);
for (var i = 0; i < data.Length / 4; i++)
{
if (i == PECheckSum / 4)
{
continue;
}
var dword = BitConverter.ToUInt32(data, i * 4);
checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
if (checksum > top)
{
checksum = (checksum & 0xffffffff) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = (checksum) + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += (uint)data.Length;
return (uint)checksum;
}

The code in the forum post is not strictly the same as what was noted during the actual disassembly of the Windows PE code. The CodeProject article you reference gives the "fold 32-bit value into 16 bits" as:
mov edx,eax ; EDX = EAX
shr edx,10h ; EDX = EDX >> 16 EDX is high order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order
add eax,edx ; EAX = EAX + EDX High Order Folded into Low Order
mov edx,eax ; EDX = EAX
shr edx,10h ; EDX = EDX >> 16 EDX is high order
add eax,edx ; EAX = EAX + EDX High Order Folded into Low Order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order 16 bits
Which you could translate into C# as:
// given: uint sum = ...;
uint high = sum >> 16; // take high order from sum
sum &= 0xFFFF; // clear out high order from sum
sum += high; // fold high order into low order
high = sum >> 16; // take the new high order of sum
sum += high; // fold the new high order into sum
sum &= 0xFFFF; // mask to 16 bits

Java code below from emmanuel may not work. In my case it hangs and does not complete. I believe this is due to the heavy use of IO in the code: in particular the data.read()'s. This can be swapped with an array as solution. Where the RandomAccessFile fully or incrementally reads the file into a byte array(s).
I attempted this but the calculation was too slow due to the conditional for the checksum offset to skip the checksum header bytes. I would imagine that the OP's C# solution would have a similar problem.
The below code removes this also.
public static long computeChecksum(RandomAccessFile data, int checksumOffset)
throws IOException {
...
byte[] barray = new byte[(int) length];
data.readFully(barray);
long i = 0;
long ch1, ch2, ch3, ch4, dword;
while (i < checksumOffset) {
ch1 = ((int) barray[(int) i++]) & 0xff;
...
checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);
if (checksum > top) {
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
}
i += 4;
while (i < length) {
ch1 = ((int) barray[(int) i++]) & 0xff;
...
checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);
if (checksum > top) {
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = checksum + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += length;
return checksum;
}
I still however think that code was too verbose and clunky so I swapped out the raf with a channel and rewrote the culprit bytes to zero's to eliminate the conditional. This code could still probably do with a cache style buffered read.
public static long computeChecksum2(FileChannel ch, int checksumOffset)
throws IOException {
ch.position(0);
long sum = 0;
long top = (long) Math.pow(2, 32);
long length = ch.size();
ByteBuffer buffer = ByteBuffer.wrap(new byte[(int) length]);
buffer.order(ByteOrder.LITTLE_ENDIAN);
ch.read(buffer);
buffer.putInt(checksumOffset, 0x0000);
buffer.position(0);
while (buffer.hasRemaining()) {
sum += buffer.getInt() & 0xffffffffL;
if (sum > top) {
sum = (sum & 0xffffffffL) + (sum >> 32);
}
}
sum = (sum & 0xffff) + (sum >> 16);
sum = sum + (sum >> 16);
sum = sum & 0xffff;
sum += length;
return sum;
}

No one really answered the original question of "Can anyone define the Windows PE Checksum Algorithm?" so I'm going to define it as simply as possible. A lot of the examples given so far are optimizing for unsigned 32-bit integers (aka DWORDs), but if you just want to understand the algorithm itself at its most fundamental, it is simply this:
Using an unsigned 16-bit integer (aka a WORD) to store the checksum, add up all of the WORDs of the data except for the 4 bytes of the PE optional header checksum. If the file is not WORD-aligned, then the last byte is a 0x00.
Convert the checksum from a WORD to a DWORD and add the size of the file.
The PE checksum algorithm above is effectively the same as the original MS-DOS checksum algorithm. The only differences are the location to skip and replacing the XOR 0xFFFF at the end and adding the size of the file instead.
From my WinPEFile class for PHP, the above algorithm looks like:
$x = 0;
$y = strlen($data);
$val = 0;
while ($x < $y)
{
// Skip the checksum field location.
if ($x === $this->pe_opt_header["checksum_pos"]) $x += 4;
else
{
$val += self::GetUInt16($data, $x, $y);
// In PHP, integers are either signed 32-bit or 64-bit integers.
if ($val > 0xFFFF) $val = ($val & 0xFFFF) + 1;
}
}
// Add the file size.
$val += $y;

I was trying to solve the same issue in Java. Here is Mark's solution translated into Java, using a RandomAccessFile instead of a byte array as input:
static long computeChecksum(RandomAccessFile data, long checksumOffset) throws IOException {
long checksum = 0;
long top = (long) Math.pow(2, 32);
long length = data.length();
for (long i = 0; i < length / 4; i++) {
if (i == checksumOffset / 4) {
data.skipBytes(4);
continue;
}
long ch1 = data.read();
long ch2 = data.read();
long ch3 = data.read();
long ch4 = data.read();
long dword = ch1 + (ch2 << 8) + (ch3 << 16) + (ch4 << 24);
checksum = (checksum & 0xffffffffL) + dword + (checksum >> 32);
if (checksum > top) {
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = checksum + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += length;
return checksum;
}

private unsafe static int GetSetPEChecksum(byte[] Array) {
var Value = 0;
var Count = Array.Length;
if(Count >= 64)
fixed (byte* array = Array) {
var Index = 0;
var Coff = *(int*)(array + 60);
if(Coff >= 64 && Count >= Coff + 92) {
*(int*)(array + Coff + 88) = 0;
var Bound = Count >> 1;
if((Count & 1) != 0) Value = array[Count & ~1];
var Short = (ushort*)array;
while(Index < Bound) {
Value += Short[Index++];
Value = (Value & 0xffff) + (Value >> 16);
Value = (Value + (Value >> 16)) & 0xffff;
}
*(int*)(array + Coff + 88) = Value += Count;
}
}
return Value;
}
If you need short unsafe... (Not need use Double and Long integers and not need Array aligning inside algorithm)

The Java example is not entirely correct. Following Java implementation corresponds with the result of Microsoft's original implementation from Imagehlp.MapFileAndCheckSumA.
It's important that the input bytes are getting masked with inputByte & 0xff and the resulting long masked again when it's used in the addition term with currentWord & 0xffffffffL (consider the L):
long checksum = 0;
final long max = 4294967296L; // 2^32
// verify the data is DWORD-aligned and add padding if needed
final int remainder = data.length % 4;
final byte[] paddedData = Arrays.copyOf(data, data.length
+ (remainder > 0 ? 4 - remainder : 0));
for (int i = 0; i <= paddedData.length - 4; i += 4)
{
// skip the checksum field
if (i == this.offsetToOriginalCheckSum)
continue;
// take DWORD into account for computation
final long currentWord = (paddedData[i] & 0xff)
+ ((paddedData[i + 1] & 0xff) << 8)
+ ((paddedData[i + 2] & 0xff) << 16)
+ ((paddedData[i + 3] & 0xff) << 24);
checksum = (checksum & 0xffffffffL) + (currentWord & 0xffffffffL);
if (checksum > max)
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = checksum + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += data.length; // must be original data length
In this case, Java is a bit inconvenient.

The CheckSum field is 32 bits long and is calculated as follows
1. Add all dwords (32 bit pieces) of the entire file to a sum
Add all dwords of the entire file not including the CheckSum field itself, including all headers and all of the contents, to a dword. If the dword overflows, add the overflowed bit back to the first bit (2^0) of the dword.
If the file is not entirely divisible into dwords (4 bit pieces) see 2.
The best way I know to realize this is by using the GNU C Compilers Integer Overflow Builtin function __builtin_uadd_overflow.
In the original ChkSum function documented by Jeffrey Walton the sum
was calculated by performing an add (%esi),%eax where
esi contains the base address of the file and eax is 0 and adding the rest of the file like this
adc 0x4(%esi),%eax
adc 0x8(%esi),%eax
adc 0xc(%esi),%eax
adc 0x10(%esi),%eax
...
adc $0x0,%eax
The first add adds the first dword ignoring any carry flag. The next dwords
are added by the adc instruction which does the same thing as add but
adds any carry flag that was set before executing the instruction in addition
to the summand. The last adc $0x0,%eax adds only the last carry flag if it
was set and cannot be discarded.
Please keep in mind that the dword of CheckSum field itself should not be added.
2. Add the remainder to the sum if there is one
If the file is not entirely divisible into dwords, add the remainder as a
zero-padded dword. For example: say your file is 15 bytes long and looks like this
0E 1F BA 0E | 00 B4 09 CD | 21 B8 01 4C | CD 21 54
You need to add the remainder as 0x005421CD to the sum. My system is a
little-endian system. I do not know if the checksum would change because of the
this order of the bytes on big-endian systems, or you would just simulate this
behaviour.
I do this by rounding up the buffer_size to the next bytecount divisible by 4
without remainder or put differently: the next whole dword count represented
in bytes. Then I allocate with calloc because it initializes the memory block
with all zeros.
if(buffer_size%4)
{buffer_size+=4-(buffer_size%4);
...
calloc(buffer_size,1)
3. Add the lower word (16 bit piece) and the higher word of the sum together.
sum=(sum&0xffff)+(sum>>16);
4. Add the new higher word once again
sum+=(sum>>16);
5. Only keep the lower word
sum&=0xffff;
6. Add the number of bytes in the file to the sum
return(sum+size);
This is how I wrote it. It is not C#, but C. off_t size is the number of bytes in the file. uint32_t *base is a pointer to the file loaded into memory. The block of memory should be padded with zeros at the end to the next bytecount divisible by 4.
uint32_t pe_header_checksum(uint32_t *base,off_t size)
{uint32_t sum=0;
off_t i;
for(i=0;i<(size/4);i++)
{if(i==0x36)
{continue;}
sum+=__builtin_uadd_overflow(base[i],sum,&sum);}
if(size%4)
{sum+=base[i];}
sum=(sum&0xffff)+(sum>>16);
sum+=(sum>>16);
sum&=0xffff;
return(sum+size);}
If you want you can see the code in action and read a little bit more here.

Related

CRC-16/Modbus Implementation in C# malfunction

I'm currently setting up the communication between a controller for a step motor and a computer, coding an application in C# (it is the first time I use this programming language, and although I'm not a computer scientist but an industrial engineer, reason why I'm sure there are some ways of optimizing the function which I don't know, any recommendation on that matter would also be very appreciated). Therefore, I've been using the RS-485 that the controller has to communicate with it, and I've implemented an algorithm that generates the CRC(Cyclic Redundancy Check) bytes required.
And there is where my problem begins. I can't find the reason why my function doesn't generate the correct CRC value. I have checked with some online calculators of CRC and I've also used the example that appears in the Modbus Guide (where it also explains how is the code implemented).
Here is the code I've written for the calculus of the CRC:
class Program
{
static void Main(string[] args)
{
// 0x05, 0x06, 0x17, 0x70, 0x00, 0x01
byte[] prueba = new byte[] { 0x02, 0x07 };
byte[] result = Aux.CRC(prueba);
Console.WriteLine(result[0] + " " + result[1]);
}
}
class Aux{
public static byte[] CRC(byte[] to_evaluate)
{
byte[] CRC_Byte = new byte[2] { 0, 0 };
UInt16 CRC_Register = 0xFFFF; //16 bits 1111.1111.1111.1111
UInt16 CRC_pol = 0xa001; //16 bits 1010.0000.0000.0001
foreach (UInt16 byte_val in to_evaluate)
{
CRC_Register ^= byte_val;
Console.WriteLine("XOR inicial : {0:X}", CRC_Register);
for (byte i = 0; i < 8; i++)
{
CRC_Register >>= 1;
Console.WriteLine("Desplazamiento " + (i + 1) + ": {0:X}", CRC_Register);
if ((CRC_Register & 1) != 0)
{
CRC_Register ^= CRC_pol;
Console.WriteLine("XOR: {0:X}", CRC_Register);
}
}
}
Console.WriteLine("{0:X}",CRC_Register);
byte low_byte_CRC = (byte)((CRC_Register << 8) >> 8);
byte high_byte_CRC = (byte)(CRC_Register >> 8);
CRC_Byte[0] = low_byte_CRC;
CRC_Byte[1] = high_byte_CRC;
return CRC_Byte;
}
}
The expected result using the test array attached and the polinomial 0xa001 is 0x1241 for CRC_Register, and {0x41,0x12} for the CRC_Byte.
I had to implement a CRC check for PPP once in C# and it was absolutely no fun!
I found in this link the code that should correctly generate the CRC. It follows the CRC Generation procedure from section 6.2.2 on page 39 of the document you shared the link to.
// Compute the MODBUS RTU CRC
UInt16 ModRTU_CRC(byte[] buf, int len)
{
UInt16 crc = 0xFFFF;
for (int pos = 0; pos < len; pos++)
{
crc ^= (UInt16)buf[pos]; // XOR byte into least sig. byte of crc
for (int i = 8; i != 0; i--) // Loop over each bit
{
if ((crc & 0x0001) != 0) // If the LSB is set
{
crc >>= 1; // Shift right and XOR 0xA001
crc ^= 0xA001;
}
else // Else LSB is not set
{
crc >>= 1; // Just shift right
}
}
}
// Note, this number has low and high bytes swapped, so use it accordingly (or swap bytes)
return crc;
}

CRC 4 implementation for C#

Solved by this code -> https://gist.github.com/Sbreitzke/b26107798eee74e39ff85800abf71fb1
I searched the web for a CRC 4 implementation in C# because I have to calculate a checksum by
Changing the numbers of the barcode into Hex representation, then to bytes and then to bits and then calculate a CRC4 checksum on the bit stream.
I already found this question from 8 years ago without an answer
CRC-4 implementation in C#.
I tried changing the CRC 8 and 16 implementations to CRC 4 but they don't quite get the result I require.
0130E0928270FFFFFFF should evaluate to 7.
I found two C implementation but was unable to convert them to C#. For example this one:
short[] crc4_tab = {
0x0, 0x7, 0xe, 0x9, 0xb, 0xc, 0x5, 0x2,
0x1, 0x6, 0xf, 0x8, 0xa, 0xd, 0x4, 0x3,
};
/**
* crc4 - calculate the 4-bit crc of a value.
* #crc: starting crc4
* #x: value to checksum
* #bits: number of bits in #x to checksum
*
* Returns the crc4 value of #x, using polynomial 0b10111.
*
* The #x value is treated as left-aligned, and bits above #bits are ignored
* in the crc calculations.
*/
short crc4(uint8_t c, uint64_t x, int bits)
{
int i;
/* mask off anything above the top bit */
x &= (1ull << bits) -1;
/* Align to 4-bits */
bits = (bits + 3) & ~0x3;
/* Calculate crc4 over four-bit nibbles, starting at the MSbit */
for (i = bits - 4; i >= 0; i -= 4)
c = crc4_tab[c ^ ((x >> i) & 0xf)];
return c;
}
My current generation code (unit test) looks like this:
[TestMethod]
public void x()
{
var ordnungskennzeichen = 01;
var kundennummer = 51251496;
var einlieferungsbel = 9999;
var sendungsnr = 16777215;
var hex_ordnungskennzeichen = ordnungskennzeichen.ToString("x2");
var hex_kundennummer = kundennummer.ToString("x2");
var hex_einlieferungsbel = einlieferungsbel.ToString("x2");
var hex_sendungsnr = sendungsnr.ToString("x2");
var complete = hex_ordnungskennzeichen + hex_kundennummer + hex_einlieferungsbel + hex_sendungsnr;
var bytes = Encoding.ASCII.GetBytes(complete);
//var computeChecksum = crc4(???);
// Console.WriteLine(computeChecksum);
}
short[] crc4_tab = {
0x0, 0x7, 0xe, 0x9, 0xb, 0xc, 0x5, 0x2,
0x1, 0x6, 0xf, 0x8, 0xa, 0xd, 0x4, 0x3,
};
/**
* crc4 - calculate the 4-bit crc of a value.
* #crc: starting crc4
* #x: value to checksum
* #bits: number of bits in #x to checksum
*
* Returns the crc4 value of #x, using polynomial 0b10111.
*
* The #x value is treated as left-aligned, and bits above #bits are ignored
* in the crc calculations.
*/
short crc4(byte c, ulong x, int bits)
{
int i;
/* mask off anything above the top bit */
x &= ((ulong)1 << bits) -1;
/* Align to 4-bits */
bits = (bits + 3) & ~0x3;
/* Calculate crc4 over four-bit nibbles, starting at the MSbit */
for (i = bits - 4; i >= 0; i -= 4)
c = (byte) crc4_tab[c ^ ((x >> i) & 0xf)];
return c;
}
Converting it to C# is not very hard. c is initial or previous nibble (4-bit number), x is 64bit number you want to calculate crc4 of, bits is number of bits in that 64bit number to actually use (the rest are ignored). Since you have array of bytes - you don't need to use 64bit number as x - use can just use byte. Then the first two lines are irrelevant for you, because all they do is throwing away irrelevant bits from 64bit number and ensuring bits is divisable by 4. So after removing irrelevant lines your implementation becomes:
static readonly byte[] crc4_tab = {
0x0, 0x7, 0xe, 0x9, 0xb, 0xc, 0x5, 0x2,
0x1, 0x6, 0xf, 0x8, 0xa, 0xd, 0x4, 0x3,
};
static byte crc4(byte c, byte x) {
var low4Bits = x & 0x0F;
var high4Bits = x >> 4;
c = crc4_tab[c ^ high4Bits];
c = crc4_tab[c ^ low4Bits];
return c;
}
static byte crc4(byte[] array) {
byte start = 0;
foreach (var item in array) {
start = crc4(start, item);
}
return start;
}
After further testing and communication with the Deutsche Post AG we made a correct implementation (for the purpose of Deutsche Post at least):
https://gist.github.com/Sbreitzke/b26107798eee74e39ff85800abf71fb1
For the purpose of Deutsche Post as well, I'd like to contribute a rather less complex algorithm wich may more easily be translated into other languages as well:
private string crc4(string sText) {
int iCRC;
int iPoly;
int iByte;
int iBit;
byte[] bText;
sText = sText.Replace(" ", "");
iPoly = 0x13 << 3;
iCRC = 0;
bText = Encoding.Default.GetBytes(sText);
for (iByte=0; iByte < bText.Length; iByte++){
iCRC = iCRC ^ bText[iByte];
for (iBit = 0; iBit < 8; iBit++){
if ((iCRC & 0x80) != 0){
iCRC = iCRC ^ iPoly;
}
iCRC = iCRC << 1;
}
}
iCRC = iCRC >> 4;
return String.Format("{0:X}", iCRC);
}
Fed with i.e. "A0 0101 002B 00 000C D10" the above code will calculate "F" as the correct check digit. (and tested with numerous other input values)

Encoding an array of bytes similar to Base64, but with arbitrary radix

Does the procedure have a name, where you take a stream of 8-bit bytes and slice them into n-bit snippets stored in 8-bit containers?
The idea is very similar to Base64 encoding, where you split the stream of 1's and 0's into 6-bit chunks (instead of 8), meaning each chunk can have a decimal value of 0 - 63, each of which is assigned a unique human-readable character. In my case, I'm not looking to assign each chunk a specific character.
For example, the input 8-bit bytes:
11100101 01101100 01010011 00001100 11000000 10111101
become the 6-bit snippets:
111001 010110 110001 010011 000011 001100 000010 111101
which are subsequently stored as:
00111001 00010110 00110001 00010011 00000011 00001100 00000010 00111101
or, optionally, with an offset of 1 bit:
01110010 00101100 01100010 00100110 00000110 00011000 00000100 01111010
or and offset of 2 bits:
11100100 01011000 11000100 01001100 00001100 00110000 00001000 11110100
I was looking to write an algorithm in C# to encode a byte array to an arbitrary length with arbitrary offset, and another algorithm to convert it back again.
After quite a lot of headache, I thought I had successfully written the forward algorithm to encode an array of bytes. It worked for all my test cases, but when started writing the reverse algorithm I realised the whole problem was a lot more complicated than I thought it would be, and, in fact, my forward algorithm didn't work where n < 4.
I wanted to write the algorithms with bitwise operators, which is the more proper and elegant solution. The other way would have been to dump the byte array as a long string of 1's and 0's to slice, but that would have been much, much slower.
Here is my forward algorithm that works for cases where n >= 4:
public static byte[] EncodeForward(byte[] input, int n, int offset = 0)
{
byte[] output = new byte[(int)Math.Ceiling(input.Length * 8.0 / n)];
output[0] = (byte)(input[0] >> (8 - n));
int p = 1;
int r = 8 - n;
for (int i = 1; i < input.Length; i++)
{
output[p++] = (byte)((byte)((byte)(input[i - 1] << (8 - r)) | (byte)(input[i] >> r)) >> (8 - n - offset));
if ((r += (8 - n)) == n)
{
output[p++] = (byte)(input[i] & (byte)(0xFF >> (8 - n)));
r = 0;
}
}
return output;
}
I originally conceived it for just the case of n = 7, so each output byte would be composed by parts of at most 7 input bytes. However in the case where n < 4, each output byte would be composed by up to, I think, ceil(8/n) input bytes, so the process is a little more complex than above.
I was hoping to write the forward and reverse algorithms myself, but, honestly, after all this time debugging and testing what I've written and now finding this approach will never work for n < 4, I'm just looking for something that works. These two algorithms are just a very small piece of the project I'm working on.
Does this encoding/decoding procedure have a name, and is there either a built-in way to do it in C# or is there a library that will do it?
You are almost there. You just need and intermediate 16-bit buffer and an unprocessed bits counter. Disclaimer: I don't know C#. The (pseudo) code below is written with C in mind; you may need some tweaking.
For encoding,
uint16_t mask = 0xffff << (16 - width);
uint16_t buffer = (input[0] << 8) | uint[1];
i += 2;
int remaining = 16;
while (i < input.Length) {
while (remaining >= width) {
output[p++] = (buffer & mask) >> (16 - width);
buffer <<= width;
remaining -= width;
}
// Refill the buffer. Since it is 16-bit wide there is a room
// for an _entire_ input byte.
buffer |= input[i++] << (8 - remaining);
remaining += 8;
}
emit_remaining_bits(buffer, remaining);
For decoding:
uint16_t buffer = 0;
int remaining = 16;
while (i < input.Length) {
while (remaining > 8) {
buffer |= input[i++] << (remaining - width);
remaining += width;
}
output[p++] = (buffer >> 8) & 0x00ff;
buffer <<= 8;
remaining += 8;
}

using bitarray to grab bits and build new value

If i take a uint value = 2921803 (0x2C954B), which is really a 4 byte package (4B 95 2C 00)
and i want to get the 16 least significant bits of the byte version of it using bitarray, how would i go about it?
This is how i am trying to do it:
byte[] bytes = BitConverter.GetBytes(value); //4B 95 2C 00 - bytes are moved around
BitArray bitArray = new BitArray(bytes); //entry [0] shows value for 1101 0010 (bits are reversed)
At this point, i am all turned around. I did try this:
byte[] bytes = BitConverter.GetBytes(value);
Array.Reverse(bytes);
BitArray bitArray = new BitArray(bytes);
Which gave me all the bits but completely reversed, reading from [31] to [0].
ultimately, i'm expecting/hoping to get 19349 (4B 95) as my answer.
This is how i was hoping to implement the function:
private uint GetValue(uint value, int bitsToGrab, int bitsToMoveOver)
{
byte[] bytes = BitConverter.GetBytes(value);
BitArray bitArray = new BitArray(bytes);
uint outputMask = (uint)(1 << (bitsToGrab - 1));
//now that i have all the bits, i can offset, and grab the ones i want
for (int i = bitsToMoveOver; i < bitsToGrab; i++)
{
if ((Convert.ToByte(bitArray[i]) & 1) > 0)
{
outputVal |= outputMask;
}
outputMask >>= 1;
}
}
The 16 least significant bits of 0x2C954B are 0x954B. You can get that as follows:
int value = 0x2C954B;
int result = value & 0xFFFF;
// result == 0x954B
If you want 0x4B95 then you can get that as follows:
int result = ((value & 0xFF) << 8) | ((value >> 8) & 0xFF);
// result == 0x4B95
Try this:
uint value = 0x002C954Bu;
int reversed = Reverse((int)value);
// reversed == 0x4B952C00;
int result = Extract(reversed, 16, 16);
// result == 0x4B95
with
int Extract(int value, int offset, int length)
{
return (value >> offset) & ((1 << length) - 1);
}
int Reverse(int value)
{
return ((value >> 24) & 0xFF) | ((value >> 8) & 0xFF00) |
((value & 0xFF00) << 8) | ((value & 0xFF) << 24);
}
unit - 32bits
Basically you should set 16 most significant bits to zero, so use bitwise AND operator:
uint newValue = 0x0000FFFF & uintValue;

Convert CRC-CCITT Kermit 16 DELPHI code to C#

I am working on a function that will give me a Kermit CRC value from a HEX string. I have a piece of code in DELPHI. I am a .NET developer and need the code in C#.
function CRC_16(cadena : string):word;
var
valuehex : word;
i: integer;
CRC : word;
Begin
CRC := 0;
for i := 1 to length(cadena) do
begin
valuehex := ((ord(cadena[i]) XOR CRC) AND $0F) * $1081;
CRC := CRC SHR 4;
CRC := CRC XOR valuehex;
valuehex := (((ord(cadena[i]) SHR 4) XOR LO(CRC)) AND $0F);
CRC := CRC SHR 4;
CRC := CRC XOR (valuehex * $1081);
end;
CRC_16 := (LO(CRC) SHL 8) OR HI(CRC);
end;
I got the code from this webpage: Kermit CRC in DELPHI
I guess that Delphi function is correct. If any one can please convert the code to C# that will be great. I tried to convert to C#, but got lost in WORD data type and the LO function of Delphi. Thank you all.
From MSDN forums:
static long ComputeCRC(byte[] val)
{
long crc;
long q;
byte c;
crc = 0;
for (int i = 0; i < val.Length; i++)
{
c = val[i];
q = (crc ^ c) & 0x0f;
crc = (crc >> 4) ^ (q * 0x1081);
q = (crc ^ (c >> 4)) & 0xf;
crc = (crc >> 4) ^ (q * 0x1081);
}
return (byte)crc << 8 | (byte)(crc >> 8);
}
Use Encoding.ASCII.GetBytes(string) to convert a string to a byte[].
A word is a 16-bit unsigned integer (which can store the values 0..65535).
Lo returns the low-order byte of an integer. So if the integer is 0x7B41AF, for example, lo will return 0xAF.

Categories