Solved by this code -> https://gist.github.com/Sbreitzke/b26107798eee74e39ff85800abf71fb1
I searched the web for a CRC 4 implementation in C# because I have to calculate a checksum by
Changing the numbers of the barcode into Hex representation, then to bytes and then to bits and then calculate a CRC4 checksum on the bit stream.
I already found this question from 8 years ago without an answer
CRC-4 implementation in C#.
I tried changing the CRC 8 and 16 implementations to CRC 4 but they don't quite get the result I require.
0130E0928270FFFFFFF should evaluate to 7.
I found two C implementation but was unable to convert them to C#. For example this one:
short[] crc4_tab = {
0x0, 0x7, 0xe, 0x9, 0xb, 0xc, 0x5, 0x2,
0x1, 0x6, 0xf, 0x8, 0xa, 0xd, 0x4, 0x3,
};
/**
* crc4 - calculate the 4-bit crc of a value.
* #crc: starting crc4
* #x: value to checksum
* #bits: number of bits in #x to checksum
*
* Returns the crc4 value of #x, using polynomial 0b10111.
*
* The #x value is treated as left-aligned, and bits above #bits are ignored
* in the crc calculations.
*/
short crc4(uint8_t c, uint64_t x, int bits)
{
int i;
/* mask off anything above the top bit */
x &= (1ull << bits) -1;
/* Align to 4-bits */
bits = (bits + 3) & ~0x3;
/* Calculate crc4 over four-bit nibbles, starting at the MSbit */
for (i = bits - 4; i >= 0; i -= 4)
c = crc4_tab[c ^ ((x >> i) & 0xf)];
return c;
}
My current generation code (unit test) looks like this:
[TestMethod]
public void x()
{
var ordnungskennzeichen = 01;
var kundennummer = 51251496;
var einlieferungsbel = 9999;
var sendungsnr = 16777215;
var hex_ordnungskennzeichen = ordnungskennzeichen.ToString("x2");
var hex_kundennummer = kundennummer.ToString("x2");
var hex_einlieferungsbel = einlieferungsbel.ToString("x2");
var hex_sendungsnr = sendungsnr.ToString("x2");
var complete = hex_ordnungskennzeichen + hex_kundennummer + hex_einlieferungsbel + hex_sendungsnr;
var bytes = Encoding.ASCII.GetBytes(complete);
//var computeChecksum = crc4(???);
// Console.WriteLine(computeChecksum);
}
short[] crc4_tab = {
0x0, 0x7, 0xe, 0x9, 0xb, 0xc, 0x5, 0x2,
0x1, 0x6, 0xf, 0x8, 0xa, 0xd, 0x4, 0x3,
};
/**
* crc4 - calculate the 4-bit crc of a value.
* #crc: starting crc4
* #x: value to checksum
* #bits: number of bits in #x to checksum
*
* Returns the crc4 value of #x, using polynomial 0b10111.
*
* The #x value is treated as left-aligned, and bits above #bits are ignored
* in the crc calculations.
*/
short crc4(byte c, ulong x, int bits)
{
int i;
/* mask off anything above the top bit */
x &= ((ulong)1 << bits) -1;
/* Align to 4-bits */
bits = (bits + 3) & ~0x3;
/* Calculate crc4 over four-bit nibbles, starting at the MSbit */
for (i = bits - 4; i >= 0; i -= 4)
c = (byte) crc4_tab[c ^ ((x >> i) & 0xf)];
return c;
}
Converting it to C# is not very hard. c is initial or previous nibble (4-bit number), x is 64bit number you want to calculate crc4 of, bits is number of bits in that 64bit number to actually use (the rest are ignored). Since you have array of bytes - you don't need to use 64bit number as x - use can just use byte. Then the first two lines are irrelevant for you, because all they do is throwing away irrelevant bits from 64bit number and ensuring bits is divisable by 4. So after removing irrelevant lines your implementation becomes:
static readonly byte[] crc4_tab = {
0x0, 0x7, 0xe, 0x9, 0xb, 0xc, 0x5, 0x2,
0x1, 0x6, 0xf, 0x8, 0xa, 0xd, 0x4, 0x3,
};
static byte crc4(byte c, byte x) {
var low4Bits = x & 0x0F;
var high4Bits = x >> 4;
c = crc4_tab[c ^ high4Bits];
c = crc4_tab[c ^ low4Bits];
return c;
}
static byte crc4(byte[] array) {
byte start = 0;
foreach (var item in array) {
start = crc4(start, item);
}
return start;
}
After further testing and communication with the Deutsche Post AG we made a correct implementation (for the purpose of Deutsche Post at least):
https://gist.github.com/Sbreitzke/b26107798eee74e39ff85800abf71fb1
For the purpose of Deutsche Post as well, I'd like to contribute a rather less complex algorithm wich may more easily be translated into other languages as well:
private string crc4(string sText) {
int iCRC;
int iPoly;
int iByte;
int iBit;
byte[] bText;
sText = sText.Replace(" ", "");
iPoly = 0x13 << 3;
iCRC = 0;
bText = Encoding.Default.GetBytes(sText);
for (iByte=0; iByte < bText.Length; iByte++){
iCRC = iCRC ^ bText[iByte];
for (iBit = 0; iBit < 8; iBit++){
if ((iCRC & 0x80) != 0){
iCRC = iCRC ^ iPoly;
}
iCRC = iCRC << 1;
}
}
iCRC = iCRC >> 4;
return String.Format("{0:X}", iCRC);
}
Fed with i.e. "A0 0101 002B 00 000C D10" the above code will calculate "F" as the correct check digit. (and tested with numerous other input values)
Related
I need to convert a byte to 4 bits so it can be used as a color.
byte input;
byte r = //the first and second bits of input ;
byte g = //the third and forth bits of input ;
byte b = //the fifth and sixth bits of input ;
Color32 output = new Color32(r,g,b);
I tried working with bit-wise operators but i am not very good at them .
You can use bitwise operators.
byte input = ...;
r = input & 0x3; // bits 0x1 + 0x2
g =( input & 0xc) >> 2; // bits 0x4 + 0x8
b = (input & 0x30) >> 4; //bits 0x10 + 0x20
The bitwise operator & makes a bitwise and on the input. >> shifts a number to the right by the given number of bits.
Or if by "first and second" bit mean the highest two bits, you can get them as follows
r = input >> 6;
g = (input >> 4) & 0x3;
b = (input >> 2) & 0x3;
You probably want 11 binary to map to 255 and 00 to map to 0 to get a maximum spread in color values.
You can get that spread by multiplying the 2 bit color value by 85. 00b stays 0, 01b becomes 85, 10b becomes 190 and 11b becomes 255.
So the code would look something like this
byte input = 0xfc;
var r = ((input & 0xc0) >> 6) * 85;
var g = ((input & 0x30) >> 4) * 85;
var b = ((input & 0x0c) >> 2) * 85;
Console.WriteLine($"{r} {g} {b}");
If I inputted 0xffffffff then the output must be:
A: 255
R: 255
G: 255
B: 255
I can't find any tutorials for converthing this. Thanks!
You can use the Color structure ( From the .NET System.Drawing assembly) to parse this:
using System;
using System.Drawing;
void Main()
{
var c = Color.FromArgb(unchecked((int)0xaa336539));
Console.WriteLine("Alpha: {0}", c.A);
Console.WriteLine("Red: {0}", c.R);
Console.WriteLine("Green: {0}", c.G);
Console.WriteLine("Blue: {0}", c.B);
}
which produces the following output:
Alpha: 170
Red: 51
Green: 101
Blue: 57
shifting and masking.
(although some prefer using a / 256 for the shift and a % 256 for the mask )
unsigned long x = 0xaa336539;
// Note the LSB to MSB order
//mask
unsigned char b = x & 0xff;
//shift
x >>= 8;
//mask
unsigned char g = x & 0xff;
//shift
x >>= 8;
//mask
unsigned char r = x & 0xff;
//shift
x >>= 8;
//mask
unsigned char a = x & 0xff;
// Technically, just saving it into an 8 bit wide container is the same as the masking, although some compilers might warn you
// Original input
var input = "0xaa336539";
// Gets aa336539
var inputRemovePrefix = input.Substring(2);
// Converts to a long
var numberConversion = long.Parse(inputRemovePrefix, System.Globalization.NumberStyles.HexNumber);
// Converts to 6 character hex string so the next operation will always work
var convertedInput = numberConversion.ToString("X6");
var aVal = int.Parse(convertedInput.Substring(0,2), System.Globalization.NumberStyles.HexNumber);
var rVal = int.Parse(convertedInput.Substring(2,2), System.Globalization.NumberStyles.HexNumber);
var gVal = int.Parse(convertedInput.Substring(4,2), System.Globalization.NumberStyles.HexNumber);
var bVal = int.Parse(convertedInput.Substring(6,2), System.Globalization.NumberStyles.HexNumber);
// Prints result
Console.WriteLine($"A: {aVal} R: {rVal} G: {gVal} B: {bVal}");
In C#, how would I go about setting 2 bytes where the first 10 bits represent one decimal value and the next 6 represent a different decimal value?
So if the first value was '8' (first 10 bits) and the second '2' (remaining 6 bits), I need to end up with '0000001000 000010' inside a byte array.
Thanks!
Ad
UInt16 val1 = 8;
UInt16 val2 = 2;
UInt16 combined = (UInt16)((val1 << 6) | val2);
If you need it in a byte array, you can pass the result to the BitConverter.GetBytes method.
byte[] array = BitConverter.GetBytes(combined);
int val1 = 8;
int val2 = 2;
// First byte contains all but the 2 least significant bits from the first value.
byte byte1 = (byte)(val1 >> 2);
// Second byte contains the 2 least significant bits from the first value,
// shifted 6 bits left to become the 2 most significant bits of the byte,
// followed by the (at most 6) bits of the second value.
byte byte2 = (byte)((val1 & 4) << 6 | val2);
byte[] bytes = new byte[] { byte1, byte2 };
// Just for verification.
string s =
Convert.ToString(byte1, 2).PadLeft(8, '0') + " " +
Convert.ToString(byte2, 2).PadLeft(8, '0');
Not accounting for any kind of overflow:
private static byte[] amend(int a, int b)
{
// Combine the datum into a 16 bits integer
var c = (ushort) ((a << 6) | (b));
// Fragment the Int to bytes
var ret = new byte[2];
ret[0] = (byte) (c >> 8);
ret[1] = (byte) (c);
return ret;
}
ushort value = (8 << 6) | 2;
byte[] bytes = BitConverter.GetBytes(value);
In order to utilize a byte to its fullest potential, I'm attempting to store two unique values into a byte: one in the first four bits and another in the second four bits. However, I've found that, while this practice allows for optimized memory allocation, it makes changing the individual values stored in the byte difficult.
In my code, I want to change the first set of four bits in a byte while maintaining the value of the second four bits in the same byte. While bitwise operations allow me to easily retrieve and manipulate the first four bit values, I'm finding it difficult to concatenate this new value with the second set of four bits in a byte. The question is, how can I erase the first four bits from a byte (or, more accurately, set them all the zero) and add the new set of 4 bits to replace the four bits that were just erased, thus preserving the last 4 bits in a byte while changing the first four?
Here's an example:
// Changes the first four bits in a byte to the parameter value
public void changeFirstFourBits(byte newFirstFour)
{
// If 'newFirstFour' is 0101 in binary, make 'value' 01011111 in binary, changing
// the first four bits but leaving the second four alone.
}
private byte value = 255; // binary: 11111111
Use bitwise AND (&) to clear out the old bits, shift the new bits to the correct position and bitwise OR (|) them together:
value = (value & 0xF) | (newFirstFour << 4);
Here's what happens:
value : abcdefgh
newFirstFour : 0000xyzw
0xF : 00001111
value & 0xF : 0000efgh
newFirstFour << 4 : xyzw0000
(value & 0xF) | (newFirstFour << 4) : xyzwefgh
When I have to do bit-twiddling like this, I make a readonly struct to do it for me. A four-bit integer is called nybble, of course:
struct TwoNybbles
{
private readonly byte b;
public byte High { get { return (byte)(b >> 4); } }
public byte Low { get { return (byte)(b & 0x0F); } {
public TwoNybbles(byte high, byte low)
{
this.b = (byte)((high << 4) | (low & 0x0F));
}
And then add implicit conversions between TwoNybbles and byte. Now you can just treat any byte as having a High and Low byte without putting all that ugly bit twiddling in your mainline code.
You first mask out you the high four bytes using value & 0xF. Then you shift the new bits to the high four bits using newFirstFour << 4 and finally you combine them together using binary or.
public void changeHighFourBits(byte newHighFour)
{
value=(byte)( (value & 0x0F) | (newFirstFour << 4));
}
public void changeLowFourBits(byte newLowFour)
{
value=(byte)( (value & 0xF0) | newLowFour);
}
I'm not really sure what your method there is supposed to do, but here are some methods for you:
void setHigh(ref byte b, byte val) {
b = (b & 0xf) | (val << 4);
}
byte high(byte b) {
return (b & 0xf0) >> 4;
}
void setLow(ref byte b, byte val) {
b = (b & 0xf0) | val;
}
byte low(byte b) {
return b & 0xf;
}
Should be self-explanatory.
public int SplatBit(int Reg, int Val, int ValLen, int Pos)
{
int mask = ((1 << ValLen) - 1) << Pos;
int newv = Val << Pos;
int res = (Reg & ~mask) | newv;
return res;
}
Example:
Reg = 135
Val = 9 (ValLen = 4, because 9 = 1001)
Pos = 2
135 = 10000111
9 = 1001
9 << Pos = 100100
Result = 10100111
A quick look would indicate that a bitwise and can be achieved using the & operator. So to remove the first four bytes you should be able to do:
byte value1=255; //11111111
byte value2=15; //00001111
return value1&value2;
Assuming newVal contains the value you want to store in origVal.
Do this for the 4 least significant bits:
byte origVal = ???;
byte newVal = ???
orig = (origVal & 0xF0) + newVal;
and this for the 4 most significant bits:
byte origVal = ???;
byte newVal = ???
orig = (origVal & 0xF) + (newVal << 4);
I know you asked specifically about clearing out the first four bits, which has been answered several times, but I wanted to point out that if you have two values <= decimal 15, you can combine them into 8 bits simply with this:
public int setBits(int upperFour, int lowerFour)
{
return upperFour << 4 | lowerFour;
}
The result will be xxxxyyyy where
xxxx = upperFour
yyyy = lowerFour
And that is what you seem to be trying to do.
Here's some code, but I think the earlier answers will do it for you. This is just to show some sort of test code to copy and past into a simple console project (the WriteBits method by be of help):
static void Main(string[] args)
{
int b1 = 255;
WriteBits(b1);
int b2 = b1 >> 4;
WriteBits(b2);
int b3 = b1 & ~0xF ;
WriteBits(b3);
// Store 5 in first nibble
int b4 = 5 << 4;
WriteBits(b4);
// Store 8 in second nibble
int b5 = 8;
WriteBits(b5);
// Store 5 and 8 in first and second nibbles
int b6 = 0;
b6 |= (5 << 4) + 8;
WriteBits(b6);
// Store 2 and 4
int b7 = 0;
b7 = StoreFirstNibble(2, b7);
b7 = StoreSecondNibble(4, b7);
WriteBits(b7);
// Read First Nibble
int first = ReadFirstNibble(b7);
WriteBits(first);
// Read Second Nibble
int second = ReadSecondNibble(b7);
WriteBits(second);
}
static int ReadFirstNibble(int storage)
{
return storage >> 4;
}
static int ReadSecondNibble(int storage)
{
return storage &= 0xF;
}
static int StoreFirstNibble(int val, int storage)
{
return storage |= (val << 4);
}
static int StoreSecondNibble(int val, int storage)
{
return storage |= val;
}
static void WriteBits(int b)
{
Console.WriteLine(BitConverter.ToString(BitConverter.GetBytes(b),0));
}
}
I would like to implement this in C#
I have looked here:
http://www.codeproject.com/KB/cpp/PEChecksum.aspx
And am aware of the ImageHlp.dll MapFileAndCheckSum function.
However, for various reasons, I would like to implement this myself.
The best I have found is here:
http://forum.sysinternals.com/optional-header-checksum-calculation_topic24214.html
But, I don't understand the explanation. Can anyone clarify how the checksum is calculated?
Thanks!
Update
I from the code example, I do not understand what this means, and how to translate it into C#
sum -= sum < low 16 bits of CheckSum in file // 16-bit borrow
sum -= low 16 bits of CheckSum in file
sum -= sum < high 16 bits of CheckSum in file
sum -= high 16 bits of CheckSum in file
Update #2
Thanks, came across some Python code that does similar too here
def generate_checksum(self):
# This will make sure that the data representing the PE image
# is updated with any changes that might have been made by
# assigning values to header fields as those are not automatically
# updated upon assignment.
#
self.__data__ = self.write()
# Get the offset to the CheckSum field in the OptionalHeader
#
checksum_offset = self.OPTIONAL_HEADER.__file_offset__ + 0x40 # 64
checksum = 0
# Verify the data is dword-aligned. Add padding if needed
#
remainder = len(self.__data__) % 4
data = self.__data__ + ( '\0' * ((4-remainder) * ( remainder != 0 )) )
for i in range( len( data ) / 4 ):
# Skip the checksum field
#
if i == checksum_offset / 4:
continue
dword = struct.unpack('I', data[ i*4 : i*4+4 ])[0]
checksum = (checksum & 0xffffffff) + dword + (checksum>>32)
if checksum > 2**32:
checksum = (checksum & 0xffffffff) + (checksum >> 32)
checksum = (checksum & 0xffff) + (checksum >> 16)
checksum = (checksum) + (checksum >> 16)
checksum = checksum & 0xffff
# The length is the one of the original data, not the padded one
#
return checksum + len(self.__data__)
However, it's still not working for me - here is my conversion of this code:
using System;
using System.IO;
namespace CheckSumTest
{
class Program
{
static void Main(string[] args)
{
var data = File.ReadAllBytes(#"c:\Windows\notepad.exe");
var PEStart = BitConverter.ToInt32(data, 0x3c);
var PECoffStart = PEStart + 4;
var PEOptionalStart = PECoffStart + 20;
var PECheckSum = PEOptionalStart + 64;
var checkSumInFile = BitConverter.ToInt32(data, PECheckSum);
Console.WriteLine(string.Format("{0:x}", checkSumInFile));
long checksum = 0;
var remainder = data.Length % 4;
if (remainder > 0)
{
Array.Resize(ref data, data.Length + (4 - remainder));
}
var top = Math.Pow(2, 32);
for (int i = 0; i < data.Length / 4; i++)
{
if (i == PECheckSum / 4)
{
continue;
}
var dword = BitConverter.ToInt32(data, i * 4);
checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
if (checksum > top)
{
checksum = (checksum & 0xffffffff) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = (checksum) + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += (uint)data.Length;
Console.WriteLine(string.Format("{0:x}", checksum));
Console.ReadKey();
}
}
}
Can anyone tell me where I'm being stupid?
Ok, finally got it working ok... my problem was that I was using ints not uints!!!
So, this code works (assuming data is 4-byte aligned, otherwise you'll have to pad it out a little) - and PECheckSum is the position of the CheckSum value within the PE (which is clearly not used when calculating the checksum!!!!)
static uint CalcCheckSum(byte[] data, int PECheckSum)
{
long checksum = 0;
var top = Math.Pow(2, 32);
for (var i = 0; i < data.Length / 4; i++)
{
if (i == PECheckSum / 4)
{
continue;
}
var dword = BitConverter.ToUInt32(data, i * 4);
checksum = (checksum & 0xffffffff) + dword + (checksum >> 32);
if (checksum > top)
{
checksum = (checksum & 0xffffffff) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = (checksum) + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += (uint)data.Length;
return (uint)checksum;
}
The code in the forum post is not strictly the same as what was noted during the actual disassembly of the Windows PE code. The CodeProject article you reference gives the "fold 32-bit value into 16 bits" as:
mov edx,eax ; EDX = EAX
shr edx,10h ; EDX = EDX >> 16 EDX is high order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order
add eax,edx ; EAX = EAX + EDX High Order Folded into Low Order
mov edx,eax ; EDX = EAX
shr edx,10h ; EDX = EDX >> 16 EDX is high order
add eax,edx ; EAX = EAX + EDX High Order Folded into Low Order
and eax,0FFFFh ; EAX = EAX & 0xFFFF EAX is low order 16 bits
Which you could translate into C# as:
// given: uint sum = ...;
uint high = sum >> 16; // take high order from sum
sum &= 0xFFFF; // clear out high order from sum
sum += high; // fold high order into low order
high = sum >> 16; // take the new high order of sum
sum += high; // fold the new high order into sum
sum &= 0xFFFF; // mask to 16 bits
Java code below from emmanuel may not work. In my case it hangs and does not complete. I believe this is due to the heavy use of IO in the code: in particular the data.read()'s. This can be swapped with an array as solution. Where the RandomAccessFile fully or incrementally reads the file into a byte array(s).
I attempted this but the calculation was too slow due to the conditional for the checksum offset to skip the checksum header bytes. I would imagine that the OP's C# solution would have a similar problem.
The below code removes this also.
public static long computeChecksum(RandomAccessFile data, int checksumOffset)
throws IOException {
...
byte[] barray = new byte[(int) length];
data.readFully(barray);
long i = 0;
long ch1, ch2, ch3, ch4, dword;
while (i < checksumOffset) {
ch1 = ((int) barray[(int) i++]) & 0xff;
...
checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);
if (checksum > top) {
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
}
i += 4;
while (i < length) {
ch1 = ((int) barray[(int) i++]) & 0xff;
...
checksum += dword = ch1 | (ch2 << 8) | (ch3 << 16) | (ch4 << 24);
if (checksum > top) {
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = checksum + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += length;
return checksum;
}
I still however think that code was too verbose and clunky so I swapped out the raf with a channel and rewrote the culprit bytes to zero's to eliminate the conditional. This code could still probably do with a cache style buffered read.
public static long computeChecksum2(FileChannel ch, int checksumOffset)
throws IOException {
ch.position(0);
long sum = 0;
long top = (long) Math.pow(2, 32);
long length = ch.size();
ByteBuffer buffer = ByteBuffer.wrap(new byte[(int) length]);
buffer.order(ByteOrder.LITTLE_ENDIAN);
ch.read(buffer);
buffer.putInt(checksumOffset, 0x0000);
buffer.position(0);
while (buffer.hasRemaining()) {
sum += buffer.getInt() & 0xffffffffL;
if (sum > top) {
sum = (sum & 0xffffffffL) + (sum >> 32);
}
}
sum = (sum & 0xffff) + (sum >> 16);
sum = sum + (sum >> 16);
sum = sum & 0xffff;
sum += length;
return sum;
}
No one really answered the original question of "Can anyone define the Windows PE Checksum Algorithm?" so I'm going to define it as simply as possible. A lot of the examples given so far are optimizing for unsigned 32-bit integers (aka DWORDs), but if you just want to understand the algorithm itself at its most fundamental, it is simply this:
Using an unsigned 16-bit integer (aka a WORD) to store the checksum, add up all of the WORDs of the data except for the 4 bytes of the PE optional header checksum. If the file is not WORD-aligned, then the last byte is a 0x00.
Convert the checksum from a WORD to a DWORD and add the size of the file.
The PE checksum algorithm above is effectively the same as the original MS-DOS checksum algorithm. The only differences are the location to skip and replacing the XOR 0xFFFF at the end and adding the size of the file instead.
From my WinPEFile class for PHP, the above algorithm looks like:
$x = 0;
$y = strlen($data);
$val = 0;
while ($x < $y)
{
// Skip the checksum field location.
if ($x === $this->pe_opt_header["checksum_pos"]) $x += 4;
else
{
$val += self::GetUInt16($data, $x, $y);
// In PHP, integers are either signed 32-bit or 64-bit integers.
if ($val > 0xFFFF) $val = ($val & 0xFFFF) + 1;
}
}
// Add the file size.
$val += $y;
I was trying to solve the same issue in Java. Here is Mark's solution translated into Java, using a RandomAccessFile instead of a byte array as input:
static long computeChecksum(RandomAccessFile data, long checksumOffset) throws IOException {
long checksum = 0;
long top = (long) Math.pow(2, 32);
long length = data.length();
for (long i = 0; i < length / 4; i++) {
if (i == checksumOffset / 4) {
data.skipBytes(4);
continue;
}
long ch1 = data.read();
long ch2 = data.read();
long ch3 = data.read();
long ch4 = data.read();
long dword = ch1 + (ch2 << 8) + (ch3 << 16) + (ch4 << 24);
checksum = (checksum & 0xffffffffL) + dword + (checksum >> 32);
if (checksum > top) {
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = checksum + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += length;
return checksum;
}
private unsafe static int GetSetPEChecksum(byte[] Array) {
var Value = 0;
var Count = Array.Length;
if(Count >= 64)
fixed (byte* array = Array) {
var Index = 0;
var Coff = *(int*)(array + 60);
if(Coff >= 64 && Count >= Coff + 92) {
*(int*)(array + Coff + 88) = 0;
var Bound = Count >> 1;
if((Count & 1) != 0) Value = array[Count & ~1];
var Short = (ushort*)array;
while(Index < Bound) {
Value += Short[Index++];
Value = (Value & 0xffff) + (Value >> 16);
Value = (Value + (Value >> 16)) & 0xffff;
}
*(int*)(array + Coff + 88) = Value += Count;
}
}
return Value;
}
If you need short unsafe... (Not need use Double and Long integers and not need Array aligning inside algorithm)
The Java example is not entirely correct. Following Java implementation corresponds with the result of Microsoft's original implementation from Imagehlp.MapFileAndCheckSumA.
It's important that the input bytes are getting masked with inputByte & 0xff and the resulting long masked again when it's used in the addition term with currentWord & 0xffffffffL (consider the L):
long checksum = 0;
final long max = 4294967296L; // 2^32
// verify the data is DWORD-aligned and add padding if needed
final int remainder = data.length % 4;
final byte[] paddedData = Arrays.copyOf(data, data.length
+ (remainder > 0 ? 4 - remainder : 0));
for (int i = 0; i <= paddedData.length - 4; i += 4)
{
// skip the checksum field
if (i == this.offsetToOriginalCheckSum)
continue;
// take DWORD into account for computation
final long currentWord = (paddedData[i] & 0xff)
+ ((paddedData[i + 1] & 0xff) << 8)
+ ((paddedData[i + 2] & 0xff) << 16)
+ ((paddedData[i + 3] & 0xff) << 24);
checksum = (checksum & 0xffffffffL) + (currentWord & 0xffffffffL);
if (checksum > max)
checksum = (checksum & 0xffffffffL) + (checksum >> 32);
}
checksum = (checksum & 0xffff) + (checksum >> 16);
checksum = checksum + (checksum >> 16);
checksum = checksum & 0xffff;
checksum += data.length; // must be original data length
In this case, Java is a bit inconvenient.
The CheckSum field is 32 bits long and is calculated as follows
1. Add all dwords (32 bit pieces) of the entire file to a sum
Add all dwords of the entire file not including the CheckSum field itself, including all headers and all of the contents, to a dword. If the dword overflows, add the overflowed bit back to the first bit (2^0) of the dword.
If the file is not entirely divisible into dwords (4 bit pieces) see 2.
The best way I know to realize this is by using the GNU C Compilers Integer Overflow Builtin function __builtin_uadd_overflow.
In the original ChkSum function documented by Jeffrey Walton the sum
was calculated by performing an add (%esi),%eax where
esi contains the base address of the file and eax is 0 and adding the rest of the file like this
adc 0x4(%esi),%eax
adc 0x8(%esi),%eax
adc 0xc(%esi),%eax
adc 0x10(%esi),%eax
...
adc $0x0,%eax
The first add adds the first dword ignoring any carry flag. The next dwords
are added by the adc instruction which does the same thing as add but
adds any carry flag that was set before executing the instruction in addition
to the summand. The last adc $0x0,%eax adds only the last carry flag if it
was set and cannot be discarded.
Please keep in mind that the dword of CheckSum field itself should not be added.
2. Add the remainder to the sum if there is one
If the file is not entirely divisible into dwords, add the remainder as a
zero-padded dword. For example: say your file is 15 bytes long and looks like this
0E 1F BA 0E | 00 B4 09 CD | 21 B8 01 4C | CD 21 54
You need to add the remainder as 0x005421CD to the sum. My system is a
little-endian system. I do not know if the checksum would change because of the
this order of the bytes on big-endian systems, or you would just simulate this
behaviour.
I do this by rounding up the buffer_size to the next bytecount divisible by 4
without remainder or put differently: the next whole dword count represented
in bytes. Then I allocate with calloc because it initializes the memory block
with all zeros.
if(buffer_size%4)
{buffer_size+=4-(buffer_size%4);
...
calloc(buffer_size,1)
3. Add the lower word (16 bit piece) and the higher word of the sum together.
sum=(sum&0xffff)+(sum>>16);
4. Add the new higher word once again
sum+=(sum>>16);
5. Only keep the lower word
sum&=0xffff;
6. Add the number of bytes in the file to the sum
return(sum+size);
This is how I wrote it. It is not C#, but C. off_t size is the number of bytes in the file. uint32_t *base is a pointer to the file loaded into memory. The block of memory should be padded with zeros at the end to the next bytecount divisible by 4.
uint32_t pe_header_checksum(uint32_t *base,off_t size)
{uint32_t sum=0;
off_t i;
for(i=0;i<(size/4);i++)
{if(i==0x36)
{continue;}
sum+=__builtin_uadd_overflow(base[i],sum,&sum);}
if(size%4)
{sum+=base[i];}
sum=(sum&0xffff)+(sum>>16);
sum+=(sum>>16);
sum&=0xffff;
return(sum+size);}
If you want you can see the code in action and read a little bit more here.