I am learning, trying to get thoughts behind CRC. I can't find CRC128 and CRC256 code anywhere. If anyone of you have the C++ or C# Code for them, please share them with me. Also provide online links to the websites. I am a newbie and can't code it by myself at all, neither can convert theories and mathematics to the coding. So I ask for help from you. It will be so nice of you who provide me the proper and simple codes. If anyone provides me these codes, please do also provide CRC Table generator functions. Thank you.
I agree with you except that the accidental collision rate is higher than 1 in 2^32 or 1 in 2^64 for 32 bit and 64 bit CRCs respectively.
I wrote an app that kept track of things by their CRC values for tracking items. We needed to track potentially millions of items and we started with a CRC32 which in real world practice has a collision rate of around 1 in 2^16 which was an unpleasant surprise. We then re-coded to use a CRC64 which had a real world collision rate of about 1 in 2^23. We tested this after the unpleasant surprise of the 32 bit one we started with and accepted the small error rate of the 64 bit one.
I can't really explain the statistics behind the expected collision rate but it makes sense that you would experience a collision much sooner that the width of the bits. Just like a hashtable...some hash buckets are empty and others have more than one entry....
Even for a 256 bit CRC the first 2 CRC's could be the same...it would be almost incredible but possible.
Though CRC-128 and CRC-256 were defined, I don't know of anyone who actually uses them.
Most of the time, developers who think they want a CRC should really be using a cryptographic hash function, which have succeeded CRCs for many applications. It would be a rare case indeed where CRC-128 or CRC-256 would be a superior choice to even the broken MD5, much less the SHA-2 family.
Here is a Java class I wrote recently for playing with CRCs. Beware that changing the bit order is implemented only for bitwise computation.
/**
* A CRC algorithm for computing check values.
*/
public class Crc
{
public static final Crc CRC_16_CCITT =
new Crc(16, 0x1021, 0xffff, 0xffff, true);
public static final Crc CRC_32 =
new Crc(32, 0x04c11db7, 0xffffffffL, 0xffffffffL, true);
private final int _width;
private final long _polynomial;
private final long _mask;
private final long _highBitMask;
private final long _preset;
private final long _postComplementMask;
private final boolean _msbFirstBitOrder;
private final int _shift;
private final long[] _crcs;
/**
* Constructs a CRC specification.
*
* #param width
* #param polynomial
* #param msbFirstBitOrder
*/
public Crc(
int width,
long polynomial)
{
this(width, polynomial, 0, 0, true);
}
/**
* Constructs a CRC specification.
*
* #param width
* #param polynomial
* #param msbFirstBitOrder
*/
public Crc(
int width,
long polynomial,
long preset,
long postComplementMask,
boolean msbFirstBitOrder)
{
super();
_width = width;
_polynomial = polynomial;
_mask = (1L << width) - 1;
_highBitMask = (1L << (width - 1));
_preset = preset;
_postComplementMask = postComplementMask;
_msbFirstBitOrder = msbFirstBitOrder;
_shift = _width - 8;
_crcs = new long[256];
for (int i = 0; i < 256; i++)
{
_crcs[i] = crcForByte(i);
}
}
/**
* Gets the width.
*
* #return The width.
*/
public int getWidth()
{
return _width;
}
/**
* Gets the polynomial.
*
* #return The polynomial.
*/
public long getPolynomial()
{
return _polynomial;
}
/**
* Gets the mask.
*
* #return The mask.
*/
public long getMask()
{
return _mask;
}
/**
* Gets the preset.
*
* #return The preset.
*/
public long getPreset()
{
return _preset;
}
/**
* Gets the post-complement mask.
*
* #return The post-complement mask.
*/
public long getPostComplementMask()
{
return _postComplementMask;
}
/**
* #return True if this CRC uses MSB first bit order.
*/
public boolean isMsbFirstBitOrder()
{
return _msbFirstBitOrder;
}
public long computeBitwise(byte[] message)
{
long result = _preset;
for (int i = 0; i < message.length; i++)
{
for (int j = 0; j < 8; j++)
{
final int bitIndex = _msbFirstBitOrder ? 7 - j : j;
final boolean messageBit = (message[i] & (1 << bitIndex)) != 0;
final boolean crcBit = (result & _highBitMask) != 0;
result <<= 1;
if (messageBit ^ crcBit)
{
result ^= _polynomial;
}
result &= _mask;
}
}
return result ^ _postComplementMask;
}
public long compute(byte[] message)
{
long result = _preset;
for (int i = 0; i < message.length; i++)
{
final int b = (int) (message[i] ^ (result >>> _shift)) & 0xff;
result = ((result << 8) ^ _crcs[b]) & _mask;
}
return result ^ _postComplementMask;
}
private long crcForByte(int b)
{
long result1 = (b & 0xff) << _shift;
for (int j = 0; j < 8; j++)
{
final boolean crcBit = (result1 & (1L << (_width - 1))) != 0;
result1 <<= 1;
if (crcBit)
{
result1 ^= _polynomial;
}
result1 &= _mask;
}
return result1;
}
public String crcTable()
{
final int digits = (_width + 3) / 4;
final int itemsPerLine = (digits + 4) * 8 < 72 ? 8 : 4;
final String format = "0x%0" + digits + "x, ";
final StringBuilder builder = new StringBuilder();
builder.append("{\n");
for (int i = 0; i < _crcs.length; i += itemsPerLine)
{
builder.append(" ");
for (int j = i; j < i + itemsPerLine; j++)
{
builder.append(String.format(format, _crcs[j]));
}
builder.append("\n");
}
builder.append("}\n");
return builder.toString();
}
}
CRC-128 and CRC-256 only make sense if the three following point are true :
You are CPU constrained to the point where a crypto hash would significantly slow you down
Accidental collision must statistically never happen, 1 in 2^64 is still too high
OTOH deliberate collisions are not a problem
A typical case where 2 and 3 can be true together is if an accidental collision would create a data loss that only affects the sender of the data, and not the platform.
Related
For the past 4 hours I've been studying the CRC algorithm. I'm pretty sure I got the hang of it already.
I'm trying to write a png encoder, and I don't wish to use external libraries for the CRC calculation, nor for the png encoding itself.
My program has been able to get the same CRC's as the examples on tutorials. Like on Wikipedia:
Using the same polynomial and message as in the example, I was able to produce the same result in both of the cases. I was able to do this for several other examples as well.
However, I can't seem to properly calculate the CRC of png files. I tested this by creating a blank, one pixel big .png file in paint, and using it's CRC as a comparision. I copied the data (and chunk name) from the IDAT chunk of the png (which the CRC is calculated from), and calculated it's CRC using the polynomial provided in the png specification.
The polynomial provided in the png specification is the following:
x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1
Which should translate to:
1 00000100 11000001 00011101 10110111
Using that polynomial, I tried to get the CRC of the following data:
01001001 01000100 01000001 01010100
00011000 01010111 01100011 11101000
11101100 11101100 00000100 00000000
00000011 00111010 00000001 10011100
This is what I get:
01011111 11000101 01100001 01101000 (MSB First)
10111011 00010011 00101010 11001100 (LSB First)
This is what is the actual CRC:
11111010 00010110 10110110 11110111
I'm not exactly sure how to fix this, but my guess would be I'm doing this part from the specification wrong:
In PNG, the 32-bit CRC is initialized to all 1's, and then the data from each byte is processed from the least significant bit (1) to the most significant bit (128). After all the data bytes are processed, the CRC is inverted (its ones complement is taken). This value is transmitted (stored in the datastream) MSB first. For the purpose of separating into bytes and ordering, the least significant bit of the 32-bit CRC is defined to be the coefficient of the x31 term.
I'm not completely sure I can understand all of that.
Also, here is the code I use to get the CRC:
public BitArray GetCRC(BitArray data)
{
// Prepare the divident; Append the proper amount of zeros to the end
BitArray divident = new BitArray(data.Length + polynom.Length - 1);
for (int i = 0; i < divident.Length; i++)
{
if (i < data.Length)
{
divident[i] = data[i];
}
else
{
divident[i] = false;
}
}
// Calculate CRC
for (int i = 0; i < divident.Length - polynom.Length + 1; i++)
{
if (divident[i] && polynom[0])
{
for (int j = 0; j < polynom.Length; j++)
{
if ((divident[i + j] && polynom[j]) || (!divident[i + j] && !polynom[j]))
{
divident[i + j] = false;
}
else
{
divident[i + j] = true;
}
}
}
}
// Strip the CRC off the divident
BitArray crc = new BitArray(polynom.Length - 1);
for (int i = data.Length, j = 0; i < divident.Length; i++, j++)
{
crc[j] = divident[i];
}
return crc;
}
So, how do I fix this to match the PNG specification?
You can find a complete implementation of the CRC calculation (and PNG encoding in general) in this public domain code:
static uint[] crcTable;
// Stores a running CRC (initialized with the CRC of "IDAT" string). When
// you write this to the PNG, write as a big-endian value
static uint idatCrc = Crc32(new byte[] { (byte)'I', (byte)'D', (byte)'A', (byte)'T' }, 0, 4, 0);
// Call this function with the compressed image bytes,
// passing in idatCrc as the last parameter
private static uint Crc32(byte[] stream, int offset, int length, uint crc)
{
uint c;
if(crcTable==null){
crcTable=new uint[256];
for(uint n=0;n<=255;n++){
c = n;
for(var k=0;k<=7;k++){
if((c & 1) == 1)
c = 0xEDB88320^((c>>1)&0x7FFFFFFF);
else
c = ((c>>1)&0x7FFFFFFF);
}
crcTable[n] = c;
}
}
c = crc^0xffffffff;
var endOffset=offset+length;
for(var i=offset;i<endOffset;i++){
c = crcTable[(c^stream[i]) & 255]^((c>>8)&0xFFFFFF);
}
return c^0xffffffff;
}
1 https://web.archive.org/web/20150825201508/http://upokecenter.dreamhosters.com/articles/png-image-encoder-in-c/
I can't think of a good way to do this, and would appreciate some help, if possible!
I'm afraid I don't have any code to post yet as I haven't got that far.
I need to generate a sequence of values from 3 (or possible more) parameters in the range 0-999.
The value must always be the same for the given inputs but with a fair distribution between upper and lower boundaries so as to appear random.
For example:
function (1, 1, 1) = 423
function (1, 1, 2) = 716
function (1, 2, 1) = 112
These must be reasonably fast to produce, by which I mean I should be able to generate 100-200 during web page load with no noticeable delay.
The method must be do-able in C# but also in JavaScript, otherwise I'd probably use a CRC32 or MD5 hash algorithm.
If it helps this will be used as part of a procedural generation routine.
I had a go at asking this previously, but I think the poor quality of my explanation let me down.
I apologise if this is worded badly. Please just let me know if so and I'll try to explain further.
Thanks very much for any help.
Here's one:
function sequence(x, y, z) {
return Math.abs(441*x-311*y+293*z) % 1000;
}
It even produces the output from your example!
Using the Marsaglia generator from the Wiki
public class SimpleMarsagliaRandom
{
private const uint original_w = 1023;
private uint m_w = original_w; /* must not be zero */
private uint m_z = 0; /* must not be zero, initialized by the constructor */
public SimpleMarsagliaRandom()
{
this.init(666);
}
public void init(uint z)
{
this.m_w = original_w;
this.m_z = z;
}
public uint get_random()
{
this.m_z = 36969 * (this.m_z & 65535) + (this.m_z >> 16);
this.m_w = 18000 * (this.m_w & 65535) + (this.m_w >> 16);
return (this.m_z << 16) + this.m_w; /* 32-bit result */
}
public uint get_random(uint min, uint max)
{
// max excluded
uint num = max - min;
return (this.get_random() % num) + min;
}
}
and
simpleMarsagliaRandom = function()
{
var original_w = 1023 >>> 0;
var m_w = 0, m_z = 0;
this.init = function(z)
{
m_w = original_w;
m_z = z >>> 0;
};
this.init(666);
var internalRandom = function()
{
m_z = (36969 * (m_z & 65535) + (m_z >>> 16)) >>> 0;
m_w = (18000 * (m_w & 65535) + (m_w >>> 16)) >>> 0;
return (((m_z << 16) >>> 0) + m_w) >>> 0; /* 32-bit result */
};
this.get_random = function(min, max)
{
if (arguments.length < 2)
{
return internalRandom();
}
var num = ((max >>> 0) - (min >>> 0)) >>> 0;
return ((internalRandom() % num) + min) >>> 0;
}
};
In Javascript all the >>> are to coerce the number to uint
Totally untested
Be aware that what is done in get_random to make numbers from x to y is wrong. Low numbers will happen a little more times than higher numbers. To make an example: let's say you have a standard 6 faces dice. You roll it, you get 1-6. Now let's say you print on it the numbers 0-5. You roll it, you get 0-5. No problems. But you need the numbers in the range 0-3. So you do roll % 3... So we have:
rolled => rolled % 3
0 => 0,
1 => 1,
2 => 2,
3 => 0,
4 => 1,
5 => 2,
6 => 0.
The 0 result is more common.
Ideone for C# version: http://ideone.com/VQudcV
JSFiddle for Javascript version: http://jsfiddle.net/dqayk/
You should be able to use MD5 hashing in both C# and JS.
In C#:
int Hash(params int[] values)
{
System.Security.Cryptography.MD5 hasher = MD5.Create();
string valuesAsString = string.Join(",", values);
var hash = hasher.ComputeHash(Encoding.UTF8.GetBytes(valuesAsString));
var hashAsInt = BitConverter.ToInt32(hash, 0);
return Math.Abs(hashAsInt % 1000);
}
In JS, implement the same method using some MD5 algorithm (e.g. jshash)
I am continuing from my previous question. I am making a c# program where the user enters a 7-bit binary number and the computer prints out the number with an even parity bit to the right of the number. I am struggling. I have a code, but it says BitArray is a namespace but is used as a type. Also, is there a way I could improve the code and make it simpler?
namespace BitArray
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Please enter a 7-bit binary number:");
int a = Convert.ToInt32(Console.ReadLine());
byte[] numberAsByte = new byte[] { (byte)a };
BitArray bits = new BitArray(numberAsByte);
int count = 0;
for (int i = 0; i < 8; i++)
{
if (bits[i])
{
count++;
}
}
if (count % 2 == 1)
{
bits[7] = true;
}
bits.CopyTo(numberAsByte, 0);
a = numberAsByte[0];
Console.WriteLine("The binary number with a parity bit is:");
Console.WriteLine(a);
Might be more fun to duplicate the circuit they use to do this..
bool odd = false;
for(int i=6;i>=0;i--)
odd ^= (number & (1 << i)) > 0;
Then if you want even parity set bit 7 to odd, odd parity to not odd.
or
bool even = true;
for(int i=6;i>=0;i--)
even ^= (number & (1 << i)) > 0;
The circuit is dual function returns 0 and 1 or 1 and 0, does more than 1 bit at a time as well, but this is a bit light for TPL....
PS you might want to check the input for < 128 otherwise things are going to go well wrong.
ooh didn't notice the homework tag, don't use this unless you can explain it.
Almost the same process, only much faster on a larger number of bits. Using only the arithmetic operators (SHR && XOR), without loops:
public static bool is_parity(int data)
{
//data ^= data >> 32; // if arg >= 64-bit (notice argument length)
//data ^= data >> 16; // if arg >= 32-bit
//data ^= data >> 8; // if arg >= 16-bit
data ^= data >> 4;
data ^= data >> 2;
data ^= data >> 1;
return (data & 1) !=0;
}
public static byte fix_parity(byte data)
{
if (is_parity(data)) return data;
return (byte)(data ^ 128);
}
Using a BitArray does not buy you much here, if anything it makes your code harder to understand. Your problem can be solved with basic bit manipulation with the & and | and << operators.
For example to find out if a certain bit is set in a number you can & the number with the corresponding power of 2. That leads to:
int bitsSet = 0;
for(int i=0;i<7;i++)
if ((number & (1 << i)) > 0)
bitsSet++;
Now the only thing remain is determining if bitsSet is even or odd and then setting the remaining bit if necessary.
Given two ARGB colors represented as integers, 8 bit/channel (alpha, red, green, blue), I need to compute a value that represents a sort of distance (also integer) between them.
So the formula for the distance is: Delta=|R1-R2|+|G1-G2|+|B1-B2| where Rx, Gx and Bx are the values of the channles of color 1 and 2. Alpha channel is always ignored.
I need to speed up this calculation because is done a lot of times on a slow machine. What is the 'geekies' way to calculate this on a single thread given the two integers.
My best so far is but I guess this can be improved further:
//Used for color conversion from/to int
private const int ChannelMask = 0xFF;
private const int GreenShift = 8;
private const int RedShift = 16;
public int ComputeColorDelta(int color1, int color2)
{
int rDelta = Math.Abs(((color1 >> RedShift) & ChannelMask) - ((color2 >> RedShift) & ChannelMask));
int gDelta = Math.Abs(((color1 >> GreenShift) & ChannelMask) - ((color2 >> GreenShift) & ChannelMask));
int bDelta = Math.Abs((color1 & ChannelMask) - (color2 & ChannelMask));
return rDelta + gDelta + bDelta;
}
Long Answer:
How many is "a lot"
I have a fast machine I guess, but I wrote this little script:
public static void Main() {
var s = Stopwatch.StartNew();
Random r = new Random();
for (int i = 0; i < 100000000; i++) {
int compute = ComputeColorDelta(r.Next(255), r.Next(255));
}
Console.WriteLine(s.ElapsedMilliseconds);
Console.ReadLine();
}
And the output is:
6878
So 7 seconds for 100 million times seems pretty good.
We can definitely speed this up though. I changed your function to look like this:
public static int ComputeColorDelta(int color1, int color2) {
return 1;
}
With that change, the output was: 5546. So, we managed to get a 1 second performance gain over 100 million iterations by returning a constant. ;)
Short answer: this function is not your bottleneck. :)
I'm trying to let runtime to make calculation for me.
First of all I define struct with explicit field offset
[StructLayout(LayoutKind.Explicit)]
public struct Color
{
[FieldOffset(0)] public int Raw;
[FieldOffset(0)] public byte Blue;
[FieldOffset(8)] public byte Green;
[FieldOffset(16)] public byte Red;
[FieldOffset(24)] public byte Alpha;
}
the calculation function will be:
public int ComputeColorDeltaOptimized(Color color1, Color color2)
{
int rDelta = Math.Abs(color1.Red - color2.Red);
int gDelta = Math.Abs(color1.Green - color2.Green);
int bDelta = Math.Abs(color1.Blue - color2.Blue);
return rDelta + gDelta + bDelta;
}
And the usage
public void FactMethodName2()
{
var s = Stopwatch.StartNew();
var color1 = new Color(); // This is a structs, so I can define they out of loop and gain some performance
var color2 = new Color();
for (int i = 0; i < 100000000; i++)
{
color1.Raw = i;
color2.Raw = 100000000 - i;
int compute = ComputeColorDeltaOptimized(color1, color2);
}
Console.WriteLine(s.ElapsedMilliseconds); //5393 vs 7472 of original
Console.ReadLine();
}
One idea would be to use the same code you already have, but in a different order: apply the mask, take the difference, then shift.
Another modification that might help is to inline this function: that is, instead of calling it for each pair of colors, just compute the difference directly, inside whatever loop executes this code. I assume it is inside a tight loop, because otherwise its cost would be negligible.
Lastly, since you're probably getting image pixel data, you'd save a lot by going the unsafe route: make your bitmaps like this EditableBitmap, then grab the byte* and read the image data out of it.
You can do this in order to reduce the AND operations:
public int ComputeColorDelta(int color1, int color2)
{
int rDelta = Math.Abs((((color1 >> RedShift) - (color2 >> RedShift))) & ChannelMask)));
// same for other color channels
return rDelta + gDelta + bDelta;
}
not much but something...
I have two byte arrays with the same length. I need to perform XOR operation between each byte and after this calculate sum of bits.
For example:
11110000^01010101 = 10100101 -> so 1+1+1+1 = 4
I need do the same operation for each element in byte array.
Use a lookup table. There are only 256 possible values after XORing, so it's not exactly going to take a long time. Unlike izb's solution though, I wouldn't suggest manually putting all the values in though - compute the lookup table once at startup using one of the looping answers.
For example:
public static class ByteArrayHelpers
{
private static readonly int[] LookupTable =
Enumerable.Range(0, 256).Select(CountBits).ToArray();
private static int CountBits(int value)
{
int count = 0;
for (int i=0; i < 8; i++)
{
count += (value >> i) & 1;
}
return count;
}
public static int CountBitsAfterXor(byte[] array)
{
int xor = 0;
foreach (byte b in array)
{
xor ^= b;
}
return LookupTable[xor];
}
}
(You could make it an extension method if you really wanted...)
Note the use of byte[] in the CountBitsAfterXor method - you could make it an IEnumerable<byte> for more generality, but iterating over an array (which is known to be an array at compile-time) will be faster. Probably only microscopically faster, but hey, you asked for the fastest way :)
I would almost certainly actually express it as
public static int CountBitsAfterXor(IEnumerable<byte> data)
in real life, but see which works better for you.
Also note the type of the xor variable as an int. In fact, there's no XOR operator defined for byte values, and if you made xor a byte it would still compile due to the nature of compound assignment operators, but it would be performing a cast on each iteration - at least in the IL. It's quite possible that the JIT would take care of this, but there's no need to even ask it to :)
Fastest way would probably be a 256-element lookup table...
int[] lut
{
/*0x00*/ 0,
/*0x01*/ 1,
/*0x02*/ 1,
/*0x03*/ 2
...
/*0xFE*/ 7,
/*0xFF*/ 8
}
e.g.
11110000^01010101 = 10100101 -> lut[165] == 4
This is more commonly referred to as bit counting. There are literally dozens of different algorithms for doing this. Here is one site which lists a few of the more well known methods. There are even CPU specific instructions for doing this.
Theorectically, Microsoft could add a BitArray.CountSetBits function that gets JITed with the best algorithm for that CPU architecture. I, for one, would welcome such an addition.
As I understood it you want to sum the bits of each XOR between the left and right bytes.
for (int b = 0; b < left.Length; b++) {
int num = left[b] ^ right[b];
int sum = 0;
for (int i = 0; i < 8; i++) {
sum += (num >> i) & 1;
}
// do something with sum maybe?
}
I'm not sure if you mean sum the bytes or the bits.
To sum the bits within a byte, this should work:
int nSum = 0;
for (int i=0; i<=7; i++)
{
nSum += (byte_val>>i) & 1;
}
You would then need the xoring, and array looping around this, of course.
The following should do
int BitXorAndSum(byte[] left, byte[] right) {
int sum = 0;
for ( var i = 0; i < left.Length; i++) {
sum += SumBits((byte)(left[i] ^ right[i]));
}
return sum;
}
int SumBits(byte b) {
var sum = 0;
for (var i = 0; i < 8; i++) {
sum += (0x1) & (b >> i);
}
return sum;
}
It can be rewritten as ulong and use unsafe pointer, but byte is easier to understand:
static int BitCount(byte num)
{
// 0x5 = 0101 (bit) 0x55 = 01010101
// 0x3 = 0011 (bit) 0x33 = 00110011
// 0xF = 1111 (bit) 0x0F = 00001111
uint count = num;
count = ((count >> 1) & 0x55) + (count & 0x55);
count = ((count >> 2) & 0x33) + (count & 0x33);
count = ((count >> 4) & 0xF0) + (count & 0x0F);
return (int)count;
}
A general function to count bits could look like:
int Count1(byte[] a)
{
int count = 0;
for (int i = 0; i < a.Length; i++)
{
byte b = a[i];
while (b != 0)
{
count++;
b = (byte)((int)b & (int)(b - 1));
}
}
return count;
}
The less 1-bits, the faster this works. It simply loops over each byte, and toggles the lowest 1 bit of that byte until the byte becomes 0. The castings are necessary so that the compiler stops complaining about the type widening and narrowing.
Your problem could then be solved by using this:
int Count1Xor(byte[] a1, byte[] a2)
{
int count = 0;
for (int i = 0; i < Math.Min(a1.Length, a2.Length); i++)
{
byte b = (byte)((int)a1[i] ^ (int)a2[i]);
while (b != 0)
{
count++;
b = (byte)((int)b & (int)(b - 1));
}
}
return count;
}
A lookup table should be the fastest, but if you want to do it without a lookup table, this will work for bytes in just 10 operations.
public static int BitCount(byte value) {
int v = value - ((value >> 1) & 0x55);
v = (v & 0x33) + ((v >> 2) & 0x33);
return ((v + (v >> 4) & 0x0F));
}
This is a byte version of the general bit counting function described at Sean Eron Anderson's bit fiddling site.