Use LINQ to XOR Bytes Together - c#

How can I use LINQ to to XOR bytes together in an array? I'm taking the output of an MD5 hash and would like to XOR every four bytes together so that I can get a 32 bit int out of it. I could easily do this with a loop, but I thought it was an interesting problem for LINQ.
public static byte[] CompressBytes(byte[] data, int length)
{
byte[] buffer = new byte[length];
for (int i = 0; i < data.Length; i++)
{
for (int j = 0; j < length; j++)
{
if (i * length + j >= data.Length)
break;
buffer[j] ^= data[i * length + j];
}
}
return buffer;
}
Slightly off topic, but is this even a good idea? If I need a int, would I be better off with a different hash function, or should I just take the first 4 bytes of the MD5 because XORing them all wouldn't help any? Comments are welcome.

You can use the IEnumerable.Aggregate function (not actually LINQ, but most people refer to the LINQ-related extension methods as LINQ) to perform a custom aggregate. For example, you could compute the total XOR of a list of bytes like this:
var xor = list.Aggregate((acc, val) => (byte)(acc ^ val));
You can create a virtually unreadable chain of extension method calls to do what you're after:
int integer = BitConverter.ToInt32(Enumerable.Range(0, 3).
Select(i => data.Skip(i * 4).Take(4).
Aggregate((acc, val) => (byte)(acc ^ val))).ToArray(), 0)

To address the "off topic" part, I'd suggest just lopping off the first 32 bits of the MD5 hash. Or consider a simpler non-crypto hash such as CRC32.
Like other cryptographic hashes, MD5 is supposed to appear as random as possible, so XOR'ing other bytes won't really make a difference, IMO.

In case you need xor of 2 byte arrays: byteArray1.Select((x, i) => (byte)(x ^ byteArray2[i])).ToArray();

Related

Convert 2 successive Bytes to one int value Increase speed in C#

I need to combine two Bytes into one int value.
I receive from my camera a 16bit Image were two successive bytes have the intensity value of one pixel. My goal is to combine these two bytes into one "int" vale.
I manage to do this using the following code:
for (int i = 0; i < VectorLength * 2; i = i + 2)
{
NewImageVector[ImagePointer] = ((int)(buffer.Array[i + 1]) << 8) | ((int)(buffer.Array[i]));
ImagePointer++;
}
My image is 1280*960 so VectorLength==1228800 and the incomming buffer size is 2*1228800=2457600 elements...
Is there any way that I can speed this up?
Maybe there is another way so I don't need to use a for-loop.
Thank you
You could use the equivalent to the union of c. Im not sure if faster, but more elegant:
[StructLayout(LayoutKind.Explicit)]
struct byte_array
{
[FieldOffset(0)]
public byte byte1;
[FieldOffset(1)]
public byte byte2;
[FieldOffset(0)]
public short int0;
}
use it like this:
byte_array ba = new byte_array();
//insert the two bytes
ba.byte1 = (byte)(buffer.Array[i]);
ba.byte2 = (byte)(buffer.Array[i + 1]);
//get the integer
NewImageVector[ImagePointer] = ba.int1;
You can fill your two bytes and use the int. To find the faster way take the StopWatch-Class and compare the two ways like this:
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
//The code
stopWatch.Stop();
MessageBox.Show(stopWatch.ElapsedTicks.ToString()); //Or milliseconds ,...
Assuming you can (re-)define NewImageVector as a short[], and every two consecutive bytes in Buffer should be transformed into a short (which basically what you're doing now, only you cast to an int afterwards), you can use Buffer.BlockCopy to do it for you.
As the documentation tells, you Buffer.BlockCopy copies bytes from one array to another, so in order to copy your bytes in buffer you need to do the following:
Buffer.BlockCopy(Buffer, 0, NewImageVector, 0, [NumberOfExpectedShorts] * 2)
This tells BlockCopy that you want to start copying bytes from Buffer, starting at index 0, to NewImageVector starting at index 0, and you want to copy [NumberOfExpectedShorts] * 2 bytes (since every short is two bytes long).
No loops, but it does depend on the ability of using a short[] array instead of an int[] array (and indeed, on using an array to begin with).
Note that this also requires the bytes in Buffer to be in little-endian order (i.e. Buffer[index] contains the low byte, buffer[index + 1] the high byte).
You can achieve a small performance increase by using unsafe pointers to iterate the arrays. The following code assumes that source is the input byte array (buffer.Array in your case). It also assumes that source has an even number of elements. In production code you would obviously have to check these things.
int[] output = new int[source.Length / 2];
fixed (byte* pSource = source)
fixed (int* pDestination = output)
{
byte* sourceIterator = pSource;
int* destIterator = pDestination;
for (int i = 0; i < output.Length; i++)
{
(*destIterator) = ((*sourceIterator) | (*(sourceIterator + 1) << 8));
destIterator++;
sourceIterator += 2;
}
}
return output;

Is this the correct way to truncate a hash in C#?

I have a requirement to hash input strings and produce 14 digit decimal numbers as output.
The math I am using tells me I can have, at maximum, a 46 bit unsigned integer.
I am aware that a 46 bit uint means less collision resistance for any potential hash function. However, the number of hashes I am creating keeps the collision probability in an acceptable range.
I would be most grateful if the community could help me verify that my method for truncating a hash to 46 bits is solid. I have a gut feeling that there are optimizations and/or easier ways to do this. My function is as follows (where bitLength is 46 when this function is called):
public static UInt64 GetTruncatedMd5Hash(string input, int bitLength)
{
var md5Hash = MD5.Create();
byte[] fullHashBytes = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
var fullHashBits = new BitArray(fullHashBytes);
// BitArray stores LSB of each byte in lowest indexes, so reversing...
ReverseBitArray(fullHashBits);
// truncate by copying only number of bits specified by bitLength param
var truncatedHashBits = new BitArray(bitLength);
for (int i = 0; i < bitLength - 1; i++)
{
truncatedHashBits[i] = fullHashBits[i];
}
byte[] truncatedHashBytes = new byte[8];
truncatedHashBits.CopyTo(truncatedHashBytes, 0);
return BitConverter.ToUInt64(truncatedHashBytes, 0);
}
Thanks for taking a look at this question. I appreciate any feedback!
With the help of the comments above, I crafted the following solution:
public static UInt64 GetTruncatedMd5Hash(string input, int bitLength)
{
if (string.IsNullOrWhiteSpace(input)) throw new ArgumentException("input must not be null or whitespace");
if(bitLength > 64) throw new ArgumentException("bitLength must be <= 64");
var md5Hash = MD5.Create();
byte[] fullHashBytes = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
if(bitLength == 64)
return BitConverter.ToUInt64(fullHashBytes, 0);
var bitMask = (1UL << bitLength) - 1UL;
return BitConverter.ToUInt64(fullHashBytes, 0) & bitMask;
}
It's much tighter (and faster) than what I was trying to do before.

How do I properly loop through and print bits of an Int, Long, Float, or BigInteger?

I'm trying to debug some bit shifting operations and I need to visualize the bits as they exist before and after a Bit-Shifting operation.
I read from this answer that I may need to handle backfill from the shifting, but I'm not sure what that means.
I think that by asking this question (how do I print the bits in a int) I can figure out what the backfill is, and perhaps some other questions I have.
Here is my sample code so far.
static string GetBits(int num)
{
StringBuilder sb = new StringBuilder();
uint bits = (uint)num;
while (bits!=0)
{
bits >>= 1;
isBitSet = // somehow do an | operation on the first bit.
// I'm unsure if it's possible to handle different data types here
// or if unsafe code and a PTR is needed
if (isBitSet)
sb.Append("1");
else
sb.Append("0");
}
}
Convert.ToString(56,2).PadLeft(8,'0') returns "00111000"
This is for a byte, works for int also, just increase the numbers
To test if the last bit is set you could use:
isBitSet = ((bits & 1) == 1);
But you should do so before shifting right (not after), otherwise you's missing the first bit:
isBitSet = ((bits & 1) == 1);
bits = bits >> 1;
But a better option would be to use the static methods of the BitConverter class to get the actual bytes used to represent the number in memory into a byte array. The advantage (or disadvantage depending on your needs) of this method is that this reflects the endianness of the machine running the code.
byte[] bytes = BitConverter.GetBytes(num);
int bitPos = 0;
while(bitPos < 8 * bytes.Length)
{
int byteIndex = bitPos / 8;
int offset = bitPos % 8;
bool isSet = (bytes[byteIndex] & (1 << offset)) != 0;
// isSet = [True] if the bit at bitPos is set, false otherwise
bitPos++;
}

C to C# Bytearray + hex

I'm currently trying to get this C code converted into C#.
Since I'm not really familiar with C I'd really apprecheate your help!
static unsigned char byte_table[2080] = {0};
First of, some bytearray gets declared but never filled which I'm okay with
BYTE* packet = //bytes come in here from a file
int unknownVal = 0;
int unknown_field0 = *(DWORD *)(packet + 0x08);
do
{
*((BYTE *)packet + i) ^= byte_table[(i + unknownVal) & 0x7FF];
++i;
}
while (i <= packet[0]);
But down here.. I really have no idea how to translate this into C#
BYTE = byte[] right?
DWORD = double?
but how can (packet + 0x08) be translated? How can I add a hex to a bytearray? Oo
I'd be happy about anything that helps! :)
In C, setting any set of memory to {0} will set the entire memory area to zeroes, if I'm not mistaken.
That bottom loop can be rewritten in a simpler, C# friendly fashion.
byte[] packet = arrayofcharsfromfile;
int field = packet[8]+(packet[9]<<8)+(packet[10]<<16)+(packet[11]<<24); //Assuming 32 bit little endian integer
int unknownval = 0;
int i = 0;
do //Why waste the newline? I don't know. Conventions are silly!
{
packet[i] ^= byte_table[(i+unknownval) & 0x7FF];
} while( ++i <= packet[0] );
field is set by taking the four bytes including and following index 8 and generating a 32 bit int from them.
In C, you can cast pointers to other types, as is done in your provided snippet. What they're doing is taking an array of bytes (each one 1/4 the size of a DWORD) and adding 8 to the index which advances the pointer by 8 bytes (since each element is a byte wide) and then treating that pointer as a DWORD pointer. In simpler terms, they're turning the byte array in to a DWORD array, and then taking index 2, as 8/4=2.
You can simulate this behavior in a safe fashion by stringing the bytes together with bitshifting and addition, as I demonstrated above. It's not as efficient and isn't as pretty, but it accomplishes the same thing, and in a platform agnostic way too. Not all platforms are little endian.

How to set each bit in a byte array

How do I set each bit in the following byte array which has 21 bytes or 168 bits to either zero or one?
byte[] logonHours
Thank you very much
Well, to clear every bit to zero you can just use Array.Clear:
Array.Clear(logonHours, 0, logonHours.Length);
Setting each bit is slightly harder:
for (int i = 0; i < logonHours.Length; i++)
{
logonHours[i] = 0xff;
}
If you find yourself filling an array often, you could write an extension method:
public static void FillArray<T>(this T[] array, T value)
{
// TODO: Validation
for (int i = 0; i < array.Length; i++)
{
array[i] = value;
}
}
BitArray.SetAll:
System.Collections.BitArray a = new System.Collections.BitArray(logonHours);
a.SetAll(true);
Note that this copies the data from the byte array. It's not just a wrapper around it.
This may be more than you need, but ...
Usually when dealing with individual bits in any data type, I define a const for each bit position, then use the binary operators |, &, and ^.
i.e.
const byte bit1 = 1;
const byte bit2 = 2;
const byte bit3 = 4;
const byte bit4 = 8;
.
.
const byte bit8 = 128;
Then you can turn whatever bits you want on and off using the bit operations.
byte byTest = 0;
byTest = byTest | bit4;
would turn bit 4 on but leave the rest untouched.
You would use the & and ^ to turn them off or do more complex exercises.
Obviously, since you only want to turn all bits up or down then you can just set the byte to 0 or 255. That would turn them all off or on.

Categories