how to extract two components from HEX string , reverse XOR - c#

My I need to generate a random 32 char HEX string. I have code in place that will generate for example E272E28B8961FB155E3FC657831F0690
Now, I need to break this down into two 32 char strings such that value of string 1 XOR string 2 will = E272E28B8961FB155E3FC657831F0690
I seem to be having a hard time wrapping my head around that. I suppose I need a reverse XOR on this string. Any suggestions on how to accomplish this?

Assuming that you want two 32 Hexadecimal Character strings (which is equivalent to 16 bytes) which will XOR to a known 32 Hexadecimal Character string, you can use this method.
Generate random bytes for the first part of the output, then calculate what the second part has to be based on the first part and the expected output. XOR is a self-inverting operator (there's a fancy word for that that I'm forgetting) so it's fairly straightforward to do.
void q50415070()
{
var random = new Random();
var output = new byte[16];
random.NextBytes(output);
Debug.WriteLine(BitConverter.ToString(output));
// 91-77-E9-2F-EC-F7-8E-CC-03-AF-37-FD-4F-6F-D2-4D
var part1 = new byte[16];
random.NextBytes(part1);
Debug.WriteLine(BitConverter.ToString(part1));
// 7A-9B-2B-8B-D7-CE-AA-7E-7E-C3-FE-FF-44-2A-21-3C
var part2 = part1.Zip(output, (x, y) => (byte)(x ^ y)).ToArray();
Debug.WriteLine(BitConverter.ToString(part2));
// EB-EC-C2-A4-3B-39-24-B2-7D-6C-C9-02-0B-45-F3-71
}
In this, output is the result I'm trying to reach, and part1 and part2 are the two components that I want to be able to XOR together to get the expected output.
I've used the Linq Zip method to combine the two IEnumerable<byte>s together element by element, then used the XOR operator ^ to calculate the result byte-by-byte. Finally calling ToArray() to make it back into an array at the end.
This technique is used often in cryptography where you want to split an encryption key into two parts for two people to have, each of which is useless by itself.
Edit: Tweaked the function slightly to more closely match your question:
void q50415070()
{
var output = new byte[16] { 0xE2, 0x72, 0xE2, 0x8B, 0x89, 0x61, 0xFB, 0x15, 0x5E, 0x3F, 0xC6, 0x57, 0x83, 0x1F, 0x06, 0x90 };
Debug.WriteLine(BitConverter.ToString(output));
// E2-72-E2-8B-89-61-FB-15-5E-3F-C6-57-83-1F-06-90
var random = new Random();
var part1 = new byte[16];
random.NextBytes(part1);
Debug.WriteLine(BitConverter.ToString(part1));
// 59-37-D0-A6-71-CC-6C-17-96-02-70-CE-A7-57-06-25
var part2 = part1.Zip(output, (x, y) => (byte)(x ^ y)).ToArray();
Debug.WriteLine(BitConverter.ToString(part2));
// BB-45-32-2D-F8-AD-97-02-C8-3D-B6-99-24-48-00-B5
}
Hope this helps

Related

Calculating the mode of an array containing hex values?

I am programming in C#.
I've been experimenting with some code that calculates the mode of an array containing integers. i.e given {5,6,2,1,5} the mode is 5.
My questions is, can this be done with hex values?
For example, lets say I had the following array:
unsigned char HEXVALUES[ ] = {0x66, 0x60, 0xe7, 0xf0, 0x66};
How could I go about writing a program that tells me that 0x66 is the mode?
I've thought about converting them to decimal values and finding the mode that way, but it seems inefficient.
Thanks
Hexadecimal value is just a representation of numeric value. E.g. these all are representations of same decimal value 102:
66 (Hex)
102 (Dec)
01100110 (Bin)
So just create array of integer values written in Hexadecimal format and make your calculations like with any other integer values:
var array = new[] { 0x66, 0x60, 0xe7, 0xf0, 0x66 };
var mode = array.GroupBy(x => x).OrderByDescending(g => g.Count()).First().Key;
Console.WriteLine($"{mode:X}"); // output int as hex
I suppose the mode of any collection is
theCollection
.GroupBy(x => x)
.OrderByDescending(x => x.Count())
.Select(g => g.FirstOrDefault())
.First()
"Hex numbers" are the same thing as integers. The difference is the way they are represented visually (in code, and when converted to strings). In binary they are exactly the same.
So whatever code you used before on
unsigned char[] list = new char[]{5,6,2,1,5};
FindMode(list);
will work identically with
unsigned char[] list = new char[]{0x66, 0x60, 0xe7, 0xf0, 0x66};
FindMode(list);

How to add multiple bytes and get a byte array?

Given a byte array
byte[] someBytes = { 0xFF, 0xFE, 0xFE, 0xFF, 0x11, 0x00 ,0x00 ,0x00 ,0x00}
What's the best to add up all the bytes? Manually adding all of the bytes by hand as hex numbers would yield 40B on my above example so preferably I'd like to end up with something like:
byte[] byteSum = { 0x04, 0x0B }
Actually, all I really need is the 0x0B part (Used for checksum). Checksum is calculated by 0x0B XOR 0x55 (Which yields 0x5E) in this case.
I understand this isn't a normal addition of bytes, but this is how the checksum is calculated.
Manually looping through the byte array and adding them results in an integer sum.
What's the most concise way of doing this?
erm,
byte checksum;
foreach (var b in someBytes)
{
checksum = (byte)((checksum + b) & 0xff);
}
I'm not sure if I understand your question... But this is how I would do it:
byte sum = 0;
foreach (byte b in someBytes)
{
unchecked
{
sum += b;
}
}
But this does not yield 0x0B, but 0x69.
Using LINQ's sum and casting to byte in the end:
unchecked
{
var checksum = (byte)(someBytes.Sum(b => (long)b) ^ 0x55);
}

Incorrect value converting hexadecimal numbers to UInt C# [duplicate]

This question already has answers here:
Why does BinaryReader.ReadUInt32() reverse the bit pattern?
(6 answers)
Closed 9 years ago.
I am trying to read a binary file in C#, but I am facing a problem.
I declared the following:
public static readonly UInt32 NUMBER = 0XCAFEBABE;
Then while reading from the very beginning of the file I am asking to read the first 4 bytes (already tried different ways, but this is the simplest):
UInt32 num = in_.ReadUInt32(); // in_ is a BinaryReader
While I have that the 4 bytes are CA, FE, BA and BE (in hex) while convert them to UInt I am getting different values. NUMBER is 3405691582, num is 3199925962.
I also tried to do this:
byte[] f2 = {0xCA, 0xFE, 0xBA, 0xBE};
and the result of doing BitConverter.ToUInt32(new byte[]{0xCA, 0xFE, 0xBA, 0xBE},0) is 3199925962.
can anyone help me?
This is because of the little endianness of your machine. See BitConverter.IsLittleEndian property to check this.
Basically, numbers are stored in reverse byte order, compared to how you would write them down. We write the most significant number on the left, but the (little endian) PC stores the least significant byte on the left. Thus, the result you're getting is really 0xBEBAFECA (3199925962 decimal) and not what you expected.
You can convert using bit shifting operations:
uint value = (f2[0] << 24) | (f2[1] << 16) | (f2[2] << 8) | f2[3];
There are many more ways to convert, including IPAddress.NetworkToHostOrder as I4V pointed out, f2.Reverse(), etc.
For your specific code, I believe this would be most practical:
uint num = (uint)IPAddress.NetworkToHostOrder(in_.ReadInt32());
This may result in an arithmetic underflow however, so it may cause problems with a /checked compiler option or checked keyword (neither are very common).
If you want to deal with these situations and get even cleaner code, wrap it in an extension method:
public static uint ReadUInt32NetworkOrder(this BinaryReader reader)
{
unchecked
{
return (uint)IPAddress.NetworkToHostOrder(reader.ReadInt32());
}
}
That's what is called byte order:
var result1 = BitConverter.ToUInt32(new byte[] { 0xCA, 0xFE, 0xBA, 0xBE }, 0);
//3199925962
var result2 = BitConverter.ToUInt32(new byte[] { 0xBE, 0xBA, 0xFE, 0xCA }, 0);
//3405691582

Integer to byte with given number of bits set

I don't know what to call this, which makes googling harder.
I have an integer, say 3, and want to convert it to 11100000, that is, a byte with the value of the integers number of bits set, from the most significantly bit.
I guess it could be done with:
byte result = 0;
for(int i = 8; i > 8 - 3; i--)
result += 2 ^ i;
but is there anything faster / more nice or, preferably, standard library included in .net?
int n = 3; // 0..8
int mask = 0xFF00;
byte result = (byte) (mask >> n);
Because there are only a few possibilities, you could just cache them:
// Each index adds another bit from the left, e.g. resultCache[3] == 11100000.
byte[] resultCache = { 0x00, 0x80, 0xC0, 0xE0, 0xF0, 0XF8, 0xFC, 0xFE, 0xFF };
You'd also get an exception instead of a silent error if you accidentally tried to get the value for n > 8.

How would you get an array of Unicode code points from a .NET String?

I have a list of character range restrictions that I need to check a string against, but the char type in .NET is UTF-16 and therefore some characters become wacky (surrogate) pairs instead. Thus when enumerating all the char's in a string, I don't get the 32-bit Unicode code points and some comparisons with high values fail.
I understand Unicode well enough that I could parse the bytes myself if necessary, but I'm looking for a C#/.NET Framework BCL solution. So ...
How would you convert a string to an array (int[]) of 32-bit Unicode code points?
You are asking about code points. In UTF-16 (C#'s char) there are only two possibilities:
The character is from the Basic Multilingual Plane, and is encoded by a single code unit.
The character is outside the BMP, and encoded using a surrogare high-low pair of code units
Therefore, assuming the string is valid, this returns an array of code points for a given string:
public static int[] ToCodePoints(string str)
{
if (str == null)
throw new ArgumentNullException("str");
var codePoints = new List<int>(str.Length);
for (int i = 0; i < str.Length; i++)
{
codePoints.Add(Char.ConvertToUtf32(str, i));
if (Char.IsHighSurrogate(str[i]))
i += 1;
}
return codePoints.ToArray();
}
An example with a surrogate pair 🌀 and a composed character ñ:
ToCodePoints("\U0001F300 El Ni\u006E\u0303o"); // 🌀 El Niño
// { 0x1f300, 0x20, 0x45, 0x6c, 0x20, 0x4e, 0x69, 0x6e, 0x303, 0x6f } // 🌀 E l N i n ̃◌ o
Here's another example. These two code points represents a 32th musical note with a staccato accent, both surrogate pairs:
ToCodePoints("\U0001D162\U0001D181"); // 𝅘𝅥𝅰𝆁
// { 0x1d162, 0x1d181 } // 𝅘𝅥𝅰 𝆁◌
When C-normalized, they are decomposed into a notehead, combining stem, combining flag and combining accent-staccato, all surrogate pairs:
ToCodePoints("\U0001D162\U0001D181".Normalize()); // 𝅘𝅥𝅰𝆁
// { 0x1d158, 0x1d165, 0x1d170, 0x1d181 } // 𝅘 𝅥 𝅰 𝆁◌
Note that leppie's solution is not correct. The question is about code points, not text elements. A text element is a combination of code points that together form a single grapheme. For example, in the example above, the ñ in the string is represented by a Latin lowercase n followed by a combining tilde ̃◌. Leppie's solution discards any combining characters that cannot be normalized into a single code point.
This answer is not correct. See #Virtlink's answer for the correct one.
static int[] ExtractScalars(string s)
{
if (!s.IsNormalized())
{
s = s.Normalize();
}
List<int> chars = new List<int>((s.Length * 3) / 2);
var ee = StringInfo.GetTextElementEnumerator(s);
while (ee.MoveNext())
{
string e = ee.GetTextElement();
chars.Add(char.ConvertToUtf32(e, 0));
}
return chars.ToArray();
}
Notes: Normalization is required to deal with composite characters.
Doesn't seem like it should be much more complicated than this:
public static IEnumerable<int> Utf32CodePoints( this IEnumerable<char> s )
{
bool useBigEndian = !BitConverter.IsLittleEndian;
Encoding utf32 = new UTF32Encoding( useBigEndian , false , true ) ;
byte[] octets = utf32.GetBytes( s ) ;
for ( int i = 0 ; i < octets.Length ; i+=4 )
{
int codePoint = BitConverter.ToInt32(octets,i);
yield return codePoint;
}
}
I came up with the same approach suggested by Nicholas (and Jeppe), just shorter:
public static IEnumerable<int> GetCodePoints(this string s) {
var utf32 = new UTF32Encoding(!BitConverter.IsLittleEndian, false, true);
var bytes = utf32.GetBytes(s);
return Enumerable.Range(0, bytes.Length / 4).Select(i => BitConverter.ToInt32(bytes, i * 4));
}
The enumeration was all I needed, but getting an array is trivial:
int[] codePoints = myString.GetCodePoints().ToArray();
This solution produces the same results as the solution by Daniel A.A. Pelsmaeker but is a little bit shorter:
public static int[] ToCodePoints(string s)
{
byte[] utf32bytes = Encoding.UTF32.GetBytes(s);
int[] codepoints = new int[utf32bytes.Length / 4];
Buffer.BlockCopy(utf32bytes, 0, codepoints, 0, utf32bytes.Length);
return codepoints;
}

Categories