Rewrite from C# to python - c#

I am trying to rewrite part of code from C# to Python.
But faced some problems with bitwise operation.
Here is C# code :
private string _generateConfirmationHashForTime(long time, string tag)
{
time = 1459152870;
byte[] decode = Convert.FromBase64String("TphBbTrbbVGJuXQ15OVZVZeBB9M=");
int n2 = 8;
if (tag != null)
{
if (tag.Length > 32)
{
n2 = 8 + 32;
}
else
{
n2 = 8 + tag.Length;
}
}
byte[] array = new byte[n2];
int n3 = 8;
while (true)
{
int n4 = n3 - 1;
if (n3 <= 0)
{
break;
}
array[n4] = (byte)time;
time >>= 8;
n3 = n4;
}
if (tag != null)
{
Array.Copy(Encoding.UTF8.GetBytes(tag), 0, array, 8, n2 - 8);
}
try
{
HMACSHA1 hmacGenerator = new HMACSHA1();
hmacGenerator.Key = decode;
byte[] hashedData = hmacGenerator.ComputeHash(array);
string encodedData = Convert.ToBase64String(hashedData, Base64FormattingOptions.None);
Console.WriteLine(encodedData)
return encodedData
}
catch (Exception)
{
return null; //Fix soon: catch-all is BAD!
}
}
I rewrote it to Python:
def _generateConfirmationHashForTime(self, time, tag):
time = 1459152870
decode = base64.b64decode("TphBbTrbbVGJuXQ15OVZVZeBB9M=")
n2 = 8
if tag is not None:
if len(tag) > 32:
n2 = 8 + 32
else:
n2 = 8 + len(tag)
arrayb = [hex(time >> i & 0xff) for i in (56, 48, 40, 32, 24, 16, 8, 0)]
if tag is not None:
for ch in range(0, len(tag)):
arrayb.append(hex(ord(tag[ch])))
arrayc = 0
n4 = len(arrayb) - 1
for i in range(0, len(arrayb)):
arrayc <<= 8
arrayc |= int(arrayb[n4], 16)
n4 -= 1
array_binary = binascii.a2b_hex("{:016x}".format(arrayc))
hmacGenerator = hmac.new(decode, array_binary, hashlib.sha1)
hashedData = hmacGenerator.digest()
encodedData = base64.b64encode(hashedData)
print encodedData
The result of hashing is not equal.
Variables encodedData do not match :(
Can you point where can be error in the code?

OK, now I remember why I don't use Python. Language snarks aside though...
The C# code composes an array of bytes, 8 from the time variable (in big-endian form, MSB first) and up to 32 from the UTF8 encoding of the tag string... but limited by the length of the original string, ignoring multi-byte encoding. Not exactly ideal, but we can handle that.
The bytes from the time variable are simple enough:
arr = struct.pack(">Q", time)
For the tag string convert it to UTF8, then slice the first 32 bytes off and append it to the array:
arr += str(tag).encode("utf-8")[0:min(32, len(str(tag)))]
Up to here we're fine. I compared the base64 encoding of arr against the composed message in C# and they match for my test data, as does the resultant HMAC message digest.
Here's the full code:
def _generateConfirmationHashForTime(time, tag):
time = 1459152870
decode = base64.b64decode("TphBbTrbbVGJuXQ15OVZVZeBB9M=")
arr = struct.pack(">Q", time)
arr += str(tag).encode("utf-8")[0:min(32, len(str(tag)))]
hmacGenerator = hmac.new(decode, arr, hashlib.sha1)
hashedData = hmacGenerator.digest()
encodedData = base64.b64encode(hashedData)
return encodedData

Related

UUID V1 conversion from PHP to C# resulted wrong GUID

I am not familiar with PHP code. I want to convert PHP code implemented UUID V1 to C#. I have tried many ways but it fails. Which part of code is wrong?
This C# code produces wrong GUID fa570fa10-3b235-472b-500-1ebc212c87e0 with node parameter of 138417599493834080 (the result can be vary depends on unix date time). When I change method Hex2Dec as written here it produces 393031383131343830-3234313135-3138323139-313238-1ebc212c87e0. I have no more idea about the wrong code. Please help me to solve it.
public static function v1($node)
{
// nano second time (only micro second precision) since start of UTC
$time = microtime(true) * 10000000 + 0x01b21dd213814000;
$time = pack("H*", sprintf('%016x', $time));
$sequence = random_bytes(2);
$sequence[0] = chr(ord($sequence[0]) & 0x3f | 0x80); // variant bits 10x
$time[0] = chr(ord($time[0]) & 0x0f | 0x10); // version bits 0001
if (!empty($node)) {
// non hex string identifier
if (is_string($node) && preg_match('/[^a-f0-9]/is', $node)) {
// base node off md5 hash for sequence
$node = md5($node);
// set multicast bit not IEEE 802 MAC
$node = (hexdec(substr($node, 0, 2)) | 1) . substr($node, 2, 10);
}
if (is_numeric($node))
$node = sprintf('%012x', $node);
$len = strlen($node);
if ($len > 12)
$node = substr($node, 0, 12);
else if ($len < 12)
$node .= str_repeat('0', 12 - $len);
} else {
// base node off random sequence
$node = random_bytes(6);
// set multicast bit not IEEE 802 MAC
$node[0] = chr(ord($node[0]) | 1);
$node = bin2hex($node);
}
return bin2hex($time[4] . $time[5] . $time[6] . $time[7]) // time low
. '-' . bin2hex($time[2] . $time[3]) // time med
. '-' . bin2hex($time[0] . $time[1]) // time hi
. '-' . bin2hex($sequence) // seq
. '-' . $node; // node
}
This is the C# part
public static string MD5(this string input)
{
// Use input string to calculate MD5 hash
using (System.Security.Cryptography.MD5 crypto = System.Security.Cryptography.MD5.Create())
{
byte[] hashBytes = crypto.ComputeHash(Encoding.ASCII.GetBytes(input));
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hashBytes.Length; i++)
sb.Append(hashBytes[i].ToString("x2"));
return sb.ToString();
}
}
public static string GenerateGuidV1(string node)
{
var xtime = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds() * 10000000 + 0x01b21dd213814000;
var time = Pack(xtime.ToString("x"));
var sequence = new byte[2];
sequence[0] = (byte)((char)sequence[0] & 0x3f | 0x80); // variant bits 10x
time[0] = (byte)((char)time[0] & 0x0f | 0x10); // version bits 0001
if (!string.IsNullOrWhiteSpace(node))
{
// non hex string identifier
if (!IsNumeric(node) && Regex.IsMatch(node, "/[^a-f0-9]/is", RegexOptions.IgnoreCase))
//if (preg_match('/[^a-f0-9]/is', $node))
{
// base node off md5 hash for sequence
//$node = md5($node);
node = node.MD5();
// set multicast bit not IEEE 802 MAC
//$node = (hexdec(substr($node, 0, 2)) | 1) . substr($node, 2, 10);
node = (int.Parse(node.Substring(0, 2), NumberStyles.HexNumber) | 1) + node.Substring(2, 10);
}
if (IsNumeric(node))
node = Convert.ToInt64(node).ToString("x"); //sprintf('%012x', $node);
var len = node.Length;
if (len > 12)
node = node.Substring(0, 12); //substr($node, 0, 12);
else if (len < 12)
node += string.Concat(Enumerable.Repeat("0", 12 - len));//str_repeat('0', 12 - $len);
}
else
{
// base node off random sequence
var seqNode = new byte[6];//$node = random_bytes(6);
// set multicast bit not IEEE 802 MAC
seqNode[0] = (byte)((char)node[0] | 1);//$node[0] = chr(ord($node[0]) | 1);
node = Convert.ToInt16(seqNode[0].ToString(), 2).ToString("x");//bin2hex($node);
}
return Bin2Hex(time[4].ToString() + time[5].ToString() + time[6].ToString() + time[7].ToString()) // time low
+ '-'+ Bin2Hex(time[2].ToString() + time[3].ToString()) // time med
+ '-'+ Bin2Hex(time[0].ToString() + time[1].ToString()) // time hi
+ '-'+ Bin2Hex(sequence[0].ToString() + sequence[1].ToString()) // seq
+ '-'+ node; // node
}
private static string Bin2Hex(string value)
{
return Convert.ToInt64(value).ToString("x");
//byte[] bytes = Encoding.GetEncoding(1252).GetBytes(value);
//string hexString = "";
//for (int ii = 0; ii < bytes.Length; ii++)
//{
// hexString += bytes[ii].ToString("x2");
//}
//return hexString;
}
private static byte[] Pack(string hex)
{
hex = hex.Replace("-", "");
byte[] raw = new byte[hex.Length / 2];
for (int i = 0; i < raw.Length; i++)
{
raw[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return raw;
}
private static bool IsNumeric(string value) => value.All(char.IsNumber);

AES GCM porting from python to C#

I am trying to port AES GCM implementation in python OpenTLS project, to C# (.Net). Below is the code in OpenTLS code:
#######################
### Galois Counter Mode
#######################
class AES_GCM:
def __init__(self, keys, key_size, hash):
key_size //= 8
hash_size = hash.digest_size
self.client_AES_key = keys[0 : key_size]
self.server_AES_key = keys[key_size : 2*key_size]
self.client_IV = keys[2*key_size : 2*key_size+4]
self.server_IV = keys[2*key_size+4 : 2*key_size+8]
self.H_client = bytes_to_int(AES.new(self.client_AES_key, AES.MODE_ECB).encrypt('\x00'*16))
self.H_server = bytes_to_int(AES.new(self.server_AES_key, AES.MODE_ECB).encrypt('\x00'*16))
def GF_mult(self, x, y):
product = 0
for i in range(127, -1, -1):
product ^= x * ((y >> i) & 1)
x = (x >> 1) ^ ((x & 1) * 0xE1000000000000000000000000000000)
return product
def H_mult(self, H, val):
product = 0
for i in range(16):
product ^= self.GF_mult(H, (val & 0xFF) << (8 * i))
val >>= 8
return product
def GHASH(self, H, A, C):
C_len = len(C)
A_padded = bytes_to_int(A + b'\x00' * (16 - len(A) % 16))
if C_len % 16 != 0:
C += b'\x00' * (16 - C_len % 16)
tag = self.H_mult(H, A_padded)
for i in range(0, len(C) // 16):
tag ^= bytes_to_int(C[i*16:i*16+16])
tag = self.H_mult(H, tag)
tag ^= bytes_to_int(nb_to_n_bytes(8*len(A), 8) + nb_to_n_bytes(8*C_len, 8))
tag = self.H_mult(H, tag)
return tag
def decrypt(self, ciphertext, seq_num, content_type, debug=False):
iv = self.server_IV + ciphertext[0:8]
counter = Counter.new(nbits=32, prefix=iv, initial_value=2, allow_wraparound=False)
cipher = AES.new(self.server_AES_key, AES.MODE_CTR, counter=counter)
plaintext = cipher.decrypt(ciphertext[8:-16])
# Computing the tag is actually pretty time consuming
if debug:
auth_data = nb_to_n_bytes(seq_num, 8) + nb_to_n_bytes(content_type, 1) + TLS_VERSION + nb_to_n_bytes(len(ciphertext)-8-16, 2)
auth_tag = self.GHASH(self.H_server, auth_data, ciphertext[8:-16])
auth_tag ^= bytes_to_int(AES.new(self.server_AES_key, AES.MODE_ECB).encrypt(iv + '\x00'*3 + '\x01'))
auth_tag = nb_to_bytes(auth_tag)
print('Auth tag (from server): ' + bytes_to_hex(ciphertext[-16:]))
print('Auth tag (from client): ' + bytes_to_hex(auth_tag))
return plaintext
def encrypt(self, plaintext, seq_num, content_type):
iv = self.client_IV + os.urandom(8)
# Encrypts the plaintext
plaintext_size = len(plaintext)
counter = Counter.new(nbits=32, prefix=iv, initial_value=2, allow_wraparound=False)
cipher = AES.new(self.client_AES_key, AES.MODE_CTR, counter=counter)
ciphertext = cipher.encrypt(plaintext)
# Compute the Authentication Tag
auth_data = nb_to_n_bytes(seq_num, 8) + nb_to_n_bytes(content_type, 1) + TLS_VERSION + nb_to_n_bytes(plaintext_size, 2)
auth_tag = self.GHASH(self.H_client, auth_data, ciphertext)
auth_tag ^= bytes_to_int(AES.new(self.client_AES_key, AES.MODE_ECB).encrypt(iv + b'\x00'*3 + b'\x01'))
auth_tag = nb_to_bytes(auth_tag)
# print('Auth key: ' + bytes_to_hex(nb_to_bytes(self.H)))
# print('IV: ' + bytes_to_hex(iv))
# print('Key: ' + bytes_to_hex(self.client_AES_key))
# print('Plaintext: ' + bytes_to_hex(plaintext))
# print('Ciphertext: ' + bytes_to_hex(ciphertext))
# print('Auth tag: ' + bytes_to_hex(auth_tag))
return iv[4:] + ciphertext + auth_tag
An attempt to translate this to C# code is below (sorry for the amateurish code, I am a newbie):
EDIT:
Created an array which got values from GetBytes, and printed the result:
byte[] incr = BitConverter.GetBytes((int) 2);
cf.printBuf(incr, (String) "Array:");
return;
Noticed that the result was "02 00 00 00". Hence I guess my machine is little endian
Made some changes to the code as rodrigogq mentioned. Below is the latest code. It is still not working:
Verified that GHASH, GF_mult and H_mult are giving same results. Below is the verification code:
Python:
key = "\xab\xcd\xab\xcd"
key = key * 10
h = "\x00\x00"
a = AES_GCM(key, 128, h)
H = 200
A = "\x02" * 95
C = "\x02" * 95
D = a.GHASH(H, A, C)
print(D)
C#:
BigInteger H = new BigInteger(200);
byte[] A = new byte[95];
byte[] C = new byte[95];
for (int i = 0; i < 95; i ++)
{
A[i] = 2;
C[i] = 2;
}
BigInteger a = e.GHASH(H, A, C);
Console.WriteLine(a);
Results:
For both: 129209628709014910494696220101529767594
EDIT: Now the outputs are agreeing between Python and C#. So essentially the porting is done :) However, these outputs still don't agree with Wireshark. Hence, the handshake is still failing. May be something wrong with the procedure or the contents. Below is the working code
EDIT: Finally managed to get the code working. Below is the code that resulted in a successful handshake
Working Code:
/*
* Receiving seqNum as UInt64 and content_type as byte
*
*/
public byte[] AES_Encrypt_GCM(byte[] client_write_key, byte[] client_write_iv, byte[] plaintext, UInt64 seqNum, byte content_type)
{
int plaintext_size = plaintext.Length;
List<byte> temp = new List<byte>();
byte[] init_bytes = new byte[16];
Array.Clear(init_bytes, 0, 16);
byte[] encrypted = AES_Encrypt_ECB(init_bytes, client_write_key, 128);
Array.Reverse(encrypted);
BigInteger H_client = new BigInteger(encrypted);
if (H_client < 0)
{
temp.Clear();
temp.TrimExcess();
temp.AddRange(H_client.ToByteArray());
temp.Add(0);
H_client = new BigInteger(temp.ToArray());
}
Random rnd = new Random();
byte[] random = new byte[8];
rnd.NextBytes(random);
/*
* incr is little endian, but it needs to be in big endian format
*
*/
byte[] incr = BitConverter.GetBytes((int) 2);
Array.Reverse(incr);
/*
* Counter = First 4 bytes of IV + 8 Random bytes + 4 bytes of sequential value (starting at 2)
*
*/
temp.Clear();
temp.TrimExcess();
temp.AddRange(client_write_iv);
temp.AddRange(random);
byte[] iv = temp.ToArray();
temp.AddRange(incr);
byte[] counter = temp.ToArray();
AES_CTR aesctr = new AES_CTR(counter);
ICryptoTransform ctrenc = aesctr.CreateEncryptor(client_write_key, null);
byte[] ctext = ctrenc.TransformFinalBlock(plaintext, 0, plaintext_size);
byte[] seq_num = BitConverter.GetBytes(seqNum);
/*
* Using UInt16 instead of short
*
*/
byte[] tls_version = BitConverter.GetBytes((UInt16) 771);
Console.WriteLine("Plain Text size = {0}", plaintext_size);
byte[] plaintext_size_array = BitConverter.GetBytes((UInt16) plaintext_size);
/*
* Size was returned as 10 00 instead of 00 10
*
*/
Array.Reverse(plaintext_size_array);
temp.Clear();
temp.TrimExcess();
temp.AddRange(seq_num);
temp.Add(content_type);
temp.AddRange(tls_version);
temp.AddRange(plaintext_size_array);
byte[] auth_data = temp.ToArray();
BigInteger auth_tag = GHASH(H_client, auth_data, ctext);
Console.WriteLine("H = {0}", H_client);
this.printBuf(plaintext, "plaintext = ");
this.printBuf(auth_data, "A = ");
this.printBuf(ctext, "C = ");
this.printBuf(client_write_key, "client_AES_key = ");
this.printBuf(iv.ToArray(), "iv = ");
Console.WriteLine("Auth Tag just after GHASH: {0}", auth_tag);
AesCryptoServiceProvider aes2 = new AesCryptoServiceProvider();
aes2.Key = client_write_key;
aes2.Mode = CipherMode.ECB;
aes2.Padding = PaddingMode.None;
aes2.KeySize = 128;
ICryptoTransform transform1 = aes2.CreateEncryptor();
byte[] cval = {0, 0, 0, 1};
temp.Clear();
temp.TrimExcess();
temp.AddRange(iv);
temp.AddRange(cval);
byte[] encrypted1 = AES_Encrypt_ECB(temp.ToArray(), client_write_key, 128);
Array.Reverse(encrypted1);
BigInteger nenc = new BigInteger(encrypted1);
if (nenc < 0)
{
temp.Clear();
temp.TrimExcess();
temp.AddRange(nenc.ToByteArray());
temp.Add(0);
nenc = new BigInteger(temp.ToArray());
}
this.printBuf(nenc.ToByteArray(), "NENC = ");
Console.WriteLine("NENC: {0}", nenc);
auth_tag ^= nenc;
byte[] auth_tag_array = auth_tag.ToByteArray();
Array.Reverse(auth_tag_array);
this.printBuf(auth_tag_array, "Final Auth Tag Byte Array: ");
Console.WriteLine("Final Auth Tag: {0}", auth_tag);
this.printBuf(random, "Random sent = ");
temp.Clear();
temp.TrimExcess();
temp.AddRange(random);
temp.AddRange(ctext);
temp.AddRange(auth_tag_array);
return temp.ToArray();
}
public void printBuf(byte[] data, String heading)
{
int numBytes = 0;
Console.Write(heading + "\"");
if (data == null)
{
return;
}
foreach (byte element in data)
{
Console.Write("\\x{0}", element.ToString("X2"));
numBytes = numBytes + 1;
if (numBytes == 32)
{
Console.Write("\r\n");
numBytes = 0;
}
}
Console.Write("\"\r\n");
}
public BigInteger GF_mult(BigInteger x, BigInteger y)
{
BigInteger product = new BigInteger(0);
BigInteger e10 = BigInteger.Parse("00E1000000000000000000000000000000", NumberStyles.AllowHexSpecifier);
/*
* Below operation y >> i fails if i is UInt32, so leaving it as int
*
*/
int i = 127;
while (i != -1)
{
product = product ^ (x * ((y >> i) & 1));
x = (x >> 1) ^ ((x & 1) * e10);
i = i - 1;
}
return product;
}
public BigInteger H_mult(BigInteger H, BigInteger val)
{
BigInteger product = new BigInteger(0);
int i = 0;
/*
* Below operation (val & 0xFF) << (8 * i) fails if i is UInt32, so leaving it as int
*
*/
while (i < 16)
{
product = product ^ GF_mult(H, (val & 0xFF) << (8 * i));
val = val >> 8;
i = i + 1;
}
return product;
}
public BigInteger GHASH(BigInteger H, byte[] A, byte[] C)
{
int C_len = C.Length;
List <byte> temp = new List<byte>();
int plen = 16 - (A.Length % 16);
byte[] zeroes = new byte[plen];
Array.Clear(zeroes, 0, zeroes.Length);
temp.AddRange(A);
temp.AddRange(zeroes);
temp.Reverse();
BigInteger A_padded = new BigInteger(temp.ToArray());
temp.Clear();
temp.TrimExcess();
byte[] C1;
if ((C_len % 16) != 0)
{
plen = 16 - (C_len % 16);
byte[] zeroes1 = new byte[plen];
Array.Clear(zeroes, 0, zeroes.Length);
temp.AddRange(C);
temp.AddRange(zeroes1);
C1 = temp.ToArray();
}
else
{
C1 = new byte[C.Length];
Array.Copy(C, 0, C1, 0, C.Length);
}
temp.Clear();
temp.TrimExcess();
BigInteger tag = new BigInteger();
tag = H_mult(H, A_padded);
this.printBuf(H.ToByteArray(), "H Byte Array:");
for (int i = 0; i < (int) (C1.Length / 16); i ++)
{
byte[] toTake;
if (i == 0)
{
toTake = C1.Take(16).ToArray();
}
else
{
toTake = C1.Skip(i * 16).Take(16).ToArray();
}
Array.Reverse(toTake);
BigInteger tempNum = new BigInteger(toTake);
tag ^= tempNum;
tag = H_mult(H, tag);
}
byte[] A_arr = BitConverter.GetBytes((long) (8 * A.Length));
/*
* Want length to be "00 00 00 00 00 00 00 xy" format
*
*/
Array.Reverse(A_arr);
byte[] C_arr = BitConverter.GetBytes((long) (8 * C_len));
/*
* Want length to be "00 00 00 00 00 00 00 xy" format
*
*/
Array.Reverse(C_arr);
temp.AddRange(A_arr);
temp.AddRange(C_arr);
temp.Reverse();
BigInteger array_int = new BigInteger(temp.ToArray());
tag = tag ^ array_int;
tag = H_mult(H, tag);
return tag;
}
Using SSL decryption in wireshark (using private key), I found that:
The nonce calculated by the C# code is same as that in wireshark (fixed part is client_write_IV and variable part is 8 bytes random)
The value of AAD (auth_data above) (client_write_key, seqNum + ctype + tls_version + plaintext_size) is matching with wireshark value
Cipher text (ctext above) (the C in GHASH(H, A, C)), is also matching the wireshark calculated value
However, the auth_tag calculation (GHASH(H_client, auth_data, ctext)) is failing. It would be great if someone could guide me as to what could be wrong in GHASH function. I just did a basic comparison of results of GF_mult function in python and C#, but the results are not matching too
This is not a final solution, but just an advice. I have seen you are using a lot the function BitConverter.GetBytes, int instead of Int32 or Int16.
The remarks from the official documentation says:
The order of bytes in the array returned by the GetBytes method
depends on whether the computer architecture is little-endian or
big-endian.
As for when you are using the BigInteger structure, it seems to be expecting always the little-endian order:
value
Type: System.Byte[]
An array of byte values in little-endian order.
Prefer using the Int32 and Int16 and pay attention to the order of the bytes before using it on these calculations.
Use log4net to log all the operations. Would be nice to put the same logs in the python program so that you could compare then at once, and check exactly where the calculations change.
Hope this give some tips on where to start.

Convert C++ function to C#

I am trying to port the following C++ function to C#:
QString Engine::FDigest(const QString & input)
{
if(input.size() != 32) return "";
int idx[] = {0xe, 0x3, 0x6, 0x8, 0x2},
mul[] = {2, 2, 5, 4, 3},
add[] = {0x0, 0xd, 0x10, 0xb, 0x5},
a, m, i, t, v;
QString b;
char tmp[2] = { 0, 0 };
for(int j = 0; j <= 4; j++)
{
a = add[j];
m = mul[j];
i = idx[j];
tmp[0] = input[i].toAscii();
t = a + (int)(strtol(tmp, NULL, 16));
v = (int)(strtol(input.mid(t, 2).toLocal8Bit(), NULL, 16));
snprintf(tmp, 2, "%x", (v * m) % 0x10);
b += tmp;
}
return b;
}
Some of this code is easy to port however I'm having problems with this part:
tmp[0] = input[i].toAscii();
t = a + (int)(strtol(tmp, NULL, 16));
v = (int)(strtol(input.mid(t, 2).toLocal8Bit(), NULL, 16));
snprintf(tmp, 2, "%x", (v * m) % 0x10);
I have found that (int)strtol(tmp, NULL, 16) equals int.Parse(tmp, "x") in C# and snprintf is String.Format, however I'm not sure about the rest of it.
How can I port this fragment to C#?
Edit I have a suspicion that your code actually does a MD5 digest of the input data.
See below for a snippet based on that assumption.
Translation steps
A few hints that should work well1
Q: tmp[0] = input[i].toAscii();
bytes[] ascii = ASCIIEncoding.GetBytes(input);
tmp[0] = ascii[i];
Q: t = a + (int)(strtol(tmp, NULL, 16));
t = a + int.Parse(string.Format("{0}{1}", tmp[0], tmp[1]),
System.Globalization.NumberStyles.HexNumber);
Q: v = (int)(strtol(input.mid(t, 2).toLocal8Bit(), NULL, 16));
No clue about the toLocal8bit, would need to read Qt documentation...
Q: snprintf(tmp, 2, "%x", (v * m) % 0x10);
{
string tmptext = ((v*m % 16)).ToString("X2");
tmp[0] = tmptext[0];
tmp[1] = tmptext[1];
}
What if ... it's just MD5?
You could try this directly to see whether it achieves what you need:
using System;
public string FDigest(string input)
{
MD5 md5 = System.Security.Cryptography.MD5.Create();
byte[] ascii = System.Text.Encoding.ASCII.GetBytes (input);
byte[] hash = md5.ComputeHash (ascii);
// Convert the byte array to hexadecimal string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hash.Length; i++)
sb.Append (hash[i].ToString ("X2")); // "x2" for lowercase
return sb.ToString();
}
1 explicitly not optimized, intended as quick hints; optimize as necessary
A few more hints:
t is a two byte buffer and you only ever write to the first byte, leaving a trailing nul. So t is always a string of exactly one character, and you're processing a hex number one character at a time. So I think
tmp[0] = input[i].toAscii();
t = a + (int)(strtol(tmp, NULL, 16));
this is roughly int t = a + Convert.ToInt32(input.substring(i, 1), 16); - take one digit from input and add its hex value to a which you've looked up from a table. (I'm assuming that the toAscii is simply to map the QString character which is already a hex digit into ASCII for strtol, so if you have a string of hex digits already this is OK.)
Next
v = (int)(strtol(input.mid(t, 2).toLocal8Bit(), NULL, 16));
this means look up two characters from input from offset t, i.e. input.substring(t, 2), then convert these to a hex integer again. v = Convert.ToInt32(input.substring(t, 2), 16); Now, as it happens, I think you'll only actually use the second digit here anyway since the calculation is (v * a) % 0x10, but hey. If again we're working with a QString of hex digits then toLocal8Bit ought to be the same conversion as toAscii - I'm not clear why your code has two different functions here.
Finally convert these values to a single digit in tmp, then append that to b
snprintf(tmp, 2, "%x", (v * m) % 0x10);
b += tmp;
(2 is the length of the buffer, and since we need a trailing nul only 1 is ever written) i.e.
int digit = (v * m) % 0x10;
b += digit.ToString("x");
should do. I'd personally write the mod 16 as a logical and, & 0xf, since it's intended to strip the value down to a single digit.
Note also that in your code i is never set - I guess that's a loop or something you omitted for brevity?
So, in summary
int t = a + Convert.ToInt32(input.substring(i, 1), 16);
int v = Convert.ToInt32(input.substring(t, 2), 16);
int nextDigit = (v * m) & 0xf;
b += nextDigit.ToString("x");

How to convert a byte array (MD5 hash) into a string (36 chars)?

I've got a byte array that was created using a hash function. I would like to convert this array into a string. So far so good, it will give me hexadecimal string.
Now I would like to use something different than hexadecimal characters, I would like to encode the byte array with these 36 characters: [a-z][0-9].
How would I go about?
Edit: the reason I would to do this, is because I would like to have a smaller string, than a hexadecimal string.
I adapted my arbitrary-length base conversion function from this answer to C#:
static string BaseConvert(string number, int fromBase, int toBase)
{
var digits = "0123456789abcdefghijklmnopqrstuvwxyz";
var length = number.Length;
var result = string.Empty;
var nibbles = number.Select(c => digits.IndexOf(c)).ToList();
int newlen;
do {
var value = 0;
newlen = 0;
for (var i = 0; i < length; ++i) {
value = value * fromBase + nibbles[i];
if (value >= toBase) {
if (newlen == nibbles.Count) {
nibbles.Add(0);
}
nibbles[newlen++] = value / toBase;
value %= toBase;
}
else if (newlen > 0) {
if (newlen == nibbles.Count) {
nibbles.Add(0);
}
nibbles[newlen++] = 0;
}
}
length = newlen;
result = digits[value] + result; //
}
while (newlen != 0);
return result;
}
As it's coming from PHP it might not be too idiomatic C#, there are also no parameter validity checks. However, you can feed it a hex-encoded string and it will work just fine with
var result = BaseConvert(hexEncoded, 16, 36);
It's not exactly what you asked for, but encoding the byte[] into hex is trivial.
See it in action.
Earlier tonight I came across a codereview question revolving around the same algorithm being discussed here. See: https://codereview.stackexchange.com/questions/14084/base-36-encoding-of-a-byte-array/
I provided a improved implementation of one of its earlier answers (both use BigInteger). See: https://codereview.stackexchange.com/a/20014/20654. The solution takes a byte[] and returns a Base36 string. Both the original and mine include simple benchmark information.
For completeness, the following is the method to decode a byte[] from an string. I'll include the encode function from the link above as well. See the text after this code block for some simple benchmark info for decoding.
const int kByteBitCount= 8; // number of bits in a byte
// constants that we use in FromBase36String and ToBase36String
const string kBase36Digits= "0123456789abcdefghijklmnopqrstuvwxyz";
static readonly double kBase36CharsLengthDivisor= Math.Log(kBase36Digits.Length, 2);
static readonly BigInteger kBigInt36= new BigInteger(36);
// assumes the input 'chars' is in big-endian ordering, MSB->LSB
static byte[] FromBase36String(string chars)
{
var bi= new BigInteger();
for (int x= 0; x < chars.Length; x++)
{
int i= kBase36Digits.IndexOf(chars[x]);
if (i < 0) return null; // invalid character
bi *= kBigInt36;
bi += i;
}
return bi.ToByteArray();
}
// characters returned are in big-endian ordering, MSB->LSB
static string ToBase36String(byte[] bytes)
{
// Estimate the result's length so we don't waste time realloc'ing
int result_length= (int)
Math.Ceiling(bytes.Length * kByteBitCount / kBase36CharsLengthDivisor);
// We use a List so we don't have to CopyTo a StringBuilder's characters
// to a char[], only to then Array.Reverse it later
var result= new System.Collections.Generic.List<char>(result_length);
var dividend= new BigInteger(bytes);
// IsZero's computation is less complex than evaluating "dividend > 0"
// which invokes BigInteger.CompareTo(BigInteger)
while (!dividend.IsZero)
{
BigInteger remainder;
dividend= BigInteger.DivRem(dividend, kBigInt36, out remainder);
int digit_index= Math.Abs((int)remainder);
result.Add(kBase36Digits[digit_index]);
}
// orientate the characters in big-endian ordering
result.Reverse();
// ToArray will also trim the excess chars used in length prediction
return new string(result.ToArray());
}
"A test 1234. Made slightly larger!" encodes to Base64 as "165kkoorqxin775ct82ist5ysteekll7kaqlcnnu6mfe7ag7e63b5"
To decode that Base36 string 1,000,000 times takes 12.6558909 seconds on my machine (I used the same build and machine conditions as provided in my answer on codereview)
You mentioned that you were dealing with a byte[] for the MD5 hash, rather than a hexadecimal string representation of it, so I think this solution provide the least overhead for you.
If you want a shorter string and can accept [a-zA-Z0-9] and + and / then look at Convert.ToBase64String
Using BigInteger (needs the System.Numerics reference)
Using BigInteger (needs the System.Numerics reference)
const string chars = "0123456789abcdefghijklmnopqrstuvwxyz";
// The result is padded with chars[0] to make the string length
// (int)Math.Ceiling(bytes.Length * 8 / Math.Log(chars.Length, 2))
// (so that for any value [0...0]-[255...255] of bytes the resulting
// string will have same length)
public static string ToBaseN(byte[] bytes, string chars, bool littleEndian = true, int len = -1)
{
if (bytes.Length == 0 || len == 0)
{
return String.Empty;
}
// BigInteger saves in the last byte the sign. > 7F negative,
// <= 7F positive.
// If we have a "negative" number, we will prepend a 0 byte.
byte[] bytes2;
if (littleEndian)
{
if (bytes[bytes.Length - 1] <= 0x7F)
{
bytes2 = bytes;
}
else
{
// Note that Array.Resize doesn't modify the original array,
// but creates a copy and sets the passed reference to the
// new array
bytes2 = bytes;
Array.Resize(ref bytes2, bytes.Length + 1);
}
}
else
{
bytes2 = new byte[bytes[0] > 0x7F ? bytes.Length + 1 : bytes.Length];
// We copy and reverse the array
for (int i = bytes.Length - 1, j = 0; i >= 0; i--, j++)
{
bytes2[j] = bytes[i];
}
}
BigInteger bi = new BigInteger(bytes2);
// A little optimization. We will do many divisions based on
// chars.Length .
BigInteger length = chars.Length;
// We pre-calc the length of the string. We know the bits of
// "information" of a byte are 8. Using Log2 we calc the bits of
// information of our new base.
if (len == -1)
{
len = (int)Math.Ceiling(bytes.Length * 8 / Math.Log(chars.Length, 2));
}
// We will build our string on a char[]
var chs = new char[len];
int chsIndex = 0;
while (bi > 0)
{
BigInteger remainder;
bi = BigInteger.DivRem(bi, length, out remainder);
chs[littleEndian ? chsIndex : len - chsIndex - 1] = chars[(int)remainder];
chsIndex++;
if (chsIndex < 0)
{
if (bi > 0)
{
throw new OverflowException();
}
}
}
// We append the zeros that we skipped at the beginning
if (littleEndian)
{
while (chsIndex < len)
{
chs[chsIndex] = chars[0];
chsIndex++;
}
}
else
{
while (chsIndex < len)
{
chs[len - chsIndex - 1] = chars[0];
chsIndex++;
}
}
return new string(chs);
}
public static byte[] FromBaseN(string str, string chars, bool littleEndian = true, int len = -1)
{
if (str.Length == 0 || len == 0)
{
return new byte[0];
}
// This should be the maximum length of the byte[] array. It's
// the opposite of the one used in ToBaseN.
// Note that it can be passed as a parameter
if (len == -1)
{
len = (int)Math.Ceiling(str.Length * Math.Log(chars.Length, 2) / 8);
}
BigInteger bi = BigInteger.Zero;
BigInteger length2 = chars.Length;
BigInteger mult = BigInteger.One;
for (int j = 0; j < str.Length; j++)
{
int ix = chars.IndexOf(littleEndian ? str[j] : str[str.Length - j - 1]);
// We didn't find the character
if (ix == -1)
{
throw new ArgumentOutOfRangeException();
}
bi += ix * mult;
mult *= length2;
}
var bytes = bi.ToByteArray();
int len2 = bytes.Length;
// BigInteger adds a 0 byte for positive numbers that have the
// last byte > 0x7F
if (len2 >= 2 && bytes[len2 - 1] == 0)
{
len2--;
}
int len3 = Math.Min(len, len2);
byte[] bytes2;
if (littleEndian)
{
if (len == bytes.Length)
{
bytes2 = bytes;
}
else
{
bytes2 = new byte[len];
Array.Copy(bytes, bytes2, len3);
}
}
else
{
bytes2 = new byte[len];
for (int i = 0; i < len3; i++)
{
bytes2[len - i - 1] = bytes[i];
}
}
for (int i = len3; i < len2; i++)
{
if (bytes[i] != 0)
{
throw new OverflowException();
}
}
return bytes2;
}
Be aware that they are REALLY slow! REALLY REALLY slow! (2 minutes for 100k). To speed them up you would probably need to rewrite the division/mod operation so that they work directly on a buffer, instead of each time recreating the scratch pads as it's done by BigInteger. And it would still be SLOW. The problem is that the time needed to encode the first byte is O(n) where n is the length of the byte array (this because all the array needs to be divided by 36). Unless you want to work with blocks of 5 bytes and lose some bits. Each symbol of Base36 carries around 5.169925001 bits. So 8 of these symbols would carry 41.35940001 bits. Very near 40 bytes.
Note that these methods can work both in little-endian mode and in big-endian mode. The endianness of the input and of the output is the same. Both methods accept a len parameter. You can use it to trim excess 0 (zeroes). Note that if you try to make an output too much small to contain the input, an OverflowException will be thrown.
System.Text.Encoding enc = System.Text.Encoding.ASCII;
string myString = enc.GetString(myByteArray);
You can play with what encoding you need:
System.Text.ASCIIEncoding,
System.Text.UnicodeEncoding,
System.Text.UTF7Encoding,
System.Text.UTF8Encoding
To match the requrements [a-z][0-9] you can use it:
Byte[] bytes = new Byte[] { 200, 180, 34 };
string result = String.Join("a", bytes.Select(x => x.ToString()).ToArray());
You will have string representation of bytes with char separator. To convert back you will need to split, and convert the string[] to byte[] using the same approach with .Select().
Usually a power of 2 is used - that way one character maps to a fixed number of bits. An alphabet of 32 bits for instance would map to 5 bits. The only challenge in that case is how to deserialize variable-length strings.
For 36 bits you could treat the data as a large number, and then:
divide by 36
add the remainder as character to your result
repeat until the division results in 0
Easier said than done perhaps.
you can use modulu.
this example encode your byte array to string of [0-9][a-z].
change it if you want.
public string byteToString(byte[] byteArr)
{
int i;
char[] charArr = new char[byteArr.Length];
for (i = 0; i < byteArr.Length; i++)
{
int byt = byteArr[i] % 36; // 36=num of availible charachters
if (byt < 10)
{
charArr[i] = (char)(byt + 48); //if % result is a digit
}
else
{
charArr[i] = (char)(byt + 87); //if % result is a letter
}
}
return new String(charArr);
}
If you don't want to lose data for de-encoding you can use this example:
public string byteToString(byte[] byteArr)
{
int i;
char[] charArr = new char[byteArr.Length*2];
for (i = 0; i < byteArr.Length; i++)
{
charArr[2 * i] = (char)((int)byteArr[i] / 36+48);
int byt = byteArr[i] % 36; // 36=num of availible charachters
if (byt < 10)
{
charArr[2*i+1] = (char)(byt + 48); //if % result is a digit
}
else
{
charArr[2*i+1] = (char)(byt + 87); //if % result is a letter
}
}
return new String(charArr);
}
and now you have a string double-lengthed when odd char is the multiply of 36 and even char is the residu. for example: 200=36*5+20 => "5k".

Dotnet Hex string to Java

Have a problem, much like this post: How to read a .NET Guid into a Java UUID.
Except, from a remote svc I get a hex str formatted like this: ABCDEFGH-IJKL-MNOP-QRST-123456.
I need to match the GUID.ToByteArray() generated .net byte array GH-EF-CD-AB-KL-IJ-OP-MN- QR- ST-12-34-56 in Java for hashing purposes.
I'm kinda at a loss as to how to parse this. Do I cut off the QRST-123456 part and perhaps use something like the Commons IO EndianUtils on the other part, then stitch the 2 arrays back together as well? Seems way too complicated.
I can rearrange the string, but I shouldn't have to do any of these. Mr. Google doesn't wanna help me neither..
BTW, what is the logic in Little Endian land that keeps those last 6 char unchanged?
Yes, for reference, here's what I've done {sorry for 'answer', but had trouble formatting it properly in comment}:
String s = "3C0EA2F3-B3A0-8FB0-23F0-9F36DEAA3F7E";
String[] splitz = s.split("-");
String rebuilt = "";
for (int i = 0; i < 3; i++) {
// Split into 2 char chunks. '..' = nbr of chars in chunks
String[] parts = splitz[i].split("(?<=\\G..)");
for (int k = parts.length -1; k >=0; k--) {
rebuilt += parts[k];
}
}
rebuilt += splitz[3]+splitz[4];
I know, it's hacky, but it'll do for testing.
Make it into a byte[] and skip the first 3 bytes:
package guid;
import java.util.Arrays;
public class GuidConvert {
static byte[] convertUuidToBytes(String guid) {
String hexdigits = guid.replaceAll("-", "");
byte[] bytes = new byte[hexdigits.length()/2];
for (int i = 0; i < bytes.length; i++) {
int x = Integer.parseInt(hexdigits.substring(i*2, (i+1)*2), 16);
bytes[i] = (byte) x;
}
return bytes;
}
static String bytesToHexString(byte[] bytes) {
StringBuilder buf = new StringBuilder();
for (byte b : bytes) {
int i = b >= 0 ? b : (int) b + 256;
buf.append(Integer.toHexString(i / 16));
buf.append(Integer.toHexString(i % 16));
}
return buf.toString();
}
public static void main(String[] args) {
String guid = "3C0EA2F3-B3A0-8FB0-23F0-9F36DEAA3F7E";
byte[] bytes = convertUuidToBytes(guid);
System.err.println("GUID = "+ guid);
System.err.println("bytes = "+ bytesToHexString(bytes));
byte[] tail = Arrays.copyOfRange(bytes, 3, bytes.length);
System.err.println("tail = "+ bytesToHexString(tail));
}
}
The last group of 6 bytes is not reversed because it is an array of bytes. The first four groups are reversed because they are a four-byte integer followed by three two-byte integers.

Categories