Converting Java hash to C# and need help understanding the Java - c#

I'm converting a Java library over to C# as I rewrite a legacy application and I need some assistance. I need to understand what this line in Java is doing:
sb.append(Integer.toHexString((b & 0xFF) | 0x100).substring(1,3))
and if this C# line is equivalent
result += (Convert.ToInt32(b).ToString("x2") + " ").Substring(1,3);
In both cases b is a byte from a SHA-1 hash that the code is looping through.
The Java part I don't understand is ((b & 0xFF) | 0x100). It looks like it's padding it?
Ordinarily I would compare the output from the Java application to what my C# is generating but I am not in a postition to do that right now (and it's frustrating me - trust me).

You don't need to change the original that drastically - the C# equivalent (assuming 'sb' is a StringBuilder) is just:
sb.Append(((b & 0xFF) | 0x100).ToString("x").Substring(1, 2));

b & 0xFF will mask the lowest byte. So whatever b is, you will get something between 0x00 and 0xFF.
In the resulting integer, 9th bit is set, no matter what it was before. So you'll have something between 0x0100 to 0x01FF.
From that string the substring from index 1 to 3 is cropped. It will give you the last two digits which will be something between 00 and FF. The |0x100 is a neat trick to have Integer.toHexString give you a leading zero for the last two digits which it wouldn't according to it's javadoc ...
If I remeber correctly, your C# code does not exactly the same. But I hope with this explanation you can build it up yourself :)

Related

Convert hashed value to a number for Time Based One Time Password (TOTP)

I've read this Github documentation: Otp.NET
In a section there are these codes:
protected internal long CalculateOtp(byte[] data, OtpHashMode mode)
{
byte[] hmacComputedHash = this.secretKey.ComputeHmac(mode, data);
// The RFC has a hard coded index 19 in this value.
// This is the same thing but also accomodates SHA256 and SHA512
// hmacComputedHash[19] => hmacComputedHash[hmacComputedHash.Length - 1]
int offset = hmacComputedHash[hmacComputedHash.Length - 1] & 0x0F;
return (hmacComputedHash[offset] & 0x7f) << 24
| (hmacComputedHash[offset + 1] & 0xff) << 16
| (hmacComputedHash[offset + 2] & 0xff) << 8
| (hmacComputedHash[offset + 3] & 0xff) % 1000000;
}
I think the last part of above method is convert hashed value to a number but I don't understand the philosophy and the algorithm of it.
1)What is the offset?
2)Why some bytes AND with 0x0f or 0xff?
3)Why in hast line it get Remain for 1000000?
Thanks
RFC 4226 specifies how the data is to be calculated from the HMAC value.
First, the bottom four bits of the last byte are used to determine a starting offset into the HMAC value. This was done so that even if an attacker found a weakness in some fixed portion of the HMAC output, it would be hard to leverage that directly into an attack.
Then, four bytes, big-endian, are read from the HMAC output starting at that offset. The top bit is cleared, to prevent any problems with negative numbers being mishandled, since some languages (e.g., Java) don't provide native unsigned numbers. Finally, the lower N digits are taken (which is typically 6, but sometimes 7 or 8). In this case, the implementation is hard-coded to 6, hence the modulo operation.
Note that due to operator precedence, the bitwise-ors bind more tightly than the modulo operation. This implementer has decided that they'd like to be clever and have not helped us out by adding an explicit pair of parentheses, but in the real world, it's nice to help the reader.

C# So how does this hexadecimal stuff work?

I'm doing some entry level programming challenges at codefights.com and I came across the following question. The link is to a blog that has the answer, but it includes the question in it as well. If only it had an explanation...
https://codefightssolver.wordpress.com/2016/10/19/swap-adjacent-bits/
My concern is with the line of code (it is the only line of code) below.
return (((n & 0x2AAAAAAA) >> 1) | ((n & 0x15555555) << 1)) ;
Specifically, I'm struggling to find some decent info on how the "0x2AAAAAAA" and "0x15555555" work, so I have a few dumb questions. I know they represent binary values of 10101010... and 01010101... respectively.
1. I've messed around some and found out that the number of 5s and As corresponds loosely and as far as I can tell to bit size, but how?
2. Why As? Why 5s?
3. Why the 2 and the 1 before the As and 5s?
4. Anything else I should know about this? Does anyone know a cool blog post or website that explains some of this in more detail?
0x2AAAAAAA is 00101010101010101010101010101010 in 32 bits binary,
0x15555555 is ‭00010101010101010101010101010101‬ in 32 bits binary.
Note that the problem specifies Constraints: 0 ≤ n < 2^30. For this reason the highest two bits can be 00.
The two hex numbers have been "built" starting from their binary representation, that has a particular property (that we will see in the next paragraph).
Now... We can say that, given the constraint, x & 0x2AAAAAAA will return the even bits of x (if we count the bits as first, second, third... the second bit is even), while x & 0x15555555 will return the odd bits of x. By using << 1 and >> 1 you move them of one step. By using | (or) you re-merge them.
0x2AAAAAAA is used to get 30 bits, which is the constraint.
Constraints:
0 ≤ n < 2^30.
0x15555555 also represent 30 bits with bits opposite of other number.
I would start with binary number (101010101010101010101010101010) in the calculator and select hex using programmer calculator to show the number in hex.
you can also use 0b101010101010101010101010101010 too, if you like, depending on language.

tricky way to convert char to lowercase - and how do the opposite

In this stackoverflow answer there is a piece of code to transform a char to lowercase:
// tricky way to convert to lowercase
sb.Append((char)(c | 32));
What is happening in (char)(c | 32) and how is it possible to do the opposite to transform to uppercase?
This is a cheap ASCII trick, that only works for that particular encoding. It is not recommended. But to answer your question, the reverse operation involves masking instead of combining:
sb.Append((char)(c & ~32));
Here, you take the bitwise inverse of 32 and use bitwise-AND. That will force that single bit off and leave others unchanged.
The reason this works is because the ASCII character set is laid out such that the lower 5 bits are the same for upper- and lowercase characters, and only differ by the 6th bit (32, or 00100000b). When you use bitwise-OR, you add the bit in. When you mask with the inverse, you remove the bit.

When is a shift operator >> or << useful? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
When to use Shift operators << >> in C# ?
I've in programming a while and I've never used the shift operator. I could see how it could be helpful for calculating hash codes like in Tuple<T>, but other than that,
When and how is the shift operator useful in C#/.NET?
In general it's not used very often. But it's very useful when dealing with bit level operations. For example printing out the bits in a numeric value
public static string GetBits(int value) {
var builder = new StringBuilder();
for (int i = 0; i < 32; i++) {
var test = 1 << (31 - i);
var isSet = 0 != (test & value);
builder.Append(isSet ? '1' : '0');
}
return builder.ToString();
}
It's useful to write powers of two.
Quick: What's 227?
Answer: 1 << 27
Writing 1 << 27 is both easier and more understandable than 134217728.
I use it rather a lot in dealing with hardware. This isn't something you probably do in C# a lot, but the operator was inherited from C/C++ where it is a fairly common need.
Example 1:
I just got a longword from a little-endian machine, but I'm big endian. How do I convert it? Well, the obvious is call htonl() (you cheater). One of the manual ways to do it is with something like the following:
((source & 0x000000ff) << 24 ) |
((source & 0x0000ff00) << 8) |
((source & 0x00ff0000) >> 8) |
((source & 0xff000000) >> 24);
Example 2:
I have a DMA device that only allows longword accesses of up to 512K. So it requires me to put (for reasons only understood by hardware folks) the modulo 4 of the transfer size into the high order 18 bits of a DMA transfer control register. For the sake of arguement, the low-order bits are to be filled with various flags controlling the DMA operation. That would be accomplished like so:
dma_flags | ((length & 0xffffc) << 14);
These might not be the kind of things you do every day. But for those of us that regularly interface to hardware they are.
If you ever need to multiply without using * How to implement multiplication without using multiplication operator in .NET :)
Or write a Sudoku solver Sudoku validity check algorithm - how does this code works?
In practice, the only time I've seen it used in my (limited) experience was as an (arguably) confusing way to multiply (see first link) or in conjunction with setting BitFlags (the Sudoku solver above).
In .NET I rarely have to work at the bit level; but if you need to, being able to shift is important.
Bitwise operators are good for saving space, but nowadays, space is hardly an issue.
It's useful when multiplying by powers of 2
number<<power;
is number*2^power
And of course division by powers of 2:
number>>power;
Another place is flags in enums.
when you come across code like
Regex re = new Regex(".",RegexOptions.Multiline|RegexOptions.Singleline);
the ability to use RegexOptions.Multiline|RegexOptions.Singleline i.e multiple flags is enabled through the shifting and also this allows them to be unique.
Something like:
enum RegexOptions {
Multiline = (1 << 0),
Singleline = (1<<1)
};
Bit shifts are used when manipulating individual bits is desired. You'll see a lot of bit shifts in many encryption algorithms, for example.
In optimization, it can used in place of multiplication/division. A shift left is equal to multiplying by two. Shift right equals division. You probably don't see this done anymore, since this level of optimization is often unnecessary.
Other than that, I can't think of many reasons to use it. I've seen it used before, but rarely in cases where it was really required and usually a more readable method could have been used.
Whenever you need to multiply by 2 ;)
Really the only use I have is for interoperability code and bitfields:
http://www.codeproject.com/KB/cs/masksandflags.aspx

How to set endianness when converting to or from hex strings

To convert an integer to a hex formatted string I am using ToString("X4") like so:
int target = 250;
string hexString = target.ToString("X4");
To get an integer value from a hex formatted string I use the Parse method:
int answer = int.Parse(data, System.Globalization.NumberStyles.HexNumber);
However the machine that I'm exchanging data with puts the bytes in reverse order.
To keep with the sample data, If I want to send the value 250 I need a string of "FA00" (not 00FA which is what hexString is) Likewise if I get "FA00" I need to convert that to 250 not 64000.
How do I set the endianness of these two converstion methods?
Marc's answer seems, by virtue of having been accepted, to have addressed the OP's original issue. However, it's not really clear to me from the question text why. That still seems to require swapping of bytes, not pairs of bytes as Marc's answer does. I'm not aware of any reasonably common scenario where swapping bits 16 at a time makes sense or is useful.
For the stated requirements, IMHO it would make more sense to write this:
int target = 250; // 0x00FA
// swap the bytes of target
target = ((target << 8) | (target >> 8)) & 0xFFFF;
// target now is 0xFA00
string hexString = target.ToString("X4");
Note that the above assumes we're actually dealing with 16-bit values, stored in a 32-bit int variable. It will handle any input in the 16-bit range (note the need to mask off the upper 16 bits, as they get set to non-zero values by the << operator).
If swapping 32-bit values, one would need something like this:
int target = 250; // 0x00FA
// swap the bytes of target
target = (int)((int)((target << 24) & 0xff000000) |
((target << 8) & 0xff0000) |
((target >> 8) & 0xff00) |
((target >> 24) & 0xff));
// target now is 0xFA000000
string hexString = target.ToString("X8");
Again, masking is required to isolate the bits we are moving to specific positions. Casting the << 24 result back to int before or-ing with the other three bytes is needed because 0xff000000 is a uint (UInt32) literal and causes the & expression to be extended to long (Int64). Otherwise, you'll get compiler warnings with each of the | operators.
In any case, as this comes up most often in networking scenarios, it is worth noting that .NET provides helper methods that can assist with this operation: HostToNetworkOrder() and NetworkToHostOrder(). In this context, "network order" is always big-endian, and "host order" is whatever byte order is used on the computer hosting the current process.
If you know that you are receiving data that's big-endian, and you want to be able to interpret in as correct values in your process, you can call NetworkToHostOrder(). Likewise, if you need to send data in a context where big-endian is expected, you can call HostToNetworkOrder().
These methods work only with the three basic integer types: Int16, Int32, and Int64 (in C#, short, int, and long, respectively). They also return the same type passed to them, so you have to be careful about sign extension. The original example in the question could be solved like this:
int target = 250; // 0x00FA
// swap the bytes of target
target = IPAddress.HostToNetworkOrder((short)target) & 0xFFFF;
// target now is 0xFA00
string hexString = target.ToString("X4");
Once again, masking is required because otherwise the short value returned by the method will be sign-extended to 32 bits. If bit 15 (i.e. 0x8000) is set in the result, then the final int value would otherwise have its highest 16 bits set as well. This could be addressed without masking simply by using more appropriate data types for the variables (e.g. short when the data is expected to be signed 16-bit values).
Finally, I will note that the HostToNetworkOrder() and NetworkToHostOrder() methods, since they are only ever swapping bytes, are equivalent to each other. They both swap bytes, when the machine architecture is little-endian† . And indeed, the .NET implementation is simply for the NetworkToHostOrder() to call HostToNetworkOrder(). There are two methods mainly so that the .NET API matches the original BSD sockets API, which included functions like htons() and ntohs(), and that API in turn included functions for both directions of conversion mainly so that it was clear in code whether one was receiving data from the network or sending data to the network.
† And do nothing when the machine architecture is big-endian…they aren't useful as generalized byte-swapping functions. Rather, the expectation is that the network protocol will always be big-endian, and these functions are used to ensure the data bytes are swapped to match whatever the machine architecture is.
That isn't an inbuilt option. So either do string work to swap the characters around, or so some bit-shifting, I.e.
int otherEndian = (value << 16) | (((uint)value) >> 16);

Categories