Byte order: converting java bytes to c# - c#

While converting a Java application to C# I came through a strange and very annoying piece of code, which is crucial and works in the original version.
byte[] buf = new byte[length];
byte[] buf2 = bout.toByteArray();
System.arraycopy(buf2, 0, buf, 0, buf2.length);;
for (int i = (int) offset; i < (int) length; ++i) {
buf[i] = (byte) 255;
}
The part which is causing a casting error is the set into buf[i] of the byte 255: while in Java it works fine, since java.lang.Byte spans from 0 to 255, .NET System.Byte spans from 0 to 254.
Because of this limitation, the output in the C# version of the application is that instead of 255, as expected, the buffer contains a set of 254.
Could anyone give me a viable alternative?
Thank you very much for the support.

I think you've misdiagnosed your problem: .NET bytes are 8-bit like everyone else's. A better approach is to try to understand what the Java code is trying to do, then figure out what the cleanest equivalent is in C#.

I think this might be because you're casting the 255 integer literal to a byte, rather than assigning a byte value. I recommend you try using using Byte.MaxValue instead. Byte.MaxValue has a value of 255.
For example:
buf[i] = byte.MaxValue;
Edit: I was wrong; (byte)255 definitely evaluates to 255; I've just confirmed in VS. There must be something you're doing to cause the change elsewhere in your code.

byte.MaxValue equals 255.
The value of this constant is 255 (hexadecimal 0xFF).

Are you absolutely sure about this C# "limitation", according to MSDN : http://msdn.microsoft.com/en-us/library/5bdb6693(VS.71).aspx
The C# byte is an unsigned 8 bit integer with values that can range between 0 and 255.

From MDSN
byte:
The byte keyword denotes an integral type that stores values as indicated in the following table.
.NET Framework type: System Byte
Range: byte 0 to 255
Size : Unsigned 8-bit integer

Related

Why doesn't my 32-bit integer convert into a float properly?

Background
First of all, I have some hexadecimal data... 0x3AD3FFD6. I have chosen to represent this data as an array of bytes as follows:
byte[] numBytes = { 0x3A, 0xD3, 0xFF, 0xD6 };
I attempt to convert this array of bytes into its single-precision floating point value by executing the following code:
float floatNumber = 0;
floatNumber = BitConverter.ToSingle(numBytes, 0);
I have calculated this online using this IEEE 754 Converter and got the following result:
0.0016174268
I would expect the output of the C# code to produce the same thing, but instead I am getting something like...
-1.406E+14
Question
Can anybody explain what is going on here?
The bytes are in the wrong order. BitConverter uses the endianness of the underlying system (computer architecture), make sure to use the right endianness always.
Quick Answer: You've got the order of the bytes in your numBytes array backwards.
Since you're programming in C# I assume you are running on an Intel processor and Intel processors are little endian; that is, they store (and expect) the least significant bytes first. In your numBytes array you are putting the most significant byte first.
BitConverter doesn't so much convert byte array data as interpret it as another base data type. Think of physical memory holding a byte array:
b0 | b1 | b2 | b3.
To interpret that byte array as a single precision float, one must know the endian of the machine, i.e. if the LSByte is stored first or last. It may seem natural that the LSByte comes last because many of us read that way, but for little endian (Intel) processors, that's incorrect.

Why Do Bytes Carryover?

I have been playing with some byte arrays recently (dealing with grayscale images). A byte can have values 0-255. I was modifying the bytes, and came across a situation where the value I was assigning to the byte was outside the bounds of the byte. It was doing unexpected things to the images I was playing with.
I wrote a test and learned that the byte carries over. Example:
private static int SetByte(int y)
{
return y;
}
.....
byte x = (byte) SetByte(-4);
Console.WriteLine(x);
//output is 252
There is a carryover! This happens when we go the other way around as well.
byte x = (byte) SetByte(259);
Console.WriteLine(x);
//output is 3
I would have expected it to set it to 255 in the first situation and 0 in the second. What is the purpose of this carry over? Is it just due to the fact that I'm casting this integer assignment? When is this useful in the real-world?
byte x = (byte) SetByte(259);
Console.WriteLine(x);
//output is 3
The cast of the result of SetByte is applying modulo 256 to your integer input, effectively dropping bits that are outside the range of a byte.
259 % 256 = 3
Why: The implementers choose to only consider the 8 least significant bits, ignoring the rest.
When compiling C# you can specify whether the assembly should be compiled in checked or unchecked mode (unchecked is default). You are also able to make certain parts of code explicit via the use of the checked or unchecked keywords.
You are currently using unchecked mode which ignores arithmetic overflow and truncates the value. The checked mode will check for possible overflows and throw if they are encountered.
Try the following:
int y = 259;
byte x = checked((byte)y);
And you will see it throws an OverflowException.
The reason why the behaviour in unchecked mode is to truncate rather than clamp is largely for performance reasons, every unchecked cast would require conditional logic to clamp the value when the majority of the time it is unnecessary and can be done manually.
Another reason is that clamping would involve a loss of data which may not be desirable. I don't condone code such as the following but have seen it (see this answer):
int input = 259;
var firstByte = (byte)input;
var secondByte = (byte)(input >> 8);
int reconstructed = (int)firstByte + (secondByte << 8);
Assert.AreEqual(reconstructed, input);
If firstByte came out as anything other than 3 this would not work at all.
One of the places I most commonly rely upon numeric carry over is when implementing GetHashCode(), see this answer to What is the best algorithm for an overridden System.Object.GetHashCode by Jon Skeet. It would be a nightmare to implement GetHashCode decently if overflowing meant we were constrained to Int32.MaxValue.
The method SetByte is irrelevant, simply casting (byte) 259 will also result in 3, since downcasting integral types is implemented as cutting of bytes.
You can create a custom clamp function:
public static byte Clamp(int n) {
if(n <= 0) return 0;
if(n >= 256) return 255;
return (byte) n;
}
Doing arithmetic modulo 2^n makes it possible for overflow errors in different directions to cancel each other out.
byte under = -12; // = 244
byte over = (byte) 260; // = 4
byte total = under + over;
Console.WriteLine(total); // prints 248, as intended
If .NET instead had overflows saturate, then the above program would print the incorrect answer 255.
The bounds control is not active for a case with direct type cast (when using (byte)) to avoid performance reducing.
FYI, result of most operations with operands of byte is integer, excluding the bit operations. Use Convert.ToByte() and you will get an Overflow Exception and you may handle it by assigning the 255 to your target.
Or you may create a fuction to do this check, as mentioned by another guy below.
If the perfomanse is a key, try to add attribute [MethodImpl(MethodImplOptions.AggressiveInlining)]
to that fuction.

Bit shifting and data explanation

Ok, I'd like to have an explanation on how bit shifting works and how to build data from an array of bytes.The language is not important (if an example is needed, I know C,C++, Java and C#, they all follow the same shifting syntax,no?)
The question is, how do I go from byte[] to something which is a bunch of bytes together? (be it 16 bit ints, 32 bits ints, 64 bit ints, n-bits ints) and more importantly, why? I'd like to understand and learn how to make this myself rather then copy from the internet.
I know of endianess, I mainly mess with little endian stuff, but explaining a general rule for both systems would be nice.
Thank you very very much!!
Federico
for bit shifting i would say. byte1 << n will shift bits of byte1 to n times left and result can be get by multiplying byte1 with 2^n...and for >>n we have to divide byte1 by 2^n.
Hm... it depends what you're looking for. There's really no way to convert to an n-bit integer type (it's always an array of bytes anyways) but you can shift bytes into other types.
my32BitInt = byte1 << 24 | byte2 << 16 | byte3 << 8 | byte4
Usually the compiler is going to take care of endianess for you. So you can OR bytes together with shifting and not worry too much about that.
I've used a bit of a cheat myself in the past to save cycles. When I know the data to be in arrays least to most significant order, then you can cast the bytes into larger types and walk the array by size of walker. For example this sample walks 4 bytes at a time and writes 4 byte ints in C.
char bytes[] = {1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1};
int *temp;
temp = (int *)&bytes[0];
printf("%d\n", *temp);
temp = (int *)&bytes[4];
printf("%d\n", *temp);
temp = (int *)&bytes[8];
printf("%d\n", *temp);
temp = (int *)&bytes[12];
printf("%d\n", *temp);
return 0;
Output:
1
256
65536
16777216
This is probably no good for C# and Java
You will want to to take into consideration the endianness of the data. From there, it depends on the data type - but here is an example. Let's say you would like to go from an array of 4 bytes to an unsigned 32 bit integer. Let's also assume the bytes are in the same order as they started (so we don't need to worry about the endianness ).
//In C
#include <stdint.h>
uint32_t offset; //the offset in the array, should be set to something
uint32_t unpacked_int; //target
uint8_t* packed_array; //source
for ( uint32_t i = 0; i < 4; --i ) {
unpacked_int += packed_array[ offset + i ] << 8*i;
You could equivalently use |= and OR the bytes too. Also, if you attempt this trick on structures be aware that your compiler probably won't pack values in back-to-back, although usually you can set the behavior with a pragma. For example, gcc/g++ packs data aligned to 32 bits on my architecture.
On C#, BitConverter should have you covered. I know Python, PHP, and Perl use a function called pack() for this, perhaps there is an equivalent in Java.

Converting Int32 to 24-bit signed integer

I have a need to convert an Int32 value to a 3-byte (24-bit) integer. Endianness remains the same (little), but I cannot figure out how to move the sign appropriately. The values are already constrained to the proper range, I just can't figure out how to convert 4 bytes to 3. Using C# 4.0. This is for hardware integration, so I have to have 24-bit values, cannot use 32 bit.
If you want to do that conversion, just remove the top byte of the four-byte number. Two's complement representation will take care of the sign correctly. If you want to keep the 24-bit number in an Int32 variable, you can use v & 0xFFFFFF to get just the lower 24 bits. I saw your comment about the byte array: if you have space in the array, write all four bytes of the number and just send the first three; that is specific to little-endian systems, though.
Found this: http://bytes.com/topic/c-sharp/answers/238589-int-byte
int myInt = 800;
byte[] myByteArray = System.BitConverter.GetBytes(myInt);
sounds like you just need to get the last 3 elements of the array.
EDIT:
as Jeremiah pointed out, you'd need to do something like
int myInt = 800;
byte[] myByteArray = System.BitConverter.GetBytes(myInt);
if (BitConverter.IsLittleEndian) {
// get the first 3 elements
} else {
// get the last 3 elements
}

How do I convert a int to an array of byte's and then back?

I need to send an integer through a NetworkStream. The problem is that I only can send bytes.
Thats why I need to split the integer in four byte's and send those and at the other end convert it back to a int.
For now I need this only in C#. But for the final project I will need to convert the four bytes to an int in Lua.
[EDIT]
How about in Lua?
BitConverter is the easiest way, but if you want to control the order of the bytes you can do bit shifting yourself.
int foo = int.MaxValue;
byte lolo = (byte)(foo & 0xff);
byte hilo = (byte)((foo >> 8) & 0xff);
byte lohi = (byte)((foo >> 16) & 0xff);
byte hihi = (byte)(foo >> 24);
Also.. the implementation of BitConverter uses unsafe and pointers, but it's short and simple.
public static unsafe byte[] GetBytes(int value)
{
byte[] buffer = new byte[4];
fixed (byte* numRef = buffer)
{
*((int*) numRef) = value;
}
return buffer;
}
Try
BitConverter.GetBytes()
http://msdn.microsoft.com/en-us/library/system.bitconverter.aspx
Just keep in mind that the order of the bytes in returned array depends on the endianness of your system.
EDIT:
As for the Lua part, I don't know how to convert back. You could always multiply by 16 to get the same functionality of a bitwise shift by 4. It's not pretty and I would imagine there is some library or something that implements it. Again, the order to add the bytes in depends on the endianness, so you might want to read up on that
Maybe you can convert back in C#?
For Lua, check out Roberto's struct library. (Roberto is one of the authors of Lua.) It is more general than needed for the specific case in question, but it isn't unlikely that the need to interchange an int is shortly followed by the need to interchange other simple types or larger structures.
Assuming native byte order is acceptable at both ends (which is likely a bad assumption, incidentally) then you can convert a number to a 4-byte integer with:
buffer = struct.pack("l", value)
and back again with:
value = struct.unpack("l", buffer)
In both cases, buffer is a Lua string containing the bytes. If you need to access the individual byte values from Lua, string.byte is your friend.
To specify the byte order of the packed data, change the format from "l" to "<l" for little-endian or ">l" for big-endian.
The struct module is implemented in C, and must be compiled to a DLL or equivalent for your platform before it can be used by Lua. That said, it is included in the Lua for Windows batteries-included installation package that is a popular way to install Lua on Windows systems.
Here are some functions in Lua for converting a 32-bit two's complement number into bytes and converting four bytes into a 32-bit two's complement number. A lot more checking could/should be done to verify that the incoming parameters are valid.
-- convert a 32-bit two's complement integer into a four bytes (network order)
function int_to_bytes(n)
if n > 2147483647 then error(n.." is too large",2) end
if n < -2147483648 then error(n.." is too small",2) end
-- adjust for 2's complement
n = (n < 0) and (4294967296 + n) or n
return (math.modf(n/16777216))%256, (math.modf(n/65536))%256, (math.modf(n/256))%256, n%256
end
-- convert bytes (network order) to a 32-bit two's complement integer
function bytes_to_int(b1, b2, b3, b4)
if not b4 then error("need four bytes to convert to int",2) end
local n = b1*16777216 + b2*65536 + b3*256 + b4
n = (n > 2147483647) and (n - 4294967296) or n
return n
end
print(int_to_bytes(256)) --> 0 0 1 0
print(int_to_bytes(-10)) --> 255 255 255 246
print(bytes_to_int(255,255,255,246)) --> -10
investigate the BinaryWriter/BinaryReader classes
Convert an int to a byte array and display : BitConverter ...
www.java2s.com/Tutorial/CSharp/0280__Development/Convertaninttoabytearrayanddisplay.htm
Integer to Byte - Visual Basic .NET answers
http://bytes.com/topic/visual-basic-net/answers/349731-integer-byte
How to: Convert a byte Array to an int (C# Programming Guide)
http://msdn.microsoft.com/en-us/library/bb384066.aspx
As Nubsis says, BitConverter is appropriate but has no guaranteed endianness.
I have an EndianBitConverter class in MiscUtil which allows you to specify the endianness. Of course, if you only want to do this for a single data type (int) you could just write the code by hand.
BinaryWriter is another option, and this does guarantee little endianness. (Again, MiscUtil has an EndianBinaryWriter if you want other options.)
To convert to a byte[]:
BitConverter.GetBytes(int)
http://msdn.microsoft.com/en-us/library/system.bitconverter.aspx
To convert back to an int:
BitConverter.ToInt32(byteArray, offset)
http://msdn.microsoft.com/en-us/library/system.bitconverter.toint32.aspx
I'm not sure about Lua though.
If you are concerned about endianness use John Skeet's EndianBitConverter. I've used it and it works seamlessly.
C# supports their own implementation of htons and ntohs as:
system.net.ipaddress.hosttonetworkorder()
system.net.ipaddress.networktohostorder()
But they only work on signed int16, int32, int64 which means you'll probably end up doing a lot of unnecessary casting to make them work, and if you're using the highest order bit for anything other than signing the integer, you're screwed. Been there, done that. ::tsk:: ::tsk:: Microsoft for not providing better endianness conversion support in .NET.

Categories