byte vs char vs int Types - c#

I have a little basic issue.
In C there is no byte type so I defined it as unsigned char to hold buffer to system file I need to read and handle.
My C program now needs to work with C# which has byte built-in type but defined as 'unsigned int'.
Would it cause any issues in casting back and forth between the two systems ?
I think no matter what "word" it is used the underlying storage remains unchanged, right. So doing something like
//C#
//byte[] b=new byte[2];
//b[0]=11;
//b[1]=12;
C
byte b[2];//unsigned char
b[0]=(byte)(11);
b[1]=(byte)(12);
is indifferent among the two.

The C standard has defined int8_t since C99. Use that for an 8 bit type, or uint8_t for the unsigned analogue.
Take care when using such a type in a structure though, there's no guarantee that adjacent members will be contiguous in memory. You'll be fine with arrays though.

As long as the valuetype in C is a 'int'-number, there should be no problem casting this back and forth.

Related

C# - are byte representations of different types different?

I know question is a bit weird, I'm asking out of pure curiosity, as I couldn't find any relevant info around. Also, please feel free to edit title, I know its terrible, but could not make up any better.
Let say I have variable foo of type object, which is either short or ushort. I need to send it over network, so I use BitConverter to transform it into byte[]:
byte[] b = new byte[2];
if(foo is short){
BitConverter.GetBytes((short)foo, 0);
}else{
BitConverter.GetBytes((ushort)foo, 0);
}
Network/Socket magic happens, I want my variable back. I know type I am expecting to get, so I call BitConverter.GetUInt16 or GetInt16 properly.
Now, question is - does it actually matter, how I serialized the variable? I mean, bits are the same, so it shouldn't have any meaning, am I correct? So that I could
BitConverter.GetBytes((short)foo, 0);
and then do
BitConverter.GetUInt16(myByteArray, 0);
Anyone?
To serialize your variable, you should assign the result of BitConverter.GetBytes() to your byte[].
It doesn't matter if your variable is short or ushort, as those are the same size and hold the same values between 0 and 32767. As long as the size is ok, you should have no problems.
So you may make your code as simple as this:
byte[] b;
if(foo is short || foo is ushort)
b = BitConverter.GetBytes((short)foo); // You get proper results for ushort as well
However at the decoding site you must know which type you need, for short, you need:
short foo = BitConverter.ToInt16(b, 0);
but if you need an ushort, then you write:
ushort foo = BitConverter.ToUInt16(b, 0);
When you send multibyte variables over the network, you should also ensure that they are in network byte order as #itsme86 mentioned in his answer.
If you need to send both shorts and ushorts, then you also need to send type information to the other end to know if the data type is signed or not.
I don't write about it now in detail as it would complicate the code.
If you're transmitting it over the network, you could run into endianness issues (i.e. multibyte values might be stored in different byte order on different architectures). The standard convention when sending a multibyte value over a network is to transform it to Network Byte Order.
The receiver of the multibyte value would then convert it to Host Byte Order.

Convert C++ data to float array in c#

There is one C++ function as below which returns me data in unsigned char**
MyCPPFunc(unsigned char** ppData, int &nSize)
I want to convert it to float array in C#. Problem is that internal representation of data returned by CPP could be char\ushort\uint\RGB etc. If I would use as below,
var srcArray = new byte[nSize];
Marshal.Copy(pCPPData, srcArray, 0, nSize/4);
outDataArray = Array.ConvertAll<byte, float>(srcArray, Convert.ToSingle);
It would convert every four consecutive bytes to float, while data in memory could be of different data type length (could be ushort, uchar, RGB etc.)
How to do this in a best performing manner, considering c++ library do support lots of data types and returns data in memory of that type.(although that is represented by uchar**)
I need something of this sort as below. where dataTypeLen can be 1 for char, 2 for short and so on.
Array.Convert(pCPPData, 0, pFloatArray, dataTypeLen, floatArrLen);
Unsafe code will also suffice.
I dont recall any such intelligent Convert'er in the Marshaller. Except for some bold exceptions like String or StringBuilder or Pointers/Handles itself, the Marshaller does not convert datatypes. It gets some bytes and interpretes them as the format(datatype) you requested and returns the format(datatype) you requested. If you ask it to read SHORT then it will decode as SHORT and return a SHORT. It is "not intelligent enough" to know how to convert CHAR into FLOAT. It's just not the job of Marshaller. And, also, not the job of any standard Converter. They can convert/cast simple types one to each other (like double to decimal), but they will not be able to understand more complex structures like "RGB", or, worse, some RGB* with padding to 32bits or BMP with padding at end of the row!
Unless you take some intelligent converter that understands the exact format you have at input, you have to do it manually. You first need to precisely know what type of data it is (uchar, ushort, RGB, etc) and then receive (i.e. marshal) the array in that precise format (uchar[], ushort[], ..) and only then you will be able to convert the elements to floats on the C# side. If you try reading the bytes "just like that", you might run into endianess problems. Of course, you might not care, depending on your needs.
So, for example: if you know that the pCPPData points to uchar array, then unmarshal it as uchar[]; if you know that the pCPPData points to ushort array, then unmarshal it as ushort[]; rgb? unmarshal it as bytes[] or as ints[], depending on the bitdepth of a single color. Then, take the resulting array, loop over it and convert to a new array of floats. (or, just LINQize with .Cast<float>().ToArray() and forget).
However, looking a bit out of the box, you seem to with with some kind of bitmaps. uchar-8bit grayscale, ushort-16bit grayscale, RGB-blah, I guess.. So why dont you try using some Bitmap processors? I dont precisely recall now, but there are functions and classes in .Net that handle/wrap raw byte arrays as Image/Bitmap objects.
For example, see ie. https://stackoverflow.com/a/16300450/717732 or https://stackoverflow.com/a/9560495/717732 - in the latter note the comments, they suggest that you could even set the scan0 to an unmanaged pointer, so you might completely escape the need of marshalling/copying anything. You might get a Bitmap object what just reads directly from the pointer you got from C++ library - and it would read in the format as specified by the ImageFormat specifier you provided. I have not ever tried this, so I say just "might".
But, of course, that would not give you float[] but Bitmap with pixels. If you really need floats, then you'll have to convert it "manually": branch over format, then read them out as the format specifies, then store them in format you want.

C# datatypes in AS3

I'm porting some C# decompression code to AS3, and since it's doing some pretty complex stuff, it's using a range of datatypes such as byte and short. The problem is, AS3 doesn't have those datatypes.
For the most part I can use uint to hold these values. However, at some points, I get a line such as:
length[symbol++] = (short)len;
To my understanding, this means that len must be read and assigned to the length array as a short. So I'm wondering, how would I do this in AS3? I'm guessing perhaps to do:
length[symbol++] = len & 0xFF;
But I'm unsure if this would give a proper result.
So basically, my question is this: how do I make sure to keep the the correct number of bytes when doing this sort of stuff in AS3? Maybe I should use ByteArrays instead?
Depending on reason why cast is in C# code you may or may not need to keep cast in AS3 code. If cast is purely to adjust type to type of elements of length array (i.e. there is no loss of precision) than you don't need cast. If len can actually be bigger than 0x7FFF you'll need to perform some cast.
I think ByteArray maybe a reasonable option if you need to handle result similar to C# StreamReader, random access may be harder than necessary.
Note that short is 2 bytes long (synonym for System.Int16) so to convert to it using bit manipulations you need to do & 0xFFFF. Be also very careful if casting between signed and unsigned types...

Fixed Size Byte Array

public : array<Byte>^ Foo(array<Byte>^ data)
gets dynamic size managed array
but how can I get fixed size managed byte array?
I wanna force C# user to send me 8 byte array; and get 8 bytes back
style:
public : Byte[8] Foo(Byte[8] data)
EDIT:
can any1 explain why its impossbile in safe context?
C# does not allow you to do that. You'll simply have to validate the array's length and maybe throw an exception if the length is not 8.
Also, the type of your function can't be Byte[8]; you'll have to change that to Byte[].
If you want to force exactly 8 bytes... consider sending a long or ulong instead. Old-school, but it works. It also has the advantage of not needing an object (a byte[] is an object) - it is a pure value-type (a primitive, in this case)
You can use a fixed size buffer inside a struct. You'll need it to be in an unsafe block though.
unsafe struct fixedLengthByteArrayWrapper
{
public fixed byte byteArray[8];
}
On the C++ side you'll need to use inline_array to represent this type.
As Marc correctly says, fixed size buffers are no fun to work with. You'll probably find it more convenient to do runtime length checking.

Serialize a C# class to binary be used by C++. How to handle alignment?

I am currently serializing a C# class into a binary stream using BinaryWriter.
I take each element of the class and write it out using BinaryWriter. This worked fine as the C++ application reading this binary file supported packed structs and hence the binary file could be loaded directly.
Now I have got a request to handle alignment as a new application has popped up which cannot support packed structs. What's the best way to convert the C# class and exporting it out as a binary keeping both 2 byte as well as 4 byte alignment in mind?
The user can choose the alignment.
In serializations of objects in any language alignment should not be an issue, as you know ahead of time how much data is being written and read.
For example, take the following struct:
struct data
{
char c;
unsigned int i;
double d;
}
Depending on how the compiler does memory layout, the size of this struct in memory can be anywhere from 13 to 20 (packed - 32-bit alignment), but that's the memory layout. As far as disk layout is concerned, you are (assuming binary) always going to be writing:
write 1 byte for c
write 4 bytes for i
write 8 bytes for d
Hence when the other side reads it in, be it either Python or C# or whatever else should do the following:
read 1 byte and convert to internal char representation
read 4 bytes and convert to an internal unsigned int representation (remember if in java there is no unsigned int)
read 8 bytes and convert to an internal real or floating point representation
This is pretty much the canonical solution. You should never rely on mass block writes of structures in any language if portability between languages and architectures is an issue.
Also the above does not take into account endianess, which is something you need to consider when serializing integers - usually as easy as converting to network byte order and then back, for writing and reading respectively.
You might want to consider not overlaying structs onto memory buffers. Just read the bytes in a deserialization step instead. Something like:
struct A {
uint8_t aByte;
uint32_t aDWord;
uint16_t aWord;
};
void serialize(FILE* fp, A const& aStruct) {
fwrite(&aStruct.aByte, sizeof(aStruct.aByte), 1, fp);
fwrite(&aStruct.aDWord, sizeof(aStruct.aDWord), 1, fp);
fwrite(&aStruct.aWord, sizeof(aStruct.aWord), 1, fp);
}
void deserialize(FILE* fp, A& aStruct) {
fread(&aStruct.aByte, sizeof(aStruct.aByte), 1, fp);
fread(&aStruct.aDWord, sizeof(aStruct.aDWord), 1, fp);
fread(&aStruct.aWord, sizeof(aStruct.aWord), 1, fp);
}
instead of:
void serialise(FILE* fp, A const& aStruct) {
fwrite(&aStruct, sizeof(aStruct), 1, fp);
}
void deserialise(FILE* fp, A& aStruct) {
fread(&aStruct, sizeof(aStruct), 1, fp);
}
The first example isn't dependent on structure packing rules where the second example is. I would recommend using the one that isn't. Most languages (not sure about C#) give you some way to read and write raw bytes so do all of the serialization/deserialization memberwise instead of a singular memory block and the packing/padding problems go away.
Consider using a C++ library on the writer side. If the reader side is constrained, create structures with the very same memory layout in unmanaged C++ and dump them to the binary file. You probably have to use #pragma pack(1) to disable the default packing and put some char[] dummies between your structure elements to resemble the alignment on the reader side. Don't forget about endianness. Then just call the writer library using DllImport or via C++/CLI (it's a matter of taste).

Categories