I have a large byte array with mostly 0's but some values that I need to process. If this was C++ or unsafe C# I would use a 32bit pointer and only if the current 32bit were not 0, I would look at the individual bytes. This enables much faster scanning through the all 0 blocks. Unfortunately this must be safe C# :-)
I could use an uint array instead of a byte array and then manipulate the individual bytes but it makes what I'm doing much more messy than I like. I'm looking for something simpler, like the pointer example (I miss pointers sigh)
Thanks!
If the code must be safe, and you don't want to use a larger type and "shift", them you'll have to iterate each byte.
(edit) If the data is sufficiently sparse, you could use a dictionary to store the non-zero values; then finding the non-zeros is trivial (and enormous by sparse arrays become cheap).
I'd follow what this guy said:
Using SSE in c# is it possible?
Basically, write a little bit of C/C++, possibly using SSE, to implement the scanning part efficiently, and call it from C#.
You can access the characters
string.ToCharArray()
Or you can access the raw byte[]
Text.Encoding.UTF8Encoding.GetBytes(stringvalue)
Ultimately, what I think you'd need here is
MemoryStream stream;
stream.Write(...)
then you will be able to directly hadnle the memory's buffer
There is also UnmanagedMemoryStream but I'm not sure whether it'd use unsafe calls inside
You can use the BitConverter class:
byte[] byteArray = GetByteArray(); // or whatever
for (int i = 0; i < b.Length; I += 2)
{
uint x = BitConverter.ToUInt32(byteArray, i);
// do what you want with x
}
Another option is to create a MemoryStream from the byte array, and then use a BinaryReader to read 32-bit values from it.
Related
I'm porting some C# decompression code to AS3, and since it's doing some pretty complex stuff, it's using a range of datatypes such as byte and short. The problem is, AS3 doesn't have those datatypes.
For the most part I can use uint to hold these values. However, at some points, I get a line such as:
length[symbol++] = (short)len;
To my understanding, this means that len must be read and assigned to the length array as a short. So I'm wondering, how would I do this in AS3? I'm guessing perhaps to do:
length[symbol++] = len & 0xFF;
But I'm unsure if this would give a proper result.
So basically, my question is this: how do I make sure to keep the the correct number of bytes when doing this sort of stuff in AS3? Maybe I should use ByteArrays instead?
Depending on reason why cast is in C# code you may or may not need to keep cast in AS3 code. If cast is purely to adjust type to type of elements of length array (i.e. there is no loss of precision) than you don't need cast. If len can actually be bigger than 0x7FFF you'll need to perform some cast.
I think ByteArray maybe a reasonable option if you need to handle result similar to C# StreamReader, random access may be harder than necessary.
Note that short is 2 bytes long (synonym for System.Int16) so to convert to it using bit manipulations you need to do & 0xFFFF. Be also very careful if casting between signed and unsigned types...
public : array<Byte>^ Foo(array<Byte>^ data)
gets dynamic size managed array
but how can I get fixed size managed byte array?
I wanna force C# user to send me 8 byte array; and get 8 bytes back
style:
public : Byte[8] Foo(Byte[8] data)
EDIT:
can any1 explain why its impossbile in safe context?
C# does not allow you to do that. You'll simply have to validate the array's length and maybe throw an exception if the length is not 8.
Also, the type of your function can't be Byte[8]; you'll have to change that to Byte[].
If you want to force exactly 8 bytes... consider sending a long or ulong instead. Old-school, but it works. It also has the advantage of not needing an object (a byte[] is an object) - it is a pure value-type (a primitive, in this case)
You can use a fixed size buffer inside a struct. You'll need it to be in an unsafe block though.
unsafe struct fixedLengthByteArrayWrapper
{
public fixed byte byteArray[8];
}
On the C++ side you'll need to use inline_array to represent this type.
As Marc correctly says, fixed size buffers are no fun to work with. You'll probably find it more convenient to do runtime length checking.
I'm writing a high-performance data structure. One problem I came across is there doesn't seem to be anyway to copy only a portion of an array to another array (preferably as quickly as possible). I also use generics, so I'm not really sure how I'd use Buffer.BlockCopy since it demands byte addresses and it appears to be impossible to objectively determine the size of an object. I know Buffer.BlockCopy works at a byte-level, but does it also count padding as a byte?
Example:
var tmo=new T[5];
var source = new T[10];
for(int i=5;i<source.Length;i++)
{
tmp[i-5]=source[i];
}
How would I do this in a faster way like Array.CopyTo?
You can use Array.Copy().
Array.Copy(source , 5, tmp, 0, tmp.Length);
what's the best way to write the binary representation of an int array (Int32[]) to a Stream?
Stream.Write only accepts byte[] as source and I would like to avoid converting/copying the array to an byte[] (array but instead streaming directly from the 'original location').
In a more system-oriented language (a.k.a. C++) I would simply cast the int array to a byte* but as far as I understood this isn't possible with C# (and moreover, casting byte* to byte[] wouldn't work out either way)
Thanks
Martin
PS: Actually, I would also like to stream single int values. Does using BinaryConverter.GetBytes() create a new byte array? In this case I extend my question to how to efficiently stream single int values ...
The simplest option would be to use BinaryWriter wrapping your output stream, and call Write(int) for each of your int values. If that doesn't use the right endianness for you, you could use EndianBinaryWriter from my MiscUtil library.
I don't know of anything built-in to do this more efficiently... I'd hope that the buffering within the stream would take care of it for the most part.
System.Array and System.Int32 both have the SerializableAttribute and so both support default serialization in a retrievable format.
http://msdn.microsoft.com/en-us/library/system.serializableattribute.aspx
There is sample code for Binary output and readback here:
http://msdn.microsoft.com/en-us/library/aa904194(VS.71).aspx
Consider the following method (used in a factory method):
private Packet(byte[] rawBytes, int startIndex)
{
m_packetId = BitConverter.ToUInt32(rawBytes, startIndex);
m_dataLength = BitConverter.ToUInt16(rawBytes, startIndex + 4);
if (this.Type != PacketType.Data)
return;
m_bytes = new byte[m_dataLength];
rawBytes.CopyTo(m_bytes, startIndex + Packet.HeaderSize);
}
The last two lines of code strike me wasteful. Allocating more memory and populating it with values from memory seems silly.
With unmanaged code, something like this is possible:
m_bytes = (rawBytes + (startIndex + Packet.HeaderSize));
(I didn't run it through a compiler so syntax is probably off, but you can see it's just a matter of pointer manipulation.)
I ran into a similar problem the other day when an API returned a byte[] array that was really a short[] array.
Are these types of array permutations just the cost of using managed code or is there a new school of thinking that I'm just missing?
Thanks in advance.
Have you considered restructuring so that maybe the copy is not necessary?
First option: store a reference to rawBytes in m_bytes and store the offset that needs to be added to all accesses into this byte array.
Second option: make m_bytes a MemoryStream instead, constructed from the buffer, an offset and a length; this constructor also does not copy the byte buffer and just allows access to the specified sub-segment.
Keep in mind though that the price of avoiding the copy operation is that the rawBytes and m_bytes (array or stream) will be aliases, so changes to one will change the other too.
Yes, this is more costly in managed code but there is nothing stopping you from making this method unsafe and doing the pointer manipulation you wish to do.