I have a byte array in memory, read from a file. I would like to split the byte array at a certain point (index) without having to just create a new byte array and copy each byte at a time, increasing the in memory foot print of the operation. What I would like is something like this:
byte[] largeBytes = [1,2,3,4,5,6,7,8,9];
byte[] smallPortion;
smallPortion = split(largeBytes, 3);
smallPortion would equal 1,2,3,4
largeBytes would equal 5,6,7,8,9
In C# with Linq you can do this:
smallPortion = largeBytes.Take(4).ToArray();
largeBytes = largeBytes.Skip(4).Take(5).ToArray();
;)
FYI. System.ArraySegment<T> structure basically is the same thing as ArrayView<T> in the code above. You can use this out-of-the-box structure in the same way, if you'd like.
This is how I would do that:
using System;
using System.Collections;
using System.Collections.Generic;
class ArrayView<T> : IEnumerable<T>
{
private readonly T[] array;
private readonly int offset, count;
public ArrayView(T[] array, int offset, int count)
{
this.array = array;
this.offset = offset;
this.count = count;
}
public int Length
{
get { return count; }
}
public T this[int index]
{
get
{
if (index < 0 || index >= this.count)
throw new IndexOutOfRangeException();
else
return this.array[offset + index];
}
set
{
if (index < 0 || index >= this.count)
throw new IndexOutOfRangeException();
else
this.array[offset + index] = value;
}
}
public IEnumerator<T> GetEnumerator()
{
for (int i = offset; i < offset + count; i++)
yield return array[i];
}
IEnumerator IEnumerable.GetEnumerator()
{
IEnumerator<T> enumerator = this.GetEnumerator();
while (enumerator.MoveNext())
{
yield return enumerator.Current;
}
}
}
class Program
{
static void Main(string[] args)
{
byte[] arr = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 };
ArrayView<byte> p1 = new ArrayView<byte>(arr, 0, 5);
ArrayView<byte> p2 = new ArrayView<byte>(arr, 5, 5);
Console.WriteLine("First array:");
foreach (byte b in p1)
{
Console.Write(b);
}
Console.Write("\n");
Console.WriteLine("Second array:");
foreach (byte b in p2)
{
Console.Write(b);
}
Console.ReadKey();
}
}
Try this one:
private IEnumerable<byte[]> ArraySplit(byte[] bArray, int intBufforLengt)
{
int bArrayLenght = bArray.Length;
byte[] bReturn = null;
int i = 0;
for (; bArrayLenght > (i + 1) * intBufforLengt; i++)
{
bReturn = new byte[intBufforLengt];
Array.Copy(bArray, i * intBufforLengt, bReturn, 0, intBufforLengt);
yield return bReturn;
}
int intBufforLeft = bArrayLenght - i * intBufforLengt;
if (intBufforLeft > 0)
{
bReturn = new byte[intBufforLeft];
Array.Copy(bArray, i * intBufforLengt, bReturn, 0, intBufforLeft);
yield return bReturn;
}
}
As Eren said, you can use ArraySegment<T>. Here's an extension method and usage example:
public static class ArrayExtensionMethods
{
public static ArraySegment<T> GetSegment<T>(this T[] arr, int offset, int? count = null)
{
if (count == null) { count = arr.Length - offset; }
return new ArraySegment<T>(arr, offset, count.Value);
}
}
void Main()
{
byte[] arr = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 };
var p1 = arr.GetSegment(0, 5);
var p2 = arr.GetSegment(5);
Console.WriteLine("First array:");
foreach (byte b in p1)
{
Console.Write(b);
}
Console.Write("\n");
Console.WriteLine("Second array:");
foreach (byte b in p2)
{
Console.Write(b);
}
}
I'm not sure what you mean by:
I would like to split the byte array at a certain point(index) without having to just create a new byte array and copy each byte at a time, increasing the in memory foot print of the operation.
In most languages, certainly C#, once an array has been allocated, there is no way to change the size of it. It sounds like you're looking for a way to change the length of an array, which you can't. You also want to somehow recycle the memory for the second part of the array, to create a second array, which you also can't do.
In summary: just create a new array.
You can't. What you might want is keep a starting point and number of items; in essence, build iterators. If this is C++, you can just use std::vector<int> and use the built-in ones.
In C#, I'd build a small iterator class that holds start index, count and implements IEnumerable<>.
I tried different algorithms :
Skip().Take() => the worst, by far
Array.Copy
ArraySegment
new Guid(int, int16, int16 ...)
The latest being the fastest I'm now using this extension method:
public static Guid ToGuid(this byte[] byteArray, int offset)
{
return new Guid(BitConverter.ToInt32(byteArray, offset), BitConverter.ToInt16(byteArray, offset + 4), BitConverter.ToInt16(byteArray, offset + 6), byteArray[offset + 8], byteArray[offset + 9], byteArray[offset + 10], byteArray[offset + 11], byteArray[offset + 12], byteArray[offset + 13], byteArray[offset + 14], byteArray[offset + 15]);
}
With a byte array with 10000000 guids:
Done (Skip().Take()) in 1,156ms (for only 100000 guids :))
Done (Array.Copy) in 1,219ms
Done (ToGuid extension) in 994ms
Done (ArraySegment) in 2,411ms
Related
I have the following C# code to convert my number array into byte array and then save it as a base64 string and vice-versa, however it doesn't work for long because long is 8-byte and my code only works for 4-byte numbers.
private static int _endianDiff1;
private static int _endianDiff2;
private static int _idx;
private static byte[] _byteBlock;
enum ArrayType { Float, Int32, UInt32, Int64, UInt64 }
public static bool SetIntArray(string key, int[] intArray)
{
return SetValue(key, intArray, ArrayType.Int32, 1, ConvertFromInt);
}
public static bool SetLongArray(string key, long[] longArray)
{
return SetValue(key, longArray, ArrayType.Int64, 1, ConvertFromLong);
}
private static bool SetValue<T>(string key, T array, ArrayType arrayType, int vectorNumber, Action<T, byte[], int> convert) where T : IList
{
var bytes = new byte[(4 * array.Count) * vectorNumber + 1];
bytes[0] = Convert.ToByte(arrayType); // Identifier
Initialize();
for (var i = 0; i < array.Count; i++)
{
convert(array, bytes, i);
}
return SaveBytes(key, bytes);
}
private static void ConvertFromInt(int[] array, byte[] bytes, int i)
{
ConvertInt32ToBytes(array[i], bytes);
}
private static void ConvertFromLong(long[] array, byte[] bytes, int i)
{
ConvertInt64ToBytes(array[i], bytes);
}
public static int[] GetIntArray(string key)
{
var intList = new List<int>();
GetValue(key, intList, ArrayType.Int32, 1, ConvertToInt);
return intList.ToArray();
}
public static long[] GetLongArray(string key)
{
var longList = new List<long>();
GetValue(key, longList, ArrayType.Int64, 1, ConvertToLong);
return longList.ToArray();
}
private static void GetValue<T>(string key, T list, ArrayType arrayType, int vectorNumber, Action<T, byte[]> convert) where T : IList
{
if (!PlayerPrefs.HasKey(key))
return;
var bytes = Convert.FromBase64String(PlayerPrefs.GetString(key));
if ((bytes.Length - 1) % (vectorNumber * 4) != 0)
{
Debug.LogError("Corrupt preference file for " + key);
return;
}
if ((ArrayType)bytes[0] != arrayType)
{
Debug.LogError(key + " is not a " + arrayType + " array");
return;
}
Initialize();
var end = (bytes.Length - 1) / (vectorNumber * 4);
for (var i = 0; i < end; i++)
{
convert(list, bytes);
}
}
private static void ConvertToInt(List<int> list, byte[] bytes)
{
list.Add(ConvertBytesToInt32(bytes));
}
private static void ConvertToLong(List<long> list, byte[] bytes)
{
list.Add(ConvertBytesToInt64(bytes));
}
private static void Initialize()
{
if (BitConverter.IsLittleEndian)
{
_endianDiff1 = 0;
_endianDiff2 = 0;
}
else
{
_endianDiff1 = 3;
_endianDiff2 = 1;
}
if (_byteBlock == null)
{
_byteBlock = new byte[4];
}
_idx = 1;
}
private static bool SaveBytes(string key, byte[] bytes)
{
try
{
PlayerPrefs.SetString(key, Convert.ToBase64String(bytes));
}
catch
{
return false;
}
return true;
}
private static void ConvertInt32ToBytes(int i, byte[] bytes)
{
_byteBlock = BitConverter.GetBytes(i);
ConvertTo4Bytes(bytes);
}
private static void ConvertInt64ToBytes(long i, byte[] bytes)
{
_byteBlock = BitConverter.GetBytes(i);
ConvertTo8Bytes(bytes);
}
private static int ConvertBytesToInt32(byte[] bytes)
{
ConvertFrom4Bytes(bytes);
return BitConverter.ToInt32(_byteBlock, 0);
}
private static long ConvertBytesToInt64(byte[] bytes)
{
ConvertFrom8Bytes(bytes);
return BitConverter.ToInt64(_byteBlock, 0);
}
private static void ConvertTo4Bytes(byte[] bytes)
{
bytes[_idx] = _byteBlock[_endianDiff1];
bytes[_idx + 1] = _byteBlock[1 + _endianDiff2];
bytes[_idx + 2] = _byteBlock[2 - _endianDiff2];
bytes[_idx + 3] = _byteBlock[3 - _endianDiff1];
_idx += 4;
}
private static void ConvertFrom4Bytes(byte[] bytes)
{
_byteBlock[_endianDiff1] = bytes[_idx];
_byteBlock[1 + _endianDiff2] = bytes[_idx + 1];
_byteBlock[2 - _endianDiff2] = bytes[_idx + 2];
_byteBlock[3 - _endianDiff1] = bytes[_idx + 3];
_idx += 4;
}
private static void ConvertTo8Bytes(byte[] bytes)
{
}
private static void ConvertFrom8Bytes(byte[] bytes)
{
}
So far I have it working for int, uint and float because they are all 4-byte and my problem is to change my Initialize function so it works based on the passed type size.
I guess there should also be ConvertTo8Bytes and ConvertFrom8Bytes functions which I don't know how to make because I set the _endianDiff and _byteBlock for 4-byte only. I understand that _byteBlock should have dynamic size instead of 4 but I don't know what to do with endianness in this case.
On the side note, I have solved this problem by splitting the long into 2 ints and just storing them as two int arrays, but I'm allocating useless memory like this just because I'm not able to make this algorithm to work.
Seems like a lot of code if all you are doing is trying to get a Base64 representation of a numeric array. Am I missing the goal?
If all you want to do is get int or long arrays to and from base64 strings, try this:
private static string ConvertArrayToBase64<T>(IList<T> array) where T : struct
{
if (typeof(T).IsPrimitive)
{
int size = System.Runtime.InteropServices.Marshal.SizeOf<T>();
var byteArray = new byte[array.Count * size];
Buffer.BlockCopy(array.ToArray(), 0, byteArray, 0, byteArray.Length);
return Convert.ToBase64String(byteArray);
}
throw new InvalidOperationException("Only primitive types are supported.");
}
private static T[] ConvertBase64ToArray<T>(string base64String) where T : struct
{
if (typeof(T).IsPrimitive)
{
var byteArray = Convert.FromBase64String(base64String);
var array = new T[byteArray.Length / System.Runtime.InteropServices.Marshal.SizeOf<T>()];
Buffer.BlockCopy(byteArray, 0, array, 0, byteArray.Length);
return array;
}
throw new InvalidOperationException("Only primitive types are supported.");
}
There are a couple things to consider with this code though...
It does make a complete copy of the array, so if you are dealing with a large array or a performance sensitive operation, it may not be the best way.
This should work with any "primitive value type" arrays, which should include all the numeric types like int, long, uint, float, etc.
To demonstrate usage, see this example:
var longArray = new long[] { 11111, 22222, 33333, 44444 };
var intArray = new int[] { 55555, 66666, 77777, 88888};
string base64longs = ConvertArrayToBase64(longArray);
Console.WriteLine(base64longs);
Console.WriteLine(string.Join(", ", ConvertBase64ToArray<long>(base64longs)));
string base64ints = ConvertArrayToBase64(intArray);
Console.WriteLine(base64ints);
Console.WriteLine(string.Join(", ", ConvertBase64ToArray<int>(base64ints)));
What it does:
Verifies that the array only has primitive types.
Determines the size of the elements in the array to
calculate the length of the byte array to allocate.
It copies the array to the byte array.
Returns the base64 representation.
The complementary function does the reverse.
Update: Here are the .NET 2.0 compatible versions...
private static string ConvertArrayToBase64<T>(IList<T> array) where T : struct
{
if (typeof(T).IsPrimitive)
{
int size = System.Runtime.InteropServices.Marshal.SizeOf(typeof(T));
var byteArray = new byte[array.Count * size];
Buffer.BlockCopy((Array)array, 0, byteArray, 0, byteArray.Length);
return Convert.ToBase64String(byteArray);
}
throw new InvalidOperationException("Only primitive types are supported.");
}
private static T[] ConvertBase64ToArray<T>(string base64String) where T : struct
{
if (typeof(T).IsPrimitive)
{
var byteArray = Convert.FromBase64String(base64String);
var array = new T[byteArray.Length / System.Runtime.InteropServices.Marshal.SizeOf(typeof(T))];
Buffer.BlockCopy(byteArray, 0, array, 0, byteArray.Length);
return array;
}
throw new InvalidOperationException("Only primitive types are supported.");
}
I`m looking for solution for hashing large file content (files may be over 2gb in 32bit os). It there any easy solution for that? Or just reading by part and loading to buffer?
Driis's solution sounds more flexible, but HashAlgorithm.ComputeHash will also accept Streams as parameters.
Use TransformBlock and TransformFinalBlock to calculate the hash block by block, so you won't need to read the entire file into memory. (There is a nice example in the first link - and another one in this previous question).
If you choose to use TransformBlock, then you can safely ignore the last parameter and set the outputBuffer to null. TransformBlock will copy from the input to the output array - but why would you want to simply copy bits for no good reason?
Furthermore, all mscorlib HashAlgorithms work as you might expect, i.e. the block size doesn't seem to affect the hash output; and whether you pass the data in one array and then hash in chunks by changing the inputOffset or you hash by passing smaller, separate arrays doesn't matter. I verified this using the following code:
(this is slightly long, just here so people can verify for themselves that HashAlgorithm implementations are sane).
public static void Main() {
RandomNumberGenerator rnd = RandomNumberGenerator.Create();
byte[] input = new byte[20];
rnd.GetBytes(input);
Console.WriteLine("Input Data: " + BytesToStr(input));
var hashAlgoTypes = Assembly.GetAssembly(typeof(HashAlgorithm)).GetTypes()
.Where(t => typeof(HashAlgorithm).IsAssignableFrom(t) && !t.IsAbstract);
foreach (var hashType in hashAlgoTypes)
new AlgoTester(hashType).AssertOkFor(input.ToArray());
}
public static string BytesToStr(byte[] bytes) {
StringBuilder str = new StringBuilder();
for (int i = 0; i < bytes.Length; i++)
str.AppendFormat("{0:X2}", bytes[i]);
return str.ToString();
}
public class AlgoTester {
readonly byte[] key;
readonly Type type;
public AlgoTester(Type type) {
this.type=type;
if (typeof(KeyedHashAlgorithm).IsAssignableFrom(type))
using(var algo = (KeyedHashAlgorithm)Activator.CreateInstance(type))
key = algo.Key.ToArray();
}
public HashAlgorithm MakeAlgo() {
HashAlgorithm algo = (HashAlgorithm)Activator.CreateInstance(type);
if (key != null)
((KeyedHashAlgorithm)algo).Key = key;
return algo;
}
public byte[] GetHash(byte[] input) {
using(HashAlgorithm sha = MakeAlgo())
return sha.ComputeHash(input);
}
public byte[] GetHashOneBlock(byte[] input) {
using(HashAlgorithm sha = MakeAlgo()) {
sha.TransformFinalBlock(input, 0, input.Length);
return sha.Hash;
}
}
public byte[] GetHashMultiBlock(byte[] input, int size) {
using(HashAlgorithm sha = MakeAlgo()) {
int offset = 0;
while (input.Length - offset >= size)
offset += sha.TransformBlock(input, offset, size, input, offset);
sha.TransformFinalBlock(input, offset, input.Length - offset);
return sha.Hash;
}
}
public byte[] GetHashMultiBlockInChunks(byte[] input, int size) {
using(HashAlgorithm sha = MakeAlgo()) {
int offset = 0;
while (input.Length - offset >= size)
offset += sha.TransformBlock(input.Skip(offset).Take(size).ToArray()
, 0, size, null, -24124512);
sha.TransformFinalBlock(input.Skip(offset).ToArray(), 0
, input.Length - offset);
return sha.Hash;
}
}
public void AssertOkFor(byte[] data) {
var direct = GetHash(data);
var indirect = GetHashOneBlock(data);
var outcomes =
new[] { 1, 2, 3, 5, 10, 11, 19, 20, 21 }.SelectMany(i =>
new[]{
new{ Hash=GetHashMultiBlock(data,i), Name="ByMSDN"+i},
new{ Hash=GetHashMultiBlockInChunks(data,i), Name="InChunks"+i}
}).Concat(new[] { new { Hash = indirect, Name = "OneBlock" } })
.Where(result => !result.Hash.SequenceEqual(direct)).ToArray();
Console.Write("Testing: " + type);
if (outcomes.Any()) {
Console.WriteLine("not OK.");
Console.WriteLine(type.Name + " direct was: " + BytesToStr(direct));
} else Console.WriteLine(" OK.");
foreach (var outcome in outcomes)
Console.WriteLine(type.Name + " differs with: " + outcome.Name + " "
+ BytesToStr(outcome.Hash));
}
}
Convertion from Double[] src to Byte[] dst
can be efficiently done in C# by fixed pointers:
fixed( Double* pSrc = src)
{
fixed( Byte* pDst = dst)
{
Byte* ps = (Byte*)pSrc;
for (int i=0; i < dstLength; i++)
{
*(pDst + i) = *(ps +i);
}
}
}
How can I do the same for List src ?
I.e. how can I get fixed pointer to array Double[]
included in List ?
Thanks.
I have used these helper methods before:
byte[] GetBytesBlock(double[] values)
{
var result = new byte[values.Length * sizeof(double)];
Buffer.BlockCopy(values, 0, result, 0, result.Length);
return result;
}
double[] GetDoublesBlock(byte[] bytes)
{
var result = new double[bytes.Length / sizeof(double)];
Buffer.BlockCopy(bytes, 0, result, 0, bytes.Length);
return result;
}
An example:
List<double> myList = new List<double>(){ 1.0, 2.0, 3.0};
//to byte[]
var byteResult = GetBytesBlock(myList.ToArray());
//back to List<double>
var doubleResult = GetDoublesBlock(byteResult).ToList();
not sure what you are intending, but I think ... you want
System.Runtime.Interopservices.Marshal.StructToPtr.
You can always use the ToArray() method on the List<Double> object to get a Double[].
You can use reflection to get the reference to the private T[] _items field, in the List instance.
Warning: In your code snippet, you need to make sure dstLength is the minimum of dst and src lengths in bytes, so that you don't try to copy more bytes than what are available. Probably you do so by creating dst with exactly the needed size to match the src, but your snippet doesn't make it clear.
Use the List<T>.ToArray() method and operate on the resulting array.
This might work, but you will have a data loss- content of the array will be 3 and 34 .
List<double> list = new List<double>();
list.Add(Math.PI);
list.Add(34.22);
byte[] arr = (from l in list
select (byte)l).ToArray<byte>();
Why don't you just access the list as usual?
List<double> list = new List<double>();
list.Add(Math.PI);
list.Add(34.22);
byte[] res = new byte[list.Count * sizeof(double)];
unsafe
{
fixed (byte* pres = res)
{
for (int i = 0; i < list.Count; i++)
{
*(((double*)pres) + i) = list[i];
}
}
}
I haven't tested it thoroughly and i seldomly need unsafe code, but it seems to work fine.
Edit: here is another (imo preferable) solution, without unsafe code:
int offset = 0;
for (int i = 0; i < list.Count; i++)
{
long num = BitConverter.DoubleToInt64Bits(list[i]);
for (int j = 0; j < 8; j++)
{
res[offset++] = (byte)num;
num >>= 8;
}
}
How can I write bits to a stream (System.IO.Stream) or read in C#? thanks.
You could create an extension method on Stream that enumerates the bits, like this:
public static class StreamExtensions
{
public static IEnumerable<bool> ReadBits(this Stream input)
{
if (input == null) throw new ArgumentNullException("input");
if (!input.CanRead) throw new ArgumentException("Cannot read from input", "input");
return ReadBitsCore(input);
}
private static IEnumerable<bool> ReadBitsCore(Stream input)
{
int readByte;
while((readByte = input.ReadByte()) >= 0)
{
for(int i = 7; i >= 0; i--)
yield return ((readByte >> i) & 1) == 1;
}
}
}
Using this extension method is easy:
foreach(bool bit in stream.ReadBits())
{
// do something with the bit
}
Attention: you should not call ReadBits multiple times on the same Stream, otherwise the subsequent calls will forget the current bit position and will just start reading the next byte.
This is not possible with the default stream class. The C# (BCL) Stream class operates on the granularity of bytes at it's lowest level. What you can do is write a wrapper class which reads bytes and partititions them out to bits.
For example:
class BitStream : IDisposable {
private Stream m__stream;
private byte? m_current;
private int m_index;
public byte ReadNextBit() {
if ( !m_current.HasValue ) {
m_current = ReadNextByte();
m_index = 0;
}
var value = (m_byte.Value >> m_index) & 0x1;
m_index++;
if (m_index == 8) {
m_current = null;
}
return value;
}
private byte ReadNextByte() {
...
}
// Dispose implementation omitted
}
Note: This will read the bits in right to left fashion which may or may not be what you're intending.
If you need to retrieve separate sections of your byte stream a few bits at a time, you need to remember the position of the bit to read next between calls. The following class takes care of caching the current byte and the bit position within it between calls.
// Binary MSB-first bit enumeration.
public class BitStream
{
private Stream wrapped;
private int bitPos = -1;
private int buffer;
public BitStream(Stream stream) => this.wrapped = stream;
public IEnumerable<bool> ReadBits()
{
do
{
while (bitPos >= 0)
{
yield return (buffer & (1 << bitPos--)) > 0;
}
buffer = wrapped.ReadByte();
bitPos = 7;
} while (buffer > -1);
}
}
Call like this:
var bStream = new BitStream(<existing Stream>);
var firstBits = bStream.ReadBits().Take(2);
var nextBits = bStream.ReadBits().Take(3);
...
For your purpose, I wrote an easy-to-use, fast and open-source (MIT license) library for this, called "BitStream", which is available at github (https://github.com/martinweihrauch/BitStream).
In this example, you can see how 5 unsigned integers, which can be represented with 6 bits (all below the value 63) are written with 6 bits each to a stream and then read back. Please note that the library takes and returns long or ulong values for the ease of it, so just convert your e. g. int, uint, etc to long/ulong first.
using SharpBitStream;
uint[] testDataUnsigned = { 5, 62, 17, 50, 33 };
var ms = new MemoryStream();
var bs = new BitStream(ms);
Console.WriteLine("Test1: \r\nFirst testing writing and reading small numbers of a max of 6 bits.");
Console.WriteLine("There are 5 unsigned ints , which shall be written into 6 bits each as they are all small than 64: 5, 62, 17, 50, 33");
foreach(var bits in testDataUnsigned)
{
bs.WriteUnsigned(6, (ulong)bits);
}
Console.WriteLine("The original data are of the size: " + testDataUnsigned.Length + " bytes. The size of the stream is now: " + ms.Length + " bytes\r\nand the bytes in it are: ");
ms.Position = 0;
Console.WriteLine("The resulting bytes in the stream look like this: ");
for (int i = 0; i < ms.Length; i++)
{
uint bits = (uint)ms.ReadByte();
Console.WriteLine("Byte #" + Convert.ToString(i).PadLeft(4, '0') + ": " + Convert.ToString(bits, 2).PadLeft(8, '0'));
}
Console.WriteLine("\r\nNow reading the bits back:");
ms.Position = 0;
bs.SetPosition(0, 0);
foreach (var bits in testDataUnsigned)
{
ulong number = (uint)bs.ReadUnsigned(6);
Console.WriteLine("Number read: " + number);
}
I use an extension method to convert float arrays into byte arrays:
public static unsafe byte[] ToByteArray(this float[] floatArray, int count)
{
int arrayLength = floatArray.Length > count ? count : floatArray.Length;
byte[] byteArray = new byte[4 * arrayLength];
fixed (float* floatPointer = floatArray)
{
fixed (byte* bytePointer = byteArray)
{
float* read = floatPointer;
float* write = (float*)bytePointer;
for (int i = 0; i < arrayLength; i++)
{
*write++ = *read++;
}
}
}
return byteArray;
}
I understand that an array is a pointer to memory associated with information on the type and number of elements. Also, it seems to me that there is no way of doing a conversion from and to a byte array without copying the data as above.
Have I understood this? Would it even be impossible to write IL to create an array from a pointer, type and length without copying data?
EDIT: Thanks for the answers, I learned some fundamentals and got to try out new tricks!
After initially accepting Davy Landman's answer I found out that while his brilliant StructLayout hack does convert byte arrays into float arrays, it does not work the other way around. To demonstrate:
[StructLayout(LayoutKind.Explicit)]
struct UnionArray
{
[FieldOffset(0)]
public Byte[] Bytes;
[FieldOffset(0)]
public float[] Floats;
}
static void Main(string[] args)
{
// From bytes to floats - works
byte[] bytes = { 0, 1, 2, 4, 8, 16, 32, 64 };
UnionArray arry = new UnionArray { Bytes = bytes };
for (int i = 0; i < arry.Bytes.Length / 4; i++)
Console.WriteLine(arry.Floats[i]);
// From floats to bytes - index out of range
float[] floats = { 0.1f, 0.2f, 0.3f };
arry = new UnionArray { Floats = floats };
for (int i = 0; i < arry.Floats.Length * 4; i++)
Console.WriteLine(arry.Bytes[i]);
}
It seems that the CLR sees both arrays as having the same length. If the struct is created from float data, the byte array's length is just too short.
You can use a really ugly hack to temporary change your array to byte[] using memory manipulation.
This is really fast and efficient as it doesn't require cloning the data and iterating on it.
I tested this hack in both 32 & 64 bit OS, so it should be portable.
The source + sample usage is maintained at https://gist.github.com/1050703 , but for your convenience I'll paste it here as well:
public static unsafe class FastArraySerializer
{
[StructLayout(LayoutKind.Explicit)]
private struct Union
{
[FieldOffset(0)] public byte[] bytes;
[FieldOffset(0)] public float[] floats;
}
[StructLayout(LayoutKind.Sequential, Pack = 1)]
private struct ArrayHeader
{
public UIntPtr type;
public UIntPtr length;
}
private static readonly UIntPtr BYTE_ARRAY_TYPE;
private static readonly UIntPtr FLOAT_ARRAY_TYPE;
static FastArraySerializer()
{
fixed (void* pBytes = new byte[1])
fixed (void* pFloats = new float[1])
{
BYTE_ARRAY_TYPE = getHeader(pBytes)->type;
FLOAT_ARRAY_TYPE = getHeader(pFloats)->type;
}
}
public static void AsByteArray(this float[] floats, Action<byte[]> action)
{
if (floats.handleNullOrEmptyArray(action))
return;
var union = new Union {floats = floats};
union.floats.toByteArray();
try
{
action(union.bytes);
}
finally
{
union.bytes.toFloatArray();
}
}
public static void AsFloatArray(this byte[] bytes, Action<float[]> action)
{
if (bytes.handleNullOrEmptyArray(action))
return;
var union = new Union {bytes = bytes};
union.bytes.toFloatArray();
try
{
action(union.floats);
}
finally
{
union.floats.toByteArray();
}
}
public static bool handleNullOrEmptyArray<TSrc,TDst>(this TSrc[] array, Action<TDst[]> action)
{
if (array == null)
{
action(null);
return true;
}
if (array.Length == 0)
{
action(new TDst[0]);
return true;
}
return false;
}
private static ArrayHeader* getHeader(void* pBytes)
{
return (ArrayHeader*)pBytes - 1;
}
private static void toFloatArray(this byte[] bytes)
{
fixed (void* pArray = bytes)
{
var pHeader = getHeader(pArray);
pHeader->type = FLOAT_ARRAY_TYPE;
pHeader->length = (UIntPtr)(bytes.Length / sizeof(float));
}
}
private static void toByteArray(this float[] floats)
{
fixed(void* pArray = floats)
{
var pHeader = getHeader(pArray);
pHeader->type = BYTE_ARRAY_TYPE;
pHeader->length = (UIntPtr)(floats.Length * sizeof(float));
}
}
}
And the usage is:
var floats = new float[] {0, 1, 0, 1};
floats.AsByteArray(bytes =>
{
foreach (var b in bytes)
{
Console.WriteLine(b);
}
});
Yes, the type information and data is in the same memory block, so that is impossible unless you overwrite the type information in a float array to fool the system that it's byte array. That would be a really ugly hack, and could easily blow up...
Here's how you can convert the floats without unsafe code if you like:
public static byte[] ToByteArray(this float[] floatArray) {
int len = floatArray.Length * 4;
byte[] byteArray = new byte[len];
int pos = 0;
foreach (float f in floatArray) {
byte[] data = BitConverter.GetBytes(f);
Array.Copy(data, 0, byteArray, pos, 4);
pos += 4;
}
return byteArray;
}
This question is the reverse of What is the fastest way to convert a float[] to a byte[]?.
I've answered with a union kind of hack to skip the whole copying of the data. You could easily reverse this (length = length *sizeof(Double).
I've written something similar for quick conversion between arrays. It's basically an ugly proof-of-concept more than a handsome solution. ;)
public static TDest[] ConvertArray<TSource, TDest>(TSource[] source)
where TSource : struct
where TDest : struct {
if (source == null)
throw new ArgumentNullException("source");
var sourceType = typeof(TSource);
var destType = typeof(TDest);
if (sourceType == typeof(char) || destType == typeof(char))
throw new NotSupportedException(
"Can not convert from/to a char array. Char is special " +
"in a somewhat unknown way (like enums can't be based on " +
"char either), and Marshal.SizeOf returns 1 even when the " +
"values held by a char can be above 255."
);
var sourceByteSize = Buffer.ByteLength(source);
var destTypeSize = Marshal.SizeOf(destType);
if (sourceByteSize % destTypeSize != 0)
throw new Exception(
"The source array is " + sourceByteSize + " bytes, which can " +
"not be transfered to chunks of " + destTypeSize + ", the size " +
"of type " + typeof(TDest).Name + ". Change destination type or " +
"pad the source array with additional values."
);
var destCount = sourceByteSize / destTypeSize;
var destArray = new TDest[destCount];
Buffer.BlockCopy(source, 0, destArray, 0, sourceByteSize);
return destArray;
}
}
public byte[] ToByteArray(object o)
{
int size = Marshal.SizeOf(o);
byte[] buffer = new byte[size];
IntPtr p = Marshal.AllocHGlobal(size);
try
{
Marshal.StructureToPtr(o, p, false);
Marshal.Copy(p, buffer, 0, size);
}
finally
{
Marshal.FreeHGlobal(p);
}
return buffer;
}
this may help you to convert an object to a byte array.
You should check my answer to a similar question: What is the fastest way to convert a float[] to a byte[]?.
In it you'll find portable code (32/64 bit compatible) to let you view a float array as a byte array or vice-versa, without copying the data. It's the fastest way that I know of to do such thing.
If you're just interested in the code, it's maintained at https://gist.github.com/1050703 .
Well - if you still interested in that hack - check out this modified code - it works like a charm and costs ~0 time, but it may not work in future since it's a hack allowing to gain full access to the whole process address space without trust requirements and unsafe marks.
[StructLayout(LayoutKind.Explicit)]
struct ArrayConvert
{
public static byte[] GetBytes(float[] floats)
{
ArrayConvert ar = new ArrayConvert();
ar.floats = floats;
ar.length.val = floats.Length * 4;
return ar.bytes;
}
public static float[] GetFloats(byte[] bytes)
{
ArrayConvert ar = new ArrayConvert();
ar.bytes = bytes;
ar.length.val = bytes.Length / 4;
return ar.floats;
}
public static byte[] GetTop4BytesFrom(object obj)
{
ArrayConvert ar = new ArrayConvert();
ar.obj = obj;
return new byte[]
{
ar.top4bytes.b0,
ar.top4bytes.b1,
ar.top4bytes.b2,
ar.top4bytes.b3
};
}
public static byte[] GetBytesFrom(object obj, int size)
{
ArrayConvert ar = new ArrayConvert();
ar.obj = obj;
ar.length.val = size;
return ar.bytes;
}
class ArrayLength
{
public int val;
}
class Top4Bytes
{
public byte b0;
public byte b1;
public byte b2;
public byte b3;
}
[FieldOffset(0)]
private Byte[] bytes;
[FieldOffset(0)]
private object obj;
[FieldOffset(0)]
private float[] floats;
[FieldOffset(0)]
private ArrayLength length;
[FieldOffset(0)]
private Top4Bytes top4bytes;
}