I having some issues with converting a large byte[] array into a strongly typed array.
I have an array which has been concatinated into one large byte[] array and stored in a table.
I want to then read this byte[] array but convert it to a strongly typed array.
As I have stored the entire array as a byte[] array, can I not read that byte array and convert it to my strongly typed version? At the moment its returning null...
Is this possible in one hit?
Thanks in advance, Onam.
<code>
#region Save
public void Save<T>(T[] Array) where T : new()
{
List<byte[]> _ByteCollection = new List<byte[]>();
byte[] _Bytes = null;
int _Length = 0;
int _Offset = 0;
foreach (T _Item in Array)
{
_ByteCollection.Add(Serialise(_Item));
}
foreach (byte[] _Byte in _ByteCollection)
{
_Length += _Byte.Length;
}
_Bytes = new byte[_Length];
foreach (byte[] b in _ByteCollection)
{
System.Buffer.BlockCopy(b, 0, _Bytes, _Offset, b.Length);
_Offset += b.Length;
}
Customer[] c = BinaryDeserialize<Customer[]>(_Bytes);
}
#endregion
#region BinaryDeserialize
public static T BinaryDeserialize<T>(byte[] RawData)
{
T _DeserializedContent = default(T);
BinaryFormatter _Formatter = new BinaryFormatter();
try
{
using (MemoryStream _Stream = new MemoryStream())
{
_Stream.Write(RawData, 0, RawData.Length);
_Stream.Seek(0, SeekOrigin.Begin);
_DeserializedContent = (T)_Formatter.Deserialize(_Stream);
}
}
catch (Exception ex)
{
_DeserializedContent = default(T);
}
return _DeserializedContent;
}
#endregion
</code>
I think the problem is that you are serializing each item to a list, then concatenating the bytes. When this is deserialised this just looks like the data for one customer plus some unexpected data (the other customers) at the end.
I don't know how your serialize method works but you can probably just change code:
foreach (T _Item in Array)
{
_ByteCollection.Add(Serialise(_Item));
}
To:
_ByteCollection.Add(Serialise(Array));
And that should work, then you could probably simplify it a little.
Most likely the line
_DeserializedContent = (T)_Formatter.Deserialize(_Stream);
throws an exception. In the catch block you simply swallow and ignore that exception.
Related
protobuf-net cannot serialize the following class because serializing objects of type Stream is not supported:
[ProtoContract]
class StreamObject
{
[ProtoMember(1)]
public Stream StreamProperty { get; set; }
}
I know I can work around this by using a serialized property of type byte[] and reading the stream into that property, as in this question. But that requires the entire byte[] to be loaded into memory, which, if the stream is long, can quickly exhaust system resources.
Is there a way to serialize a stream as an array of bytes in protobuf-net without loading the entire sequence of bytes into memory?
The basic difficulty here isn't protobuf-net, it's the V2 protocol buffer format. There are two ways a repeated element (e.g. a byte array or stream) can be encoded:
As a packed repeated element. Here all of the elements of the field are packed into a single key-value pair with wire type 2 (length-delimited). Each element is encoded the same way it would be normally, except without a tag preceding it.
protobuf-net automatically encodes byte arrays in this format, however doing so requires knowing the total number of bytes in advance. For a byte stream, this might require loading the entire stream into memory (e.g. when StreamProperty.CanSeek == false), which violates your requirements.
As a repeated element. Here the encoded message has zero or more key-value pairs with the same tag number.
For a byte stream, using this format would cause massive bloat in the encoded message as each byte would require an an additional integer key.
As you can see, neither default representation meets your needs. Instead, it makes sense to encode a large byte stream as a sequence of "fairly large" chunks, where each chunk is packed, but the overall sequence is not.
The following version of StreamObject does this:
[ProtoContract]
class StreamObject
{
public StreamObject() : this(new MemoryStream()) { }
public StreamObject(Stream stream)
{
if (stream == null)
throw new ArgumentNullException();
this.StreamProperty = stream;
}
[ProtoIgnore]
public Stream StreamProperty { get; set; }
internal static event EventHandler OnDataReadBegin;
internal static event EventHandler OnDataReadEnd;
const int ChunkSize = 4096;
[ProtoMember(1, IsPacked = false, OverwriteList = true)]
IEnumerable<ByteBuffer> Data
{
get
{
if (OnDataReadBegin != null)
OnDataReadBegin(this, new EventArgs());
while (true)
{
byte[] buffer = new byte[ChunkSize];
int read = StreamProperty.Read(buffer, 0, buffer.Length);
if (read <= 0)
{
break;
}
else if (read == buffer.Length)
{
yield return new ByteBuffer { Data = buffer };
}
else
{
Array.Resize(ref buffer, read);
yield return new ByteBuffer { Data = buffer };
break;
}
}
if (OnDataReadEnd != null)
OnDataReadEnd(this, new EventArgs());
}
set
{
if (value == null)
return;
foreach (var buffer in value)
StreamProperty.Write(buffer.Data, 0, buffer.Data.Length);
}
}
}
[ProtoContract]
struct ByteBuffer
{
[ProtoMember(1, IsPacked = true)]
public byte[] Data { get; set; }
}
Notice the OnDataReadBegin and OnDataReadEnd events? I added then in for debugging purposes, to enable checking that the input stream is actually getting streamed into the output protobuf stream. The following test class does this:
internal class TestClass
{
public void Test()
{
var writeStream = new MemoryStream();
long beginLength = 0;
long endLength = 0;
EventHandler begin = (o, e) => { beginLength = writeStream.Length; Console.WriteLine(string.Format("Begin serialization of Data, writeStream.Length = {0}", writeStream.Length)); };
EventHandler end = (o, e) => { endLength = writeStream.Length; Console.WriteLine(string.Format("End serialization of Data, writeStream.Length = {0}", writeStream.Length)); };
StreamObject.OnDataReadBegin += begin;
StreamObject.OnDataReadEnd += end;
try
{
int length = 1000000;
var inputStream = new MemoryStream();
for (int i = 0; i < length; i++)
{
inputStream.WriteByte(unchecked((byte)i));
}
inputStream.Position = 0;
var streamObject = new StreamObject(inputStream);
Serializer.Serialize(writeStream, streamObject);
var data = writeStream.ToArray();
StreamObject newStreamObject;
using (var s = new MemoryStream(data))
{
newStreamObject = Serializer.Deserialize<StreamObject>(s);
}
if (beginLength >= endLength)
{
throw new InvalidOperationException("inputStream was completely buffered before writing to writeStream");
}
inputStream.Position = 0;
newStreamObject.StreamProperty.Position = 0;
if (!inputStream.AsEnumerable().SequenceEqual(newStreamObject.StreamProperty.AsEnumerable()))
{
throw new InvalidOperationException("!inputStream.AsEnumerable().SequenceEqual(newStreamObject.StreamProperty.AsEnumerable())");
}
else
{
Console.WriteLine("Streams identical.");
}
}
finally
{
StreamObject.OnDataReadBegin -= begin;
StreamObject.OnDataReadEnd -= end;
}
}
}
public static class StreamExtensions
{
public static IEnumerable<byte> AsEnumerable(this Stream stream)
{
if (stream == null)
throw new ArgumentNullException();
int b;
while ((b = stream.ReadByte()) != -1)
yield return checked((byte)b);
}
}
And the output of the above is:
Begin serialization of Data, writeStream.Length = 0
End serialization of Data, writeStream.Length = 1000888
Streams identical.
Which indicates that the input stream is indeed streamed to the output without being fully loaded into memory at once.
Prototype fiddle.
Is there a mechanism available to write out a packed repeated element incrementally with bytes from a stream, knowing the length in advance?
It appears not. Assuming you have a stream for which CanSeek == true, you could encapsulate it in an IList<byte> that enumerates through the bytes in the stream, provides random access to bytes in the stream, and returns the stream length in IList.Count. There is a sample fiddle here showing such an attempt. Unfortunately, however, ListDecorator.Write() simply enumerates the list and buffers its encoded contents before writing them to the output stream, which causes the input stream to be loaded completely into memory. I think this happens because protobuf-net encodes a List<byte> differently from a byte [], namely as a length-delimited sequence of Base 128 Varints. Since the Varint representation of a byte sometimes requires more than one byte, the length cannot be computed in advance from the list count. See this answer for additional details on the difference in how byte arrays and lists are encoded. It should be possible to implement encoding of an IList<byte> in the same way as a byte [] -- it just isn't currently available.
I'm working on c# windows service that handles firebird database requests. My problem occurs at random moments (sometimes after 5 minutes, sometimes after just 4 calls to database), when I try to deserialize object on client application. It happens though only at specific position (stops at 18th byte in 54 byte array). Rest of the time the function returns a proper result.
I'm using this function to serialize single object
public byte[] ObjectToByteArray(Object obj)
{
if (obj == null)
return null;
MemoryStream fs = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(fs, obj);
fs.Seek(0, SeekOrigin.Begin);
byte[] rval = fs.ToArray();
fs.Close();
return rval;
}
I am not serializing any custom classes, only strings and numeric types (firebird api returns them as objects though).
I use this to deserialize:
public object ByteArrayToObject(Byte[] Buffer)
{
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream stream = new MemoryStream(Buffer);
stream.Position = 0;
object rval = formatter.Deserialize(stream); <--- this thing drives me nuts.
stream.Close();
return rval;
}
and main fnct in client aplication. Sorry for ugly code,
public List<object[]> ByteToList(byte[] data, int[] pomocnicza)
{
//pomocnicza table contains size of (original) particular column of list in bytes
int size_row = 0;
foreach (int i in pomocnicza)
{ size_row += i; }
List<object[]> result = new List<object[]>();
int iterator = 0;
for (int i = 0; i < data.Length / size_row ; i++)
{
object[] zxc = new object[3];
int l = pomocnicza.Length/4;
for (int j = 0; j < l; j++)
{
byte[] tmp = new byte[pomocnicza[j*4]];
System.Array.Copy(data, iterator, tmp, 0, pomocnicza[j*4]);
object ffs = ByteArrayToObject(tmp);
zxc[j] = ffs;
iterator += pomocnicza[j*4];
}
result.Add(zxc);
}
return result;
}
What is baffling me is that it works in most cases, but inevitably causes to throw an error. Thing that it happens on random makes pinpointing it harder. Please help.
#EDIT
This is how I read the input:
public List<object[]> RetrieveSelectData(FbConnection dbConn, string SQLCommand)
{
using (var command = dbConn.CreateCommand())
{
command.CommandText = SQLCommand;
using (var reader = command.ExecuteReader())
{
var rows = new List<object[]>();
while (reader.Read())
{
var columns = new object[reader.FieldCount];
reader.GetValues(columns);
rows.Add(columns);
}
return rows;
}
}
}
and then serialize with this function
public byte[] ListToByte(List<object[]> lista, out int[] rozmiary)
{
int size= 0;
rozmiary = new int[lista[0].Length];
for (int i = 0; i < lista[0].Length; i++)
{
byte[] test = this.ObjectToByteArray(lista[0][i]);
size+= test.Length;
rozmiary[i] = test.Length;
}
size*= lista.Count;
byte[] result = new byte[size];
int index = 0;
for (int i = 0; i < lista.Count; i++)
{
for (int j = 0; j < lista[i].Length; j++)
{
byte[] tmp = this.ObjectToByteArray(lista[i][j]);
tmp.CopyTo(result, index);
index += tmp.Length;
}
}
return result;
}
If you are using above deserializing methods & also call them while getting stream from clientstream OR other streams.... skip it. try to use directly those streams with formatter. Like Below :
NetworkStream clientStream = client.GetStream();
Object src = (Object)formatter.Deserialize(clientStream);
I have found the bug. The code above works fine, but care for encoding in some cases(!), so feel free to use it.
The problem laying in another part of a program, where I mistyped and send 4 bytes BUT the client app was told to receive 8, so in most cases it filled it in with zeros, but sometimes it got it from next pack of data.
It was #Marc Gravell and his blog that made me look over and over again to eventually find the source.
I have an object array containing array of a different type that is not known at compile time, but turns out to be int[], double[] etc.
I want to save these arrays to disk and I don't really need to process their contents online, so I looking for a way to cast the object[] to a byte[] that I then can write to disk.
How can I achieve this?
You may use binary serialization and deserialization for Serializable types.
using System.Runtime.Serialization.Formatters.Binary;
BinaryFormatter binary = BinaryFormatter();
using (FileStream fs = File.Create(file))
{
bs.Serialize(fs, objectArray);
}
Edit: If all these elements of an array are simple types then use BitConverter.
object[] arr = { 10.20, 1, 1.2f, 1.4, 10L, 12 };
using (MemoryStream ms = new MemoryStream())
{
foreach (dynamic t in arr)
{
byte[] bytes = BitConverter.GetBytes(t);
ms.Write(bytes, 0, bytes.Length);
}
}
You could do it the old fashioned way.
static void Main()
{
object [] arrayToConvert = new object[] {1.0,10.0,3.0,4.0, 1.0, 12313.2342};
if (arrayToConvert.Length > 0) {
byte [] dataAsBytes;
unsafe {
if (arrayToConvert[0] is int) {
dataAsBytes = new byte[sizeof(int) * arrayToConvert.Length];
fixed (byte * dataP = &dataAsBytes[0])
// CLR Arrays are always aligned
for(int i = 0; i < arrayToConvert.Length; ++i)
*((int*)dataP + i) = (int)arrayToConvert[i];
} else if (arrayToConvert[0] is double) {
dataAsBytes = new byte[sizeof(double) * arrayToConvert.Length];
fixed (byte * dataP = &dataAsBytes[0]) {
// CLR Arrays are always aligned
for(int i = 0; i < arrayToConvert.Length; ++i) {
double current = (double)arrayToConvert[i];
*((long*)dataP + i) = *(long*)¤t;
}
}
} else {
throw new ArgumentException("Wrong array type.");
}
}
Console.WriteLine(dataAsBytes);
}
}
However, I would recommend that you revisit your design. You should probably be using generics, rather than object arrays.
From here:
List<object> list = ...
byte[] obj = (byte[])list.ToArray(typeof(byte));
or if your list is complex type:
list.CopyTo(obj);
Is there an elegant to emulate the StreamReader.ReadToEnd method with BinaryReader? Perhaps to put all the bytes into a byte array?
I do this:
read1.ReadBytes((int)read1.BaseStream.Length);
...but there must be a better way.
Original Answer (Read Update Below!)
Simply do:
byte[] allData = read1.ReadBytes(int.MaxValue);
The documentation says that it will read all bytes until the end of the stream is reached.
Update
Although this seems elegant, and the documentation seems to indicate that this would work, the actual implementation (checked in .NET 2, 3.5, and 4) allocates a full-size byte array for the data, which will probably cause an OutOfMemoryException on a 32-bit system.
Therefore, I would say that actually there isn't an elegant way.
Instead, I would recommend the following variation of #iano's answer. This variant doesn't rely on .NET 4:
Create an extension method for BinaryReader (or Stream, the code is the same for either).
public static byte[] ReadAllBytes(this BinaryReader reader)
{
const int bufferSize = 4096;
using (var ms = new MemoryStream())
{
byte[] buffer = new byte[bufferSize];
int count;
while ((count = reader.Read(buffer, 0, buffer.Length)) != 0)
ms.Write(buffer, 0, count);
return ms.ToArray();
}
}
There is not an easy way to do this with BinaryReader. If you don't know the count you need to read ahead of time, a better bet is to use MemoryStream:
public byte[] ReadAllBytes(Stream stream)
{
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
return ms.ToArray();
}
}
To avoid the additional copy when calling ToArray(), you could instead return the Position and buffer, via GetBuffer().
To copy the content of a stream to another, I've solved reading "some" bytes until the end of the file is reached:
private const int READ_BUFFER_SIZE = 1024;
using (BinaryReader reader = new BinaryReader(responseStream))
{
using (BinaryWriter writer = new BinaryWriter(File.Open(localPath, FileMode.Create)))
{
int byteRead = 0;
do
{
byte[] buffer = reader.ReadBytes(READ_BUFFER_SIZE);
byteRead = buffer.Length;
writer.Write(buffer);
byteTransfered += byteRead;
} while (byteRead == READ_BUFFER_SIZE);
}
}
Had the same problem.
First, get the file's size using FileInfo.Length.
Next, create a byte array and set its value to BinaryReader.ReadBytes(FileInfo.Length).
e.g.
var size = new FileInfo(yourImagePath).Length;
byte[] allBytes = yourReader.ReadBytes(System.Convert.ToInt32(size));
Another approach to this problem is to use C# extension methods:
public static class StreamHelpers
{
public static byte[] ReadAllBytes(this BinaryReader reader)
{
// Pre .Net version 4.0
const int bufferSize = 4096;
using (var ms = new MemoryStream())
{
byte[] buffer = new byte[bufferSize];
int count;
while ((count = reader.Read(buffer, 0, buffer.Length)) != 0)
ms.Write(buffer, 0, count);
return ms.ToArray();
}
// .Net 4.0 or Newer
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
return ms.ToArray();
}
}
}
Using this approach will allow for both reusable as well as readable code.
I use this, which utilizes the underlying BaseStream property to give you the length info you need. It keeps things nice and simple.
Below are three extension methods on BinaryReader:
The first reads from wherever the stream's current position is to the end
The second reads the entire stream in one go
The third utilizes the Range type to specify the subset of data you are interested in.
public static class BinaryReaderExtensions {
public static byte[] ReadBytesToEnd(this BinaryReader binaryReader) {
var length = binaryReader.BaseStream.Length - binaryReader.BaseStream.Position;
return binaryReader.ReadBytes((int)length);
}
public static byte[] ReadAllBytes(this BinaryReader binaryReader) {
binaryReader.BaseStream.Position = 0;
return binaryReader.ReadBytes((int)binaryReader.BaseStream.Length);
}
public static byte[] ReadBytes(this BinaryReader binaryReader, Range range) {
var (offset, length) = range.GetOffsetAndLength((int)binaryReader.BaseStream.Length);
binaryReader.BaseStream.Position = offset;
return binaryReader.ReadBytes(length);
}
}
Using them is then trivial and clear...
// 1 - Reads everything in as a byte array
var rawBytes = myBinaryReader.ReadAllBytes();
// 2 - Reads a string, then reads the remaining data as a byte array
var someString = myBinaryReader.ReadString();
var rawBytes = myBinaryReader.ReadBytesToEnd();
// 3 - Uses a range to read the last 44 bytes
var rawBytes = myBinaryReader.ReadBytes(^44..);
Please show me optimized solutions for castings:
1)
public static byte[] ToBytes(List<Int64> list)
{
byte[] bytes = null;
//todo
return bytes;
}
2)
public static List<Int64> ToList(byte[] bytes)
{
List<Int64> list = null;
//todo
return list;
}
It will be very helpful to see versions with minimized copying and/or with unsafe code (if it can be implemented). Ideally, copying of data are do not need at all.
Update:
My question is about casting like C++ manner:
__int64* ptrInt64 = (__int64*)ptrInt8;
and
__int8* ptrInt8 = (__int8*)ptrInt64
Thank you for help!!!
Edit, fixed for correct 8 byte conversion, also not terribly efficient when converting back to byte array.
public static List<Int64> ToList(byte[] bytes)
{
var list = new List<Int64>();
for (int i = 0; i < bytes.Length; i += sizeof(Int64))
list.Add(BitConverter.ToInt64(bytes, i));
return list;
}
public static byte[] ToBytes(List<Int64> list)
{
var byteList = list.ConvertAll(new Converter<Int64, byte[]>(Int64Converter));
List<byte> resultList = new List<byte>();
byteList.ForEach(x => { resultList.AddRange(x); });
return resultList.ToArray();
}
public static byte[] Int64Converter(Int64 x)
{
return BitConverter.GetBytes(x);
}
Use Mono.DataConvert. This library has converters to/from most primitive types, for big-endian, little-endian, and host-order byte ordering.
CLR arrays know their types and sizes so you can't just cast an array of one type to another. However, it is possible to do unsafe casting of value types. For example, here's the source to BitConverter.GetBytes(long):
public static unsafe byte[] GetBytes(long value)
{
byte[] buffer = new byte[8];
fixed (byte* numRef = buffer)
{
*((long*) numRef) = value;
}
return buffer;
}
You could write this for a list of longs, like this:
public static unsafe byte[] GetBytes(IList<long> value)
{
byte[] buffer = new byte[8 * value.Count];
fixed (byte* numRef = buffer)
{
for (int i = 0; i < value.Count; i++)
*((long*) (numRef + i * 8)) = value[i];
}
return buffer;
}
And of course it would be easy to go in the opposite direction if this was how you wanted to go.