Fastest way to serialize and deserialize .NET objects - c#

I'm looking for the fastest way to serialize and deserialize .NET objects. Here is what I have so far:
public class TD
{
public List<CT> CTs { get; set; }
public List<TE> TEs { get; set; }
public string Code { get; set; }
public string Message { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public static string Serialize(List<TD> tData)
{
var serializer = new XmlSerializer(typeof(List<TD>));
TextWriter writer = new StringWriter();
serializer.Serialize(writer, tData);
return writer.ToString();
}
public static List<TD> Deserialize(string tData)
{
var serializer = new XmlSerializer(typeof(List<TD>));
TextReader reader = new StringReader(tData);
return (List<TD>)serializer.Deserialize(reader);
}
}

Here's your model (with invented CT and TE) using protobuf-net (yet retaining the ability to use XmlSerializer, which can be useful - in particular for migration); I humbly submit (with lots of evidence if you need it) that this is the fastest (or certainly one of the fastest) general purpose serializer in .NET.
If you need strings, just base-64 encode the binary.
[XmlType]
public class CT {
[XmlElement(Order = 1)]
public int Foo { get; set; }
}
[XmlType]
public class TE {
[XmlElement(Order = 1)]
public int Bar { get; set; }
}
[XmlType]
public class TD {
[XmlElement(Order=1)]
public List<CT> CTs { get; set; }
[XmlElement(Order=2)]
public List<TE> TEs { get; set; }
[XmlElement(Order = 3)]
public string Code { get; set; }
[XmlElement(Order = 4)]
public string Message { get; set; }
[XmlElement(Order = 5)]
public DateTime StartDate { get; set; }
[XmlElement(Order = 6)]
public DateTime EndDate { get; set; }
public static byte[] Serialize(List<TD> tData) {
using (var ms = new MemoryStream()) {
ProtoBuf.Serializer.Serialize(ms, tData);
return ms.ToArray();
}
}
public static List<TD> Deserialize(byte[] tData) {
using (var ms = new MemoryStream(tData)) {
return ProtoBuf.Serializer.Deserialize<List<TD>>(ms);
}
}
}

A comprehensive comparison between different formats made by me in this post-
https://medium.com/#maximn/serialization-performance-comparison-xml-binary-json-p-ad737545d227
Just one sample from the post-

Having an interest in this, I decided to test the suggested methods with the closest "apples to apples" test I could. I wrote a Console app, with the following code:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Threading.Tasks;
namespace SerializationTests
{
class Program
{
static void Main(string[] args)
{
var count = 100000;
var rnd = new Random(DateTime.UtcNow.GetHashCode());
Console.WriteLine("Generating {0} arrays of data...", count);
var arrays = new List<int[]>();
for (int i = 0; i < count; i++)
{
var elements = rnd.Next(1, 100);
var array = new int[elements];
for (int j = 0; j < elements; j++)
{
array[j] = rnd.Next();
}
arrays.Add(array);
}
Console.WriteLine("Test data generated.");
var stopWatch = new Stopwatch();
Console.WriteLine("Testing BinarySerializer...");
var binarySerializer = new BinarySerializer();
var binarySerialized = new List<byte[]>();
var binaryDeserialized = new List<int[]>();
stopWatch.Reset();
stopWatch.Start();
foreach (var array in arrays)
{
binarySerialized.Add(binarySerializer.Serialize(array));
}
stopWatch.Stop();
Console.WriteLine("BinaryFormatter: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
stopWatch.Reset();
stopWatch.Start();
foreach (var serialized in binarySerialized)
{
binaryDeserialized.Add(binarySerializer.Deserialize<int[]>(serialized));
}
stopWatch.Stop();
Console.WriteLine("BinaryFormatter: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine();
Console.WriteLine("Testing ProtoBuf serializer...");
var protobufSerializer = new ProtoBufSerializer();
var protobufSerialized = new List<byte[]>();
var protobufDeserialized = new List<int[]>();
stopWatch.Reset();
stopWatch.Start();
foreach (var array in arrays)
{
protobufSerialized.Add(protobufSerializer.Serialize(array));
}
stopWatch.Stop();
Console.WriteLine("ProtoBuf: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
stopWatch.Reset();
stopWatch.Start();
foreach (var serialized in protobufSerialized)
{
protobufDeserialized.Add(protobufSerializer.Deserialize<int[]>(serialized));
}
stopWatch.Stop();
Console.WriteLine("ProtoBuf: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine();
Console.WriteLine("Testing NetSerializer serializer...");
var netSerializerSerializer = new ProtoBufSerializer();
var netSerializerSerialized = new List<byte[]>();
var netSerializerDeserialized = new List<int[]>();
stopWatch.Reset();
stopWatch.Start();
foreach (var array in arrays)
{
netSerializerSerialized.Add(netSerializerSerializer.Serialize(array));
}
stopWatch.Stop();
Console.WriteLine("NetSerializer: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
stopWatch.Reset();
stopWatch.Start();
foreach (var serialized in netSerializerSerialized)
{
netSerializerDeserialized.Add(netSerializerSerializer.Deserialize<int[]>(serialized));
}
stopWatch.Stop();
Console.WriteLine("NetSerializer: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine("Press any key to end.");
Console.ReadKey();
}
public class BinarySerializer
{
private static readonly BinaryFormatter Formatter = new BinaryFormatter();
public byte[] Serialize(object toSerialize)
{
using (var stream = new MemoryStream())
{
Formatter.Serialize(stream, toSerialize);
return stream.ToArray();
}
}
public T Deserialize<T>(byte[] serialized)
{
using (var stream = new MemoryStream(serialized))
{
var result = (T)Formatter.Deserialize(stream);
return result;
}
}
}
public class ProtoBufSerializer
{
public byte[] Serialize(object toSerialize)
{
using (var stream = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(stream, toSerialize);
return stream.ToArray();
}
}
public T Deserialize<T>(byte[] serialized)
{
using (var stream = new MemoryStream(serialized))
{
var result = ProtoBuf.Serializer.Deserialize<T>(stream);
return result;
}
}
}
public class NetSerializer
{
private static readonly NetSerializer Serializer = new NetSerializer();
public byte[] Serialize(object toSerialize)
{
return Serializer.Serialize(toSerialize);
}
public T Deserialize<T>(byte[] serialized)
{
return Serializer.Deserialize<T>(serialized);
}
}
}
}
The results surprised me; they were consistent when run multiple times:
Generating 100000 arrays of data...
Test data generated.
Testing BinarySerializer...
BinaryFormatter: Serializing took 336.8392ms.
BinaryFormatter: Deserializing took 208.7527ms.
Testing ProtoBuf serializer...
ProtoBuf: Serializing took 2284.3827ms.
ProtoBuf: Deserializing took 2201.8072ms.
Testing NetSerializer serializer...
NetSerializer: Serializing took 2139.5424ms.
NetSerializer: Deserializing took 2113.7296ms.
Press any key to end.
Collecting these results, I decided to see if ProtoBuf or NetSerializer performed better with larger objects. I changed the collection count to 10,000 objects, but increased the size of the arrays to 1-10,000 instead of 1-100. The results seemed even more definitive:
Generating 10000 arrays of data...
Test data generated.
Testing BinarySerializer...
BinaryFormatter: Serializing took 285.8356ms.
BinaryFormatter: Deserializing took 206.0906ms.
Testing ProtoBuf serializer...
ProtoBuf: Serializing took 10693.3848ms.
ProtoBuf: Deserializing took 5988.5993ms.
Testing NetSerializer serializer...
NetSerializer: Serializing took 9017.5785ms.
NetSerializer: Deserializing took 5978.7203ms.
Press any key to end.
My conclusion, therefore, is: there may be cases where ProtoBuf and NetSerializer are well-suited to, but in terms of raw performance for at least relatively simple objects... BinaryFormatter is significantly more performant, by at least an order of magnitude.
YMMV.

Protobuf is very very fast.
See http://code.google.com/p/protobuf-net/wiki/Performance for in depth information concerning the performance of this system, and an implementation.

Yet another serializer out there that claims to be super fast is netserializer.
The data given on their site shows performance of 2x - 4x over protobuf, I have not tried this myself, but if you are evaluating various options, try this as well

The binary serializer included with .net should be faster that the XmlSerializer. Or another serializer for protobuf, json, ...
But for some of them you need to add Attributes, or some other way to add metadata. For example ProtoBuf uses numeric property IDs internally, and the mapping needs to be somehow conserved by a different mechanism. Versioning isn't trivial with any serializer.

I removed the bugs in above code and got below results: Also I am unsure given how NetSerializer requires you to register the types you are serializing, what kind of compatibility or performance differences that could potentially make.
Generating 100000 arrays of data...
Test data generated.
Testing BinarySerializer...
BinaryFormatter: Serializing took 508.9773ms.
BinaryFormatter: Deserializing took 371.8499ms.
Testing ProtoBuf serializer...
ProtoBuf: Serializing took 3280.9185ms.
ProtoBuf: Deserializing took 3190.7899ms.
Testing NetSerializer serializer...
NetSerializer: Serializing took 427.1241ms.
NetSerializer: Deserializing took 78.954ms.
Press any key to end.
Modified Code
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Threading.Tasks;
namespace SerializationTests
{
class Program
{
static void Main(string[] args)
{
var count = 100000;
var rnd = new Random((int)DateTime.UtcNow.Ticks & 0xFF);
Console.WriteLine("Generating {0} arrays of data...", count);
var arrays = new List<int[]>();
for (int i = 0; i < count; i++)
{
var elements = rnd.Next(1, 100);
var array = new int[elements];
for (int j = 0; j < elements; j++)
{
array[j] = rnd.Next();
}
arrays.Add(array);
}
Console.WriteLine("Test data generated.");
var stopWatch = new Stopwatch();
Console.WriteLine("Testing BinarySerializer...");
var binarySerializer = new BinarySerializer();
var binarySerialized = new List<byte[]>();
var binaryDeserialized = new List<int[]>();
stopWatch.Reset();
stopWatch.Start();
foreach (var array in arrays)
{
binarySerialized.Add(binarySerializer.Serialize(array));
}
stopWatch.Stop();
Console.WriteLine("BinaryFormatter: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
stopWatch.Reset();
stopWatch.Start();
foreach (var serialized in binarySerialized)
{
binaryDeserialized.Add(binarySerializer.Deserialize<int[]>(serialized));
}
stopWatch.Stop();
Console.WriteLine("BinaryFormatter: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine();
Console.WriteLine("Testing ProtoBuf serializer...");
var protobufSerializer = new ProtoBufSerializer();
var protobufSerialized = new List<byte[]>();
var protobufDeserialized = new List<int[]>();
stopWatch.Reset();
stopWatch.Start();
foreach (var array in arrays)
{
protobufSerialized.Add(protobufSerializer.Serialize(array));
}
stopWatch.Stop();
Console.WriteLine("ProtoBuf: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
stopWatch.Reset();
stopWatch.Start();
foreach (var serialized in protobufSerialized)
{
protobufDeserialized.Add(protobufSerializer.Deserialize<int[]>(serialized));
}
stopWatch.Stop();
Console.WriteLine("ProtoBuf: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine();
Console.WriteLine("Testing NetSerializer serializer...");
var netSerializerSerialized = new List<byte[]>();
var netSerializerDeserialized = new List<int[]>();
stopWatch.Reset();
stopWatch.Start();
var netSerializerSerializer = new NS();
foreach (var array in arrays)
{
netSerializerSerialized.Add(netSerializerSerializer.Serialize(array));
}
stopWatch.Stop();
Console.WriteLine("NetSerializer: Serializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
stopWatch.Reset();
stopWatch.Start();
foreach (var serialized in netSerializerSerialized)
{
netSerializerDeserialized.Add(netSerializerSerializer.Deserialize<int[]>(serialized));
}
stopWatch.Stop();
Console.WriteLine("NetSerializer: Deserializing took {0}ms.", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine("Press any key to end.");
Console.ReadKey();
}
public class BinarySerializer
{
private static readonly BinaryFormatter Formatter = new BinaryFormatter();
public byte[] Serialize(object toSerialize)
{
using (var stream = new MemoryStream())
{
Formatter.Serialize(stream, toSerialize);
return stream.ToArray();
}
}
public T Deserialize<T>(byte[] serialized)
{
using (var stream = new MemoryStream(serialized))
{
var result = (T)Formatter.Deserialize(stream);
return result;
}
}
}
public class ProtoBufSerializer
{
public byte[] Serialize(object toSerialize)
{
using (var stream = new MemoryStream())
{
ProtoBuf.Serializer.Serialize(stream, toSerialize);
return stream.ToArray();
}
}
public T Deserialize<T>(byte[] serialized)
{
using (var stream = new MemoryStream(serialized))
{
var result = ProtoBuf.Serializer.Deserialize<T>(stream);
return result;
}
}
}
public class NS
{
NetSerializer.Serializer Serializer = new NetSerializer.Serializer(new Type[] { typeof(int), typeof(int[]) });
public byte[] Serialize(object toSerialize)
{
using (var stream = new MemoryStream())
{
Serializer.Serialize(stream, toSerialize);
return stream.ToArray();
}
}
public T Deserialize<T>(byte[] serialized)
{
using (var stream = new MemoryStream(serialized))
{
Serializer.Deserialize(stream, out var result);
return (T)result;
}
}
}
}
}

You can try Salar.Bois serializer which has a decent performance. Its focus is on payload size but it also offers good performance.
There are benchmarks in the Github page if you wish to see and compare the results by yourself.
https://github.com/salarcode/Bois

I took the liberty of feeding your classes into the CGbR generator. Because it is in an early stage it doesn't support DateTime yet, so I simply replaced it with long. The generated serialization code looks like this:
public int Size
{
get
{
var size = 24;
// Add size for collections and strings
size += Cts == null ? 0 : Cts.Count * 4;
size += Tes == null ? 0 : Tes.Count * 4;
size += Code == null ? 0 : Code.Length;
size += Message == null ? 0 : Message.Length;
return size;
}
}
public byte[] ToBytes(byte[] bytes, ref int index)
{
if (index + Size > bytes.Length)
throw new ArgumentOutOfRangeException("index", "Object does not fit in array");
// Convert Cts
// Two bytes length information for each dimension
GeneratorByteConverter.Include((ushort)(Cts == null ? 0 : Cts.Count), bytes, ref index);
if (Cts != null)
{
for(var i = 0; i < Cts.Count; i++)
{
var value = Cts[i];
value.ToBytes(bytes, ref index);
}
}
// Convert Tes
// Two bytes length information for each dimension
GeneratorByteConverter.Include((ushort)(Tes == null ? 0 : Tes.Count), bytes, ref index);
if (Tes != null)
{
for(var i = 0; i < Tes.Count; i++)
{
var value = Tes[i];
value.ToBytes(bytes, ref index);
}
}
// Convert Code
GeneratorByteConverter.Include(Code, bytes, ref index);
// Convert Message
GeneratorByteConverter.Include(Message, bytes, ref index);
// Convert StartDate
GeneratorByteConverter.Include(StartDate.ToBinary(), bytes, ref index);
// Convert EndDate
GeneratorByteConverter.Include(EndDate.ToBinary(), bytes, ref index);
return bytes;
}
public Td FromBytes(byte[] bytes, ref int index)
{
// Read Cts
var ctsLength = GeneratorByteConverter.ToUInt16(bytes, ref index);
var tempCts = new List<Ct>(ctsLength);
for (var i = 0; i < ctsLength; i++)
{
var value = new Ct().FromBytes(bytes, ref index);
tempCts.Add(value);
}
Cts = tempCts;
// Read Tes
var tesLength = GeneratorByteConverter.ToUInt16(bytes, ref index);
var tempTes = new List<Te>(tesLength);
for (var i = 0; i < tesLength; i++)
{
var value = new Te().FromBytes(bytes, ref index);
tempTes.Add(value);
}
Tes = tempTes;
// Read Code
Code = GeneratorByteConverter.GetString(bytes, ref index);
// Read Message
Message = GeneratorByteConverter.GetString(bytes, ref index);
// Read StartDate
StartDate = DateTime.FromBinary(GeneratorByteConverter.ToInt64(bytes, ref index));
// Read EndDate
EndDate = DateTime.FromBinary(GeneratorByteConverter.ToInt64(bytes, ref index));
return this;
}
I created a list of sample objects like this:
var objects = new List<Td>();
for (int i = 0; i < 1000; i++)
{
var obj = new Td
{
Message = "Hello my friend",
Code = "Some code that can be put here",
StartDate = DateTime.Now.AddDays(-7),
EndDate = DateTime.Now.AddDays(2),
Cts = new List<Ct>(),
Tes = new List<Te>()
};
for (int j = 0; j < 10; j++)
{
obj.Cts.Add(new Ct { Foo = i * j });
obj.Tes.Add(new Te { Bar = i + j });
}
objects.Add(obj);
}
Results on my machine in Release build:
var watch = new Stopwatch();
watch.Start();
var bytes = BinarySerializer.SerializeMany(objects);
watch.Stop();
Size: 149000 bytes
Time: 2.059ms 3.13ms
Edit: Starting with CGbR 0.4.3 the binary serializer supports DateTime. Unfortunately the DateTime.ToBinary method is insanely slow. I will replace it with somehting faster soon.
Edit2: When using UTC DateTime by invoking ToUniversalTime() the performance is restored and clocks in at 1.669ms.

Related

How to conduct proper deep copying?

I'm trying to perform a deep copy of an object in C# so when I do the following:
Route currentBestRoute = Ants[0].Route;
currentBestRoute would not change after altering Ants[0].Route.
I have tried altering the Route class:
using System;
using System.Collections.Generic;
namespace ACO.Models
{
public class Route : ICloneable
{
public List<City> Cities = new List<City>();
public string Name
{
get
{
string name = "";
for(int i = 0; i < Cities.Count; i++)
{
name += Cities[i].Name;
if (i != Cities.Count - 1)
{
name += "->";
}
}
return name;
}
}
public double Distance
{
get
{
double distance = 0.0;
for(int i = 0; i < Cities.Count - 1; i++)
{
distance += Cities[i].measureDistance(Cities[i + 1]);
}
return distance;
}
}
public object Clone()
{
Route route = new Route
{
Cities = Cities
};
return route;
}
}
}
and conduct a deep clone as below:
private static Route GetCurrentBestRoute()
{
Route currentBestRoute = (Route) Ants[0].Route.Clone();
foreach(Ant ant in Ants)
{
if(ant.Route.Distance < currentBestRoute.Distance)
{
currentBestRoute = (Route) ant.Route.Clone();
}
}
return currentBestRoute;
}
But this is not working. currentBestRoute still changes on its own every time the Ants List is updated.
Am I missing something?
public object Clone()
{
Route route = new Route
{
//Cities = Cities
Cities = this.Cities.ToList(),
};
return route;
}
IConeable interface doesn't create deep copy. you can use [Serializable] attribute on class
and use this generic code
public static T DeepClone<T>(T obj)
{
using (var ms = new MemoryStream())
{
var formatter = new BinaryFormatter();
formatter.Serialize(ms, obj);
ms.Position = 0;
return (T) formatter.Deserialize(ms);
}
}

Generalize deserializing from byte array

I am parsing a byte array containing different type values stored in a fixed format. For example first 4 bytes could be an int containing size of an array -
let's say the array of doubles so next 8 bytes represent a double - the first element of the array etc. It could in theory contain values of other types, but let's say we can only have
bool,int,uint,short,ushort,long,ulong,float,double and arrays of each of these. Simple approach:
public class FixedFormatParser
{
private byte[] _Contents;
private int _CurrentPos = 0;
public FixedFormatParser(byte[] contents)
{
_Contents = contents;
}
bool ReadBool()
{
bool res = BitConverter.ToBoolean(_Contents, _CurrentPos);
_CurrentPos += sizeof(bool);
return res;
}
int ReadInt()
{
int res = BitConverter.ToInt32(_Contents, _CurrentPos);
_CurrentPos += sizeof(int);
return res;
}
// etc. for uint, short, ushort, long, ulong, float, double
int[] ReadIntArray()
{
int size = ReadInt();
if (size == 0)
return null;
int[] res = new int[size];
for (int i = 0; i < size; i++)
res[i] = ReadInt();
return res;
}
// etc. for bool, uint, short, ushort, long, ulong, float, double
}
I can obviously write 18 methods to cover each case, but seems like there should be a way to generalize this.
bool val = Read<bool>();
long[] arr = ReadArray<long>(); // or ReadArray(Read<long>);
Obviously I don't mean write 2 wrappers in addition to the 18 methods to allow for this syntax. The syntax is not important, the code duplication is the issue. Another consideration is the performance. Ideally there would not be any (or much) of a performance hit. Thanks.
Update:
Regarding other questions that are supposedly duplicates. I disagree as none of them addressed the particular generalization I am after, but one came pretty close:
First answer in
C# Reading Byte Array
described wrapping BinaryReader. This would cover 9 of the 18 methods. So half of the problem is addressed. I still would need to write all of the various array reads.
public class FixedFormatParser2 : BinaryReader
{
public FixedFormatParser2(byte[] input) : base(new MemoryStream(input))
{
}
public override string ReadString()
{
//
}
public double[] ReadDoubleArray()
{
int size = ReadInt32();
if (size == 0)
return null;
double[] res = new double[size];
for (int i = 0; i < size; i++)
res[i] = ReadDouble();
return res;
}
}
How do I not write a separate ReadXXXArray for each of the types?
Nearest I got to it:
public void WriteCountedArray(dynamic[] input)
{
if (input == null || input.Length == 0)
Write((int)0);
else
{
Write(input.Length);
foreach (dynamic val in input)
Write(val);
}
}
This compiles but calling it is cumbersome and expensive :
using (FixedFormatWriter writer = new FixedFormatWriter())
{
double[] array = new double[3];
// ... assign values
writer.WriteCountedArray(array.Select(x=>(dynamic)x).ToArray());
I like doing like this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Xml;
using System.Xml.Serialization;
using System.IO;
namespace ConsoleApplication50
{
class Program
{
static void Main(string[] args)
{
new Format();
}
}
public class Format
{
public enum TYPES
{
INT,
INT16,
LONG
}
public static List<Format> format = new List<Format>() {
new Format() { name = "AccountNumber", _type = TYPES.INT ,numberOfBytes = 4},
new Format() { name = "Age", _type = TYPES.INT16 ,numberOfBytes = 2},
new Format() { name = "AccountNumber", _type = TYPES.LONG ,numberOfBytes = 8}
};
public Dictionary<string, object> dict = new Dictionary<string, object>();
public string name { get; set; }
public TYPES _type { get; set; }
public int numberOfBytes { get; set; }
public Format() { }
public Format(byte[] contents)
{
MemoryStream stream = new MemoryStream(contents);
BinaryReader reader = new BinaryReader(stream);
foreach (Format item in format)
{
switch (item._type)
{
case TYPES.INT16 :
dict.Add(item.name, reader.ReadInt16());
break;
case TYPES.INT:
dict.Add(item.name, reader.ReadInt32());
break;
case TYPES.LONG:
dict.Add(item.name, reader.ReadInt64());
break;
}
}
}
}
}

clrobj(<class name>) does not have llvm when passing array of struct to GPU Kernel (ALEA Library)

I am getting the "Fody/Alea.CUDA: clrobj(cGPU) does not have llvm" build error for a code in which I try to pass an array of struct to the NVIDIA Kernel using ALEA library. Here is a simplified version of my code. I removed the output gathering functionality in order to keep the code simple. I just need to be able to send the array of struct to the GPU for the moment.
using Alea.CUDA;
using Alea.CUDA.Utilities;
using Alea.CUDA.IL;
namespace GPUProgramming
{
public class cGPU
{
public int Slice;
public float FloatValue;
}
[AOTCompile(AOTOnly = true)]
public class TestModule : ILGPUModule
{
public TestModule(GPUModuleTarget target) : base(target)
{
}
const int blockSize = 64;
[Kernel]
public void Kernel2(deviceptr<cGPU> Data, int n)
{
var start = blockIdx.x * blockDim.x + threadIdx.x;
int ind = threadIdx.x;
var sharedSlice = __shared__.Array<int>(64);
var sharedFloatValue = __shared__.Array<float>(64);
if (ind < n && start < n)
{
sharedSlice[ind] = Data[start].Slice;
sharedFloatValue[ind] = Data[start].FloatValue;
Intrinsic.__syncthreads();
}
}
public void Test2(deviceptr<cGPU> Data, int n, int NumOfBlocks)
{
var GridDim = new dim3(NumOfBlocks, 1);
var BlockDim = new dim3(64, 1);
try
{
var lp = new LaunchParam(GridDim, BlockDim);
GPULaunch(Kernel2, lp, Data, n);
}
catch (CUDAInterop.CUDAException x)
{
var code = x.Data0;
Console.WriteLine("ErrorCode = {0}", code);
}
}
public void Test2(cGPU[] Data)
{
int NumOfBlocks = Common.divup(Data.Length, blockSize);
using (var d_Slice = GPUWorker.Malloc(Data))
{
try
{
Test_Kernel2(d_Slice.Ptr, Data.Length, NumOfBlocks);
}
catch (CUDAInterop.CUDAException x)
{
var code = x.Data0;
Console.WriteLine("ErrorCode = {0}", x.Data0);
}
}
}
}
}
Your data is class, which is reference type. Try use struct. Reference type doesn't fit Gpu well, since it require of allocating small memory on the heap.

Read x number of lines of a file at a time C#

I want to read and process 10+ lines at a time for GB files, but haven't found a solution to spit out 10 lines until the end.
My last attempt was :
int n = 10;
foreach (var line in File.ReadLines("path")
.AsParallel().WithDegreeOfParallelism(n))
{
System.Console.WriteLine(line);
Thread.Sleep(1000);
}
I've seen solutions that use buffer sizes but I want to read in the entire row.
The Default behavour is to read all the Line in one shot, if you want to read less than that you need to dig a little deeper into how it reads them and get a StreamReader which will then let you control the reading process
using (StreamReader sr = new StreamReader(path))
{
while (sr.Peek() >= 0)
{
Console.WriteLine(sr.ReadLine());
}
}
it also has a ReadLineAsync method that will return a task
if you contain these tasks in an ConcurrentBag you can very easily keep the processing running on 10 lines at a time.
var bag =new ConCurrentBag<Task>();
using (StreamReader sr = new StreamReader(path))
{
while(sr.Peek() >=0)
{
if(bag.Count < 10)
{
Task processing = sr.ReadLineAsync().ContinueWith( (read) => {
string s = read.Result;//EDIT Removed await to reflect Scots comment
//process line
});
bag.Add(processing);
}
else
{
Task.WaitAny(bag.ToArray())
//remove competed tasks from bag
}
}
}
note this code is for guidance only not to be used as is;
if all you want is the last ten lines then you can get that with the solution here
How to read a text file reversely with iterator in C#
This method would create "pages" of lines from your file.
public static IEnumerable<string[]> ReadFileAsLinesSets(string fileName, int setLen = 10)
{
using (var reader = new StreamReader(fileName))
while (!reader.EndOfStream)
{
var set = new List<string>();
for (var i = 0; i < setLen && !reader.EndOfStream; i++)
{
set.Add(reader.ReadLine());
}
yield return set.ToArray();
}
}
... More fun version...
class Example
{
static void Main(string[] args)
{
"YourFile.txt".ReadAsLines()
.AsPaged(10)
.Select(a=>a.ToArray()) //required or else you will get random data since "WrappedEnumerator" is not thread safe
.AsParallel()
.WithDegreeOfParallelism(10)
.ForAll(a =>
{
//Do your work here.
Console.WriteLine(a.Aggregate(new StringBuilder(),
(sb, v) => sb.AppendFormat("{0:000000} ", v),
sb => sb.ToString()));
});
}
}
public static class ToolsEx
{
public static IEnumerable<IEnumerable<T>> AsPaged<T>(this IEnumerable<T> items,
int pageLength = 10)
{
using (var enumerator = new WrappedEnumerator<T>(items.GetEnumerator()))
while (!enumerator.IsDone)
yield return enumerator.GetNextPage(pageLength);
}
public static IEnumerable<T> GetNextPage<T>(this IEnumerator<T> enumerator,
int pageLength = 10)
{
for (var i = 0; i < pageLength && enumerator.MoveNext(); i++)
yield return enumerator.Current;
}
public static IEnumerable<string> ReadAsLines(this string fileName)
{
using (var reader = new StreamReader(fileName))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
internal class WrappedEnumerator<T> : IEnumerator<T>
{
public WrappedEnumerator(IEnumerator<T> enumerator)
{
this.InnerEnumerator = enumerator;
this.IsDone = false;
}
public IEnumerator<T> InnerEnumerator { get; private set; }
public bool IsDone { get; private set; }
public T Current { get { return this.InnerEnumerator.Current; } }
object System.Collections.IEnumerator.Current { get { return this.Current; } }
public void Dispose()
{
this.InnerEnumerator.Dispose();
this.IsDone = true;
}
public bool MoveNext()
{
var next = this.InnerEnumerator.MoveNext();
this.IsDone = !next;
return next;
}
public void Reset()
{
this.IsDone = false;
this.InnerEnumerator.Reset();
}
}

How to writte protobuf serialized content directly to SharpZipLib stream

Is there possible to writte protobuf serialized content directly to SharpZipLib stream? When I try to do this, looks like the provided stream is not filled with the data from protobuf. Later I would need to get back the deserialized entity from provided Zip stream.
My code looks like this:
private byte[] ZipContent(T2 content)
{
const short COMPRESSION_LEVEL = 4; // 0-9
const string ENTRY_NAME = "DefaultEntryName";
byte[] result = null;
if (content == null)
return result;
IStreamSerializerProto<T2> serializer = this.GetSerializer(content.GetType());
using (MemoryStream outputStream = new MemoryStream())
{
using (ZipOutputStream zipOutputStream = new ZipOutputStream(outputStream))
{
zipOutputStream.SetLevel(COMPRESSION_LEVEL);
ZipEntry entry = new ZipEntry(ENTRY_NAME);
entry.DateTime = DateTime.Now;
zipOutputStream.PutNextEntry(entry);
serializer.Serialize(zipOutputStream, content);
}
result = outputStream.ToArray();
}
return result;
}
private class ProtobufStreamSerializer<T3> : IStreamSerializerProto<T3>
{
public ProtobufStreamSerializer()
{
ProtoBuf.Serializer.PrepareSerializer<T3>();
}
public void Serialize(Stream outputStream, T3 content)
{
Serializer.Serialize(outputStream, content);
}
public T3 Deserialize(Stream inputStream)
{
T3 deserializedObj;
using (inputStream)
{
deserializedObj = ProtoBuf.Serializer.Deserialize<T3>(inputStream);
}
return deserializedObj;
}
}
Sample of a class which I'm trying to serialize:
[Serializable]
[ProtoContract]
public class Model
{
[XmlElement("ModelCode")]
[ProtoMember(1)]
public int ModelCode { get; set; }
...
}
This is the problem, I believe (in the original code in the question):
public void Serialize(Stream outputStream, T3 content)
{
using (var stream = new MemoryStream())
{
Serializer.Serialize(stream, content);
}
}
You're completely ignoring outputStream, instead writing the data just to a new MemoryStream which is then ignored.
I suspect you just want:
public void Serialize(Stream outputStream, T3 content)
{
Serializer.Serialize(outputStream, content);
}
I'd also suggest removing the using statement from your Deserialize method: I'd expect the caller to be responsible for disposing of the input stream when they're finished with it. Your method can be simplified to:
public T3 Deserialize(Stream inputStream)
{
return ProtoBuf.Serializer.Deserialize<T3>(inputStream);
}
The code (with the edit pointed out by Jon) looks fine. Here it is working:
static void Main()
{
var obj = new Bar{ X = 123, Y = "abc" };
var wrapper = new Foo<Bar>();
var blob = wrapper.ZipContent(obj);
var clone = wrapper.UnzipContent(blob);
}
where Bar is:
[ProtoContract]
class Bar
{
[ProtoMember(1)]
public int X { get; set; }
[ProtoMember(2)]
public string Y { get; set; }
}
and Foo<T> is your class (I didn't know the name), where I have added:
public T2 UnzipContent(byte[] data)
{
using(var ms = new MemoryStream(data))
using(var zip = new ZipInputStream(ms))
{
var entry = zip.GetNextEntry();
var serializer = this.GetSerializer(typeof(T2));
return serializer.Deserialize(zip);
}
}
Also, note that compression is double-edged. In the example I give above, the underlying size (i.e. if we just write to a MemoryStream) is 7 bytes. ZipInputStream "compresses" this 7 bytes down to 179 bytes. Compression works best on larger objects, usually when there is lots of text content.

Categories