I'm working on a project where I'll need to serialize some data in a java 6 app and deserialize it a c# 2.0 app. Is there a strategy or something already in existence I can look at that would allow me to do this with these two languages? I'm guessing they both support XML serialization but I really need it to be binary serialized.
Protocol buffers would be a good option here. On the C# side, I would recommend Jon Skeet's dotnet-protobufs for this use-case, since it has the same API on both sides (his C# version is a port of the Google Java version, part of the core distribution). If you want the C# to be more "typical .NET", then protobuf-net may help.
(the wire format is obviously identical between versions; the API may vary)
Small, fast, efficient, portable.
For info, I know that protobuf-net has .NET 2.0 support; I honestly haven't tried this on Jon's version, but I expect it would - there isn't much that you need 3.0/3.5 for in protobuf.
Protocol Buffers (Google Site)
Java Tutorial
Jon Skeet's C# Port
Marc Gravell's C# Port
Upsides: Fast and you can bug quite some people that are involved with this thing on SO.. ;-)
Let me exploit Marc's project site: Performance is quite acceptable..
The default binary serialization of each language is incompatible so you will not be able to use that.
There are many cross-language serialization technologies that support Java, C#, and other languages:
JSON
Thrift
Protocol Buffers
Of these, JSON is not binary but very efficient for a string-based language. Thrift and Protocol Buffers are binary and have a very compact representation.
You could try Hessian:
http://hessian.caucho.com/index.xtp
It's binary, and supports Java, C++, and several other languages. I've never used it myself, but came across it, thought it was interesting, and bookmarked it...
Google's Protocol Buffers is something that you could look into. You will need to check into the state of the usability of the C# implementation, but in all other respects, I think that it meets your needs.
you can use BSON if you really need the data as binary...
http://bsonspec.org/implementations.html
You can use the wox cross platform serialization library (https://github.com/codelion/wox), it is based on native XML serializers for Java and C#.
I don't believe binary serialization will work as C# and Java have no idea of each others native types.
I'll echo most of the other answers here as far as Google protocol buffers is concerned. But I ended up using a program called protostuff at the Java end instead of Google's own Java implementation, and I also added the name of the (outermost) class as a prefix to the protocol buffers data to make the data self-describing for deserialization. Details here: https://stackoverflow.com/a/17923846/253938
Related
What's the correct/mainstream way to handle byte arrays sent from C# client code to Node.js server environment?
Now I'm using standard C# serialization through BinaryWriter (client-side) and streambuf npm-package (server-side).
But I'm pretty sure that there is a more straightforward/native/mainstream/efficient way to do it.
What I have to do?
Thank you for your advices!
The mainstream way to convert an object to a serialized representation would be to use a serialization framework. If you are using node.js I would assume json would be most appropriate. I'm not familiar with Node.js, but my understanding is that it handles json natively. See also newtonsoft json for another very popular library.
If you need more a more compact serialization format you might take a look at protobuf since that serializes to a binary format. It looks like there are protobuf packages available for Node.js, but again, I have no familiarity with javascript.
Using an existing serialization format tend to make it easier to share objects between different systems since there are usually libraries available for most popular formats for the most common platforms. It usually also makes compatibility easier since many formats allow some flexibility for things like optional fields. So if you find that you need to add some more data you can often do so in the confidence that it will not break existing users.
Is it possible to serialize the class/object in C# and deserialize the same in java. I want to serialize the class and not any XML/JSON data. Please clarify.
Thanks
I see 3 options here. I suggest option 1, Protobufs.
Look into Google's ProtoBufs
Or some equivalent. Here's the java version. Here's a C# port.
Protobufs meant for this sort of language interop. Its binary, small, fast, and language agnostic.
Also it has backwards compatibility, so if you change the serialized objects in the future, you can still read them. This feature is transparent to you too, long as you write code understanding newer variables could be missing when unserialized old objects. This is a huge advantage!
Implement one language's default serialization in the other
You can try implementing the java serialization logic in C#, or the C# serialization routines in Java. I don't suggest this as it will be more difficult, more verbose, almost certainly slower as you're writing new code, and will net you the same result.
Write your serialization routines by hand
This will certainly be fast, but tedious, more error prone, harder to maintain, less flexible...
Here's some benchmarks for libraries like ProtoBufs. This should aide you in selecting the best one for your use case.
We did this a while ago, it worked after lot of tinkering, it really depends on byte encoding, i think JAva uses one and C# uses another (little endian vs. big endian) so you will need to implement a deserializer which takes this affects into account. hope this helps
As others have suggested, your options are going to be external serialization libraries (Google Protobuff, Apache Thrift, etc), or simply using something built-in that's slower/less efficient bandwidth-wise (JSON, XML, etc). You could also write your own, but believe me, it's a maintenance nightmare.
Not using native serialization. The built-in defaults are tied to the binary representation of the data types, which are different for the different VMs. The purpose of XML, JSON, and similar technologies is precisely to provide a format that's generic and can be moved between differing systems. For what it's worth, the overhead in serializing to JSON is usually small, and there's a lot of benefit to being able to read the serialized objects manually, so I'd recommend JSON unless you have a very specific reason why you can't.
Consider OMG's standard CORBA IIOP.
While you many not need the full-on "remote object" support of CORBA, IIOP is the underlying binary protocol for "moving language-neutral objects" (such as an object value parameter) across the wire.
For Java: Java EE EJB's are based on IIOP, there is RMI-IIOP; various support libraries. The IDL-to-Java compiler is delivered with the JDK.
For C# IIOP & integration with Java EE, see IIOP.NET
You can also consider BSON, which is used by MongoDB.
If it is OK for your C#/Java programs to communicate with a mongodb database, you could store your objects there and read it with the appropriate driver.
Regarding BSON itself, see BSON and Data Interchange at the mongoDB blog.
I've got an application that uses both C# and Java and MSMQ. Technically the app is C# and MSMQ based with a need for a small Java component.
I've been using MSMQJava to serialize strings and integers from C# to Java.
Is there any library or technique out there that will allow me to serialize a C# object to a Java object?
I can keep the object very simple. Only string, double and integer values, no methods or references/pointers.
I would use JSON or XML. Both languages can handle those formats.
Maybe create a c# WCF service for the cross-language communication. It can just be a pass through for your c# code. Keep it simple (basicHttp or wsHttp) and you should be able to pass whatever kinds of primitives you like and call any c# methods you'd like from Java.
You do not need any add on libraries in dotNET to serialize/deserialize to JSON or XML if you are targeting at least Framework 3.0 or newer. You can do either or both using DataContracts which are very flexible and give you complete control over both ends of the process if you need. Refer to the following Microsoft article:
http://msdn.microsoft.com/en-us/library/ms733127.aspx
We use XStream both on Java and on .Net. It's easy to use and vary powerful.
.Net version
Java version
XML. Way to go. JSON is very effective since there are plenty of google JSON libraries at your disposal in Java.
Simplisitic approach would be plain XML. LINQ to XML is very simple to understand.
I would like my newer C# 2.0 application to talk to my older Java 1.4 application (can't change versions, sorry). What are my options?
I think that using shared memory would give me better performance, but on the other hand, if I use a network protocol then the architecture would be more flexible. So I'm looking to weigh up both options to see which has the biggest pay off.
I've used XML-RPC implementations that are dog slow, but I assume that was just a bad implementation, and not the actual protocol. Would I be better off going with a lower-level protocol? I've used Google's protobuf before in C++ and Python (over plain old sockets) but I'm not so sure that it's available for Java and C# -- is there anything similar available for the languages I'm using?
I'm looking for the best performance that I can possibly get, but, I'm working with objects and inheritance hierarchies that I'd like to serialize (protobuf is a good example of how this can be done). So, sadly, just sending a simple string over sockets isn't really feasible.
Aha, there's actually C# versions of protobuf!
http://code.google.com/p/protosharp/
http://code.google.com/p/protobuf-csharp-port
... and protobuf has support for Java anyway.
you might also consider JSON as serialization for your objects, much lighter than XML but same capabilities to represent object hierarchies and many libraries are available.
for the communication bus however, i would recommend network since it gives better flexibility.
IMHO,your performance bottleneck is due to serialization/deserialization more than communication bus itself.
I want to use object serialization to communicate over the network between a Mono server and Silverlight clients.
It is pretty important that serialization is space efficient and pretty fast, as the server is going to host multiple real time games.
What technique should I use? The BinaryFormatter adds a lot of overhead to serialized classes (Version, culture, class name, property names, etc.) that is not required within this application.
What can I do to make this more space efficient?
You can use Protocol Buffers. I'm changing all my serialization code from BinaryFormatter with compression to Protocol Buffers and obtaining very good results. It's more efficient in both time and space.
There are two .NET implementations by Jon Skeet and Marc Gravell.
Update: Official .NET implementation can be found here.
I have some benchmarks for the leading .NET serializers available based on the Northwind dataset.
#marcgravell binary protobuf-net is the fastest implementations benchmarked that is about 7x faster than Microsoft fastest serializer available (the XML DataContractSerializer) in the BCL.
I also maintain some open-source high-performance .NET text serializers as well:
JSV TypeSerializer a compact, clean, JSON+CSV-like format that's 3.1x quicker than the DataContractSerializer
as well as a JsonSerializer that's 2.6x quicker.
As the author, I would invite you to try protobuf-net; it ships with binaries for both Mono 2.0 and Silverlight 2.0, and is fast and efficient. If you have any problems whatsoever, just drop me an e-mail (see my Stack Overflow profile); support is free.
Jon's version (see the earlier accepted answer) is also very good, but IMO the protobuf-net version is more idiomatic for C# - Jon's would be ideal if you were talking C# to Java, so you could have a similar API at both ends.
I had a similar problem, although I'm just using .NET. I wanted to send data over the Internet as quickly and easily as possible. I didn't find anything that would be optimized enough, so I made my own serializer, named NetSerializer.
NetSerializer has its limitations, but they didn't affect my use case. And I haven't done benchmarks for a while, but it was much much faster than anything else I found.
I haven't tried it on Mono or Silverlight. I'd bet it works on Mono, but I'm not sure what the level of support is for DynamicMethods on Silverlight.
You could try using JSON. It's not as bandwidth efficient as Protocol Buffers, but it would be a lot easier to monitor messages with tools like Wireshark, which helps a lot when debugging problems. .NET 3.5 comes with a JSON serializer.
You could pass the data through a DeflateStream or GZipStream to compress it prior to transmission. These classes live in the System.IO.Compression namespace.
I had a very similar problem - saving to a file. But the following can also be used over a network as it was actually designed for remoting.
The solution is to use Simon Hewitt's library - see Optimizing
Serialization in .NET - part 2.
Part 1 of the article states (the bold is my emphasis):
"... If you've ever used .NET remoting for large amounts of
data, you will have found that there are problems with
scalability. For small amounts of data, it works well
enough, but larger amounts take a lot of CPU and memory,
generate massive amounts of data for transmission, and
can fail with Out Of Memory exceptions. There is also a big
problem with the time taken to actually perform the
serialization - large amounts of data can make it unfeasible
for use in apps ...."
I got a similar result for my particular application, 40
times faster saving and 20 times faster loading (from
minutes to seconds). The size of the serialised data was
also much reduced. I don't remember exactly, but it
was at least 2-3 times.
It is quite easy to get started. However there is one
gotcha: only use .NET serialisation for the very highest
level datastructure (to get serialisation/deserialisation
started) and then call the serialisation/deserialisation
functions directly for the fields in the highest level
datastructure. Otherwise there will not be any speed-up...
For instance, if a particular data structure (say
Generic.List) is not supported by the library then .NET
serialisation will used instead and this is a no-no. Instead
serialise the list in client code (or similar). For an example
see near "'This is our own encoding." in the same function
as listed below.
For reference: code from my application - see near "Note: this is the only place where we use the built-in .NET ...".
You can try BOIS which focuses on packed data size and provides the best packing so far. (I haven't seen better optimization yet.)
https://github.com/salarcode/Bois