XML compression compatible to both Java and C# - c#

Am building a C# front end that communicates to a Java Tomcat server via HTTP.
The WOX package is used to de/serialize the objects on the Java and C# ends.
However, I want to reduce the time spent in sending XML strings over HTTP, by using some XML compression packages.
My questions are:
Is using WOX de/serialization resulting in XML strings being passed back and forth, the best way to communicate between C# and Java?
What XML compression libraries (has to be free) should I consider to increase the speed?
Many thanks.
Chapax

I'd initially try just applied gzip compression at the HTTP level - partly because that should be able to be applied transparently to your app. XML generally compresses pretty well. Do you have a specific target in mind, so you'll know when a result is "good enough"? (If not, that might be the first thing to work out - otherwise you won't know when to stop.) Tomcat supports gzip compression as a connector configuration option.
As for whether XML is the right way to go - it certainly has advantages and disadvantages. There are plenty of other serialization options, including JSON, Thrift and Protocol Buffers. Each has pros and cons in terms of platform integration, size, readability, versioning etc. You should work out what's important to you and then look at the options in terms of those considerations.

Related

The best way to handle serialized object from C# client in Node.js server env

What's the correct/mainstream way to handle byte arrays sent from C# client code to Node.js server environment?
Now I'm using standard C# serialization through BinaryWriter (client-side) and streambuf npm-package (server-side).
But I'm pretty sure that there is a more straightforward/native/mainstream/efficient way to do it.
What I have to do?
Thank you for your advices!
The mainstream way to convert an object to a serialized representation would be to use a serialization framework. If you are using node.js I would assume json would be most appropriate. I'm not familiar with Node.js, but my understanding is that it handles json natively. See also newtonsoft json for another very popular library.
If you need more a more compact serialization format you might take a look at protobuf since that serializes to a binary format. It looks like there are protobuf packages available for Node.js, but again, I have no familiarity with javascript.
Using an existing serialization format tend to make it easier to share objects between different systems since there are usually libraries available for most popular formats for the most common platforms. It usually also makes compatibility easier since many formats allow some flexibility for things like optional fields. So if you find that you need to add some more data you can often do so in the confidence that it will not break existing users.

Serialization in C# and de-serialization in Java

Is it possible to serialize the class/object in C# and deserialize the same in java. I want to serialize the class and not any XML/JSON data. Please clarify.
Thanks
I see 3 options here. I suggest option 1, Protobufs.
Look into Google's ProtoBufs
Or some equivalent. Here's the java version. Here's a C# port.
Protobufs meant for this sort of language interop. Its binary, small, fast, and language agnostic.
Also it has backwards compatibility, so if you change the serialized objects in the future, you can still read them. This feature is transparent to you too, long as you write code understanding newer variables could be missing when unserialized old objects. This is a huge advantage!
Implement one language's default serialization in the other
You can try implementing the java serialization logic in C#, or the C# serialization routines in Java. I don't suggest this as it will be more difficult, more verbose, almost certainly slower as you're writing new code, and will net you the same result.
Write your serialization routines by hand
This will certainly be fast, but tedious, more error prone, harder to maintain, less flexible...
Here's some benchmarks for libraries like ProtoBufs. This should aide you in selecting the best one for your use case.
We did this a while ago, it worked after lot of tinkering, it really depends on byte encoding, i think JAva uses one and C# uses another (little endian vs. big endian) so you will need to implement a deserializer which takes this affects into account. hope this helps
As others have suggested, your options are going to be external serialization libraries (Google Protobuff, Apache Thrift, etc), or simply using something built-in that's slower/less efficient bandwidth-wise (JSON, XML, etc). You could also write your own, but believe me, it's a maintenance nightmare.
Not using native serialization. The built-in defaults are tied to the binary representation of the data types, which are different for the different VMs. The purpose of XML, JSON, and similar technologies is precisely to provide a format that's generic and can be moved between differing systems. For what it's worth, the overhead in serializing to JSON is usually small, and there's a lot of benefit to being able to read the serialized objects manually, so I'd recommend JSON unless you have a very specific reason why you can't.
Consider OMG's standard CORBA IIOP.
While you many not need the full-on "remote object" support of CORBA, IIOP is the underlying binary protocol for "moving language-neutral objects" (such as an object value parameter) across the wire.
For Java: Java EE EJB's are based on IIOP, there is RMI-IIOP; various support libraries. The IDL-to-Java compiler is delivered with the JDK.
For C# IIOP & integration with Java EE, see IIOP.NET
You can also consider BSON, which is used by MongoDB.
If it is OK for your C#/Java programs to communicate with a mongodb database, you could store your objects there and read it with the appropriate driver.
Regarding BSON itself, see BSON and Data Interchange at the mongoDB blog.

Object Serialization on different platform

Hi guys I'm creating a very simple socket server that lets its clients save its own object state based on Keys by sending it over the wire. I'm using a very simple protocol encoding the serialized object to base64 string and will be sent out as part of my custom xml format. I wanted to know if the serialization will still be the same if the client app runs on 32-bit and 64-bit Windows and using .net Framework? will this also be the same if all client apps are created using c++ but runs on different platforms?
Serialization between 32 and 64-bit CLRs shouldn't be a problem, but if you want to be able to serialize on non-.NET platforms, you shouldn't use the default binary serialization. Personally I wouldn't use that anyway, as it can be tricky to handle in terms of versioning etc.
There are plenty of other serialization options available:
XML serialization (either the built-in or hand-rolled)
Thrift
YAML
JSON
Protocol Buffers (I declare an interest: I've written one of the ports of Protocol Buffers to C#)
All of these are likely to work better across multiple platforms. Some are human readable as well, which can sometimes be useful. (Protocol Buffers aren't human readable, but it's easy to dump a text version of a protocol buffer message.)
Some effectively build their own object model via a separate schema, whereas others will cope (to a greater or lesser degree) with your existing object model.

What is a good communication layer for both Java and C#?

I would like my newer C# 2.0 application to talk to my older Java 1.4 application (can't change versions, sorry). What are my options?
I think that using shared memory would give me better performance, but on the other hand, if I use a network protocol then the architecture would be more flexible. So I'm looking to weigh up both options to see which has the biggest pay off.
I've used XML-RPC implementations that are dog slow, but I assume that was just a bad implementation, and not the actual protocol. Would I be better off going with a lower-level protocol? I've used Google's protobuf before in C++ and Python (over plain old sockets) but I'm not so sure that it's available for Java and C# -- is there anything similar available for the languages I'm using?
I'm looking for the best performance that I can possibly get, but, I'm working with objects and inheritance hierarchies that I'd like to serialize (protobuf is a good example of how this can be done). So, sadly, just sending a simple string over sockets isn't really feasible.
Aha, there's actually C# versions of protobuf!
http://code.google.com/p/protosharp/
http://code.google.com/p/protobuf-csharp-port
... and protobuf has support for Java anyway.
you might also consider JSON as serialization for your objects, much lighter than XML but same capabilities to represent object hierarchies and many libraries are available.
for the communication bus however, i would recommend network since it gives better flexibility.
IMHO,your performance bottleneck is due to serialization/deserialization more than communication bus itself.

Fast and compact object serialization in .NET

I want to use object serialization to communicate over the network between a Mono server and Silverlight clients.
It is pretty important that serialization is space efficient and pretty fast, as the server is going to host multiple real time games.
What technique should I use? The BinaryFormatter adds a lot of overhead to serialized classes (Version, culture, class name, property names, etc.) that is not required within this application.
What can I do to make this more space efficient?
You can use Protocol Buffers. I'm changing all my serialization code from BinaryFormatter with compression to Protocol Buffers and obtaining very good results. It's more efficient in both time and space.
There are two .NET implementations by Jon Skeet and Marc Gravell.
Update: Official .NET implementation can be found here.
I have some benchmarks for the leading .NET serializers available based on the Northwind dataset.
#marcgravell binary protobuf-net is the fastest implementations benchmarked that is about 7x faster than Microsoft fastest serializer available (the XML DataContractSerializer) in the BCL.
I also maintain some open-source high-performance .NET text serializers as well:
JSV TypeSerializer a compact, clean, JSON+CSV-like format that's 3.1x quicker than the DataContractSerializer
as well as a JsonSerializer that's 2.6x quicker.
As the author, I would invite you to try protobuf-net; it ships with binaries for both Mono 2.0 and Silverlight 2.0, and is fast and efficient. If you have any problems whatsoever, just drop me an e-mail (see my Stack Overflow profile); support is free.
Jon's version (see the earlier accepted answer) is also very good, but IMO the protobuf-net version is more idiomatic for C# - Jon's would be ideal if you were talking C# to Java, so you could have a similar API at both ends.
I had a similar problem, although I'm just using .NET. I wanted to send data over the Internet as quickly and easily as possible. I didn't find anything that would be optimized enough, so I made my own serializer, named NetSerializer.
NetSerializer has its limitations, but they didn't affect my use case. And I haven't done benchmarks for a while, but it was much much faster than anything else I found.
I haven't tried it on Mono or Silverlight. I'd bet it works on Mono, but I'm not sure what the level of support is for DynamicMethods on Silverlight.
You could try using JSON. It's not as bandwidth efficient as Protocol Buffers, but it would be a lot easier to monitor messages with tools like Wireshark, which helps a lot when debugging problems. .NET 3.5 comes with a JSON serializer.
You could pass the data through a DeflateStream or GZipStream to compress it prior to transmission. These classes live in the System.IO.Compression namespace.
I had a very similar problem - saving to a file. But the following can also be used over a network as it was actually designed for remoting.
The solution is to use Simon Hewitt's library - see Optimizing
Serialization in .NET - part 2.
Part 1 of the article states (the bold is my emphasis):
"... If you've ever used .NET remoting for large amounts of
data, you will have found that there are problems with
scalability. For small amounts of data, it works well
enough, but larger amounts take a lot of CPU and memory,
generate massive amounts of data for transmission, and
can fail with Out Of Memory exceptions. There is also a big
problem with the time taken to actually perform the
serialization - large amounts of data can make it unfeasible
for use in apps ...."
I got a similar result for my particular application, 40
times faster saving and 20 times faster loading (from
minutes to seconds). The size of the serialised data was
also much reduced. I don't remember exactly, but it
was at least 2-3 times.
It is quite easy to get started. However there is one
gotcha: only use .NET serialisation for the very highest
level datastructure (to get serialisation/deserialisation
started) and then call the serialisation/deserialisation
functions directly for the fields in the highest level
datastructure. Otherwise there will not be any speed-up...
For instance, if a particular data structure (say
Generic.List) is not supported by the library then .NET
serialisation will used instead and this is a no-no. Instead
serialise the list in client code (or similar). For an example
see near "'This is our own encoding." in the same function
as listed below.
For reference: code from my application - see near "Note: this is the only place where we use the built-in .NET ...".
You can try BOIS which focuses on packed data size and provides the best packing so far. (I haven't seen better optimization yet.)
https://github.com/salarcode/Bois

Categories