Hi guys I'm creating a very simple socket server that lets its clients save its own object state based on Keys by sending it over the wire. I'm using a very simple protocol encoding the serialized object to base64 string and will be sent out as part of my custom xml format. I wanted to know if the serialization will still be the same if the client app runs on 32-bit and 64-bit Windows and using .net Framework? will this also be the same if all client apps are created using c++ but runs on different platforms?
Serialization between 32 and 64-bit CLRs shouldn't be a problem, but if you want to be able to serialize on non-.NET platforms, you shouldn't use the default binary serialization. Personally I wouldn't use that anyway, as it can be tricky to handle in terms of versioning etc.
There are plenty of other serialization options available:
XML serialization (either the built-in or hand-rolled)
Thrift
YAML
JSON
Protocol Buffers (I declare an interest: I've written one of the ports of Protocol Buffers to C#)
All of these are likely to work better across multiple platforms. Some are human readable as well, which can sometimes be useful. (Protocol Buffers aren't human readable, but it's easy to dump a text version of a protocol buffer message.)
Some effectively build their own object model via a separate schema, whereas others will cope (to a greater or lesser degree) with your existing object model.
Related
What's the correct/mainstream way to handle byte arrays sent from C# client code to Node.js server environment?
Now I'm using standard C# serialization through BinaryWriter (client-side) and streambuf npm-package (server-side).
But I'm pretty sure that there is a more straightforward/native/mainstream/efficient way to do it.
What I have to do?
Thank you for your advices!
The mainstream way to convert an object to a serialized representation would be to use a serialization framework. If you are using node.js I would assume json would be most appropriate. I'm not familiar with Node.js, but my understanding is that it handles json natively. See also newtonsoft json for another very popular library.
If you need more a more compact serialization format you might take a look at protobuf since that serializes to a binary format. It looks like there are protobuf packages available for Node.js, but again, I have no familiarity with javascript.
Using an existing serialization format tend to make it easier to share objects between different systems since there are usually libraries available for most popular formats for the most common platforms. It usually also makes compatibility easier since many formats allow some flexibility for things like optional fields. So if you find that you need to add some more data you can often do so in the confidence that it will not break existing users.
I have two applications which communicate via TCP socket. For the time being, these applications are both local but in the future, the server application will run on the cloud (Amazon EC2 Instance).
The Server application is written in C++
The Client application is written in C#
I am sending an object from server to client that has the following properties:
Guid Id
uint8* ImageData
Although, I may wish to add extra properties in the future. However, I will try to keep this object as minimal as possible as latency is important here.
Now, I am currently using JSON to communicate between programs, but I was wondering about Google Protocol Buffers (GPB) because, while JSON is nice and east to work with, plus is human-readable, it does have a large overhead and from the looks of things, is causing a noticeable delay in the communications.
What I am looking for, is a more efficient method to communicate between Client and Server applications.
How do GPB's compare with JSON? Has anyone had any experience with high-performance use of GPB? Are there any other protocols which may be better suited here?
These references will help you.
https://google.github.io/flatbuffers/md__benchmarks.html
https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html
There is a C++ implementation for converting JSON to/from protobuf on github.
There're many things that we don't know:
How big is uint8* ImageData usually?
How do you serialise binary data to JSON
What's available bandwidth
What's an average and expected data rate
What I'm trying to say is that you need to worry about JSON overhead only if it matters, otherwise why to change anything. You mentioned latency, but it would be affected only if you had more data to send than your available bandwidth.
For your extremely simple case I wouldn't even use JSON but serialise it manually into binary blob, except you expect in the future your protocol will evolve significantly.
Is it possible to serialize the class/object in C# and deserialize the same in java. I want to serialize the class and not any XML/JSON data. Please clarify.
Thanks
I see 3 options here. I suggest option 1, Protobufs.
Look into Google's ProtoBufs
Or some equivalent. Here's the java version. Here's a C# port.
Protobufs meant for this sort of language interop. Its binary, small, fast, and language agnostic.
Also it has backwards compatibility, so if you change the serialized objects in the future, you can still read them. This feature is transparent to you too, long as you write code understanding newer variables could be missing when unserialized old objects. This is a huge advantage!
Implement one language's default serialization in the other
You can try implementing the java serialization logic in C#, or the C# serialization routines in Java. I don't suggest this as it will be more difficult, more verbose, almost certainly slower as you're writing new code, and will net you the same result.
Write your serialization routines by hand
This will certainly be fast, but tedious, more error prone, harder to maintain, less flexible...
Here's some benchmarks for libraries like ProtoBufs. This should aide you in selecting the best one for your use case.
We did this a while ago, it worked after lot of tinkering, it really depends on byte encoding, i think JAva uses one and C# uses another (little endian vs. big endian) so you will need to implement a deserializer which takes this affects into account. hope this helps
As others have suggested, your options are going to be external serialization libraries (Google Protobuff, Apache Thrift, etc), or simply using something built-in that's slower/less efficient bandwidth-wise (JSON, XML, etc). You could also write your own, but believe me, it's a maintenance nightmare.
Not using native serialization. The built-in defaults are tied to the binary representation of the data types, which are different for the different VMs. The purpose of XML, JSON, and similar technologies is precisely to provide a format that's generic and can be moved between differing systems. For what it's worth, the overhead in serializing to JSON is usually small, and there's a lot of benefit to being able to read the serialized objects manually, so I'd recommend JSON unless you have a very specific reason why you can't.
Consider OMG's standard CORBA IIOP.
While you many not need the full-on "remote object" support of CORBA, IIOP is the underlying binary protocol for "moving language-neutral objects" (such as an object value parameter) across the wire.
For Java: Java EE EJB's are based on IIOP, there is RMI-IIOP; various support libraries. The IDL-to-Java compiler is delivered with the JDK.
For C# IIOP & integration with Java EE, see IIOP.NET
You can also consider BSON, which is used by MongoDB.
If it is OK for your C#/Java programs to communicate with a mongodb database, you could store your objects there and read it with the appropriate driver.
Regarding BSON itself, see BSON and Data Interchange at the mongoDB blog.
Our desktop application consists of a Mono/.NET 3.5 back end that communicates via USB with a variety of devices and a Silverlight front end that communicates with the back end via sockets. The firmware for the devices is developed in-house with C. To accelerate our development process and reduce bugs, we would like to share code between our firmware and desktop application. What tools and techniques would you suggest for us to be able to do this? Better yet, what have you successfully used in your software to solve a similar problem?
The two main things we'd like to share are the message structures that define our communication protocol and data that is currently defined through C structure/array constants. For the protocol messages, we're currently manually rewriting our message-implementing classes to match the C definitions, using the C code as a guide. For the data we share, we created a managed C++ application that links to the compiled C code then extracts the arrays' contents into an XML file.
Our techniques work, but they are less than optimal. For one, we had a multitude of bugs related to our reinterpretation of C structures as C#, due to the C code changing in parallel and programmer mistakes; we'd like to avoid this class of bugs in future development. For data sharing, I don't have a huge problem with our current solution, but the maintainer of the extraction program says that it's a painful process getting that to work properly.
We're a bit constrained on things we'll be able to change on the firmware for the devices. For one, we have a wide variety of processor architectures and embedded platforms, so the C code must remain portable. For another, the firmware runs real-time software and is constrained on available MIPS and storage space, so we cannot add anything with unpredictable or slow execution time.
Try protocol buffers, which is a binary, programming language-agnostic encoding format that Google uses as data exchange format between their services.
The idea is that you write a .proto file that describes structure of your data and run protocol buffers compiler which generates serialization/deserialization code for your language. This would be more efficient that encoding in XML and save the time of writing serializers/deserializers by hand and eliminate mistakes due to incorrect implementation (since they're auto-generated from high-level description in the case of protocol buffers).
Google's implementation supports C++, Java and Python and there are independent implementation for other languages, e.g. for C# there is this one and this one.
There are other technologies like that, e.g. Facebook's Thrift.
I would consider using XSLT transformation for code generation. Have an XML that defines the protocol structures and have various XSLTs that generate the C# and the various platforms C code. Then use the generated code in building the applications.
The XSLT transform can be made part of the project build, I used this technique on several projects as described in Using XSLT to generate Performance Counters code (although that post is not about comm protocols, I actually used the same technique on comm protocols, both as wire and between modules). Once you get over the difficulties to getting up to speed in writing XSL and XQuery, you'll become very productive and you'll appreciate how easy and fast you can change the communication protocol.
Am building a C# front end that communicates to a Java Tomcat server via HTTP.
The WOX package is used to de/serialize the objects on the Java and C# ends.
However, I want to reduce the time spent in sending XML strings over HTTP, by using some XML compression packages.
My questions are:
Is using WOX de/serialization resulting in XML strings being passed back and forth, the best way to communicate between C# and Java?
What XML compression libraries (has to be free) should I consider to increase the speed?
Many thanks.
Chapax
I'd initially try just applied gzip compression at the HTTP level - partly because that should be able to be applied transparently to your app. XML generally compresses pretty well. Do you have a specific target in mind, so you'll know when a result is "good enough"? (If not, that might be the first thing to work out - otherwise you won't know when to stop.) Tomcat supports gzip compression as a connector configuration option.
As for whether XML is the right way to go - it certainly has advantages and disadvantages. There are plenty of other serialization options, including JSON, Thrift and Protocol Buffers. Each has pros and cons in terms of platform integration, size, readability, versioning etc. You should work out what's important to you and then look at the options in terms of those considerations.