I need to serialize an object using the BinaryFormatter with .NET 4.0 and send it across the wire (via SOAP as a byte array) to a web service running under .NET 3.5. And vice versa. I've tested this scenario, and it seems to work fine.
There is one old question regarding this scenario on SO, that talks about .NET 1.x to 2.0, which did not leave me with a lot of confidence about the approach.
So it works in my test harness, but I can't test every possible variation of the object, so I need some theoretical underpinnings.
As a rule, can objects serialize/deserialize across different framework versions? Is this an accepted scenario or a hack that worked in my case?
If by "binary" you mean BinaryFormatter, then it is already hugely intolerant between versions, since it is strictly tied to type metadata (unless you work really hard with custom bindings). As such, it is only strictly reliable when both ends are using exactly the same implementations. Even changing a property to/from an automatically implemented property is a breaking change.
This isn't a failing of "binary", but a feature of BinaryFormatter. Other binary serializers don't have this issue. For example, protobuf-net works between OS, between frameworks, etc - since the format a: doesn't care about your specific types, and b: is fixed to a published spec.
If you are using BinaryFormatter for this currently: then IMO yes, you should explicitly test every API. Since any type could change an implementation detail. And unfortunately since BF has a habit of pulling in unexpected data (via events, etc), even this isn't necessarily enough to validate the real usage.
If the serialization format is XML (SOAP) or JSON it should absolutely work no problem. I am unsure of how a binary serialized object would react.
The biggest issue with serialization is when you have primitives that do not exist. Hell, the problem exists when going to certain types in native code, so it is not a unique problem found in services (assumption).
As a "rule", you can serialize across framework versions and even to clients written in Java, Delphi and COBOL (provided a version with web service ability - and provided you have exposed the serialized objects appropriately through a service endpoint).
I am trying to think if there are any primitives in .NET that were not present in 1.x, as they would be problematic. As would any new framework objects you might try to serialize. You have a lot less danger with 2.0 (perhaps non-existent?)
The more "open" your serialization is (ie, standards like JSON, SOAP, etc - simplified: JSON or XML, at least in most cases), the less likely you are to have issues. And, if you have issues, you can code around the automagic proxies, etc. As you move towards binary, you can have some incompatibility between an object serialized in 4.0 with WCF and a Remoting client.
Related
What's the correct/mainstream way to handle byte arrays sent from C# client code to Node.js server environment?
Now I'm using standard C# serialization through BinaryWriter (client-side) and streambuf npm-package (server-side).
But I'm pretty sure that there is a more straightforward/native/mainstream/efficient way to do it.
What I have to do?
Thank you for your advices!
The mainstream way to convert an object to a serialized representation would be to use a serialization framework. If you are using node.js I would assume json would be most appropriate. I'm not familiar with Node.js, but my understanding is that it handles json natively. See also newtonsoft json for another very popular library.
If you need more a more compact serialization format you might take a look at protobuf since that serializes to a binary format. It looks like there are protobuf packages available for Node.js, but again, I have no familiarity with javascript.
Using an existing serialization format tend to make it easier to share objects between different systems since there are usually libraries available for most popular formats for the most common platforms. It usually also makes compatibility easier since many formats allow some flexibility for things like optional fields. So if you find that you need to add some more data you can often do so in the confidence that it will not break existing users.
As part of moving from WCF to gRPC I am dealing with NetDataContractSerializer which is used for serializing objects on client side and de-serializing on server side. Both client and server are sharing same DLL with types used in communication.
As part of client app update process actual version of shared DLL with new/changed/deleted definitions of communication objects is downloaded from server. The basic communications objects used for update process are never changed. So serialization/deserialization during update works.
I would like to rewrite existing code as little as possible. I found out that I could replace NetDataContractSerializer by Newtonsoft's Json.NET serialization as described here:
How to deserialize JSON to objects of the correct type, without having to define the type before hand? and here https://www.newtonsoft.com/json/help/html/SerializeTypeNameHandling.htm.
But I wonder if:
Is there better solution in general?
Is there some solution based on what is part of .NET framework 4.8 and will be also working in .NET 5.0 without need to reference third-party DLL?
Is there some binary-serialization alternative which would be more message-size friendly / faster? It is not mandatory for me to have sent messages in readable form.
On "3", gRPC is actually very open to you swapping out the serializer; you are not bound to protobuf, but gRPC is usually used with protobuf. In fact, you could actually use NetDataContractSerializer, although for reasons I'll come onto: I wouldn't recommend it.
The "how" for this is hard to explain, because often with gRPC people use protoc to generate all the bindings, which hides all the details (and ties you to protobuf).
You might be interested in protobuf-net.Grpc here, which is an alternative way of binding to gRPC (using the Google or Microsoft transports - it is just the bindings that are different), and which is much more comparable to WCF. In fact, it even allows you to borrow WCF's interface/attribute approach, although it doesn't give you like-for-like feature parity with WCF (it is still fundamentally gRPC!).
For how, a getting started guide is here. The opening line sets the context:
What is it?
Simple gRPC access in .NET Core 3+ and .NET Framework 4.6.1+ - think WCF, but over gRPC
It defaults to protobuf-net, which is an alternative protobuf serializer designed for code-first scenarios, but you can replace the serializer (globally, or for individual types). An example of implementing a custom serializer binding is provided here - note that most of that file is a large comment (the actual serializer code is 8-ish lines at the end). Please read those comments: they're notionally about BinaryFormatter, but every word of them applies equally to NetDataContractSerializer.
I realise you said "without need to reference third-party DLL" - in which case, I mean sure: you could effectively spend a few weeks replicating the most immediately obvious things that protobuf-net.Grpc is doing for you, but ... that doesn't sound like a great use of your time if the NuGet package is simply sitting there ready to use. The relevant APIs are readily available to use with the Google/Microsoft packages, but there is quite a lot of plumbing involved in making everything work together.
Is it possible to serialize the class/object in C# and deserialize the same in java. I want to serialize the class and not any XML/JSON data. Please clarify.
Thanks
I see 3 options here. I suggest option 1, Protobufs.
Look into Google's ProtoBufs
Or some equivalent. Here's the java version. Here's a C# port.
Protobufs meant for this sort of language interop. Its binary, small, fast, and language agnostic.
Also it has backwards compatibility, so if you change the serialized objects in the future, you can still read them. This feature is transparent to you too, long as you write code understanding newer variables could be missing when unserialized old objects. This is a huge advantage!
Implement one language's default serialization in the other
You can try implementing the java serialization logic in C#, or the C# serialization routines in Java. I don't suggest this as it will be more difficult, more verbose, almost certainly slower as you're writing new code, and will net you the same result.
Write your serialization routines by hand
This will certainly be fast, but tedious, more error prone, harder to maintain, less flexible...
Here's some benchmarks for libraries like ProtoBufs. This should aide you in selecting the best one for your use case.
We did this a while ago, it worked after lot of tinkering, it really depends on byte encoding, i think JAva uses one and C# uses another (little endian vs. big endian) so you will need to implement a deserializer which takes this affects into account. hope this helps
As others have suggested, your options are going to be external serialization libraries (Google Protobuff, Apache Thrift, etc), or simply using something built-in that's slower/less efficient bandwidth-wise (JSON, XML, etc). You could also write your own, but believe me, it's a maintenance nightmare.
Not using native serialization. The built-in defaults are tied to the binary representation of the data types, which are different for the different VMs. The purpose of XML, JSON, and similar technologies is precisely to provide a format that's generic and can be moved between differing systems. For what it's worth, the overhead in serializing to JSON is usually small, and there's a lot of benefit to being able to read the serialized objects manually, so I'd recommend JSON unless you have a very specific reason why you can't.
Consider OMG's standard CORBA IIOP.
While you many not need the full-on "remote object" support of CORBA, IIOP is the underlying binary protocol for "moving language-neutral objects" (such as an object value parameter) across the wire.
For Java: Java EE EJB's are based on IIOP, there is RMI-IIOP; various support libraries. The IDL-to-Java compiler is delivered with the JDK.
For C# IIOP & integration with Java EE, see IIOP.NET
You can also consider BSON, which is used by MongoDB.
If it is OK for your C#/Java programs to communicate with a mongodb database, you could store your objects there and read it with the appropriate driver.
Regarding BSON itself, see BSON and Data Interchange at the mongoDB blog.
I want to separate modules of my program to communicate with each other. They could be on the same computer, but possibly on different ones.
I was considering 2 methods:
create a class with all details. Send it of to the communication layer. This one serializes it, sends it, the other side deserializes it back to the class and than handles it further.
Create a hashtable (key/value thing). Put all data in it. Send it of to the communicationlayer etc etc
So it boils down to hashtable vs class.
If I think 'loosely coupled', I favor hashtable. It's easy to have one module updated, include new extra params in the hastable, without updating the other side.
Then again with a class I get compile-time type checking, instead of runtime.
Has anyone tackled this previously and has suggestions about this?
Thanks!
edit:
I've awarded points to the answer which was most relevant to my original question, although it isn't the one which was upvoted the most
It sounds like you simply want to incorporate some IPC (Inter-Process Communication) into your system.
The best way of accomplishing this in .NET (3.0 onwards) is with the Windows Communication Foundation (WCF) - a generic framework developed by Microsoft for communication between programs in various different manners (transports) on a common basis.
Although I suspect you will probably want to use named pipes for the purposes of efficiency and robustness, there are a number of other transports available such as TCP and HTTP (see this MSDN article), not to mention a variety of serialisation formats from binary to XML to JSON.
One tends to hit this kind of problem in distributed systems design. It surfaces in Web Service (the WSDL defining the paramers and return types) Messaging systems where the formats of messages might be XML or some other well-defined format. The problem of controlling the coupling of client and server remains in all cases.
What happens with your hash table? Suppose your request contains "NAME" and "PHONE-NUMBER", and suddenly you realise that you need to differentiate "LANDLINE-NUMBER" and "CELL-NUMBER". If you just change the hash table entries to use new values, then your server needs changing at the same time. Suppose at this point you don't just have one client and one server, but are perhaps dealing with some kind of exchange or broker systems, many clients implemented by many teams, many servers implemented by many teams. Asking all of them to upgrade to a new message format at the same time is quite an undertaking.
Hence we tend to seek back-comptible solutions such as additive change, we preserve "PHONE-NUMBER" and add the new fields. The server now tolerates messages containg either old or new format.
Different distribution technologies have different in-built degrees of toleration for back-compatibility. When dealing with serialized classes can you deal with old and new versions? When dealing with WSDL, will the message parsers tolerate additive change.
I would follow the following though process:
1). Will you have a simple relationship between client and server, for example do you code and control both, are free to dictate their release cycles. If "no", then favour flexibility, use hash tables or XML.
2). Even if you are in control look at how easily your serialization framework supports versioning. It's likely that a strongly typed, serialized class interface will be easier to work with, providing you have a clear picture of what it's going to take to make a change to the interface.
You can use Sockets, Remoting, or WCF, eash has pros and cons.
But if the performance is not crucial you can use WCF and serialize and deserialize your classes, and for maximum performance I recommend sockets
What ever happened to the built in support for Remoting?
http://msdn.microsoft.com/en-us/library/aa185916.aspx
It works on TCP/IP or IPC if you want. Its quicker than WCF, and is pretty transparent to your code.
In our experience using WCF extensively over the last few years with various bindings we found WCF not be worth the hassle.
It is just to complicated to correctly use WCF including handling errors on channels correctly while retaining good performance (we gave up on high performance with wcf early on).
For authenticated client scenarios we switched to http rest (without wcf) and do json/protobuf payloading.
For high-speed non-authenticated scenarios (or at least non-kerberos authenticated scenarios) we are using zeromq and protobuf now.
I want to use object serialization to communicate over the network between a Mono server and Silverlight clients.
It is pretty important that serialization is space efficient and pretty fast, as the server is going to host multiple real time games.
What technique should I use? The BinaryFormatter adds a lot of overhead to serialized classes (Version, culture, class name, property names, etc.) that is not required within this application.
What can I do to make this more space efficient?
You can use Protocol Buffers. I'm changing all my serialization code from BinaryFormatter with compression to Protocol Buffers and obtaining very good results. It's more efficient in both time and space.
There are two .NET implementations by Jon Skeet and Marc Gravell.
Update: Official .NET implementation can be found here.
I have some benchmarks for the leading .NET serializers available based on the Northwind dataset.
#marcgravell binary protobuf-net is the fastest implementations benchmarked that is about 7x faster than Microsoft fastest serializer available (the XML DataContractSerializer) in the BCL.
I also maintain some open-source high-performance .NET text serializers as well:
JSV TypeSerializer a compact, clean, JSON+CSV-like format that's 3.1x quicker than the DataContractSerializer
as well as a JsonSerializer that's 2.6x quicker.
As the author, I would invite you to try protobuf-net; it ships with binaries for both Mono 2.0 and Silverlight 2.0, and is fast and efficient. If you have any problems whatsoever, just drop me an e-mail (see my Stack Overflow profile); support is free.
Jon's version (see the earlier accepted answer) is also very good, but IMO the protobuf-net version is more idiomatic for C# - Jon's would be ideal if you were talking C# to Java, so you could have a similar API at both ends.
I had a similar problem, although I'm just using .NET. I wanted to send data over the Internet as quickly and easily as possible. I didn't find anything that would be optimized enough, so I made my own serializer, named NetSerializer.
NetSerializer has its limitations, but they didn't affect my use case. And I haven't done benchmarks for a while, but it was much much faster than anything else I found.
I haven't tried it on Mono or Silverlight. I'd bet it works on Mono, but I'm not sure what the level of support is for DynamicMethods on Silverlight.
You could try using JSON. It's not as bandwidth efficient as Protocol Buffers, but it would be a lot easier to monitor messages with tools like Wireshark, which helps a lot when debugging problems. .NET 3.5 comes with a JSON serializer.
You could pass the data through a DeflateStream or GZipStream to compress it prior to transmission. These classes live in the System.IO.Compression namespace.
I had a very similar problem - saving to a file. But the following can also be used over a network as it was actually designed for remoting.
The solution is to use Simon Hewitt's library - see Optimizing
Serialization in .NET - part 2.
Part 1 of the article states (the bold is my emphasis):
"... If you've ever used .NET remoting for large amounts of
data, you will have found that there are problems with
scalability. For small amounts of data, it works well
enough, but larger amounts take a lot of CPU and memory,
generate massive amounts of data for transmission, and
can fail with Out Of Memory exceptions. There is also a big
problem with the time taken to actually perform the
serialization - large amounts of data can make it unfeasible
for use in apps ...."
I got a similar result for my particular application, 40
times faster saving and 20 times faster loading (from
minutes to seconds). The size of the serialised data was
also much reduced. I don't remember exactly, but it
was at least 2-3 times.
It is quite easy to get started. However there is one
gotcha: only use .NET serialisation for the very highest
level datastructure (to get serialisation/deserialisation
started) and then call the serialisation/deserialisation
functions directly for the fields in the highest level
datastructure. Otherwise there will not be any speed-up...
For instance, if a particular data structure (say
Generic.List) is not supported by the library then .NET
serialisation will used instead and this is a no-no. Instead
serialise the list in client code (or similar). For an example
see near "'This is our own encoding." in the same function
as listed below.
For reference: code from my application - see near "Note: this is the only place where we use the built-in .NET ...".
You can try BOIS which focuses on packed data size and provides the best packing so far. (I haven't seen better optimization yet.)
https://github.com/salarcode/Bois