Version control a protobuf-net serialised C# object - c#

I have serialized a C# class using protobuf-net. The resultant byte array is stored in a database. This is for performance reasons and I probably won't be able to change this design. The C# language doesn't make it possible to prevent classes being modified, and the class structure being passed in for deserialization with time may require changes that will not match that used for serialization, causing retrieval to fail.
Other than the wrapper technique suggested here, is there a pattern or technique for handling this kind of problem?

The only othe technique that comes to my mind is to version the classes that need to be deserialized in order to not loose anything when you need to make some changes. When you serialize an instance of those classes, you have to serialize also the version of the class (it could be a field of the the class itself).
I don't think this is the best solution but a solution.
The versioning strategy could become very difficult to manage when the changes (and the versions) start to grow.

Related

How to deserialize a binary formatted object whose type got modified as an abstract class?

Our application uses event sourcing. Earlier we had an event called PropertyUpdated which was raised by an aggregate. This has been serialized and stored in our event store. Now the schema of this class has got changed and also the event has been made as an abstract class. We now have more specific derived classes like TemplatePropertyUpdated, SystemPropertyUpdated, etc.
As you can see, if I load older aggregate events and try to hydrate the state, I'll get errors during deserialization. I'm working on a tool to load older events and upgrade them as per the latest changes. We are using BinaryFormatter for Serialization and Deserialization. The older (non-abstract) event is now giving "cannot create an abstract class" exception during deserialization.
What is the best approach to create a dynamic object from this old event so that a newer version can be derived?
My preferred way to deal with things like this is to have a serialization-layer. All objects get converted to a corresponding serialization object, and back again.
If there is a significant change in the object hierarchy you would rename the serialization object to something like MyObjectLegacy or MyObject_v0 . The conversion code can then convert the old serialization object to whatever new object(s) you want to use. This helps keep serialization separated from the rest of the object model.
If you lack a serialization layer I see few other options than keeping the old class around just so you can unpack it. You might be able to use binding redirects or other stuff to be able to rename the class, but this is a bit fragile in my experience.
I would encourage you to switch to something other than BinaryFormatter. It is slow, fragile, and produces relatively large files. There are much better alternatives, like protobuf.Net, Bson, json etc. Take a look at a comparison. We had some problems when BinaryFormatter changed format between different .Net versions, not fun.

Serialize custom external class that is not serializable

I wonder if there is any possibility of serializing a class described in a topic.
Suppose we have someone's library that is shared as binary DLL file. Additionally a creator of this lib created a class that is not Serializable. How to serialize such a class? I know I can create a twin-class that contains all the poperties etc. that can be serialized. But is there any other, easier solution to do this? How do you serialize classes that are "not yours" and are stored as binary only?
The 3rd party class is an implementation detail; frankly, it is a very bad idea to involve this in your serialization, as you are then completely fenced into a corner, and can never change implementation. You would also face significant risk of versioning issues - something that BinaryFormatter simply doesn't handle well.
It might not be what you want to hear, but I offer two recommendations:
do not serialize implementation details; serialize the data (only); this may indeed require you to write a DTO that mirrors the implementation, but this is usually a trivial job
make sure you understand the implications of BinaryFormatter; frankly, I never recommend it - it has... glitches.
As for workarounds: you can investigate serialization surrogates, but that isn't a trivial thing to do inside BinaryFormatter, and is basically just a re-statement of the first bullet.
If it was me (although I am hugely biased), I would change serializer; protobuf-net (disclosure: I'm the author) works as a binary serializer, and has easy-to-implement support for surrogates if the third-party model is already coupled to your model.

What's the difference between DataContractJsonSerializer and JavaScriptSerializer?

The .NET Framework ships with System.Runtime.Serialization.Json.DataContractJsonSerializer and System.Web.Script.Serialization.JavaScriptSerializer, both of which de/serialize JSON. How do I know when to choose one of these types over the other? MSDN doesn't make it clear what their relative advantages are.
We have several projects that consume or emit JSON, and the class selected for each thus far has depended on the opinion of the primary dev on each project. Some are simple, two have complex logic regarding producing managed types from JSON (the types do not map closely to the streams) but don't have any emphasis on speed, one requires speed. None interact with WCF, at least as of now.
While I'm interested in alternative libraries, I am hoping that somebody might have an answer to my question too.
The DataContractJsonSerializer is intended for use with WCF client applications where the serialized types are typically POCO classes with the DataContract attribute applied to them. No DataContract, no serialization. The mapping mechanism of WCF makes the sending and receiving very simple, but only if your platform is homogeneous. If you start mixing in different toolsets, your program might go sideways.
The JavaScriptSerializer can serialize any type, including anonymous types (one way), and does so in a more conformant way. You lose the "automagic" of WCF, but you gain more integration options.
As you can see by the comments, there are a lot of options out there for AJAX serialization, and to address your speed vs. maintainability questions, it might be worth investigating them to find a solution that meets the needs of all the teams, to reduce maintainability issues in the long term as everybody does things their own way.
2014-04-07 UPDATE:
I suggest using JSON.NET if you can. See http://james.newtonking.com/json Feature Comparison for a review of the 3 libraries considered in this question.
2015-05-26 UPDATE:
If your company requires the use of commercially licensable products, or you need every last bit of performance, you may also want to check out https://servicestack.net/.
Both do approximately the same but using very different infrastructure thus applying different restrictions on the classes you want to serialize/deserialize and providing different degree of flexibility in tuning the serialization/deserialization process.
For DataContractJsonSerializer you must mark all classes you want to serialize using DataContract atrtibute and all members using DataMember attribute. As well as if some of you classes have enum members, then the enums also must be marked as DataContract and each enum member - with EnumMember attribute.
Also DataContractJsonSerializer allows you fine control over the whole process of serialization/deserialization by altering types resolution logic and replacing the types you serialize with surrogates.
For JavaScriptSerializer you must provide parameterless constructor if you plan on deserializing objects from json string.
For me, I usually use JavaScriptSerializer in presentation logic, where there's a simple model I want to render in Json together with page, without additional ajax requests. And I even usually don't have to deserialize them back to c# - so there's no overhead at all. But if it's persistence logic, where I want to save objects into a data store (usually no-sql storage), to load them later, I prefer using DataContractJsonSerializer because the overhead of putting attributes is worth of flexibility in the serialization/deserialization process tuning, especially when it comes to loading of serialized data into the objects of the newer version, with updated definitions
Personally, I think that DataContractJsonSerializer reeks of over-engineering. I'd skip it and go with JavaScriptSerializer. In the event where JavaScriptSerializer isn't available, you can use FridayThe13th (a library I wrote ;p).

Best way to update a database's serialized objects after changing the object form

I often write my objects out to a database in xml form.
However, if I change the form of my objects, say by changing the name or by changing the fields, I can no longer read them from the database, which makes somewhat difficult the task of reading them, converting them to their new form, and writing them back out to the database.
I'd rather not have to rename my classes everytime I change something about them.
*Note: I am relying on C#'s XmlSerialization/Deserialization of objects for generating the Xml. Perhaps this is not desirable if I change the format of the objects.
If you implement the ISerializable interface on your objects, then you can implement custom serialization/deserialization routines that provide backwards compatibility with older versions of the objects.
See here for an example of how it can be done: ISerializable and backward compatibility
It ultimately depends on how you serialize your objects.
One convenient option is to store them as hash (key-value pairs). This way, if to class Dog having property name I add another property breed, existing objects won't be invalidated. They'll just have have breed = nil.
Exact storage format (xml, json or separate 'properties' table) is not important in this case, the important thing is how you convert objects to it and back.
But I don't think anyone could give specific suggestions without knowing specific platform.

Why is Serializable Attribute required for an object to be serialized

Based on my understanding, SerializableAttribute provides no compile time checks, as it's all done at runtime. If that's the case, then why is it required for classes to be marked as serializable?
Couldn't the serializer just try to serialize an object and then fail? Isn't that what it does right now? When something is marked, it tries and fails. Wouldn't it be better if you had to mark things as unserializable rather than serializable? That way you wouldn't have the problem of libraries not marking things as serializable?
As I understand it, the idea behind the SerializableAttribute is to create an opt-in system for binary serialization.
Keep in mind that, unlike XML serialization, which uses public properties, binary serialization grabs all the private fields by default.
Not only this could include operating system structures and private data that is not supposed to be exposed, but deserializing it could result in corrupt state that can crash an application (silly example: a handle for a file open in a different computer).
This is only a requirement for BinaryFormatter (and the SOAP equivalent, but nobody uses that). Diego is right; there are good reasons for this in terms of what it does, but it is far from the only option - indeed, personally I only recommend BinaryFormatter for talking between AppDomains - it is not (IMO) a good way to persist data (to disk, in cache, to a database BLOB, etc).
If this behaviour causes you trouble, consider using any of the alternatives:
XmlSerializer, which works on public members (not just the fields), but demands a public parameterless constructor and public type
DataContractSerializer, which can work fully opt-in (using [DataContract]/[DataMember]), but which can also (in 3.5 and above) work against the fields instead
Also - for a 3rd-party option (me being the 3rd party); protobuf-net may have options here; "v2" (not fully released yet, but available as source) allows the model (which members to serialize, etc) to be described independently of the type, so that it can be applied to types that you don't control. And unlike BinaryFormatter the output is version-tolerant, known public format, etc.

Categories