Can strong naming cause problems with object serialization in C#? - c#

I serialize some configuration objects and store the result bytes within a database.
new BinaryFormatter().Serialize(memoryStream, instance);
Convert.ToBase64String(memoryStream.ToArray());
These objects will be deserialized later.
new BinaryFormatter().Deserialize(memoryStream);
It's possible, that the Application has some new assembly versions at the time of deserialization. In general it works well, but sometimes I get a file load exception:
"The located assembly's manifest definition does not match the assembly reference.". The assemblies work all with strong naming, can that be the problem and how could I avoid this problem?
Thanks for help

Absolutely, using BinaryFormatter with database (i.e. long-term) storage is a bad idea; BinaryFormatter has two three big faults (by default):
it includes type metadata (shucks if you move/rename your types... this can mean strong name/versioning too)
it includes field names (fields are private details!)
it is .NET specific (which is a pain if you ever want to use anything else)
My blog post here raises two specific issues with this - obfuscation and automatically implemented properties... I won't repeat the text here, but you may find it interesting.
I recommend the use of a contract based serialization. XmlSerializer or DataContractSerializer would suffice normally. If you want small efficient binary, then protobuf-net might be of interest. Unlike BinaryFormatter, the binary from this is portable between implementations, extensible (for new fields), etc. And it is quicker and smaller, too.

I think WCF might be your best bet. It can handle passing unknown fields through to it's consumer even if it doesn't know how to deserialize them.
Example:
Service A: Knows about version 2 of the Widget class which has a Description field
Service B: Knows about version 1 of the Widget class which doesn't have a Description field
Service C: Knows about version 2 of the Widget class which has a Description field
If service A calls service B passing a Widget object and then service B calls service C passing on the same Widget object then service C will get the Description field as it was passed from service A. Service B won't have any Description field but when it deserializes it and re-serializes it it will just pass the Description field through without knowing what it is.
So, you could use WCF services with in-proc communication.
See this link for more on versioning wcf contracts.

Related

Serialization and object versioning in C#

If I want to serialize an object I have to use [Serializable] attribute and all member variables will be written to the file. What I don't know how to do versioning e.g. if I add a new member variable (rename a variable or just remove a variable) and then I open (deserialize) the file how can I determine the object/file version so I can correctly set the new member or take some kind of migration? How can I determine that the variable was initialized during the load or not (ignored by deserializer).
I know that there are version tolerant approaches and I can mark variables with [OptionalField(VersionAdded = 1)] attribute. If I open an old file the framework will ignore this optional (new variable) and it will be just zero/null. But again how can I determine if the variable is initialized by load or it was ignored.
I can write the class/object version number to the stream. Use the ISerializable approach and in the constructor(SerializationInfo oInfo, StreamingContext context) method read this version number. This will exactly tell me what is the class version in the stream.
However I expected that such kind of versioning is already implemented by the streaming framework in C#. I tried to obtain the Assembly version from the SerializationInfo but it is always set to current version not to the version which was used when the object was saved.
What is the preferred approach? I found a lot of articles on the net, but I could not find a good solution for this which addresses versioning...
Any help is appreciated
Thanks,
Abyss
Forgive me if some of what I write is too obvious,
First of all, please! you must stop thinking that you are serializing an object...
That is simply incorrect as the methods which are part of your object are not being persisted.
You are persisting information - and so.. DATA only.
.NET serialization also serializing the type name of your object which contain the assembly name and its version, so when you deserialize - it compares the persisted assembly information with the type that will be manifested with the information - if they are not the same it will return an exception.
Beside the versioning problem - not everything can be serialized so easily.. try to serialize a System.Drawing.Color type and you will begin to understand the problems with the over simplistic mechanism of .NET serialization.
Unless you plan to serialize something really simple which has no plans to evolve I wouldn't use the serialization mechanism provided by .NET.
Getting the focus back to your question, you can read here about the versioning ignorance ability:
http://msdn.microsoft.com/en-us/library/ms229752(v=vs.80).aspx which is provided for BinaryFormatter.
You should also check XML Serialization which has some nice abilities, but the biggest benefit is that you getting an XML which is Human readable so your data will never be lost even if you had complication with the versioning of your types.
But finally, I recommend you either use Database with Entity Framework to persist your data or write your own flat file manager.. while EF is very good for most solutions, sometime you might want something lighter to persist something very simple.
(my imply is that I can no longer see a solution where .NET serialization can be relevant.)
I hope this helps, Good luck.

WCF - Complex Objects - KnownTypes

Ok, not really sure how to word, but will try my best.
I have a number of WCF services that are setup and run awaiting an object to come in for processing.
WCFServiceA
WCFServiceB
WCFServiceC
Service A will run some processing and decide to send the object onto Service B or C.
So my object has [DataContract] attribute on all classes in it and [DataMember] on all properties.
So so far so good.
But now I well lose all the functionality from my object, as this is now basically a serialised version of the object.
So is it best practice if I want to use a full complex object to include the same assembly in all 3 services as a reference and send things across as "KnownTypes"?? Providing the basic DataContract and DataMember for anything using the services that does not know these types so they can still create these object for the services to run with?
Hope I have worded this correctly and you understand my question here.
:EDIT:
To try and clarify.
The object I am sending can have a "Policy" attached to it, this policy object is a class and can be one of several types, vehicle, house, life, pet policy etc.
But the actual type will not be known by the receiving service. Hence the need for KnownTypes.
I think I just answered my own question!! :)
That was a good explanation of the problem. The draw back I see in this approach is if you are going to update the object , say adding new properties or removing some , all the 3 service needs to be updated with the new assembly.
Using of the known types can sometimes lead to backward compatibility issues when you want to upgrade the objects in live depending on the setup.
Or create a DTO (Data transfer object) with just the properties and pass it across the services as a data contract and strip the complex logic in to a helper class which can be referenced by the services.

Why is Serializable Attribute required for an object to be serialized

Based on my understanding, SerializableAttribute provides no compile time checks, as it's all done at runtime. If that's the case, then why is it required for classes to be marked as serializable?
Couldn't the serializer just try to serialize an object and then fail? Isn't that what it does right now? When something is marked, it tries and fails. Wouldn't it be better if you had to mark things as unserializable rather than serializable? That way you wouldn't have the problem of libraries not marking things as serializable?
As I understand it, the idea behind the SerializableAttribute is to create an opt-in system for binary serialization.
Keep in mind that, unlike XML serialization, which uses public properties, binary serialization grabs all the private fields by default.
Not only this could include operating system structures and private data that is not supposed to be exposed, but deserializing it could result in corrupt state that can crash an application (silly example: a handle for a file open in a different computer).
This is only a requirement for BinaryFormatter (and the SOAP equivalent, but nobody uses that). Diego is right; there are good reasons for this in terms of what it does, but it is far from the only option - indeed, personally I only recommend BinaryFormatter for talking between AppDomains - it is not (IMO) a good way to persist data (to disk, in cache, to a database BLOB, etc).
If this behaviour causes you trouble, consider using any of the alternatives:
XmlSerializer, which works on public members (not just the fields), but demands a public parameterless constructor and public type
DataContractSerializer, which can work fully opt-in (using [DataContract]/[DataMember]), but which can also (in 3.5 and above) work against the fields instead
Also - for a 3rd-party option (me being the 3rd party); protobuf-net may have options here; "v2" (not fully released yet, but available as source) allows the model (which members to serialize, etc) to be described independently of the type, so that it can be applied to types that you don't control. And unlike BinaryFormatter the output is version-tolerant, known public format, etc.

WCF serialization of Method Bodies

I'm writing some kind of Computing farm with central server giving tasks and nodes that compute them.
I wanted to write it in such way, that nodes don't know what exactly they are computing. They get (from server) an object that implements IComputable iterface, has one method, .compute() that returns IResult type object and send it to the server.
Server is responsible for preparing these object and serving them through .getWork() method on wcf service, and gets the results with .submitResult(IResult result) method.
Problem is, that worker nodes need to know not only the interface, but full object implementation.
I know that Java can serialize method (probably to bytecode) through RMI. Is it possible with c# ?
What you will have to do is put the type which implements the method you are describing into a separate assembly. You can then send the assembly as a byte array to your server, where it will load the assembly, insptect it for types that fit your interface, and then load them. This is the basic pattern for plug-ins using .Net.
Some care has to be taken though. If you are accepting code from arbitrary sources, you will have to lockdown what these loaded assemblies can do (and it is good practice to do even if you trust the source).
A good classic example for how to do this is the Terrarium project. It is a case study that Microsoft produced that involved the viral spreading of arbitrary assemblies in a secure fashion.
You can do
System.Expression.LambdaExpression<Func<result>> lambda = MyFunction;
and then you can serialize expression to string and deserialize on the server

Method code in c# serialization

When an object is serialized (by remoting to be sent across the wire) does the instance method code get serialized? Or are just the class level instance fields serialized?
I am asking this as some of my objects have large method and want to know wheather I should be using DTO's (data transfer objects) for sending data across the wire.
I guessing it's just the data plus some type version data ... am I right?
Thanks
Methods are never serialized.
Re "fields" - it all depends on the serializer; BinaryFormatter will do fields; you mention "remoting", which suggests BinaryFormatter, but remoting is largely a hangover now - from MSDN (on remoting):
This topic is specific to a legacy
technology that is retained for
backward compatibility with existing
applications and is not recommended
for new development. Distributed
applications should now be developed
using the Windows Communication
Foundation (WCF).
If you use web-services or WCF: XmlSerializer does public fields+properties; DataContractSerializer will do marked fields, etc.
Regular classes are often reusable as DTOs, but if you need lots of control over the wire (or have versioning issues), a separate DTO can be helpful.
(edit/additional) note also that there are other reasons not to like BinaryFormatter - it can be very brittle with versioning, and very painful to fix (although achievable). Other (more tolerant) serializers exist if this is likely to be an issue... if so, let me know and I'll update.
What gets saved is the data plus tags corresponding to your class and property names. The code itself doesn't get serialized.

Categories