Based on my understanding, SerializableAttribute provides no compile time checks, as it's all done at runtime. If that's the case, then why is it required for classes to be marked as serializable?
Couldn't the serializer just try to serialize an object and then fail? Isn't that what it does right now? When something is marked, it tries and fails. Wouldn't it be better if you had to mark things as unserializable rather than serializable? That way you wouldn't have the problem of libraries not marking things as serializable?
As I understand it, the idea behind the SerializableAttribute is to create an opt-in system for binary serialization.
Keep in mind that, unlike XML serialization, which uses public properties, binary serialization grabs all the private fields by default.
Not only this could include operating system structures and private data that is not supposed to be exposed, but deserializing it could result in corrupt state that can crash an application (silly example: a handle for a file open in a different computer).
This is only a requirement for BinaryFormatter (and the SOAP equivalent, but nobody uses that). Diego is right; there are good reasons for this in terms of what it does, but it is far from the only option - indeed, personally I only recommend BinaryFormatter for talking between AppDomains - it is not (IMO) a good way to persist data (to disk, in cache, to a database BLOB, etc).
If this behaviour causes you trouble, consider using any of the alternatives:
XmlSerializer, which works on public members (not just the fields), but demands a public parameterless constructor and public type
DataContractSerializer, which can work fully opt-in (using [DataContract]/[DataMember]), but which can also (in 3.5 and above) work against the fields instead
Also - for a 3rd-party option (me being the 3rd party); protobuf-net may have options here; "v2" (not fully released yet, but available as source) allows the model (which members to serialize, etc) to be described independently of the type, so that it can be applied to types that you don't control. And unlike BinaryFormatter the output is version-tolerant, known public format, etc.
Related
I wonder if there is any possibility of serializing a class described in a topic.
Suppose we have someone's library that is shared as binary DLL file. Additionally a creator of this lib created a class that is not Serializable. How to serialize such a class? I know I can create a twin-class that contains all the poperties etc. that can be serialized. But is there any other, easier solution to do this? How do you serialize classes that are "not yours" and are stored as binary only?
The 3rd party class is an implementation detail; frankly, it is a very bad idea to involve this in your serialization, as you are then completely fenced into a corner, and can never change implementation. You would also face significant risk of versioning issues - something that BinaryFormatter simply doesn't handle well.
It might not be what you want to hear, but I offer two recommendations:
do not serialize implementation details; serialize the data (only); this may indeed require you to write a DTO that mirrors the implementation, but this is usually a trivial job
make sure you understand the implications of BinaryFormatter; frankly, I never recommend it - it has... glitches.
As for workarounds: you can investigate serialization surrogates, but that isn't a trivial thing to do inside BinaryFormatter, and is basically just a re-statement of the first bullet.
If it was me (although I am hugely biased), I would change serializer; protobuf-net (disclosure: I'm the author) works as a binary serializer, and has easy-to-implement support for surrogates if the third-party model is already coupled to your model.
The .NET Framework ships with System.Runtime.Serialization.Json.DataContractJsonSerializer and System.Web.Script.Serialization.JavaScriptSerializer, both of which de/serialize JSON. How do I know when to choose one of these types over the other? MSDN doesn't make it clear what their relative advantages are.
We have several projects that consume or emit JSON, and the class selected for each thus far has depended on the opinion of the primary dev on each project. Some are simple, two have complex logic regarding producing managed types from JSON (the types do not map closely to the streams) but don't have any emphasis on speed, one requires speed. None interact with WCF, at least as of now.
While I'm interested in alternative libraries, I am hoping that somebody might have an answer to my question too.
The DataContractJsonSerializer is intended for use with WCF client applications where the serialized types are typically POCO classes with the DataContract attribute applied to them. No DataContract, no serialization. The mapping mechanism of WCF makes the sending and receiving very simple, but only if your platform is homogeneous. If you start mixing in different toolsets, your program might go sideways.
The JavaScriptSerializer can serialize any type, including anonymous types (one way), and does so in a more conformant way. You lose the "automagic" of WCF, but you gain more integration options.
As you can see by the comments, there are a lot of options out there for AJAX serialization, and to address your speed vs. maintainability questions, it might be worth investigating them to find a solution that meets the needs of all the teams, to reduce maintainability issues in the long term as everybody does things their own way.
2014-04-07 UPDATE:
I suggest using JSON.NET if you can. See http://james.newtonking.com/json Feature Comparison for a review of the 3 libraries considered in this question.
2015-05-26 UPDATE:
If your company requires the use of commercially licensable products, or you need every last bit of performance, you may also want to check out https://servicestack.net/.
Both do approximately the same but using very different infrastructure thus applying different restrictions on the classes you want to serialize/deserialize and providing different degree of flexibility in tuning the serialization/deserialization process.
For DataContractJsonSerializer you must mark all classes you want to serialize using DataContract atrtibute and all members using DataMember attribute. As well as if some of you classes have enum members, then the enums also must be marked as DataContract and each enum member - with EnumMember attribute.
Also DataContractJsonSerializer allows you fine control over the whole process of serialization/deserialization by altering types resolution logic and replacing the types you serialize with surrogates.
For JavaScriptSerializer you must provide parameterless constructor if you plan on deserializing objects from json string.
For me, I usually use JavaScriptSerializer in presentation logic, where there's a simple model I want to render in Json together with page, without additional ajax requests. And I even usually don't have to deserialize them back to c# - so there's no overhead at all. But if it's persistence logic, where I want to save objects into a data store (usually no-sql storage), to load them later, I prefer using DataContractJsonSerializer because the overhead of putting attributes is worth of flexibility in the serialization/deserialization process tuning, especially when it comes to loading of serialized data into the objects of the newer version, with updated definitions
Personally, I think that DataContractJsonSerializer reeks of over-engineering. I'd skip it and go with JavaScriptSerializer. In the event where JavaScriptSerializer isn't available, you can use FridayThe13th (a library I wrote ;p).
On my own object I can add the metatag [Serializable] to make it serializable. Now I use a 3rd party library that I need to be serializable. I inspected the code and it should not be a problem. Is there a way to fix this without altering the 3rd party code?
My advice would be: serialize data, not implementation. The fact of the existence of a 3rd-party object is nothing to do with the data; that is an implementation detail. As such, I always offer the same advice: if serialization ever gets complex, the first thing to do is to introduce a separate DTO model that represents the data in isolation of the implementation, and just map the current state to that DTO. This allows you to handle implementation changes without impact on the storage, and allows otherwise non-serializable objects to be serialized.
Some serializers offer workarounds - for example with protobuf-net you can a: supply the serialization information for any type at runtime, and b: supply a "surrogate" to use automatically when it gets tricky, but - using a DTO model is simpler and easier to maintain.
Your use of [Serializable] suggests BinaryFormatter; in my opinion, this is almost never a good choice for any kind of storage, since BinaryFormatter relies on implementation details. It works nicely for passing data between two in-sync app-domains, though
If the types are public you should be able to use the XmlSerializer to do what you want.
There's more information on this here
Serializes and deserializes objects into and from XML documents. The
XmlSerializer enables you to control how objects are encoded into XML.
Exactly take your subclass and make it serializable.
[Serializable] public class Foo: Bar {}
Write an adapter or be prepared to do something more extreme like disassembling the assembly, injecting the serializable attribute and reassembling.
I would like to know the most common scenarios where xml serialization may fail in .NET.
I'm thinking mainly of XmlSerializer here:
it is limited to tree-like data; it can't handle full object graphs
it is limited to public members, on public classes
it can't really do much with object members
it has some weaknesses around generics
like many serializers, it won't touch instance properties on a collection (bad practice in the first place)
xml simply isn't always a good choice for large data (not least, for performance)
requires a public parameterless constructor
DataContractSerializer solves some of these, but has its own limitations:
it can't handle values in attributes
requires .NET 3.0 (so not much use in 2.0)
Cannot easily serialize generic collections.
See another question: C# XML Serialization Gotchas
Depending on the serializer, cyclic references may not work
Using the shadows keyword has also broken serialization and deserialization for me because the shadowing causes a new implementation of that property to exist making it incompatible for proper reconstruction. Only use overloads if you want to retype to the specific for a subclass.
TimeSpan objects are not serializable. IDictionary-implementing types are not serializable either (although they can be serialized with some manual massaging).
AFAIK, classes marked as [Obsolete] are not serialized by XmlSerializer since .NET 2.0
In C#, if I want to serialize an instance with XmlSerializer, the object's type doesn't have to be marked with [Serializable] attribute. However, for other serialization approaches, such as DataContractSerializer, needs the class be marked as [Serializable] or [DataContract].
Is there any standard or pattern about serialization requirement?
This is because XmlSerializer only serializes public fields/properties. Other forms of serialization can serialize private data, which constitutes a potential security risk, so you have to "opt in" using an attribute.
Security isn't the only issue; simply, serialization only makes sense for certain classes. For example, it makes little snse to serialize a "connection". A connection string, sure, but the connection itself? nah. Likewise, anything that requires an unmanaged pointer/handle is not going to serialize very well. Nor are delegates.
Additionally, XmlSerializer and DataContractSerializer (by default) are tree serializers, not graph serializers - so any recursive links (like Parent) will cause it to break.
Marking the class with the serializer's preferred token is simply a way of saying "and it should make sense".
IIRC, both [XmlSerializer and [DataContractSerializer] used to be very rigid about demanding things like [Serializable], [DataContract] or [IXmlSerializable], but they have become a bit more liberal lately.
Right now there are really 3 forms of serialization in the .Net Framework.
XmlSerialization - By default works on public fields and properties. Can still be controlled via XmlElementAttribute, XmlAttributeAttribute, etc ...
BinarySerialization - Controlled by the SerializationAttribute. Deeply integrated into the CLR
WCF Seralization - DataContractAttribute, etc ...
There unfortunately is standard overall pattern for serialization. All 3 frameworks have different requirements and quirks.