What XML serialization method should I use for a public API?

What XML serialization method should I use for a public API? - c#

I'm writing a program that builds up a tree structure made up of classes that inherit from an abstract Node class. There are a number of different type of nodes built into my program. However, I also want to allow more advanced users to be able to reference my library and write their own derivations of Node. These plug-in libraries are then loaded when my app starts up through Assembly.Load(). Thus all the potential Node types used by my application will not be known until run time.
In addition, I want to be able to serialize and deserialize these trees to and from XML files. I have some experience with XMLSerializer, DataContractSerializer, and implementing IXmlSerializable. Typically, I go with DataContractSerializer as it usually requires less code then implementing IXmlSerializable, and can serialize private fields where XmlSerializer can not.
Yet with this project I also have to consider that other users will be creating classes that derive from my class, and will also have to add whatever code or attributes are required to serialize them as well.
Considering this are there reasons I should go with one serialization mechanism over another?

If the serialization and deserialization will only occur within your application, and if there is no requirement that anyone else be able to read the serialized data, then the serialization format doesn't impact the API: as far as a user of the API is concerned, you will serialize into an opaque file and deserialize from the same.
In this case, use DataContractSerializer, as it can serialize into binary if necessary.

Related

Deserialize an object graph with private members in C#

I want to deserialize an object graph in C#, the objects in the graph will have object and collection properties, some of the properties may be private, but I do not need to worry about cyclic object references. My intent is to use the deserialized object graph as test data as an application is being built, for this reason the objects need to be able to be deserialized from the XML prior to any serialization. I would like it to be as easy as possible to freely edit the XML to vary the objects that are constructed. I want the deserialization process not to require nested loops or nested Linq to SQL statements for each tier in the object graph.
I found the DataContractSerializer lacking. It can indeed deserialize to private fields and properties with a private setter but it appears to be incredibly brittle with regard to the processing of the XML input. All it takes is for an element in the XML to be not in quite the right order and it fails. What's more the order it expects the data to be declared in does not necessarily match the order the object members are declared in the class declaration, making it impossible to determine what XML will work without having the data in the objects to start with so that you can serialize it and check what it expects.
The XmlSerializer does not appear to be able to serialize to non-public data of any type.
Since the purpose is to generate test input data for what might be quite simple applications during development I'd rather not have to resort to heavyweight ORM technologies like Entity or Nhibernate.
Is there a simple solution?
[Update]
#Chuck Savage
Thanks very much for your reply. I'm responding in this edit due to the comment character limit.
In the technique you suggested the logic to deserialize each tier of the object hierarchy is maintained in each class, so in a sense you do have nested Linq to SQL just spread out across the various classes involved. This technique also maintains a reference to the XElement from which each object gets its values in each class, so in that sense it isn't so much deserialized as just creating a wrapper around the XML. In the scenario I have in mind I'd ideally like to be deserializing the actual business objects the application will use so an XML wrapper type object like this wouldn't work very well since it would require a distinctly different implementation for test usage compared to production usage.
What I'm really after is something that can do something akin to what the XmlSerializer can do, but which can also deserialize private fields, (or at least properties with no setter). The reason being that the XmlSerializer does what it does with minimal impact on the 'normal' production use of the classes involved (and hence no impact on their implementation).

How about something like this: https://stackoverflow.com/a/10158569/353147
You will have to create your own boilerplate code to go back and forth to xml, but with the included extensions that can be minimized.
Here is another example: https://stackoverflow.com/a/9035905/353147
You can also search my answers on the topic with: user:353147 XElement in the StackOverflow search.

Why is Serializable Attribute required for an object to be serialized

Based on my understanding, SerializableAttribute provides no compile time checks, as it's all done at runtime. If that's the case, then why is it required for classes to be marked as serializable?
Couldn't the serializer just try to serialize an object and then fail? Isn't that what it does right now? When something is marked, it tries and fails. Wouldn't it be better if you had to mark things as unserializable rather than serializable? That way you wouldn't have the problem of libraries not marking things as serializable?

As I understand it, the idea behind the SerializableAttribute is to create an opt-in system for binary serialization.
Keep in mind that, unlike XML serialization, which uses public properties, binary serialization grabs all the private fields by default.
Not only this could include operating system structures and private data that is not supposed to be exposed, but deserializing it could result in corrupt state that can crash an application (silly example: a handle for a file open in a different computer).

This is only a requirement for BinaryFormatter (and the SOAP equivalent, but nobody uses that). Diego is right; there are good reasons for this in terms of what it does, but it is far from the only option - indeed, personally I only recommend BinaryFormatter for talking between AppDomains - it is not (IMO) a good way to persist data (to disk, in cache, to a database BLOB, etc).
If this behaviour causes you trouble, consider using any of the alternatives:
XmlSerializer, which works on public members (not just the fields), but demands a public parameterless constructor and public type
DataContractSerializer, which can work fully opt-in (using [DataContract]/[DataMember]), but which can also (in 3.5 and above) work against the fields instead
Also - for a 3rd-party option (me being the 3rd party); protobuf-net may have options here; "v2" (not fully released yet, but available as source) allows the model (which members to serialize, etc) to be described independently of the type, so that it can be applied to types that you don't control. And unlike BinaryFormatter the output is version-tolerant, known public format, etc.

a question about serialization

What is a better approach to serialize custom class: using XMLSerializer or BinarryFormatter and [Serializable] attribute on class?

It's not possible to answer this, without knowing how you will use the resulting file, and the lifetime of it.
The decision is based on the fact that it is harder to "upgrade" the binary format. If your object model changes, it won't deserialise correctly. But if you've implemented a custom XML serialisation/deserialisation, then you can handle the "new" cases appropriately, and life will be good.
So decide more about how you will use it, who you are sharing information with, and what the possible changes to the model are.
FWIW, I sometimes use both types of serialisation in a given project.

That really depends on how you use the serialized class. If you want to pass it to other programs or want to easily debug it, use XML (but mind that XMLSerializer might produce non-compliant XML output, like multiple root elements).
In all other cases, you can use the binary formatter. But note that XML is more suitable if you change the class later - you can use XMLIgnore and the like to keep the XML format intact.

The decision will sometimes also be made for you based on what the serialized output will be used for - while you could expose a WebService to take a binary array that is a binary serialized item, you couldn't utilize the web service easily from anything but .Net (and the end client would probably need a reference to the type).
Using XML means that the service could be exposed to any end client regardless of the platform/environment on the end client

Maintaining xml hierarchy (ie parent-child) information in objects generated by XmlSerializer

for some time now I have been trying to solve the following problem and I'm starting to run out of ideas:
I have generated a set of C# classes from an xsd schema using the xsd.exe tool and deserializing xml files works fine. The problem is that apart from the convenience and safety of using the auto generated classes, I also need information about the xml hierarchy, ie I need to establish parent-child relationships between the objects created during deserialization. Note that I want to avoid keeping a separate xml hierarchy structure (like a DOM tree), but rather make the generated objects keep track of their parents and children.
I have managed to pull this off in java using JAXB by:
Defining a common base class for all deserialized objects. This base class contains a list of children and a reference to a parent object (if any).
Using the Unmarshaller.Listener functionality that provides a callback on completed object deserialization. This callback provides a reference to the parent of the recently deserialized object, which makes establishing parent-child relationships trivial.
How would I go about doing this in C#? I have had a look at the MSDN docs and done quite a lot of googling, but haven't been able to find any useful information.

I wrote an article some time ago about this exact problem, perhaps it can help you.
http://www.thomaslevesque.com/2009/06/12/c-parentchild-relationship-and-xml-serialization/

XmlSerializer should maintain simple object hierarchies for serialization and deserialization. Complex things such as arrays or lists containing more than one type of object are a bit tricker. . . but possible.

Serialize in memory object with C#

I've got a program that picks up some code from script files and compiles it.
And It works fine.
The problem is: in the scripts I declare a couple of classes and I want to serialize them.
Obviously the C# serializer (xml and binary) doesn't like to serialize and the de-serialize object defined in a in-memory assembly.
I prefer to don't leave the in-memory assembly so i'm looking for another way of serializing, but in case, is possible to build assembly in memory and eventually write it on file ?

You could always write your own ToXml function using reflection to write out your property data to a string. Then your object would deserialize itself.
Just a thought.

If you want to create assemblies dynamically look into IL emitting via reflection. Here is a good article to get you started.

So just to clarify, are you asking how you can serialize a type if it hasn't got the [Serializable] attribute applied?
One solution is to use the WCF Data Contract Serializer: http://msdn.microsoft.com/en-us/library/ms731923.aspx.
Obviously this will only work if you can target .Net 3.0 or higher.
Alternately you can implement an ISerializationSurrogate. Jeffrey Richter has a great introduction at http://msdn.microsoft.com/en-us/magazine/cc188950.aspx.

I would avoid all built-in serialization whenever possible, both are badly broken. For example, XML serialization doesn't support dictionaries and normal serialization/SOAP doesn't support generics. And both have versioning issues.
It is time consuming, but createing ToXML and FromXML methods is probably to most effective way to go.

Hava a look at here for custom serialisers, which is a sample for dictionary XML serializing

I'm slightly confused by the statement that the XmlSerializer can't serialize dynamically generated types. The XmlSerializer generates it's own serialization code dynamically as well during construction so there should be no issue with it serializing your type.
You may need to decorate your dynamic classes with the appropriate attributes, depending on what you are generating (like derived classes), but there shouldn't be any issue with using the XmlSerializer in the situation you described.
If you could post details about the issues the XmlSerializer is giving you I can help you work out what the problem is.
Also, I'm of the belief that auto-generating code is in general a blessing. All to often have I had to go back into a class to fix one or all of the copy/paste/save/load functions, just because someone forgot to update them when adding a new variable. Save/Load code is boiler plate code. Let the computers write it.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

What XML serialization method should I use for a public API? - c#

Related

Deserialize an object graph with private members in C#

Why is Serializable Attribute required for an object to be serialized

a question about serialization

Maintaining xml hierarchy (ie parent-child) information in objects generated by XmlSerializer

Serialize in memory object with C#

Categories

Resources