I want to deserialize an object graph in C#, the objects in the graph will have object and collection properties, some of the properties may be private, but I do not need to worry about cyclic object references. My intent is to use the deserialized object graph as test data as an application is being built, for this reason the objects need to be able to be deserialized from the XML prior to any serialization. I would like it to be as easy as possible to freely edit the XML to vary the objects that are constructed. I want the deserialization process not to require nested loops or nested Linq to SQL statements for each tier in the object graph.
I found the DataContractSerializer lacking. It can indeed deserialize to private fields and properties with a private setter but it appears to be incredibly brittle with regard to the processing of the XML input. All it takes is for an element in the XML to be not in quite the right order and it fails. What's more the order it expects the data to be declared in does not necessarily match the order the object members are declared in the class declaration, making it impossible to determine what XML will work without having the data in the objects to start with so that you can serialize it and check what it expects.
The XmlSerializer does not appear to be able to serialize to non-public data of any type.
Since the purpose is to generate test input data for what might be quite simple applications during development I'd rather not have to resort to heavyweight ORM technologies like Entity or Nhibernate.
Is there a simple solution?
[Update]
#Chuck Savage
Thanks very much for your reply. I'm responding in this edit due to the comment character limit.
In the technique you suggested the logic to deserialize each tier of the object hierarchy is maintained in each class, so in a sense you do have nested Linq to SQL just spread out across the various classes involved. This technique also maintains a reference to the XElement from which each object gets its values in each class, so in that sense it isn't so much deserialized as just creating a wrapper around the XML. In the scenario I have in mind I'd ideally like to be deserializing the actual business objects the application will use so an XML wrapper type object like this wouldn't work very well since it would require a distinctly different implementation for test usage compared to production usage.
What I'm really after is something that can do something akin to what the XmlSerializer can do, but which can also deserialize private fields, (or at least properties with no setter). The reason being that the XmlSerializer does what it does with minimal impact on the 'normal' production use of the classes involved (and hence no impact on their implementation).
How about something like this: https://stackoverflow.com/a/10158569/353147
You will have to create your own boilerplate code to go back and forth to xml, but with the included extensions that can be minimized.
Here is another example: https://stackoverflow.com/a/9035905/353147
You can also search my answers on the topic with: user:353147 XElement in the StackOverflow search.
Related
I have a very large class (500+ properties and nested complex objects) and we are mapping to another class with the same properties i.e. it is a one-to-one mapping.
Please no comments about why we are doing this (a long story - but this is a legacy system that is in the process of being re-architected and this is a stepping stone to the next stage of refactoring out services) - and why not automapper etc. Data mapping is hand coded in C#.
I could create a test object, map and compare the mapped object, however there are SO many properties to populate, this in itself is a major task which we hope to avoid.
Any thoughts on whether I could use reflection or serialize/deserialize or some test libraries or maybe use automapper in some way to fill object, map and compare?
We need to ensure a) all properties are mapped and b) each property is mapped to the correct property (properties on each object is named the same)
I suspect a manual code review is probably the only feasible solution but I'm reaching out...
UPDATE
OK not sure why people have down-voted this. It is a valid question with some potentially complex technical solutions. Thanks for you guys that have responded with useful suggestions!
Any thoughts on whether I could use reflection or serialize/deserialize or some test libraries or maybe use automapper in some way to fill object, map and compare?
You could just use a serializer and serialize one object and deserialize the other. Could be a three-to-five-liner if your objects are plain data classes that don't do exotic stuff.
I have serialized a C# class using protobuf-net. The resultant byte array is stored in a database. This is for performance reasons and I probably won't be able to change this design. The C# language doesn't make it possible to prevent classes being modified, and the class structure being passed in for deserialization with time may require changes that will not match that used for serialization, causing retrieval to fail.
Other than the wrapper technique suggested here, is there a pattern or technique for handling this kind of problem?
The only othe technique that comes to my mind is to version the classes that need to be deserialized in order to not loose anything when you need to make some changes. When you serialize an instance of those classes, you have to serialize also the version of the class (it could be a field of the the class itself).
I don't think this is the best solution but a solution.
The versioning strategy could become very difficult to manage when the changes (and the versions) start to grow.
The .NET Framework ships with System.Runtime.Serialization.Json.DataContractJsonSerializer and System.Web.Script.Serialization.JavaScriptSerializer, both of which de/serialize JSON. How do I know when to choose one of these types over the other? MSDN doesn't make it clear what their relative advantages are.
We have several projects that consume or emit JSON, and the class selected for each thus far has depended on the opinion of the primary dev on each project. Some are simple, two have complex logic regarding producing managed types from JSON (the types do not map closely to the streams) but don't have any emphasis on speed, one requires speed. None interact with WCF, at least as of now.
While I'm interested in alternative libraries, I am hoping that somebody might have an answer to my question too.
The DataContractJsonSerializer is intended for use with WCF client applications where the serialized types are typically POCO classes with the DataContract attribute applied to them. No DataContract, no serialization. The mapping mechanism of WCF makes the sending and receiving very simple, but only if your platform is homogeneous. If you start mixing in different toolsets, your program might go sideways.
The JavaScriptSerializer can serialize any type, including anonymous types (one way), and does so in a more conformant way. You lose the "automagic" of WCF, but you gain more integration options.
As you can see by the comments, there are a lot of options out there for AJAX serialization, and to address your speed vs. maintainability questions, it might be worth investigating them to find a solution that meets the needs of all the teams, to reduce maintainability issues in the long term as everybody does things their own way.
2014-04-07 UPDATE:
I suggest using JSON.NET if you can. See http://james.newtonking.com/json Feature Comparison for a review of the 3 libraries considered in this question.
2015-05-26 UPDATE:
If your company requires the use of commercially licensable products, or you need every last bit of performance, you may also want to check out https://servicestack.net/.
Both do approximately the same but using very different infrastructure thus applying different restrictions on the classes you want to serialize/deserialize and providing different degree of flexibility in tuning the serialization/deserialization process.
For DataContractJsonSerializer you must mark all classes you want to serialize using DataContract atrtibute and all members using DataMember attribute. As well as if some of you classes have enum members, then the enums also must be marked as DataContract and each enum member - with EnumMember attribute.
Also DataContractJsonSerializer allows you fine control over the whole process of serialization/deserialization by altering types resolution logic and replacing the types you serialize with surrogates.
For JavaScriptSerializer you must provide parameterless constructor if you plan on deserializing objects from json string.
For me, I usually use JavaScriptSerializer in presentation logic, where there's a simple model I want to render in Json together with page, without additional ajax requests. And I even usually don't have to deserialize them back to c# - so there's no overhead at all. But if it's persistence logic, where I want to save objects into a data store (usually no-sql storage), to load them later, I prefer using DataContractJsonSerializer because the overhead of putting attributes is worth of flexibility in the serialization/deserialization process tuning, especially when it comes to loading of serialized data into the objects of the newer version, with updated definitions
Personally, I think that DataContractJsonSerializer reeks of over-engineering. I'd skip it and go with JavaScriptSerializer. In the event where JavaScriptSerializer isn't available, you can use FridayThe13th (a library I wrote ;p).
I ran in to a fellow programmer and was discussing a method i needed to write, and in an OOP aspect, the a Dictionary<T,U> is perfect. But, i voiced concerns about the XML size and structure that it is translated to during serialization. So my buddy, in a very direct manner, said i should be using a wrapper object that contains the key and value, and return a list of them instead of a dictionary. Are there some .NET objects that just shouldnt be serialized over SOAP, and simpler, custom objects should be created instead?
The main things you need to worry about are:
Don't send unnecessary information.
Dont make too many service calls.
Try to balance size of data against number of calls (optimally reduce both of these to a minimum).
As a rule most people avoid passing data structures which contain complicated logic, such as Dictionary.
Serializing a List is fine (it will be serialized as an IEnumerable).
Don't feel that your data objects have to look like your Entity objects - think of packets of information rather than Entities. When you receive the data at the client end you should convert it into Entity objects.
for some time now I have been trying to solve the following problem and I'm starting to run out of ideas:
I have generated a set of C# classes from an xsd schema using the xsd.exe tool and deserializing xml files works fine. The problem is that apart from the convenience and safety of using the auto generated classes, I also need information about the xml hierarchy, ie I need to establish parent-child relationships between the objects created during deserialization. Note that I want to avoid keeping a separate xml hierarchy structure (like a DOM tree), but rather make the generated objects keep track of their parents and children.
I have managed to pull this off in java using JAXB by:
Defining a common base class for all deserialized objects. This base class contains a list of children and a reference to a parent object (if any).
Using the Unmarshaller.Listener functionality that provides a callback on completed object deserialization. This callback provides a reference to the parent of the recently deserialized object, which makes establishing parent-child relationships trivial.
How would I go about doing this in C#? I have had a look at the MSDN docs and done quite a lot of googling, but haven't been able to find any useful information.
I wrote an article some time ago about this exact problem, perhaps it can help you.
http://www.thomaslevesque.com/2009/06/12/c-parentchild-relationship-and-xml-serialization/
XmlSerializer should maintain simple object hierarchies for serialization and deserialization. Complex things such as arrays or lists containing more than one type of object are a bit tricker. . . but possible.