Data Member Order and XML Deserialization - c#

I have a RESTful WCF application that makes use of custom classes as service method parameters. These classes are decorated with the [DataContract] attribute and each of their properties is decorated with the [DataMember] attribute.
The deserializer works consistent with the following "Data Member Order" page at MSDN:
http://msdn.microsoft.com/en-us/library/ms729813.aspx.
That is, it expects the elements in XML formatted input data to follow the order so described. In fact, if one of the elements is out of order, after deserialization it does not have the submitted value but rather is null.
Is there a good way to allow the calling program to order the xml elements freely (i.e., in any order) and to have the deserialization come out right for every ordering of the elements?

Most XML does not permit elements to be entered in arbitrary order. There's no good reason to permit this, as far as I know.
The Data Contract Serializer does not support this at all. It would add overhead, and provide no value.
Why can't your callers just send the correct XML?

Related

How can I extract values as strings from an xml file based on the element/property name in a generated .Net class or the original XSD?

I have a large complex XSD set.
I have C# classes generated from those XSDs using xsd.exe. Naturally, though the majority of properties in the generated classes are strings, many are decimals, DateTimes, enums or bools, just as they should be.
Now, I have some UNVALIDATED data that is structured in the correct XML format, but may well NOT be able to pass XSD validation, let alone be put into an instance of the relevant .Net object. For example, at this stage, for all we know the value for the element that should be a DateTime could be "ABC" - not even parseable as a DateTime - let alone other string elements respecting maxLength or regex pattern restrictions. This data is ready to be passed in to a rules engine that we already have to make everything valid, including defaulting things appropriately depending on other data items, etc.
I know how to use the various types in System.Xml to read the string value of a given element by name. Clearly I could just hand craft code to get out all the elements that exist today by name - but if the XSD changes, the code would need to be reworked. I'd like to be able to either directly read the XSD or use reflection on the generated classes (including attributes like [System.Xml.Serialization.XmlTypeAttribute(TypeName=...] where necessary) to find exactly how to recursively query the XML down to the the raw, unverified string version of any given element to pass through to the ruleset, and then after the rules have made something valid of it, either put it back into the strongly typed object or back into a copy of the XML for serialization into the object.
(It has occurred to me that an alternative approach would be to somehow automatically generate a 'stringly typed' version of the object - where there are not DateTimes etc; nothing but strings - and serialize the xml into that. I have even madly thought of taking the xsd.exe generated .cs file and search/replacing all the enums and base types that aren't strings to strings, but there has to be a better way.)
In other words, is there an existing generic way to pull the XElement or attribute value from some XML that would correspond to a given item in a .Net class if it were serialized, without actually serializing it?
Sorry to self-answer, and sorry for the lack of actual code in my answer, but I don't yet have the permission of my employer to share the actual code on this. Working on it, I'll update here when there is movement.
I was able to implement something I called a Tolerant XML Reader. Unlike most XML deserializing, it starts by using reflection to look at the structure of the required .Net type, and then attempts to find the relevant XElements and interpret them. Any extra elements are ignored (because they are never looked for), any elements not found are defaulted, and any elements found are further interpreted.
The main method signature, in C#, is as follows:
public static T TolerantDeserializeIntoType<T>(
XDocument doc,
out List<string> messagesList,
out bool isFromSuppliedData,
XmlSchemaSet schemas = null,
bool tolerant = true)
A typical call to it might look like this:
List<string> messagesList;
bool defaultOnly;
SomeType result = TolerantDeserializeIntoType<SomeType>(someXDocument, out messagesList, out defaultOnly);
(you may use var; I just explicitly put the type there for clarity in this example).
This will take any XDocument (so the only criteria of the original was that it was well-formed), and make an instance of the specified type (SomeType, in this example) from it.
Note that even if nothing at all in the XML is recognized, it will still not fail. The new instance will simply have all properties / public fields nulled or defaulted, the MessageList would list all the defaulting done, and the boolean out paramater would be FALSE.
The recursive method that does all the work has a similar signature, except it takes an XElement instead of an XDocument, and it does not take a schemaSet. (The present implementation also has an explicit bool to indicate a recursive call defaulting to false. This is a slightly dirty way to allow it to gather all failure messages up to the end before throwing an error if tolerant is false; in a future version I will refactor that to only expose publicly a version without that, if I even want to make the XElement version public at all):
public static T TolerantDeserializeXElementIntoType<T>(
ref XElement element,
ref List<string> messagesList,
out bool isFromSuppliedValue,
bool tolerant = true,
bool recursiveCall = false)
How it works, detail
Starting with the main call, the one with with an XDocument and optional SchemaSet:
If a schema Set that will compile is supplied (actually, it also looks for xsi:noNamespaceSchemaLocation as well) the initial XDocument and schemaSet call runs a standard XDocument.Validate() across the supplied XDocument, but this only collects any issued validation error callbacks. It won't throw an exception, and is done for only two reasons:
it will give some useful messages for the MessageList, and
it will populate the SchemaInfo of all XElements to
possibly use later in the XElement version.
(note, however, that the
schema is entirely optional. It is actually only used to resolve
some ambiguous situations where it can be unclear from the C#
object if a given XElement is mandatory or not.)
From there, the recursive XElement version is called on the root node and the supplied C# type.
I've made the code look for the style of C# objects generated by xsd.exe, though most basic structured objects using Properties and Fields would probably work even without the CustomAttributes that xsd.exe supplies, if the Xml elements are named the same as the properties and fields of the object.
The method looks for:
Arrays
Simple value types, explicitly:
String
Enum
Bool
then anything
else by using the relevant TryParse() method, found by reflection.
(note that nulls/xsi:nill='true' values also have to be specially
handled)
objects, recursively.
It also looks for a boolean 'xxxSpecified' in the object for each field or property 'xxx' that it finds, and sets it appropriately. This is how xsd.exe indicates an element being omitted from the XML in cases where null won't suffice.
That's the outline. As I said, I may be able to put actual code somewhere like GitHub in due course. Hope this is helpful to someone.

DataContract deserialization fails due to incorrect ordering of XML nodes

I am baffled with the behavior of the DataContractSerializer. Our configuration is XML based. XML is used as source for DataContractSerializer.ReadObject method. Recently I have encountered a problem when some properties of deserialized object were not set. I have tracked the changes and discovered that those properties were added into XML manually. Which is OK in my opinion. Apparently it was not OK in DataContractSerializer's opinion because it appears it expects XML nodes to be ordered alphabetically. Really?! Deserialization seems like really straightforward thing - read XML sequentially, parse node name, set corresponding property. What's the purpose of ordering?
Is there a workaround? Maybe some sort of settings for DataContractSerializer?
I encountered this problem recently. To work around it, I used the XmlSerializer and removed the explicit ordering from the XmlElement attributes:
set proxy_tool="C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\SvcUtil.exe" /nologo /t:code /ser:XmlSerializer /UseSerializerForFaults
set sed_tool="$(ProjectDir)sed.exe" -r -i "s/,?[[:space:]]*Order=[[:digit:]]+//"
%proxy_tool% /o:"Proxy1.cs" /n:*,Namespaces.Name1 "Proxy1.wsdl"
%sed_tool% "Proxy1.cs"
%proxy_tool% /o:"Proxy2.cs" /n:*,Namespaces.Name2 "Proxy2.wsdl"
%sed_tool% "Proxy2.cs"
...
There's some more information on my blog post.
If you want to know why the order matters, it's because a sequence in XSD has a defined order, and the web service contracts are defined with XSD.
From the specification:
The consequence of this definition is that any element appearing in an instance whose type is declared to be USAddress (e.g. shipTo in po.xml) must consist of five elements and one attribute. These elements must be called name, street, city, state and zip as specified by the values of the declarations' name attributes, and the elements must appear in the same sequence (order) in which they are declared.
You can use the Order member of DataMemberAttribute to help with this, but in most cases: XML is order-specific (for elements, not attributes) - so it isn't specifically wrong.
That said: if you want fine control over XML serialization, DataContractSerializer is a poor choice XmlSerializer offers more control - and is less fussy re order iirc.

How to set default values to the properties of dynamically loaded types at runtime for XML serialization

I need to serialize dynamically loaded types' classes using XMLSerializer.
When using XML serializer non initialized values are not being serialized. I dont have control over the assemblies I am working with so can not use XML attributes for specifying default values on properties. So I think I need to set all properties and sub properties to their default values recursively and then serialize. ( Please let me know if there is any better way )
Followed this :
Activator.CreateInstance(propType);
but above line complains about not having a parameterless constructor for some types.
Tried this :
subObject = FormatterServices.GetUninitializedObject(propType);
but this one gives an error "value was invalid" with no inner exception.
Please let me know if you need any further information.
If the types in question don't have public parameterless constructors, you'll struggle. You can get around the attributes issue by using the constructor overload that accepts a XmlAttributeOverrides object, which you can use to fully configure the serializer including the default value (via XmlAttributes.XmlDefaultValue), but some things you can't do - and get around the constructor limitation is one of them.
What is the scenario here?
if you want xml, then I would introduce a DTO layer: some objects that look like the ones you're talking about, but are simple and under your control. Ideal for XmlSerializer. You then write code to map between the two
if you just want serialization (and xml is an implementation detail) then there are other serializers that may help. DataContractSerializer or protobuf-net, for example; either would be more versatile here.

Is there an XML specific object (like XElement) that is binary serializable?

I have a use case where I am serializing objects over the wire via MSMQ (mostly strings). When I read the object off the queue I want to be able to tell if the user meant for the object to be a XML or a string. I was thinking a good way to do this would just be to check the type. If it's XmlElement than it becomes XML data otherwise it becomes string or CDATA. The reason I don't want to just check to see if the data is valid XML is that sometimes data will be provided that is supposed to be serialized as a string but is in fact valid XML. I want the caller to be able to control the de-serialization into string or XML.
Are there any types that are marked as serializable in the .NET Framework like XElement or XmlElement (both which are not marked serializable)?
Why don't you just add a property to the class of the serialized object that tells you what it is? I'd propose IsXml.

How to design to prompt users for new values for properties of deserialized objects?

Right now, I'm currently serializing a class like this:
class Session
{
String setting1;
String setting2;
...etc... (other member variables)
List<SessionAction> actionsPerformed;
}
Where SessionAction is an interface that just has one method. All implementations of the SessionAction interface have various properties describing what that specific SessionAction does.
Currently, I serialize this to a file which can be loaded again using the default .Net binary serializer. Now, I want to serialize this to a template. This template will just be the List of SessionActions serialized to a file, but upon loading it back into memory at another time, I want some properties of these SessionActions to require input from the user (which I plan to dynamically generate GUI controls on the fly depending on the property type). Right now, I'm stuck on determining the best way to do this.
Is there some way I could flag some properties so that upon using reflection, I could determine which properties need input from user? Or what are my other options? Feel free to leave comments if anything isn't clear.
For info, I don't recommend using BinaryFormatter for anything that you are storing long-term; it is very brittle between versions. It is fine for short-lived messages where you know the same version will be used for serialization and deserialization.
I would recommend any of: XmlSerializer, DataContractSerializer (3.0), or for fast binary, protobuf-net; all of these are contract-based, so much more version tolerant.
Re the question; you could use things like Nullable<T> for value-types, and null for strings etc - and ask for input for those that are null? There are other routes involving things like the ShouldSerialize* pattern, but this might upset the serialization APIs.
If you know from start what properties will have that SessionAction, you must implement IDeserializationCallback and put to those props the attribute [NonSerialized]. When you implement the OnDeserialization method you get the new values from the user.

Categories