Can XmlSerializer.Deserialize ever return null? - c#

I ve been trying in a few different ways to get XmlSerializer.Deserialize to return null
however it doesnt seem possible
I tried with a class being null, malformated xml, well formatted xml .
I might be missing something obvious here, but is it possible ?
Just to clarify give a Class MyClass that is serializable I want a similar test to the following to pass
[Fact] //this is a the test attribute when using xUnit
public void When_xml_Something_Then_serialize_returns_null()
{
string serializedObject = "<?xml version=\"1.0\" encoding=\"utf-8\"?><MyClass xmlns:xsi=\"http://www.w3asdsadasdasd.org/2001/XMLSchema-instance\"></MyClass>";
using (var stringReader = new StringReader(serializedObject))
{
Assert.Null(new XmlSerializer(typeof(MyClass)).Deserialize(stringReader));
}
}
Tried different things in the serialized string, and i either get an exception or an empty instance of MyClass :(
Thanks
NOTE: there was a typo in this question, it is now corrected
NOTE 2: for a more detailed answer look at the comments.

Yes, Deserialize can return null when the input does not contain the XML that is expected. This is frequently seen when there is confusion of the XML namespaces. If the input contains a root element with the expected name, but in a different namespace, then null will be returned.
This is often seen when dealing with ASMX web services or with Web References, especially web references against RPC-style services, where the messages are described in terms of the XSD type of the message, and not in terms of the element.

Appearantly, you can view or download the code for the System.Xml part of the .NET framework. This lets you look in the source code to determine when it returns null.

For the future i answer this, Use IsNullable attribute for those property can be null
https://learn.microsoft.com/en-us/dotnet/api/system.xml.serialization.xmlelementattribute.isnullable

Related

How can I extract values as strings from an xml file based on the element/property name in a generated .Net class or the original XSD?

I have a large complex XSD set.
I have C# classes generated from those XSDs using xsd.exe. Naturally, though the majority of properties in the generated classes are strings, many are decimals, DateTimes, enums or bools, just as they should be.
Now, I have some UNVALIDATED data that is structured in the correct XML format, but may well NOT be able to pass XSD validation, let alone be put into an instance of the relevant .Net object. For example, at this stage, for all we know the value for the element that should be a DateTime could be "ABC" - not even parseable as a DateTime - let alone other string elements respecting maxLength or regex pattern restrictions. This data is ready to be passed in to a rules engine that we already have to make everything valid, including defaulting things appropriately depending on other data items, etc.
I know how to use the various types in System.Xml to read the string value of a given element by name. Clearly I could just hand craft code to get out all the elements that exist today by name - but if the XSD changes, the code would need to be reworked. I'd like to be able to either directly read the XSD or use reflection on the generated classes (including attributes like [System.Xml.Serialization.XmlTypeAttribute(TypeName=...] where necessary) to find exactly how to recursively query the XML down to the the raw, unverified string version of any given element to pass through to the ruleset, and then after the rules have made something valid of it, either put it back into the strongly typed object or back into a copy of the XML for serialization into the object.
(It has occurred to me that an alternative approach would be to somehow automatically generate a 'stringly typed' version of the object - where there are not DateTimes etc; nothing but strings - and serialize the xml into that. I have even madly thought of taking the xsd.exe generated .cs file and search/replacing all the enums and base types that aren't strings to strings, but there has to be a better way.)
In other words, is there an existing generic way to pull the XElement or attribute value from some XML that would correspond to a given item in a .Net class if it were serialized, without actually serializing it?
Sorry to self-answer, and sorry for the lack of actual code in my answer, but I don't yet have the permission of my employer to share the actual code on this. Working on it, I'll update here when there is movement.
I was able to implement something I called a Tolerant XML Reader. Unlike most XML deserializing, it starts by using reflection to look at the structure of the required .Net type, and then attempts to find the relevant XElements and interpret them. Any extra elements are ignored (because they are never looked for), any elements not found are defaulted, and any elements found are further interpreted.
The main method signature, in C#, is as follows:
public static T TolerantDeserializeIntoType<T>(
XDocument doc,
out List<string> messagesList,
out bool isFromSuppliedData,
XmlSchemaSet schemas = null,
bool tolerant = true)
A typical call to it might look like this:
List<string> messagesList;
bool defaultOnly;
SomeType result = TolerantDeserializeIntoType<SomeType>(someXDocument, out messagesList, out defaultOnly);
(you may use var; I just explicitly put the type there for clarity in this example).
This will take any XDocument (so the only criteria of the original was that it was well-formed), and make an instance of the specified type (SomeType, in this example) from it.
Note that even if nothing at all in the XML is recognized, it will still not fail. The new instance will simply have all properties / public fields nulled or defaulted, the MessageList would list all the defaulting done, and the boolean out paramater would be FALSE.
The recursive method that does all the work has a similar signature, except it takes an XElement instead of an XDocument, and it does not take a schemaSet. (The present implementation also has an explicit bool to indicate a recursive call defaulting to false. This is a slightly dirty way to allow it to gather all failure messages up to the end before throwing an error if tolerant is false; in a future version I will refactor that to only expose publicly a version without that, if I even want to make the XElement version public at all):
public static T TolerantDeserializeXElementIntoType<T>(
ref XElement element,
ref List<string> messagesList,
out bool isFromSuppliedValue,
bool tolerant = true,
bool recursiveCall = false)
How it works, detail
Starting with the main call, the one with with an XDocument and optional SchemaSet:
If a schema Set that will compile is supplied (actually, it also looks for xsi:noNamespaceSchemaLocation as well) the initial XDocument and schemaSet call runs a standard XDocument.Validate() across the supplied XDocument, but this only collects any issued validation error callbacks. It won't throw an exception, and is done for only two reasons:
it will give some useful messages for the MessageList, and
it will populate the SchemaInfo of all XElements to
possibly use later in the XElement version.
(note, however, that the
schema is entirely optional. It is actually only used to resolve
some ambiguous situations where it can be unclear from the C#
object if a given XElement is mandatory or not.)
From there, the recursive XElement version is called on the root node and the supplied C# type.
I've made the code look for the style of C# objects generated by xsd.exe, though most basic structured objects using Properties and Fields would probably work even without the CustomAttributes that xsd.exe supplies, if the Xml elements are named the same as the properties and fields of the object.
The method looks for:
Arrays
Simple value types, explicitly:
String
Enum
Bool
then anything
else by using the relevant TryParse() method, found by reflection.
(note that nulls/xsi:nill='true' values also have to be specially
handled)
objects, recursively.
It also looks for a boolean 'xxxSpecified' in the object for each field or property 'xxx' that it finds, and sets it appropriately. This is how xsd.exe indicates an element being omitted from the XML in cases where null won't suffice.
That's the outline. As I said, I may be able to put actual code somewhere like GitHub in due course. Hope this is helpful to someone.

Json with a strange name field cannot be parsed

I have this JSON container that has a strange field called "48x48" for a photoUrl.
using Newtonsoft.Json;
(...)
dynamic issuesJson = JsonConvert.DeserializeObject(responseIssues.Content);
foreach (dynamic issue in issuesJson.issues){
Console.WriteLine(issue.name); //works properly
Console.WriteLine(issue.48x48); //error -> expected;
}
For some reason Visual Studio doesn't accept the access to this runtime field of this dynamic object. How can I work around this problem?
Note: I cannot change the field name.
Thanks anyway.
For some reason Visual Studio doesn't accept the access to this runtime field of this dynamic object.
Well what you've provided is simply not valid C#. An identifier can't start with a digit. That's still enforced even when you're trying to resolve a member of dynamic.
We don't know what type you're using for issues, but basically you'll need to handle it as a key/value map which you can access by string. Quite how you do that will depend on the implementation of issue. It doesn't look like Json.NET guarantees anything there - you may be able to cast it to JObject, for example:
foreach (JObject issue in issuesJson.issues) {
Console.WriteLine(issue["48x48"]);
}
Field names cannot start with a number. Sorry, no way around it.
You'll have to consult the documentation of your deserializer to see how it takes care of cases like that. It may be as simple as renaming the field "_48x48".
EDIT: actually, based on your code, you probably don't have a class representing this JSON object; I'm leaving my answer anyway, in case it helps someone else.
As others have mentioned, a C# identifier can't start with a digit. You just need to rename 48x48 to a valid name in your class, and map it to the actual JSON name using the [JsonProperty] attribute:
[JsonProperty("48x48")]
public string _48x48 { get; set; }

"Namespace prefix not defined" when it actually is defined

I'm having trouble deserializing XML with an "undefined" namespace prefix which really is defined.
We've published an internal web service in C# which serves a variety of clients. A new client's IDE insists on declaring xsi:type for every element in its XML output, and they can't turn off this "feature".
The XML message they produce goes like this, where "namespace" is the correct namespace.
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<myOperation xsi:type="ns1:namespace" xmlns="namespace" xmlns:ns1="namespace">
<inputString xsi:type="xsd:string">ABCDEF</inputString>
<books xsi:type="ns1:booksType">
<bookID xsi:type="xsd:string">ABC123</bookID>
<bookID xsi:type="xsd:string">DEF456</bookID>
</books>
<!-- ... snip... -->
</myOperation>
</soapenv:Body>
<books> is basically an array of strings.
The service method accepts as XmlNode, but XmlSerializer throws a "prefix 'ns1' not defined" error. (It is defined in a parent node, but apparently that is not good enough.) I have a similar problem using wsdl.exe to generate classes and deserialize the input for me.
Using XmlNamespaceManager to specify prefixes doesn't seem right -- akin to magic numbers, and I can't predict which prefix a given consumer will declare anyway. Is there a way to handle this without stripping the attributes out (books.Attributes.RemoveAll)? That doesn't feel particularly elegant either.
I've found that books.OuterXML does not contain any information for 'ns1' unless I hack the element inbound to use that prefix (), so I can see why it complains, but I don't yet understand why 'ns1' isn't recognized from its previous definition above.
Many thanks for any help, or at least education, someone can provide.
Edits: it works fine if I change <books> to use the prefix, i.e. <ns1:books xsi:type="ns1:booksType">. This works whether I've defined xmlns or no. That may be consistent with this answer, but I still don't see how I would feasibly declare the prefix in the service code.
#Chris, certainly. Hope I can strike a balance between "stingy with closed source" and "usable for those who would help". Here "books" is the XmlNode received in the service method parameter. (Not to get off topic, but will also humbly take suggestions to improve it in general; I'm still a novice.)
XmlSerializer xmlSerializer = new XmlSerializer(typeof(booksType));
StringReader xmlDataReader = new StringReader(books.OuterXml);
books = (booksType)xmlSerializer.Deserialize(xmlDataReader);
The class is pretty much this:
[Serializable()]
[XmlRoot("books", Namespace = "namespace")]
[XmlTypeAttribute(TypeName = "booksType", Namespace = "namespace")]
public class booksType
{
[XmlElement(ElementName = "bookID")]
public string[] bookIDs { get; set; }
}
Your deserialization code could look something like this:
XmlSerializer sz = new XmlSerializer(typeof(booksType));
var reader = new XmlNodeReader(booksXmlNode);
var books = sz.Deserialize(reader);
[EDIT] This is better, because the namespace declarations are preserved with the XmlNode, whereas converting to an XML string via OuterXml appears to slice off the namespace declaration for the ns1 prefix, and the serializer then barfs on the type attribute value containing this prefix. I imagine this is a bug in the XML implementation but maybe an XML guru can confirm this.
This should get you past the error you are seeing, but whether it solves the problem completely I'm not sure.
[FURTHER EDIT] As noted in the comments below, there is a bug in the .NET XmlSerializer which is causing the deserialization to fail. Stepping through the deserialization code in the generated assembly, there is a point where the following condition is tested:
(object) ((System.Xml.XmlQualifiedName)xsiType).Namespace == (object)id2_namespace))
Although the Namespace property of the XmlQualifiedName has the same value ('namespace') as the string variable id2_namespace, the condition is evaluating to false because it is coded as an object identity test rather than a test for string value equivalence. Failing this condition leads directly to the exception reported by OP.
As far as I can see, this bug will always cause deserialization to fail whenever the XML for the object being deserialized uses one prefix on the object's root element name, and another prefix (defined as the same namespace) on that element's xsi:type attribute.

Why doesn't XElement have a GetAttributeValue method?

Sometimes I'd like to know the reasoning of certain API changes. Since Google hasn't helped me with this question, maybe StackOverflow can. Why did Microsoft choose to remove the GetAttribute helper method on XML elements? In the System.Xml world there was XmlElement.GetAttribute("x") like getAttribute in MSXML before it, both of which return either the attribute value or an empty string when missing. With XElement there's SetAttributeValue but GetAttributeValue wasn't implemented.
Certainly it's not too much work to modify logic to test and use the XElement.Attribute("x").Value property but it's not as convenient and providing the utility function one way (SetAttributeValue) but not the other seems weird. Does anyone out there know the reasons behind the decision so that I can rest easily and maybe learn something from it?
You are supposed to get attribute value like this:
var value = (TYPE) element.Attribute("x");
UPDATE:
Examples:
var value = (string) element.Attribute("x");
var value = (int) element.Attribute("x");
etc.
See this article: http://www.hanselman.com/blog/ImprovingLINQCodeSmellWithExplicitAndImplicitConversionOperators.aspx. Same thing works for attributes.
Not sure exactly the reason, but with C# extension methods, you can solve the problem yourself.
public static string GetAttributeValue(this XElement element, XName name)
{
var attribute = element.Attribute(name);
return attribute != null ? attribute.Value : null;
}
Allows:
element.GetAttributeValue("myAttributeName");

.NET XmlSerializer to Element FormDefault=Unqualified XML?

I use C# code more-or-less like this to serialize an object to XML:
XmlSerializer xs1 = new XmlSerializer(typeof(YourClassName));
StreamWriter sw1 = new StreamWriter(#"c:\DeserializeYourObject.xml");
xs1.Serialize(sw1, objYourObjectFromYourClassName);
sw1.Close();
I want it to serialize like this:
<ns0:Header xmlns:ns0="https://mynamespace/">
<SchemaVersion>1.09</SchemaVersion>
<DateTime>2009-12-15T00:00:01-08:00</DateTime>
but instead, it is doing this:
<Header xmlns="https://mynamespace/">
<SchemaVersion xmlns="">V109</SchemaVersion>
<DateTime xmlns="">2010-03-08T18:21:09.100125-08:00</DateTime>
The way it is serializing doesn't work with the XPath I had planned to use, and doesn't match my BizTalk schema. Originally I built the class using XSD.exe from a BizTalk 2006 schema, then I use it for an argument to a WCF web service.
This might be related to an option called element FormDefault = Qualified or Unqualified. In BizTalk, my I have the schema set to "Unqualfiied" which is what I want.
Is there any way for the serializer to output "unqualified" results?
Thanks,
Neal Walters
Update:
Sample attribute on DateTime:
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute(Form = System.Xml.Schema.XmlSchemaForm.Unqualified)]
public System.DateTime DateTime
{
get
{
return this.dateTimeField;
}
set
{
this.dateTimeField = value;
}
}
BizTalk provides for what it calls promoted (or distinguished) fields, which use XPath to pull out values of individual elements. I checked the XPath of BizTalk in a tool called StylusStudio, and Biztalk'x xpath didn't work with the xmlns='' fields above.
The first thing my WCF web service does is to serialize the object to a string (using UTF16 encoding) and store it in an XML column in a SQL database. It is from there I am seeing the above xml sample with the xmlns="".
XPath:
/*[local-name()='Header' and namespace-uri()='https://mynamespace/']/*[local-name()='DateTime' and namespace-uri()='']
The XPATH you're using does not match the namespaces of your XML. Your Header element, for instance, in in the https://mynamespace/, but your XPATH is searching in the http://mynamespace/ namespace.
My question was a bit muddled, so this answer may or may not help someone.
This is a fairly complex scenario, and half of my issues came from trying to simplify it to make an easy post here.
I was actually adding a new element programmatically with a C# routine (see "NewElement" below). The C# code did not set its namespace to an empty string, therefore I believe it is inheriting the namespace of the "Header" element.
I freaked out a little because I was jumping to the conclusion that DateTime should not have the "xmlns=""' when in fact it should. Even though DateTime falls under Header, it does not nor should not inherit the Header's namespace.
In BizTalk, typically only complex types have their own namespace, and DateTime as well as NewElement are simple types.
<Header xmlns="https://mynamespace/">
<SchemaVersion xmlns="">V109</SchemaVersion>
<DateTime xmlns="">2010-03-08T18:21:09.100125-08:00</DateTime>
<NewElement>myvalue</NewElement>
So in effect, the two XML's I posted originally are identical as far as XPath goes. If I insert a new element, I need to make sure it follows the same pattern.
I had written the C# routine to add the element more than a year ago, and it worked fine then, so I wasn't suspect that it was causing this problem.

Categories