Xml Empty Tag Deserialization - c#

Could you please help me to find the solution to deserialize xml file which contains an empty tag?
Example is here:
<Report>
<ItemsCount></ItemsCount>
</Report>
And I want to deserialize it into object of class like:
public class Report{
public int? ItemsCount { get;set;}
}
my xml schema which i'm using in deserialization is:
[XmlRoot]
public partial class Report
{
private int? itemsCount;
[XmlElement(IsNullable = true)]
public int? ItemsCount {
get
{
return itemsCount;
}
set
{
itemsCount = value;
}
}
It works well if the ItemsCount tag is missing at all, but if it is exist and is empty at the same moment, in that case it throwing the exception regarding lines there this tag is located in xml.
I saw a lot of links here while trying to find the solution, but without success.
And also, i don't want to just ignore the tag for all the cases, i want to get a null value instead then it is empty.

XmlSerializer is trying to convert string.Empty value of tag to integer and failing. Change your property as below to convert data type to string:
[XmlElement]
public string ItemsCount {
get
{
return itemsCount;
}
set
{
itemsCount = value;
}
This will set property Itemscount to empty in the above case.
For null value for the above property the xml should be as below:
<ItemsCount xs:Nil='true'/>

How about this approach?
Define the class as follows:
public class Report
{
[XmlIgnore]
public int? ItemsCount { get; set; }
}
Due to the XmlIgnore attribute, this tag will be treated as unknown.
When creating the serializer add the event handler:
var xs = new XmlSerializer(typeof(Report));
xs.UnknownElement += Xs_UnknownElement;
In the event handler interpret an empty string as null:
private void Xs_UnknownElement(object sender, XmlElementEventArgs e)
{
var report = (Report)e.ObjectBeingDeserialized;
if (e.Element.InnerText == string.Empty)
report.ItemsCount = null;
else
report.ItemsCount = int.Parse(e.Element.InnerText);
}
Use the serializer as usual:
Report report;
using (var fs = new FileStream("test.xml", FileMode.Open))
{
report = (Report)xs.Deserialize(fs);
}

To my understanding, the described behaviour is correct; if the tag ItemsCount is missing, its value is null; if it is empty, its value cannot be converted from "" to a value of int?. That being said, it would be possible to implement some custom parsing into the accessors of ItemsCount, which would have to be of type string. However, this seems more like a workaround to me. If possible, the document should be changed to begin with.

Related

IXmlSerializable ReadXml implementation

I am trying to get the name of an XML tag into a class property when performing XML deserialization. I need the name as a property since multiple XML tags share the same class. The XML and associated classes are defined below.
I have an XML response which I receive in the format:
<Data totalExecutionTime="00:00:00.0467241">
<ItemNumber id="1234" order="0" createdDate="2017-03-24T12:07:09.07" modifiedDate="2018-08-29T16:59:19.127">
<Value modifiedDate="2017-03-24T12:07:12.77">ABC1234</Value>
<Category id="5432" parentID="9876" itemOrder="0" modifiedDate="2017-03-24T12:16:23.687">The best category</Category>
... <!-- like 100 other elements -->
</ItemNumber>
</Data>
Deserialize done as follows:
XmlSerializer serializer = new XmlSerializer(typeof(ItemData));
using (TextReader reader = new StringReader(response))
{
ItemData itemData = (ItemData)serializer.Deserialize(reader);
}
And a class for the top level, ItemData:
[Serializable]
[XmlRoot("Data")]
public class ItemData
{
[XmlAttribute("totalExecutionTime")]
public string ExecutionTime { get; set; }
[XmlElement("ItemNumber", Type = typeof(ItemBase))]
public List<ItemBase> Items { get; set; }
}
ItemBase is defined as:
[Serializable]
public class ItemBase
{
[XmlElement("Value")]
public virtual ItemProperty ItemNumber { get; set; } = ItemProperty.Empty;
[XmlElement("ItemName")]
public virtual ItemProperty Category { get; set; } = ItemProperty.Empty;
... // like 100 other properties
}
And finally ItemProperty:
public class ItemProperty : IXmlSerializable
{
public static ItemProperty Empty { get; } = new ItemProperty();
public ItemProperty()
{
this.Name = string.Empty;
this.Value = string.Empty;
this.Id = 0;
this.Order = 0;
}
public string Name { get; set; }
[XmlText] // no effect while using IXmlSerializable
public string Value { get; set; }
[XmlAttribute("id")] // no effect while using IXmlSerializable
public int Id { get; set; }
[XmlAttribute("itemOrder")] // no effect while using IXmlSerializable
public int Order { get; set; }
public XmlSchema GetSchema()
{
return null;
}
public void ReadXml(XmlReader reader)
{
reader.MoveToContent();
string name = reader.Name;
this.Name = name;
string val = reader.ReadElementString();
this.Value = val;
if (reader.HasAttributes)
{
string id = reader.GetAttribute("id");
this.Id = Convert.ToInt32(id);
string itemOrder = reader.GetAttribute("itemOrder");
this.Order = Convert.ToInt32(itemOrder);
string sequence = reader.GetAttribute("seq");
this.Sequence = Convert.ToInt32(sequence);
}
// it seems the reader doesn't advance to the next element after reading
if (reader.NodeType == XmlNodeType.EndElement && !reader.IsEmptyElement)
{
reader.Read();
}
}
public void WriteXml(XmlWriter writer)
{
throw new NotImplementedException();
}
}
The point of implementing the IXmlSerializable interface is because ultimately I need the name of the XML tag that is stored as an ItemProperty and that information is not captured when using the XML class/property attributes. I believe this is the case since the attributes determine which class to use for the deserialization and normally each XML tag would have an associated class. I don't want to go that direction since there are such a large number of different tags that may be in the response and they all share similar attributes. Hence the ItemProperty class.
I've tried also passing the name of the tag via a parameterized constructor in the ItemProperty and setting the name property there, but when deserialization is performed, it uses the default constructor and then sets the property values, so that is not an option.
Reflection doesn't work either since the class is always ItemProperty and therefore doesn't have a unique name.
Maybe how I'm structuring the XML attributes could be done differently to achieve what I'm trying to do, but I don't see it.
I'm open to any way to solve this problem, but I'm pretty sure it entails implementing IXmlSerializable.ReadXml(). I know it is the job of the XmlReader to read the entirety of the XML and advance the reader to the end of the text, but I'm a little unclear on how to do that.
TLDR: How do I properly implement IXmlSerializable.ReadXml() while capturing the XML tag name, tag value, and all attributes into the class properties?
Edit: with the updated ReadXml method, I get all the data needed at the ItemProperty level, but class ItemData, the Items list only ever has one item. I assume because I am not advancing the reader properly.
From the documentation for IXmlSerializable.ReadXml(XmlReader):
When this method is called, the reader is positioned on the start tag that wraps the information for your type. ... When this method returns, it must have read the entire element from beginning to end, including all of its contents. Unlike the WriteXml method, the framework does not handle the wrapper element automatically. Your implementation must do so. Failing to observe these positioning rules may cause code to generate unexpected runtime exceptions or corrupt data.
Your ReadXml() can be modified to meet these requirements as follows:
public void ReadXml(XmlReader reader)
{
reader.MoveToContent();
this.Name = reader.LocalName; // Do not include the prefix (if present) in the Name.
if (reader.HasAttributes)
{
var id = reader.GetAttribute("id");
if (id != null)
// Since id is missing from some elements you might want to make it nullable
this.Id = XmlConvert.ToInt32(id);
var order = reader.GetAttribute("itemOrder");
if (order != null)
// Since itemOrder is missing from some elements you might want to make it nullable
this.Order = XmlConvert.ToInt32(order);
string sequence = reader.GetAttribute("seq");
//There is no Sequence property?
//this.Sequence = Convert.ToInt32(sequence);
}
// Read element value.
// This method reads the start tag, the contents of the element, and moves the reader past the end element tag.
// thus there is no need for an additional Read()
this.Value = reader.ReadElementContentAsString();
}
Notes:
You are calling ReadElementString() whose documentation states:
We recommend that you use the ReadElementContentAsString() method to read a text element.
As suggested, I modified your ReadXml() to use this method. In turn, its documentation states:
This method reads the start tag, the contents of the element, and moves the reader past the end element tag.
Thus this method should leave the XmlReader positioned exactly as required by ReadXml(), ensuring the reader is advanced properly.
The XML attributes of each ItemProperty element must be processed before that element's content is read, since reading the content advances the reader past the element start -- and its attributes.
Utilities from the XmlConvert class should be used to parse and format XML primitives so that numerical and date/time values are not erroneously localized.
You probably don't want to include the namespace prefix (if any) in the Name property.
Demo fiddle here.

Generating XML document with C#

I need to generate an XML document that follows this specifictaion
<productName locale="en_GB">Name</productName>
but using XMLSeralization I am getting the following
<productName locale="en_GB">
<Name>Name</Name>
</productName>
My C# code is like this:
[Serializable]
public class productName
{
public productName()
{
}
public string Name;
[XmlAttribute]
public string locale;
}
XmlAttribute is what is required to show the locale in the correct place, but I am unable to figure out how to export the Name field correctly.
Does anyone have an idea?
Thanks
EDIT:
This is the code to generate the XML
public static class XMLSerialize
{
public static void SerializeToXml<T>(string file, T value)
{
var serializer = new XmlSerializer(typeof(T));
using (var writer = XmlWriter.Create(file))
serializer.Serialize(writer, value);
}
public static T DeserializeFromXML<T>(string file)
{
XmlSerializer deserializer = new XmlSerializer(typeof(T));
TextReader textReader = new StreamReader(file);
T result;
result = (T)deserializer.Deserialize(textReader);
textReader.Close();
return result;
}
}
Instead of specifying Name as element specify it as text value by adding [XmlText] attribute
[XmlText]
public string Value { get; set; }
This contains not only a direct answer to your question, but more of a indirect answer of how to solve similar issues like this in the future.
Start the other way around, with your xml, write your xml exactly like you want it and go from there, like this:
// assuming data.xml contains the xml as you'd like it
> xsd.exe data.xml // will generate data.xsd, ie xsd-descriptor
> xsd.exe data.xsd /classes // will generate data.cs, ie c# classes
> notepad.exe data.cs // have a look at data.cs with your favorite editor
Now just have a look at data.cs, this will contain an enormous amount of attributes and stuff and the namespaces are probably wrong, but at least you know how to solve your particular xml-issue.
The direct answer is to use the XmlTextAttribute on the given property, preferably named Value since that is the convention I've seen so far.
[Serializable]
public class productName {
public productName() { }
[XmlText]
public string Value {get; set;}
[XmlAttribute]
public string locale {get; set;}
}

Incorrect XML deserialization

I have the following class:
public class FtpDefinition
{
public FtpDefinition()
{
Id = Guid.NewGuid();
FtpServerAddress = string.Empty;
FtpPortSpecified = false;
FtpPort = "21";
}
[System.Xml.Serialization.XmlElement("Id")]
public System.Guid Id { get; set; }
[System.Xml.Serialization.XmlElement("FtpServerAddress")]
public string FtpServerAddress { get; set; }
[System.Xml.Serialization.XmlElement("FtpPortSpecified")]
public bool FtpPortSpecified { get; set; }
[System.Xml.Serialization.XmlElement("FtpPort")]
public string FtpPort { get; set; }
}
I have a method that gets the following XML string, and using the .net XML deserialization capability
deserializes it into an object of type FtpDefinition.
<FTPDefinition>
<Id>a0a940a7-6785-41be-ac3a-75ba5d4c13ee</Id>
<FtpServerAddress>ftp.noname.com</FtpServerAddress>
<FtpPortSpecified>false</FtpPortSpecified>
<FtpPort>21</FtpPort>
</FTPDefinition>
The problem is, that although the Id and FtpServerAddress fields get populated properly, FtpPort gets
populated with an empty string, and what's more weird is that FtpPortSpecified gets populated with the bool value TRUE instead of FALSE.
I replaced the automatic properties in the above code with actual return\... = value old style getter\setter, so that I can catch the setter getting hit. I was suspecting there's some user code setting the value, but this is not the case. In the call stack it clearly shows that the .net deserialization code is calling the setter with the value TRUE, but one can also see that the XML string provided as parameter to the deserializing method has the correct value (FALSE).
The deserialization code is simple:
XmlSerializer xs = ...(objectType);
using (StringReader stringReader = new StringReader(xml))
{
return xs.Deserialize(stringReader);
}
Please help me figure out what's going on.
The Specified suffix has some special behavior in XML Serialization. Simply change FtpPortSpecified to something else.
http://msdn.microsoft.com/en-us/library/office/bb402199(v=exchg.140).aspx

XML deserialization - throwing custom errors

So I have the following method:
private int? myIntField
[System.Xml.Serialization.XmlElementAttribute(Form = System.Xml.Schema.XmlSchemaForm.Unqualified)]
public int? IntField{
get {
return this.myIntField;
}
set {
this.myIntField= value;
}
}
Now, I am deserializing xml from a post, if for whatever reason I am getting a string, such as "here is the int field: 55444" instead of 55444, the error I get in response is: Input string was not in a correct format. which isn't very specific, especially considering I will have more than one int field I need to verify.
Originally, I was planning something like this:
private string myIntField
[System.Xml.Serialization.XmlElementAttribute(Form = System.Xml.Schema.XmlSchemaForm.Unqualified)]
public int? IntField{
get {
return this.myIntField.CheckValue();
}
set {
this.myIntField= value;
}
}
Where CheckValue performs a try-parse to an Int32, and if it fails it returns a null and adds an error to a list. However, I can't seem to nail this set-up for the generated classes.
Is there I way I can throw a specific error if I am getting strings in place of ints, DateTimes, etc?
It's easy if you have schema(s) for you XML and validate it against schema before deserializing. Suppose you have schema(s) for your XML, you can initialize a XmlSchemaSet, add your schema(s) in it and the:
var document = new XmlDocument();
document.LoadXml(xml); // this a string holding the XML
document.Schemas.XmlResolver = null; //if you don't need to resolve every references
document.Schemas.Add(SchemaSet); // System.Xml.Schema.XmlSchemaSet instance filled with schemas
document.Validate((sender, args) => { ... }); //args are of type ValidationEventArgs and hold problem if there is one...
Personally I think this is a better approach, because you can validate your XML before deserializing and be sure the XML is correct otherwise the deserializer will most probably throw an exception if something is wrong and you will almost never be able to show a meaningful feedback to the user...
P.S. I recommend creating schema(s) describing the XML
The "Input string was not in a correct format" messages comes from a standard System.FormatException raised by a call to int.Parse, added to the automatically generated assembly that does the deserialization. I don't think you can add some custom logic to that.
One solution is to do something like this:
[XmlElement("IntField")]
[Browsable(false)] // not displayed in grids
[EditorBrowsable(EditorBrowsableState.Never)] // not displayed by intellisense
public string IntFieldString
{
get
{
return DoSomeConvert(IntField);
}
set
{
IntField = DoSomeOtherConvert(value);
}
}
[XmlIgnore]
public int? IntField { get; set; }
It's not perfect, because you can still get access to the public IntFieldString, but at least, the "real" IntField property is used only programmatically, but not by the XmlSerializer (XmlIgnore), while the field that's holding the value back & forth is hidden from programmers (EditorBrowsable), grids (Browsable), etc... but not from the XmlSerializer.
I have three approaches for you.
Assuming your data is being entered by a user in a user interface, use input validation to ensure the data is valid. It seems odd to allow random strings to be entered when it should be an integer.
Use exactly the approach you suggest above. Here's an example using LINQ Pad
void Main()
{
using(var stream = new StringReader(
"<Items><Item><IntValue>1</IntValue></Item></Items>"))
{
var serializer = new XmlSerializer(typeof(Container));
var items = (Container)serializer.Deserialize(stream);
items.Dump();
}
}
[XmlRoot("Items")]
public class Container
{
[XmlElement("Item")]
public List<Item> Items { get; set; }
}
public class Item
{
[XmlElement("IntValue")]
public string _IntValue{get;set;}
[XmlIgnore]
public int IntValue
{
get
{
// TODO: check and throw appropriate exception
return Int32.Parse(_IntValue);
}
}
}
Take control of serialization using IXmlSerializable, here's another example
void Main()
{
using(var stream = new StringReader(
"<Items><Item><IntValue>1</IntValue></Item></Items>"))
{
var serializer = new XmlSerializer(typeof(Container));
var items = (Container)serializer.Deserialize(stream);
items.Dump();
}
}
[XmlRoot("Items")]
public class Container
{
[XmlElement("Item")]
public List<Item> Items { get; set; }
}
public class Item : IXmlSerializable
{
public int IntValue{get;set;}
public void WriteXml (XmlWriter writer)
{
writer.WriteElementString("IntValue", IntValue.ToString());
}
public void ReadXml (XmlReader reader)
{
var v = reader.ReadElementString();
// TODO: check and throw appropriate exception
IntValue = int.Parse(v);
}
public XmlSchema GetSchema()
{
return(null);
}
}

deserializing enums

I have an xml in which one of the elements has an attribute that can be blank.
For e.g.,
<tests>
<test language="">
.....
</test>
</tests>
Now, language is enum type in the classes created from the schema. It works fine if the language is specified, it fails to deserialize if it is blank (as shown in example).
Edit: Code for deserialization:
XmlSerializer xmlserializer = new XmlSerializer(type);
StringReader strreader = new StringReader(stringXML);
Object o = serializer.Deserialize(strreader);
How can I handle this scenario
You could declare the enum property as nullable:
public Language? Language { get; set; }
EDIT: ok, I just tried, it doesn't work for attributes... Here's another option: don't serialize/deserialize this property directly, but serialize a string property instead :
[XmlIgnore]
public Language Language { get; set; }
[XmlAttribute("Language")]
public string LanguageAsString
{
get { return Language.ToString(); }
set
{
if (string.IsNullOrEmpty(value))
{
Language = default(Language);
}
else
{
Language = (Language)Enum.Parse(typeof(Language), value);
}
}
}
You probably need to mark up your enumeration, and add a default item that represents Unknown.
For example:
Public Enum EmployeeStatus
<XmlEnum("")> Unknown = 0
<XmlEnum("Single")> One = 1
<XmlEnum("Double")> Two = 2
<XmlEnum("Triple")> Three = 3
End Enum
For more information, see here.
You can do it this way:
namespace Example
{
public enum Language
{
[XmlEnum("en")]
English,
[XmlEnum("de")]
Deutsch
}
public class ExampleClass
{
private Language? language;
[XmlAttribute("Language")]
public Language Language
{
get { return language ?? Example.Language.English; }
set { language = value; }
}
.
.
.
}
}
What would you want the result to be ?
A blank value cannot be mapped to a null reference since an enum is a non-nullable value type.
object wontBeNull = couldBeNull ?? defaultIfNull;
Is what I'd try. It's called Null-Coalescing operator, I use it when I want a default for null input.

Categories