C# - XML deserialization - ignore elements with attribue - c#

I need to deserialize some xml to c# objects. This is my class:
[XmlRoot("root")]
[Serializable]
public class MyRoot
{
[XmlElement("category")]
public List<Category> Categories { get; set; }
}
I'm deserializing like this:
root = (MyRoot)new XmlSerializer(typeof(MyRoot)).Deserialize(new StringReader(client.DownloadString(XmlUrl)));
But I want to ignore some Category elements with specified "id" attribute values. Is there some way I can do this?

Implementing IXmlSerializable is one way to go, but perhaps an easier path would be simply modifying the XML (using LINQ or XSLT?) ahead of time:
HashSet<string> badIds = new HashSet<string>();
badIds.Add("1");
badIds.Add("excludeme");
XDocument xd = XDocument.Load(new StringReader(client.DownloadString(XmlUrl)));
var badCategories = xd.Root.Descendants("category").Where(x => badIds.Contains((string)x.Attribute("id")));
if (badCategories != null && badCategories.Any())
badCategories.Remove();
MyRoot root = (MyRoot)new XmlSerializer(typeof(MyRoot)).Deserialize(xd.Root.CreateReader());
You could do something similar on your resulting collection, but it's entirely possible you don't serialize the id, and may not want to/need to otherwise.

Another approach is to have a property named something like ImportCategories with the [XmlElement("category")] attribute and then have Categories as a property that returns a filtered list from ImportCategories using LINQ.
Then your code would do the deserialisaion and then use root.Categories.

To do this the Microsoft way, you would need to implement an IXmlSerializable interface for the class that you want to serialize:
https://msdn.microsoft.com/en-us/library/system.xml.serialization.ixmlserializable(v=vs.110).aspx
It's going to require some hand-coding on your part - you basically have to implement the WriteXml and ReadXml methods, and you get a XmlWriter and a XmlReader interface respectively, to do what you need to do.
Just remember to keep your classes pretty atomic, so that you don't end up custom-serializing for the entire object graph (ugh).

Related

How to manipulate objects after JSON list deserialization

I'm trying to work with JSON files to store a Class and I'm stuck with the deserialization.
I'm using the following NameSpace:
using System.Text.Json.Serialization;
I have a very simple class, made of 2 properties:
public EnumOfType Type { get; set; }
public double Price { get; set; }
I have 4 instances of this classe that I store in a list. When quiting the application, this list is saved in a JSON file.
string jsonString;
jsonString = JsonSerializer.Serialize(myListOfInstances);
File.WriteAllText(FileName, jsonString);
When I'm opening the Application, I want the JSON file to be loaded to recreate the instances.
I'm using the following method, which apparently works well.
string jsonString = File.ReadAllText(FileName);
myListOfInstances = JsonSerializer.Deserialize<List<MyClass>>(jsonString);
So far so good. When I check the content of the list, it is correctly populated and my 4 instances are there.
But then... how to use them?
Before the JSON, I was creating each instance (for example:)
MyClass FirstInstance = New MyClass();
FirstInstance.Type = EnumOfType.Type1;
FirstInstance.Price = 100.46;
Then I could manipulate it easily, simply calling FirstInstance.
myWindow.Label1.Content = FirstInstance.Price.ToString("C");
FirstInstance.Method1...
Now that the instances are in my list, I don't know how to manipulate them individually because I don't know how to call them.
It's probably obvious to most, but I'm still in the learning process.
Thank you for your help,
Fab
Based on how you have loaded the JSON file into your program, it looks like your variable myListOfInstances already contains all four MyClass objects ready to go. At this point you can use List accessors (or Linq if you want to be fancy) and do things such as the following:
myListOfInstances[0] //Gives you the first item in the list accessed by index
myListOfInstances.First() //Gives you the first item in the list (using linq)
foreach(var item in myListOfInstances) {
// this will iterate through all four items in the list storing each instance in
//the 'item' variable
}
etc...
EDIT: From my comment below. If you need to access values in a a list directly, you can search for specific conditions in the list using linq with the 'Where' method. The syntax is something like this:
myListOfInstances.Where(x => x.Property == SomePropertyToMatch)

Cannot access or find reference to System.Xml.Linq.LineInfoAnnotation. Why is this?

I have an application which takes an XML document and sorts it by certain attributes. I have information associated with each line of the XML document which I want to include in the sorted document. In order to do this,
When I load the file, I make sure the line info is loaded using XDocument.Load(file, LoadOptions.SetLineInfo).
Then I recursively iterate over each XElement and get its line info. When I ran the app, I noticed that each XElement has two annotations,
one of type System.Xml.Linq.LineInfoAnnotation
and one of type System.Xml.Linq.LineInfoEndElementAnnotation.
They contain the info that I need but in private fields.
I can't find any information on these classes, I can't instantiate them, they do not appear in the Object browser under System.Xml.Linq. Yet they exist and I can run "GetType()" on them and get information about the class.
If they exist, why are they not in MSDN references and why can't I instantiate them or extend them? Why can't I find them in the object browser?
P.S. My workaround for this was to use reflection to get the information contained inside each element. But I still can't pass a class name to tell the method what type it is, I have to isolate the object from XElement.Annotations(typeof(object)), and then run GetType() on it. I've illustrated this below.
public object GetInstanceField(Type type, object instance, string fieldName)
{
//reflective method that gets value of private field
}
XElement xEl = existingXElement; //existingXElement is passed in
var annotations = xEl.Annotations(typeof(object)); //contains two objects, start and end LineInfoAnnotation
var start = annotations.First();
var end = annotations.Last();
var startLineNumber = GetInstanceField(start.GetType(), start, lineNumber); //lineNumber is private field I'm trying to access.
var endLineNumber = GetInstanceField(end.GetType(), end, lineNumber);
This code works, but again, I can't just tell the method "typeof(LineInfoAnnotation)", instead I have to do GetType on the existing object. I cannot make sense of this.
Those classes are private - an implementation detail, if you will.
All XObjects (elements, attributes) implement the IXmlLineInfo interface - but they implement the inteface explicitly, so you must perform a cast to access the properties.
Once you have your IXmlLineInfo, you can use the properties LineNumber and LinePosition.
var data =
#"<example>
<someElement
someAttribute=""val"">
</someElement></example>
";
var doc = XDocument.Load(new MemoryStream(Encoding.UTF8.GetBytes(data)), LoadOptions.SetLineInfo);
foreach(var element in doc.Descendants()) {
var elLineInfo = element as IXmlLineInfo;
Console.Out.WriteLine(
$"Element '{element.Name}' at {elLineInfo.LineNumber}:{elLineInfo.LinePosition}");
foreach(var attr in element.Attributes()) {
var attrLineInfo = attr as IXmlLineInfo;
Console.Out.WriteLine(
$"Attribute '{attr.Name}' at {attrLineInfo.LineNumber}:{attrLineInfo.LinePosition}");
}
}
Output:
Element 'example' at 1:2
Element 'someElement' at 2:2
Attribute 'someAttribute' at 3:3
To get the EndElement information, you have to use a plain old XML reader, since the XObject api doesn't expose any information about where the element ends.
using(var reader = doc.CreateReader()) {
while(reader.Read()) {
var lineInfo = reader as IXmlLineInfo;
Console.Out.WriteLine($"{reader.NodeType} {reader.Name} at {lineInfo.LineNumber}:{lineInfo.LinePosition}");
if(reader.NodeType == XmlNodeType.Element && reader.HasAttributes) {
while(reader.MoveToNextAttribute()) {
Console.Out.WriteLine($"{reader.NodeType} {reader.Name} at {lineInfo.LineNumber}:{lineInfo.LinePosition}");
}
}
}
}
Output:
Element example at 1:2
Element someElement at 2:2
Attribute someAttribute at 3:3
EndElement someElement at 5:5
EndElement example at 5:19

Why can't I deserialize a KnownType'd anyType?

I'm trying to use DataContractSerializer outside of WCF to serialize an object. The object in this case inherits from an old generic wrapper around CollectionBase e.g.
[KnownType(typeof(Foo)]
[CollectionDataContract]
class FooCollection : MyCollectionBase<Foo>
[KnownType(typeof(FooCollection)]
[KnownType(typeof(Foo)]
[CollectionDataContract]
class MyCollectionBase<T> : CollectionBase
[DataContract]
class Foo
{
[DataMember]
string Name;
[DataMember]
string Value;
}
When this is serialized, I'm getting the following structure:
<FooCollection xmlns="http://schemas.datacontract.org/ ...>
<anyType>
<Name>...</Name>
<Value>...</Value>
</anyType>
</ArrayOfAnyType>
On deserializing I get the error:
System.Runtime.Serialization.SerializationException : Element anyType from namespace http://schemas.datacontract.org/2004/07/MyAssembly cannot have child contents to be deserialized as an object. Please use XmlNode[] to deserialize this pattern of XML.
----> System.Xml.XmlException : End element 'anyType' from namespace 'http://schemas.datacontract.org/2004/07/MyAssembly' expected. Found element 'Name' from namespace 'http://schemas.datacontract.org/2004/07/MyAssembly'. Line 1, position xxx.
Googling this error shows a number of people who changed up their inheritance hierarchy to get the serialization working, or simply list problems with the approach. I haven't been able to find any examples of using XmlNode[] to deserialize this pattern of XML.
So my questions are:
Is there a way to convince the DataContractSerializer that the type stored in the underlying ArrayList is of type Foo?
How do I implement the XmlNode[] workaround?
Is the only solution to use a generic collection that isn't backed by a non-generic one?
I found the cause of the issue, not shown in my original question: MyCollectionBase implemented ISerializable.
The DataContractSerializer will use an ISerializable implementation before it uses any attributes. It also can't infer a contract from ISerializable, therefore there's no workaround for the anyType deserialization.
This is why the KnownTypeAttribute wasn't working.
Given your code above I used the following code to serialize and deserialize:
var serializer = new DataContractSerializer(typeof(FooCollection));
IList collection = new FooCollection();
var foo = new Foo();
foo.Name = "TestName";
foo.Value = "Test value";
collection.Add(foo);
StringBuilder sb = new StringBuilder();
StringWriter writer = new StringWriter(sb);
XmlTextWriter xmlWriter = new XmlTextWriter(writer);
serializer.WriteObject(xmlWriter, collection);
Console.WriteLine(sb.ToString());
var serialized = sb.ToString();
var reader = new StringReader(serialized);
var xmlReader = new XmlTextReader(reader);
var deserialized = serializer.ReadObject(xmlReader);
return;
without any problems using .Net 4.0 and Visual Studio 2010.
If you want to change the name of the element so that it is not 'anyType' you can change the CollectionDataContract attribute on the FooCollection class to read:
[CollectionDataContract(ItemName="Foo")]

Parsing XML in C#

I am still working on a project and I am enjoying it greatly.
I wanted to see if I could implement a live updating feed using XML
at the moment I dont even know how to parse this particular type of XML as all the tutorials I have found are for parsing node values etc
but I was thinking something along the lines of this
<Object name="ObjectName" type="ObjectType" size="ObjectSize" image="ObjectImage" />
if you guys could help me understand how to access the inner elements of from that node that would be amazing, and if it is not too much to ask just a small explanation so I understand. I know how to parse XML that looks like this using XElement
<Object>
<Name>ObjectName</Name>
<Type>ObjectType</Type>
<Size>ObjectSize</Size>
<Image>ObjectImage</Image>
</Object>
I just cant seem to parse the example at the top, I dont mind if its Linq as long as it is in C#, maybe tell me why you would chose one over the other? Also have you got any idea on how to perhaps check if the file has changed, so I could implement a live update?
Thanks for your Help
John
The example at the top uses attributes instead of sub-elements but it's just as easy to work with:
XElement element = XElement.Parse(xml);
string name = (string) element.Attribute("name");
string type = (string) element.Attribute("type");
string size = (string) element.Attribute("size");
string image = (string) element.Attribute("image");
I usually prefer to use the explicit string conversion instead of the Value property as if you perform the conversion on a null reference, you just end up with a null string reference instead of a NullReferenceException. Of course, if it's a programming error for an attribute to be missing, then an exception is more appropriate and the Value property is fine. (The same logic applies to converting XElement values as well, by the way.)
If you have a domain object that represents your document (usually the case), then the XmlSerializer is quite easy to use.
[XmlRoot("Object")
public class Item
{
public string Name { get; set; }
public string Type { get; set; }
public string Size { get; set; }
public string Image { get; set; }
}
Usage:
XmlSerializer ser = new XmlSerializer(typeof(Item));
Item item = (Item)ser.Deserialize(someXmlStream);
I find using this approach easier than manual parsing when an entire document represents a domain object of some kind.
Use can also use XEelment.FirstAttribute to get the first attribute on the element and then XAttribute.NextAttribute to loop through them all. This doesn't rely on you knowing that the attribute is present.
XAttribute attribute = element.FirstAttribute;
while (attribute != null)
{
// Do stuff
attribute = attribute.NextAttribute`
}

XML serialization and LINQ

I've currently got an XML element in my database that maps to an object (Long story short the XML is complicated and dynamic enough to defy a conventional relational data structure without massive performance hits).
Around it I've wrapped a series of C# objects that encapsulate the structure of the XML. They work off a base class and there's multiple different possible classes it'll deserialize to with different data structures and different implemented methods. I'm currently wrapping the functionality to serialize/deserialize these into a partial class of the LINQ-generated database objects.
My current approach to this is:
public Options GetOptions()
{
if (XmlOptions == null) return null;
XmlSerializer xs = new XmlSerializer(typeof(Options));
return (Options)xs.Deserialize(XmlOptions.CreateReader());
}
public void SetOptions(Options options)
{
if (XmlOptions == null) Options = null;
else
{
XmlSerializer xs = new XmlSerializer(typeof(Options));
using (MemoryStream ms = new MemoryStream())
{
xs.Serialize(ms, options);
XmlOptions = System.Xml.Linq.XElement.Parse(System.Text.UTF8Encoding.UTF8.GetString(ms.ToArray()));
}
}
}
(To help with reading given the changed names aren't too clear, XmlOptions is the XElement element from LINQ and Options is my class it deserializes into)
Now, it works. But that's not really enough to call it "finished" to me :P It just seems incredibly inefficient to serialize XML to a memory stream, convert it to a string, then re-parse it as XML. My question is - Is this the cleanest way to do this? Is there a more efficient/tidy mechanism for doing the serialization? Is there a better approach that'll give me the same benefits (I've only got test data in the system so far, so I can change the XML structure if required)?
PS: I've renamed the fields to be more generic - so don't really need comments about naming conventions ;)
XmlSerializer.Serialize has an overload that takes an XmlWriter.
You can create an XmlWriter that writes to an existing XElement by calling the CreateWriter method.
You can therefore write the following:
static readonly XmlSerializer xs = new XmlSerializer(typeof(Options));
...
var temp = new XElement("Parent");
using (var writer = temp.CreateWriter())
xs.Serialize(writer, options);
XmlOptions = Temp.FirstNode;

Categories