Create a List from XElements Dynamically - c#

I am reading a bunch of XML files into a list (IEnumerable really) of XElements. Then I want to convert the XElement list (these XElements contain a bunch of child-elements) into a list of classes, so I can do later operations with the data more easily.
Now if I know in advance the structure of XElements, this would be easy; I'd just create a class that mimics the XElement structure and fill instances of it with the XElement contents. But here's the caveat; my XML file element structure is mostly similar, but there could be the odd element that has a different structure. To better illustrate the situation let me take an example.
Let's say my XML files contain a bunch of 'Person' elements. The Person elements has some common elements that will be in ALL the elements, but there are some children of Person which can be found only in some of the elements.
For example all Person elements have these mandatory children:
<Person>
<Name/>
<Age/>
<City/>
<Country/>
</Person>
But, some Person elements may contain additional children as follows:
<Person>
<Name/>
<Age/>
<City/>
<Country/>
<EyeColor/>
<Profession/>
</Person>
To make things worse, these child elements can also have mostly similar structure that occasionally varies.
So is there a way that I can go through these XElements in just one loop, and put them into an instance that is somehow dynamically created, say, based on the element names or something similar? I could create a class with all the mandatory elements and leave few additional member variables for the odd new ones, but that's not ideal for two reasons; one, it would be a waste of space, and two, there could be more child element than I have extra variables in my class.
So I'm looking for a way to create the class instances dynamically to fit the XElement structure. In other words I'd really like to mimic the element structure right down to the deepest level.
Thanks in advance!

I think the best route personally would be to get an XSD, if you cannot get that then make up a serializable class that has all the possibilities and then reference that. EG: You have two fields where one get's set sometimes and one you have never seen set but there is the potential in a spec somewhere it may happen.
So let's make up a pretend class:
using System;
using System.Collections.Generic;
using System.Xml.Serialization;
namespace GenericTesting.Models
{
[Serializable()]
public class Location
{
[XmlAttribute()]
public int Id { get; set; }
[XmlAttribute()]
public double PercentUsed { get; set; }
[XmlElement]
public string ExtraGarbage { get; set; }
[XmlText]
public string UsedOnceInTheUniverse { get; set; }
}
}
And for the purpose of serializing/deserializing let me give extension methods for those:
using System.IO;
using System.Xml;
using System.Xml.Serialization;
namespace GenericTesting
{
static class ExtensionHelper
{
public static string SerializeToXml<T>(this T valueToSerialize)
{
dynamic ns = new XmlSerializerNamespaces();
ns.Add("", "");
StringWriter sw = new StringWriter();
using (XmlWriter writer = XmlWriter.Create(sw, new XmlWriterSettings { OmitXmlDeclaration = true }))
{
dynamic xmler = new XmlSerializer(valueToSerialize.GetType());
xmler.Serialize(writer, valueToSerialize, ns);
}
return sw.ToString();
}
public static T DeserializeXml<T>(this string xmlToDeserialize)
{
dynamic serializer = new XmlSerializer(typeof(T));
using (TextReader reader = new StringReader(xmlToDeserialize))
{
return (T)serializer.Deserialize(reader);
}
}
}
}
And a simple main entry point in a console app:
static void Main(string[] args)
{
var locations = new List<Location>
{
new Location { Id = 1, PercentUsed = 0.5, ExtraGarbage = "really important I'm sure"},
new Location { Id = 2, PercentUsed = 0.6},
new Location { Id = 3, PercentUsed = 0.7},
};
var serialized = locations.SerializeToXml();
var deserialized = serialized.DeserializeXml<List<Location>>();
Console.ReadLine();
}
I know this is not exactly what you are asking for but I personally think well typed is better for XML and any third party you ever deal with should have at the very least some type of spec sheet or details on what they are giving you. Else you are losing standards. Xml should not be created from reflection or other means dynamically as it is meant if anything to enforce strict typing if anything.

if you want to just enumerate over any child element of <Person> and xml is relatively small
you could use linq to xml
var listOfElementChildNames = XDocument.Parse(xml).Element("Person")
.Elements()
.Select(e => e.Name)
.ToList();
Edit:
instead of select .Select(e => e.Name)
we could map to any class:
public class Person
{
public string Name {get;set;}
public int Age {get;set;}
public string City {get;set;}
}
var xml = #"<Person>
<Name>John</Name>
<Age>25</Age>
<City>New York</City>
</Person>";
var people = XDocument.Parse(xml).Elements("Person")
.Select(p => new Person
{
Name = p.Element("Name").Value,
Age = int.Parse(p.Element("Age").Value),
City = p.Element("City").Value
}).ToList();

Let me first apologize for the VB, but that is what I do.
If I understand what you are wanting you could use a Dictionary. I shortened your example to have fewer mandatory items, but hopefully you get the idea. Here is the person class that simply iterates the children adding them to the dictionary by their element name.
Public Class Person
Private _dict As New Dictionary(Of String, XElement)
Public Sub New(persEL As XElement)
'if the class is intended to modify the original XML
'use this declaration.
Dim aPers As XElement = persEL
'if the original XML will go away during the class lifetime
'use this declaration.
'Dim aPers As XElement =New XElement( persEL)
For Each el As XElement In aPers.Elements
Me._dict.Add(el.Name.LocalName, el)
Next
End Sub
'mandatory children are done like this
Public Property Name() As String
Get
Return Me._dict("Name").Value
End Get
Set(ByVal value As String)
Me._dict("Name").Value = value
End Set
End Property
Public Property Age() As Integer
Get
Return CInt(Me._dict("Age").Value)
End Get
Set(ByVal value As Integer)
Me._dict("Age").Value = value.ToString
End Set
End Property
'end mandatory children
Public Property OtherChildren(key As String) As String
Get
Return Me._dict(key).Value
End Get
Set(ByVal value As String)
Me._dict(key).Value = value
End Set
End Property
Public Function HasChild(key As String) As Boolean
Return Me._dict.ContainsKey(key)
End Function
End Class
Here is a simple test to see how it works
Dim onePersXE As XElement = <Person>
<Name>C</Name>
<Age>22</Age>
<Opt1>optional C1</Opt1>
<Opt2>optional C2</Opt2>
</Person>
Dim onePers As New Person(onePersXE)
onePers.Name = "new name"
onePers.Age = 42
onePers.OtherChildren("Opt1") = "new opt1 value"
onePers.OtherChildren("Opt2") = "opt 2 has new value"
As you can see there are two mandatory elements and in this case two optional children.
Here is another example to show how persons might work
Dim persons As XElement
persons = <persons>
<Person>
<Name>A</Name>
<Age>32</Age>
</Person>
<Person>
<Name>B</Name>
<Age>42</Age>
<Opt1>optional B1</Opt1>
<Opt2>optional B2</Opt2>
</Person>
</persons>
Dim persList As New List(Of Person)
For Each el As XElement In persons.Elements
persList.Add(New Person(el))
Next
Hope this at least gives you some ideas.

Related

Adding an attribute to the root XML element when serializing an object

I have an object that I serialize into an XML file. Everything works as expected, but I'm wanting to add a Version attribute to the root element.
What's the best way to do this?
Here's an example of how I'm serializing:
MyProgram newProgram = new MyProgram()
{
ValueA = "A value.",
ValueB = "B value.",
ValueC = "C value."
};
XmlSerializer xmlSerializer = new XmlSerializer(typeof(MyProgram));
StreamWriter streamWriter = new StreamWriter(fileName);
xmlSerializer.Serialize(streamWriter, newProgram);
streamWriter.Close();
Right now, my XML looks something like this:
<MyProgram>
<ValueA>A value.</ValueA>
<ValueB>B value.</ValueB>
<ValueC>C value.</ValueC>
</MyProgram>
But I'd like to have this:
<MyProgram Version="1.0">
<ValueA>A value.</ValueA>
<ValueB>B value.</ValueB>
<ValueC>C value.</ValueC>
</MyProgram>
Thanks!
First
Modify your XML to be like this, assuming you have a concrete class and not an anonymous one
[XmlRoot("MyProgram")]//sepcifies the name of the root element
public class MyProgram
{
[XmlAttribute("Version")]//name not required unless you want to change output to something different
public string Version{get;set;}
[XmlElement("ValueA")]//again, name not required if the name is the same
public ValueA ValueA{get;set;}
....
}
Then creating the MyProgram class as you specified will give you the desired output, also note that you might need to add XML tags to the ValueA/B/C classes if the end result is not as desired.
The second
Serialize the XML data to a string, and use simple Regex/String manipulation to insert the desired value and then save the string to the desired location
Third
You can either use XElement to query your XML string /Create it and then set the Version attribute
XElement x = XElement.Load("Your XML location");
var yourRoot = x.Descendants("MyProgram").FirstOrDefaul();
yourRoot.SetAttributeValue("Version","1.0");
yourRoot.Save("Your XML location");

How to deserialize only part of a large xml file to c# classes?

I've already read some posts and articles on how to deserialize xml but still haven't figured out the way I should write the code to match my needs, so.. I'm apologizing for another question about deserializing xml ))
I have a large (50 MB) xml file which I need to deserialize. I use xsd.exe to get xsd schema of the document and than autogenerate c# classes file which I put into my project. I want to get some (not all) data from this xml file and put it into my sql database.
Here is the hierarchy of the file (simplified, xsd is very large):
public class yml_catalog
{
public yml_catalogShop[] shop { /*realization*/ }
}
public class yml_catalogShop
{
public yml_catalogShopOffersOffer[][] offers { /*realization*/ }
}
public class yml_catalogShopOffersOffer
{
// here goes all the data (properties) I want to obtain ))
}
And here is my code:
first approach:
yml_catalogShopOffersOffer catalog;
var serializer = new XmlSerializer(typeof(yml_catalogShopOffersOffer));
var reader = new StreamReader(#"C:\div_kid.xml");
catalog = (yml_catalogShopOffersOffer) serializer.Deserialize(reader);//exception occures
reader.Close();
I get InvalidOperationException: There is an error in the XML(3,2) document
second approach:
XmlSerializer ser = new XmlSerializer(typeof(yml_catalogShopOffersOffer));
yml_catalogShopOffersOffer result;
using (XmlReader reader = XmlReader.Create(#"C:\div_kid.xml"))
{
result = (yml_catalogShopOffersOffer)ser.Deserialize(reader); // exception occures
}
InvalidOperationException: There is an error in the XML(0,0) document
third: I tried to deserialize the entire file:
XmlSerializer ser = new XmlSerializer(typeof(yml_catalog)); // exception occures
yml_catalog result;
using (XmlReader reader = XmlReader.Create(#"C:\div_kid.xml"))
{
result = (yml_catalog)ser.Deserialize(reader);
}
And I get the following:
error CS0030: The convertion of type "yml_catalogShopOffersOffer[]" into "yml_catalogShopOffersOffer" is not possible.
error CS0029: The implicit convertion of type "yml_catalogShopOffersOffer" into "yml_catalogShopOffersOffer[]" is not possible.
So, how to fix (or overwrite) the code to not get the exceptions?
edits: Also when I write:
XDocument doc = XDocument.Parse(#"C:\div_kid.xml");
The XmlException occures: unpermitted data on root level, string 1, position 1.
Here is the first string of the xml file:
<?xml version="1.0" encoding="windows-1251"?>
edits 2:
The xml file short example:
<?xml version="1.0" encoding="windows-1251"?>
<!DOCTYPE yml_catalog SYSTEM "shops.dtd">
<yml_catalog date="2012-11-01 23:29">
<shop>
<name>OZON.ru</name>
<company>?????? "???????????????? ??????????????"</company>
<url>http://www.ozon.ru/</url>
<currencies>
<currency id="RUR" rate="1" />
</currencies>
<categories>
<category id=""1126233>base category</category>
<category id="1127479" parentId="1126233">bla bla bla</category>
// here goes all the categories
</categories>
<offers>
<offer>
<price></price>
<picture></picture>
</offer>
// other offers
</offers>
</shop>
</yml_catalog>
P.S.
I've already acccepted the answer (it's perfect). But now I need to find "base category" for each Offer using categoryId. The data is hierarchical and the base category is the category that has no "parentId" attribute. So, I wrote a recursive method to find the "base category", but it never finishes. Seems like the algorythm is not very fast))
Here is my code: (in the main() method)
var doc = XDocument.Load(#"C:\div_kid.xml");
var offers = doc.Descendants("shop").Elements("offers").Elements("offer");
foreach (var offer in offers.Take(2))
{
var category = GetCategory(categoryId, doc);
// here goes other code
}
Helper method:
public static string GetCategory(int categoryId, XDocument document)
{
var tempId = categoryId;
var categories = document.Descendants("shop").Elements("categories").Elements("category");
foreach (var category in categories)
{
if (category.Attribute("id").ToString() == categoryId.ToString())
{
if (category.Attributes().Count() == 1)
{
return category.ToString();
}
tempId = Convert.ToInt32(category.Attribute("parentId"));
}
}
return GetCategory(tempId, document);
}
Can I use recursion in such situation? If not, how else can I find the "base category"?
Give LINQ to XML a try. XElement result = XElement.Load(#"C:\div_kid.xml");
Querying in LINQ is brilliant but admittedly a little weird at the start. You select nodes from the Document in a SQL like syntax, or using lambda expressions. Then create anonymous objects (or use existing classes) containing the data you are interested in.
Best is to see it in action.
miscellaneous examples of LINQ to XML
simple sample using xquery and lambdas
sample denoting namespaces
There is tons more on msdn. Search for LINQ to XML.
Based on your sample XML and code, here's a specific example:
var element = XElement.Load(#"C:\div_kid.xml");
var shopsQuery =
from shop in element.Descendants("shop")
select new
{
Name = (string) shop.Descendants("name").FirstOrDefault(),
Company = (string) shop.Descendants("company").FirstOrDefault(),
Categories =
from category in shop.Descendants("category")
select new {
Id = category.Attribute("id").Value,
Parent = category.Attribute("parentId").Value,
Name = category.Value
},
Offers =
from offer in shop.Descendants("offer")
select new {
Price = (string) offer.Descendants("price").FirstOrDefault(),
Picture = (string) offer.Descendants("picture").FirstOrDefault()
}
};
foreach (var shop in shopsQuery){
Console.WriteLine(shop.Name);
Console.WriteLine(shop.Company);
foreach (var category in shop.Categories)
{
Console.WriteLine(category.Name);
Console.WriteLine(category.Id);
}
foreach (var offer in shop.Offers)
{
Console.WriteLine(offer.Price);
Console.WriteLine(offer.Picture);
}
}
As an extra: Here's how to deserialize the tree of categories from the flat category elements.
You need a proper class to house them, for the list of Children must have a type:
class Category
{
public int Id { get; set; }
public int? ParentId { get; set; }
public List<Category> Children { get; set; }
public IEnumerable<Category> Descendants {
get
{
return (from child in Children
select child.Descendants).SelectMany(x => x).
Concat(new Category[] { this });
}
}
}
To create a list containing all distinct categories in the document:
var categories = (from category in element.Descendants("category")
orderby int.Parse( category.Attribute("id").Value )
select new Category()
{
Id = int.Parse(category.Attribute("id").Value),
ParentId = category.Attribute("parentId") == null ?
null as int? : int.Parse(category.Attribute("parentId").Value),
Children = new List<Category>()
}).Distinct().ToList();
Then organize them into a tree (Heavily borrowed from flat list to hierarchy):
var lookup = categories.ToLookup(cat => cat.ParentId);
foreach (var category in categories)
{
category.Children = lookup[category.Id].ToList();
}
var rootCategories = lookup[null].ToList();
To find the root which contains theCategory:
var root = (from cat in rootCategories
where cat.Descendants.Contains(theCategory)
select cat).FirstOrDefault();

Deserialization : several kind of nodes into the same object

I'm trying to deserialize an XML document, one of its nodes can be represented like this :
<n1 zone="00000" id="0000" />
or this :
<n2 zone="00000" id="0000" />
or this :
<n3 zone="00000" id="0000" />
In my document I will always have one "n1" node or one "n2" node or one "n3" node. I'd like to deserialize all these fragments into an instance of this class :
[Serializable]
public class N
{
[XmlAttribute("zone")]
public string Zone { get; set; }
[XmlAttribute("id")]
public string Id { get; set; }
}
But I didn't manage to do that. The documentation suggests to use the XmlChoiceIdentifier attribute in order to accomplish this, but maybe I used it in a wrong way.
Any idea ?
PS : I know I can create three classes : N1, N2 and N3, and map them to my different types of XML fragments. But I'd prefer a cleaner solution.
Assuming you have something like:
<?xml version="1.0"?>
<ns>
<n1 id="0000" zone="00000"/>
<n2 id="0000" zone="00000"/>
<n3 id="0000" zone="00000"/>
</ns>
You could use LINQ to XML:
XDocument document = XDocument.Load(path);
XElement parentNode = document.Element("ns");
var childNodes = parentNode.Elements().Where(x => x.Name.ToString().StartsWith("n")); //this prevent from take elements wich didn't start with n
List<N> list = new List<N>();
foreach (XElement element in childNodes) {
N n = new N(){
Id = element.Attribute("id").Value,
Zone = element.Attribute("zone").Value
};
list.Add(n);
}
Here is a fairly understandable read on how to use XmlChoiceIdentifier:
http://msdn.microsoft.com/en-us/magazine/cc164135.aspx
If you are really having trouble with this, then you could always use an XSL transform to do the mapping first.
You could use LINQ-To-Xml
XDocument x = XDocument.Parse(
"<root><n1 zone=\"0000\" id=\"0000\"/><n2 zone=\"0001\" id=\"0011\"/><n3 zone=\"0002\" id=\"0022\"/></root>");
var result = from c in x.Element("root").Descendants()
select new N { Zone = c.Attribute("zone").Value,
Id = c.Attribute("id").Value };
It's not using a serializer, which I think you are aiming for but it's a way forward.

c# XML and LINQ

I have an XML doc that looks like so:
<people>
<person>
<name>mike</name>
<address>1 main st</address>
<jobTitle>SE</jobTitme>
<children>
<name>mary</name>
<age>5</age>
</childres>
</person>
<person>
<name>john</name>
<address>2 main st</address>
<jobTitle>SE</jobTitme>
</person>
</people>`
So not all of the person blocks and a children block. Pretty simple. When I add a new person to the XML via C#, I am writing a function that takes a person object and that person object has a collection of children objects (which may be 0 or more). I am having trouble writing the linq in that function. I can easily add a person object, but conditionally adding 1 or more children is tough. Here is what I have so far:
doc.Element("People").Add(
new XElement("Person",
new XElement("Name", person.name),
new XElement("Address", person.address),
new XElement("jobTitle", person.jobTitle)))
how can I conditionally add the children if they exist?
public class person
{
public List<Child> childList;
public string name;
public string address;
public string jobTitle
}
public class child
{
public string name;
public int age;
}
how can I conditionally add the children if they exist?
Three options:
Use a null argument in the XElement call; that will be ignored
Pass in an empty sequence of children; again, this will be irrelevant
Build the rest of the element, then just conditionally call Add afterwards.
It's hard to give more concrete advice without seeing the code for your Person type.
(As an aside, it looks like your element should actually be child rather than children, assuming that you have one element per child...)
EDIT: Now that we can see your code, it looks like you just want:
doc.Element("People").Add(
new XElement("Person",
new XElement("Name", person.name),
new XElement("jobTitle", job.title),
person.children.Select(c => new XElement("children",
new XElement("Name", c.name),
new XElement("Age", c.age)))));
Note that you're currently being very inconsistent with your capitalization when it comes to element names, and also it's a bad idea to expose public fields like this.

C# object to XML

I am creating an application which requires to convert c# object to XML.
I am using XML Serializer class to achieve this. Here is the code snippet:
public class Anwer
{
public int ID { get; set; }
public string XML { get; set; }
public Anwer(int ID, string XML)
{
this.ID = ID;
this.XML = XML;
}
public Anwer() { }
}
Here is the main function:
string AnswerXML = #"<Answer>1<Answer>";
List<Anwer> answerList = new List<Anwer>();
answerList.Add(new Anwer(1,AnswerXML));
AnswerXML = #"<Answer>2<Answer>";
answerList.Add(new Anwer(2, AnswerXML));
XmlSerializer x = new XmlSerializer(answerList.GetType());
x.Serialize(Console.Out, answerList);
The output is:
<?xml version="1.0" encoding="IBM437"?>
<ArrayOfAnwer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="h
ttp://www.w3.org/2001/XMLSchema">
<Anwer>
<ID>1</ID>
<XML><Answer>1<Answer></XML>
</Anwer>
<Anwer>
<ID>2</ID>
<XML><Answer>2<Answer></XML>
</Anwer>
</ArrayOfAnwer>
In the above code '<' and '>' are getting replaced by '<' and '&gt';
How to avoid this?
I know string replace is one of the way, but I don't want to use it.
Thanks in advance.
You don't, basically. That's correctly serializing the object - the XML serializer doesn't want to have to deal with XML within strings messing things up, so it escapes the XML appropriately.
If you deserialize the XML later, you'll get back to the original object data.
If you're trying to build up an XML document in a custom fashion, I suggest you don't use XML serialization to start with. Either use LINQ to XML if you're happy to create elements etc explicitly, or if you really, really want to include arbitrary strings directly in your output, use XmlWriter.
If you could give us more information about the bigger picture of what you're trying to do, we may be able to suggest better alternatives - building XML strings directly is almost never a good idea.
XmlSerializer won't believe you that an element is xml unless you convince it, for example by exposing that property as an XmlDocument. Otherwise, it (correctly, IMO) always encodes such values. For example:
using System;
using System.Xml;
using System.Xml.Serialization;
public class Anwer
{
public int ID { get; set; }
public XmlDocument XML { get; set; }
public Anwer(int ID, string XML)
{
this.ID = ID;
XmlDocument doc = new XmlDocument();
doc.LoadXml(XML);
this.XML = doc;
}
public Anwer()
{ }
}
static class Program
{
static void Main()
{
var answer = new Anwer(123, "<Answer>2</Answer>");
var ser = new XmlSerializer(answer.GetType());
ser.Serialize(Console.Out, answer);
}
}
I am creating an application which requires to convert c# object to XML. I am using XML Serializer class to achieve this
If you're using the XML Serializer to do the work, then why the "XML" field where you're inserting hand-coded XML? Seems like you want something more like this (using your class name, though it looks like a misspelling):
public class Anwer
{
public int ID { get; set; }
public int Answer { get; set; }
}
..
List<Anwer> answerList = new List<Anwer>() {
new Anwer { ID=1, Answer=2 },
new Anwer { ID=2, Answer=3 },
};
XmlSerializer x = new XmlSerializer(answerList.GetType());
x.Serialize(Console.Out, answerList);
..
<ArrayOfAnwer ...>
<Anwer>
<ID>1</ID>
<Answer>2</Answer>
</Anwer>
...
Or if you actually want/need the Answer element to be nested in an XML element for some reason, you can alter your Anwer object to reflect that structure (as Oleg Kalenchuk suggests), or generate the XML yourself rather than using the serializer:
XElement xml = new XElement("AnwerList",
from anwer in anwerList select
new XElement("Anwer",
new XElement("ID", anwer.ID),
new XElement("XML",
new XElement("Answer", anwer.Answer)
)
)
);
Console.Out.WriteLine(xml);
<AnwerList>
<Anwer>
<ID>1</ID>
<XML>
<Answer>2</Answer>
</XML>
</Anwer>
...
I prefer the latter anyway, because it gives you more control.
You're assigning a string containing the < and > sign to the XML element so it is obvious that teh serializer would replace the < and > with entity references. Even if you're getting > in the text when you deserialise the XML you'll get the > in your text.
Create a new class AnswerXML with one integer "Answer" member
Change type of XML member to AnswerXML instead of string
Because '<' and '>' are characters used for the xml-structure itself, they are automatically htmlencoded. When you read it back in your app and deserialize it, the '<' and '>' should be converted back to '<' and '>'.
If your goal is otherwise, use htmldecode functionality.
If this don't help, just tell what exactly you want to do with the xml-data.

Categories