Is there a good way in c# to look through an XML node list using DOM and get a node list of only the unique nodes and also a list of each nodes unique possible attributes.
The XMl file in question has nodes of the same name but with different attributes, i want a list of all the possible ones. Also the list of nodes i would like to be only of the unique nodes, rather than having repeats (so node lists i generate at the moment might have contact twice, three time ect within it). And it needs to work for any XML document. Any ideas?
Here is an example:
<book id="bk112">
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
<genre>Computer</genre>
<price>49.95</price>
<publish_date>2001-04-16</publish_date>
</book>
<book id="bk162">
<genre>fiction</genre>
<popularity>High</popularity>
<price>20.00</price>
<publish_date>2002-03-12</publish_date>
</book>
<cd id="bk162">
<genre>jaz</genre>
<popularity>High</popularity>
<price>10.00</price>
</cd>
and get some sort of output like:
there are 2 of the type book
there are 1 of the type cd
there are 3 of the type genre
book may have the attributes author, title, genre, price, popularity, publish_date
but in a way that works for any xml file.
In the case of genre it doesnt need to be celver in any way, just know there are 3 genre nodes in the document.
Would this do it?
XDocument xDoc = XDocument.Load("XMLFile1.xml");
List<XElement> distinctDocs = xDoc.Descendants().GroupBy(x => x.Name).Where(x => x.Count() == 1).Select(g => g.Single()).ToList();
Related
Given an example XML file as such:
<libraries>
<library name="some library">
<book name="my book"/>
<book name="your book"/>
</library>
<library name="another library">
<book name="his book"/>
<book name="her book"/>
</library>
</libraries>
How would one iterate through each library and get only its children? E.g. if I was in the first library element and I went to retrieve all its descendants/children, it would only return with the two books inside it.
I've tried iterating and using XElement.Elements("book"), XElement.Elements(), XElement.Descendants(), etc. but all return every element that is a book (so it would pull the elements from the second library, too). Mostly I think I'm just struggling with understanding how XDocument keeps track of its elements and what's considered a descendant/child.
If possible, if one could explain as to how this would be done with XDocument for an element at any level it'd be appreciated (e.g. if each book had child elements, and if those elements had child elements, etc).
You can iterate over your XML by going through all the descendents of libraries in the following way.
XDocument doc=XDocument.Load(XmlPath);
foreach (var item in doc.Descendants("library"))
IEnumerable<XNode> nodes = item.DescendantNodes();//Here you got book nodes within a library
Sheer,
The problem is you are pulling all elements with "book".
If you want to get only items dependant on the parent element, you will have to supply a proper condition.
var v = from n in doc.Descendants("library")
where n.Attribute("name").Value == "some library"
select n.DescendantNodes();
Now, this will give you element who's name is "some library".
Currently I have a working C# program that works as follows:
Accept .xls template with values (xls is manually created by user)
Save the values (matching fields) to the database
Convert and write .xls to XML. Please see below sample output:
Existing XML Structure
Now, what I want to do is:
Read the existing xml (the created xml)
Insert another set of nodes and subnodes (ReleaseLine and sub nodes). It must accept multiple ReleaseLine.
Save/create the new xml with appended nodes. Please see below output:
This is what I'm looking for:
My existing C# program is simple but the XML nodes and hierarchy is bloody deep. I just created the C# code using new XElement method and passing values for each nodes. Then I simply use xmlDocument.Save() method to write the xml.
[Existing XML Program][3]
To add nodes or append content in existing xml-data I´d use Linq to XML.
XElement xml = XElement.Load("file.xml");
xml.Add( new XElement("uberNode",
new XElement("childNode", content),
new XElement("anotherChildNode", content)));
xml.Save("file.xml");
Here are some other related solutions.
Add to specific node (with example):
Following exisiting XML-data:
`<Names>
<Name>
<prename>John</prename>
<lastname>Snow</lastname>
</Name>
<Name>
<prename>Harry</prename>
<lastname>Harry</lastname>
</Name>
</Names>`
Now I want to add an "age"-tag before the first "prename"-tag and a "family"-tag after the first "lastname"-tag.
XElement xml = XElement.Load("file.xml");
var childrens = xml.DescendantsAndSelf().ToArray();
var first_prename = childrens[2];
var first_lastname = childrens[3];
Console.WriteLine(childrens[0]); //prints out the whole content
first_prename.AddBeforeSelf(new XElement("age", 22));
first_lastname.AddAfterSelf(new XElement("family", new XElement("mother", "paula"), new XElement("father", "paul")));
xml.Save("file.xml");
Outcome:
`<Names>
<Name>
<age>22</age>
<prename>John</prename>
<lastname>Snow</lastname>
<family>
<mother>paula</mother>
<father>paul</father>
</family>
</Name>
<Name>
<prename>Harry</prename>
<lastname>Harry</lastname>
</Name>
</Names>`
I was facing the problem and Linq gave me the easiest way to accomplish that!
There are also other similar way e.g. here. But I tried a bit more and DescendantsAndSelf() made it easier for me to go through.
I found an answer to my question, here is the link http://www.xmlplease.com/add-xml-linq
Using XPathSelectElement method, I was able to find the right node and appended new block of XElement.
I'm really new to Linq and C# and I'm stuck on what is probably an obvious problem.
I have an existing XML file
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<books>
<book>
<title>This is Title 1</title>
<author>John Doe</author>
<categories>
<category>How to</category>
<category>Technical</category>
</book>
<book>
<title>This is Title 2</title>
<author>Jane Brown</author>
<categories>
<category>Fantasy</category>
</categories>
</book>
</books>
I want to add a 2nd category to the second book in this file.
I've gotten this far:
var thiscat = doc.Root
.Element("book")
.Element("categories");
thiscat.Add(new XElement("category", "novel"));
But this adds a 3rd category to the first book. I need to learn how to point 'thiscat' at the last categories element rather than the first one. I've been sniffing around LastNode but haven't managed to get the syntax right.
This is my first question here. Please let me know if I'm not being clear or if I'm doing anything wrong.
Pete,
Here is an example that will search for the book by title This is Title 2 and add another category.
var elem = doc.Root.Elements("book").FirstOrDefault(x => x.Element("title").Value.Equals("This is Title 2"));
if (elem != null)
{
var category = elem.Element("categories");
category.Add(new XElement("category", "novel"));
}
Edit: More explanatoin.
First of we search the documents book elements for the matching title of This is Title 2 (effectively your second entry). By executing the FirstOrDefault extension method we either the get the first matching element (as XElement) or null.
Because we 'could' get a null value we must check if the value is null if not we move into the next step of locating the categories element. This can be done simply calling the elem.Element() method as we only expect one element.
Finally we add a new XElement to the category element.
Hope this helps.
Cheers.
To answer your question quite literally, you could modify the statement as follows:
var thiscat = doc.Root
.Elements("book")
.Skip(1)
.First()
.Element("categories");
The "Element" function returns the first element of that type found. In this case, we used "Elements" instead to return an IEnumerable containing all of the elements named "book", and then we used the LINQ "skip" function to skip the first (returning another IEnumerable of all the remaining elements), and then we took just the first element in the IEnumerable (back to a single XElement).
Another way you could have gotten to the answer is as follows:
var thiscat = doc.Root
.Element("book")
.ElementsAfterSelf()
.First()
.Element("categories");
ElementsAfterSelf returns an IEnumerable of all the sibling elements after the calling object.
LINQ is a really critical part of programming in C# and it's good to see you're trying to learn it from the beginning. Although your methodology here in adding a specific element to a specific place programmatically is questionable (obviously it is a contrived example), in playing around like this you will probably learn a bit about LINQ and that is always good.
First you should get your second book element.According to your code:
var thiscat = doc.Root
.Element("book")
.Element("categories");
This statement returns just one categories element which belongs to your first book.Because you are using Element instead of Elements. Let's go step by step.
A proper way to get second element is using Descendants like this:
var secondBook = doc.Descendants("book")[1];
Descendants returning a collection of your books.And we are getting second element with indexer.Now we need to select your categories element under the book element.
var categories = secondBook.Element("categories");
Now we have our categories element and we can add our new category and save Xml Document:
categories.Add(new XElement("category", "novel"));
doc.Save(path);
And that's all.If you understand that logic you can modify your html file however you like.Besides you can make all of these in one line:
doc.Descendants("book")[1]
.Element("categories")
.Add(new XElement("category", "novel"));
This should work( slightly lengthy solution as it helps understand the fundamentals better):
XmlElement rootNode = xd.DocumentElement; //gives <books> the root node
XmlNodeList cnodes= rootNode.ChildNodes; //gets the childnodes of <books>
XmlNode secondBook= cnodes.Item(1); //second child of <books> i.e., the <book> you want
XmlNodeList bnodes= secondBook.ChildNodes; //gets the childnodes of that <book>
XmlNode categories= bnodes.Item(2); //gets the third child i.e.,<categories>
//making the new <category> node
string xmlContent = "<category>novel</category>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlContent);
XmlNode newNode = doc.DocumentElement;
//making the new node completes
categories.AppendChild(newNode); //append the new node to <categories> as a child
I have several XDocuments that look like:
<Test>
<element
location=".\jnk.txt"
status="(modified)"/>
<element
location=".\jnk.xml"
status="(overload)"/>
</Test>
In C#, I create a new XDocument:
XDocument mergedXmlDocs = new XDocument(new XElement("ACResponse"));
And try to add the nodes from the other XDocuments:
for (ti = 0; (ti < 3); ++ti)
{
var query = from xElem in xDocs[(int)ti].Descendants("element")
select new XElement(xElem);
foreach (XElement xElem in query)
{
mergedXmlDocs.Add(xElem);
}
}
At runtime I get an error about how the Add would create a badly-formed document.
What am I doing wrong?
Thanks...
(I saw this question -- Merge XML documents -- but creating an XSLT transform seemed like extra trouble for what seems like a simple operation.)
You are very close. Trying changing the line
mergedXmlDocs.Add(xElem);
to
mergedXmlDocs.Root.Add(xElem);
The problem is that each XML document can only contain 1 root node. Your existing code is trying to add all of the nodes at the root level. You need to add them to the existing top level node instead.
I am not sure what programming language you are using, but for most programming languages there is extensive XML support classes. Most of them allow parsing and even adding of element. I would have 1 main file that I would keep around and then parse each new one adding the elements from the new one into the master.
EDIT: Sorry it looks like you are already doing exactly this.
I have an XmlDocument which I can traverse with XmlNode or convert it to a XDocument and traverse it via LINQ.
<Dataset>
<Person>
<PayrollNumber>1234567</PayrollNumber>
<Surname>Smith-Rodrigez</Surname>
<Name>John-Jaime-Winston Junior</Name>
<Skills>
<Skill>ICP</Skill>
<Skill>R</Skill>
</Skills>
<HomePhone>08 8888 8888</HomePhone>
<MobilePhone>041 888 999</MobilePhone>
<Email>curly#stooge.com</Email>
</Person>
<Person>
<PayrollNumber>12342567</PayrollNumber>
<Surname>Smith-Rodrigez</Surname>
<Name>Steve</Name>
<Skills>
<Skill>Resus</Skill>
<Skill>Air</Skill>
</Skills>
<HomePhone>08 8888 8888</HomePhone>
<MobilePhone>041 888 999</MobilePhone>
<Email>curly#stooge.com</Email>
</Person>
</Dataset>
Question 1
I want to convert the Person records/nodes in the XML to a business entity object (POCO).
Therefore I have to iterate through a Person node at a time, and then parse the individual values. This last bit is interesting in itself, but first I have to get the actual Person records. The problem I have is that if I select by individual nodes (using say XmlList in XmlDocoment).
I end up aggregating all fields by name. I am concerned to do this in case one of the person nodes is incomplete, or even missing and then I won't know which is missing when I pass through and aggregate the fields in to business objects. I will try and validate - see question 2.
I realize this can be done through reflection but I am interested.
I tried iterating through by Person object:
Option 1:
foreach (XObject o in xDoc.Descendants("Person"))
{
Console.WriteLine("Name" + o);
// [...]
}
This gets me 2 person records (correct) each a stringified complete XML doc - formatted as an XML document. Just a subset of the above XML document.
But how to split up the record now into separate nodes or fields - preferably as painless as possible?
Option 2:
foreach (XElement element in xDoc.Descendants("Person"))
{
// [...]
}
This gets me the XML nodes - values only - for each Person all in one string, e.g.
1234567Smith-RodrigezJohn-Jaime-Winston JuniorLevel 5, City Central Tower 2, 121 King William StNorth Adelaide 5000ICPR08 8888 8888041 888 999111111curly#stooge.comE
Again, not much use.
Question 2
I can validate an XDocument quite easily, there are some good examples on MSDN, but I'd like to know how can I flag a wrong record. Ideally, I'd like to be able to filter the good records out to a new XDocument on the fly leaving the old ones behind. Is this possible?
The problem is that you're just printing out the elements as strings. You need to write code to convert an XElement of <Person> into your business object. Admittedly I'd expect the full XML to be written out instead - are you sure you're not printing out XElement.Value (which concatenates all the descendant text nodes)?
(I'm not sure of the answer to your second question - I suggest you ask it as a separate question here, so that we don't get a mixture of answers in one page.)
Why not using XML deserialization?
There are two ways to do that.
The first one is to modify the business object Person to match the given XML, by adding appropriate attributes to the Person class and its properties. The XML is quite simple, so probably you would just have to change the names if there is no 1:1 match between object properties and XML nodes. For example, you have to specify [XmlArray("Skills")] and [XmlArrayItem("Skill)] for the Skills collection.
The second one is to transform the given XML to the one which matches the default serialization of your Person object, then to deserialize.
The second solution will also give you the possibility to filter "bad" records very easily.