Querying XML and updating certain elements - c#

I have an XML file in the following format
<?xml version="1.0" ?>
<AA someattrib="xyz">
<BB someOtherAttrib="xyz">
<Title></Title>
<CC>
<myNode rowid="">
<subNode1></subNode1>
<subNode2></subNode2>
<nodeOfInterest></nodeOfInterest>
</myNode >
<myNode rowid="">
<subNode1> </subNode1>
</myNode>
</CC>
</BB>
</AA>
I want to use Linq to pick out one node by the name 'MyNode' where the rowid is a particular number that I will be getting from a collection in an object. Once I get myNode I want to update the value of the child nodeOfInterest if it is present. If not present, then I would like to add it. Once done I want to save the file.
This is what I have at the moment but it may not be the right approach.
foreach (User employee in Users)
{
XPathNavigator node = xNav.SelectSingleNode("/AA/BB/CC/myNode[#rowid = '"+employee.ID.ToString()+"']");
XPathNodeIterator nodeIterator= node.SelectChildren("nodeOfInterest", "");
if (nodeIterator.Count == 1)
{
}
else
{
}
}
Is there a way this can be done using a direct join between the List and the xmldoc in memory? This will be a large list and an equally large xml file. I dont think running a loop and calling selectSingleNode is the most efficient way.
Thanks for your inputs

Well one starting point would be to create a Dictionary<string, XElement> mapping the row ID to the element:
var dictionary = doc.Element("AA").Element("BB").Element("CC").Elements("myNode")
.ToDictionary(x => x.Attribute("rowId").Value);
Then:
foreach (User employee in Users)
{
XElement myNode;
if (dictionary.TryGetValue(employee.ID, out myNode))
{
// Use myNode
}
else
{
// Employee not found
}
}
Personally I prefer using the selection methods provided by LINQ to XML (Elements, Element, Descendants etc) rather than SelectSingleNode, SelectChildren etc.

The full answer, with help from Jon's replies...
var doc = XDocument.Load("thefile.xml");
var dictionary = doc.Element("AA").Element("BB").Element("CC").Elements("myNode")
.ToDictionary(x => x.Attribute("rowId").Value);
foreach (User employee in Users)
{
XElement myNode;
if (dictionary.TryGetValue(employee.ID, out myNode))
{
XElement nodeOfInterest = myNode.Elements("nodeOfInterest").FirstOrDefault();
if (nodeOfInterest != null)
{
nodeOfInterest.Value = "update with this value";
}
else
{
XElement nodeOfInterest = new XElement("nodeOfInterest", "Add nodeOfInterest with this value");
myNode.Add(newElement);
}
}
}
doc.Save("TheFile.xml");

Related

Reading certain nodes from elements one at a time of xml file in c#

I'm trying to read through .xml -file and get information out of there. Here is a sample of the .xml -file I have:
<?xml version="1.0" encoding="UTF-8"?>
<XmlFile>
<xmlsource>
<Name>TestXml</Name>
<filename>MyXmlFile.xml</filename>
<Information Key="GeneralInfo"/>
<Products>
<Product>
<ProductName>Product1</ProductName>
<Name Key="SomeName"/>
<Usages>
<Usage>
<Specs>
<Spec1 Key="Moving"/>
<Spec2 Key="Lifting"/>
</Specs>
<Info1>
<MovingInfo1>yes</MovingInfo1>
</Info1>
<Info2>Noup</Info2>
<MoreSpecs>
<ProductModel1>
<DetInfo1>DetInfo1</DetInfo1>
<DetInfo2>DetInfo2</DetInfo2>
</ProductModel11>
</MoreSpecs>
</Usage>
</Usages>
</Product>
<Product>
<ProductName>Product2</ProductName>
<Name Key="SomeName2"/>
<Usages>
<Usage>
<Specs>
<Spec1 Key="Moving"/>
<Spec2 Key="Lifting"/>
</Specs>
<Info1>
<MovingInfo1>not</MovingInfo1>
</Info1>
<Info2>Yes</Info2>
<MoreSpecs>
<ProductModel1>
<DetInfo1>DetInfo1</DetInfo1>
</ProductModel1>
</MoreSpecs>
</Usage>
<Usage>
<Specs>
<Spec1 Key="Turning"/>
</Specs>
<Info1>
<TurningInfo1>Infoooo</TurningInfo1>
</Info1>
<Info2>No</Info2>
<MoreSpecs>
<ProductType1>
<DetInfo1>DetInfo1</DetInfo1>
</ProductType1>
</MoreSpecs>
</Usage>
</Usages>
</Product>
</Products>
</xmlsource>
(This is just a sample, original file has a lot more data in it.)
I want to know only the values of ProductName and Spec1. As you can see from the sample, 'Product2' has two different values of Spec1: 'Moving' and 'Turning'.
What I'm trying to achieve:
Read ProductName ("Product1") from the first <Product> and then the Spec1 ("Moving"), then do something with the information. After that, move to next <Product>, read ProductName ("Product2"), Spec1 ("Moving") and the other Spec1 ("Turning"), and skipping all the other possible Spec values - meaning, that I want only Spec1 value. And so on go through the hole file.
Here is what I have tried to do:
public void getNodes(string filepath)
{
xmlFilePath = filepath;
XmlDocument xDoc = new XmlDocument();
xDoc.Load(xmlFilePath);
XmlNodeList products = xDoc.SelectNodes("//Product");
XmlNodeList productnames = xDoc.SelectNodes("//Product/ProductName");
XmlNodeList specs = xDoc.SelectNodes("//Product//Spec1");
AllocConsole();
Console.WriteLine(products.Count);
Console.WriteLine(specs.Count);
foreach (XmlNode xn in specs)
{
XmlAttributeCollection spec1Atts = xn.Attributes;
Console.WriteLine(spec1Atts["Key"].Value.ToString());
}
for (int i = 0; i < products.Count; i++)
{
Console.WriteLine(products.Item(i).InnerText);
Console.ReadLine();
}
}
This is the closest I have got (closest to what I'm trying to do).
There, first I have load the .xml -file.
Then, in lines containing XmlNodeList etc. I'm filtering with those requirements.
Here (below), is being checked the amount of products specs:
Console.WriteLine(products.Count);
Console.WriteLine(specs.Count);
Finally I'm printing out the values which has been read. With this, the print-out is obviously:
First comes out the amounts
Second comes the specs
And finally the productnames
As said above, I want ProductNames and Spec1's to be "linked" together.
I tried many methods e.g. shown in here: Reading multiple child nodes of xml file
Somehow I couldn't make any example work in my situation. Maybe it's because in my case, there is so deep parent-child pairs?
I can't change the structure of the .xml -file. If I could, I would have changed it already...
So, my question is: Could someone show me a hint/way how to achieve my goal? Thanks in advance.
What you need to do is traverse the hierarchy. In the revised code below, I find the ProductName, then within that, I look for the next node and so on until I find the Specs that correspond to that product.
private static void getNodes(string filePath)
{
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(filePath);
var productNodes = xmlDoc.SelectNodes("//Product");
if (productNodes != null)
{
foreach (XmlNode product in productNodes)
{
var childNodes = product.ChildNodes;
foreach (XmlNode child in childNodes)
{
if (child.Name == "ProductName")
{
Console.WriteLine(child.InnerText);
}
else if (child.Name == "Usages")
{
var childNodes2 = child.ChildNodes;
foreach (XmlNode child2 in childNodes2)
{
if (child2.Name == "Usage")
{
var childNodes3 = child2.ChildNodes;
foreach (XmlNode child3 in childNodes3)
{
if (child3.Name == "Specs")
{
var childNodes4 = child3.ChildNodes;
foreach (XmlNode child4 in childNodes4)
{
foreach (XmlNode a in child4.Attributes)
{
Console.WriteLine($" {a.InnerText}");
}
}
}
}
}
}
}
}
}
}
Console.ReadLine();
}
Hope that helps. If so, please vote for my answer because I need the reputation. Thank you.
If I understand your question correctly, you are trying to retrieve the specs for each product and you want to use them together. (Apologies if I don't fully get it). if that is the case you can try probing the product element directly in your example loop. like
for (int i = 0; i < products.Count; i++) {
var specs = products[i].SelectNodes("Usages/Usage/Specs")[0].ChildNodes;
for (int j = 0; j < specs.Count; j++)
Console.WriteLine("{0}->{1}", products[i].FirstChild.InnerText, specs[j].Attributes["Key"].Value);
}
I hope this helps

How can I remove duplicate, invalid, child nodes from an XML document using Linq to XML?

I'm creating XML from JSON retrieved from an HttpWebRequest call, using JsonConvert. The JSON I'm getting back sometimes has duplicate nodes, creating duplicate nodes in the XML after conversion, which I then have to remove.
The processing of the JSON to XML conversion is being done in a generic service call wrapper that has no knowledge of the underlying data structure and so can't do any XPath queries based on a named node. The duplicates could be at any level within the XML.
I've got to the stage where I have a list of the names of duplicate nodes at each level but am not sure of the Linq query to use this to remove all but the first node with that name.
My code:
protected virtual void RemoveDuplicateChildren(XmlNode node)
{
if (node.NodeType != XmlNodeType.Element || !node.HasChildNodes)
{
return;
}
var xNode = XElement.Load(node.CreateNavigator().ReadSubtree());
var duplicateNames = new List<string>();
foreach (XmlNode child in node.ChildNodes)
{
var isBottom = this.IsBottomElement(child); // Has no XmlNodeType.Element type children
if (!isBottom)
{
this.RemoveDuplicateChildren(child);
}
else
{
var count = xNode.Elements(child.Name).Count();
if (count > 1 && !duplicateNames.Contains(child.Name))
{
duplicateNames.Add(child.Name);
}
}
}
if (duplicateNames.Count > 0)
{
foreach (var duplicate in duplicateNames)
{
xNode.Elements(duplicate).SelectMany(d => d.Skip(1)).Remove();
}
}
}
The final line of code obviously isn't correct but I can't find an example of how to rework it to retrieve and then remove all but the first matching element.
UPDATE:
I have found two ways of doing this now, one using the XElement and one the XmlNode, but neither actually removes the nodes.
Method 1:-
foreach (var duplicate in duplicateNames)
{
xNode.Elements(duplicate).Skip(1).Remove();
}
Method 2:-
foreach (var duplicate in duplicateNames)
{
var nodeList = node.SelectNodes(duplicate);
if (nodeList.Count > 1)
{
for (int i=1; i<nodeList.Count; i++)
{
node.RemoveChild(nodeList[i]);
}
}
}
What am I missing?
If you don't want any duplicate names: (assuming no namespaces)
XElement root = XElement.Load(file); // .Parse(string)
List<string> names = root.Descendants().Distinct(x => x.Name.LocalName).ToList();
names.ForEach(name => root.Descendants(name).Skip(1).Remove());
root.Save(file); // or root.ToString()
You might try to solve the problem at the wrong level. In XML is perfectly valid to have multiple nodes with the same name. JSON structures with duplicate property names should be invalid. You should try to do this sanitation at the JSON level and not after it was already transformed to XML.
For the xml cleanup this might be a starting point:
foreach (XmlNode child
in node.ChildNodes.Distinct(custom comparer that looks on node names))
{
.....
}

Relative references in an XML document

I am trying to pull out data from an XML document that seems to use relative references like this:
<action>
<topic reference="../../action[110]/topic"/>
<context reference="../../../../../../../../../../../../../contexts/items/context[2]"/>
</action>
Two questions:
Is this normal or common?
Is there a way to handle this with linq to XML / XDocument or would I need to manually traverse the document tree?
Edit:
To clarify, the references are to other nodes within the same XML document. The context node above references a list of contexts, and says to get the one at index 2.
The topic node worries me more because it's referencing a certain other action's topic, which could in turn reference a list of topics. If that wasn't happening I would have just loaded the lists of contexts and topics in a cache and looked them up that way.
You can use XPATH Query to extract the nodes and it is very efficient.
Step1: Load the XML into XMLDocument
Step2: use node.SelectNodes("//*[reference]")
Step3: After that you can loop through the XML nodes.
I ended up manually traversing the tree. But with extension methods it's all nice and out of the way. In case it might help anyone in the future, this is what I threw together for my use-case:
public static XElement GetRelativeNode(this XAttribute attribute)
{
return attribute.Parent.GetRelativeNode(attribute.Value);
}
public static string GetRelativeNode(this XElement node, string pathReference)
{
if (!pathReference.Contains("..")) return node; // Not relative reference
var parts = pathReference.Split(new string[] { "/"}, StringSplitOptions.RemoveEmptyEntries);
XElement current = node;
foreach (var part in parts)
{
if (string.IsNullOrEmpty(part)) continue;
if (part == "..")
{
current = current.Parent;
}
else
{
if (part.Contains("["))
{
var opening = part.IndexOf("[");
var targetNodeName = part.Substring(0, opening);
var ending = part.IndexOf("]");
var nodeIndex = int.Parse(part.Substring(opening + 1, ending - opening - 1));
current = current.Descendants(targetNodeName).Skip(nodeIndex-1).First();
}
else
{
current = current.Element(part);
}
}
}
return current;
}
And then you'd use it like this (item is an XElement):
item.Element("topic").Attribute("reference").GetRelativeNode().Value

How to select XML node by attribute and use it's child nodes data?

Here is my XML file
<?xml version="1.0" encoding="utf-8" ?>
<storage>
<Save Name ="Lifeline">
<Seconds>12</Seconds>
<Minutes>24</Minutes>
<Hours>9</Hours>
<Days>25</Days>
<Months>8</Months>
<Years>2010</Years>
<Health>90</Health>
<Mood>100</Mood>
</Save>
<Save Name ="Hellcode">
<Seconds>24</Seconds>
<Minutes>48</Minutes>
<Hours>18</Hours>
<Days>15</Days>
<Months>4</Months>
<Years>1995</Years>
<Health>50</Health>
<Mood>50</Mood>
</Save>
Here is a code which get's data from XML and loads it into application.
System.IO.StreamReader sr = new System.IO.StreamReader(#"Saves.xml");
System.Xml.XmlTextReader xr = new System.Xml.XmlTextReader(sr);
System.Xml.XmlDocument save = new System.Xml.XmlDocument();
save.Load(xr);
XmlNodeList saveItems = save.SelectNodes("Storage/Save");
XmlNode seconds = saveItems.Item(0).SelectSingleNode("Seconds");
sec = Int32.Parse(seconds.InnerText);
XmlNode minutes = saveItems.Item(0).SelectSingleNode("Minutes");
min = Int32.Parse(minutes.InnerText);
XmlNode hours = saveItems.Item(0).SelectSingleNode("Hours");
hour = Int32.Parse(hours.InnerText);
XmlNode days = saveItems.Item(0).SelectSingleNode("Days");
day = Int32.Parse(days.InnerText);
XmlNode months = saveItems.Item(0).SelectSingleNode("Months");
month = Int32.Parse(months.InnerText);
XmlNode years = saveItems.Item(0).SelectSingleNode("Years");
year = Int32.Parse(years.InnerText);
XmlNode health_ = saveItems.Item(0).SelectSingleNode("Health");
health = Int32.Parse(health_.InnerText);
XmlNode mood_ = saveItems.Item(0).SelectSingleNode("Mood");
mood = Int32.Parse(mood_.InnerText);
The problem is that this code loads data inly from "Lifeline" node. I would like to use a listbox and be able to choose from which node to load data.
I've tried to take string from listbox item content and then use such a line
XmlNodeList saveItems = save.SelectNodes(string.Format("storage/Save[#Name = '{0}']", name));
variable "name" is a string from listboxe's item. While compiled this code gives exception.
Do somebody knows a way how to select by attribute and load nedeed data from that XML?
If you can use XElement:
XElement xml = XElement.Load(file);
XElement storage = xml.Element("storage");
XElement save = storage.Elements("Save").FirstOrDefault(e => ((string)e.Attribute("Name")) == nameWeWant);
if(null != save)
{
// do something with it
}
Personally I like classes that have properties that convert to and from the XElement to hide that detail from the main program. IE say the Save class takes an XElement node in the constructor, saves it internally globally, and the properties read/write to it.
Example class:
public class MyClass
{
XElement self;
public MyClass(XElement self)
{
this.self = self;
}
public string Name
{
get { return (string)(self.Attribute("Name") ?? "some default value/null"); }
set
{
XAttribute x = source.Attribute("Name");
if(null == x)
source.Add(new XAttribute("Name", value));
else
x.ReplaceWith(new XAttribute("Name", value));
}
}
}
Then you can change the search to something like:
XElement save = storage.Elements("Save")
.FirstOrDefault(e => new MyClass(e).Name == NameWeWant);
Since it is not that much data, I'd suggest loading all information to a list of saves(constructor) and then drawing from there which one the user would like to use...
As for things not working, I personally use a lower level approach to get my data and it is not error prone. Remodeling it to fit your problem a bit:
int saves = 0;
List<Saves> saveGames = new List<Saves>();
saveGames.Add(new Saves());
while (textReader.Read())
{
if (textReader.NodeType == XmlNodeType.Element)
whatsNext = textReader.Name;
else if (textReader.NodeType == XmlNodeType.Text)
{
if (whatsNext == "name")
saveGames[saves].name = Convert.ToString(textReader.Value);
//else if statements for the rest of your attributes
else if (whatsNext == "Save")
{
saveGames.Add(new Saves());
saves++;
}
}
else if (textReader.NodeType == XmlNodeType.EndElement)
whatsNext = "";
}
Basically throw everything in the xml file into a list of objects and manipulate that list to fill the listbox. Instead of having Saves name = "...", have a name attribute as the first attribute in the save.
Code tags hate me. Why they break so easily ( ._.)
The select nodes is returning two XmlNode objects.
XmlNodeList saveItems = save.SelectNodes("Storage/Save");
Later in your code you seem to be selecting the first one and with saveItems.Item(0) and getting values from it which in this case would be the save node with the Name="LifeLine". So if you were to do saveItems.Item(1) and select nodes and its values then you would get the other set of nodes.

C# how can I get all elements name from a xml file

I'd like to get all the element name from a xml file, for example the xml file is,
<BookStore>
<BookStoreInfo>
<Address />
<Tel />
<Fax />
<BookStoreInfo>
<Book>
<BookName />
<ISBN />
<PublishDate />
</Book>
<Book>
....
</Book>
</BookStore>
I would like to get the element's name of "BookName". "ISBN" and "PublishDate " and only those names, not include " BookStoreInfo" and its child node's name
I tried several ways, but doesn't work, how can I do it?
Well, with XDocument and LINQ-to-XML:
foreach(var name in doc.Root.DescendantNodes().OfType<XElement>()
.Select(x => x.Name).Distinct())
{
Console.WriteLine(name);
}
There are lots of similar routes, though.
Using XPath
XmlDocument xdoc = new XmlDocument();
xdoc.Load(something);
XmlNodeList list = xdoc.SelectNodes("//BookStore");
gives you a list with all nodes in the document named BookStore
I agree with Adam, the ideal condition is to have a schema that defines the content of xml document. However, sometimes this is not possible. Here is a simple method for iterating all of the nodes of an xml document and using a dictionary to store the unique local names. I like to keep track of the depth of each local name, so I use a list of int to store the depth. Note that the XmlReader is "easy on the memory" since it does not load the entire document as the XmlDocument does. In some instances it makes little difference because the size of the xml data is small. In the following example, an 18.5MB file is read with an XmlReader. Using an XmlDocument to load this data would have been less effecient than using an XmlReader to read and sample its contents.
string documentPath = #"C:\Docs\cim_schema_2.18.1-Final-XMLAll\all_classes.xml";
Dictionary<string, List<int>> nodeTable = new Dictionary<string, List<int>>();
using (XmlReader reader = XmlReader.Create(documentPath))
{
while (!reader.EOF)
{
if (reader.NodeType == XmlNodeType.Element)
{
if (!nodeTable.ContainsKey(reader.LocalName))
{
nodeTable.Add(reader.LocalName, new List<int>(new int[] { reader.Depth }));
}
else if (!nodeTable[reader.LocalName].Contains(reader.Depth))
{
nodeTable[reader.LocalName].Add(reader.Depth);
}
}
reader.Read();
}
}
Console.WriteLine("The node table has {0} items.",nodeTable.Count);
foreach (KeyValuePair<string, List<int>> kv in nodeTable)
{
Console.WriteLine("{0} [{1}]",kv.Key, kv.Value.Count);
for (int i = 0; i < kv.Value.Count; i++)
{
if (i < kv.Value.Count-1)
{
Console.Write("{0}, ", kv.Value[i]);
}
else
{
Console.WriteLine(kv.Value[i]);
}
}
}
The purists way of doing this (and, to be fair, the right way) would be to have a schema contract definition and read it in that way. That being said, you could do something like this...
List<string> nodeNames = new List<string>();
foreach(System.Xml.XmlNode node in doc.SelectNodes("BookStore/Book"))
{
foreach(System.Xml.XmlNode child in node.Children)
{
if(!nodeNames.Contains(child.Name)) nodeNames.Add(child.Name);
}
}
This is, admittedly, a rudimentary method for obtaining the list of distinct node names for the Book node's children, but you didn't specify much else in the way of your environment (if you have 3.5, you could use LINQ to XML to make this a little prettier, for example), but this should get the job done regardless of your environment.
If you're using C# 3.0, you can do the following:
var data = XElement.Load("c:/test.xml"); // change this to reflect location of your xml file
var allElementNames =
(from e in in data.Descendants()
select e.Name).Distinct();
You can try doing it using XPATH.
XmlDocument doc = new XmlDocument();
doc.LoadXml("xml string");
XmlNodeList list = doc.SelectNodes("//BookStore/Book");
If BookStore is ur root element then u can try following code
XmlDocument doc = new XmlDocument();
doc.Load(configPath);
XmlNodeList list = doc.DocumentElement.GetElementsByTagName("Book");
if (list.Count != 0)
{
for (int i = 0; i < list[0].ChildNodes.Count; i++)
{
XmlNode child = list[0].ChildNodes[i];
}
}

Categories