I close my last question as it was commented that not enough research had been done. The more I research them more confused I am getting. What I think should work in my understanding and from post here and elsewhere is not working.
XML sample
<?xml version="1.0" encoding="UTF-8" ?>
<multistatus xmlns="DAV:">
<schmeata>
<classschema name="Space">
<base_class>Space</base_class>
</classschema>
<classschema name="Tapestry">
<base_class>File</base_class>
</classschema>
<classschema name="Document">
<base_class>File</base_class>
</classschema>
</schmeata>
</multistatus>
I am trying to get the name attribute of the classschema nodes that have base_class children with the 'File' value. so the result should be 'Tapestry' and 'Document'
I can easily return all classschemata nodes with
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema", nsmgr))
{
strNode = node.Attributes["name"].Value;
responseString += strNode +" ";
}
return responseString;
And I can get the base_class value = to 'File' by looping through all the base_class nodes like this.
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema/DAV:base_class", nsmgr))
{
if (node.InnerText == "File")
{
strNode = node.InnerText;
responseString += strNode +" ";
}
}
return responseString;
but if I try and filter or use axis to reference the parent node from the child I am failing.
An example of my filtering efforts are based at the SelectNodes method.
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema[/DAV:base_class(contains,'File')]", nsmgr)) or
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema[/DAV:base_class=='File']", nsmgr))
along with many, many other variations as examples I have seen is hard to tell if it is LINQ2XML or XDocument and mix in for PHP or other languages where aren't always specified I am now jumpled.
My next attempt will be SelectNodes("//DAV:schemata/DAV:classschemata[/DAV:baseclass(contains,'File')]"#name,nsmgr);
and variations on that.
I have thought from other examples that they had exactly what I wanted but when implemented did not work, for reasons I cannot explain.
This should give you the result you are after. Basically it looks for all classschema elements that have a first element with Value == "File" and then selects their name attributes Value. Please note I also used the string.Join method (pretty handy for stuff like this) to turn the result into a space delimited string which is what you need.
var xmlDoc = XDocument.Load("YourFile.xml");
var result = xmlDoc.Descendants("{DAV:}classschema")
.Where(x => x.Elements().First().Value == "File")
.Attributes("name")
.Select(x => x.Value);
string spaceDelimited = string.Join(" ", result);
This is what I am using that now works. I would swear on a stack of Bibles I had done this in earlier testing and it filed. But it does work. In posting the code I see that the inner loop I was using is now commented out and was the problem. I was looping something that did not need to be looped so was returning empty.
foreach (XmlNode node in xmlDoc.SelectNodes("//DAV:schemata/DAV:classschema[DAV:base_class='File']", nsmgr))
{
//strNode = node.Attributes["name"].Value;
//if (node.InnerText == "File")
// {
strNode = node.Attributes["name"].Value;
//strNode = node.InnerText;
responseString += strNode +" ";
// }
}
return responseString;
Thanks you for the help.
My next step will be to use each of the returned nodes and get all base_class elements and filter for only some of them to be returned. Not sure what I am looking for yet. Need to evaluate the XML to look for uniquenesss to capture what I want further.
Meaning now that I have the only classschema nodes with children elements containing 'File' get some of the child element siblings but not all. Challenge for another day.
Related
In C# I'm using the following to get some elements from an XML file:
var TestCaseDescriptions = doc.SelectNodes("//testcase/htmlcomment");
This works fine and gets the correct information but when my testcase has no htmlcomment it won't add any entry in the XmlNodeList TestCaseDescriptions.
When there's not htmlcomment I would like to have the value "null" as string the TestCaseDescriptions. So in the end I would have an XMLNodeList like
htmlcomment
htmlcomment
null
htmlcomment
htmlcomment
Can anyone describe or create a sample how to make this happen?
var TestCaseDescriptions = doc.SelectNodes("//testcase/htmlcomment");
When there's not htmlcomment I would like to have the value "null" as string the TestCaseDescriptions.
Your problem comes from the fact that if there is no htmlcomment, the number of selected nodes will be one less. The current answer shows what to do when the htmlcomment element is present, but empty, but I think you need this instead, if indeed the whole htmlcomment element is empty:
var testCases = doc.SelectNodes("//testcase");
foreach (XmlElement element in testCases)
{
var description = element.SelectSingleNode("child::htmlcomment");
string results = description == null ? "null" : description.Value;
}
In above code, you go over each test case, and select the child node htmlcomment of the test case. If not found, SelectSingleNode returns null, so the last line checks for the result and returns "null" in that case, or the node's value otherwise.
To change this result into a node, you will have to create the node as a child to the current node. You said you want an XmlNodeList, so perhaps this works for you:
var testCaseDescriptions = doc.SelectNodes("//testcase");
foreach (XmlElement element in testCaseDescriptions)
{
var comment = element.SelectSingleNode("child::htmlcomment");
if (comment == null)
{
element.AppendChild(
doc.CreateElement("htmlcomment")
.AppendChild(doc.CreateTextNode("none")));
}
}
After this, the node set is updated.
Note: apparently, the OP mentions that element.SelectSingleNode("child::htmlcomment"); does not work, but element.SelectSingleNode("./htmlcomment"); does, even though technically, these are equal expressions from the point of XPath, and should work according to Microsoft's documentation.
Try this
XmlDocument doc = new XmlDocument();
var TestCaseDescriptions = doc.SelectNodes("//testcase/htmlcomment");
foreach (XmlElement element in TestCaseDescriptions)
{
string results = element.Value == null ? "" : element.Value;
}
Here's some fantastic example XML:
<root>
<section>Here is some text<mightbe>a tag</mightbe>might <not attribute="be" />. Things are just<label>a mess</label>but I have to parse it because that's what needs to be done and I can't <font stupid="true">control</font> the source. <p>Why are there p tags here?</p>Who knows, but there may or may not be spaces around them so that's awesome. The point here is, there's node soup inside the section node and no definition for the document.</section>
</root>
I'd like to just grab the text from the section node and all sub nodes as strings. BUT, note that there may or may not be spaces around the sub-nodes, so I want to pad the sub notes and append a space.
Here's a more precise example of what input might look like, and what I'd like output to be:
<root>
<sample>A good story is the<book>Hitchhikers Guide to the Galaxy</book>. It was published<date>a long time ago</date>. I usually read at<time>9pm</time>.</sample>
</root>
I'd like the output to be:
A good story is the Hitchhikers Guide to the Galaxy. It was published a long time ago. I usually read at 9pm.
Note that the child nodes don't have spaces around them, so I need to pad them otherwise the words run together.
I was attempting to use this sample code:
XDocument doc = XDocument.Parse(xml);
foreach(var node in doc.Root.Elements("section"))
{
output += String.Join(" ", node.Nodes().Select(x => x.ToString()).ToArray()) + " ";
}
But the output includes the child tags, and is not going to work out.
Any suggestions here?
TL;DR: Was given node soup xml and want to stringify it with padding around child nodes.
Incase you have nested tags to an unknown level (e.g <date>a <i>long</i> time ago</date>), you might also want to recurse so that the formatting is applied consistently throughout. For example..
private static string Parse(XElement root)
{
return root
.Nodes()
.Select(a => a.NodeType == XmlNodeType.Text ? ((XText)a).Value : Parse((XElement)a))
.Aggregate((a, b) => String.Concat(a.Trim(), b.StartsWith(".") ? String.Empty : " ", b.Trim()));
}
You could try using xpath to extract what you need
var docNav = new XPathDocument(xml);
// Create a navigator to query with XPath.
var nav = docNav.CreateNavigator();
// Find the text of every element under the root node
var expression = "/root//*/text()";
// Execute the XPath expression
var resultString = nav.evaluate(expression);
// Do some stuff with resultString
....
References:
Querying XML, XPath syntax
Here is a possible solution following your initial code:
private string extractSectionContents(XElement section)
{
string output = "";
foreach(var node in section.Nodes())
{
if(node.NodeType == System.Xml.XmlNodeType.Text)
{
output += string.Format("{0}", node);
}
else if(node.NodeType == System.Xml.XmlNodeType.Element)
{
output += string.Format(" {0} ", ((XElement)node).Value);
}
}
return output;
}
A problem with your logic is that periods will be preceded by a space when placed right after an element.
You are looking at "mixed content" nodes. There is nothing particularly special about them - just get all child nodes (text nodes are nodes too) and join they values with space.
Something like
var result = String.Join("",
root.Nodes().Select(x => x is XText ? ((XText)x).Value : ((XElement)x).Value));
I have just written some code, which as i was writing i thought, this is going to be a nice generic method for searching for a particular node. When i finished i actually realised it was a mess :D
public String sqlReading(String fileName, String path, String nodeId )
{
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNodeList names = doc.SelectNodes(path);
foreach (XmlNode xmlDocSearchTerm in names)
{
//if the attribute of the node i start at is the same as where i am now
if (xmlDocSearchTerm.Attributes.Item(0).Value.ToString().Equals(nodeId))
{
//get a list of all of its child nodes
XmlNodeList childNodes = xmlDocSearchTerm.ChildNodes;
foreach (XmlNode node in childNodes)
{
//if there is a node in here called gui display, go inside
if (node.Name.Equals("GUIDisplay"))
{
XmlNodeList list = node.ChildNodes;
//find the sqlsearchstring tag inside of here
foreach (XmlNode finalNode in list)
{
if (finalNode.Name.Equals("sqlSearchString"))
{
return node.InnerText;
}
}
}
}
}
}
return "";
}
What i intended to do was based on a path - i would start and check to see if the element had the id i was looking for, if it did then i wanted to get inside there and not stop going until i got to the sqlsearchstring tag which was buried two levels deeper. I have managed that, but the issue here is that now i seem to have almost hardcoded a path to the tag opposed to looping there. How could i change my code to stop me from doing this?
Its from the second foreach where its going wrong imo.
Thanks
Haven't tested it but I believe something like this would work, by using a xpath. However I'm not sure the name of the attribute, or is it always the first attribute?
public String sqlReading(String fileName, String path, String nodeId)
{
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNode foundNode = doc.SelectNodes(path).SelectSingleNode("*[#id='" + nodeId + "']/GUIDisplay/sqlSearchString");
if (foundNode != null)
return foundNode.InnerText;
return string.Empty;
}
Im not sure if this is exaclty right (as I dont have an XML document to try it with, but something similar should work
var innerTexts = XDocument.Load(fileName)
.Elements(path)
.Where(n => n.Attributes().ElementAt(0).Value == nodeId)
.SelectMany(n => n.Elements())
.Where(n => n.Name == "GUIDisplay")
.SelectMany(n => n.Elements())
.Where(n => n.Name == "sqlSearchString")
.Select(n => n.ToString());
I would say recursion is a safe bet (for iterating through nested child nodes) Though, from what I gather, the structure remains the same. And with that in mind, why not use [XmlDocumentObj].SelectSingleNode("/[nodeId='"+nodeId+"']") (or some facsimile) instead? This has the ability to search by attribute name, unless the XML structure is always changed and you never have constant tag (in which case XPath is probably a good idea).
I'm just learning XDocument and LINQ queries. Here's some simple XML (which doesn't look formatted exactly right in this forum in my browser, but you get the idea . . .)
<?xml version="1.0" encoding="utf-8"?>
<quiz
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.example.com/name XMLFile2.xsd"
title="MyQuiz1">
<q_a>
<q_a_num>1</q_a_num>
<q_>Here is question 1</q_>
<_a>Here is the answer to 1</_a>
</q_a>
<q_a>
<q_a_num>2</q_a_num>
<q_>Here is question 2</q_>
<_a>Here is the answer to 2</_a>
</q_a>
</quiz>
I can iterate across all elements in my XML file and display their Name, Value, and NodeType in a ListBox like this, no problem:
XDocument doc = XDocument.Load(sPath);
IEnumerable<XElement> elems = doc.Descendants();
IEnumerable<XElement> elem_list = from elem in elems
select elem;
foreach (XElement element in elem_list)
{
String str0 = "Name = " + element.Name.ToString() +
", Value = " + element.Value.ToString() +
", Nodetype = " + element.NodeType.ToString();
System.Windows.Controls.Label strLabel = new System.Windows.Controls.Label();
strLabel.Content = str0;
listBox1.Items.Add(strLabel);
}
...but now I want to add a "where" clause to my query so that I only select elements with a certain name (e.g., "qa") but my element list comes up empty. I tried . . .
IEnumerable<XElement> elem_list = from elem in elems
where elem.Name.ToString() == "qa"
select elem;
Could someone please explain what I'm doing wrong? (and in general are there some good tips for debugging Queries?) Thanks in advance!
The problem is that the Name property is not a string, it's an XName. When you ToString it, you get a lot more than you think.
While it's possible to write the query in the way you're attempting to, also consider these possibilites:
//from nodes immediately below this one
IEnumerable<XElement> elem_list = doc.Elements("qa");
//from nodes of all levels below this node.
IEnumerable<XElement> elem_list = doc.Descendants("qa");
I would perhaps change your query to something that looks more like this
var query = from q_a in document.Descendants("q_a")
select new
{
Number = (int)q_a.Element("q_a_num"),
Question = (string)q_a.Element("q_"),
Answer = (string)q_a.Element("_a")
};
With this, you'll pull from each of your q_a descendants the inner elements into an IEnumerable<[Anonymous Type]>, each object containing the number, question, and answer.
However, if you just want to extract the XElements where the name is q_a, you could do this using a where clause.
IEnumerable<XElement> elem_list = elems.Where(elem => elem.Name.LocalName == "q_a");
Of course, as David B showed, the where clause is not necessary here.
IEnumerable<XElement> elem_list = elems.Elements("q_a");
I would like to traverse every element and attribute in an xml and grab the name an value without knowing the names of the elements in advance. I even have a book on linq to xml with C# and it only tells me how to query to get the value of elements when I already know the name of the element.
The code below only gives me the most high level element information. I need to also reach all of the descending elements.
XElement reportElements = null;
reportElements = XElement.Load(filePathName.ToString());
foreach (XElement xe in reportElements.Elements())
{
MessageBox.Show(xe.ToString());
}
Elements only walks one level; Descendants walks the entire DOM for elements, and you can then (per-element) check the attributes:
foreach (var el in doc.Descendants()) {
Console.WriteLine(el.Name);
foreach (var attrib in el.Attributes()) {
Console.WriteLine("> " + attrib.Name + " = " + attrib.Value);
}
}
You should try:
reportElements.Descendants()