Xpath in XML not working because of xml namespace field [duplicate] - c#

This question already has answers here:
Using Xpath With Default Namespace in C#
(14 answers)
Closed 3 years ago.
I have the following XML:
<?xml version="1.0" encoding="UTF-8" ?>
<bookstore xmlns="urn:hl7-org:v3" xmlns:voc="urn:hl7-org:v3/voc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:hl7-org:v3 PORT_MT020001.xsd" type="Observation" classCode="OBS" moodCode="EVN">
<book>
<title lang="en">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="en">Learning XML</title>
<price>39.95</price>
</book>
</bookstore>
When I try to use XMLDocument.SelectNodes() on the above xml like this:
XmlNodeList xmlNodelist = doc.SelectNodes("//book");
Console.WriteLine(xmlNodelist.Count);
I get a result of:
0
When I change the xmlns attribute value in root node to empty like this:
<bookstore xmlns="" ...........>
then I get back the proper result of:
2
Why is this happening? The xmlns attribute value in root node is vital to me. Is there any solution to this problem?

Reason why your Count shows 0 for books is because books comes under a specific namespace. If it was outside the tags that had a namespace, then your query would have resulted in 2.
To make your queries work for tags that are part of a namespace (parent tags with namespace means all children inherits that namespace), you can use the code like this,
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", doc.DocumentElement.NamespaceURI);
XmlNodeList xmlNodelist = doc.DocumentElement.SelectNodes("//x:book", nsmgr);
Console.WriteLine(xmlNodelist.Count); // Prints 2
What this does is creates a NamespaceManager with default namespaceUri of the document and uses x (or you can use any letter / word) to associate the tags with in searches. When you search for nodes, use this letter and the namespacemanager to get the results you need.

Related

Get all possible XPath expressions with XPathNavigator class?

I already have an algorithm to retrieve the XPath expressions of an Xml:
Avoid recursion on this function XML related
However, it is imperfect, unsecure, and it needs a lot of additional decoration to format the obtained expressions.
( for an example of desired formatting I mean this: Get avaliable XPaths and its element names using HtmlAgilityPack )
Then, I recently discovered the XPathNavigator class, and to improve in any way the reliability of my current code, I would like to know if with the XPathNavigator class I could retrieve all the XPath exprressions of the Xml document, because that way my algorithm could be based in the efficiency of the .Net framework logic and their rules instead of the imperfect logic of a single programmer.
I search for a solution in C# or Vb.Net.
This is what I tried:
Dim xDoc As XDocument =
<?xml version="1.0"?>
<Document>
<Tests>
<Test>
<Name>A</Name>
<Value>0.01</Value>
<Result>Pass</Result>
</Test>
<Test>
<Name>A</Name>
<Value>0.02</Value>
<Result>Pass</Result>
</Test>
<Test>
<Name>B</Name>
<Value>1.01</Value>
<Result>Fail</Result>
</Test>
</Tests>
</Document>
Dim ms As New MemoryStream
xDoc.Save(ms, SaveOptions.None)
ms.Seek(0, SeekOrigin.Begin)
Dim xpathDoc As New XPathDocument(ms)
Dim xpathNavigator As XPathNavigator = xpathDoc.CreateNavigator
Dim nodes As XPathNodeIterator = xpathNavigator.Select(".//*")
For Each item As XPathNavigator In nodes
Console.WriteLine(item.Name)
Next
With that code I only managed to get this (undesired)kind of output:
Document
Tests
Test
Name
Value
Result
Test
Name
Value
Result
Test
Name
Value
Result
Test
Name
Value
Result
Is possibly to extract all the XPath expressions using the XPathNavigator class?.
No, that's not possible. There are many, many ways to select a particular node with XPath. You might settle on some notion of the "canonical" XPath for any given node, but even that sounds hard to specify, and XPathNavigator has no such notion built in to help you.

Find Element when XPath is variable

I am trying to compose an algorithm that will take XML as input, and find a value associated with a particular element, but the position of the element within the XML body varies. I have seen many examples of using XDocument.Descendants() but most (if not all) the examples expect the structure to be consistent, and descendants well known. I presume I will need to recurse the XML to find this value, but before heading that way, ask the general population.
What is the best way to find an element in an XDocument when the path for the element may be different on each call? Just need the first occurrence found, not in any particular order. Can be first occurrence found by going wide, or by going deep.
For example, if I am trying to find the element "FirstName", and if the input XML for Call one looks like..
<?xml version="1.0"?>
<PERSON><Name><FirstName>BOB</FirstName></Name></PERSON>
and the next call looks like:
<?xml version="1.0"?>
<PERSONS><PERSON><Name><FirstName>BOB</FirstName></Name></PERSON></PERSONS>
What do you recommend? Is there a "Find" option in XDocument that I have not seen?
UPDATE:
Simple example above works with lazyberezovsky answer of XDocument.Descendents, but real XML does not. My problematic XML...
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Header>
<To s:mustUnderstand="1" xmlns="http://schemas.microsoft.com/ws/2005/05/addressing/none">http://localhost:52087/Service1.svc</To>
<Action s:mustUnderstand="1" xmlns="http://schemas.microsoft.com/ws/2005/05/addressing/none">http://tempuri.org/IService1/GetDataUsingDataContract</Action>
</s:Header>
<s:Body>
<GetDataUsingDataContract xmlns="http://tempuri.org/">
<composite xmlns:a="http://schemas.datacontract.org/2004/07/WcfService2" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<a:BoolValue>false</a:BoolValue>
<a:Name>
<a:FirstName>BOB</a:FirstName>
</a:Name>
<a:StringValue i:nil="true" />
</composite>
</GetDataUsingDataContract>
</s:Body>
</s:Envelope>
UPDATE:
lazyberezovsky helped immensely showing me how Descendents is supposed to work. Be careful of namespaces in XML. Lesson learned. Found another . article with similar issues..
Search XDocument using LINQ without knowing the namespace
Resolved using the following snippet...
var xdoc = XDocument.Parse(xml);
var name = (from p in xdoc.Descendants() where p.Name.LocalName == "FirstName" select p.Value).FirstOrDefault();
When you are using Descendants for finding first occurrence of element, you don't need to know
structure of file. Following code will work for both your cases:
var xdoc = XDocument.Load(path_to_xml);
var name = (string)xdoc.Descendants("FirstName").FirstOrDefault();
Same with XPath
var name = (string)xdoc.XPathSelectElement("//FirstName[1]");
Without knowing all the possible permutations of the XML document (which is very unusual by the way) I don't think anyone could hope to give you any worthwhile recommendations.
"Just need the first occurrence found, not in any particular order." I think Descendants do trick. Look at this:
string xml = #"<?xml version=""1.0""?>
<PERSONS>
<PERSON>
<Name>
<FirstName>BOB</FirstName>
</Name>
</PERSON>
</PERSONS>";
XDocument doc = XDocument.Parse(xml);
Console.WriteLine(string.Join(",", doc.Descendants("FirstName").Select(e =>(string)e)));
xml = #"<?xml version=""1.0""?>
<PERSON>
<Name>
<FirstName>BOB</FirstName>
</Name>
</PERSON>";
doc = XDocument.Parse(xml);
Console.WriteLine(string.Join(",", doc.Descendants("FirstName").Select(e =>(string)e)));

How do I find a XML node by path in Linq-to-XML

If I get the path to a specific node as a string can I somehow easily find said node by using Linq/Method of the XElement ( or XDocument ).
There are so many different types of XML objects it would also be nice if as a added bonus you could point me to a guide on why/how to use different types.
EDIT: Ok after being pointed towards XPathSelectElement I'm trying it out so I can give him the right answer I can't quite get it to work though. This is the XML I'm trying out
<Product>
<Name>SomeName</Name>
<Type>SomeType</Type>
<Quantity>Alot</Quantity>
</Product>
and my code
string path = "Product/Name";
string name = xml.XPathSelectElement(path).Value;
note my string is coming from elsewhere so I guess it doesn't have to be literal ( at least in debug mode it looks like the one above). I've also tried adding / in front. It gives me a null ref.
Try using the XPathSelectElement extension method of XElement. You can pass the method an XPath expression to evaluate. For example:
XElement myElement = rootElement.XPathSelectElement("//Book[#ISBN='22542']");
Edit:
In reply to your edit, check your XPath expression. If your document only contains that small snippet then /Product/Name will work as the leading slash performs a search from the root of the document:
XElement element = document.XPathSelectElement("/Product/Name");
If there are other products and <Product> is not the root node you'll need to modify the XPath you're using.
You can also use XPathEvaluate
XDocument document = XDocument.Load("temp.xml");
var found = document.XPathEvaluate("/documents/items/item") as IEnumerable<object>;
foreach (var obj in found)
{
Console.Out.WriteLine(obj);
}
Given the following xml:
<?xml version="1.0" encoding="utf-8" ?>
<documents>
<items>
<item name="Jamie"></item>
<item name="John"></item>
</items>
</documents>
This should print the contents from the items node.

Parse XML document in C#

Duplicate: This is a duplicate of Best practices to parse xml files with C#? and many others (see https://stackoverflow.com/search?q=c%23+parse+xml). Please close it and do not answer.
How do you parse XML document from bottom up in C#?
For Example :
<Employee>
<Name> Test </name>
<ID> 123 </ID>
<Employee>
<Company>
<Name>ABC</company>
<Email>test#ABC.com</Email>
</company>
Like these there are many nodes..I need to start parsing from bottom up like..first parse <company> and then and so on..How doi go about this in C# ?
Try this:
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\Path\To\Xml\File.xml");
Or alternatively if you have the XML in a string use the LoadXml method.
Once you have it loaded, you can use SelectNodes and SelectSingleNode to query specific values, for example:
XmlNode node = doc.SelectSingleNode("//Company/Email/text()");
// node.Value contains "test#ABC.com"
Finally, note that your XML is invalid as it doesn't contain a single root node. It must be something like this:
<Data>
<Employee>
<Name>Test</Name>
<ID>123</ID>
</Employee>
<Company>
<Name>ABC</Name>
<Email>test#ABC.com</Email>
</Company>
</Data>

Calling the DescendantNodes without repeating each node

I have an xml that I would like to get all of its elements. I tried getting those elements by Descendants() or DescendantNodes(), but both of them returned me repeated nodes
For example, here is my xml:
<Root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<FirstElement xsi:type="myType">
<SecondElement>A</SecondElement>
</FirstElement>
</Root>
and when I use this snippet:
XElement Elements = XElement.Parse(XML);
IEnumerable<XElement> xElement = Elements.Descendants();
IEnumerable<XNode> xNodes = Elements.DescendantNodes();
foreach (XNode node in xNodes )
{
stringBuilder.Append(node);
}
it gives me two nodes but repeating the <SecondElement>. I know Descendants call its children, and children of a child all the time, but is there any other way to avoid it?
Then, this is the content of my stringBuilder:
<FirstElement xsi:type="myType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<SecondElement>A</SecondElement>
</FirstElement>
<SecondElement>A</SecondElement>
Well do you actually want all the descendants or just the top-level elements? If you only want the top level ones, then use the Elements() method - that returns all the elements directly under the current node.
The problem isn't that nodes are being repeated - it's that the higher-level nodes include the lower level nodes. So the higher-level node is being returned, then the lower-level one, and you're writing out the whole of both of those nodes, which means you're writing out the lower-level node twice.
If you just write out, say, the name of the node you're looking at, you won't see a problem. But you haven't said what you're really trying to do, so I don't know if that helps...
XmlDocument doc = new XmlDocument();
doc.LoadXml(XML);
XmlNodeList allElements = doc.SelectNodes("//*");
foreach(XmlElement element in allElements)
{
// your code here
}

Categories