Parse XML document in C#

Parse XML document in C# - c#

Duplicate: This is a duplicate of Best practices to parse xml files with C#? and many others (see https://stackoverflow.com/search?q=c%23+parse+xml). Please close it and do not answer.
How do you parse XML document from bottom up in C#?
For Example :
<Employee>
<Name> Test </name>
<ID> 123 </ID>
<Employee>
<Company>
<Name>ABC</company>
<Email>test#ABC.com</Email>
</company>
Like these there are many nodes..I need to start parsing from bottom up like..first parse <company> and then and so on..How doi go about this in C# ?

Try this:
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\Path\To\Xml\File.xml");
Or alternatively if you have the XML in a string use the LoadXml method.
Once you have it loaded, you can use SelectNodes and SelectSingleNode to query specific values, for example:
XmlNode node = doc.SelectSingleNode("//Company/Email/text()");
// node.Value contains "test#ABC.com"
Finally, note that your XML is invalid as it doesn't contain a single root node. It must be something like this:
<Data>
<Employee>
<Name>Test</Name>
<ID>123</ID>
</Employee>
<Company>
<Name>ABC</Name>
<Email>test#ABC.com</Email>
</Company>
</Data>

Related

Xpath in XML not working because of xml namespace field [duplicate]

This question already has answers here:
Using Xpath With Default Namespace in C#
(14 answers)
Closed 3 years ago.
I have the following XML:
<?xml version="1.0" encoding="UTF-8" ?>
<bookstore xmlns="urn:hl7-org:v3" xmlns:voc="urn:hl7-org:v3/voc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:hl7-org:v3 PORT_MT020001.xsd" type="Observation" classCode="OBS" moodCode="EVN">
<book>
<title lang="en">Harry Potter</title>
<price>29.99</price>
</book>
<book>
<title lang="en">Learning XML</title>
<price>39.95</price>
</book>
</bookstore>
When I try to use XMLDocument.SelectNodes() on the above xml like this:
XmlNodeList xmlNodelist = doc.SelectNodes("//book");
Console.WriteLine(xmlNodelist.Count);
I get a result of:
0
When I change the xmlns attribute value in root node to empty like this:
<bookstore xmlns="" ...........>
then I get back the proper result of:
2
Why is this happening? The xmlns attribute value in root node is vital to me. Is there any solution to this problem?

Reason why your Count shows 0 for books is because books comes under a specific namespace. If it was outside the tags that had a namespace, then your query would have resulted in 2.
To make your queries work for tags that are part of a namespace (parent tags with namespace means all children inherits that namespace), you can use the code like this,
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", doc.DocumentElement.NamespaceURI);
XmlNodeList xmlNodelist = doc.DocumentElement.SelectNodes("//x:book", nsmgr);
Console.WriteLine(xmlNodelist.Count); // Prints 2
What this does is creates a NamespaceManager with default namespaceUri of the document and uses x (or you can use any letter / word) to associate the tags with in searches. When you search for nodes, use this letter and the namespacemanager to get the results you need.

Get all possible XPath expressions with XPathNavigator class?

I already have an algorithm to retrieve the XPath expressions of an Xml:
Avoid recursion on this function XML related
However, it is imperfect, unsecure, and it needs a lot of additional decoration to format the obtained expressions.
( for an example of desired formatting I mean this: Get avaliable XPaths and its element names using HtmlAgilityPack )
Then, I recently discovered the XPathNavigator class, and to improve in any way the reliability of my current code, I would like to know if with the XPathNavigator class I could retrieve all the XPath exprressions of the Xml document, because that way my algorithm could be based in the efficiency of the .Net framework logic and their rules instead of the imperfect logic of a single programmer.
I search for a solution in C# or Vb.Net.
This is what I tried:
Dim xDoc As XDocument =
<?xml version="1.0"?>
<Document>
<Tests>
<Test>
<Name>A</Name>
<Value>0.01</Value>
<Result>Pass</Result>
</Test>
<Test>
<Name>A</Name>
<Value>0.02</Value>
<Result>Pass</Result>
</Test>
<Test>
<Name>B</Name>
<Value>1.01</Value>
<Result>Fail</Result>
</Test>
</Tests>
</Document>
Dim ms As New MemoryStream
xDoc.Save(ms, SaveOptions.None)
ms.Seek(0, SeekOrigin.Begin)
Dim xpathDoc As New XPathDocument(ms)
Dim xpathNavigator As XPathNavigator = xpathDoc.CreateNavigator
Dim nodes As XPathNodeIterator = xpathNavigator.Select(".//*")
For Each item As XPathNavigator In nodes
Console.WriteLine(item.Name)
Next
With that code I only managed to get this (undesired)kind of output:
Document
Tests
Test
Name
Value
Result
Test
Name
Value
Result
Test
Name
Value
Result
Test
Name
Value
Result
Is possibly to extract all the XPath expressions using the XPathNavigator class?.

No, that's not possible. There are many, many ways to select a particular node with XPath. You might settle on some notion of the "canonical" XPath for any given node, but even that sounds hard to specify, and XPathNavigator has no such notion built in to help you.

Find Element when XPath is variable

I am trying to compose an algorithm that will take XML as input, and find a value associated with a particular element, but the position of the element within the XML body varies. I have seen many examples of using XDocument.Descendants() but most (if not all) the examples expect the structure to be consistent, and descendants well known. I presume I will need to recurse the XML to find this value, but before heading that way, ask the general population.
What is the best way to find an element in an XDocument when the path for the element may be different on each call? Just need the first occurrence found, not in any particular order. Can be first occurrence found by going wide, or by going deep.
For example, if I am trying to find the element "FirstName", and if the input XML for Call one looks like..
<?xml version="1.0"?>
<PERSON><Name><FirstName>BOB</FirstName></Name></PERSON>
and the next call looks like:
<?xml version="1.0"?>
<PERSONS><PERSON><Name><FirstName>BOB</FirstName></Name></PERSON></PERSONS>
What do you recommend? Is there a "Find" option in XDocument that I have not seen?
UPDATE:
Simple example above works with lazyberezovsky answer of XDocument.Descendents, but real XML does not. My problematic XML...
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Header>
<To s:mustUnderstand="1" xmlns="http://schemas.microsoft.com/ws/2005/05/addressing/none">http://localhost:52087/Service1.svc</To>
<Action s:mustUnderstand="1" xmlns="http://schemas.microsoft.com/ws/2005/05/addressing/none">http://tempuri.org/IService1/GetDataUsingDataContract</Action>
</s:Header>
<s:Body>
<GetDataUsingDataContract xmlns="http://tempuri.org/">
<composite xmlns:a="http://schemas.datacontract.org/2004/07/WcfService2" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<a:BoolValue>false</a:BoolValue>
<a:Name>
<a:FirstName>BOB</a:FirstName>
</a:Name>
<a:StringValue i:nil="true" />
</composite>
</GetDataUsingDataContract>
</s:Body>
</s:Envelope>
UPDATE:
lazyberezovsky helped immensely showing me how Descendents is supposed to work. Be careful of namespaces in XML. Lesson learned. Found another . article with similar issues..
Search XDocument using LINQ without knowing the namespace
Resolved using the following snippet...
var xdoc = XDocument.Parse(xml);
var name = (from p in xdoc.Descendants() where p.Name.LocalName == "FirstName" select p.Value).FirstOrDefault();

When you are using Descendants for finding first occurrence of element, you don't need to know
structure of file. Following code will work for both your cases:
var xdoc = XDocument.Load(path_to_xml);
var name = (string)xdoc.Descendants("FirstName").FirstOrDefault();
Same with XPath
var name = (string)xdoc.XPathSelectElement("//FirstName[1]");

Without knowing all the possible permutations of the XML document (which is very unusual by the way) I don't think anyone could hope to give you any worthwhile recommendations.

"Just need the first occurrence found, not in any particular order." I think Descendants do trick. Look at this:
string xml = #"<?xml version=""1.0""?>
<PERSONS>
<PERSON>
<Name>
<FirstName>BOB</FirstName>
</Name>
</PERSON>
</PERSONS>";
XDocument doc = XDocument.Parse(xml);
Console.WriteLine(string.Join(",", doc.Descendants("FirstName").Select(e =>(string)e)));
xml = #"<?xml version=""1.0""?>
<PERSON>
<Name>
<FirstName>BOB</FirstName>
</Name>
</PERSON>";
doc = XDocument.Parse(xml);
Console.WriteLine(string.Join(",", doc.Descendants("FirstName").Select(e =>(string)e)));

how to compare XML strings in C#?

In DB I have XML strings stored in a column. Below is my XML structure:
<?xml version="1.0"?>
-<ProductAttributes xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
-<Attribute Required="false" ID="2" Name="Color">
<Value ID="18">Light Pink</Value>
</Attribute>
-<Attribute Required="false" ID="1" Name="Size">
<Value ID="9">XL</Value>
</Attribute>
</ProductAttributes>
Another XML is:
<?xml version="1.0"?>
-<ProductAttributes xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
-<Attribute Required="false" ID="1" Name="Size">
<Value ID="1">S</Value>
</Attribute>
-<Attribute Required="false" ID="2" Name="Color">
<Value ID="4">Red</Value>
</Attribute>
-<Attribute Required="false" ID="3" Name="Weight">
<Value ID="6">10gr</Value>
</Attribute>
</ProductAttributes>
Notes
There can be n number of xml strings and each xml string can have m number of tags
Attribute nodes can in different order, for example in 1st attribute Id=1 can be first attribute and in 2nd attribute Id=1 can be last.
Requirement is not compare these n XML strings and find if any of strings has complete duplication of attributes (this comparison will consider values as order can be different).
Please guide and help me.

don't compare strings of XML. Use them as input to an XML parser that will turn them into XML trees, then search the trees for matching elements and compare their list of attributes.

You could iterate all nodes of xml doc A and for each node, look up its xpath in xml doc B. If any paths are not found or the path is found but the value is different, the doc's are not 'the same'.
You'd then have to do the same for all nodes in B, checking the xpaths in A, to ensure there's nothing "in B but not in A"
Optimise by quitting as 'not equal' as soon as an xpath is not found or the values don't match.

You may want to try The XML Diff and Patch GUI Tool which you can download from here. I've used it before and it works ok.

How do I find a XML node by path in Linq-to-XML

If I get the path to a specific node as a string can I somehow easily find said node by using Linq/Method of the XElement ( or XDocument ).
There are so many different types of XML objects it would also be nice if as a added bonus you could point me to a guide on why/how to use different types.
EDIT: Ok after being pointed towards XPathSelectElement I'm trying it out so I can give him the right answer I can't quite get it to work though. This is the XML I'm trying out
<Product>
<Name>SomeName</Name>
<Type>SomeType</Type>
<Quantity>Alot</Quantity>
</Product>
and my code
string path = "Product/Name";
string name = xml.XPathSelectElement(path).Value;
note my string is coming from elsewhere so I guess it doesn't have to be literal ( at least in debug mode it looks like the one above). I've also tried adding / in front. It gives me a null ref.

Try using the XPathSelectElement extension method of XElement. You can pass the method an XPath expression to evaluate. For example:
XElement myElement = rootElement.XPathSelectElement("//Book[#ISBN='22542']");
Edit:
In reply to your edit, check your XPath expression. If your document only contains that small snippet then /Product/Name will work as the leading slash performs a search from the root of the document:
XElement element = document.XPathSelectElement("/Product/Name");
If there are other products and <Product> is not the root node you'll need to modify the XPath you're using.

You can also use XPathEvaluate
XDocument document = XDocument.Load("temp.xml");
var found = document.XPathEvaluate("/documents/items/item") as IEnumerable<object>;
foreach (var obj in found)
{
Console.Out.WriteLine(obj);
}
Given the following xml:
<?xml version="1.0" encoding="utf-8" ?>
<documents>
<items>
<item name="Jamie"></item>
<item name="John"></item>
</items>
</documents>
This should print the contents from the items node.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parse XML document in C# - c#

Related

Xpath in XML not working because of xml namespace field [duplicate]

Get all possible XPath expressions with XPathNavigator class?

Find Element when XPath is variable

how to compare XML strings in C#?

How do I find a XML node by path in Linq-to-XML

Categories

Resources