Getting an XElement with a namespace via XPathSelectElements - c#

I have an XML e.g.
<?xml version="1.0" encoding="utf-8"?>
<A1>
<B2>
<C3 id="1">
<D7>
<E5 id="abc" />
</D7>
<D4 id="1">
<E5 id="abc" />
</D4>
<D4 id="2">
<E5 id="abc" />
</D4>
</C3>
</B2>
</A1>
This is may sample code:
var xDoc = XDocument.Load("Test.xml");
string xPath = "//B2/C3/D4";
//or string xPath = "//B2/C3/D4[#id='1']";
var eleList = xDoc.XPathSelectElements(xPath).ToList();
foreach (var xElement in eleList)
{
Console.WriteLine(xElement);
}
It works perfectly, but if I add a namespace to the root node A1, this code doesn't work.
Upon searching for solutions, I found this one, but it uses the Descendants() method to query the XML. From my understanding, this solution would fail if I was searching for <E5> because the same tag exists for <D7>, <D4 id="1"> and <D4 id="2">
My requirement is to search if a node exists at a particular XPath. If there is a way of doing this using Descendants, I'd be delighted to use it. If not, please guide me on how to search using the name space.
My apologies in case this is a duplicate.

To keep using XPath, you can use something link this:
var xDoc = XDocument.Parse(#"<?xml version='1.0' encoding='utf-8'?>
<A1 xmlns='urn:sample'>
<B2>
<C3 id='1'>
<D7><E5 id='abc' /></D7>
<D4 id='1'><E5 id='abc' /></D4>
<D4 id='2'><E5 id='abc' /></D4>
</C3>
</B2>
</A1>");
// Notice this
XmlNamespaceManager nsmgr = new XmlNamespaceManager(new NameTable());
nsmgr.AddNamespace("sample", "urn:sample");
string xPath = "//sample:B2/sample:C3/sample:D4";
var eleList = xDoc.XPathSelectElements(xPath, nsmgr).ToList();
foreach (var xElement in eleList)
{
Console.WriteLine(xElement);
}

but it uses the Descendants() method to query the XML. From my understanding, this solution would fail if I was searching for because the same tag exists for , and
I'm pretty sure you're not quite understanding how that works. From the MSDN documentation:
Returns a filtered collection of the descendant elements for this document or element, in document order. Only elements that have a matching XName are included in the collection.
So in your case, just do this:
xDoc.RootNode
.Descendants("E5")
.Where(n => n.Parent.Name.LocalName == "B4");

Try this
var xDoc = XDocument.Parse("<A1><B2><C3 id=\"1\"><D7><E5 id=\"abc\" /></D7><D4 id=\"1\"><E5 id=\"abc\" /></D4><D4 id=\"2\"><E5 id=\"abc\" /></D4></C3></B2></A1>");
foreach (XElement item in xDoc.Element("A1").Elements("B2").Elements("C3").Elements("D4"))
{
Console.WriteLine(item.Element("E5").Value);//to get the value of E5
Console.WriteLine(item.Element("E5").Attribute("id").Value);//to get the value of attribute
}

Related

Retrieving specific data from XML file

Using LINQ to XML.
I have an XML file which looks like this:
<?xml version="1.0" encoding="utf-8"?>
<TileMap xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Title>title</Title>
<Abstract>Some clever text about this.</Abstract>
<SRS>OSGEO:41001</SRS>
<Profile>global-mercator or something</Profile>
</TileMap>
I can retrieve the <Title> from this with no problems by using this little piece of code:
string xmlString = AppDomain.CurrentDomain.BaseDirectory + #"Capabilities\" + name + ".xml";
string xmlText = File.ReadAllText(xmlString);
byte[] buffer = Encoding.UTF8.GetBytes(xmlText);
XElement element = XElement.Load(xmlString);
IEnumerable<XElement> title =
from el in element.Elements("Title")
select el;
foreach (XElement el in title)
{
var elementValue = el.Value;
}
However, this isn't very flexible because say I have an XML file that looks like this:
<?xml version="1.0" encoding="utf-8"?>
<RootObject xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Services>
<TileMapService>
<Title>title</Title>
<href>http://localhost/root</href>
</TileMapService>
</Services>
</RootObject>
It can't find <Title> but it finds <Services> (I presume) but since it's not called "Title" it just ignores it. I'm not very strong in working with XML. How would I go about making a method that looks through the XML and fetches me "Title" or however you'd implement this?
You're currently just looking at the child elements of the root element.
Instead, if you want to find all descendants, use Descendants.
Additionally, there's no point in using a query expression of from x in y select x (or rather, there's a very limited point in some cases, but not here). So just use:
var titles = element.Descendants("Title");
Personally I would actually use XDocument here rather than XElement - you have after all got a whole document, complete with XML declaration, not just an element.
Change your LINQ query to:
IEnumerable<XElement> title =
from el in element.Descendants("Title")
select el;
Elements returns only the immediate children, Descendants returns all descendant nodes instead.
Descendants will select all the "Title" elements irrespective of the level. Please use xpath to correctly locate the element
using System;
using System.Linq;
using System.Xml.Linq;
using System.Xml.XPath;
using System.IO;
public class Program
{
public static void Main()
{
string xmlFile = AppDomain.CurrentDomain.BaseDirectory + #"Capabilities\" + name + ".xml";
XElement xml=XElement.Load(xmlFile);
IEnumerable<XElement> titleElements = xml.XPathSelectElements("//Services/TileMapService/Title");
}
}

Searching through XML to find a list of items

I have an xml doc as such:
<?xml version="1.0" encoding="utf-8" ?>
<Categories>
<Category>
<Name>Fruit</Name>
<Items>
<Item>Apple</Item>
<Item>Banana</Item>
<Item>Peach</Item>
<Item>Strawberry</Item>
</Items>
</Category>
<Category>
<Name>Vegetable</Name>
<Items>
<Item>Carrots</Item>
<Item>Beets</Item>
<Item>Green Beans</Item>
<Item>Bell Pepper</Item>
</Items>
</Category>
<Category>
<Name>Seafood</Name>
<Items>
<Item>Crab</Item>
<Item>Lobster</Item>
<Item>Shrimp</Item>
<Item>Oysters</Item>
<Item>Salmon</Item>
</Items>
</Category>
</Categories>
I would like to be able to search on a term such as Category.Name = Fruit and get back the list of the Fruit Items.
Here is the incomplete code I've started so far:
string localPath = Server.MapPath("~/App_Data/Foods.xml");
XmlDocument doc = new XmlDocument();
doc.Load(localPath);
XmlNodeList list = doc.SelectNodes("Categories");
//Do something here to search the category names and get back the list of items.
This is my first attempt at parsing through XML so I'm a bit lost. Note: the application I am working on uses .Net 2.0
I'd suggest to read about XPath as you're limited to .NET 2.0, moreover XPath is very useful to work with XML even in more general context (not limited to .NET platform only).
In this particular case XPath become useful because SelectNodes() and SelectSingleNode() method accept XPath string as parameter. For example, to get all <Item> that corresponds to category name "Fruit" :
string localPath = Server.MapPath("~/App_Data/Foods.xml");
XmlDocument doc = new XmlDocument();
doc.Load(localPath);
XmlNodeList items = doc.SelectNodes("Categories/Category[Name='Fruit']/Items/Item");
foreach(XmlNode item in items)
{
Console.WriteLine(item.InnerText);
}
You can see XPath as a path, similar to file path in windows explorer. I'd try to explain only the bit that is different from common path expression in the above sample, particularly this bit :
...\Category[Name='Fruit']\...
the expression within square-brackets is a filter which say search for <Category> node having child node <Name> equals "Fruit".
You are on the right path. However, you would need to load the 'Categories' node first, then you can get it's child nodes.
I have added a filter to return only nodes where the name is "Fruit".
XmlNode cat = doc.SelectSingleNode("Categories");
var list = cat.SelectNodes("Category").Cast<XmlNode>()
.Where(c => c.SelectSingleNode("Name").InnerText == "Fruit");
foreach ( XmlNode item in list )
{
// process each node here
}

reading node from xml file in XMLDocument

i am trying to grab the TopicName how should i go after it and try different combination but somehow i am unable to get TopicName below is my source codee...
XmlDocument xdoc = new XmlDocument();//xml doc used for xml parsing
xdoc.Load(
"http://latestpackagingnews.blogspot.com/feeds/posts/default"
);//loading XML in xml doc
XmlNodeList xNodelst = xdoc.DocumentElement.SelectNodes("content");//reading node so that we can traverse thorugh the XML
foreach (XmlNode xNode in xNodelst)//traversing XML
{
//litFeed.Text += "read";
}
sample xml file
<content type="application/xml">
<CatalogItems xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="sitename.xsd">
<CatalogSource Acronym="ABC" OrganizationName="ABC Corporation" />
<CatalogItem Id="3212" CatalogUrl="urlname">
<ContentItem xmlns:content="sitename.xsd" TargetUrl="url">
<content:SelectionSpec ClassList="" ElementList="" />
<content:Language Value="eng" Scheme="ISO 639-2" />
<content:Source Acronym="ABC" OrganizationName="ABC Corporation" />
<content:Topics Scheme="ABC">
<content:Topic TopicName="Marketing" />
<content:Topic TopiccName="Coverage" />
</content:Topics>
</ContentItem>
</CatalogItem>
</CatalogItems>
</content>
The Topic nodes in your XML are using the content namespace - you need to declare and use the XML namespace in your code, then you can use SelectNodes() to grab the nodes of interest - this worked for me:
XmlNamespaceManager nsmgr = new XmlNamespaceManager(xdoc.NameTable);
nsmgr.AddNamespace("content", "sitename.xsd");
var topicNodes = xdoc.SelectNodes("//content:Topic", nsmgr);
foreach (XmlNode node in topicNodes)
{
string topic = node.Attributes["TopicName"].Value;
}
Just as a comparison see how easy this would be with Linq to XML:
XDocument xdoc = XDocument.Load("test.xml");
XNamespace ns = "sitename.xsd";
string topic = xdoc.Descendants(ns + "Topic")
.Select(x => (string)x.Attribute("TopicName"))
.FirstOrDefault();
To get all topics you can replace the last statement with:
var topics = xdoc.Descendants(ns + "Topic")
.Select(x => (string)x.Attribute("TopicName"))
.ToList();
If you just need a specific element, then I'd use XPath:
This is a guide to use XPath in C#:
http://www.codeproject.com/KB/XML/usingXPathNavigator.aspx
And this is the query that will get you a collection of your Topics:
//content/CatalogItems/CatalogItem/ContentItem/content:Topics/content:Topic
You could tweak this query depending on what it is you're trying to accomplish, grabbing just a specific TopicName value:
//content/CatalogItems/CatalogItem/ContentItem/content:Topics/content:Topic/#TopicName
XPath is pretty easy to learn. I've done stuff like this pretty quickly with no prior knowledge.
You can paste you XML and xpath query here to test your queries:
http://www.bit-101.com/xpath/
The following quick and dirty LINQ to XML code obtains your TopicNames and prints them on the console.
XDocument lDoc = XDocument.Load(lXmlDocUri);
foreach (var lElement in lDoc.Element("content").Element(XName.Get("CatalogItems", "sitename.xsd")).Elements(XName.Get("CatalogItem", "sitename.xsd")))
{
foreach (var lContentTopic in lElement.Element(XName.Get("ContentItem", "sitename.xsd")).Element(XName.Get("Topics", "sitename.xsd")).Elements(XName.Get("Topic", "sitename.xsd")))
{
string lTitle = lContentTopic.Attribute("TopicName").Value;
Console.WriteLine(lTitle);
}
}
It'd have been a lot shorter if it wasn't for all the namespaces :) (Instead of "XName.Get" you would just use the name of the element).

C# XPath Not Finding Anything

I'm trying to use XPath to select the items which have a facet with Location values, but currently my attempts even to just select all items fail: The system happily reports that it found 0 items, then returns (instead the nodes should be processed by a foreach loop). I'd appreciate help either making my original query or just getting XPath to work at all.
XML
<?xml version="1.0" encoding="UTF-8" ?>
<Collection Name="My Collection" SchemaVersion="1.0" xmlns="http://schemas.microsoft.com/collection/metadata/2009" xmlns:p="http://schemas.microsoft.com/livelabs/pivot/collection/2009" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<FacetCategories>
<FacetCategory Name="Current Address" Type="Location"/>
<FacetCategory Name="Previous Addresses" Type="Location" />
</FacetCategories>
<Items>
<Item Id="1" Name="John Doe">
<Facets>
<Facet Name="Current Address">
<Location Value="101 America Rd, A Dorm Rm 000, Chapel Hill, NC 27514" />
</Facet>
<Facet Name="Previous Addresses">
<Location Value="123 Anywhere Ln, Darien, CT 06820" />
<Location Value="000 Foobar Rd, Cary, NC 27519" />
</Facet>
</Facets>
</Item>
</Items>
</Collection>
C#
public void countItems(string fileName)
{
XmlDocument document = new XmlDocument();
document.Load(fileName);
XmlNode root = document.DocumentElement;
XmlNodeList xnl = root.SelectNodes("//Item");
Console.WriteLine(String.Format("Found {0} items" , xnl.Count));
}
There's more to the method than this, but since this is all that gets run I'm assuming the problem lies here. Calling root.ChildNodes accurately returns FacetCategories and Items, so I am completely at a loss.
Thanks for your help!
Your root element has a namespace. You'll need to add a namespace resolver and prefix the elements in your query.
This article explains the solution. I've modified your code so that it gets 1 result.
public void countItems(string fileName)
{
XmlDocument document = new XmlDocument();
document.Load(fileName);
XmlNode root = document.DocumentElement;
// create ns manager
XmlNamespaceManager xmlnsManager = new XmlNamespaceManager(document.NameTable);
xmlnsManager.AddNamespace("def", "http://schemas.microsoft.com/collection/metadata/2009");
// use ns manager
XmlNodeList xnl = root.SelectNodes("//def:Item", xmlnsManager);
Response.Write(String.Format("Found {0} items" , xnl.Count));
}
Because you have an XML namespace on your root node, there is no such thing as "Item" in your XML document, only "[namespace]:Item", so when searching for a node with XPath, you need to specify the namespace.
If you don't like that, you can use the local-name() function to match all elements whose local name (the name part other than the prefix) is the value you're looking for. It's a bit ugly syntax, but it works.
XmlNodeList xnl = root.SelectNodes("//*[local-name()='Item']");

Search XDocument using LINQ without knowing the namespace

Is there a way to search an XDocument without knowing the namespace? I have a process that logs all SOAP requests and encrypts the sensitive data. I want to find any elements based on name. Something like, give me all elements where the name is CreditCard. I don't care what the namespace is.
My problem seems to be with LINQ and requiring a xml namespace.
I have other processes that retrieve values from XML, but I know the namespace for these other process.
XDocument xDocument = XDocument.Load(#"C:\temp\Packet.xml");
XNamespace xNamespace = "http://CompanyName.AppName.Service.Contracts";
var elements = xDocument.Root
.DescendantsAndSelf()
.Elements()
.Where(d => d.Name == xNamespace + "CreditCardNumber");
I really want to have the ability to search xml without knowing about namespaces, something like this:
XDocument xDocument = XDocument.Load(#"C:\temp\Packet.xml");
var elements = xDocument.Root
.DescendantsAndSelf()
.Elements()
.Where(d => d.Name == "CreditCardNumber")
This will not work because I don't know the namespace beforehand at compile time.
How can this be done?
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Request xmlns="http://CompanyName.AppName.Service.ContractA">
<Person>
<CreditCardNumber>83838</CreditCardNumber>
<FirstName>Tom</FirstName>
<LastName>Jackson</LastName>
</Person>
<Person>
<CreditCardNumber>789875</CreditCardNumber>
<FirstName>Chris</FirstName>
<LastName>Smith</LastName>
</Person>
...
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Request xmlns="http://CompanyName.AppName.Service.ContractsB">
<Transaction>
<CreditCardNumber>83838</CreditCardNumber>
<TransactionID>64588</FirstName>
</Transaction>
...
As Adam precises in the comment, XName are convertible to a string, but that string requires the namespace when there is one. That's why the comparison of .Name to a string fails, or why you can't pass "Person" as a parameter to the XLinq Method to filter on their name.
XName consists of a prefix (the Namespace) and a LocalName. The local name is what you want to query on if you are ignoring namespaces.
Thank you Adam :)
You can't put the Name of the node as a parameter of the .Descendants() method, but you can query that way :
var doc= XElement.Parse(
#"<s:Envelope xmlns:s=""http://schemas.xmlsoap.org/soap/envelope/"">
<s:Body xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xmlns:xsd=""http://www.w3.org/2001/XMLSchema"">
<Request xmlns=""http://CompanyName.AppName.Service.ContractA"">
<Person>
<CreditCardNumber>83838</CreditCardNumber>
<FirstName>Tom</FirstName>
<LastName>Jackson</LastName>
</Person>
<Person>
<CreditCardNumber>789875</CreditCardNumber>
<FirstName>Chris</FirstName>
<LastName>Smith</LastName>
</Person>
</Request>
</s:Body>
</s:Envelope>");
EDIT : bad copy/past from my test :)
var persons = from p in doc.Descendants()
where p.Name.LocalName == "Person"
select p;
foreach (var p in persons)
{
Console.WriteLine(p);
}
That works for me...
You could take the namespace from the root-element:
XDocument xDocument = XDocument.Load(#"C:\temp\Packet.xml");
var ns = xDocument.Root.Name.Namespace;
Now you can get all desired elements easily using the plus-operator:
root.Elements(ns + "CreditCardNumber")
I think I found what I was looking for. You can see in the following code I do the evaluation Element.Name.LocalName == "CreditCardNumber". This seemed to work in my tests. I'm not sure if it's a best practice, but I'm going to use it.
XDocument xDocument = XDocument.Load(#"C:\temp\Packet.xml");
var elements = xDocument.Root.DescendantsAndSelf().Elements().Where(d => d.Name.LocalName == "CreditCardNumber");
Now I have elements where I can encrypt the values.
If anyone has a better solution, please provide it. Thanks.
There's a couple answers with extension methods that have been deleted. Not sure why. Here's my version that works for my needs.
public static class XElementExtensions
{
public static XElement ElementByLocalName(this XElement element, string localName)
{
return element.Descendants().FirstOrDefault(e => e.Name.LocalName == localName && !e.IsEmpty);
}
}
The IsEmpty is to filter out nodes with x:nil="true"
There may be additional subtleties - so use with caution.
If your XML documents always defines the namespace in the same node (Request node in the two examples given), you can determine it by making a query and seeing what namespace the result has:
XDocument xDoc = XDocument.Load("filename.xml");
//Initial query to get namespace:
var reqNodes = from el in xDoc.Root.Descendants()
where el.Name.LocalName == "Request"
select el;
foreach(var reqNode in reqNodes)
{
XNamespace xns = reqNode.Name.Namespace;
//Queries making use of namespace:
var person = from el in reqNode.Elements(xns + "Person")
select el;
}
I a suffering from a major case of "I know that is the solution, but I am disappointed that that is the solution"... I recently wrote a query like the one below (which I will shortly replace, but it has educational value):
var result = xdoc.Descendants("{urn:schemas-microsoft-com:rowset}data")
.FirstOrDefault()?
.Descendants("{#RowsetSchema}row");
If I remove the namespaces from the XML, I can write the same query like this:
var result = xdoc.Descendants("data")
.FirstOrDefault()?
.Descendants("row");
I plan to write my own extension methods that should allow me to leave the namespaces alone and search for nodes like this:
var result = xdoc.Descendants("rs:data")
.FirstOrDefault()?
.Descendants("z:row");
//'rs:' {refers to urn:schemas-microsoft-com:rowset}
//'z:' {refers to xmlns:z=#RowsetSchema}
My comments just below the code point to how I would like to hide the ugliness of the solution in an Extension Methods library. Again, I'm aware of the solutions posted earlier - but I wish the API itself handled this more fluently. (See what I did there?)
Just use the Descendents method:
XDocument doc = XDocument.Load(filename);
String[] creditCards = (from creditCardNode in doc.Root.Descendents("CreditCardNumber")
select creditCardNode.Value).ToArray<string>();

Categories