Searching through XML to find a list of items - c#

I have an xml doc as such:
<?xml version="1.0" encoding="utf-8" ?>
<Categories>
<Category>
<Name>Fruit</Name>
<Items>
<Item>Apple</Item>
<Item>Banana</Item>
<Item>Peach</Item>
<Item>Strawberry</Item>
</Items>
</Category>
<Category>
<Name>Vegetable</Name>
<Items>
<Item>Carrots</Item>
<Item>Beets</Item>
<Item>Green Beans</Item>
<Item>Bell Pepper</Item>
</Items>
</Category>
<Category>
<Name>Seafood</Name>
<Items>
<Item>Crab</Item>
<Item>Lobster</Item>
<Item>Shrimp</Item>
<Item>Oysters</Item>
<Item>Salmon</Item>
</Items>
</Category>
</Categories>
I would like to be able to search on a term such as Category.Name = Fruit and get back the list of the Fruit Items.
Here is the incomplete code I've started so far:
string localPath = Server.MapPath("~/App_Data/Foods.xml");
XmlDocument doc = new XmlDocument();
doc.Load(localPath);
XmlNodeList list = doc.SelectNodes("Categories");
//Do something here to search the category names and get back the list of items.
This is my first attempt at parsing through XML so I'm a bit lost. Note: the application I am working on uses .Net 2.0

I'd suggest to read about XPath as you're limited to .NET 2.0, moreover XPath is very useful to work with XML even in more general context (not limited to .NET platform only).
In this particular case XPath become useful because SelectNodes() and SelectSingleNode() method accept XPath string as parameter. For example, to get all <Item> that corresponds to category name "Fruit" :
string localPath = Server.MapPath("~/App_Data/Foods.xml");
XmlDocument doc = new XmlDocument();
doc.Load(localPath);
XmlNodeList items = doc.SelectNodes("Categories/Category[Name='Fruit']/Items/Item");
foreach(XmlNode item in items)
{
Console.WriteLine(item.InnerText);
}
You can see XPath as a path, similar to file path in windows explorer. I'd try to explain only the bit that is different from common path expression in the above sample, particularly this bit :
...\Category[Name='Fruit']\...
the expression within square-brackets is a filter which say search for <Category> node having child node <Name> equals "Fruit".

You are on the right path. However, you would need to load the 'Categories' node first, then you can get it's child nodes.
I have added a filter to return only nodes where the name is "Fruit".
XmlNode cat = doc.SelectSingleNode("Categories");
var list = cat.SelectNodes("Category").Cast<XmlNode>()
.Where(c => c.SelectSingleNode("Name").InnerText == "Fruit");
foreach ( XmlNode item in list )
{
// process each node here
}

Related

Search through XML and grab another Node

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>1</MessageID>
<Product>
<SKU>33333-01</SKU>
</Product>
</Message>
</Envelope>
I've tried googling but whether I'm just not providing the correct search criteria I don't know.
I want to be able to search the XML file based on the MessageID and then grab the SKU.
I then want to search another XML file based on the SKU and remove that message completely.
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>1</MessageID>
<Inventory>
<SKU>33333-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
<Message>
<MessageID>2</MessageID>
<Inventory>
<SKU>22222-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
</Envelope>
Meaning the XML above becomes:
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>2</MessageID>
<Inventory>
<SKU>22222-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
</Envelope>
To confirm I cannot confirm that the MessageID will be the same over different XML files.
Thanks in advance for any help.
My questions:
How do I search through XML files?
How do I then grab another Nodes details
Can I remove a complete from an XML file based on a search?
You can use XmlDocument to load your XML document. Then, you can use XPath for searching any nodes.
XmlDocument document = new XmlDocument();
document.Load("C:\fileOnTheDisk.xml");
// or
document.LoadXml("<a>someXmlString</a>");
// Returns single element or null if not found
var singleNode = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
// Returns a NodeList
var nodesList = document.SelectNodes("Envelope/Message[MessageID = '1']");
Read more about XPath at w3schools.com.
Here is a good XPath Tester.
For example, you can use the following XPath to find nodes in your document by ID:
XmlDocument document = new XmlDocument();
document.Load("C:\doc.xml");
var node = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
var sku = node.SelectSingleNode("Inventory/SKU").InnerText;
Console.WriteLine("{0} node has SKU = {1}", 1, sku);
Or you can output all SKUs:
foreach (XmlNode node in document.SelectNodes("Envelope/Message"))
{
Console.WriteLine("{0} node has SKU = {1}",
node.SelectSingleNode("MessageID").InnerText,
node.SelectSingleNode("Inventory/SKU").InnerText);
}
It will produce:
1 node has SKU = 33333-01
2 node has SKU = 22222-01
Note that there are possible NullReferenceExceptions if nodes are not present.
You can simply remove it using RemoveChild() method of its parent.
XmlDocument document = new XmlDocument();
document.Load("C:\doc.xml");
var node = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
node.ParentNode.RemoveChild(node);
document.Save("C:\docNew.xml"); // will be without Message 1
You can use Linq to XML to do this:
var doc= XDocument.Load("input.xml");//path of your xml file in which you want to search based on message id.
var searchNode= doc.Descendants("MessageID").FirstOrDefault(d => d.Value == "1");// It will search message node where its value is 1 and get first of it
if(searchNode!=null)
{
var SKU=searchNode.Parent.Descendants("SKU").FirstOrDefault();
if(SKU!=null)
{
var searchDoc=XDocument.Load("search.xml");//path of xml file where you want to search based on SKU value.
var nodes =searchDoc.Descendants("SKU").Where(d=>d.Value==SKU.Value).Select(d=>d.Parent.Parent).ToList();
nodes.ForEach(node=>node.Remove());
searchDoc.Save("output.xml");//path of output file
}
}
I'd recommend you did this using LINQ to XML - it's much nicer to work with than the old XmlDocument API.
For all the examples, you can parse your XML string xml to an XDocument like so:
var doc = XDocument.Parse(xml);
1. How do I search through XML files?
You can get the SKU for a specific message ID by querying your document:
var sku = (string)doc.Descendants("Message")
.Where(e => (int)e.Element("MessageID") == 1)
.SelectMany(e => e.Descendants("SKU"))
.Single();
2. How do I then grab another Nodes details?
You can get the Message element with a specified SKU using a another query:
var message = doc.Descendants("SKU")
.Where(sku => (string)sku == "33333-01")
.SelectMany(e => e.Ancestors("Message"))
.Single();
3. Can I remove a complete element from an XML file based on a search?
Using your result from step 2, you can simple call Remove:
message.Remove();
Alternatively, you can combine the query from step 2 and simply execute a command to remove any messages that have a specific SKU:
doc.Descendants("SKU")
.Where(sku => (string)sku == "33333-01")
.SelectMany(e => e.Ancestors("Message"))
.Remove();
I tried to answer all your questions:
using System.Xml.XPath;
using System.Xml.Linq;
XDocument xdoc1 = XDocument.Load("xml1.xml");
XDocument xdoc2 = XDocument.Load("xml2.xml");
string sku = String.Empty;
string searchedID = "2";
//1.searching through an xml file based on path
foreach (XElement message in xdoc1.XPathSelectElements("Envelope/Message"))
{
if (message.Element("MessageID").Value.Equals(searchedID))
{
//2.grabbing another node's details
sku = message.XPathSelectElement("Inventory/SKU").Value;
}
}
foreach (XElement message in xdoc2.XPathSelectElements("Envelope/Message"))
{
if (message.XPathSelectElement("Inventory/SKU") != null && message.XPathSelectElement("Inventory/SKU").Value.Equals(sku))
{
//removing a node
message.Remove();
}
}
xdoc2.Save("xml2_del.xml");
}

Getting an XElement with a namespace via XPathSelectElements

I have an XML e.g.
<?xml version="1.0" encoding="utf-8"?>
<A1>
<B2>
<C3 id="1">
<D7>
<E5 id="abc" />
</D7>
<D4 id="1">
<E5 id="abc" />
</D4>
<D4 id="2">
<E5 id="abc" />
</D4>
</C3>
</B2>
</A1>
This is may sample code:
var xDoc = XDocument.Load("Test.xml");
string xPath = "//B2/C3/D4";
//or string xPath = "//B2/C3/D4[#id='1']";
var eleList = xDoc.XPathSelectElements(xPath).ToList();
foreach (var xElement in eleList)
{
Console.WriteLine(xElement);
}
It works perfectly, but if I add a namespace to the root node A1, this code doesn't work.
Upon searching for solutions, I found this one, but it uses the Descendants() method to query the XML. From my understanding, this solution would fail if I was searching for <E5> because the same tag exists for <D7>, <D4 id="1"> and <D4 id="2">
My requirement is to search if a node exists at a particular XPath. If there is a way of doing this using Descendants, I'd be delighted to use it. If not, please guide me on how to search using the name space.
My apologies in case this is a duplicate.
To keep using XPath, you can use something link this:
var xDoc = XDocument.Parse(#"<?xml version='1.0' encoding='utf-8'?>
<A1 xmlns='urn:sample'>
<B2>
<C3 id='1'>
<D7><E5 id='abc' /></D7>
<D4 id='1'><E5 id='abc' /></D4>
<D4 id='2'><E5 id='abc' /></D4>
</C3>
</B2>
</A1>");
// Notice this
XmlNamespaceManager nsmgr = new XmlNamespaceManager(new NameTable());
nsmgr.AddNamespace("sample", "urn:sample");
string xPath = "//sample:B2/sample:C3/sample:D4";
var eleList = xDoc.XPathSelectElements(xPath, nsmgr).ToList();
foreach (var xElement in eleList)
{
Console.WriteLine(xElement);
}
but it uses the Descendants() method to query the XML. From my understanding, this solution would fail if I was searching for because the same tag exists for , and
I'm pretty sure you're not quite understanding how that works. From the MSDN documentation:
Returns a filtered collection of the descendant elements for this document or element, in document order. Only elements that have a matching XName are included in the collection.
So in your case, just do this:
xDoc.RootNode
.Descendants("E5")
.Where(n => n.Parent.Name.LocalName == "B4");
Try this
var xDoc = XDocument.Parse("<A1><B2><C3 id=\"1\"><D7><E5 id=\"abc\" /></D7><D4 id=\"1\"><E5 id=\"abc\" /></D4><D4 id=\"2\"><E5 id=\"abc\" /></D4></C3></B2></A1>");
foreach (XElement item in xDoc.Element("A1").Elements("B2").Elements("C3").Elements("D4"))
{
Console.WriteLine(item.Element("E5").Value);//to get the value of E5
Console.WriteLine(item.Element("E5").Attribute("id").Value);//to get the value of attribute
}

Extract nodes from XmlNodeList, with namespace, and where same child appears at multiple levels

I am trying to extract those subnodes, but I had got just headache so far...
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<supplyCrew xmlns="http://site.ddf.com">
<login>
<login>XXXX</login>
<password>XXXX</password>
</login>
<flightInformation>
<flights>
<item>
<arrivalDateTime>2010-11-08T22:48:00.000Z</arrivalDateTime>
<arrivingCity>ORD</arrivingCity>
<crewMembers>
<item>
<employeeId>020040</employeeId>
<isDepositor>Y</isDepositor>
<isTransmitter>N</isTransmitter>
</item>
<item>
<employeeId>09000</employeeId>
<isDepositor>N</isDepositor>
<isTransmitter>Y</isTransmitter>
</item>
</crewMembers>
</item>
<item>
<arrivalDateTime>2010-11-08T20:29:00.000Z</arrivalDateTime>
<arrivingCity>JFK</arrivingCity>
<crewMembers>
<item>
<employeeId>0538</employeeId>
<isDepositor>Y</isDepositor>
<isTransmitter>N</isTransmitter>
</item>
<item>
<employeeId>097790</employeeId>
<isDepositor>N</isDepositor>
<isTransmitter>Y</isTransmitter>
</item>
with the code I can get them, but I do not know how to select each one according to their tag name to insert them into a database.
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("C:/Crew_Request_Sample.xml");
XmlNodeList elemList = xmlDoc.GetElementsByTagName("item");
foreach (XmlNode node in elemList)
{
Debug.WriteLine(node.InnerText);
}
I need some direction, please.
The problem with using GetElementsByTagName("item") here is that there are 2 levels of item node - one as a child of flights and another item as a child of crewMembers.
Edit Now that the full xml is pasted, it is clear that there is also a namespace involved as well. To handle namespaces, make use of a namespace manager to define aliases for the namespaces, which you can then use in the xpath queries:
var nsm = new XmlNamespaceManager(xmlDoc.NameTable);
nsm.AddNamespace("s", "http://site.ddf.com");
var elemList = xmlDoc.SelectNodes("//s:crewMembers/s:item", nsm);
foreach (var node in elemList)
{
Debug.WriteLine(node.SelectSingleNode("s:employeeId", nsm).InnerText);
Debug.WriteLine(node.SelectSingleNode("s:isDepositor", nsm).InnerText);
Debug.WriteLine(node.SelectSingleNode("s:isTransmitter", nsm).InnerText);
}
You can do it using LINQ2XML..
XElement doc=XElement.Load("C:/Crew_Request_Sample.xml");
XNamespace e = "http://schemas.xmlsoap.org/soap/envelope/";
XNamespace s = "http://site.ddf.com";
//this would access the nodes of item->crewMembers->item and put it into an Anonymous Type
var yourList=doc.Descendants(e+"Body")
.Descendants(s+"supplyCrew")
.Descendants(s+"flightInformation")
.Descendants(s+"flights")
.Descendants(s+"item")
.Descendants(s+"crewMembers")
.Descendants(s+"item")
.Select(
x=>new
{
//Anonymous Type
employeeId=x.Element(s+"employeeId").Value,
isDepositor=x.Element(s+"isDepositor").Value,
isTransmitter=x.Element(s+"isTransmitter").Value
}
);
You can then access yourList using for-each loop
foreach(var item in yourList)
{
Console.WriteLine(item.employeeId);
Console.WriteLine(item.isDepositor);
Console.WriteLine(item.isTransmitter);
}
I think you'll do it faster and easily using this technique
Linq To XML
There is a lot of examples in the site, so it 'll be easy to find what you want.
Hope it helps.

How do I put XML into a List of Dictionaries with C# on Windows Phone 7

Here is the XML I have in a file:
SPECIAL NOTE:
This is a question for Windows Phone 7, not general C#
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<item>
<date>01/01</date>
<word>aberrant</word>
<def>straying from the right or normal way</def>
</item>
<item>
<date>01/02</date>
<word>Zeitgeist</word>
<def>the spirit of the time.</def>
</item>
</rss>
I need it in a List (aka array) of Dictionary objects. Each Dictionary represents an <item>. Each element like <word> is the key with type string and each value like "Zeitgeist" is the value with type string.
Is there any easy way to do this? I'm coming from Objective-C and iOS so this is completely new to me with .NET and C#.
LINQ-to-XML makes it pretty easy. Here's a complete example:
public static void Main(string[] args)
{
string xml = #"
<rss version='2.0'>
<item>
<date>01/01</date>
<word>aberrant</word>
<def>straying from the right or normal way</def>
</item>
<item>
<date>01/02</date>
<word>Zeitgeist</word>
<def>the spirit of the time.</def>
</item>
</rss>";
var xdoc = XDocument.Parse(xml);
var result = xdoc.Root.Elements("item")
.Select(itemElem => itemElem.Elements().ToDictionary(e => e.Name.LocalName, e => e.Value))
.ToList();
}
Instead of loading from a string with XDocument.Parse(), you would probably do XDocument.Load(filename) but either way you get an XDocument object to work with (I did a string just for example).
You can use Linq-Xml to do this:
var doc = XDocument.Parse(xml); //xml is a String with your XML in it.
doc
.Root //Elements under the root element.
.Elements("item") //Select the elements called "item".
.Select( //Projecting each item element to something new.
item => //Selecting each element in the item.
item //And creating a new dictionary using the element name
.Elements() // as the key and element value as the value.
.ToDictionary(xe => xe.Name.LocalName, xe => xe.Value))
.ToList();
Yes, there is an easy way, it's called LINQ to XML.
Some resources:
Parsing complex XML with C#
LINQ to read XML
http://msdn.microsoft.com/en-us/library/bb387098.aspx
Hope this helps...

C# XPath Not Finding Anything

I'm trying to use XPath to select the items which have a facet with Location values, but currently my attempts even to just select all items fail: The system happily reports that it found 0 items, then returns (instead the nodes should be processed by a foreach loop). I'd appreciate help either making my original query or just getting XPath to work at all.
XML
<?xml version="1.0" encoding="UTF-8" ?>
<Collection Name="My Collection" SchemaVersion="1.0" xmlns="http://schemas.microsoft.com/collection/metadata/2009" xmlns:p="http://schemas.microsoft.com/livelabs/pivot/collection/2009" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<FacetCategories>
<FacetCategory Name="Current Address" Type="Location"/>
<FacetCategory Name="Previous Addresses" Type="Location" />
</FacetCategories>
<Items>
<Item Id="1" Name="John Doe">
<Facets>
<Facet Name="Current Address">
<Location Value="101 America Rd, A Dorm Rm 000, Chapel Hill, NC 27514" />
</Facet>
<Facet Name="Previous Addresses">
<Location Value="123 Anywhere Ln, Darien, CT 06820" />
<Location Value="000 Foobar Rd, Cary, NC 27519" />
</Facet>
</Facets>
</Item>
</Items>
</Collection>
C#
public void countItems(string fileName)
{
XmlDocument document = new XmlDocument();
document.Load(fileName);
XmlNode root = document.DocumentElement;
XmlNodeList xnl = root.SelectNodes("//Item");
Console.WriteLine(String.Format("Found {0} items" , xnl.Count));
}
There's more to the method than this, but since this is all that gets run I'm assuming the problem lies here. Calling root.ChildNodes accurately returns FacetCategories and Items, so I am completely at a loss.
Thanks for your help!
Your root element has a namespace. You'll need to add a namespace resolver and prefix the elements in your query.
This article explains the solution. I've modified your code so that it gets 1 result.
public void countItems(string fileName)
{
XmlDocument document = new XmlDocument();
document.Load(fileName);
XmlNode root = document.DocumentElement;
// create ns manager
XmlNamespaceManager xmlnsManager = new XmlNamespaceManager(document.NameTable);
xmlnsManager.AddNamespace("def", "http://schemas.microsoft.com/collection/metadata/2009");
// use ns manager
XmlNodeList xnl = root.SelectNodes("//def:Item", xmlnsManager);
Response.Write(String.Format("Found {0} items" , xnl.Count));
}
Because you have an XML namespace on your root node, there is no such thing as "Item" in your XML document, only "[namespace]:Item", so when searching for a node with XPath, you need to specify the namespace.
If you don't like that, you can use the local-name() function to match all elements whose local name (the name part other than the prefix) is the value you're looking for. It's a bit ugly syntax, but it works.
XmlNodeList xnl = root.SelectNodes("//*[local-name()='Item']");

Categories