Filter XDocument more efficiently - c#

I would like to filter with high performance XML elements from an XML document.
Take for instance this XML file with contacts:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="asistentes.xslt"?>
<contactlist evento="Cena Navidad 2010" empresa="company">
<contact type="1" id="1">
<name>Name1</name>
<email>xxxx#zzzz.es</email>
<confirmado>SI</confirmado>
</contact>
<contact type="1" id="2">
<name>Name2</name>
<email>xxxxxxxxx#zzzze.es</email>
<confirmado>Sin confirmar</confirmado>
</contact>
</contaclist>
My current code to filter from this XML document:
using System;
using System.Xml.Linq;
class Test
{
static void Main()
{
string xml = #" the xml above";
XDocument doc = XDocument.Parse(xml);
foreach (XElement element in doc.Descendants("contact")) {
Console.WriteLine(element);
var id = element.Attribute("id").Value;
var valor = element.Descendants("confirmado").ToList()[0].Value;
var email = element.Descendants("email").ToList()[0].Value;
var name = element.Descendants("name").ToList()[0].Value;
if (valor.ToString() == "SI") { }
}
}
}
What would be the best way to optimize this code to filter on <confirmado> element content?

var doc = XDocument.Parse(xml);
var query = from contact in doc.Root.Elements("contact")
let confirmado = (string)contact.Element("confirmado")
where confirmado == "SI"
select new
{
Id = (int)contact.Attribute("id"),
Name = (string)contact.Element("name"),
Email = (string)contact.Element("email"),
Valor = confirmado
};
foreach (var contact in query)
{
...
}
Points of interest:
doc.Root.Elements("contact") selects only the <contact> elements in the document root, instead of searching the whole document for <contact> elements.
The XElement.Element method returns the first child element with the given name. No need to convert the child elements to a list and take the first element.
The XElement and XAttribute classes provide a wide selection of convenient conversion operators.

You could use LINQ:
foreach (XElement element in doc.Descendants("contact").Where(c => c.Element("confirmado").Value == "SI"))

Related

Parsing SOAP response in C#

I am new C#. I make a SOAP request and in the SOAP response, I need to access repeating nodes 'ABC'. This is how my SOAP Response looks like:
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Header>
<work:WorkContext xmlns:work="http://example.com/soap/workarea/">sdhjasdajsdhj=</work:WorkContext>
</env:Header>
<env:Body>
<ReadABCResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.xyz.com/abc/a6/AB/XYZ/V1">
<ABC xmlns="http://xmlns.xyz.example/abc/a6/AB/XYZ/V1">
<asd xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true"/>
<xyz xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true"/>
</ABC>
<ABC xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1">
<asd xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true"/>
<xyz xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true"/>
</ABC>
</ReadABCResponse>
</env:Body>
</env:Envelope>
My code is as below:
XmlDocument responseDoc = new XmlDocument();
responseDoc.LoadXml(responseString); //responseString is set to above SOAP response.
XmlNamespaceManager nsmgr = new XmlNamespaceManager(responseDoc.NameTable);
nsmgr.AddNamespace("env", "http://schemas.xmlsoap.org/soap/envelope/");
nsmgr.AddNamespace("work", "http://example.com/soap/workarea/");
nsmgr.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
nsmgr.AddNamespace("", "http://xmlns.example.com/abc/a6/AB/XYZ/V1");
XmlNodeList lst = responseDoc.SelectNodes("/env:Envelope/env:Body/ReadABCResponse/ABC", nsmgr);
Console.WriteLine("Count " + lst.Count);
// and then iterate over the repeating ABC nodes to do some work.
However value of Count is always printed as 0. I have tried different combinations of the xpath path in "SelectNodes" method including "//ABC" - which I thought should give me all the repeating 'ABC' nodes but it does not.
What is wrong with my code. please can someone highlight and help me!
I have looked around on this site but cant figure out what is wrong with this code.
The following shows how to use XDocument to read data from the XML.
Test.xml:
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Header>
<work:WorkContext xmlns:work="http://example.com/soap/workarea/">sdhjasdajsdhj=</work:WorkContext>
</env:Header>
<env:Body>
<ReadABCResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.xyz.com/abc/a6/AB/XYZ/V1">
<ABC xmlns="http://xmlns.xyz.example/abc/a6/AB/XYZ/V1">
<asd xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true">asd data 1</asd>
<xyz xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true">xyz data 1</xyz>
</ABC>
<ABC xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1">
<asd xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true">asd data 2</asd>
<xyz xmlns="http://xmlns.example.com/abc/a6/AB/XYZ/V1" xsi:nil="true">xyz data 2</xyz>
</ABC>
</ReadABCResponse>
</env:Body>
</env:Envelope>
Add the following using statements:
using System.Xml;
using System.Xml.Linq;
using System.Diagnostics;
Create a class (name: ABC.cs)
public class ABC
{
public string Asd { get; set; }
public string Xyz { get; set; }
}
Option 1:
private void GetABC()
{
//ToDo: replace with your XML data
string xmlText = "your XML data...";
//parse XML
XDocument doc = XDocument.Parse(xmlText);
//create new instance
List<ABC> abcs = new List<ABC>();
foreach (XElement elem in doc.Descendants().Where(x => x.Name.LocalName == "ABC"))
{
//create new instance
ABC abc = new ABC();
foreach (XElement elemChild in elem.Descendants())
{
//Debug.WriteLine($"{elemChild.Name}: '{elemChild.Value?.ToString()}'");
if (elemChild.Name.LocalName == "asd")
abc.Asd = elemChild.Value?.ToString();
else if (elemChild.Name.LocalName == "xyz")
abc.Xyz = elemChild.Value?.ToString();
}
//add to List
abcs.Add(abc);
}
foreach (ABC abc in abcs)
{
Debug.WriteLine($"ABC: '{abc.Asd}' XYZ: '{abc.Xyz}'");
}
}
Option 2:
private void GetABC()
{
//ToDo: replace with your XML data
string xmlText = "your XML data...";
//parse XML
XDocument doc = XDocument.Parse(xmlText);
//get namespace
XNamespace nsABC = doc.Descendants().Where(x => x.Name.LocalName == "ABC").FirstOrDefault().GetDefaultNamespace();
List<ABC> abcs = doc.Descendants().Where(x => x.Name.LocalName == "ABC").Select(x2 => new ABC()
{
Asd = (string)x2.Element(nsABC + "asd"),
Xyz = (string)x2.Element(nsABC + "xyz")
}).ToList();
foreach (ABC abc in abcs)
{
Debug.WriteLine($"ABC: '{abc.Asd}' XYZ: '{abc.Xyz}'");
}
}
Resources:
XDocument
How to Read SOAP XML Response in VB.NET
You can also do this: copy the xml and paste it in Visual studio using Paste special -> Paste Xml as classes. Now you will have an Envelope class and you can deserialize the xml like in this example:https://learn.microsoft.com/en-us/dotnet/api/system.xml.serialization.xmlserializer.deserialize?view=net-7.0.
Soap messages can be quite complex and it would be easier to work with objects which you can modify and eventually serialize back if needed.
Thanks for your response guys but I managed to solve it. In the namsspace manager, I made the following change
nsmgr.AddNamespace("x", "http://xmlns.example.com/abc/a6/AB/XYZ/V1");
and when listing the ABC nodes, I made the following change i.e. to prefix ABC with x:
XmlNodeList lst = responseDoc.SelectNodes("//x:ABC", nsmgr);
Rest of the code remains as it is and now I can loop through all the ABC nodes.

C# : XML Parsing : Group XML on a node and then subGroup underneath the same group

<?xml version="1.0" encoding="UTF-8"?>
<Batch Id="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<PurchaseOrders>
<PurchaseOrder id="xx267681">
<Header>
<AccountNumber>999</AccountNumber>
<ShipDate>2/10/2009</ShipDate>
</Header>
<PurchaseOrderDetails>
<Item>
<ItemNumber>yy235240</ItemNumber>
<Quantity>200</Quantity>
</Item>
<Item>
<ItemNumber>yy336820</ItemNumber>
<Quantity>3</Quantity>
</Item>
</PurchaseOrderDetails>
</PurchaseOrder>
<PurchaseOrder id="zz267456">
<Header>
<AccountNumber>123</AccountNumber>
<ShipDate>2/10/2009</ShipDate>
</Header>
<PurchaseOrderDetails>
<Item>
<ItemNumber>nn235240</ItemNumber>
<Quantity>200</Quantity>
</Item>
</PurchaseOrderDetails>
</PurchaseOrder>
</PurchaseOrders>
</Batch>
Attached above is the XML file I am trying to parse. My current C# code find all items in the XML file and assigns it against the PO#. But the recent XML file I got to know that there can be multiple PO# in the same XML file and hence I now need to find only those items matching to that PO#.
So in above example, PONumber with xx267681 has 2 items whereas 2nd PO has only item.
Here is what I tried so far.
try
{
ArrayList ItemsInFeed = new ArrayList();
XDocument xDoc = XDocument.Load(fileName);
foreach (var node in xDoc.Descendants("PurchaseOrder"))
{
poID = node.Attribute("id").Value;
}
foreach (var node in xDoc.Descendants("Item"))
{
Items itemRcd = new Items();
itemRcd.ItemNr = node.Descendants("ItemNumber")?.First().Value;
ItemsInFeed.Add(itemRcd);
}
if (ItemsInFeed.Count > 0)
{
// Do other logic based on the items linked to each PO#.
// Issue found : So far each XML file has one PO#, but latest XML file received has more than PO# and underlying items.
ItemsInFeed.Clear();
}
}
catch (Exception ex)
{
//Catch exception here
}
Try following :
XDocument xDoc = XDocument.Load(fileName);
var results = xDoc.Descendants("PurchaseOrder").Select(x => new
{
poID = (string)x.Attribute("id"),
items = x.Descendants("Item").Select(y => new
{
itemNumber = (string)y.Element("ItemNumber"),
quantity = (int)y.Element("Quantity")
}).ToList()
}).ToList();

Search through XML and grab another Node

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>1</MessageID>
<Product>
<SKU>33333-01</SKU>
</Product>
</Message>
</Envelope>
I've tried googling but whether I'm just not providing the correct search criteria I don't know.
I want to be able to search the XML file based on the MessageID and then grab the SKU.
I then want to search another XML file based on the SKU and remove that message completely.
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>1</MessageID>
<Inventory>
<SKU>33333-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
<Message>
<MessageID>2</MessageID>
<Inventory>
<SKU>22222-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
</Envelope>
Meaning the XML above becomes:
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>2</MessageID>
<Inventory>
<SKU>22222-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
</Envelope>
To confirm I cannot confirm that the MessageID will be the same over different XML files.
Thanks in advance for any help.
My questions:
How do I search through XML files?
How do I then grab another Nodes details
Can I remove a complete from an XML file based on a search?
You can use XmlDocument to load your XML document. Then, you can use XPath for searching any nodes.
XmlDocument document = new XmlDocument();
document.Load("C:\fileOnTheDisk.xml");
// or
document.LoadXml("<a>someXmlString</a>");
// Returns single element or null if not found
var singleNode = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
// Returns a NodeList
var nodesList = document.SelectNodes("Envelope/Message[MessageID = '1']");
Read more about XPath at w3schools.com.
Here is a good XPath Tester.
For example, you can use the following XPath to find nodes in your document by ID:
XmlDocument document = new XmlDocument();
document.Load("C:\doc.xml");
var node = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
var sku = node.SelectSingleNode("Inventory/SKU").InnerText;
Console.WriteLine("{0} node has SKU = {1}", 1, sku);
Or you can output all SKUs:
foreach (XmlNode node in document.SelectNodes("Envelope/Message"))
{
Console.WriteLine("{0} node has SKU = {1}",
node.SelectSingleNode("MessageID").InnerText,
node.SelectSingleNode("Inventory/SKU").InnerText);
}
It will produce:
1 node has SKU = 33333-01
2 node has SKU = 22222-01
Note that there are possible NullReferenceExceptions if nodes are not present.
You can simply remove it using RemoveChild() method of its parent.
XmlDocument document = new XmlDocument();
document.Load("C:\doc.xml");
var node = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
node.ParentNode.RemoveChild(node);
document.Save("C:\docNew.xml"); // will be without Message 1
You can use Linq to XML to do this:
var doc= XDocument.Load("input.xml");//path of your xml file in which you want to search based on message id.
var searchNode= doc.Descendants("MessageID").FirstOrDefault(d => d.Value == "1");// It will search message node where its value is 1 and get first of it
if(searchNode!=null)
{
var SKU=searchNode.Parent.Descendants("SKU").FirstOrDefault();
if(SKU!=null)
{
var searchDoc=XDocument.Load("search.xml");//path of xml file where you want to search based on SKU value.
var nodes =searchDoc.Descendants("SKU").Where(d=>d.Value==SKU.Value).Select(d=>d.Parent.Parent).ToList();
nodes.ForEach(node=>node.Remove());
searchDoc.Save("output.xml");//path of output file
}
}
I'd recommend you did this using LINQ to XML - it's much nicer to work with than the old XmlDocument API.
For all the examples, you can parse your XML string xml to an XDocument like so:
var doc = XDocument.Parse(xml);
1. How do I search through XML files?
You can get the SKU for a specific message ID by querying your document:
var sku = (string)doc.Descendants("Message")
.Where(e => (int)e.Element("MessageID") == 1)
.SelectMany(e => e.Descendants("SKU"))
.Single();
2. How do I then grab another Nodes details?
You can get the Message element with a specified SKU using a another query:
var message = doc.Descendants("SKU")
.Where(sku => (string)sku == "33333-01")
.SelectMany(e => e.Ancestors("Message"))
.Single();
3. Can I remove a complete element from an XML file based on a search?
Using your result from step 2, you can simple call Remove:
message.Remove();
Alternatively, you can combine the query from step 2 and simply execute a command to remove any messages that have a specific SKU:
doc.Descendants("SKU")
.Where(sku => (string)sku == "33333-01")
.SelectMany(e => e.Ancestors("Message"))
.Remove();
I tried to answer all your questions:
using System.Xml.XPath;
using System.Xml.Linq;
XDocument xdoc1 = XDocument.Load("xml1.xml");
XDocument xdoc2 = XDocument.Load("xml2.xml");
string sku = String.Empty;
string searchedID = "2";
//1.searching through an xml file based on path
foreach (XElement message in xdoc1.XPathSelectElements("Envelope/Message"))
{
if (message.Element("MessageID").Value.Equals(searchedID))
{
//2.grabbing another node's details
sku = message.XPathSelectElement("Inventory/SKU").Value;
}
}
foreach (XElement message in xdoc2.XPathSelectElements("Envelope/Message"))
{
if (message.XPathSelectElement("Inventory/SKU") != null && message.XPathSelectElement("Inventory/SKU").Value.Equals(sku))
{
//removing a node
message.Remove();
}
}
xdoc2.Save("xml2_del.xml");
}

Reading XML File - reading a child node which has any number of subnodes

So I'm currentlty trying to parse an XML file which looks like so:
<employees>
<employee>
<id>1</id>
<projects>
<projectID>7</projectID>
<projectID>3</projectID>
</projects>
</employee>
<employee>
<id>2</id>
<projects>
<projectID>4</projectID>
</projects>
</employee>
</employees>
I'm trying to read in each employee and any number of projects which appear. The Employee object is a string and list(int).
Currently I have:
XmlDocument doc = new XmlDocument();
doc.Load(path);
XmlNodeList xmlNodes = doc.DocumentElement.SelectNodes("/employees/employee");
foreach (XmlNode xmlNode in xmlNodes)
{
string id;
List<int> projects = new List<int>();
id = xmlNode.SelectSingleNode("id").InnerText;
//this is the bit. What I have works but it feels like it could
//be majorly refined. Is there a better way to construct the foreach below?
foreach (XmlNode node in xmlNode.ChildNodes.Item(1))
//index 1 is the projects node
{
projects.Add(int.Parse(node.InnerText));
}
//
Employee e = new Employee(id, projects);
e.Add(e);
}
If the XML file itself is an issue it can be changed to accomodate the parsing.
Thank you.
It will be much easier with LINQ to XML:
var xDoc = XDocument.Load(path);
var employees = (from e in xDoc.Root.Elements("employee")
let projects = e.Element("projects")
.Elements("projectID")
.Select(p => (int)p)
.ToList()
let id = (string)e.Element("id")
select new Employee(id, projects)).ToList();
You need using System.Linq and using System.Xml.Linq to make it work.

Get the Value of XElements that dont has Child

How can I get the value of a Node in a XDocument when don't has more childs ?
<Contacts>
<Company>
<Name>Testing</Name>
<ID>123</ID>
</Company>
</Contacts>
In this case, I wanna get the value of the <Name> and <ID> element, because don't has child elements in them.
I'm trying the follow
protected void LeXMLNode(HttpPostedFile file)
{
XmlReader rdr = XmlReader.Create(file.FileName);
XDocument doc2 = XDocument.Load(rdr);
foreach (var name in doc2.Root.DescendantNodes().OfType<XElement>().Select(x => x.Name).Distinct())
{
XElement Contact = (from xml2 in doc2.Descendants(name.ToString())
where xml2.Descendants(name.ToString()).Count() == 0
select xml2).FirstOrDefault();
string nome = name.ToString();
}
}
but without success, because my foreach pass in all Elements and I wanna get just the value of Elements that don't has childs.
document.Root.Elements("Company").Elements()
.Where(item => !item.HasElements).ToList();
See XElement.HasElements: http://msdn.microsoft.com/en-us/library/system.xml.linq.xelement.haselements.aspx

Categories