Parse XML issue - c#

I am parsing a huge XML using for-loops,SelectNodes,Attributes.GetNamedItem etc.
I came across an issue of how to parse the identical nodes CustLoyalty that are identical as shown in the abstract below. The issue is how to get the identical noded values since they are not exclusively inside a parent node
<Customer>
<PersonName>
<NamePrefix>Ms</NamePrefix>
<GivenName>Fra</GivenName>
<Surname>etti</Surname>
</PersonName>
<Telephone FormattedInd="false" PhoneLocationType="6" PhoneNumber="10" PhoneTechType="1"/>
<Telephone FormattedInd="false" PhoneLocationType="6" PhoneNumber="49" PhoneTechType="3"/>
<Email DefaultInd="true" EmailType="1">z#z</Email>
<Address Type="1">
<AddressLine>alace</AddressLine>
<StateProv StateCode="NY"/>
<CountryName Code="GB"/>
</Address>
<CustLoyalty MembershipID="3" ProgramID="Guest"/>
<CustLoyalty MembershipID="6" ProgramID="Freq"/>
<CustLoyalty MembershipID="56" ProgramID="teID"/>
<CustLoyalty MembershipID="N6" ProgramID="ID"/>
</Customer>
My code goes something like that:
XmlNodeList CustomerList = ProfileList[v].SelectNodes("df:Customer", mgr);
for (int w = 0; w < CustomerList.Count; w++)
{
XmlNodeList PersonNameList = CustomerList[w].SelectNodes("df:PersonName", mgr);
for (int x = 0; x < PersonNameList.Count; x++)
{
XmlNode NamePrefixNode = PersonNameList[x].SelectSingleNode("df:NamePrefix", mgr);
string NamePrefix = NamePrefixNode.InnerText;
XmlNode GivenNameNode = PersonNameList[x].SelectSingleNode("df:GivenName", mgr);
string GivenName = GivenNameNode.InnerText;
XmlNode SurnameNode = PersonNameList[x].SelectSingleNode("df:Surname", mgr);
string Surname = SurnameNode.InnerText;
myProfiles.GivenName = GivenName;
myProfiles.Surname = Surname;
myProfiles.NamePrefix = NamePrefix;
}
XmlNode TelephoneNode = CustomerList[w].SelectSingleNode("df:Telephone", mgr);
if (TelephoneNode != null)
{
string PhoneNumber = TelephoneNode.Attributes.GetNamedItem("PhoneNumber").Value;
myProfiles.Telephone = PhoneNumber;
}..........

Let's say that you are parsing it with XDocument object. Beware that XDocument can throw exception if your input isn't valid html and element xCostumer can have null value if element with name "Customer" is not in xDoc on top level in element hierarchy.
XDocument xDoc = XDocument.Parse(YourStringHoldingXmlContent);
XElement xCustomer = xDoc.Element("Customer");
foreach (XElement CustLoayalty in xCustomer.Elements("CustLoyalty"))
{
Console.WriteLine(CustomLoaylty.Value.ToString());
}

you can do the following
1- you define a class CustomLoyalty
public class CustomLoyalty
{
public string Membership{get;set;}
public string Program{get;set;}
}
2- declare a list call it uniqueCustomLoyalty
private List<CustomLoyalty> uniqueCustomLoyalty=new List<CustomLoyalty>();
3- while you are looping on the custom loyalty for each customer do this
foreach(var data in customLoyaltiesList)
{
// customLoyaltiesList is the list of nodes of type custom loyalty
// assume that the current object of customloyalty called detail
CustomLoyalty detail=new CustomLoyalty(){
Membership=data.Attributes.GetNamedItem("MembershipID").Value, // the code to get the value of membership ID according to the method you are using
Program=data.Attributes.GetNamedItem("ProgramID").Value,
};
// check if the list contains the current customloyalty
var exists=uniqueCustomLoyalty.FirstOrDefault(t=>MemberShip=detail.MemberShip && t.Program=detail.Program);
if(exists==null) // list doesn't contain this data
uniqueCustomLoyalty.Add(detail); // add the detail to the list to compare it with the rest of the parsing data
else{
// the data is not unique, you can do what ever you want
}
}
hope this will help you
regards

Related

How do i get the parent xml based on value of child xml

I have an requirement like,I retrieved id and supplier from an xml which has more than 40 ID's and Suppliers.Now all i need is want to get the parent node of particular Id and Supplier and append it to another xml.
I however managed to retrieve ID and Supplier,now i want to get the whole xml in c#.Any help would be appreciable..
c#
var action = xmlAttributeCollection["id"];
xmlActions[i] = action.Value;
var fileName = xmlAttributeCollection["supplier"];
xmlFileNames[i] = fileName.Value;
This is the code i have used to get ID and supplier.
You may want to be a little more specific about how, you are traversing the Xml Tree, and give your variables types so we can understand the problem more clearly. In saying that here is my answer:
Assuming items[i] is an XmlNode, and in this case we are working with the "hoteId" node, there is a property called XmlNode.ParentNode which returns the immediate ancestor of a node, or null if it is a root node.
XmlNode currentNode = items[i] as XmlNode; //hotelId
XmlNode parentNode = currentNode.ParentNode; //hotelDetail
string outerXml = parentNode.OuterXml; //returns a string representation of the entire parent node
Full example:
XmlDocument doc = new XmlDocument();
doc.Load("doc.xml");
XmlNode hotelIdNode = doc.SelectSingleNode("hoteldetail//hotelId"); //Find a hotelId Node
XmlNode hotelDetailNode = hotelIdNode.ParentNode; //Get the parent node
string hotelDetailXml = hotelDetailNode.OuterXml; //Get the Xml as a string
You can get Parent XML like : XmlNode node = doc.SelectSingleNode("//hoteldetail"); node.innerXml;
I think you'll be better off using linq.
var xDoc = XDocument.Parse(yourXmlString);
foreach(var xElement in xDoc.Descendants("hoteldetail"))
{
//this is your <hoteldetail>....</hoteldetail>
var hotelDetail = xElement;
var hotelId = hotelDetail.Element("hotelId");
//this is your id
var id = hotelId.Attribute("id").Value;
//this is your supplier
var supplier = hotelId.Attribute("supplier").Value;
if (id == someId && supplier == someSupplier)
return hotelDetail;
}

Get XmlNodeList if a particular element value or its attribute value is present in a given list of strings

I would like to get XmlNodeList from a huge XML file.
Conditions:
I have a List of unique ID values, say IDList
Case I: Collect all the nodes where element called ID has value from IDList.
Case II: Collect all nodes where one of the attribute called idName of element ID has value from IDList.
In short, extract only the nodes which match with the values given in the IDList.
I did this using some loops like load this XML to XmlDocument to iterate over all nodes and ID value but what I am looking for is some sophisticated method to do it faster and in quick way.
Because looping isn't a solution for a large XML file.
My try:
try
{
using (XmlReader reader = XmlReader.Create(URL))
{
XmlDocument doc = new XmlDocument();
doc.Load(reader);
XmlNodeList nodeList = doc.GetElementsByTagName("idgroup");
foreach (XmlNode xn in nodeList)
{
string id = xn.Attributes["id"].Value;
string value = string.Empty;
if (IDList.Contains(id))
{
value = xn.ChildNodes[1].ChildNodes[1].InnerText; // <value>
if (!string.IsNullOrEmpty(value))
{
listValueCollection.Add(value);
}
}
}
}
}
catch
{}
XML (XLIFF) structure:
<XLIFF>
<xliff xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.2">
<file date="2013-07-17">
<body>
<id idName="test_001" >
<desc-group name="test_001">
<desc type="text"/>
</desc-group>
<result-unit idName="test_001_text">
<source>abcd</source>
<result>xyz</result>
</result-unit>
</id>
</body>
</file>
</xliff>
Collect all the nodes like above where idName matches.
EDIT
This is a test that can parse the example you are giving. It attempts to reach the result node directly, so that it stays as efficient as possible.
[Test]
public void TestXPathExpression()
{
var idList = new List<string> { "test_001" };
var resultsList = new List<string>();
// Replace with appropriate method to open your URL.
using (var reader = new XmlTextReader(File.OpenRead("fixtures\\XLIFF_sample_01.xlf")))
{
var doc = new XmlDocument();
doc.Load(reader);
var root = doc.DocumentElement;
// This is necessary, since your example is namespaced.
var nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", "urn:oasis:names:tc:xliff:document:1.2");
// Go directly to the node from which you want the result to come from.
foreach (var nodes in idList
.Select(id => root.SelectNodes("//x:file/x:body/x:id[#idName='" + id + "']/x:result-unit/x:result", nsmgr))
.Where(nodes => nodes != null && nodes.Count > 0))
resultsList.AddRange(nodes.Cast<XmlNode>().Select(node => node.InnerText));
}
// Print the resulting list.
resultsList.ForEach(Console.WriteLine);
}
You can extract only those nodes you need by using an XPath query. A brief example on how you 'd go about it:
using (XmlReader reader = XmlReader.Create(URL))
{
XmlDocument doc = new XmlDocument();
doc.Load(reader);
foreach(var id in IDList) {
var nodes = doc.SelectNodes("//xliff/file/body/id[#idName='" + id + "']");
foreach(var node in nodes.Where(x => !string.IsNullOrEmpty(x.ChildNodes[1].ChildNodes[1].InnerText)))
listValueCollection.Add(node.ChildNodes[1].ChildNodes[1].InnerText);
}
}
The xpath expression is of course an example. If you want, you can post an example of your XML so I can give you something more accurate.

How can I know the index of a XML Tag

How can I get the index of my current XML tag ?
Example:
<User>
<Contact>
<Name>Lucas</Name>
</Contact>
<Contact>
<Name>Andre</Name>
</Contact>
...
</User>
I'm trying the code below
foreach (var element2 in doc2.Root.Descendants())
{
String name = element.Name.LocalName;
String value = element.Value;
}
I want to know if I'm reading the first <Contact> tag, or the second, or the third...
Using the appropriate overload of Select will yield the index as you enumerate the collection.
var userContacts = doc2.Root
.Descendants()
.Where(element => element.Name == "Contact")
.Select((c, i) => new {Contact = c, Index = i});
foreach(var indexedContact in userContacts)
{
// indexedContact.Contact
// indexedContact.Index
}
Note: I added the .Where because .Descendants will recurse.
You can use a for statement, then you'll always know the index. I am making an assumption that Descendants() can be used in a for statement.
The other possibility it to create a count variable outside the foreach.
int count = 0
foreach (var element2 in doc2.Root.Descendants())
{
String name = element.Name.LocalName;
String value = element.Value;
count++;
}
Replace your foreach loop with a normal for loop:
for (int i = 0; i < doc2.Root.Descendants().Count(); i++)
{
String name = doc2.Root.Descendants()[i].Name.LocalName;
String value = doc2.Root.Descendants()[i].Value;
}
Then use i to see if you're reading the first, second, third, etc. tag.
There is no way to get the index of a foreach enumerator without using an external counter.. AFAIK.
This also presents an efficiency problem, as you have to process the Descendants method twice every loop iteration, so I recommend keeping a List representing the Descendants outside of the for loop, and then use it like this:
var desecendants = doc2.Root.Descendants().ToList();
for (int i = 0; i < descendants.Count; i++)
{
String name = descendants[i].Name.LocalName;
String value = descendants[i].Value;
}
Use a variable as counter and put the result into an array. The problem here is, that you need to know the size of the array in advance.
int i = 0;
foreach (var element in doc2.Root.Descendants()) {
name[i] = element.Name.LocalName;
value[i] = element.Value;
i++;
}
with the use of a List<T> you don't have this problem
var list = new List<KeyValuePair<string,string>>();
foreach (var element in doc2.Root.Descendants()) {
list.Append(new KeyValuePair(element.Name.LocalName, element.Value));
}
I don't think you can with foreach, try using a normal for loop instead.
To get the position of your current node without counters (as previous solutions pointed out) you'll need to write a function to build up a the XPath of your current XmlElement. The only way to do it is to traverse the document from your node using parent node and previous siblings. That way you'll be able to build up the exact XPath to access your node from the document. Here's a sample taken from here
public static string GetXPath_UsingPreviousSiblings(this XmlElement element)
{
string path = "/" + element.Name;
XmlElement parentElement = element.ParentNode as XmlElement;
if (parentElement != null)
{
// Gets the position within the parent element, based on previous siblings of the same name.
// However, this position is irrelevant if the element is unique under its parent:
XPathNavigator navigator = parentElement.CreateNavigator();
int count = Convert.ToInt32(navigator.Evaluate("count(" + element.Name + ")"));
if (count > 1) // There's more than 1 element with the same name
{
int position = 1;
XmlElement previousSibling = element.PreviousSibling as XmlElement;
while (previousSibling != null)
{
if (previousSibling.Name == element.Name)
position++;
previousSibling = previousSibling.PreviousSibling as XmlElement;
}
path = path + "[" + position + "]";
}
// Climbing up to the parent elements:
path = parentElement.GetXPath_UsingPreviousSiblings() + path;
}
return path;
}
Assuming that's really what you really need, depending on the document size it could be resource intensive. If you only require the index, I'd recommend using one of the other methods.

XMLReader reading XML file based on attribute value

I am trying to read the following file, I can read the attributes, but I can't go into the specific element (Address in this case) and read its elements based on the attribute of that (Address) element. Shortly I need to distinguish between work and home addresses. I need to do this with XMLReader class. Can you help?
<Address Label="Work">
<Name>Name1</Name>
<Street>PO 1</Street>
<City>City1</City>
<State>State 1</State>
</Address>
<Address Label="Home">
<Name>Name2</Name>
<Street>PO 2</Street>
<City>City2</City>
<State>State 2</State>
</Address>"
Okay, here are some notes to think about. XMLReader in the sense i understand you use it (with no code example) is that you iterate over the document, since the XMLReader is forward-only, and read-only.
Because of this you need to iterate until you find the node you need. In the example below i find the address element labeled "work" and extract that entire node. Then query on this node as you want.
using (var inFile = new FileStream(path, FileMode.Open))
{
using (var reader = new XmlTextReader(inFile))
{
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
if (reader.Name == "Address" && reader.GetAttribute(0) == "Work")
{
// Create a document, which will contain the address element as the root
var doc = new XmlDocument();
// Create a reader, which only will read the substree <Address> ... until ... </Address>
doc.Load(reader.ReadSubtree());
// Use XPath to query the nodes, here the "Name" node
var name = doc.SelectSingleNode("//Address/Name");
// Print node name and the inner text of the node
Console.WriteLine("Node: {0}, Inner text: {1}", name.Name, name.InnerText);
}
break;
}
}
}
}
Edit
Made an example that not uses LINQ
XML:
<Countries>
<Country name ="ANDORRA">
<state>Andorra (general)</state>
<state>Andorra</state>
</Country>
<Country name ="United Arab Emirates">
<state>Abu Z¸aby</state>
<state>Umm al Qaywayn</state>
</Country>
Java:
public void datass(string file)
{
string file = HttpContext.Current.Server.MapPath("~/App_Data/CS.xml");
XmlDocument doc = new XmlDocument();
if (System.IO.File.Exists(file))
{
//Load the XML File
doc.Load(file);
}
//Get the root element
XmlElement root = doc.DocumentElement;
XmlNodeList subroot = root.SelectNodes("Country");
for (int i = 0; i < subroot.Count; i++)
{
XmlNode elem = subroot.Item(i);
string attrVal = elem.Attributes["name"].Value;
Response.Write(attrVal);
XmlNodeList sub = elem.SelectNodes("state");
for (int j = 0; j < sub.Count; j++)
{
XmlNode elem1 = sub.Item(j);
Response.Write(elem1.InnerText);
}
}
}
Using XPath you can easily write concise expressions to navigate an XML document.
You would do something like
XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml(myXMLString);
XmlNode homeAddress = xDoc.SelectSingleNode("//Address[#Label='Work']");
Then do whatever you want with homeAddress.
Read more here on w3schools on XPath.

How to select XML node by attribute and use it's child nodes data?

Here is my XML file
<?xml version="1.0" encoding="utf-8" ?>
<storage>
<Save Name ="Lifeline">
<Seconds>12</Seconds>
<Minutes>24</Minutes>
<Hours>9</Hours>
<Days>25</Days>
<Months>8</Months>
<Years>2010</Years>
<Health>90</Health>
<Mood>100</Mood>
</Save>
<Save Name ="Hellcode">
<Seconds>24</Seconds>
<Minutes>48</Minutes>
<Hours>18</Hours>
<Days>15</Days>
<Months>4</Months>
<Years>1995</Years>
<Health>50</Health>
<Mood>50</Mood>
</Save>
Here is a code which get's data from XML and loads it into application.
System.IO.StreamReader sr = new System.IO.StreamReader(#"Saves.xml");
System.Xml.XmlTextReader xr = new System.Xml.XmlTextReader(sr);
System.Xml.XmlDocument save = new System.Xml.XmlDocument();
save.Load(xr);
XmlNodeList saveItems = save.SelectNodes("Storage/Save");
XmlNode seconds = saveItems.Item(0).SelectSingleNode("Seconds");
sec = Int32.Parse(seconds.InnerText);
XmlNode minutes = saveItems.Item(0).SelectSingleNode("Minutes");
min = Int32.Parse(minutes.InnerText);
XmlNode hours = saveItems.Item(0).SelectSingleNode("Hours");
hour = Int32.Parse(hours.InnerText);
XmlNode days = saveItems.Item(0).SelectSingleNode("Days");
day = Int32.Parse(days.InnerText);
XmlNode months = saveItems.Item(0).SelectSingleNode("Months");
month = Int32.Parse(months.InnerText);
XmlNode years = saveItems.Item(0).SelectSingleNode("Years");
year = Int32.Parse(years.InnerText);
XmlNode health_ = saveItems.Item(0).SelectSingleNode("Health");
health = Int32.Parse(health_.InnerText);
XmlNode mood_ = saveItems.Item(0).SelectSingleNode("Mood");
mood = Int32.Parse(mood_.InnerText);
The problem is that this code loads data inly from "Lifeline" node. I would like to use a listbox and be able to choose from which node to load data.
I've tried to take string from listbox item content and then use such a line
XmlNodeList saveItems = save.SelectNodes(string.Format("storage/Save[#Name = '{0}']", name));
variable "name" is a string from listboxe's item. While compiled this code gives exception.
Do somebody knows a way how to select by attribute and load nedeed data from that XML?
If you can use XElement:
XElement xml = XElement.Load(file);
XElement storage = xml.Element("storage");
XElement save = storage.Elements("Save").FirstOrDefault(e => ((string)e.Attribute("Name")) == nameWeWant);
if(null != save)
{
// do something with it
}
Personally I like classes that have properties that convert to and from the XElement to hide that detail from the main program. IE say the Save class takes an XElement node in the constructor, saves it internally globally, and the properties read/write to it.
Example class:
public class MyClass
{
XElement self;
public MyClass(XElement self)
{
this.self = self;
}
public string Name
{
get { return (string)(self.Attribute("Name") ?? "some default value/null"); }
set
{
XAttribute x = source.Attribute("Name");
if(null == x)
source.Add(new XAttribute("Name", value));
else
x.ReplaceWith(new XAttribute("Name", value));
}
}
}
Then you can change the search to something like:
XElement save = storage.Elements("Save")
.FirstOrDefault(e => new MyClass(e).Name == NameWeWant);
Since it is not that much data, I'd suggest loading all information to a list of saves(constructor) and then drawing from there which one the user would like to use...
As for things not working, I personally use a lower level approach to get my data and it is not error prone. Remodeling it to fit your problem a bit:
int saves = 0;
List<Saves> saveGames = new List<Saves>();
saveGames.Add(new Saves());
while (textReader.Read())
{
if (textReader.NodeType == XmlNodeType.Element)
whatsNext = textReader.Name;
else if (textReader.NodeType == XmlNodeType.Text)
{
if (whatsNext == "name")
saveGames[saves].name = Convert.ToString(textReader.Value);
//else if statements for the rest of your attributes
else if (whatsNext == "Save")
{
saveGames.Add(new Saves());
saves++;
}
}
else if (textReader.NodeType == XmlNodeType.EndElement)
whatsNext = "";
}
Basically throw everything in the xml file into a list of objects and manipulate that list to fill the listbox. Instead of having Saves name = "...", have a name attribute as the first attribute in the save.
Code tags hate me. Why they break so easily ( ._.)
The select nodes is returning two XmlNode objects.
XmlNodeList saveItems = save.SelectNodes("Storage/Save");
Later in your code you seem to be selecting the first one and with saveItems.Item(0) and getting values from it which in this case would be the save node with the Name="LifeLine". So if you were to do saveItems.Item(1) and select nodes and its values then you would get the other set of nodes.

Categories