Get partial data of XML file - c#

I have this xml file which contains information from tagchimp, but the file contains way to much information. How do i only load the information i need. I have found some code:
XmlDocument doc = new XmlDocument();
doc.Load(path);
XmlElement root = doc.DocumentElement;
XmlNodeList nodes = root.SelectNodes("movie");
foreach (XmlNode node in nodes)
{
string date = node["tagChimpID"].InnerText;
string name = node["locked"].InnerText;
Console.WriteLine("Id:" + date + " Locked:" + name);
}
but it only loads the elements not the child elements, some for example movieTitle or shortDescription
I found a way:
public static string Test(string path, XmlElement nodes)
{
string Name = "";
string releaseDate = "";
XmlNodeList xnList = nodes.SelectNodes("/items/movie");
XmlNode eNode;
foreach (XmlNode xn in xnList)
{
eNode = xn.SelectSingleNode("movieTags/info/movieTitle");
if (eNode != null)
{
Name = eNode.InnerText;
}
eNode= xn.SelectSingleNode("movieTags/info/releaseDate");
releaseDate = eNode.InnerText;
}
but it's not the most practical way to come by it.

Since you are using .NET, you might spend some time learning LINQ to XML, since it sounds like it would do what you want.
There is a lot of online documentation, in particular how to do various basic queries to get information from specific nodes: http://msdn.microsoft.com/en-us/library/bb943906.aspx
You could also get the same results with XPath.
Or you could manually traverse the nodes, although you would have to handle the child nodes properly (as you already discovered).

Related

How Can I Parse an XML file using XmlNodeList

I have been tasked with taking one XML file and converting it to a new XML I have no experience working with XML documents but I have been able to get some of the data from the first XML document using the code shown below. Note not all code is being shown just a small example.
XmlDocument rssXmlDoc = new XmlDocument();
// Load the RSS file from the RSS URL
rssXmlDoc.Load("https://agency.governmentjobs.com/jobfeed.cfm?agency=ocso");
// Setup name space
XmlNamespaceManager nsmgr = new XmlNamespaceManager(rssXmlDoc.NameTable);
nsmgr.AddNamespace("joblisting", "http://www.neogov.com/namespaces/JobListing");
// Parse the Items in the RSS file
XmlNodeList rssNodes = rssXmlDoc.SelectNodes("rss/channel/item/");
// Iterate through the items in the RSS file
foreach (XmlNode rssNode in rssNodes)
{
XmlNode rssSubNode = rssNode.SelectSingleNode("title");
string title = rssSubNode != null ? rssSubNode.InnerText : "";
using this code I am able to get most of the elements. I have run into a wall when trying to get data from a child element. The portion of the XML I cannot get is shown below.
<joblisting:department>Supply</joblisting:department>
<guid isPermaLink="true">https://www.governmentjobs.com/careers/ocso/Jobs/2594527</guid>
<joblisting:categories>
<joblisting:category xmlns:joblisting="http://www.neogov.com/namespaces/JobListing" xmlns:atom="http://www.w3.org/2005/Atom">
<CategoryCode>ClericalDataEntry</CategoryCode>
<Category>Clerical & Data Entry</Category>
</joblisting:category>
<joblisting:category
</joblisting:categories>
But I cannot get all of the data. How can I get the value for the element that starts with guid isPermaLink="true"
For the joblisting:categories I have used a foreach loop to read those values
foreach (var item in rssSubNode.SelectNodes("joblisting:categories", nsmgr))
{
rssSubNode = rssSubNode = rssNode.SelectSingleNode("joblisting:category", nsmgr);
string category = rssSubNode != null ? rssSubNode.InnerText : "";
}
How can read the values of those child elements?
To read guid node you can use the follow code. note that use selectSingleNode in node contains the "item" node.
public static void test() {
XmlDocument rssXmlDoc = new XmlDocument();
// Load the RSS file from the RSS URL
rssXmlDoc.Load("https://agency.governmentjobs.com/jobfeed.cfm?agency=ocso");
// Setup name space
XmlNamespaceManager nsmgr = new XmlNamespaceManager(rssXmlDoc.NameTable);
nsmgr.AddNamespace("joblisting", "http://www.neogov.com/namespaces/JobListing");
// Parse the Items in the RSS file
XmlNodeList rssNodes = rssXmlDoc.SelectNodes("rss/channel/item");
// Iterate through the items in the RSS file
foreach (XmlNode rssNode in rssNodes) {
var xmlnode = rssNode.SelectSingleNode("guid ");
System.Console.WriteLine("the value of guid is =>" + xmlnode.InnerText);
XmlNode rssSubNode = rssNode.SelectSingleNode("title");
string title = rssSubNode != null ? rssSubNode.InnerText : "";
}
}

How to get the sibling of an xml node

I have an xml file as below
<Games>
<Game>
<name>Tzoker</name>
<file>tzoker1</file>
</Game>
<Game>
<file>lotto770</file>
</Game>
<Game>
<name>Proto</name>
<file>proto220</file>
</Game>
</Games>
I want to get the values of name and file items for every Game node.
It is easy by using this query.
string query = String.Format("//Games/Game");
XmlNodeList elements1 = xml.SelectNodes(query);
foreach (XmlNode xn in elements1)
{
s1 = xn["name"].InnerText;
s2 = xn["file"].InnerText;
}
The problem is that there are some nodes that they don't have the name item. So the code above doesn't work.
I have solved the problem by using the following code
string query = String.Format("//Games/Game/name");
XmlNodeList elements1 = xml.SelectNodes(query);
foreach (XmlNode xn in elements1)
{
s1 = xn.InnerText;
string query1 = String.Format("//Games/Game[name='{0}']/file", s1);
XmlNodeList elements2 = xml.SelectNodes(query1);
foreach (XmlNode xn2 in elements2)
{
s2 = xn2.InnerText;
}
}
The problem is that there is a case that two or more nodes have the same name value. So, the s2 variable will get the file value of the last node that the loop finds. So, I would like to find a way to get the sibling file value of the current name item. How could I do it? I try do move to the parent node of the current node and then to move to the file item but without success by using the following code.
string query = String.Format("//Games/Game/name");
XmlNodeList elements1 = xml.SelectNodes(query);
foreach (XmlNode xn in elements1)
{
s1 = xn.InnerText;
string query1 = String.Format("../file");
XmlNodeList elements2 = xml.SelectNodes(query1);
foreach (XmlNode xn2 in elements2)
{
s2 = xn2.InnerText;
}
}
I hope there is a solution.
You can use Game[name] to filter Game elements to those with child element name. This is possible because child:: is the default axes which will be implied when no explicit axes mentioned. Extending this further to check for child element file as well, would be as simple as Game[name and file] :
string query = String.Format("//Games/Game[name]");
XmlNodeList elements1 = xml.SelectNodes(query);
foreach (XmlNode xn in elements1)
{
s1 = xn["name"].InnerText;
s2 = xn["file"].InnerText;
}
Now to answer your question literally, you can use following-sibling:: axes to get sibling element that follows current context element. So, given the context element is name, you can do following-sibling::file to return the sibling file element.
Your attempt which uses ../file should also work. The only problem was, that your code executes that XPath on xml, the XmlDocument, instead of executing it on current name element :
XmlNodeList elements2 = xn.SelectNodes("../file");
If I understand you correctly you want to find all games that have a name. You can do that using XPath. Here is a solution that uses LINQ to XML. I find that easier to work with than XmlDocument:
var xDocument = XDocument.Parse(xml);
var games = xDocument
.Root
.XPathSelectElements("Game[child::name]")
.Select(
gameElement => new {
Name = gameElement.Element("name").Value,
File = gameElement.Element("file").Value
}
);
The XPath to select all <Game> elements that have a <name> child element is Game[child::name].

Reading XML nodes causing issues

I have one XML string, I am trying to read that using C#, but I am not getting child nodes. I am getting entire XML as inner XML string. I am not able to read the nodes. Here is my XML string and my code.
<Filters FilterName="706337_test">
<MemberName>Dorvil</MemberName>
<MemberId />
<ProviderName />
<ProviderId>706337</ProviderId>
<SelectedProjects>5030003</SelectedProjects>
<CNAChartSelected>false</CNAChartSelected>
<OldProject>false</OldProject>
</Filters>
C# code trying to read the XML nodes
XmlDocument xml = new XmlDocument();
xml.LoadXml(xmlstring);
XmlNodeList xnList = xml.SelectNodes("/Filters");
I can see only one child node that too filters, I need to read MemberId, MemberName etc., how to read them?
This is because your string in SelectNodes is wrong:
var xml = new XmlDocument();
xml.LoadXml(xmlstring);
var xnList1 = xml.SelectNodes("/Filters"); //list of 1 element
var xnList2 = xml.SelectNodes("/Filters/*"); //list of 7 elements
foreach (XmlNode node in xnList2)
{
Console.WriteLine(node.OuterXml);
}
Also you can use this:
var xElements = XElement.Parse(xmlstring).Elements();
foreach (var element in xElements)
{
Console.WriteLine(element);
}
You need to tell the app which nodes to read..
XmlDocument xml = new XmlDocument();
xml.LoadXml(xmlstring);
XmlNodeList xnList = xml.SelectNodes("/Filters");
foreach (XmlNode node in xnList)
{
string memberName = node["MemberName"].InnerText;
}
This lets the app know to read what is inside the MemberName node.. Do the same for the other nodes and post back your results. Debug as you go to see what you are getting out of each node.

Can't parse XML data with C#

I have been trying to extract one value from a XML file, and then store it on the same file but in another node, I tried all the examples i've found on the net, read XPath Syntax documentation like hell and still can't get it to work.
I must take the <Documento ID="F60T33"> ID Value (which may vary) and copy it into <Reference URI="#F60T33">.
But I can't even do that if I can't manage to parse the lines, most of times node/variables/"", or I get an "object reference not established as object instance error".
Here's the code:
// Create a new XML document.
XmlDocument xmlDoc = new XmlDocument();
// Load an XML file into the XmlDocument object.
xmlDoc.PreserveWhitespace = true;
xmlDoc.Load(pfile);
//TEST !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! PROBLEMA
XmlNodeList Documentos = xmlDoc.GetElementsByTagName("//Documento");
XmlNodeList DatosDocumento = ((XmlElement)Documentos[0]).GetElementsByTagName("ID");
foreach (XmlElement nodo in DatosDocumento)
{
int I = 0;
XmlNodeList ID = nodo.GetElementsByTagName("ID");
Console.WriteLine("Elemento nombre ... {0}}", ID[i].InnerText);
}
//
XmlNodeList nodes = xmlDoc.SelectNodes("EnvioDTE");
XmlNode nodesimple = xmlDoc.SelectSingleNode("EnvioDTE");
Console.WriteLine("Lista Nodos: " + nodes.Count);
Console.WriteLine("Nodo Simple: " + nodesimple.InnerText);
foreach (XmlNode node in nodes)
{
string id = node.Attributes["ID"].InnerText;
Console.WriteLine(id);
}
I am almost certain the problem is on the XPath Syntax, but I can't get it to work.
Sadly I can't use XDocument as im using .NET 3.5 for this task, I would really appreciate some help on this, by behand apologize my bad english
As the XML file is too big, I'll put it here on this URL
http://puu.sh/bVNDc/31e4da5a26.xml
If you can get your references right for using System.Linq.XDocument you can do:
string idValue = xDocument.XPathSelectElement("//EnvioDTE/SetDTE")
.Attributes().SingleOrDefault(a => a.Name == "ID").Value;

Confused in getting nested elements and values via XmlDocument

I am trying to access quite a deep XML file, yet I get confused and just brute forced access to an element to get the elements within it, My problem is, it does not work (Object reference not set to an instance of an object) and my method of getting the "dictory" of the XML looks very inefficient,
My goal is to grab all the <ramStick> via a foreach loop since it can have 1-* ramSticks
Here is part of my C# code that I am having an issue with:
XmlDocument doc = new XmlDocument();
string xmlFilePath = #"C:\xampp\htdocs\userInfo.xml";
doc.Load(xmlFilePath);
XmlNodeList accountList = doc.GetElementsByTagName("account");
foreach (XmlNode node in accountList)
{
XmlElement accountElement = (XmlElement)node;
// I got inside this loop at this point and can get an accountElement with values
String hostname = accountElement.GetElementsByTagName("user")[0].InnerText;
// This is where I get confused and don't know what I am really doing and I just experiment
XmlNode accountRoot = accountElement.GetElementsByTagName("systemInfo")[0];
XmlElement ramNode = (XmlElement)accountRoot;
XmlNode ramInfo = ramNode.GetElementsByTagName("ramInfo")[0];
XmlElement ramList = (XmlElement)ramInfo;
XmlNodeList ramStick = ramList.GetElementsByTagName("ramStick");
// I want to run a foreach loop on each ramStick to get the values
foreach (XmlNode ramNodeForLoop in ramStick)
{
XmlElement ramData = (XmlElement)ramNodeForLoop;
String partNumber = ramData.GetElementsByTagName("partNumber")[0].InnerText;
}
}`
I have quite a big XML file which contains multiple <account> here is a sample of it:
<account>
<user>OYSTER-PC</user>
<systemInfo>
<dskInfo>
<dskInterface>
<deviceID>C:</deviceID><description>Local Fixed Disk</description><size>500000878592</size><freeSpace>377396776960</freeSpace><fileSystem>NTFS</fileSystem><volumeSerialNumber>4C922158</volumeSerialNumber>
</dskInterface>
<dskInterface>
<deviceID>D:</deviceID><description>CD-ROM Disc</description><size/><freeSpace/><fileSystem/><volumeSerialNumber/>
</dskInterface>
<dskInterface>
<deviceID>E:</deviceID><description>CD-ROM Disc</description><size/><freeSpace/><fileSystem/><volumeSerialNumber/>
</dskInterface>
</dskInfo>
<hddInfo>
<hddInterface>
<model>ST9500325AS ATA Device</model><interfaceType>IDE</interfaceType><name>\\.\PHYSICALDRIVE0</name><partitions>1</partitions><serialNumber>2020202020202020202020205636384547535146</serialNumber><status>OK</status>
</hddInterface>
</hddInfo>
<nicInfo>
<nicInterface>
<macAddress>48:5D:60:03:88:04</macAddress><description>Atheros AR9285 Wireless Network Adapter</description><ipAddress>192.168.1.10</ipAddress><ipSubnet>255.255.255.0</ipSubnet><defaultIpGateway>192.168.1.1</defaultIpGateway><dhcpServer>192.168.1.1</dhcpServer>
</nicInterface>
<nicInterface>
<macAddress>74:F0:6D:A8:4E:32</macAddress><description>Bluetooth Device (Personal Area Network)</description><ipAddress/><ipSubnet/><defaultIpGateway/><dhcpServer/></nicInterface>
<nicInterface>
<macAddress>20:41:53:59:4E:FF</macAddress><description>RAS Async Adapter</description><ipAddress/><ipSubnet/><defaultIpGateway/><dhcpServer/>
</nicInterface>
<nicInterface>
<macAddress>20:CF:30:55:0C:EF</macAddress><description>JMicron PCI Express Gigabit Ethernet Adapter</description><ipAddress/><ipSubnet/><defaultIpGateway/><dhcpServer/>
</nicInterface>
</nicInfo>
<ramInfo>
<ramStick>
<partNumber>M471B5273DH0-CK0 </partNumber><serialNumber>E2CE33AF</serialNumber><capacity>4294967296</capacity>
</ramStick>
<ramStick>
<partNumber>M471B5273DH0-CK0 </partNumber><serialNumber>630155D0</serialNumber><capacity>4294967296</capacity>
</ramStick>
</ramInfo>
</systemInfo>
</account>
You can use xpath like this:
foreach (XmlElement element in doc.SelectNodes("//account/systemInfo/ramInfo/ramStick"))
{
string partNumber = element["partNumber"].InnerText;
}
Note that since this call to SelectNodes returns only XmlElement, I can use XmlElement as type for the loop variable. Otherwise I'd have to use XmlNode.
After recoding so much,
Here is what I finally did...
XmlNode systemInfo = node.SelectSingleNode("systemInfo");
XmlNode ramInfo = systemInfo.SelectSingleNode("ramInfo");
XmlNodeList ramList = ramInfo.SelectNodes("ramStick");
foreach (XmlElement ramStick in ramList)
{
// add code here
}

Categories