This question already has answers here:
XPath doesn't work as desired in C#
(2 answers)
Closed 10 years ago.
I'm trying to use XmlDocument() to read an XML file node by node an output each element.
After much trial-and-error, I determined that having an xmlns attribute on my node causes no nodes to be returned from SelectNodes() call. Not sure why.
Since I can't change the output format and don't have access to the actual namespace, what are my options to get around this issue?
In addition, I have some elements that have subnodes. How do I access these while looping through the XML file? I.E., I need to decrypt the CipherValue elements, but not sure how to access this node anyway?
Sample below:
doc.Load(#"test.xml");
XmlNodeList logEntryNodeList = doc.SelectNodes("/Logs/LogEntry");
Console.WriteLine("Nodes = {0}", logEntryNodeList.Count);
foreach (XmlNode xn in logEntryNodeList)
{
string dateTime = xn["DateTime"].InnerText;
string sequence = xn["Sequence"].InnerText;
string appId = xn["AppID"].InnerText;
Console.WriteLine("{0} {1} {2}", dateTime, sequence, appId);
}
Sample XML looks like this:
<Logs>
<LogEntry Version="1.5" PackageVersion="10.10.0.10" xmlns="http://private.com">
<DateTime>2013-02-04T14:05:42.912349-06:00</DateTime>
<Sequence>5058</Sequence>
<AppID>TEST123</AppID>
<StatusDesc>
<EncryptedData>
<CipherData xmlns="http://www.w3.org/2001/04/xmlenc#">
<CipherValue>---ENCRYPTED DATA BASE64---</CipherValue>
</CipherData>
</EncryptedData>
</StatusDesc>
<Severity>Detail</Severity>
</LogEntry>
<LogEntry Version="1.5" PackageVersion="10.10.0.10" xmlns="http://private.com">
<DateTime>2013-02-04T14:05:42.912350-06:00</DateTime>
<Sequence>5059</Sequence>
<AppID>TEST123</AppID>
<StatusDesc>
<EncryptedData>
<CipherData xmlns="http://www.w3.org/2001/04/xmlenc#">
<CipherValue>---ENCRYPTED DATA BASE64---</CipherValue>
</CipherData>
</EncryptedData>
</StatusDesc>
<Severity>Detail</Severity>
</LogEntry>
</Logs>
After much trial-and-error, I determined that having an xmlns attribute on my node causes no nodes to be returned from SelectNodes() call. Not sure why.
The xmlns attribute effectively changes the default namespace within the element, including that element itself. So the namespace of your LogEntry element is "http://private.com". You'd need to include this appropriately in your XPath query, probably via an XmlNamespaceManager.
(If you can use LINQ to XML instead, it makes it much easier to work with namespaces.)
You can use Linq to Xml (as Jon stated). Here is code which parses your xml file with respect to namespaces. Result is a strongly typed sequence of anonymous objects (i.e. Date property has type of DateTime, Sequence is integer, and AppID is a string):
XDocument xdoc = XDocument.Load("test.xml");
XNamespace ns = "http://private.com";
var entries = xdoc.Descendants("Logs")
.Elements(ns + "LogEntry")
.Select(e => new {
Date = (DateTime)e.Element(ns + "DateTime"),
Sequence = (int)e.Element(ns + "Sequence"),
AppID = (string)e.Element(ns + "AppID")
}).ToList();
Console.WriteLine("Entries count = {0}", entries.Count);
foreach (var entry in entries)
{
Console.WriteLine("{0}\t{1} {2}",
entry.Date, entry.Sequence, entry.AppID);
}
Related
This question already has answers here:
How does one parse XML files? [closed]
(12 answers)
Closed 7 years ago.
I'm starting working with XML files and I know little about it.
What I need is read the value from a certain tag of a XML file
<Tag>
<Name>PBAS</Name>
<ID>1</ID>
<Description>Array of sequence characters as edited by user</Description>
<Type>char</Type>
<Elements>163</Elements>
<Size>163</Size>
<Value>TCCGAACAAGCGGCGAGTGCTGGGCGAGTGGCCGCGGGGCTGTGGTCCTGTGCACGGTGATCGCAAGGGCCCCCGGGGCTTGCTGGTCTTCCCTAACGGTTCGTGGAACCCCCGAAGCTGGGGGCTGGCCAGGGGCTTAAGGAACCCTTCCTAAATTAACTAC</Value>
</Tag>
I need, in the example above, getting the string between the "Value"(the string TCCGA...) of the tag named "PBAS" with "ID 1".
But I need this value just from the tag named PBAS with "ID1", since the XML document has other "values" from other tags named differently.
Thanks a lot!
var yourval = XDocument.Load(filename)
.XPathSelectElement("//Tag[Name='PBAS' and ID='1']")
.Element("Value")
.Value;
or
var yourval = (string)XDocument.Load(filename)
.XPathSelectElement("//Tag[Name='PBAS' and ID='1']/Value");
or (using only Linq)
var yourval = XDocument.Load(filename)
.Descendants("Tag")
.Where(t => (string)t.Element("Name") == "PBAS")
.Where(t => (string)t.Element("ID") == "1")
.First()
.Element("Value")
.Value;
PS: Necessary namespaces: System.Xml, System.Xml.XPath, and System.Xml.Linq . Also Read about Linq2Xml and Xpath
The technology to select nodes from XML is called XPath. Basically you create a path of element names separated by / like
/Tag/Value
To specify conditions, put them in square brackets at the place from where you want to start the condition:
/Tag[./Name/text()="PBAS"][./ID/text()="1"]/Value
Above XPath is good for demonstration purposes, since it explains how it works. In practice, this would be simplified to
/Tag[Name="PBAS"][ID="1"]/Value
Code:
using System.Xml;
var doc = new XmlDocument();
doc.LoadXml(xml);
var nodes = doc.SelectNodes("/Tag[./Name/text()=\"PBAS\"][./ID/text()=\"1\"]/Value");
foreach (XmlElement node in nodes)
{
Console.WriteLine(node.InnerText);
}
Console.ReadLine();
You can use the below code:
string xmlFile = File.ReadAllText(#"D:\Work_Time_Calculator\10-07-2013.xml");
XmlDocument xmldoc = new XmlDocument();
xmldoc.LoadXml(xmlFile);
XmlNodeList xnList = xmldoc.SelectNodes("/Tag");
string Value ="";
foreach (XmlNode xn in xnList)
{
string ID = xn["ID"].InnerText;
if(ID =="1")
{
value = xn["Value"].InnerText;
}
}
This is my XML:
<?xml version="1.0"?>
<formatlist>
<format>
<formatName>WHC format</formatName>
<delCol>ID</delCol>
<delCol>CDRID</delCol>
<delCol>TGIN</delCol>
<delCol>IPIn</delCol>
<delCol>TGOUT</delCol>
<delCol>IPOut</delCol>
<srcNum>SRCNum</srcNum>
<distNum>DSTNum</distNum>
<connectTime>ConnectTime</connectTime>
<duration>Duration</duration>
</format>
<format>
<formatName existCombineCol="1">Umobile format</formatName> //this format
<delCol>billing_operator</delCol>
<hideCol>event_start_date</hideCol>
<hideCol>event_start_time</hideCol>
<afCombineName dateType="DateTime" format="dd/MM/yyyy HH:mm:ss"> //node i want
<name>ConnectdateTimeAFcombine</name>
<combineDate>event_start_date</combineDate>
<combineTime>event_start_time</combineTime>
</afCombineName>
<afCombineName dateType="DateTime" format="dd/MM/yyyy HH:mm:ss"> //node i want
<name>aaa</name>
<combineDate>bbb</combineDate>
<combineTime>ccc</combineTime>
</afCombineName>
<modifyPerfixCol action="add" perfix="60">bnum</modifyPerfixCol>
<srcNum>anum</srcNum>
<distNum>bnum</distNum>
<connectTime>ConnectdateTimeAFcombine</connectTime>
<duration>event_duration</duration>
</format>
</formatlist>
I want to find format with Umobile format then iterate over those two nodes.
<afCombineName dateType="DateTime" format="dd/MM/yyyy HH:mm:ss"> //node i want
<name>ConnectdateTimeAFcombine</name>
<combineDate>event_start_date</combineDate>
<combineTime>event_start_time</combineTime>
</afCombineName>
<afCombineName dateType="DateTime" format="dd/MM/yyyy HH:mm:ss"> //node i want
<name>aaa</name>
<combineDate>bbb</combineDate>
<combineTime>ccc</combineTime>
</afCombineName>
and list all the two node's child nodes. The result should like this:
ConnectdateTimeAFcombine,event_start_date,event_start_time.
aaa,bbb,ccc
How can I do this?
foreach(var children in format.Descendants())
{
//Do something with the child nodes of format.
}
For all XML related traversing, you should get used to using XPath expressions. It is very useful. Even if you could perhaps do something easier in your specific case, it is good practice to use XPath. This way, if your scheme changes at some point, you just update your XPath expression and your code will be up and running.
For a complete example, you can have a look at this article.
You can use the System.Xml namespace APIs along with System.Xml.XPath namespace API. Here is a quick algorithm that will help you do your task:
Fetch the text node containing the string Umobile format using the below XPATH:
XmlNode umobileFormatNameNode = document.SelectSingleNode("//formatName[text()='Umobile format']");
Now the parent of umobileFormatNameNode will be the node that you are interested in:
XmlNode formatNode = umobileFormatNameNode.ParentNode;
Now get the children for this node:
XmlNodeList afCombineFormatNodes = formatNode.SelectNodes("afCombineName");
You can now process the list of afCombineFormatNodes
for(XmlNode xmlNode in afCombineNameFormtNodes)
{
//process nodes
}
This way you can access those elements:
var doc = System.Xml.Linq.XDocument.Load("PATH TO YOUR XML FILE");
var result = doc.Descendants("format")
.Where(x => (string)x.Element("formatName") == "Umobile format")
.Select(x => x.Element("afCombineName"));
Then you can iterate the result this way:
foreach (var item in result)
{
string format = item.Attribute("format").Value.ToString();
string name = item.Element("name").Value.ToString();
string combineDate = item.Element("combineDate").Value.ToString();
string combineTime = item.Element("combineTime").Value.ToString();
}
I have this XML file:
<MyXml>
<MandatoryElement1>value</MandatoryElement1>
<MandatoryElement2>value</MandatoryElement2>
<MandatoryElement3>value</MandatoryElement3>
<CustomElement1>value</CustomElement1>
<CustomElement2>value</CustomElement2>
<MyXml>
All 3 elements that are called 'MandatoryElementX' will always appear in the file. The elements called 'CustomElementX' are unknown. These can be added or removed freely by a user and have any name.
What I need is to fetch all the elements that are not MandatoryElements. So for the file above I would want this result:
<CustomElement1>value</CustomElement1>
<CustomElement2>value</CustomElement2>
I don't know what the names of the custom elements may be, only the names of the 3 MandatoryElements, so the query needs to somehow exclude these 3.
Edit:
Even though this was answered, I want to clarify the question. Here is an actual file:
<Partner>
<!--Mandatory elements-->
<Name>ALU FAT</Name>
<InterfaceName>Account Lookup</InterfaceName>
<RequestFolder>C:\Documents and Settings\user1\Desktop\Requests\ALURequests</RequestFolder>
<ResponseFolder>C:\Documents and Settings\user1\Desktop\Responses</ResponseFolder>
<ArchiveMessages>Yes</ArchiveMessages>
<ArchiveFolder>C:\Documents and Settings\user1\Desktop\Archive</ArchiveFolder>
<Priority>1</Priority>
<!--Custom elements - these can be anything-->
<Currency>EUR</Currency>
<AccountingSystem>HHGKOL</AccountingSystem>
</Partner>
The result here would be:
<Currency>EUR</Currency>
<AccountingSystem>HHGKOL</AccountingSystem>
You can define a list of mandatory names and use LINQ to XML to filter:
var mandatoryElements = new List<string>() {
"MandatoryElement1",
"MandatoryElement2",
"MandatoryElement3"
};
var result = xDoc.Root.Descendants()
.Where(x => !mandatoryElements.Contains(x.Name.LocalName));
Do you have created this xml or do you get it by another person/application?
If it's yours I would advise you not to number it. You can do something like
<MyXml>
<MandatoryElement id="1">value<\MandatoryElement>
<MandatoryElement id="2">value<\MandatoryElement>
<MandatoryElement id="3">value<\MandatoryElement>
<CustomElement id="1">value<\CustomElement>
<CustomElement id="2">value<\CustomElement>
<MyXml>
In the LINQ-Statement you don't need the List then.
Your question shows improperly formatted XML but I am assuming that is a typo and the real Xml can be loaded into the XDocument class.
Try this...
string xml = #"<MyXml>
<MandatoryElement1>value</MandatoryElement1>
<MandatoryElement2>value</MandatoryElement2>
<MandatoryElement3>value</MandatoryElement3>
<CustomElement1>value</CustomElement1>
<CustomElement2>value</CustomElement2>
</MyXml> ";
System.Xml.Linq.XDocument xDoc = XDocument.Parse(xml);
var result = xDoc.Root.Descendants()
.Where(x => !x.Name.LocalName.StartsWith("MandatoryElement"));
lets say TestXMLFile.xml will contain your xml,
XElement doc2 = XElement.Load(Server.MapPath("TestXMLFile.xml"));
List<XElement> _list = doc2.Elements().ToList();
List<XElement> _list2 = new List<XElement>();
foreach (XElement x in _list)
{
if (!x.Name.LocalName.StartsWith("Mandatory"))
{
_list2.Add(x);
}
}
foreach (XElement y in _list2)
{
_list.Remove(y);
}
I'm trying to parse results from the YouTube API. I'm getting the results correctly as a string, but am unable to parse it correctly.
I followed suggestions on a previous thread, but am not getting any results.
My sample code is:
string response = youtubeService.GetSearchResults(search.Term, "published", 1, 50);
XDocument xDoc = XDocument.Parse(response, LoadOptions.SetLineInfo);
var list = xDoc.Descendants("entry").ToList();
var entries = from entry in xDoc.Descendants("entry")
select new
{
Id = entry.Element("id").Value,
Categories = entry.Elements("category").Select(c => c.Value)
//Published = entry.Element("published").Value,
//Title = entry.Element("title").Value,
//AuthorName = entry.Element("author").Element("name").Value,
//Thumnail = entry.Element("media:group").Elements("media:thumnail").ToList().ElementAt(0)
};
foreach (var entry in entries)
{
// entry.Id and entry.Categories available here
}
The problem is that entries has a count of 0 even though the XDocument clearly has the valid values.
The value of the response variable (Sample XML) can be seen here: http://snipt.org/lWm
(FYI: The youTube schema is listed here: http://code.google.com/apis/youtube/2.0/developers_guide_protocol_understanding_video_feeds.html)
Can anyone tell me what I'm doing wrong here?
All the data is in the "http://www.w3.org/2005/Atom" namespace; you need to use this throughout:
XNamespace ns = XNamespace.Get("http://www.w3.org/2005/Atom");
...
from entry in xDoc.Descendants(ns + "entry")
select new
{
Id = entry.Element(ns + "id").Value,
Categories = entry.Elements(ns + "category").Select(c => c.Value)
...
};
etc (untested)
When you see prefix:name, it means that name is in the namespace whose prefix has been declared as prefix. If you look at the top of the document, you'll see an xmlns:media=something. The something is the namespace used for anything with the prefix media.
This means you need to create an XNamespace for each of the namespaces you need to reference:
XNamespace media = XNamespace.Get("http://search.yahoo.com/mrss/");
and then use media for the names in that namespace:
media + "group"
The namespaces in this document are:
xmlns="http://www.w3.org/2005/Atom"
xmlns:app="http://www.w3.org/2007/app"
xmlns:media="http://search.yahoo.com/mrss/"
xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/"
xmlns:gd="http://schemas.google.com/g/2005"
xmlns:gml="http://www.opengis.net/gml"
xmlns:yt="http://gdata.youtube.com/schemas/2007"
xmlns:georss="http://www.georss.org/georss"
You need to set the namespace.
Creating an XName in a Namespace
As with XML, an XName can be in a namespace, or it can be in no namespace.
For C#, the recommended approach for creating an XName in a namespace is to declare the XNamespace object, then use the override of the addition operator.
http://msdn.microsoft.com/en-us/library/system.xml.linq.xname.aspx
Say I call XElement.Parse() with the following XML string:
var xml = XElement.Parse(#"
<?xml version="1.0" encoding="UTF-8"?>
<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Owner>
<ID>7c75442509c41100b6a413b88b523bd6f46554cdbee5b6cbe27bc08cb3f6a865</ID>
<DisplayName>me</DisplayName>
</Owner>
<AccessControlList>
<Grant>
<Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="Group">
...
");
When it comes time to query the element, I'm forced to use fully-qualified element names because that XML document contains an xmlns attribute in its root. This requires cumbersome creations of XName instances:
var AWS_XMLNS = "http://s3.amazonaws.com/doc/2006-03-01/";
var ownerElement = xml.Element(XName.Get("AccessControlPolicy", AWS_XMLNS)).Element(XName.Get("Owner", AWS_XMLNS));
When what I really want is simply,
var ownerElement = xml.Element("AccessControlPolicy").Element("Owner");
Is there a way to make LINQ to XML assume a specific namespace so I don't have to keep specifying it?
You could simplify by using
XNamespace ns = "http://s3.amazonaws.com/doc/2006-03-01/";
var ownerElement = xml.Element(ns + "AccessControlPolicy").Element(ns + "Owner");
I don't think you can (see Jon Skeet's comment), but there are a few tricks you can do.
1) create an extension method that appends the XNamespace to your string
2) Use VB?!?