Linq to XML - how do I get this element value - c#

Fairly simple one, but my knowledge is limited in this area. I'm using the following c# code to access the value of elements within my SGML and XML documents.
It's working fine when there is only one element with the given name in the document, but as soon as there are more than one element with the same name it throws an exception, obviously!
I need to use xpath or some other way of specifying the location of the element i'm trying to get the value of.
XDocument doc = XDocument.Load(sgmlReader);
string system = doc.Descendants("chapnum").Single().Value;
return system;
This works fine, if there is only one "chapnum" in the doc, but i need to specifically get the value of "chapnum" at the following nested location "dmaddres/chapnum".
How please?
Here is a sample of the xml doc. I'm trying to get the value of the "chapnum" element nested in the "dmaddress" element.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE dmodule []>
<dmodule xmlns:dc="http://www.purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://www.s1000d.org/S1000D_2-3-1/xml_schema_flat/descript.xsd">
<idstatus>
<dmaddres>
<dmc><avee><modelic>xx</modelic><sdc>A</sdc><chapnum>29</chapnum>
<section>1</section><subsect>3</subsect><subject>54</subject><discode
>00</discode><discodev>AAA</discodev><incode>042</incode><incodev
>A</incodev><itemloc>D</itemloc></avee></dmc>
<dmtitle><techname>Switch</techname><infoname>Description of function</infoname>
</dmtitle>
<issno inwork="00" issno="001" type="new"/>
<issdate day="20" month="07" year="2012"/>
<language language="sx"/></dmaddres>
<status>
<security class="01"/><datarest><instruct><distrib>-</distrib><expcont
>Obey the national regulations for export control.</expcont></instruct>
<inform><copyright><para><refdm><avee><modelic>xx</modelic><sdc>A</sdc>
<chapnum>29</chapnum><section>1</section><subsect>3</subsect><subject
>54</subject><discode>00</discode><discodev>ZZZ</discodev><incode
>021</incode><incodev>Z</incodev><itemloc>D</itemloc></avee></refdm
></para></copyright><datacond>BREXREF=AJ-A-00-00-00-05ZZZ-022Z-D VERSUB=CDIM-V6</datacond>
</inform></datarest>
<rpc>xxxxx</rpc>
<orig>xxxxx</orig>
<applic>
<type>-</type>
<model model="xxxxx"><mfc>xxxxx</mfc><pnr>xxxxxxx</pnr></model>
</applic>
<brexref><refdm><avee><modelic>xx</modelic><sdc>A</sdc><chapnum>00</chapnum>
<section>0</section><subsect>0</subsect><subject>00</subject><discode
>05</discode><discodev>ZZZ</discodev><incode>022</incode><incodev
>Z</incodev><itemloc>D</itemloc></avee></refdm></brexref>

like this?
string system = doc.Descendants("dmaddres")
.Single(e => e.Element("chapnum") != null)
.Element("chapnum").Value;
string system = doc.Root.Element("dmaddres").Element("chapnum").Value;
would probably do just as well.

Related

C# Skip anything to next tag

I have a log file in xml format like
<log> // skip this node
<?xml version="1.0" encoding="UTF-8"?>
<qbean logger="main-logger">
</qbean>
</log>
<log> // go to this node
</log>
Now ReadToNextSibling("log") throw an exception an I need to skip content of first "log" tag and move to next "log" tag without throwing exception.
Is there a way?
Hint:
Your XML is invalid since the <?xml version="1.0" encoding="UTF-8"?> has to be before the root element. You can search for it and remove it if that fixes your problem. You can use yourXml.Repalce("<?xml version=\"1.0\" encoding=\"UTF-8\"?>", "")
You have to create a root element for your XML to be valid for parsing.
Then, you can use the XmlDocument class to parse the XML data that you have and skip anything you want. You would need something like this:
var document = new XmlDocument();
document.LoadXml(yourXml);
document.DocumentElement.ChildNodes[1]

C# XML Select Multiple Nodes

I am trying to modify a website that was built by some other web developers.
The part in question, reads an XML data file and pulls back data to display on a Google Map.
They have a line of code;
string path = Server.MapPath(OutageXmlVirtualPath); //path to XML file
OutageData outages = XMLUtil.Deserialize<OutageData>(path);
Outage outage = outages.Outages.FirstOrDefault(o => o.PostCodes.Any(p => FoundOutagePostcode(p)) && !o.Planned);
That pulls the First record in the XML that matches a postcode the user has entered into a textbox. (lastOrDefault works also)
The issue with this however, is that the postcode they enter might appear more than once. In another node in the XML. So what I want to do is pull back all of the records in the XML that match. Not just the first. I can see that there is 'All' and 'SelectMany' methods, but dont know how to implement these into my code.
I would consider myself a complete novice in this area.
If anyone is able to lend any help that would be greatly appreciated.
Kind regards,
Chris
XML sample
<?xml version="1.0" encoding="utf-16"?>
<OutageData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<TimeStamp>2013-12-16T06:38:00.1706983+00:00</TimeStamp>
<Outages>
<Outage>
<Region>South West</Region>
<IncidentID>INCD-83651-m</IncidentID>
<ConfirmedOff>1</ConfirmedOff>
<PredictedOff>0</PredictedOff>
<Restored>0</Restored>
<Status>In Progress</Status>
<Planned>false</Planned>
<StartTime>2013-12-14T18:03:00</StartTime>
<ETR>2013-12-16T12:00:00</ETR>
<Voltage>LV</Voltage>
<PostCodes>
<string>PL1 4RL</string>
<string>PL2 1AF</string>
<string>PL2 1AG</string>
<string>PL2 1AH</string>
</PostCodes>
<Sensitive>1</Sensitive>
</Outage>
<Outage>
<Region>West Midlands</Region>
<IncidentID>INCD-12499-I</IncidentID>
<ConfirmedOff>0</ConfirmedOff>
<PredictedOff>0</PredictedOff>
<Restored>0</Restored>
<Status>In Progress</Status>
<Planned>true</Planned>
<StartTime>2013-12-13T10:00:00</StartTime>
<ETR xsi:nil="true" />
<Voltage>HV</Voltage>
<PostCodes>
<string>SY7 9AX</string>
<string>SY7 9AY</string>
<string>SY7 9AZ</string>
<string>SY7 9BE</string>
</PostCodes>
<Sensitive>0</Sensitive>
</Outage>
</Outages>
</OutageData>
just try to use Where
var outagesFound = outages.Outages.Where(o => o.PostCodes.Any(p => FoundOutagePostcode(p)) && !o.Planned);
and then you can iterate through the outagesfound list using the foreach loop

Why does having a xmlns cause my C# program not to read XML?

I have a C# program that attempts to read the following xml, but can't read any elements:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Comments Here -->
<FileFeed
xmlns="http://www.mycompany.com/schemas/xxx/FileFeed/V1"
xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.somecompany.com/schemas/xxx/FileFeed/V1
FileFeed.xsd"
RecordCount = "1">
<Object>
<ID>PAMSMOKE110113xxx</ID>
<CorpID>12509</CorpID>
<AnotherID>201654702345</AnotherID>
<TimeStamp>2013-09-03</TimeStamp>
<Type>Some Type</Type>
<SIM_ID>89011704258012600767</SIM_ID>
<Code>ZZZ</Code>
<Year>2013</Year>
</Object>
</FileFeed>
With the above XML my C# program is unable to read any elements.. For instance the ID Element is always NULL.
Now if I simply remove the first xmlns from the above XML, my program can read all the elements without any issues. The problem is I have to process the XML file in the format that's given to me, and can't change the file format. My program reads the below XML just fine: Note the line xmlns="http://www.mycompany.com/schemas/xxx/FileFeed/V1" is removed.
<?xml version="1.0" encoding="UTF-8"?>
<!-- Comments Here -->
<FileFeed
xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.somecompany.com/schemas/xxx/FileFeed/V1
FileFeed.xsd"
RecordCount = "1">
<Object>
<ID>PAMSMOKE110113xxx</ID>
<CorpID>12509</CorpID>
<AnotherID>201654702345</AnotherID>
<TimeStamp>2013-09-03</TimeStamp>
<Type>Some Type</Type>
<SomeNumber>89011704258012600767</SomeNumber>
<Code>ZZZ</Code>
<Year>2013</Year>
</Object>
</FileFeed>
I realize I'm not posting any code, but just wondering what possible issue could I be having, where simply removing the xmlns line resolves everything??
Your problem is with xml namespaces
Using Linq2Xml
XNamespace ns = "http://www.mycompany.com/schemas/xxx/FileFeed/V1";
var xDoc = XDocument.Load(fname);
var id = xDoc.Root.Element(ns + "Object").Element(ns + "ID").Value;
Your root element FileFeed has a namespace attribute. This means that each element inside it also uses that namespace.
The Element method takes an XName as its argument. Usually you use a string which gets implicitly converted into an XName.
If you want to include a namespace you create an XNamespace and add the string. Since XNamespace overloads the + operator this will also result in an XName.
XDocument doc = XDocument.Load("Test.xml");
// this will be null
XElement objectElementWithoutNS = doc.Root.Element("Object");
XNamespace ns = doc.Root.GetDefaultNamespace();
XElement objectElementWithNS = doc.Root.Element(ns + "Object");
Xml namespaces are more or less like C# namespaces. Would you be able to access a class when its namespace is set or not set?
public namespace My.Company.Schemas {
public class FileFeed
vs
public class FileFeed {
They are two DISTINCT classes! The same applies to XML - by setting a namespace you make it possible to have documents with similar or even the same internal structure but they represent two disctinct documents that are not exchangeable. This is really convenient.
If you'd like to get help on why your actual reading method doesn't consider the namespace, you have to present the C# code. The general rule though is that any reading API makes is possible to set the namespace for actual reading.

XML DocumentElement is trashing the innerXml

I have a simple XML file, shown below, which when read-in via a basic XmlDocument.Load(filename.xml). If I load the file, and inspect it's innerXML, it all looks normal. However, when I inspect the value of DocumentElement, it's a mess!!! I kept the example small, so you can easily see there is no mal-formation:
<?xml version="1.0" encoding="UTF-8"?>
<fax:FaxService xmlns:fax="http://www.hp.com/schemas/imaging/con/service/fax/2009/02/11/" xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">
<fax:ServiceDefaults>
<fax:ServiceSendDefaults>
<fax:InternetFaxSettings>
<dd:FaxFileFormat>MTIFFG4</dd:FaxFileFormat>
<dd:UseEmailAsFaxAcctAddr>false</dd:UseEmailAsFaxAcctAddr>
<dd:AutoCompleteToNANP>false</dd:AutoCompleteToNANP>
<dd:RetryInterval>0</dd:RetryInterval>
<dd:MaxRetryAttempts>0</dd:MaxRetryAttempts>
</fax:InternetFaxSettings>
</fax:ServiceSendDefaults>
</fax:ServiceDefaults>
</fax:FaxService>
Now, try this in C# with this simple code:
...
XmlDocument xDoc = new XmlDocument();
xDoc.Load("*XMLSAMPLE.XML*");
textBox1.Text = xDoc.InnerXml;
textBox2.Text = xDoc.DocumentElement.InnerXml;
...
It's completely mangled, with the 2nd namespace repeated with every dd tag, and not even included in the top-most tag.
What am I doing wrong? This is driving me nuts!
The content returned by xDoc.DocumentElement.InnerXml is semantically identical to your original ServiceDefaults tag - if the first fragment conforms to your XML schema, the InnerXml fragment will also conform to the definition of the inner element. Just because the framework has re-arranged the namespace declarations does not change the semantics of the document.
Compare the output of your the two XmlDocument properties:
xDoc.DocumentElement:
<?xml version="1.0" encoding="UTF-8"?>
<fax:FaxService xmlns:fax="http://www.hp.com/schemas/imaging/con/service/fax/2009/02/11/" xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">
<fax:ServiceDefaults>
<fax:ServiceSendDefaults>
<fax:InternetFaxSettings>
<dd:FaxFileFormat>MTIFFG4</dd:FaxFileFormat>
<dd:UseEmailAsFaxAcctAddr>false</dd:UseEmailAsFaxAcctAddr>
<dd:AutoCompleteToNANP>false</dd:AutoCompleteToNANP>
<dd:RetryInterval>0</dd:RetryInterval>
<dd:MaxRetryAttempts>0</dd:MaxRetryAttempts>
</fax:InternetFaxSettings>
</fax:ServiceSendDefaults>
</fax:ServiceDefaults>
</fax:FaxService>
xDoc.DocumentElement.InnerXml:
<fax:ServiceDefaults xmlns:fax="http://www.hp.com/schemas/imaging/con/service/fax/2009/02/11/">
<fax:ServiceSendDefaults>
<fax:InternetFaxSettings>
<dd:FaxFileFormat xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">MTIFFG4</dd:FaxFileFormat>
<dd:UseEmailAsFaxAcctAddr xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">false</dd:UseEmailAsFaxAcctAddr>
<dd:AutoCompleteToNANP xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">false</dd:AutoCompleteToNANP>
<dd:RetryInterval xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">0</dd:RetryInterval>
<dd:MaxRetryAttempts xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">0</dd:MaxRetryAttempts>
</fax:InternetFaxSettings>
</fax:ServiceSendDefaults>
</fax:ServiceDefaults>
A look at the following link in MSDN will help shed light on your situation:
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.innerxml.aspx
Basically, xDoc.DocumentElement.InnerXml is looking at the <fax:ServiceDefaults> node, whereas xDoc.InnerXml is looking one level higher (FaxService node). This is crucial to understanding your problem - because all of your xmlns is on the FaxService node.
Make the following change to your XML document, and notice what happens (basically, copy over the xmlns info to the ServiceDefaults node:
<?xml version="1.0" encoding="UTF-8"?>
<fax:FaxService xmlns:fax="http://www.hp.com/schemas/imaging/con/service/fax/2009/02/11/" xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">
<fax:ServiceDefaults xmlns:fax="http://www.hp.com/schemas/imaging/con/service/fax/2009/02/11/" xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/">
<fax:ServiceSendDefaults>
<fax:InternetFaxSettings>
<dd:FaxFileFormat>MTIFFG4</dd:FaxFileFormat>
<dd:UseEmailAsFaxAcctAddr>false</dd:UseEmailAsFaxAcctAddr>
<dd:AutoCompleteToNANP>false</dd:AutoCompleteToNANP>
<dd:RetryInterval>0</dd:RetryInterval>
<dd:MaxRetryAttempts>0</dd:MaxRetryAttempts>
</fax:InternetFaxSettings>
</fax:ServiceSendDefaults>
</fax:ServiceDefaults>
</fax:FaxService>
Suddenly your code will behave according to your expectations. So hopefully this helps you towards understanding the issue. What the permanent fix should be, that's up to you.
HTH!

XDocument.Descendants(itemName) - Problems finding qualified name

I'm trying to read a XML-RSS-Feed from a website. Therefore I use a async download and create a XDocument with the XDocument.Parse() Method.
The Document intends to be very simple, like this:
<root>
<someAttribute></SomeAttribute>
<item>...</item>
<item>...</item>
</root>
Now I want to read out all the items. Therefore I tried:
foreach (XElement NewsEntry in xDocument.Descendants("item"))
but this doesn't work. So I found a post in this board to use the qualified name, because there are some namespaces defined in the root element:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns="http://purl.org/rss/1.0/">
well, I tried all 3 available namespaces - nothing worked for me:
XName itemName = XName.Get("item", "http://www.w3.org/1999/02/22-rdf-syntax-ns#");
XName itemName2 = XName.Get("item", "http://purl.org/dc/elements/1.1/");
XName itemName3 = XName.Get("item", "http://purl.org/rss/1.0/modules/syndication/");
Any help would be appreciated.
(Usually I'm doing the XML-Analysis with Regex - but this time I'm developing for a mobile device, and therefore need to care about performance.)
You have not tried the default namespace at the end of the rdf declaration:
xmlns="http://purl.org/rss/1.0/"
This makes sense, as any element in the default namespace will not need to have the namespace prepended to the element name.
Not directly a solution to the XDocument RSS read problem. But why aren't you using the provided SyncdicationFeed class to load the feed? http://msdn.microsoft.com/en-us/library/system.servicemodel.syndication.syndicationfeed.aspx
Try this
var elements = from p in xDocument.Root.Elements()
where p.Name.LocalName == "item"
select p;
foreach(var element in elements)
{
//Do stuff
}

Categories