Parsing specific part of XML file - c#

I have a .gpx XML file with the following sample:
<trk>
<name>Test</name>
<trkseg>
<trkpt lon="-84.89032996818423" lat="32.75810896418989">
<ele>225.0</ele>
<time>2011-04-02T11:57:48.000Z</time>
<extensions>
<gpxtpx:TrackPointExtension>
<gpxtpx:cad>0</gpxtpx:cad>
</gpxtpx:TrackPointExtension>
</extensions>
</trkpt>
</trkseg>
</trk>
I'm using Linq to XML to parse this but I'm having a difficult time parsing the extensions section. Here's the code I'm using:
var gpxDoc = LoadFromStream(document);
var gpx = GetGpxNameSpace();
var gpxtpx = XNamespace.Get("gpxtpx");
var tracks = from track in gpxDoc.Descendants(gpx + "trk")
select new
{
Name = DefaultStringValue(track, gpx, "name"),
Description = DefaultStringValue(track, gpx, "desc"),
Segments = (from trkSegment in track.Descendants(gpx + "trkseg")
select new
{
TrackSegment = trkSegment,
Points = (from trackpoint in trkSegment.Descendants(gpx + "trkpt")
select new
{
Lat = Double(trackpoint.Attribute("lat").Value),
Lng = Double(trackpoint.Attribute("lon").Value),
Ele = DefaultDoubleValue(trackpoint, gpx, "ele"),
Time = DefaultDateTimeValue(trackpoint, gpx, "time"),
Extensions = (
from ext in trackpoint.Descendants(gpx + "extensions").Descendants(gpxtpx + "TrackPointExtension")
select new
{
Cad = DefaultIntValue(ext, gpxtpx, "cad")
}).SingleOrDefault()
})
})
};
Here's the relevant helper code:
private static double? DefaultIntValue(XContainer element, XNamespace ns, string elementName)
{
var xElement = element.Element(ns + elementName);
return xElement != null ? Convert.ToInt32(xElement.Value) : (int?)null;
}
private XNamespace GetGpxNameSpace()
{
var gpx = XNamespace.Get("http://www.topografix.com/GPX/1/1");
return gpx;
}
The actual error I'm getting is
The following error occurred: Object reference not set to an instance of an object.
and it bombs on this code:
Extensions = (from ext in trackpoint.Descendants(gpx + "extensions").Descendants(gpxtpx + "TrackPointExtension")
select new
{
Cad = DefaultIntValue(ext, gpxtpx, "cad")
}).SingleOrDefault();
I just don't know how to fix it.

Since you never declare the namespace (xmlns:gpxtpx="http://www.topografix.com/GPX/1/1") it is never going to match. The xml fragment you provided is not well formed due to the lack of the namespace.
If the fragment posted is snipped from a larger document, consider switching to XML API's rather than string manipulation. If that is the entirety of the XML you receive from an outside system, add it to a root node which you can declare the schema in:
<root xmlns:gpxtpx="http://www.topografix.com/GPX/1/1">
<!-- put your xml fragment here -->
</root>

Related

Create XML file with custom formatting

I need to create an XML file with line breaks and tabs in Attributes and on few tags as well. So I tried like below.
string xmlID = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<PersonLib Version=\"1.0\"></PersonLib>";
XDocument doc = XDocument.Parse(xmlID);//, LoadOptions.PreserveWhitespace);
XElement device = doc.Root;
using (StringWriter str = new StringWriter())
using (XmlTextWriter xml = new XmlTextWriter(str))
{
xml.Formatting=Formatting.Indented;
xml.WriteStartElement("Details");
xml.WriteWhitespace("\n\t");
xml.WriteStartElement("Name");
xml.WriteWhitespace("\n\t\t");
xml.WriteStartElement("JohnDoe");
xml.WriteAttributeString("DOB", "10");
xml.WriteAttributeString("FirstName", "20");
xml.WriteAttributeString("LastName", "40");
xml.WriteAttributeString("\n\t\t\tAddress", "50");
xml.WriteAttributeString("\n\t\t\tPhoneNum", "60");
xml.WriteAttributeString("\n\t\t\tCity", "70");
xml.WriteAttributeString("\n\t\t\tState", "80");
xml.WriteAttributeString("\n\t\t\tCountry", "90");
//xml.WriteWhitespace("\n\t\t");
xml.WriteEndElement();
xml.WriteWhitespace("\n\t");
xml.WriteEndElement();
xml.WriteWhitespace("\n");
xml.WriteEndElement();
Console.WriteLine(str);
device.Add(XElement.Parse(str.ToString(), LoadOptions.PreserveWhitespace));
File.WriteAllText("MyXML.xml", device.ToString());
I can get the XML generated in format I need but the issue comes when I try to add it to the parent XMLElement device in this case. The formatting is all gone despite LoadOptions.PreserveWhitespace.
I get
<PersonLib Version="1.0">
<Details>
<Name>
<JohnDoe DOB="10" FirstName="20" LastName="40" Address="50" PhoneNum="60" City="70" State="80" Country"90" />
</Name>
</Details>
</PersonLib >
while I need
<PersonLib Version="1.0">
<Details>
<Name>
<JohnDoe DOB="10" FirstName="20" LastName="40"
Address="50"
PhoneNum="60"
City="70"
State="80"
Country="90" />
</Name>
</Details>
</PersonLib >
Not sure what am I missing.
You should take a look at this question: xdocument save preserve white space inside tags
LoadOptions.PreserveWhitespace (LoadOptions Enum)
If you preserve white space when loading, all insignificant white
space in the XML tree is materialized in the XML tree as is. If you do
not preserve white space, then all insignificant white space is
discarded.
This gives the impression that 'insignificant' whitespace between attributes would be preserved.
If however you look at XElement.Parse Method you see this:
If the source XML is indented, setting the PreserveWhitespace flag in
options causes the reader to read all white space in the source XML.
Nodes of type XText are created for both significant and insignificant
white space.
Looking at the class hierarchy you can see that XAttribute does not inherit from XNode. The long and the short of that is whitespace between attributes are not preserved. If they were you would still have to disable formatting on output (something like ToString(SaveOptions.DisableFormatting)).
I don't think that attributes were designed to be used as you have, but it is a very common usage. There is considerable diversity of opinion about this (see: Attribute vs Element)
Either way it sounds like you are stuck with both the design and format of what you were given. Unfortunately, this means you are also stuck with having to create a custom formatter to get the output you need.
Note the following code is meant only as an example of one possible way to implement code that creates the format you ask about.
using System;
using System.Linq;
using System.Text;
using System.Xml.Linq;
namespace FormatXml {
class Program {
static String OutputElement(int indentCnt, XElement ele) {
StringBuilder sb = new StringBuilder();
var indent = "".PadLeft(indentCnt, '\t');
var specialFormat = (ele.Parent == null) ? false : ((ele.Parent.Name == "Name") ? true : false);
sb.Append($"{indent}<{ele.Name}");
String FormatAttr(XAttribute attr) {
return $"{attr.Name} = '{attr.Value}'";
}
String FormatAttrByName(String name) {
var attr = ele.Attributes().Where(x => x.Name == name).FirstOrDefault();
var rv = "";
if (attr == null) {
rv = $"{name}=''";
}
else {
rv = FormatAttr(attr);
}
return rv;
}
if (specialFormat) {
var dob = FormatAttrByName("DOB");
var firstName = FormatAttrByName("FirstName");
var lastName = FormatAttrByName("LastName");
var address = FormatAttrByName("Address");
var phoneNum = FormatAttrByName("PhoneNum");
var city = FormatAttrByName("City");
var state = FormatAttrByName("State");
var country = FormatAttrByName("Country");
sb.AppendLine($"{dob} {firstName} {lastName}");
var left = ele.Name.LocalName.Length + 5;
var fill = indent + "".PadLeft(left);
sb.AppendLine($"{fill}{address}");
sb.AppendLine($"{fill}{phoneNum}");
sb.AppendLine($"{fill}{city}");
sb.AppendLine($"{fill}{state}");
sb.AppendLine($"{fill}{country} />");
}
else {
foreach (var attr in ele.Attributes()) {
sb.AppendFormat(" {0}", FormatAttr(attr));
}
}
sb.AppendLine(">");
foreach (var e in ele.Elements()) {
sb.Append(OutputElement(indentCnt + 1, e));
}
sb.AppendLine($"{indent}</{ele.Name}>");
return sb.ToString();
}
static void Main(string[] args) {
var txtEle = #"
<Details>
<Name>
<JohnDoe DOB = '10' FirstName = '20' LastName = '40'
Address = '50'
PhoneNum = '60'
City = '70'
State = '80'
Country = '90' />
</Name>
</Details>";
var plib = new XElement("PersonLib");
XDocument xdoc = new XDocument(plib);
var nameEle = XElement.Parse(txtEle, LoadOptions.PreserveWhitespace);
xdoc.Root.Add(nameEle);
var xml = OutputElement(0, (XElement)xdoc.Root);
Console.WriteLine(xml);
}
}
}

Why is this program not accessing child nodes?

Here it gets the XML document and individual nodes, and inserts the nodes into a dictionary.
//create the xml document obj
XmlDocument inputXMLDoc = new XmlDocument();
fileref.isValid = false;
//load the xml document
#region
try
{
inputXMLDoc.XmlResolver = null;
inputXMLDoc.Load( strfile );//load the xml file
string input = inputXMLDoc.OuterXml;//get the string
Console.WriteLine( "success,loaded XML" );
logger.Log( "loaded xml:" + strfile );
fileref.importList = new Dictionary<string, XmlNode>();
nodeNames = new List<string> { "OrderId", "CustomerId", "CustomerName", "Addresses", "OrderStatus", "DateOrdered", "PaymentTime", "IncludeVAT", "OrderTotalIncVat", "OrderTotalVat", "Currency", "TypeOfSaleId" };
try
{
int i = 0;
foreach( string name in nodeNames )
{
Console.WriteLine( "Adding xml node " + name );
if( inputXMLDoc.GetElementsByTagName( name ) != null )
{
XmlNodeList xlist = inputXMLDoc.GetElementsByTagName( name );
foreach( XmlNode node in xlist )
{
fileref.importList.Add( name, node );
//add individual node within nodelist
Console.WriteLine( name );
}
} //add specified node from XML doc
else
{
nodeNames.RemoveAt( i );
}
i++;
}
}
}
Later, the nodes are accessed to save the information to a web service. However, nodes with child nodes within are not showing up this way.
Invoices.Address address = new Invoices.Address();
XmlNodeList oNodeList = fileref.importList["Addresses"].SelectNodes("/Delivery/Street");
foreach (XmlNode xn in oNodeList)
{
address.Street = xn.InnerText;
}
Sample XML document
<?xml version="1.0" encoding="utf-8"?>
<InvoiceOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<OrderId xmlns="http://24sevenOffice.com/webservices">35</OrderId>
<CustomerId xmlns="http://24sevenOffice.com/webservices">21</CustomerId>
<CustomerName xmlns="http://24sevenOffice.com/webservices">James Bond</CustomerName>
<Addresses xmlns="http://24sevenOffice.com/webservices">
<Delivery>
<Street>11 Shewell Walk</Street>
<State />
<PostalCode>CO1 1WG</PostalCode>
<PostalArea>Essex</PostalArea>
<Name />
<City>Colchester</City>
<Country>UK</Country>
</Delivery>
<Invoice>
<Street>10 Shewell Walk</Street>
<State />
<PostalCode>CO1 1WG</PostalCode>
<PostalArea>Essex</PostalArea>
<Name />
<City>Colchester</City>
<Country>UK</Country>
</Invoice>
</Addresses>
<OrderStatus xmlns="http://24sevenOffice.com/webservices">Offer</OrderStatus>
<DateOrdered xmlns="http://24sevenOffice.com/webservices">2015-06-15T14:00:00Z</DateOrdered>
<PaymentTime xmlns="http://24sevenOffice.com/webservices">14</PaymentTime>
<IncludeVAT xsi:nil="true" xmlns="http://24sevenOffice.com/webservices" />
<OrderTotalIncVat xmlns="http://24sevenOffice.com/webservices">480.0000</OrderTotalIncVat>
<OrderTotalVat xmlns="http://24sevenOffice.com/webservices">80.0000</OrderTotalVat>
<Currency xmlns="http://24sevenOffice.com/webservices">
<Symbol>LOCAL</Symbol>
</Currency>
<TypeOfSaleId xmlns="http://24sevenOffice.com/webservices">-100</TypeOfSaleId>
<InvoiceRows xmlns="http://24sevenOffice.com/webservices">
<InvoiceRow>
<ProductId>18</ProductId>
<RowId>4665754</RowId>
<Price>400.0000</Price>
<Name>17" Laptop Screen</Name>
<DiscountRate>0.0000</DiscountRate>
<Quantity>7.0000</Quantity>
<Cost>0.0000</Cost>
<InPrice>0.0000</InPrice>
</InvoiceRow>
</InvoiceRows>
</InvoiceOrder>
The reason your code doesn't work is likely that you're ignoring the namespace of the elements you're looking for. There are many questions covering how to do that, such as this one.
That said, XmlDocument is a creaky old API and the newer LINQ to XML is a huge improvement - I'd suggest you look into that.
I'm also not sure the dictionary is pulling its weight for such a small number of elements. You can simply query what you need straight from the XML. For example, to get all your fields as typed values:
var doc = XDocument.Parse(strfile);
var order = doc.Elements("InvoiceOrder").Single();
XNamespace ns = "http://24sevenOffice.com/webservices";
var orderId = (int)order.Element(ns + "OrderId");
var customerId = (int)order.Element(ns + "CustomerId");
var customerName = (string)order.Element(ns + "CustomerName");
var orderStatus = (string)order.Element(ns + "OrderStatus");
var dateOrdered = (DateTime)order.Element(ns + "DateOrdered");
var paymentTime = (int)order.Element(ns + "PaymentTime");
var totalIncVat = (decimal)order.Element(ns + "OrderTotalIncVat");
var totalVat = (decimal)order.Element(ns + "OrderTotalVat");
var currency = (string)order.Elements(ns + "Currency").Elements(ns + "Symbol").SingleOrDefault();
var typeOfSaleId = (int)order.Element(ns + "TypeOfSaleId");
You can use a similar technique to get map your addresses to your strongly typed Address class:
var deliveryAddress = order.Elements(ns + "Addresses")
.Elements(ns + "Delivery")
.Select(e => new Invoice.Address
{
Street = (string)e.Element(ns + "Street"),
// ....
})
.Single();
The problem you have is with namespaces. If you specify the namespace for each of those elements then it seems to work. I came to this conclusion with a bit of googling and some experimentation so my explanation might not be spot on so I advise researching the issue further yourself to understand it correctly.
This code will work:
XmlNamespaceManager nsmgr = new XmlNamespaceManager(inputXMLDoc.NameTable);
nsmgr.AddNamespace("ns", "http://24sevenOffice.com/webservices");
var oNodeList = importList["Addresses"].SelectNodes("//ns:Delivery/ns:Street",nsmgr);
The reason is (I think) that in your XML document you are specifying a default namespace for your elements (xmlns="http://24sevenOffice.com/webservices") and in your xpath you are not specifying that same namespace. In my code I create a namespace manager with that namespace in and prefix it to the two elements which it now considers to match the ones in your document that have these namespaces.

Finding a set of nodes in an XML [duplicate]

Can't get any result in feeds.
feedXML has the correct data.
XDocument feedXML = XDocument.Load(#"http://search.twitter.com/search.atom?q=twitter");
var feeds = from entry in feedXML.Descendants("entry")
select new
{
PublicationDate = entry.Element("published").Value,
Title = entry.Element("title").Value
};
What am I missing?
You need to specify the namespace:
// This is the default namespace within the feed, as specified
// xmlns="..."
XNamespace ns = "http://www.w3.org/2005/Atom";
var feeds = from entry in feedXML.Descendants(ns + "entry")
...
Namespace handling is beautifully easy in LINQ to XML compared with everything other XML API I've ever used :)
You need to specify a namespace on both the Descendents and Element methods.
XDocument feedXML = XDocument.Load(#"http://search.twitter.com/search.atom?q=twitter");
XNamespace ns = "http://www.w3.org/2005/Atom";
var feeds = from entry in feedXML.Descendants(ns + "entry")
select new
{
PublicationDate = entry.Element(ns + "published").Value,
Title = entry.Element(ns + "title").Value
};
If you look at the XML returned by the HTTP request, you will see that it has an XML namespace defined:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" ...>
<id>tag:search.twitter.com,2005:search/twitter</id>
...
</feed>
XML is just like C#, if you use an element name with the wrong namespace, it is not considered to be the same element! You need to add the required namepsace to your query:
private static string AtomNamespace = "http://www.w3.org/2005/Atom";
public static XName Entry = XName.Get("entry", AtomNamespace);
public static XName Published = XName.Get("published", AtomNamespace);
public static XName Title = XName.Get("title", AtomNamespace);
var items = doc.Descendants(AtomConst.Entry)
.Select(entryElement => new FeedItemViewModel()
new {
Title = entryElement.Descendants(AtomConst.Title).Single().Value,
...
});
The issue is in feedXML.Descendants("entry"). This is returning 0 results
According to the documentation you need to put in a fully qualified XName

Linq to XML dynamic XML Decendants

I'm parsing a lot of XML files using Linq to XML synatx, everything works when I try to access top level elements
var indexroot = (from element in prodcutDocument.Root.Descendants("indexroot")
select new
{
model = (string)element.Element("MODEL"),
}).FirstOrDefault()
The problem occurs when I need to access lower level childs of that document I tried:
var indexroot = (from element in prodcutDocument.Root.Descendants("indexroot")
select new
{
ddName = (string)element.Descendants("DD_NAME").Elements("name").First();
}).FirstOrDefault()
and
var indexroot = (from element in prodcutDocument.Root.Descendants("indexroot").Descendants("DD_NAME")
select new
{
ddName = (string)element.Element("name")
}).FirstOrDefault();
Sadly none of that works and i get same error "Sequence contains no elements". And one more thing sometimes the XML document contains those tags and sometimes not is something like this enough for handling this case?
var indexroot = (from element in prodcutDocument.Root.Descendants("indexroot").Descendants("DD_NAME")
select new
{
ddName = (string)element.Element("name") ?? "-"
}).FirstOrDefault();
Edit:
I don't think is possible to paste short version of XML that would be simple, so here's full version: http://pastebin.com/uDkP3rnR and for the code example:
XDocument prodcutDocument = XDocument.Load(this.ServerPATHData + file);
var indexroot = (from element in prodcutDocument.Root.Descendants("indexroot")
select new
{
modelis = (string)element.Element("MODELIS"),
T2918_0 = (string)element.Descendants("dd_DARBINIS_GRAFIKAS_SPEC").First()
}).FirstOrDefault();
writeTxt.WriteLine("modelis: " + indexroot.modelis);
writeTxt.WriteLine("T2979_0" + indexroot.T2918_0);
In examining the sample XML that you posted on PasteBin, it appears to me that the elements that you mention appear only once. To access them, you can simply specify a path to each as follows:
XElement indexroot = document.Root.Element("indexroot");
XElement modelis = indexroot.Element("MODELIS");
XElement dd_dgs = indexroot.Element("dd_DARBINIS_GRAFIKAS_SPEC");
XElement voltageuv = dd_dgs.Element("VoltageUV");
string t2979_0 = (string)voltageuv.Element("T2979_0");
string t2861_60 = (string)voltageuv.Element("T2861_60");
string t2757_121 = (string)voltageuv.Element("T2757_121");
(Note that you may need to check for null if there is a chance that any of the elements you are trying to access may not be present. Without doing so, you'll encounter a NullReferenceException.)
Here is a snippet of the XML that you posted to give context to the above code:
<?xml version='1.0' encoding='UTF-8' standalone='no'?>
<PDB>
<indexroot>
<ed_BENDRA_MAKS_SUV_GALIA>1.45</ed_BENDRA_MAKS_SUV_GALIA>
<ed_BENDRA_MAKS_SROVE>6.48</ed_BENDRA_MAKS_SROVE>
<TIPAS>1</TIPAS>
<MODELIS>RIS 2500 HW EC 3.0</MODELIS>
<dd_DARBINIS_GRAFIKAS_SPEC>
<VoltageUV>
<T2979_0>229,42</T2979_0>
<T2861_60>227,98</T2861_60>
<T2757_121>228,97</T2757_121>
</VoltageUV>
<CurrentIA>
<T2979_0>2,56</T2979_0>
<T2861_60>2,63</T2861_60>
<T2757_121>2,72</T2757_121>
</CurrentIA>
</dd_DARBINIS_GRAFIKAS_SPEC>
</indexroot>
</PDB>
You can just change:
element.Descendants("dd_DARBINIS_GRAFIKAS_SPEC").First()
to this:
element.Descendants("dd_DARBINIS_GRAFIKAS_SPEC").FirstOrDefault() ?? "-"

XDocument Descendants and Element always return null values

Hey all i have looked thoroughly through all the questions containing XDocument and while they are all giving an answer to what I'm looking for (mostly namespaces issues) it seems it just won't work for me.
The problem I'm having is that I'm unable to select any value, be it an attribute or element.
Using this XML
I'm trying to retrieve the speaker's fullname.
public void GetEvent()
{
var xdocument = XDocument.Load(#"Shared\techdays2013.xml");
XNamespace xmlns = "http://www.w3.org/2001/XMLSchema-instance";
var data = from c in xdocument.Descendants(xmlns + "speaker")
select c.Element(xmlns + "fullname").Value;
}
You can omit the namespace declaration in your linq statement.
public void GetEvent()
{
var xdocument = XDocument.Load(#"Shared\techdays2013.xml");
//XNamespace xmlns = "http://www.w3.org/2001/XMLSchema-instance";
var data = from c in xdocument.Descendants("speaker")
select c.Element("fullname").Value;
}
You can omit WebClient because you have direct local access to a file. I'm just showing a way to process your file on my machine.
void Main()
{
string p = #"http://events.feed.comportal.be/agenda.aspx?event=TechDays&year=2013&speakerlist=c%7CExperts";
using (var client = new WebClient())
{
string str = client.DownloadString(p);
var xml = XDocument.Parse(str);
var result = xml.Descendants("speaker")
.Select(speaker => GetNameOrDefault(speaker));
//LinqPad specific call
result.Dump();
}
}
public static string GetNameOrDefault(XElement element)
{
var name = element.Element("fullname");
return name != null ? name.Value : "no name";
}
prints:
Bart De Smet
Daniel Pearson
Scott Schnoll
Ilse Van Criekinge
John Craddock
Corey Hynes
Bryon Surace
Jeff Prosise
1) You have to drop the namespace
2) You'll have to query more precisely. All your <speaker> elements inside <speakers> have a fullname but in the next section I spotted <speaker id="94" />
A simple fix (maybe not the best) :
//untested
var data = from c in xdocument.Root.Descendants("speakers").Descendants("speaker")
select c.Element("fullname").Value;
You may want to specify the path more precise:
xdocument.Element("details").Element("tracks").Element("speakers").

Categories