I have a given XML file that I need to process. For the sake of argument, let's say I've already loaded it in as a string.
<?xml version="1.0" encoding="UTF-8" ?>
<GROUP ID="_group_id" ORDERINFO="00000" TITLE="Group 1">
<GROUP ID="_group_id_2" TITLE="Group 2">
<LO ID="_id_code1" LANG="enUS" TYPE="_cust" TITLE="Title 1" />
<LO ID="_id_code2" LANG="enUS" TYPE="_cust" TITLE="Title 2" />
</GROUP>
<GROUP ID="_group_id_3" TITLE="Group 3">
<LO ID="_id_code1" LANG="enUS" TYPE="_cust" TITLE="Title 1" />
<LO ID="_id_code2" LANG="enUS" TYPE="_cust" TITLE="Title 2" />
</GROUP>
</GROUP>
There can be many LOs and many GROUPs in a given XML file. I've been trying various methods with no luck. I need something that will find the matching LO by ID to a given string and then allow me to retrieve the corresponding TYPE and TITLE into strings so that I may use them for processing.
I tried reading the file into an XmlDocument but once loaded I could not figure out how to find the appropriate elements.
Sorry for post prior to edit - some text got cut off
You can use XmlDocument or XDocument to parse the Xml.
Here is an example with XDocument:
Data class:
public class Lo
{
public string Id { get; set; }
public string Lang { get; set; }
public string Type { get; set; }
public string Title { get; set; }
}
Code:
var document = XDocument.Parse(data);
var value = "_id_code1";
IEnumerable<Lo> result =
document.XPathSelectElements(".//LO")
.Where(x => x.Attribute("ID").Value == value)
.Select(x =>
new Lo
{
Id = x.Attribute("ID").Value,
Lang = x.Attribute("LANG").Value,
Type = x.Attribute("TYPE").Value,
Title = x.Attribute("TITLE").Value
});
When loaded into a XmlDocument, you can use XPath to locate notes.
E.g:
XmlNode group = xmlDocument.SelectSingleNode("/GROUP/GROUP[#ID='_group_id_2']");
Or:
XmlNodeList groups = xmlDocument.SelectNodes("/GROUP/GROUP");
foreach(XmlNode group in groups)
{
string id = group.Attributes["ID"].Value;
}
It is very easy. For a more complete walk through you should search the internet.
See the documentation:
Overview of XML in the .NET Framework.
XML Processing Options in the .NET Framework
It's better to cast XAtribute to string, then access its Value property (if some attribute not found, you will get null instead of exception). Also query syntax is more compact here
string id = "_id_code1";
XDocument xdoc = XDocument.Parse(xml);
var query = from lo in xdoc.Descendants("LO")
where (string)lo.Attribute("ID") == id
select new {
Id = (string)lo.Attribute("ID"),
Language = (string)lo.Attribute("LANG"),
Type = (string)lo.Attribute("TYPE"),
Title = (string)lo.Attribute("TITLE")
};
This query will return sequence of anonymous objects with properties Id, Language, Type, Title. You can enumerate them with foreach.
I did small test application for this, I included your xml as the string.
var xmlMessage = #"keep your xml here, I removed due to formatting";
var matchedElements = XDocument.Parse(xmlMessage).Descendants().Where(el => el.Name == "LO" && el.Attribute("ID").Value == "_id_code1");
foreach (var el in matchedElements)
{
Console.WriteLine("ElementName : {0}\nID = {1}\nLANG = {2}\nTYPE = {3}\nTITLE = {4}\n"
, el.Name.LocalName, el.Attribute("ID").Value, el.Attribute("LANG").Value, el.Attribute("TYPE").Value, el.Attribute("TITLE").Value);
}
This would help you to fetch all LO elements having the ID "_id_code" irrespective of the GROUP element.
If you need to consider the group, replace the second line code with this:
var matchedElements = XDocument.Parse(xmlMessage).Descendants().Where(el => el.Parent != null && el.Parent.Attribute("ID").Value == "_group_id_2" && el.Name == "LO" && el.Attribute("ID").Value == "_id_code1");
Here, I'm checking for the "_group_id_2", you can replace with your group id.
The required namespaces:
using System.Linq;
using System.Xml;
using System.Xml.Linq;
Related
My XML File is structured as follows:
<File>
<Setting1></Setting1>
<Setting2></Setting2>
<Options>
<Option>
<NameStartsWith>Br</NameStartsWith>
<Data>1234</Data>
</Option>
<Option>
<NameStartsWith>Ch</NameStartsWith>
<Data>4567</Data>
</Option>
</Options>
</File>
What I would like to do is use LINQ for something like the below..
String Name = "Brian";
if(Name.StartsWith(LINQ.Any.NameStartsWith)))
{
Console.WriteLine("The Answer is: " 1234);
}
At present I perform the above by looping through the <Option> fields with foreach (XElement xe in Tests). But the real XML file is a lot more detailed than this and the loops are getting unmanageable. I would ideally like to use LINQ to search all fields at once and make it a simple if or statement.
Use LINQ to Xml
string name = "Brian";
XDocument doc = XDocument.Load(yourXmlFile);
var matches = doc.Root
.Descendants("Option")
.Where(option => name.StartsWith(option.Element("NameStartsWith").Value))
.Select(option => option.Element("Data").Value);
foreach(var data in matches)
{
Console.WriteLine("The Answer is: {data}");
}
XContainer.Descendants Method (XName) will return all elements with name passed as parameter from all hierarchical levels of current XElement.
If element NameStartsWith is optional inside Option, then just add checking for null in the chain of LINQ methods. XElement.Element(XName name) will return null if no such element exists.
var matches = doc.Root
.Descendants("Option")
.Where(option => option.Element("NameStartsWith") != null)
.Where(option => name.StartsWith(option.Element("NameStartsWith").Value))
.Select(option => option.Element("Data").Value);
If Option element contain more then one other elements which need to be selected, then create a class which represent all needed data and fill it inside Select method
public class Option
{
public string NameStartsWith {get; set; }
public string Data {get; set; }
public string ElementOne {get; set; }
public string ElementTwo {get; set; }
}
var matches = doc.Root
.Descendants("Option")
.Where(option => option.Element("NameStartsWith") != null)
.Where(option => name.StartsWith(option.Element("NameStartsWith").Value))
.Select(option => new Option
{
NameStartsWith = option.Element("Data").Value,
Data = option.Element("Data").Value,
ElementOne = option.Element("ElementOne").Value,
ElementTwo = option.Element("ElementTwo").Value,
});
Of course you can use anonymous class instead of created one.
XPATH + Linq2Xml is also possible
string Name = "Brian";
var xDoc = XDocument.Parse(xmlstring); //or XDocument.Load(filename)
var matches = xDoc
.XPathSelectElements($"//Option/NameStartsWith[starts-with('{Name}', text())]");
I am trying to figure out the code to extract xml child (I think this is worded correctly) elements. I have searched and tried many samples but cannot find how to drill down to pick out the section I want and return the information I need. Maybe I all I need is someone to define the data I am trying to pull so I can read up on the issue, of course any code would be very helpful and I will figure it out from there. Thanks in advanced for any help!
Here is the xml file. I am trying to run an if statement to find the section named <STATISTICTYPE>PVCAP_CharactersSaved</STATISTICTYPE> and return the <JOBNAME>,<TIMEDELTA>,<VALUESUM>.
<?xml version="1.0" encoding="utf-8"?>
<PVCAPTURESTATISTICCONTAINTER xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<PVCAPTUREJOBSTATISTICS>
<PVCAPTURESTATISTICSUMMARY>
<STATISTICTYPE>PVCAP_CharactersSaved</STATISTICTYPE>
<STATISTICNAME>Characters saved</STATISTICNAME>
<JOBID>24</JOBID>
<JOBNAME>HEAT FILES</JOBNAME>
<TIMEDELTA>422</TIMEDELTA>
<VALUESUM>25432</VALUESUM>
</PVCAPTURESTATISTICSUMMARY>
<PVCAPTURESTATISTICSUMMARY>
<STATISTICTYPE>PVCAP_CharactersSaved_NoMM</STATISTICTYPE>
<STATISTICNAME>Characters saved (no match and merge)</STATISTICNAME>
<JOBID>24</JOBID>
<JOBNAME>HEAT FILES</JOBNAME>
<TIMEDELTA>422</TIMEDELTA>
<VALUESUM>25432</VALUESUM>
</PVCAPTURESTATISTICSUMMARY>
</PVCAPTUREJOBSTATISTICS>
<DOCUMENTCOUNT>762</DOCUMENTCOUNT>
<PAGECOUNT>3194</PAGECOUNT>
<IMAGECOUNT>3194</IMAGECOUNT>
<VERSION>2.0</VERSION>
</PVCAPTURESTATISTICCONTAINTER>
You can use LINQ to XML, particularly the XElement class.
var element = XElement.Parse(xmlStr).Element("PVCAPTUREJOBSTATISTICS")
.Elements("PVCAPTURESTATISTICSUMMARY")
.First(c => c.Element("STATISTICTYPE").Value == "PVCAP_CharactersSaved")
var jobName = element.Element("JOBNAME").Value;
var timeDelta = element.Element("TIMEDELTA").Value;
var valueSum = element.Element("VALUESUM").Value;
You'll want to add in some error handling and whatnot here, but this should get you going in the right direction.
You can do something like this:
XElement res = XElement.Parse(xmlResult);
foreach(var elem in res.Element("PVCAPTUREJOBSTATISTICS").Elements("PVCAPTURESTATISTICSUMMARY"))
{
if (elem.Element("STATISTICTYPE").Value.Equals("PVCAP_CharactersSaved", StringComparison.Ordinal))
{
string jobName = elem.Element("JOBNAME").Value;
string timeDelta = elem.Element("TIMEDELTA").Value;
string valueSum = elem.Element("VALUESUM").Value;
}
}
You can use XDocument and LINQ-to-XML to do that quite easily, for example :
string xml = "your xml content here";
XDocument doc = XDocument.Parse(xml);
//or if you have the xml file instead :
//XDocument doc = XDocument.Load("path_to_xml_file.xml");
var result = doc.Descendants("PVCAPTURESTATISTICSUMMARY")
.Where(o => (string) o.Element("STATISTICTYPE") == "PVCAP_CharactersSaved")
.Select(o => new
{
jobname = (string) o.Element("JOBNAME"),
timedelta = (string) o.Element("TIMEDELTA"),
valuesum = (string) o.Element("VALUESUM")
});
foreach (var r in result)
{
Console.WriteLine(r);
}
I have a .gpx XML file with the following sample:
<trk>
<name>Test</name>
<trkseg>
<trkpt lon="-84.89032996818423" lat="32.75810896418989">
<ele>225.0</ele>
<time>2011-04-02T11:57:48.000Z</time>
<extensions>
<gpxtpx:TrackPointExtension>
<gpxtpx:cad>0</gpxtpx:cad>
</gpxtpx:TrackPointExtension>
</extensions>
</trkpt>
</trkseg>
</trk>
I'm using Linq to XML to parse this but I'm having a difficult time parsing the extensions section. Here's the code I'm using:
var gpxDoc = LoadFromStream(document);
var gpx = GetGpxNameSpace();
var gpxtpx = XNamespace.Get("gpxtpx");
var tracks = from track in gpxDoc.Descendants(gpx + "trk")
select new
{
Name = DefaultStringValue(track, gpx, "name"),
Description = DefaultStringValue(track, gpx, "desc"),
Segments = (from trkSegment in track.Descendants(gpx + "trkseg")
select new
{
TrackSegment = trkSegment,
Points = (from trackpoint in trkSegment.Descendants(gpx + "trkpt")
select new
{
Lat = Double(trackpoint.Attribute("lat").Value),
Lng = Double(trackpoint.Attribute("lon").Value),
Ele = DefaultDoubleValue(trackpoint, gpx, "ele"),
Time = DefaultDateTimeValue(trackpoint, gpx, "time"),
Extensions = (
from ext in trackpoint.Descendants(gpx + "extensions").Descendants(gpxtpx + "TrackPointExtension")
select new
{
Cad = DefaultIntValue(ext, gpxtpx, "cad")
}).SingleOrDefault()
})
})
};
Here's the relevant helper code:
private static double? DefaultIntValue(XContainer element, XNamespace ns, string elementName)
{
var xElement = element.Element(ns + elementName);
return xElement != null ? Convert.ToInt32(xElement.Value) : (int?)null;
}
private XNamespace GetGpxNameSpace()
{
var gpx = XNamespace.Get("http://www.topografix.com/GPX/1/1");
return gpx;
}
The actual error I'm getting is
The following error occurred: Object reference not set to an instance of an object.
and it bombs on this code:
Extensions = (from ext in trackpoint.Descendants(gpx + "extensions").Descendants(gpxtpx + "TrackPointExtension")
select new
{
Cad = DefaultIntValue(ext, gpxtpx, "cad")
}).SingleOrDefault();
I just don't know how to fix it.
Since you never declare the namespace (xmlns:gpxtpx="http://www.topografix.com/GPX/1/1") it is never going to match. The xml fragment you provided is not well formed due to the lack of the namespace.
If the fragment posted is snipped from a larger document, consider switching to XML API's rather than string manipulation. If that is the entirety of the XML you receive from an outside system, add it to a root node which you can declare the schema in:
<root xmlns:gpxtpx="http://www.topografix.com/GPX/1/1">
<!-- put your xml fragment here -->
</root>
I've already read some posts and articles on how to deserialize xml but still haven't figured out the way I should write the code to match my needs, so.. I'm apologizing for another question about deserializing xml ))
I have a large (50 MB) xml file which I need to deserialize. I use xsd.exe to get xsd schema of the document and than autogenerate c# classes file which I put into my project. I want to get some (not all) data from this xml file and put it into my sql database.
Here is the hierarchy of the file (simplified, xsd is very large):
public class yml_catalog
{
public yml_catalogShop[] shop { /*realization*/ }
}
public class yml_catalogShop
{
public yml_catalogShopOffersOffer[][] offers { /*realization*/ }
}
public class yml_catalogShopOffersOffer
{
// here goes all the data (properties) I want to obtain ))
}
And here is my code:
first approach:
yml_catalogShopOffersOffer catalog;
var serializer = new XmlSerializer(typeof(yml_catalogShopOffersOffer));
var reader = new StreamReader(#"C:\div_kid.xml");
catalog = (yml_catalogShopOffersOffer) serializer.Deserialize(reader);//exception occures
reader.Close();
I get InvalidOperationException: There is an error in the XML(3,2) document
second approach:
XmlSerializer ser = new XmlSerializer(typeof(yml_catalogShopOffersOffer));
yml_catalogShopOffersOffer result;
using (XmlReader reader = XmlReader.Create(#"C:\div_kid.xml"))
{
result = (yml_catalogShopOffersOffer)ser.Deserialize(reader); // exception occures
}
InvalidOperationException: There is an error in the XML(0,0) document
third: I tried to deserialize the entire file:
XmlSerializer ser = new XmlSerializer(typeof(yml_catalog)); // exception occures
yml_catalog result;
using (XmlReader reader = XmlReader.Create(#"C:\div_kid.xml"))
{
result = (yml_catalog)ser.Deserialize(reader);
}
And I get the following:
error CS0030: The convertion of type "yml_catalogShopOffersOffer[]" into "yml_catalogShopOffersOffer" is not possible.
error CS0029: The implicit convertion of type "yml_catalogShopOffersOffer" into "yml_catalogShopOffersOffer[]" is not possible.
So, how to fix (or overwrite) the code to not get the exceptions?
edits: Also when I write:
XDocument doc = XDocument.Parse(#"C:\div_kid.xml");
The XmlException occures: unpermitted data on root level, string 1, position 1.
Here is the first string of the xml file:
<?xml version="1.0" encoding="windows-1251"?>
edits 2:
The xml file short example:
<?xml version="1.0" encoding="windows-1251"?>
<!DOCTYPE yml_catalog SYSTEM "shops.dtd">
<yml_catalog date="2012-11-01 23:29">
<shop>
<name>OZON.ru</name>
<company>?????? "???????????????? ??????????????"</company>
<url>http://www.ozon.ru/</url>
<currencies>
<currency id="RUR" rate="1" />
</currencies>
<categories>
<category id=""1126233>base category</category>
<category id="1127479" parentId="1126233">bla bla bla</category>
// here goes all the categories
</categories>
<offers>
<offer>
<price></price>
<picture></picture>
</offer>
// other offers
</offers>
</shop>
</yml_catalog>
P.S.
I've already acccepted the answer (it's perfect). But now I need to find "base category" for each Offer using categoryId. The data is hierarchical and the base category is the category that has no "parentId" attribute. So, I wrote a recursive method to find the "base category", but it never finishes. Seems like the algorythm is not very fast))
Here is my code: (in the main() method)
var doc = XDocument.Load(#"C:\div_kid.xml");
var offers = doc.Descendants("shop").Elements("offers").Elements("offer");
foreach (var offer in offers.Take(2))
{
var category = GetCategory(categoryId, doc);
// here goes other code
}
Helper method:
public static string GetCategory(int categoryId, XDocument document)
{
var tempId = categoryId;
var categories = document.Descendants("shop").Elements("categories").Elements("category");
foreach (var category in categories)
{
if (category.Attribute("id").ToString() == categoryId.ToString())
{
if (category.Attributes().Count() == 1)
{
return category.ToString();
}
tempId = Convert.ToInt32(category.Attribute("parentId"));
}
}
return GetCategory(tempId, document);
}
Can I use recursion in such situation? If not, how else can I find the "base category"?
Give LINQ to XML a try. XElement result = XElement.Load(#"C:\div_kid.xml");
Querying in LINQ is brilliant but admittedly a little weird at the start. You select nodes from the Document in a SQL like syntax, or using lambda expressions. Then create anonymous objects (or use existing classes) containing the data you are interested in.
Best is to see it in action.
miscellaneous examples of LINQ to XML
simple sample using xquery and lambdas
sample denoting namespaces
There is tons more on msdn. Search for LINQ to XML.
Based on your sample XML and code, here's a specific example:
var element = XElement.Load(#"C:\div_kid.xml");
var shopsQuery =
from shop in element.Descendants("shop")
select new
{
Name = (string) shop.Descendants("name").FirstOrDefault(),
Company = (string) shop.Descendants("company").FirstOrDefault(),
Categories =
from category in shop.Descendants("category")
select new {
Id = category.Attribute("id").Value,
Parent = category.Attribute("parentId").Value,
Name = category.Value
},
Offers =
from offer in shop.Descendants("offer")
select new {
Price = (string) offer.Descendants("price").FirstOrDefault(),
Picture = (string) offer.Descendants("picture").FirstOrDefault()
}
};
foreach (var shop in shopsQuery){
Console.WriteLine(shop.Name);
Console.WriteLine(shop.Company);
foreach (var category in shop.Categories)
{
Console.WriteLine(category.Name);
Console.WriteLine(category.Id);
}
foreach (var offer in shop.Offers)
{
Console.WriteLine(offer.Price);
Console.WriteLine(offer.Picture);
}
}
As an extra: Here's how to deserialize the tree of categories from the flat category elements.
You need a proper class to house them, for the list of Children must have a type:
class Category
{
public int Id { get; set; }
public int? ParentId { get; set; }
public List<Category> Children { get; set; }
public IEnumerable<Category> Descendants {
get
{
return (from child in Children
select child.Descendants).SelectMany(x => x).
Concat(new Category[] { this });
}
}
}
To create a list containing all distinct categories in the document:
var categories = (from category in element.Descendants("category")
orderby int.Parse( category.Attribute("id").Value )
select new Category()
{
Id = int.Parse(category.Attribute("id").Value),
ParentId = category.Attribute("parentId") == null ?
null as int? : int.Parse(category.Attribute("parentId").Value),
Children = new List<Category>()
}).Distinct().ToList();
Then organize them into a tree (Heavily borrowed from flat list to hierarchy):
var lookup = categories.ToLookup(cat => cat.ParentId);
foreach (var category in categories)
{
category.Children = lookup[category.Id].ToList();
}
var rootCategories = lookup[null].ToList();
To find the root which contains theCategory:
var root = (from cat in rootCategories
where cat.Descendants.Contains(theCategory)
select cat).FirstOrDefault();
I'm trying to deserialize an XML document, one of its nodes can be represented like this :
<n1 zone="00000" id="0000" />
or this :
<n2 zone="00000" id="0000" />
or this :
<n3 zone="00000" id="0000" />
In my document I will always have one "n1" node or one "n2" node or one "n3" node. I'd like to deserialize all these fragments into an instance of this class :
[Serializable]
public class N
{
[XmlAttribute("zone")]
public string Zone { get; set; }
[XmlAttribute("id")]
public string Id { get; set; }
}
But I didn't manage to do that. The documentation suggests to use the XmlChoiceIdentifier attribute in order to accomplish this, but maybe I used it in a wrong way.
Any idea ?
PS : I know I can create three classes : N1, N2 and N3, and map them to my different types of XML fragments. But I'd prefer a cleaner solution.
Assuming you have something like:
<?xml version="1.0"?>
<ns>
<n1 id="0000" zone="00000"/>
<n2 id="0000" zone="00000"/>
<n3 id="0000" zone="00000"/>
</ns>
You could use LINQ to XML:
XDocument document = XDocument.Load(path);
XElement parentNode = document.Element("ns");
var childNodes = parentNode.Elements().Where(x => x.Name.ToString().StartsWith("n")); //this prevent from take elements wich didn't start with n
List<N> list = new List<N>();
foreach (XElement element in childNodes) {
N n = new N(){
Id = element.Attribute("id").Value,
Zone = element.Attribute("zone").Value
};
list.Add(n);
}
Here is a fairly understandable read on how to use XmlChoiceIdentifier:
http://msdn.microsoft.com/en-us/magazine/cc164135.aspx
If you are really having trouble with this, then you could always use an XSL transform to do the mapping first.
You could use LINQ-To-Xml
XDocument x = XDocument.Parse(
"<root><n1 zone=\"0000\" id=\"0000\"/><n2 zone=\"0001\" id=\"0011\"/><n3 zone=\"0002\" id=\"0022\"/></root>");
var result = from c in x.Element("root").Descendants()
select new N { Zone = c.Attribute("zone").Value,
Id = c.Attribute("id").Value };
It's not using a serializer, which I think you are aiming for but it's a way forward.