XDocument Descendant Selector using Wildcard? - c#

I have some XML structured like this:
<form>
<section-1>
<item-1>
<value />
</item-1>
<item-2>
<value />
</item-2>
</section-1>
<section-2>
<item-3>
<value />
</item-3>
<item-4>
<value />
</item-4>
</section-2>
</form>
...and want to turn it into something sane like this:
<form>
<items>
<item id="1">
<value/>
</item>
<item id="2">
<value/>
</item>
<item id="3">
<value/>
</item>
<item id="4">
<value/>
</item>
</items>
</form>
I am struggling to turn the old XML into an array or object of values. Once in the new format I'd be able to do the following:
XDocument foo = XDocument.Load(form.xml);
var items = foo.Descendants("item")
.Select(i => new Item
{
value = i.Element("value").Value
});
...but in the current mess the xml is in can I wildcard the descendants selector?
var items = foo.Descendants("item"*)
...or something? I tried to follow this question's answer but failed to adapt it to my purpose.

Ah-ha! It did click in the end. If I leave the descendants selector blank and add in a where statement along the lines of what's in this question's answer
.Where(d => d.Name.ToString().StartsWith("item-"))
Then we get:
XDocument foo = XDocument.Load(form.xml);
var items = foo.Descendants()
.Where(d => d.Name.ToString().StartsWith("item-"))
.Select(i => new Item
{
value = i.Element("value").Value
});
...and I'm now able to iterate through those values while outputting the new XML format. Happiness.

Related

How to query xml of this structure?

<root>
<level1>
<item id="1" date="" name="" >
<item id="2" date="" name="" >
<item id="3" date="" name="" >
<item id="4" date="" name="" >
<item id="5" date="" name="" >
</level1>
</root>
I have an xml structure like the one above.
I used
XmlNodeList xnList = xmlDoc.SelectNodes("/level1");
If I used xmlnodelist as above, how can I specifically only get the element with id="3"?
or more useful if I could store all elements inside as elements in xnlist?
XmlNodeList xnList = xmlDoc.SelectNodes("//level1/item[#id='3']");
and if you want to use Linq To Xml
var xDoc = XDocument.Parse(xmlstring); // XDocument.Load(filename)
var items = xDoc.Descendants("level1")
.First()
.Elements("item")
.Select(item => new {
ID = item.Attribute("id").Value,
Name = item.Attribute("name").Value
})
.ToList();
You can even combine XPath and Linq2Xml
var item2 = xDoc.XPathSelectElements("//level1/item")
.Select(item => new {
ID = item.Attribute("id").Value,
Name = item.Attribute("name").Value
})
.ToList();
besides the great answer from #L.B I also use Linq, personally I think is a lot more readable:
xdoc.Element("level1")
.Descendants("item")
.Where(x => x.Attribute("id").Value == "3").First();
but it all depends on your style ;)

How do I access this XML structure using Linq in C#?

I'm pretty new to this stuff and having a hard time figuring out how to properly access my data. What I have is an XML tree in this form:
<bpm:ResponseData
xmlns:bpm="http://rest.bpm.ibm.com/v1/data">
<status>200</status>
<data
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:srch="http://rest.bpm.ibm.com/v1/data/search"
xsi:type="srch:SearchDetails">
<data>
<item key="assignedToUser"/>
<item key="bpdName">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:string">
Some process name
</value>
</item>
<item key="instanceDueDate">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:string">
2011-09-06T12:35:48Z
</value>
</item>
<item key="taskId">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:decimal">
218
</value>
</item>
<item key="taskSubject">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:string">
Task: Some process related task
</value>
</item>
</data>
<data>
<item key="bpdName">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:string">
Another process name
</value>
</item>
<item key="instanceStatus">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:string">
Active
</value>
</item>
<item key="taskId">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:decimal">
253
</value>
</item>
<item key="taskSubject">
<value xmlns:ns5="http://www.w3.org/2001/XMLSchema" xsi:type="ns5:string">
Task: Another process related task
</value>
</item>
</data>
</data>
</bpm:ResponseData>
I need to extract exactly two things from this data: the taskSubject and the taskId. Preferably in a manner which would allow me to iterate over them. Something involving new{subject, id} would be nice.
I'm not quite sure how to handle thing task...
With
var items = from feed in XMLDocument.Descendants("data").Descendants("data") select feed;
I get the two data items. Is there any way to drill them down further, returning the value of the descendant with a specific "key" attribute?
Regards,
Michael
EDIT:
I figured this would work:
var items = from feed in XMLDocument.Descendants("data").Descendants("data") select
new{
subject = from subjects in feed.Elements() where (subjects.Attribute("key").Value=="taskSubject") select subjects.Value,
id = from subjects in feed.Elements() where (subjects.Attribute("key").Value == "taskId") select subjects.Value
};
But that seems pretty "dirty"...
This is a bit hackish, but it should work (tested on Mono 2.10.2):
var items = from data in document.Descendants("data")
let taskId =
data.Elements("item")
.Where(i => (string)i.Attribute("key") == "taskId")
.FirstOrDefault()
where taskId != null
let taskSubject =
data.Elements("item")
.Where(i => (string)i.Attribute("key") == "taskSubject")
.FirstOrDefault()
where taskSubject != null
select new {
TaskId = taskId.Element("value").Value.Trim(),
TaskSubject = taskSubject.Element("value").Value.Trim()
};

LINQ to XML (Dynamic XML)

I have an XML file which has kind of a similar structure that you can see below:
I would like to select title and subitems using LINQ to XML. The difficulties that I have: sometimes a subitem can be just one and sometimes it can be 20 subitems, and I need to add them to List<string>.
<?xml version="1.0"?>
<items>
<item>
<title>Name of the title</title>
<subitem>Test</subitem>
<subitem1>Test</subitem1>
<subitem2>Test</subitem2>
<subitem3>Test</subitem3>
<subitem4>Test</subitem4>
<subitem5>Test</subitem5>
</item>
<item>
<title>Name of the title</title>
<subitem>Test</subitem>
<subitem1>Test</subitem1>
<subitem2>Test</subitem2>
<subitem3>Test</subitem3>
</item>
<item>
<title>Name of the title</title>
<subitem>Test</subitem>
<subitem1>Test</subitem1>
</item>
</items>
The solution, including getting the titles, is:
XDocument yourXDocument = XDocument.Load(yourXmlFilePath);
IEnumerable<Tuple<XElement, IEnumerable<XElement>>> yourSubItems =
yourXDocument.Root.Descendants()
.Where(xelem => xelem.Name == "title")
.Select(xelem => new Tuple<XElement, IEnumerable<XElement>>(xelem, xelem.Parent.Elements().Where(subelem => subelem.Name.LocalName.StartsWith("subitem")));
XDocument xdoc = XDocument.Load(path_to_xml);
var query = from i in xdoc.Descendants("item")
select new
{
Title = (string)i.Element("title"),
Subitems = i.Elements()
.Where(e => e.Name.LocalName.StartsWith("subitem"))
.Select(e => (string)e)
.ToList()
};

XML Namespaces are confounding me

I have an XML document which is confounding me. I'd like to (to start) pull all of the document nodes (/database/document), but it only works if I remove all of the attributes on the database element. Specifically the xmlns tag causes an xpath query for /database/document to return nothing - remove it, and it works.
xmlns="http://www.lotus.com/dxl"
I take it this has to do with XML namespaces. What is it doing, and more to the point, how do I make it stop? I just want to parse the document for data.
<?xml version="1.0" encoding="utf-8"?>
<database xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.lotus.com/dxl xmlschemas/domino_7_0_3.xsd"
xmlns="http://www.lotus.com/dxl"
version="7.0"
maintenanceversion="3.0"
path="C:\LotusXML\test1.nsf"
title="test1">
<databaseinfo numberofdocuments="3">
<datamodified>
<datetime dst="true">20090812T142141,48-04</datetime>
</datamodified>
<designmodified>
<datetime dst="true">20090812T154850,91-04</datetime>
</designmodified>
</databaseinfo>
<document form="NameAddress">
<noteinfo noteid="8fa" unid="x" sequence="2">
<created>
<datetime dst="true">20090812T130308,71-04</datetime>
</created>
<modified>
<datetime dst="true">20090812T142049,36-04</datetime>
</modified>
<revised>
<datetime dst="true">20090812T142049,35-04</datetime>
</revised>
<lastaccessed>
<datetime dst="true">20090812T142049,35-04</datetime>
</lastaccessed>
<addedtofile>
<datetime dst="true">20090812T130321,57-04</datetime>
</addedtofile>
</noteinfo>
<updatedby>
<name>MOOSE</name>
</updatedby>
<revisions>
<datetime dst="true">20090812T130321,57-04</datetime>
</revisions>
<item name="Name">
<text>joe</text>
</item>
<item name="OtherName">
<text>dave</text>
</item>
<item name="Address">
<text>here at home</text>
</item>
<item name="PictureHere">
<richtext>
<pardef id="1" />
<par def="1">
</par>
<par def="1" />
</richtext>
</item>
</document>
<document form="NameAddress">
<noteinfo noteid="8fe" unid="x" sequence="2">
<created>
<datetime dst="true">20090812T130324,59-04</datetime>
</created>
<modified>
<datetime dst="true">20090812T142116,95-04</datetime>
</modified>
<revised>
<datetime dst="true">20090812T142116,94-04</datetime>
</revised>
<lastaccessed>
<datetime dst="true">20090812T142116,94-04</datetime>
</lastaccessed>
<addedtofile>
<datetime dst="true">20090812T130333,90-04</datetime>
</addedtofile>
</noteinfo>
<updatedby>
<name>MOOSE</name>
</updatedby>
<revisions>
<datetime dst="true">20090812T130333,90-04</datetime>
</revisions>
<item name="Name">
<text>fred</text>
</item>
<item name="OtherName">
<text>wilma</text>
</item>
<item name="Address">
<text>bedrock</text>
</item>
<item name="PictureHere">
<richtext>
<pardef id="1" />
<par def="1">
</par>
<par def="1" />
</richtext>
</item>
</document>
<document form="NameAddress">
<noteinfo noteid="902" unid="x" sequence="2">
<created>
<datetime dst="true">20090812T130337,09-04</datetime>
</created>
<modified>
<datetime dst="true">20090812T142141,48-04</datetime>
</modified>
<revised>
<datetime dst="true">20090812T142141,47-04</datetime>
</revised>
<lastaccessed>
<datetime dst="true">20090812T142141,47-04</datetime>
</lastaccessed>
<addedtofile>
<datetime dst="true">20090812T130350,20-04</datetime>
</addedtofile>
</noteinfo>
<updatedby>
<name>MOOSE</name>
</updatedby>
<revisions>
<datetime dst="true">20090812T130350,20-04</datetime>
</revisions>
<item name="Name">
<text>julie</text>
</item>
<item name="OtherName">
<text>mccarthy</text>
</item>
<item name="Address">
<text>the pen</text>
</item>
<item name="PictureHere">
<richtext>
<pardef id="1" />
<par def="1">
</par>
<par def="1" />
</richtext>
</item>
</document>
</database>
The xmlns="http://www.lotus.com/dxl" sets a default namespace for contained nodes. It means that /database/document is really /{http://www.lotus.com/dxl}:database/{http://www.lotus.com/dxl}:document. Your XPath query will need to include the namespace:
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNamespaceManager ns = new XmlNamespaceManager(doc.NameTable);
ns.AddNamespace("tns", "http://www.lotus.com/dxl");
var documents = doc.SelectNodes("/tns:database/tns:document", ns);
When there is an XML namespace defined, each element needs to preceded by it for it to be correctly recognized.
If you were to use LINQ to XML to read in this data it would look something like this:
XDocument xdoc = XDocument.Load("file.xml");
XNamespace ns = "http://www.lotus.com/dxl";
var documents = xdoc.Descendants(ns + "document");
XML namespaces are similar in concept to C# namespaces (or any other language that supports it). If you define a class inside a namespace, you wouldn't be able to access it without first specifying the namespace (this is what using statements do for you).
You need to specify the element by their full name, including the namespace. The easy way to do this is to define the appropriate XNamespace and prepend it to the element name.
XDocument myDoc;
XNamespace ns = "http://www.lotus.com/dxl";
XElement myElem = myDoc.Element(ns + "ElementName");
See MSDN for more information.

Query to retrieve names of group nodes

If I had some XML such as this loaded into an XDocument object:
<Root>
<GroupA>
<Item attrib1="aaa" attrib2="000" />
</GroupA>
<GroupB>
<Item attrib1="bbb" attrib2="111" />
<Item attrib1="ccc" attrib2="222" />
<Item attrib1="ddd" attrib2="333" />
</GroupB>
<GroupC>
<Item attrib1="eee" attrib2="444" />
<Item attrib1="fff" attrib2="555" />
</GroupC>
</Root>
What would a query look like to retrieve the names of the group nodes?
For example, I'd like a query to return:
GroupA
GroupB
GroupC
Something like this:
XDocument doc; // populate somehow
// this will give the names as XName
var names = from child in doc.Root.Elements()
select child.Name;
// if you want just the local (no-namespaces) name as a string, use this
var simpleNames = from child in doc.Root.Elements()
select child.Name.LocalName;

Categories