Loading duplicate XML attributes using XDocument - c#

I need help loading xml using XDocument. The xml holds the data for a HierarchicalDataTemplate in WPF so each element has the same attributes.
I'm having a newbie problem with how to handle the duplicate attributes Name, image and fileLoc.
I was trying to get something like the code below to work, but as you can see duplicate attributes will not work.
public static List<MenuItem> Load(string MyMenuFile)
{
var mymenu = XDocument.Load(MyMenuFile).Root.Elements("Menu").Select(
x => new MenuItem(
(string)x.Attribute("id"),
(string)x.Attribute("name"),
(string)x.Attribute("image"),
(string)x.Attribute("fileLoc"),
(string)x.Element("itemlist"),
(string)x.Attribute("name"),
(string)x.Attribute("image"),
(string)x.Attribute("fileLoc"),
(string)x.Element("item"),
(string)x.Attribute("name"),
(string)x.Attribute("image"),
(string)x.Attribute("fileLoc")));
return stationfiles.ToList();
}
Here is the xml:
<Menus>
<Menu id="1" Name="Level1" image="C:\lvl1.jpg" fileLoc="C:\lvl1.xml">
</Menu>
<Menu id="2" Name="Level2" image="C:\lvl2.jpg" >
<itemlist Name="Level2" image="C:\lvl2.jpg" fileLoc="C:\lvl2.xml">
</itemlist>
<itemlist Name="Level3" image="C:\lvl3.jpg">
<item Name="First" image="C:\first.jpg" fileLoc="C:\first.xml"></item>
<item Name="Second" image="C:\second.jpg" fileLoc="C:\second.xml"></item>
<item Name="Third" image="C:\third.jpg" fileLoc="C:\third.xml"></item>
</itemlist>
</Menu>
</Menus>
As you can see, different elements but duplicate attributes. Should I have 3 separate classes, but how would I combine them for the XDocument load? Any help would be great.

This assumes those are elements and attributes directly of MenuItem. What I suspect is that you need read attributes of elements itemslist and items. Not sure how to do it with a single loop. You need to loop through the elements and then loop the attribute so THAT element (not the parent element).

You are not being heirarchical in your processing.
I have adjusted your xml, but here is an example of how you should be processing it:
string xml = #"<?xml version=""1.0"" encoding=""UTF-8""?>
<Menus>
<Menu id=""1"" Name=""Level1 - Alpha"" image=""C:\lvl1.jpg"" fileLoc=""C:\lvl1.xml""/>
<Menu id=""2"" Name=""Level1 - Beta"" image=""C:\lvl2.jpg"" fileLoc=""C:\lvl1.xml"" >
<itemlist Name=""Level2-Gamma"" image=""C:\lvl2.jpg"" fileLoc=""C:\lvl2.xml""/>
<itemlist Name=""Level3-Zeta"" image=""C:\lvl3.jpg"" fileLoc=""C:\lvl1.xml"">
<item Name=""First"" image=""C:\first.jpg"" fileLoc=""C:\first.xml""></item>
<item Name=""Second"" image=""C:\second.jpg"" fileLoc=""C:\second.xml""></item>
<item Name=""Third"" image=""C:\third.jpg"" fileLoc=""C:\third.xml""></item>
</itemlist>
</Menu>
</Menus>";
var xd = XDocument.Parse(xml);
var result =
xd.Descendants("Menu")
.Select (l1 => new
{
Name = l1.Attribute("Name").Value,
Image = l1.Attribute("image").Value,
File = l1.Attribute("fileLoc"),
Children = l1.Descendants("itemlist")
.Select (l2 => new {
Name = l2.Attribute("Name").Value,
Image = l2.Attribute("image").Value,
File = l2.Attribute("fileLoc"),
Children = l2.Descendants("item")
.Select (l3 => new {
Name = l3.Attribute("Name").Value,
Image = l3.Attribute("image").Value,
File = l3.Attribute("fileLoc")
})
})
});
Console.WriteLine (result );
Here is the result as found from linqpad:
See how the data parses out, that is how you need to work with it to get it into the menu structure. There are no duplicate attributes. :-)
HTH

Related

XDocument Descendant Selector using Wildcard?

I have some XML structured like this:
<form>
<section-1>
<item-1>
<value />
</item-1>
<item-2>
<value />
</item-2>
</section-1>
<section-2>
<item-3>
<value />
</item-3>
<item-4>
<value />
</item-4>
</section-2>
</form>
...and want to turn it into something sane like this:
<form>
<items>
<item id="1">
<value/>
</item>
<item id="2">
<value/>
</item>
<item id="3">
<value/>
</item>
<item id="4">
<value/>
</item>
</items>
</form>
I am struggling to turn the old XML into an array or object of values. Once in the new format I'd be able to do the following:
XDocument foo = XDocument.Load(form.xml);
var items = foo.Descendants("item")
.Select(i => new Item
{
value = i.Element("value").Value
});
...but in the current mess the xml is in can I wildcard the descendants selector?
var items = foo.Descendants("item"*)
...or something? I tried to follow this question's answer but failed to adapt it to my purpose.
Ah-ha! It did click in the end. If I leave the descendants selector blank and add in a where statement along the lines of what's in this question's answer
.Where(d => d.Name.ToString().StartsWith("item-"))
Then we get:
XDocument foo = XDocument.Load(form.xml);
var items = foo.Descendants()
.Where(d => d.Name.ToString().StartsWith("item-"))
.Select(i => new Item
{
value = i.Element("value").Value
});
...and I'm now able to iterate through those values while outputting the new XML format. Happiness.

How to get elements without child elements?

Here is my xml:
<Root>
<FirstChild id="1" att="a">
<SecondChild id="11" att="aa">
<ThirdChild>123</ThirdChild>
<ThirdChild>456</ThirdChild>
<ThirdChild>789</ThirdChild>
</SecondChild>
<SecondChild id="12" att="ab">12</SecondChild>
<SecondChild id="13" att="ac">13</SecondChild>
</FirstChild>
<FirstChild id="2" att="b">2</FirstChild>
<FirstChild id="3" att="c">3</FirstChild>
</Root>
This xml doc is very big and may be 1 GB size or more. For better performance in querying, i want to read xml doc step by step. So, in first step i want to read only "First Child"s and their attributes like below:
<FirstChild id="1" att="a"></FirstChild>
<FirstChild id="2" att="b">2</FirstChild>
<FirstChild id="3" att="c">3</FirstChild>
And after that, I maybe want to get "SecondChild"s by id of their parent and so ...
<SecondChild id="11" att="aa"></SecondChild>
<SecondChild id="12" att="ab">12</SecondChild>
<SecondChild id="13" att="ac">13</SecondChild>
How can I do it?
Note: XDoc.Descendants() or XDoc.Elements() load all specific elements with all child elements!
Provided that you have memory available to hold the file, I suggest treating each search step as an item in the outer collection of a PLINQ pipeline.
I would start with an XName collection for the node collections that you want to retrieve. By nesting queries within XElement constructors, you can return new instances of your target nodes, with only name and attribute information.
With a .Where(...) statement or two, you could also filter the attributes being kept, allow for some child nodes to be retained, etc.
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
namespace LinqToXmlExample
{
public class Program
{
public static void Main(string[] args)
{
XElement root = XElement.Load("[your file path here]");
XName[] names = new XName[] { "firstChild", "secondChild", "thirdChild" };
IEnumerable<XElement> elements =
names.AsParallel()
.Select(
name =>
new XElement(
$"result_{name}",
root.Descendants(name)
.AsParallel()
.Select(
x => new XElement(name, x.Attributes()))))
.ToArray();
}
}
}
I suggest creating a new element and copy the attributes.
var sourceElement = ...get "<FirstChild id="1" att="a">...</FirstChild>" through looping, xpath or any method.
var element = new XElement(sourceElement.Name);
foreach( var attribute in sourceElement.Attributes()){
element.Add(new XAttribute(attribute.Name, attribute.Value));
}
In VB this you could do this to get a list of FirstChild
'Dim yourpath As String = "your path here"
Dim xe As XElement
'to load from a file
'xe = XElement.Load(yourpath)
'for testing
xe = <Root>
<FirstChild id="1" att="a">
<SecondChild id="11" att="aa">
<ThirdChild>123</ThirdChild>
<ThirdChild>456</ThirdChild>
<ThirdChild>789</ThirdChild>
</SecondChild>
<SecondChild id="12" att="ab">12</SecondChild>
<SecondChild id="13" att="ac">13</SecondChild>
</FirstChild>
<FirstChild id="2" att="b">2</FirstChild>
<FirstChild id="3" att="c">3</FirstChild>
</Root>
Dim ie As IEnumerable(Of XElement)
ie = xe...<FirstChild>.Select(Function(el)
'create a copy
Dim foo As New XElement(el)
foo.RemoveNodes()
Return foo
End Function)

Remove all nodes in a specified namespace from XML

I have an XML document that contains some content in a namespace. Here is an example:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:test="urn:my-test-urn">
<Item name="Item one">
<test:AlternativeName>Another name</test:AlternativeName>
<Price test:Currency="GBP">124.00</Price>
</Item>
</root>
I want to remove all of the content that is within the test namespace - not just remove the namespace prefix from the tags, but actually remove all nodes (elements and attributes) from the document that (in this example) are in the test namespace. My required output is:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:test="urn:my-test-urn">
<Item name="Item one">
<Price>124.00</Price>
</Item>
</root>
I'm currently not overly concerned if the namespace declaration is still present, for now I'd be happy with removing just the content within the specified namespace. Note that there may be multiple namespaces in the document to be modified, so I'd like to be able to specify which one I want to have the content removed.
I've tried doing it using .Descendants().Where(e => e.Name.Namespace == "test") but that is only for returning an IEnumerable<XElement> so it doesn't help me with finding the attributes, and if I use .DescendantNodes() I can't see a way of querying the namespace prefix as that doesn't seem to be a property on XNode.
I can iterate through each element and then through each attribute on the element checking each one's Name.Namespace but that seems inelegant and hard to read.
Is there a way of achieving this using LINQ to Xml?
Iterating through elements then through attributes seems not too hard to read :
var xml = #"<?xml version='1.0' encoding='UTF-8'?>
<root xmlns:test='urn:my-test-urn'>
<Item name='Item one'>
<test:AlternativeName>Another name</test:AlternativeName>
<Price test:Currency='GBP'>124.00</Price>
</Item>
</root>";
var doc = XDocument.Parse(xml);
XNamespace test = "urn:my-test-urn";
//get all elements in specific namespace and remove
doc.Descendants()
.Where(o => o.Name.Namespace == test)
.Remove();
//get all attributes in specific namespace and remove
doc.Descendants()
.Attributes()
.Where(o => o.Name.Namespace == test)
.Remove();
//print result
Console.WriteLine(doc.ToString());
output :
<root xmlns:test="urn:my-test-urn">
<Item name="Item one">
<Price>124.00</Price>
</Item>
</root>
Give this a try. I had to pull the namespace from the root element then run two separate Linqs:
Removes elements with the namespace
Removes attributes with the namespace
Code:
string xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<root xmlns:test=\"urn:my-test-urn\">" +
"<Item name=\"Item one\">" +
"<test:AlternativeName>Another name</test:AlternativeName>" +
"<Price test:Currency=\"GBP\">124.00</Price>" +
"</Item>" +
"</root>";
XDocument xDocument = XDocument.Parse(xml);
if (xDocument.Root != null)
{
string namespaceValue = xDocument.Root.Attributes().Where(a => a.IsNamespaceDeclaration).FirstOrDefault().Value;
// Removes elements with the namespace
xDocument.Root.Descendants().Where(d => d.Name.Namespace == namespaceValue).Remove();
// Removes attributes with the namespace
xDocument.Root.Descendants().ToList().ForEach(d => d.Attributes().Where(a => a.Name.Namespace == namespaceValue).Remove());
Console.WriteLine(xDocument.ToString());
}
Results:
<root xmlns:test="urn:my-test-urn">
<Item name="Item one">
<Price>124.00</Price>
</Item>
</root>
If you want to remove the namespace from the root element add the this line in the if statement after you get the namespaceValue
xDocument.Root.Attributes().Where(a => a.IsNamespaceDeclaration).Remove();
Results:
<root>
<Item name="Item one">
<Price>124.00</Price>
</Item>
</root>

List is empty after parsing XML with LinQ

I have an xml file similar to the following:
<doc>
<file>
<header>
<source>
RNG
</source>
</header>
<body>
<item name="items.names.id1">
<property>propertyvalue1</property>
</item>
<!-- etc -->
<item name="items.names.id100">
<property>propertyvalue100</property>
</item>
<!-- etc -->
<item name="otheritems.names.id100">
<property>propertyvalue100</property>
</item>
</body>
</file>
</doc>
And the following class:
private class Item
{
public string Id;
public string Property;
}
The file has, for example, 100 item entries (labeled 1 to 100 in the name attribute). How can I use Linq Xml to get hold of these nodes and place them a in list of item?
Using Selman22's example, I'm doing the following:
var myList = xDoc.Descendants("item")
.Where(x => x.Attributes("name").ToString().StartsWith("items.names.id"))
.Select(item => new Item
{
Id = (string)item.Attribute("name"),
Name = (string)item.Element("property")
}).ToList();
However, the list is empty. What am I missing here?
Using LINQ to XML:
XDocument xDoc = XDocument.Load(filepath);
var myList = xDoc.Descendants("item").Select(item => new Item {
Id = (string)item.Attribute("name"),
Property = (string)item.Element("property")
}).ToList();
You can use LinqToXml to directly query the XML, or deserialize it and use LINQ to object. If you choose to deserialize I suggest to start from the schema and generate the classes representing your datamodel with xsd.exe. If you don't have the schema of your xml, even xsd.exe can infer one from an example xml file, but you probably need to fine tune the result.
Try this one XElement root = XElement.Parse("your file name");
var items textSegs =(from item in root.Descendants("item")
select item).ToList();
Now iterate over list and store it
The below is a way of getting information from xml using Xdocument.
string input = "<Your xml>";
Xdocument doc = XDocument.Parse(input);
var data = doc.Descendants("item");
List<Items> itemsList = new List<Items>();
foreach(var item in data)
{
string itemname= item.Element("item").Value;
string property = item.Element("property").Value;
itemsList.Add(new item(itemname, property));
}
I'm guessing you want the code given how your question is phrased.. also I'm assuming the real XML is very simplistic as well.
var items = from item in doc.Descendants("item")
select new Item()
{
Id = item.Attributes("name").First().Value,
Property = item.Elements().First().Value,
};
Just ensure that your xml is loaded into doc. You can load the xml in two ways:
// By a string with xml
var doc = XDocument.Parse(aStringWithXml);
// or by loading from uri (file)
var doc = XDocuemnt.Load(aStringWhichIsAFile);

Find frequency of values in an Array or XML (C#)

I have an XML feed (which I don't control) and I am trying to figure out how to detect the volume of certain attribute values within the document.
I am also parsing the XML and separating attributes into Arrays (for other functionality)
Here is a sample of my XML
<items>
<item att1="ABC123" att2="uID" />
<item att1="ABC345" att2="uID" />
<item att1="ABC123" att2="uID" />
<item att1="ABC678" att2="uID" />
<item att1="ABC123" att2="uID" />
<item att1="XYZ123" att2="uID" />
<item att1="XYZ345" att2="uID" />
<item att1="XYZ678" att2="uID" />
</items>
I want to find the volume nodes based on each att1 value. Att1 value will change. Once I know the frequency of att1 values I need to pull the att2 value of that node.
I need to find the TOP 4 items and pull the values of their attributes.
All of this needs to be done in C# code behind.
If I was using Javascript I would create an associative array and have att1 be the key and the frequency be the value. But since I'm new to c# I don't know how to duplicate this in c#.
So I believe, first I need to find all unique att1 values in the XML. I can do this using:
IEnumerable<string> uItems = uItemsArray.Distinct();
// Where uItemsArray is a collection of all the att1 values in an array
Then I get stuck on how I compare each unique att1 value to the whole document to get the volume stored in a variable or array or whatever data set.
Here is the snippet I ended up using:
XDocument doc = XDocument.Load(#"temp/salesData.xml");
var topItems = from item in doc.Descendants("item")
select new
{
name = (string)item.Attribute("name"),
sku = (string)item.Attribute("sku"),
iCat = (string)item.Attribute("iCat"),
sTime = (string)item.Attribute("sTime"),
price = (string)item.Attribute("price"),
desc = (string)item.Attribute("desc")
} into node
group node by node.sku into grp
select new {
sku = grp.Key,
name = grp.ElementAt(0).name,
iCat = grp.ElementAt(0).iCat,
sTime = grp.ElementAt(0).sTime,
price = grp.ElementAt(0).price,
desc = grp.ElementAt(0).desc,
Count = grp.Count()
};
_topSellers = new SalesDataObject[4];
int topSellerIndex = 0;
foreach (var item in topItems.OrderByDescending(x => x.Count).Take(4))
{
SalesDataObject topSeller = new SalesDataObject();
topSeller.iCat = item.iCat;
topSeller.iName = item.name;
topSeller.iSku = item.sku;
topSeller.sTime = Convert.ToDateTime(item.sTime);
topSeller.iDesc = item.desc;
topSeller.iPrice = item.price;
_topSellers.SetValue(topSeller, topSellerIndex);
topSellerIndex++;
}
Thanks for all your help!
Are you using .NET 3.5? (It looks like it based on your code.) If so, I suspect this is pretty easy with LINQ to XML and LINQ to Objects. However, I'm afraid it's not clear from your example what you want. Do all the values with the same att1 also have the same att2? If so, it's something like:
var results = (from element in items.Elements("item")
group element by element.Attribute("att1").Value into grouped
order by grouped.Count() descending
select grouped.First().Attribute("att2").Value).Take(4);
I haven't tested it, but I think it should work...
We start off with all the item elements
We group them (still as elements) by their att1 value
We sort the groups by their size, descending so the biggest one is first
From each group we take the first element to find its att2 value
We take the top four of these results
If you have the values, you should be able to use LINQ's GroupBy...
XDocument doc = XDocument.Parse(xml);
var query = from item in doc.Descendants("item")
select new
{
att1 = (string)item.Attribute("att1"),
att2 = (string)item.Attribute("att2") // if needed
} into node
group node by node.att1 into grp
select new { att1 = grp.Key, Count = grp.Count() };
foreach (var item in query.OrderByDescending(x=>x.Count).Take(4))
{
Console.WriteLine("{0} = {1}", item.att1, item.Count);
}
You can use LINQ/XLINQ to accomplish this. Below is a sample console application I just wrote, so the code might not be optimized but it works.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
using System.Text;
namespace FrequencyThingy
{
class Program
{
static void Main(string[] args)
{
string data = #"<items>
<item att1=""ABC123"" att2=""uID"" />
<item att1=""ABC345"" att2=""uID"" />
<item att1=""ABC123"" att2=""uID"" />
<item att1=""ABC678"" att2=""uID"" />
<item att1=""ABC123"" att2=""uID"" />
<item att1=""XYZ123"" att2=""uID"" />
<item att1=""XYZ345"" att2=""uID"" />
<item att1=""XYZ678"" att2=""uID"" />
</items>";
XDocument doc = XDocument.Parse(data);
var grouping = doc.Root.Elements().GroupBy(item => item.Attribute("att1").Value);
foreach (var group in grouping)
{
var groupArray = group.ToArray();
Console.WriteLine("Group {0} has {1} element(s).", groupArray[0].Attribute("att1").Value, groupArray.Length);
}
Console.ReadKey();
}
}
}

Categories