LINQ to XML select - c#

I have two xml documents with some elements like
doc1
<Item id="22"/>
<Item id="33"/>
<Item id="44"/>
...
doc2
<Item id="33"/>
<Item id="44"/>
<Item id="66"/>
<Item id="88"/>
...
I need a query to select
only those elements from doc1 that are missing in doc2 ignoring other doc2 elements.
In this case the result will be:
<Item id="22"/>
How do I do that?

Basically, you create a List with all ids from the second list, and check for each item of doc1 if it is in the list.
Performance wise, I think it isnt the best choice - but it should work
var qry = from item in doc1.Descendants("Item")
where
!(from item2 in doc2.Descendants("Item")
select item2.Attribute("id"),Value
).ToList().Contains(item.Attribute("id").Value)
select item;
In the linq-statement above, I think the list of ids is created for every element in doc1. Better option would be to create the list first and then use the list in the next statement:
List<string> items = (from item2 in doc2.Descendants("Item")
select item2.Attribute("id").Value
).ToList();
var qry = from item in doc1.Descendants("Item")
where !items.Contains(item.Attribute("id").Value)
select item;

Probably something along the lines of
doc1.Where(i1=>doc2.All(i2 => i2.id != i1.id))
could get you there.
HOWEVER, this is performing a subquery on doc2 for each element in doc1. Make sure they are small!

The easiest way is to use the ExceptedBy method from the MoreLinq library. Assuming the Item elements are directly under the root element:
var doc1 = XDocument.Load("doc1.xml");
var doc2 = XDocument.Load("doc2.xml");
var doc1Elements = doc1.Root.Elements("Item");
var doc2Elements = doc2.Root.Elements("Item");
var diff = doc1Elements.ExceptBy(doc2Elements, e => e.Attribute("id").Value);

Related

Parse XML to Tuple

New to LINQ.
Example xml:
<item name="XX">
<inside type="A"/>
<inside type="B" />
</item>
<item name="YY">
<inside type="C"/>
<inside type="D" />
</item>
I would like to parse it to tuples:
(XX, A)
(XX, B)
(YY, C)
(YY, D)
So far I can retrieve the first lines online: (XX, A) and (YY, C) of each item using the code below:
var selected = (from item in doc.Root.Elements()
let inside = item.Element(XName.Get("inside", item.GetDefaultNamespace().NamespaceName))
select Tuple.Create(item.Attribute("name").Value,
inside.Attribute("type").Value)).ToList();
I believe I should modify item.Element to item.Elements but so far no luck.
instead of using item.Element you should iterate over item.Elements:
var selected = (from item in doc.Root.Elements()
from inside in item.Elements("inside")
select Tuple.Create(
item.Attribute("name").Value,
inside.Attribute("type").Value)
).ToList();
var tuples = XElement.Parse(xml).Descendants("item")
.SelectMany(item => item.Elements()
.Select(inside => Tuple.Create(item.Attribute("name"), inside.Attribute("type"))));
You can use SelectMany to help you build a collection of tuples from a single element.

List is empty after parsing XML with LinQ

I have an xml file similar to the following:
<doc>
<file>
<header>
<source>
RNG
</source>
</header>
<body>
<item name="items.names.id1">
<property>propertyvalue1</property>
</item>
<!-- etc -->
<item name="items.names.id100">
<property>propertyvalue100</property>
</item>
<!-- etc -->
<item name="otheritems.names.id100">
<property>propertyvalue100</property>
</item>
</body>
</file>
</doc>
And the following class:
private class Item
{
public string Id;
public string Property;
}
The file has, for example, 100 item entries (labeled 1 to 100 in the name attribute). How can I use Linq Xml to get hold of these nodes and place them a in list of item?
Using Selman22's example, I'm doing the following:
var myList = xDoc.Descendants("item")
.Where(x => x.Attributes("name").ToString().StartsWith("items.names.id"))
.Select(item => new Item
{
Id = (string)item.Attribute("name"),
Name = (string)item.Element("property")
}).ToList();
However, the list is empty. What am I missing here?
Using LINQ to XML:
XDocument xDoc = XDocument.Load(filepath);
var myList = xDoc.Descendants("item").Select(item => new Item {
Id = (string)item.Attribute("name"),
Property = (string)item.Element("property")
}).ToList();
You can use LinqToXml to directly query the XML, or deserialize it and use LINQ to object. If you choose to deserialize I suggest to start from the schema and generate the classes representing your datamodel with xsd.exe. If you don't have the schema of your xml, even xsd.exe can infer one from an example xml file, but you probably need to fine tune the result.
Try this one XElement root = XElement.Parse("your file name");
var items textSegs =(from item in root.Descendants("item")
select item).ToList();
Now iterate over list and store it
The below is a way of getting information from xml using Xdocument.
string input = "<Your xml>";
Xdocument doc = XDocument.Parse(input);
var data = doc.Descendants("item");
List<Items> itemsList = new List<Items>();
foreach(var item in data)
{
string itemname= item.Element("item").Value;
string property = item.Element("property").Value;
itemsList.Add(new item(itemname, property));
}
I'm guessing you want the code given how your question is phrased.. also I'm assuming the real XML is very simplistic as well.
var items = from item in doc.Descendants("item")
select new Item()
{
Id = item.Attributes("name").First().Value,
Property = item.Elements().First().Value,
};
Just ensure that your xml is loaded into doc. You can load the xml in two ways:
// By a string with xml
var doc = XDocument.Parse(aStringWithXml);
// or by loading from uri (file)
var doc = XDocuemnt.Load(aStringWhichIsAFile);

C# XML get nodes based on attribute

I have the following xml:
<root ...>
<Tables>
<Table content="..">
</Table>
<Table content="interesting">
<Item ...></Item>
<Item ...></Item>
<Item ...></Item>
</Table>
...etc...
</Tables>
</root>
I'm using the following code to get the items from the 'interesting' node:
XElement xel = XElement.Parse(resp);
var nodes = from n in xel.Elements("Tables").Elements("Table")
where n.Attribute("content").Value == "interesting"
select n;
var items = from i in nodes.Elements()
select i;
Is there a simpler, cleaner way to achieve this?
Well there's no point in using a query expression for items, and you can wrap the whole thing up very easily in a single statement. I wouldn't even bother with a query expression for that:
var items = XElement.Parse(resp)
.Elements("Tables")
.Elements("Table")
.Where(n => n.Attribute("content").Value == "interesting")
.Elements();
Note that this (and your current query) will throw an exception for any Table element without a content attribute. If you'd rather just skip it, you can use:
.Where(n => (string) n.Attribute("content") == "interesting")
instead.
You can use XPath (extension is in System.Xml.XPath namespace) to select all items in one line:
var items = xel.XPathSelectElements("//Table[#content='interesting']/Item");
If you don't need nodes outside of your query for items, you can just do this:
var items = from n in xel.Elements("Tables").Elements("Table")
where n.Attribute("content").Value == "interesting"
from i in n.Elements()
select i;
using xml document
XmlDocument xdoc = new XmlDocument();
var item= xdoc.GetElementsByTagName("Table[#content='interesting']/Item");

Order XmlNodeList based on an attribute

I have an XmlNodeList that contains packets (item) from the root of the XML example below. I want to sort the XmlNodeList based on the node's key attribute value.
The sorting has to be very efficient, every millisecond counts.
Do you have any idea?
<root>
<item key="1000000020">
Content 20
</item>
<item key="1000000001">
Content 1
</item>
...
<item key="1043245231">
Content n
</item>
</root>
Edit:
I already have an XmlNodeList constructed from the items. I do not have access to the XmlDocument anymore, only the list of items.
You should try Linq to XML.
XDocument doc = XDocument.Load(file);
var nodeList = from ele in doc.Descendants("item")
orderby int.Parse(ele.Attribute("key").Value)
select ele;
You may try XPathNavigator and XPathExpression.
//I presume that variable xNodeList contains XmlNodeList.
XPathNavigator nav=xNodeList.Item(0).OwnerDocument.CreateNavigator();
XPathExpression exp = nav.Compile("root/item");
exp.AddSort("#key", XmlSortOrder.Ascending, XmlCaseOrder.None, "", XmlDataType.Number );
foreach (XPathNavigator t in nav.Select(exp))
{
Console.WriteLine(t.OuterXml );
}
note: xml variable is string value
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
IEnumerable<XmlNode> rows = doc.SelectNodes("report/table/row").Cast<XmlNode>().OrderByDescending(r => Convert.ToDecimal(r.Attributes["conversions"].Value));
I solved the problem in a very non-elegant way:
I iterated my XmlNodeList
During iteration I extracted the timestamps
After extracting a timestamp I added the timestamp-XmlElement to a SortedDictionary
Converted the SortedDictionary to list (sortedKeys = sortedByDateDisctionary.Keys.ToList();)
If the nodes need to be sorted Descending then sortedKeys.Reverse();
Then the nodes can be accessed by the sorted keys

Find frequency of values in an Array or XML (C#)

I have an XML feed (which I don't control) and I am trying to figure out how to detect the volume of certain attribute values within the document.
I am also parsing the XML and separating attributes into Arrays (for other functionality)
Here is a sample of my XML
<items>
<item att1="ABC123" att2="uID" />
<item att1="ABC345" att2="uID" />
<item att1="ABC123" att2="uID" />
<item att1="ABC678" att2="uID" />
<item att1="ABC123" att2="uID" />
<item att1="XYZ123" att2="uID" />
<item att1="XYZ345" att2="uID" />
<item att1="XYZ678" att2="uID" />
</items>
I want to find the volume nodes based on each att1 value. Att1 value will change. Once I know the frequency of att1 values I need to pull the att2 value of that node.
I need to find the TOP 4 items and pull the values of their attributes.
All of this needs to be done in C# code behind.
If I was using Javascript I would create an associative array and have att1 be the key and the frequency be the value. But since I'm new to c# I don't know how to duplicate this in c#.
So I believe, first I need to find all unique att1 values in the XML. I can do this using:
IEnumerable<string> uItems = uItemsArray.Distinct();
// Where uItemsArray is a collection of all the att1 values in an array
Then I get stuck on how I compare each unique att1 value to the whole document to get the volume stored in a variable or array or whatever data set.
Here is the snippet I ended up using:
XDocument doc = XDocument.Load(#"temp/salesData.xml");
var topItems = from item in doc.Descendants("item")
select new
{
name = (string)item.Attribute("name"),
sku = (string)item.Attribute("sku"),
iCat = (string)item.Attribute("iCat"),
sTime = (string)item.Attribute("sTime"),
price = (string)item.Attribute("price"),
desc = (string)item.Attribute("desc")
} into node
group node by node.sku into grp
select new {
sku = grp.Key,
name = grp.ElementAt(0).name,
iCat = grp.ElementAt(0).iCat,
sTime = grp.ElementAt(0).sTime,
price = grp.ElementAt(0).price,
desc = grp.ElementAt(0).desc,
Count = grp.Count()
};
_topSellers = new SalesDataObject[4];
int topSellerIndex = 0;
foreach (var item in topItems.OrderByDescending(x => x.Count).Take(4))
{
SalesDataObject topSeller = new SalesDataObject();
topSeller.iCat = item.iCat;
topSeller.iName = item.name;
topSeller.iSku = item.sku;
topSeller.sTime = Convert.ToDateTime(item.sTime);
topSeller.iDesc = item.desc;
topSeller.iPrice = item.price;
_topSellers.SetValue(topSeller, topSellerIndex);
topSellerIndex++;
}
Thanks for all your help!
Are you using .NET 3.5? (It looks like it based on your code.) If so, I suspect this is pretty easy with LINQ to XML and LINQ to Objects. However, I'm afraid it's not clear from your example what you want. Do all the values with the same att1 also have the same att2? If so, it's something like:
var results = (from element in items.Elements("item")
group element by element.Attribute("att1").Value into grouped
order by grouped.Count() descending
select grouped.First().Attribute("att2").Value).Take(4);
I haven't tested it, but I think it should work...
We start off with all the item elements
We group them (still as elements) by their att1 value
We sort the groups by their size, descending so the biggest one is first
From each group we take the first element to find its att2 value
We take the top four of these results
If you have the values, you should be able to use LINQ's GroupBy...
XDocument doc = XDocument.Parse(xml);
var query = from item in doc.Descendants("item")
select new
{
att1 = (string)item.Attribute("att1"),
att2 = (string)item.Attribute("att2") // if needed
} into node
group node by node.att1 into grp
select new { att1 = grp.Key, Count = grp.Count() };
foreach (var item in query.OrderByDescending(x=>x.Count).Take(4))
{
Console.WriteLine("{0} = {1}", item.att1, item.Count);
}
You can use LINQ/XLINQ to accomplish this. Below is a sample console application I just wrote, so the code might not be optimized but it works.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
using System.Text;
namespace FrequencyThingy
{
class Program
{
static void Main(string[] args)
{
string data = #"<items>
<item att1=""ABC123"" att2=""uID"" />
<item att1=""ABC345"" att2=""uID"" />
<item att1=""ABC123"" att2=""uID"" />
<item att1=""ABC678"" att2=""uID"" />
<item att1=""ABC123"" att2=""uID"" />
<item att1=""XYZ123"" att2=""uID"" />
<item att1=""XYZ345"" att2=""uID"" />
<item att1=""XYZ678"" att2=""uID"" />
</items>";
XDocument doc = XDocument.Parse(data);
var grouping = doc.Root.Elements().GroupBy(item => item.Attribute("att1").Value);
foreach (var group in grouping)
{
var groupArray = group.ToArray();
Console.WriteLine("Group {0} has {1} element(s).", groupArray[0].Attribute("att1").Value, groupArray.Length);
}
Console.ReadKey();
}
}
}

Categories