Morelinq ExceptBy using several specific element - c#

There are 2 xml files
First xml file contains:
<Prices>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>200</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>200</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>180</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
</Prices>
and the second xml file:
<Prices>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>200</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
</Prices>
What I want is, using morelinq features ExceptBy(), or using custom class extend IEqualityComparer on Except() features in Linq to return something like this (between 1st xml file and the 2nd xml file, even when the third tag price on 1st xml file have different DistributorPriceFibrate value):
<Prices/>
Since Except() compares all values on element 'Price' node, I just want compare only specific element at <ProductId> and <EffectiveDate> only.
If they are the same, then go empty tag <Prices/>. If not same value on those elements, return the price tag from 1st xml file which not have same value ProductID and EffectiveDate from 2nd xml file.
What I've done I distinct the 1st xml file:
var distinctItemsonxmldoc1 =
xmldoc1
.Descendants("Price")
.DistinctBy(element => new
{
ProductId = (string)element.Element("ProductId"),
EffectiveDate = (string)element.Element("EffectiveDate")
});
var afterdistinctxmldoc1 = new XElement("Prices");
foreach (var a in distinctItemsonxmldoc1 )
{
afterdistinctxmldoc1.Add(a);
}
and when using except to compare between 2 files:
var afterexcept = afterdistinctxmldoc1.Descendants("Price").Cast<XNode>().Except(xmldoc2.Descendants("Price").Cast<XNode>(), new XNodeEqualityComparer());
but it compare all element value on price node.
how using ExceptBy() in spesific element?
or custom IComparer maybe?
Thanks before.
EDIT
already solved. see the answer by #dbc.

To confirm I understand your question: given two XML documents, you want to enumerate through instances of each Price element in the first document with distinct values values for the child elements ProductId and EffectiveDate, skipping all those whose ProductId and EffectiveDate match a Price element in the second document, using MoreLinq.
In that case, you can do:
var diff = xmldoc1.Descendants("Price").ExceptBy(xmldoc2.Descendants("Price"),
e => new { ProductId = e.Elements("ProductId").Select(p => p.Value).FirstOrDefault(), EffectiveDate = e.Elements("EffectiveDate").Select(p => p.Value).FirstOrDefault() });

Related

How to select a single XML node in c# using multiple XPath queries

I'm trying to select a single node from an XML file based on two queries, I have a product ID for which I need the latest entry - highest issue number.
This is the format of my XML file:
<MyProducts>
<Product code="1011234">
<ProductName>Product Name A</ProductName>
<ProductId>101</ProductId>
<IssueNumber>1234</IssueNumber>
</Product>
<Product code="1029999">
<ProductName>Product Name B</ProductName>
<ProductId>102</ProductId>
<IssueNumber>9999</IssueNumber>
</Product>
<Product code="1015678">
<ProductName>Product Name A2</ProductName>
<ProductId>101</ProductId>
<IssueNumber>5678</IssueNumber>
</Product>
</MyProducts>
I need to get the <product> node from a ProductId that has the highest IssueNumber. For example if the ProductId is 101 I want the third node, if it's 102, I want the second node. There are around 50 different products in the file, split over three different product ids.
I've tried a number of XPath combinations using SelectSingleNode either by using the specific ProductID and IssueNumber nodes, or by using the code attribute of the product node (which is a combination of Id and Issue) without any success.
The code currently uses the code attribute, but only because we're also passing in the issue number and I want to be able to do this without the issue number (to decrease front end maintenance) as it's always the highest issue we want.
Current code is this:
XmlNode productNode = productXml.SelectSingleNode("/MyProducts/Product[#code='" + productCode + "']");
I've used these as well, they kind of work, but select the inner nodes, not the outer Product node:
XmlNodeList productNodes = productXml.SelectNodes("/MyProducts/Product/ProductId[text()='101']");
XmlNodeList productNodes = productXml.SelectNodes("/MyProducts/Product[not (../Product/IssueNumber > IssueNumber)]/IssueNumber");
I would like to use a combination of the two, something like this:
XmlNode productNode = productXml.SelectSingleNode("/MyProducts/Product/ProductId[text()='101'] and /MyProducts/Product[not (../Product/IssueNumber > IssueNumber)]/IssueNumber");
But that returns the error "...threw an exception of type 'System.Xml.XPath.XPathException'", but I also expect it won't return the Product node anyway.
Can this even be done in a single line, or will I have to loop through the nodes to find the right one?
Use Xml Linq
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication167
{
class Program
{
const string FILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
var products = doc.Descendants("Product")
.OrderByDescending(x => (int)x.Element("IssueNumber"))
.GroupBy(x => (int)x.Element("ProductId"))
.Select(x => x.First())
.ToList();
Dictionary<int, XElement> dict = products
.GroupBy(x => (int)x.Element("ProductId"), y => y)
.ToDictionary(x => x.Key, y => y.FirstOrDefault());
XElement highestId = dict[101];
}
}
}
Your last idea is almost there. You need to put the two clauses inside the [] selector. There is also max() available which I think clarifies the logic. This should work:
/MyProducts/Product[ProductId='101'
and IssueNumber=max(/MyProducts/Product[ProductId='101']/IssueNumber)]
This selects the Product which both has id 101 and has the highest IssueNumber of all id-101-products.

How to do operations (avg, cnt, etc) while parsing xml (c#)?

I have the following xml:
<bookstore>
<book IMDB="11-023-2022">
<title>Hamlet 2</title>
<comments>
<user rating="2">good enough</user>
<user rating="1">didnt read it</user>
<user rating="5">didnt read it but title is good</user>
</comments>
</book>
</bookstore>
I have an AverageUserRating property which i supposed to fill while parsing in the following format, I also have no idea how to cast comments into list. I tried everything, I can't use nuget packages like xpath. Thank you for your help.
return xdoc.Descendants("book").Select(n => new Books()
{
IMDB = n.Attribute("IMDB").Value,
Title = n.Element("title").Value,
//Comments = (List<string>)(n.Elements("user")), ???
//AverageUserRating= ???
}).ToList();
Comments = n.Element("comments").Elements("user").Select(u => u.Value).ToList(),
Explation:
1) Element("comments"), returns the child html element named "comments"
2) Elements("user"), returns all childrens elements named "user"
3) .Select(u => u.Value), select from every user element the value, that is the comment that you need
4) .ToList() converts into a list of strings
AverageUserRating = n.Element("comments").Elements("user").Select(u => u.Attribute("rating").Value).Select(r => Convert.ToInt32(r)).Average()
Explation:
1) Element("comments"), returns the child html element named "comments"
2) Elements("user"), returns all childrens elements named "user"
3) .Select(u => u.Attribute("rating").Value), selects from any element the value of the attribute "rating"
4) .Select(r => Convert.ToInt32(r)) converts the string value of the attribute into an int32 (pay attention, if the value is not a number, it throws an exception)
5) .Average() It calculates the aritmetic average and returns a double
Maybe, you should process original XML with XSLT to get the data you need automatically. Then, resulting doc could be easier to parse. Take a look here as an example Calculate average with xslt
It uses HTML as output format, you can do the same with XML.
Another option can be to create the classes with the same structure as your original XML so then you could employ automatic deserialization. Then, use LINQ or any other way to get the stats.

Linq to XML to retrieve value based on Attribute

I have a Linq to Xml query that needs to retrieve a value based on the attribute value of a particular node. I'm trying to retrieve a list of items and one of the nodes has an attribute that I can't seem to find a way to get the value.
Here's the XML:
<codelist-items>
<codelist-item>
<code>1</code>
<name>
<narrative>Planned start</narrative>
<narrative xml:lang="fr">Début prévu</narrative>
</name>
<description>
<narrative>
The date on which the activity is planned to start, for example the date of the first planned disbursement or when physical activity starts.
</narrative>
</description>
</codelist-item>
<codelist-item>
<code>2</code>
<name>
<narrative>Actual start</narrative>
<narrative xml:lang="fr">Début réel</narrative>
</name>
<description>
<narrative>
The actual date the activity starts, for example the date of the first disbursement or when physical activity starts.
</narrative>
</description>
</codelist-item>
</codelist-items>
I'm only displaying 2 items to keep it short. And here is my Linq query to try and retrieve the value from "name/narrative" where there is a "xml:lang='fr'" attribute:
XElement xelement = XElement.Load(xmlFile);
var elements = from adt in xelement.Elements("codelist-items").Elements("codelist-item")
select new ActivityDateType
{
Code = (string)adt.Element("code"),
NameEng = (string)adt.Element("name").Element("narrative"),
NameFra = (string)adt.Element("name").Element("narrative[#xml:lang='fr']"),
Description = (string)adt.Element("description")
};
return elements;
Anyone know how to get the value for NameFra?
Thanks
You can either use LINQ FirstOrDefault() with predicate that filters the element by its attribute value :
NameFra = (string)adt.Element("name")
.Elements("narrative")
.FirstOrDefault(o => (string)o.Attribute(XNamespace.Xml+"lang") == "fr"),
Or use XPathSelectElement() extension to execute your XPath expression which already contains attribute filtering logic :
NameFra = (string)adt.Element("name")
.XPathSelectElement("narrative[#xml:lang='fr']", nsmgr),
The latter can be simplified further to the following :
NameFra = (string)adt.XPathSelectElement("name/narrative[#xml:lang='fr']", nsmgr),
nsmgr assumed has previously been declared as follow :
var nsmgr = new XmlNamespaceManager(new NameTable());
nsmgr was needed because the XPath contains prefix xml (XPathSelectElement() complained when I use the overload which accepts just XPath string argument without namespace manager).

How to access element in XML files with attribute values?

This is my XML file
<colleges>
<college college_name="DYPSOE">
<departments>
<department department_name="Computer" id="10">
<![CDATA[I NEED TO CHANGE THIS COMMENT!]]>
</department>
<department department_name="Machanical" id="20">
<![CDATA[I NEED TO CHANGE THIS COMMENT!]]>
</department>
</departments>
</college>
<college college_name="DYPSOET">
<departments>
<department department_name="Computer" id="10">
<![CDATA[I NEED TO CHANGE THIS COMMENT!]]>
</department>
<department department_name="Machanical" id="20">
<![CDATA[I NEED TO CHANGE THIS COMMENT!]]>
</department>
</departments>
</college>
</colleges>
I have three attribute values as college_name, department_name and id available in the program. So I want to go to the particular "Department" node and change the value in the comments with these three attribute values.
I'am trying to reach the node with different queries, failed so far.
var node = from e in doc.Descendants("college")
where e.Attribute("college_name").ToString() == college_name
select (XElement)e.Elements("department");
foreach (XElement data in node)
{
Console.WriteLine(data); //Just to look what I got
data.Value = ""; //To change the comment section
}
This is not working at all. If you guys could suggest me the query it would help me a lot.
Presuming you want to select a single department element based on college and department names, you can find it using a query like this one:
var query = from college in doc.Descendants("college")
where (string) college.Attribute("college_name") == "DYPSOE"
from department in college.Descendants("department")
where (string) department.Attribute("department_name") == "Computer"
select department;
var element = query.Single();
You can then replace the comment like this:
element.ReplaceNodes(new XCData("new comment"));
I think you can use System.Xml.XmlReader to read the nodes' attribute and write it with System.Xml.XmlWriter
How to: Parse XML with XmlReader has the example of parsing and writing.

Parsing data from a child element in an xml document

I'm having difficulty parsing a sub element from an xml document.
The document contains a series of elements containing pricing information that I need to extract the Euro price from. No matter what I do, I can't seem to extract the data that I need. The result is always null.
<departure>
<pricing xmlns="http://website.com/api/feeds/xmlns/20110926/">
<price age_group="Adult" label="1 Adult" max_age="100" max_passengers="100" min_age="12" min_passengers="1">
<USD>4249.00</USD>
<AUD>4299.00</AUD>
<CHF>3649.00</CHF>
<GBP>2749.00</GBP>
<NZD>5399.00</NZD>
<CAD>4399.00</CAD>
<EUR>3249.00</EUR> <------------this is what I need to parse
</price>
</pricing>
<pricing xmlns="http://website.com/api/feeds/xmlns/20110926/">
<price age_group="Adult" label="1 Adult" max_age="100" max_passengers="100" min_age="12" min_passengers="1">
<USD>4249.00</USD>
<AUD>4299.00</AUD>
<CHF>3649.00</CHF>
<GBP>2749.00</GBP>
<NZD>5399.00</NZD>
<CAD>4399.00</CAD>
<EUR>3249.00</EUR> <------------this is what I need to parse
</price>
</pricing>
<departure>
XmlNodeList departureNodes = xmlDoc.GetElementsByTagName("departure");
if (departureNodes.Count > 0)
{
foreach (XmlElement element in departureNodes)
{
string priceInEUR = xmlElement.SelectSingleNode("pricing/price/EUR"); // returns null
string priceInEUR2 = xmlElement.SelectSingleNode("//pricing/price/EUR"); // also returns null
}
}
I recommend to use XDocument and Linq to XML.
using System.Xml.Linq;
IEnumerable<XElement> prices = from t in doc.Root.Descendants("EUR");
foreach (XElement t in prices)
{
string priceInEUR = t.Value;
}
The way I have this document here: http://searisen.com/xmllib/extensions.wiki
You can currently do: (presuming departure is a child of the root node)
decimal[] euros = XElement.Load(xmlFile)
.GetEnumerable("departure/pricing",
x => x.Get("price/EUR", decimal.MinValue))
.ToArray();
This gets all the euros, the two you have listed.

Categories