How to do operations (avg, cnt, etc) while parsing xml (c#)? - c#

I have the following xml:
<bookstore>
<book IMDB="11-023-2022">
<title>Hamlet 2</title>
<comments>
<user rating="2">good enough</user>
<user rating="1">didnt read it</user>
<user rating="5">didnt read it but title is good</user>
</comments>
</book>
</bookstore>
I have an AverageUserRating property which i supposed to fill while parsing in the following format, I also have no idea how to cast comments into list. I tried everything, I can't use nuget packages like xpath. Thank you for your help.
return xdoc.Descendants("book").Select(n => new Books()
{
IMDB = n.Attribute("IMDB").Value,
Title = n.Element("title").Value,
//Comments = (List<string>)(n.Elements("user")), ???
//AverageUserRating= ???
}).ToList();

Comments = n.Element("comments").Elements("user").Select(u => u.Value).ToList(),
Explation:
1) Element("comments"), returns the child html element named "comments"
2) Elements("user"), returns all childrens elements named "user"
3) .Select(u => u.Value), select from every user element the value, that is the comment that you need
4) .ToList() converts into a list of strings
AverageUserRating = n.Element("comments").Elements("user").Select(u => u.Attribute("rating").Value).Select(r => Convert.ToInt32(r)).Average()
Explation:
1) Element("comments"), returns the child html element named "comments"
2) Elements("user"), returns all childrens elements named "user"
3) .Select(u => u.Attribute("rating").Value), selects from any element the value of the attribute "rating"
4) .Select(r => Convert.ToInt32(r)) converts the string value of the attribute into an int32 (pay attention, if the value is not a number, it throws an exception)
5) .Average() It calculates the aritmetic average and returns a double

Maybe, you should process original XML with XSLT to get the data you need automatically. Then, resulting doc could be easier to parse. Take a look here as an example Calculate average with xslt
It uses HTML as output format, you can do the same with XML.

Another option can be to create the classes with the same structure as your original XML so then you could employ automatic deserialization. Then, use LINQ or any other way to get the stats.

Related

c# extract value from nextnode

I have the following xml part and am trying to extract the value where key is known. The example below is a snippet, from a larger xml that contains 1000's of nodes.
<?xml version="1.0" encoding="utf-8"?>
<DictionarySerializer>
<item>
<key>key1</key>
<value>CONTENT1</value>
</item>
<item>
<key>key2</key>
<value>CONTENT2</value>
</item>
</DictionarySerializer>
i assume the above is a string called xml,
then with
XDocument.Parse(xml)
.Descendants("key")
.Where(x => (string)x.Value == "key1")
.FirstOrDefault().NextNode.ToString()
I can get the string <value>CONTENT1</value> But i simply cannot get my head around how to get the value of the value node to to say.
I am afrad it is super simple, and i just are stuck in a coffein loop :-)
XDocument.Parse(xml)
.Descendants("key")
.Where(x => (string)x.Value == "key1")
.FirstOrDefault().Value.ToString()
you should use .Value property instead of .NextNode
If you want to get all keys and values from the XML from all 1000 elemnts. You can use:
Dictionary<string, string> elements = new Dictionary<string, string>();
xml.Root.Elements().ToList().ForEach(xmlElement =>
{
elements.Add(xmlElement.Descendants("key").First().Value,
xmlElement.Descendants("value").First().Value);
});
So, the elements dictionary will contain all of your 1000 nodes.
Try to cast NextNode to XElement and get Value from it.
Considering you can use XPath expressions.
expression = #"//Item[Key='1']/Value"
XmlNodeList nodeList = xmlDocument.SelectNodes(expression);
This would give you the value node(s) of items with Key=1. Just find the value of the desired node.
I believe using XDocument you can also try,
string output = xDocument.XPathEvaluate(expression);

Morelinq ExceptBy using several specific element

There are 2 xml files
First xml file contains:
<Prices>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>200</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>200</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>180</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
</Prices>
and the second xml file:
<Prices>
<Price>
<SalesOrg>700</SalesOrg>
<AreaOfPricing>D20</AreaOfPricing>
<ProductId>20228090</ProductId>
<EffectiveDate>2015-05-11T00:00:00+7</EffectiveDate>
<DistributorPriceFibrate>200</DistributorPriceFibrate>
<CustomerPriceFibrate>20</CustomerPriceFibrate>
<CustomerPriceInDozen>30</CustomerPriceInDozen>
<CustomerPriceinPC>80.00</CustomerPriceinPC>
<CompanyID>001</CompanyID>
<ValidTo>2999-12-31T00:00:00+7</ValidTo>
<UOM>CS</UOM>
<Currency>IDR</Currency>
</Price>
</Prices>
What I want is, using morelinq features ExceptBy(), or using custom class extend IEqualityComparer on Except() features in Linq to return something like this (between 1st xml file and the 2nd xml file, even when the third tag price on 1st xml file have different DistributorPriceFibrate value):
<Prices/>
Since Except() compares all values on element 'Price' node, I just want compare only specific element at <ProductId> and <EffectiveDate> only.
If they are the same, then go empty tag <Prices/>. If not same value on those elements, return the price tag from 1st xml file which not have same value ProductID and EffectiveDate from 2nd xml file.
What I've done I distinct the 1st xml file:
var distinctItemsonxmldoc1 =
xmldoc1
.Descendants("Price")
.DistinctBy(element => new
{
ProductId = (string)element.Element("ProductId"),
EffectiveDate = (string)element.Element("EffectiveDate")
});
var afterdistinctxmldoc1 = new XElement("Prices");
foreach (var a in distinctItemsonxmldoc1 )
{
afterdistinctxmldoc1.Add(a);
}
and when using except to compare between 2 files:
var afterexcept = afterdistinctxmldoc1.Descendants("Price").Cast<XNode>().Except(xmldoc2.Descendants("Price").Cast<XNode>(), new XNodeEqualityComparer());
but it compare all element value on price node.
how using ExceptBy() in spesific element?
or custom IComparer maybe?
Thanks before.
EDIT
already solved. see the answer by #dbc.
To confirm I understand your question: given two XML documents, you want to enumerate through instances of each Price element in the first document with distinct values values for the child elements ProductId and EffectiveDate, skipping all those whose ProductId and EffectiveDate match a Price element in the second document, using MoreLinq.
In that case, you can do:
var diff = xmldoc1.Descendants("Price").ExceptBy(xmldoc2.Descendants("Price"),
e => new { ProductId = e.Elements("ProductId").Select(p => p.Value).FirstOrDefault(), EffectiveDate = e.Elements("EffectiveDate").Select(p => p.Value).FirstOrDefault() });

How to delete certain root from xml file?

My '.xml' file looks this way:
<?xml version="1.0" encoding="utf-8"?>
<Requestes>
<Single_Request num="1">
<numRequest>1</numRequest>
<IDWork>1</IDWork>
<NumObject>1</NumObject>
<lvlPriority>Высокий</lvlPriority>
</Single_Request>
<Single_Request num="2">
<numRequest>2</numRequest>
<IDWork>2</IDWork>
<NumObject>2</NumObject>
<lvlPriority>Средний</lvlPriority>
</Single_Request>
<Periodic_Request num="1">
<numRequest>3</numRequest>
<IDWork>23</IDWork>
<pFrequency>23</pFrequency>
<lvlPriority>Низкий</lvlPriority>
<time_service>23</time_service>
<time_last_service>23</time_last_service>
<relative_time>23</relative_time>
</Periodic_Request>
</Requestes>
So I need to delete Single_Request with atribute value equal to sTxtBlock_numRequest.Text. I have tried to do it this way:
XDocument doc = XDocument.Load(FilePath);
IEnumerable<XElement> sRequest = doc.Root.Descendants("Single_Request").Where(
t => t.Attribute("num").Value =="sTxtBlock_numRequest.Text"); //I'm sure, that problem is here
sRequest.Remove();
doc.Save(FilePath);
Unfortunattly, nothing has happanned, don`t know how to solve the problem.
This is why , I am looking forward to your help.
You are comparing attribute value with string literal "sTxtBlock_numRequest.Text". You should pass value of textbox text instead:
doc.Root.Elements("Single_Request")
.Where(t => (string)t.Attribute("num") == sTxtBlock_numRequest.Text)
.Remove();
Note - it's better to use Elements when you are getting Single_Request elements of root, because Descendants will search whole tree, instead of looking at direct children only. Also you can call Remove() without saving query to local variable.

LINQ C#: Get the Distinct TAG NAMES (Not Values) that's common across all child XML elements

<root>
<abc:Description abc:about="XXX.XXX_CSData-2">
<xxx:Data.Curve abc:resource="XXX.XXX"/>
<xxx:Data.y2AData abc:datatype="#int">27</xxx:Data.y2AData>
<xxx:Data.y1AData abc:datatype="#int">-27</xxx:Data.y1AData>
<xxx:Data.xAData abc:datatype="#int">60</xxx:Data.xAData>
<xxx:IdentifiedObject.description abc:datatype="#string">SOME NAME</xxx:IdentifiedObject.description>
<xxx:IdentifiedObject.name abc:datatype="#string">XXX_CCC.XX</xxx:IdentifiedObject.name>
<abc:type abc:resource="http://iec.ch/TC57/2008/xxx-schema-xxx13#Data"/>
</abc:Description>
<abc:Description abc:about="XXX.XXX">
<xxx:ConnectivityNode.MemberOf_EquipmentContainer abc:resource="XXX.XXX"/>
<xxx:IdentifiedObject.description abc:datatype="#string">XXX.XXX</xxx:IdentifiedObject.description>
<xxx:IdentifiedObject.name abc:datatype="#string">XXX.XXX</xxx:IdentifiedObject.name>
<abc:type abc:resource="http://iec.ch/TC57/2008/xxx-schema-xxx13#ConnectivityNode"/>
<xxx:ConnectivityNode.Terminals abc:resource="XXX.XXX"/>
<xxx:ConnectivityNode.Terminals abc:resource="XXX.XXX"/>
<xxx:ConnectivityNode.Terminals abc:resource="JXXX.XXX"/>
<xxx:ConnectivityNode.Terminals abc:resource="JXXX.XXX"/>
</abc:Description>
</root>
Hello all,
In the above XML Snippet tha tags are "xxx:IdentifiedObject.description","IdentifiedObject.name" and "abc:type" are common between two child nodes.
I want to write a LINQ query that would return these tag names that are common (appear atleast once) in the child elements. That is need the tage names but the value of the tags
1) "xxx:IdentifiedObject.description",
2) "IdentifiedObject.name" and
3) "abc:type"
It sounds like you want something like:
// Given x and y as the parent elements you're interested in
var commonNames = x.Elements()
.Select(x => x.Name)
.Intersect(y.Elements().Select(y => y.Name));
That will give you an IEnumerable<XName>.

C# Linq XML pull out nodes from document

I’m trying to use Linq XML to select a number of nodes and the children but getting terrible confused!
In the example XML below I need to pull out all the <MostWanted> and all the Wanted with their child nodes but without the other nodes in between the Mostwanted and Wanted nodes.
This because each MostWanted can be followed by any number of Wanted and the Wanted relate to the preceding Mostwanted.
I’m even confusing myself typing this up!!!
How can I do this in C#??
<root>
<top>
<NotWanted3>
</NotWanted3>
<MostWanted>
<UniqueKey>1</UniqueKey>
<QuoteNum>1</QuoteNum>
</MostWanted>
<NotWanted2>
<UniqueKey>1</UniqueKey>
<QuoteNum>1</QuoteNum>
</NotWanted2>
<NotWanted1>
<UniqueKey>0001</UniqueKey>
</NotWanted1>
<Wanted>
<Seg>
<SegNum>1</SegNum>
</Seg>
</Wanted>
<Wanted>
<Seg>
<SegNum>2</SegNum>
</Seg>
</Wanted>
<NotWanted>
<V>x</V>
</NotWanted>
<NotWanted3>
</NotWanted3>
<MostWanted>
<UniqueKey>1</UniqueKey>
<QuoteNum>1</QuoteNum>
</MostWanted>
<NotWanted2>
<UniqueKey>1</UniqueKey>
<QuoteNum>1</QuoteNum>
</NotWanted2>
<NotWanted1>
<UniqueKey>0002</UniqueKey>
</NotWanted1>
<Wanted>
<Seg>
<SegNum>3</SegNum>
</Seg>
</Wanted>
<Wanted>
<Seg>
<SegNum>4</SegNum>
</Seg>
</Wanted>
<NotWanted>
<V>x</V>
</NotWanted>
</top>
</root>
Why don't you just use:
XName wanted = "Wanted";
XName mostWanted = "MostWanted";
var nodes = doc.Descendants()
.Where(x => x.Name == wanted || x.Name == mostWanted);
That will retrieve every element called "Wanted" or "MostWanted". From each of those elements you can get to the child elements etc.
If this isn't what you're after, please clarify your question.

Categories