how to get count of the repeated values xml element in wpf - c#

In above figure the element Chandru is repeated as two times.
so i have to count the repeated element.
But i don't know to get count of repeated element.
please help me.
Here the code what i wrote
public XML_3()
{
this.InitializeComponent();
XmlDocument doc = new XmlDocument();
doc.Load("D:/student_2.xml");
XmlNodeList student_list = doc.GetElementsByTagName("Student");
foreach (XmlNode node in student_list)
{
XmlElement student = (XmlElement)node;
string sname = student.GetElementsByTagName("Chandru")[0].InnerText;
string fname = student.GetElementsByTagName("FName")[0].InnerText;
string id = student.GetElementsByTagName("Chandru")[0].Attributes["ID"].InnerText;
Window.Content = sname + fname + id;
}
}

var count = student.GetElementsByTagName("Chandru").Count;

I think it would be easier using LINQ to XML and X-Classes:
var doc = XDocument.Load("D:/student_2.xml");
var results = (from s in doc.Root.Elements()
group s by s.Name into g
select new { Name = g.Key, Count = g.Count() }).ToList();
this will give you a list of elements with two properties each: Name and Count. If you want to receive only that names, which occurred more than once you can add where g.Count() > 1 between group and select statement.

I am not a C# programmer, but hence, I can give you the algorithm then you can try to apply it on your program.
Define an array of struct, the struct must have two fields: ElmntName and NbOccurance.
struct Elmnt
{ String ElmntName;
Int NbOccurance;
} MyElement;
Then each element you pass thorugh, pass through your array, if the elementwas not found, save it as new element with NbOccurance=0;
Else, if it was found increment the number of occursnces.
At the end of reading your xml, you will get a list containing the names of the elements and their occurances.

Related

How to check the attribute value string from linq in c#

I have a string value in column of database table :-
<Attributes><ProductAttribute ID="322"><ProductAttributeValue><Value>782</Value></ProductAttributeValue></ProductAttribute></Attributes>
There are multiple column with the same format.
Now I need to check ProductAttributeValue and get the data from linQ
currently I am doing by
var id = 782
var string = "<Attributes><ProductAttribute ID="322"><ProductAttributeValue><Value>" + id + "</Value></ProductAttributeValue></ProductAttribute></Attributes>";
var value = sometable.where(x => x.valueString == string).FirstOrDefault();
Is there any way to get direct from linq?
This can be done using LINQ to XML.
using System.Linq;
using System.Xml.Linq;
...
var id = "Value To Find";
var str = "<Attributes><ProductAttribute ID=\"322\"><ProductAttributeValue><Value>" + id + "</Value></ProductAttributeValue></ProductAttribute></Attributes>";
var xml = XDocument.Parse(str);
var val = xml
.Element("Attributes")
.Element("ProductAttribute")
.Element("ProductAttributeValue")
.Element("Value")?.Value;
Since there is only 1 of each element in the xml data structure you can use Element, if there are multiple you can use Elements and operate on them as a collection.
You can filter elements like usual using Where and other extension methods.
var valToFind = "722";
var val = xml
.Element("Attributes")
.Elements("ProductAttribute")
.Where(node => node
.Element("ProductAttributeValue")
?.Element("Value")
?.Value == valToFind
)
.FirstOrDefault();
The above will find the ProductAttribute node that has a ProductAttributeValue Value equal to the valToFind. valToFind is a string for quick comparison against the xml string value.

HTMLAgilityPack selects nodes from first iteration through divs

I'm trying to use HTMLAgilityPack to parse some website for the first time. Everything works as expected but only for first iteration. On each iteration I get unique div with its data, but SelectNodes() always gets data from first iteration.
The code listed below explains the problem
All the properties for station get values from first iteration.
static void Main(string[] args)
{
List<Station> stations = new List<Station>();
wClient = new WebClient();
wClient.Proxy = null;
wClient.Encoding = encode;
for (int i = 1; i <= 1; i++)
{
HtmlDocument html = new HtmlDocument();
string link = string.Format("http://energybase.ru/powerPlant/index?PowerPlant_page={0}&pageSize=20&q=/powerPlant", i);
html.LoadHtml(wClient.DownloadString(link));
var stationList = html.DocumentNode.SelectNodes("//div[#class='items']").First().ChildNodes.Where(x=>x.Name=="div").ToList();//get list of nodes with PowerStation Data
foreach (var item in stationList) //each iteration returns Item with unique InnerHTML
{
Station st = new Station();
st.Name = item.SelectNodes("//div[#class='col-md-20']").First().SelectNodes("//div[#class='name']").First().ChildNodes["a"].InnerText;//gets name from first iteration
st.Url = item.SelectNodes("//div[#class='col-md-20']").First().SelectNodes("//div[#class='name']").First().ChildNodes["a"].Attributes["href"].Value;//gets url from first iteration and so on
st.Company = item.SelectNodes("//div[#class='col-md-20']").First().SelectNodes("//div[#class='name']").First().ChildNodes["small"].ChildNodes["em"].ChildNodes["a"].InnerText;
stations.Add(st);
}
}
Maybe I am not getting some of essentials of OOP?
Your code can be greatly simplified by using the full power of XPath.
var stationList = html.DocumentNode.SelectNodes("//div[#class='items']/div");
// XPath-expression may be so: "//div[#class='items'][1]/div"
// where [1] means first node
foreach (var item in stationList)
{
Station st = new Station();
st.Name = item.SelectSingleNode("div[#class='col-md-20']/div[#class='name']/a").InnerText;
st.Url = item.SelectSingleNode("div[#class='col-md-20']/div[#class='name']/a").Attributes["href"].Value;
string rawText = item.SelectSingleNode("div[#class='col-md-20']/div[#class='name']/small/em").InnerText;
st.Company = HttpUtility.HtmlDecode(rawText.Trim());
stations.Add(st);
}
Your mistake was to use XPath descendants axis: //div.
Even better rewrite code like this:
var divName = item.SelectSingleNode("div[#class='col-md-20']/div[#class='name']");
var nodeA = divName.SelectSingleNode("a");
st.Name = nodeA.InnerText;
st.Url = nodeA.Attributes["href"].Value;
string rawText = divName.SelectSingleNode("small/em").InnerText;
st.Company = HttpUtility.HtmlDecode(rawText.Trim());
This article contains some good exaples on various aspects of html agility pack.
have a look into this article, it would give you a quick start.

c# xml parsing - conversion from php

I currently have my application/project setup on PHP and I am trying to get it working in c# so I can build an application around it.
I have come across some parts of the code which I am looking help with.
XML data from: http://api.eve-central.com/api/marketstat?usesystem=30000142&hours=48&typeid=34&typeid=456
Above is XML data from a certain system containing 2 typeids (same as the other XML), again this will be around 100 items at a time.
I am using this code at the moment:
XDocument doc = XDocument.Load("http://api.eve-central.com/api/marketstat?typeid=34&typeid=35&usesystem=30000142");
var id = from stats in doc.Root.Elements("marketstat")
from type in stats.Elements("type")
select new
{
typeID = type.Attribute("id").Value
};
foreach (var itemids in id)
{
kryptonListBox4.Items.Add(itemids.typeID);
};
Which populates the ListBox as 34 and 456.
What I need is to be able to add the other xml data such as min sell and max buy
I can get the first min sell like this:
string minSell = doc.Descendants("sell")
.First()
.Element("min")
.Value;
But I need to have the minsell in relation to the typeID and being able to work with the data.
Second Problem
XML data from http://api.eve-marketdata.com/api/item_history2.xml?char_name=demo&days=10&region_ids=10000002&type_ids=34,456
Above is XML data from a certain region and contains 2 type_ids (this will be a much larger list when completed around 100 items at a time).
I have tried to use similar code as above but I cannot get it to return the correct data.
I need to be able to get the volume in total for each typeid
In PHP I use this:
foreach ($xml -> result -> rowset-> row as $row)
{
$id = (string)$row['typeID'];
$volume = $row['volume'];
if (!isset($volumes[$id]))
{
$volumes[$id] = 0;
}
$volumes[$id] = $volumes[$id] + $volume;
}
Any help would be greatly appreciated!
//Edit: Looks like I can use
var vRow = from xmlRow in marketstats.Descendants("emd").Descendants("result").Descendants("rowset").Descendants("row")
select xmlRow;
for the 2nd problem but I cannot seem to get the multidimensional array to work
For your second problem if my understanding is right this will be close to ur need.
var strxml = File.ReadAllText(#"D:\item_history2.xml");
var xml = XElement.Parse(strxml);
var typeIDs = (from obj in xml.Descendants("row")
select obj).Select(o => o.Attribute("typeID").Value).Distinct();
Dictionary<string, long> kv = new Dictionary<string, long>();
foreach (var item in typeIDs)
{
var sum = (from obj in xml.Descendants("row")
select obj).Where(o => o.Attribute("typeID").Value == item).Sum(p => long.Parse(p.Attribute("volume").Value));
kv.Add(item, sum);
}
In the dictionary you will have the sum of volumes against each typeID in such a way
typeID as key and sum of volume as value in Dictionary kv.
For your first problem,
Checkout this,
var minsell = (from obj in xml.Descendants("type")
select new
{
typeid = obj.Attribute("id").Value,
minsell = obj.Descendants("sell").Descendants("min").FirstOrDefault().Value
}
).ToArray();
This will give you minsell value in relation with typeid.
I guess this what you expects ?
If wrong please comment it.

LINQ Type expected

have the following linq code, trying to parse an xml file to a datatable but i get strange values in the resultant datatable all cell values show as
System.Xml.Ling.XContainer+<GetElements>d_11
Here is my LINQ
XDocument doc1 = XDocument.Load(#"D:\m.xml");
var q = from address in doc1.Root.Elements("Address")
let name = address.Elements("Name")
let street = address.Elements("Street")
let city = address.Elements("city")
select new
{
Name = name,
Street = street,
City = city
};
var xdt = new DataTable();
xdt.Columns.Add("Name", typeof(string));
xdt.Columns.Add("Street", typeof(string));
xdt.Columns.Add("City", typeof(string));
foreach (var address in q)
{
xdt.Rows.Add(address.Name, address.Street, address.City);
}
dataGrid1.ItemsSource = xdt.DefaultView;
here is my xml:
<PurchaseOrder PurchaseOrderNumber="99503" OrderDate="1999-10-20">
<Address Type="Shipping">
<Name>Ellen Adams</Name>
<Street>123 Maple Street</Street>
<City>Mill Valley</City>
<State>CA</State>
<Zip>10999</Zip>
<Country>USA</Country>
</Address>
<Address Type="Billing">
<Name>Tai Yee</Name>
<Street>8 Oak Avenue</Street>
<City>Old Town</City>
<State>PA</State>
<Zip>95819</Zip>
<Country>USA</Country>
</Address>
</PurchaseOrder>
and here is the result i get!
You forgot to regrieve the inner text of XElements. So you are selecting the whole element with attributes etc. Use this part of code:
var q = from address in doc1.Root.Elements("Address")
let name = address.Element("Name")
let street = address.Element("Street")
let city = address.Element("city")
select new
{
Name = name.Value,
Street = street.Value,
City = city.Value
};
address.Elements("Name") is a collection of all of the elements of type "Name". It so happens that in your case it's a collection of size one, but it's still a collection. You want to get the first item out of that collection (since you know it will be the only one) and then get the text value of that element. If you use Element instead of Elements you'll get the first item that matches, rather than a collection of items.
Now that you have your single element, you also need to get the value of that element, rather than the element itself (which also contains lots of other information in the general case, even though there really isn't anything else interesting about it in this particular case.
var q = from address in doc1.Root.Elements("Address")
select new
{
Name = address.Element("Name").Value,
Street = address.Element("Street").Value,
City = address.Element("City").Value
};
How does this work for you?
var q = from address in doc1.Root.Elements("Address")
let name = (string)(address.Element("Name"))
let street = (string)(address.Element("Street"))
let city = (string)(address.Element("city"))
//...
http://msdn.microsoft.com/en-us/library/bb155263.aspx
You are calling Elements which returns n elements wrapped in a helper class.
You probably mean to call Element which returns the first element as an XElement object.
Try this:
var q = from address in doc1.Root.Elements("Address")
let name = address.Element("Name").Value
let street = address.Element("Street").Value
let city = address.Element("City").Value
Change address.Elements("Name") to address.Elements("Name").FirstOrDefault() and so on.
The Elements method returns an IEnumerable. Therefore, your let variables point ato a sequence of elements, not a single element. You should take the single element returned, which will be an XElement, and then take its Value property to get the concatenated text of its contents. (As per documentation)
Instead of
select new
{
Name = name,
Street = street,
City = city
}
You should write:
select new
{
Name = name.Single().Value,
Street = street.Single().Value,
City = city.Single().Value
}
Either there, or directly in the let expressions. You may also find a helper method useful:
public static string StringValueOfElementNamed(XElement node, string elementName) {
return node.Elements(elementName).Single().Value;
}
Turn this helper method into an extension method if you wish to use member access syntax.
Edit: After reading concurrent answers, the better method to use would be:
public static string StringValueOfElementNamed(XElement node, string elementName) {
return node.Element(elementName).Value;
}
Element returns the first found element. Beware the null pointer returned when there is no element found.

Working with HtmlAgilityPack

I'm trying to get a link and another element from an HTML page, but I don't really know what to do. This is what I have right now:
var client = new HtmlWeb(); // Initialize HtmlAgilityPack's functions.
var url = "http://p.thedgtl.net/index.php?tag=-1&title={0}&author=&o=u&od=d&page=-1&"; // The site/page we are indexing.
var doc = client.Load(string.Format(url, textBox1.Text)); // Index the whole DB.
var nodes = doc.DocumentNode.SelectNodes("//a[#href]"); // Get every url.
string authorName = "";
string fileName = "";
string fileNameWithExt;
foreach (HtmlNode link in nodes)
{
string completeUrl = link.Attributes["href"].Value; // The complete plugin download url.
#region Get all jars
if (completeUrl.Contains(".jar")) // Check if the url contains .jar
{
fileNameWithExt = completeUrl.Substring(completeUrl.LastIndexOf('/') + 1); // Get the filename with extension.
fileName = fileNameWithExt.Remove(fileNameWithExt.LastIndexOf('.')); ; // Get the filename without extension.
Console.WriteLine(fileName);
}
#endregion
#region Get all Authors
if (completeUrl.Contains("?author=")) // Check if the url contains .jar
{
authorName = completeUrl.Substring(completeUrl.LastIndexOf('=') + 1); // Get the filename with extension.
Console.WriteLine(authorName);
}
#endregion
}
I am trying to get all the filenames and authors next to each other, but now everything is like randomly placed, why?
Can someone help me with this? Thanks!
If you look at the HTML, it's very unfortunate it is not well-formed. There's a lot of open tags and the way HAP structures it is not like a browser, it interprets the majority of the document as deeply nested. So you can't just simply iterate through the rows of the table like you would in the browser, it gets a lot more complicated than that.
When dealing with such documents, you have to change your queries quite a bit. Rather than searching through child elements, you have to search through descendants adjusting for the change.
var title = System.Web.HttpUtility.UrlEncode(textBox1.Text);
var url = String.Format("http://p.thedgtl.net/index.php?title={0}", title);
var web = new HtmlWeb();
var doc = web.Load(url);
// select the rows in the table
var xpath = "//div[#class='content']/div[#class='pluginList']/table[2]";
var table = doc.DocumentNode.SelectSingleNode(xpath);
// unfortunately the `tr` tags are not closed so HAP interprets
// this table having a single row with multiple descendant `tr`s
var rows = table.Descendants("tr")
.Skip(1); // skip header row
var query =
from row in rows
// there may be a row with an embedded ad
where row.SelectSingleNode("td/script") == null
// each row has 6 columns so we need to grab the next 6 descendants
let columns = row.Descendants("td").Take(6).ToList()
let titleText = columns[1].Elements("a").Select(a => a.InnerText).FirstOrDefault()
let authorText = columns[2].Elements("a").Select(a => a.InnerText).FirstOrDefault()
let downloadLink = columns[5].Elements("a").Select(a => a.GetAttributeValue("href", null)).FirstOrDefault()
select new
{
Title = titleText ?? "",
Author = authorText ?? "",
FileName = Path.GetFileName(downloadLink ?? ""),
};
So now you can just iterate through the query and write out what you want for each of the rows.
foreach (var item in query)
{
Console.WriteLine("{0} ({1})", item.FileName, item.Author);
}

Categories