Could someone tell me how to parse a XML-String that i receive from a wcf-rest service?
my webserive XML-String looks like
<WS>
<Info>
<Name>Beta</Name>
<Description>Prototyps</Description>
</Info>
<Pages>
<Page>
<Name>Custom</Name>
<Description>toDo</Description>
</Page>
...many other pages...
</Pages>
</WS>
an my phone sourcecode:
public void DownloadCompleted(Object sender, DownloadStringCompletedEventArgs e)
{
if (!e.Cancelled && e.Error == null)
{
var answer = XElement.Parse(e.Result).Descendants("WS"); // null
...
}
}
if i try to parse it through XDocument.Load(e.Result) then i get the exception: File not Found.
i just want the "unique" information of the Info-Node and a list of all Page-Nodes with their values
Update
Even if i try to load the Root-Element via var item = xdoc.Root.Descendants(); item will be assigned to the whole xml-file.
Update 2 it seems the problem occurs with the namespaces in the root-element. with namespaces xdocument will parse the webservice output not correctly. if i delete the namespaces it works fine. could someone explain me this issue? and is there a handy solution for deleting all namespaces?
update 3 A Handy way for removing namespaces1
With really simple XML if you know the format wont change, you might be interested in using XPath:
var xdoc = XDocument.Parse(e.Result);
var name = xdoc.XPathSelectElement("/WS/Info/Name");
but for the multiple pages, maybe some linq to xml
var xdoc = XDocument.Parse(xml);
var pages = xdoc.Descendants("Pages").Single();
var PagesList = pages.Elements().Select(x => new Page((string)x.Element("Name"), (string)x.Element("Description"))).ToList();
Where Page is a simple class:
public class Page
{
public string Name { get; set; }
public string Descrip { get; set; }
public Page(string name, string descrip)
{
Name = name;
Descrip = descrip;
}
}
Let me know if you need more explanation.
Also to select the Info without XPath:
var info = xdoc.Descendants("Info").Single();
var InfoName = info.Element("Name").Value;
var InfoDescrip = info.Element("Description").Value;
Viktor - XDocument.Load(string) attempts to load an XDocument by the supplied filename, not a string representation of an XML element.
You say var answer = XElement.Parse(e.Result).Descendants("WS"); // null, but which part is null? The parsed XElement or the attempt to grab a descendant? If <WS>...</WS> is your root element, would the .Descendents("WS") call return the root element? Based on the documentation for XElement.DescendantsAndSelf(), I'm guessing not. Have you instead tried calling:
var answer = XElement.Parse(e.Result).Descendants("Info");
A quick test on my end showed that, with WS as the root element, calling XElement.Parse(e.Result).Descendants("WS"); yielded no results, while XElement.Parse(e.Result).Descendants("Info"); yielded the <Info>...</Info> element.
Related
I'm trying to update an existing XML file, but always when I update it adding new tags the xmlns="" attribute mysteriously appears in all tags and I didn't find a way to remove it.
private static void EditarXML(string path, List<SiteUrl> listaUrls, bool indice, string loc)
{
XmlDocument documentoXML = new XmlDocument();
documentoXML.Load(path);
XmlNode sitemap = documentoXML.CreateElement("sitemap");
XmlNode xloc = documentoXML.CreateElement("loc");
xloc.InnerText = loc;
sitemap.AppendChild(xloc);
XmlNode lastmod = documentoXML.CreateElement("lastmod");
lastmod.InnerText = DateTime.Now.ToShortDateString();
sitemap.AppendChild(lastmod);
documentoXML.DocumentElement.AppendChild(sitemap);
}
Any help or ideas would be appreciated.
This will happen with the parent node you are appending to has a namespace, but you don't specify it in the CreateElement() call.
To handle this, you can get the namespace from the DocumentElement, like this (my sample just creates the document in memory, but the principle is the same), and pass it to CreateElement().
if (x.DocumentElement != null) {
var xmlns = (x.DocumentElement.NamespaceURI);
var sitemap = x.CreateElement("sitemap", xmlns);
var xloc = x.CreateElement("loc", xmlns);
xloc.InnerText = "Hello";
sitemap.AppendChild(xloc);
var lastmod = x.CreateElement("lastmod", xmlns);
lastmod.InnerText = DateTime.Now.ToShortDateString();
sitemap.AppendChild(lastmod);
x.DocumentElement.AppendChild(sitemap);
}
Console.WriteLine(x.InnerXml);
Output
<test xmlns="jdphenix"><sitemap><loc>Hello</loc><lastmod>4/20/2015</lastmod></sitemap></test>
Note that if I did not pass the parent namespace to each CreateElement() call, the children of that call would have the blank xmlns.
// incorrect - appends xmlns=""
if (x.DocumentElement != null) {
var sitemap = x.CreateElement("sitemap");
var xloc = x.CreateElement("loc");
xloc.InnerText = "Hello";
sitemap.AppendChild(xloc);
var lastmod = x.CreateElement("lastmod");
lastmod.InnerText = DateTime.Now.ToShortDateString();
sitemap.AppendChild(lastmod);
x.DocumentElement.AppendChild(sitemap);
}
Console.WriteLine(x.InnerXml);
Output
<test xmlns="jdphenix"><sitemap xmlns=""><loc>Hello</loc><lastmod>4/20/2015</lastmod></sitemap></test>
Related reading: Why does .NET XML append an xlmns attribute to XmlElements I add to a document? Can I stop it?
How to prevent blank xmlns attributes in output from .NET's XmlDocument?
I'm trying to show weather information on my website from world weather online. I'm using VS2012 with c# to create this.
I could able to retrieve data from world weather online to a function under a XMLDocument type variable "WP_XMLdoc".
Take a look at the code below:
public static XmlDocument WeatherAPI(string sLocation)
{
HttpWebRequest WP_Request;
HttpWebResponse WP_Response = null;
XmlDocument WP_XMLdoc = null;
String Value;
string sKey = "xxxxxxxxxxxxxxxxxxxxxxxxx"; //The API key generated by World Weather Online
string sRequestUrl = "http://api.worldweatheronline.com/free/v1/weather.ashx?format=xml&"; //The request URL for XML format
try
{
//Here we are concatenating the parameters
WP_Request = (HttpWebRequest)WebRequest.Create(string.Format(sRequestUrl + "q=" + sLocation + "&key=" + sKey));
WP_Request.UserAgent = #"Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4";
//Making the request
WP_Response = (HttpWebResponse)WP_Request.GetResponse();
WP_XMLdoc = new XmlDocument();
//Assigning the response to our XML object
WP_XMLdoc.Load(WP_Response.GetResponseStream());
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
WP_Response.Close();
return WP_XMLdoc;
}
}
So, now I just want to take XML data from "WP_XMLdoc" variable and show few details like temp_c, windspeed, time etc in my labels.
How can I do that?
The XML data that rest under "WP_XMLdoc" is given below:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<request>
<type>City</type>
<query>London, United Kingdom</query>
</request>
<current_condition>
<observation_time>04:17 AM</observation_time>
<temp_C>17</temp_C>
<temp_F>63</temp_F>
<weatherCode>143</weatherCode>
<weatherIconUrl>
<![CDATA[http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0006_mist.png]]>
</weatherIconUrl>
<weatherDesc>
<![CDATA[Mist]]>
</weatherDesc>
<windspeedMiles>0</windspeedMiles>
<windspeedKmph>0</windspeedKmph>
<winddirDegree>62</winddirDegree>
<winddir16Point>ENE</winddir16Point>
<precipMM>0.0</precipMM>
<humidity>94</humidity>
<visibility>2</visibility>
<pressure>1010</pressure>
<cloudcover>50</cloudcover>
</current_condition>
<weather>
<date>2014-09-19</date>
<tempMaxC>28</tempMaxC>
<tempMaxF>82</tempMaxF>
<tempMinC>14</tempMinC>
<tempMinF>57</tempMinF>
<windspeedMiles>5</windspeedMiles>
<windspeedKmph>8</windspeedKmph>
<winddirection>SSE</winddirection>
<winddir16Point>SSE</winddir16Point>
<winddirDegree>149</winddirDegree>
<weatherCode>356</weatherCode>
<weatherIconUrl>
<![CDATA[http://cdn.worldweatheronline.net/images/wsymbols01_png_64/wsymbol_0010_heavy_rain_showers.png]]>
</weatherIconUrl>
<weatherDesc>
<![CDATA[Moderate or heavy rain shower]]>
</weatherDesc>
<precipMM>8.3</precipMM>
</weather>
</data>
Please help!
Assuming that your existing code successfully load the XML data to XmlDocument object, we can then use SelectSingleNode() passing suitable XPath expression as argument to get any particular part of the XML document. For example, to get <temp_C> value :
string temp_c = WP_XMLdoc.SelectSingleNode("/data/current_condition/temp_C")
.InnerText;
Another option is using newer XML API, XDocument. It has Load() method which functionality is similar to XmlDocument.Load() :
XDocument WP_XMLdoc = XDocument.Load(WP_Response.GetResponseStream());
Using this approach, we can simply cast XElement to string to get it's value :
string temp_c = (string)WP_XMLdoc.XPathSelectElement("/data/current_condition/temp_C");
Try something like this, as an example:
var str = #"<your xml here>";
XDocument xdoc = XDocument.Parse(str);
var output = new List<string>();
foreach (var element in xdoc.Element("data").Element("current_condition").Elements())
{
output.Add(string.Format("{0} : {1}",element.Name, element.Value.ToString()));
}
This would traverse the properties of the current_condition node, you can adjust as necessary to extract what you need.
Ok, according to your answer in comments I believe you need to show multiple columns of data.
Best option would be to use a GridView to populate your XML data using ADO.net. It's bit easy.
Have a look at this SO thread
I am making a Application on Windows Phone 8. The bit I am struggling with is getting the XML parsed.
Here the XML File:
<ArrayOfThemeParkList xmlns="http://schemas.datacontract.org/2004/07/WCFServiceWebRole1" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ThemeParkList>
<id>1</id>
<name>Alton Towers</name>
</ThemeParkList>
<ThemeParkList>
<id>2</id>
<name>Thorpe Park</name>
</ThemeParkList>
<ThemeParkList>
<id>3</id>
<name>Chessington World Of Adventures</name>
</ThemeParkList>
<ThemeParkList>
<id>4</id>
<name>Blackpool Pleasure beach</name>
</ThemeParkList>
</ArrayOfThemeParkList>
And the c# code that tries to parse it is:
void ThemeParksNames_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
{
//Now need to get that data and display it on the page
//check for errors
if (e.Error == null)
{
//No errors have been passed now need to take this file and parse it
//Its in XML format
XDocument xdox = XDocument.Parse(e.Result);
//need a list for them to be put in to
List<ThemeParksClass> ParkList = new List<ThemeParksClass>();
//Now need to get every element and add it to the list
foreach (XElement item in xdox.Root.Elements("ThemeParkList"))
{
ThemeParksClass content = new ThemeParksClass();
content.ID = Convert.ToInt32(item.Element("id").Value);
content.ThemeParkName = item.Element("name").Value.ToString();
ParkList.Add(content);
}
parkList.ItemsSource = ParkList.ToList();
}
else
{
//There an Error
}
}
Now when using Break points it get to the for each loop but does not run at all just moves on. I am guessing i have the for each loop set wrong.
Many Thanks.
Your ThemeParkList elements are in a namespace http://schemas.datacontract.org/2004/07/WCFServiceWebRole1 - you'll need to adjust accordingly:
XNamespace ns = "http://schemas.datacontract.org/2004/07/WCFServiceWebRole1";
foreach (XElement item in xdox.Descendants(ns + "ThemeParkList"))
You'll need to handle the other elements in the same way.
I need to create a html parser, that given a blog url, it returns a list, with all the posts in the page.
I.e. if a page has 10 posts, it
should return a list of 10 divs,
where each div contains h1 and
a p
I can't use its rss feed, because I need to know exactly how it looks like for the user, if it has any ad, image etc and in contrast some blogs have just a summary of its content and the feed has it all, and vice-versa.
Anyway, I've made one that download its feed, and search the html for similar content, it works very well for some blogs, but not for others.
I don't think I can make a parser that works for 100% of the blogs it parses, but I want to make the best possible.
What should be the best approach? Look for tags that have its id attribute equal "post", "content"? Look for p tags? etc etc etc...
Thanks in advance for any help!
I don't think you will be successful on that. You might be able to parse one blog, but if the blog engine changes stuff, it won't work any more. I also don't think you'll be able to write a generic parser. You might even be partially successful, but it's going to be an ethereal success, because everything is so error prone on this context. If you need content, you should go with RSS. If you need to store (simply store) how it looks, you can also do that. But parsing by the way it looks? I don't see concrete success on that.
"Best possible" turns out to be "best reasonable," and you get to define what is reasonable. You can get a very large number of blogs by looking at how common blogging tools (WordPress, LiveJournal, etc.) generate their pages, and code specially for each one.
The general case turns out to be a very hard problem because every blogging tool has its own format. You might be able to infer things using "standard" identifiers like "post", "content", etc., but it's doubtful.
You'll also have difficulty with ads. A lot of ads are generated with JavaScript. So downloading the page will give you just the JavaScript code rather than the HTML that gets generated. If you really want to identify the ads, you'll have to identify the JavaScript code that generates them. Or, your program will have to execute the JavaScript to create the final DOM. And then you're faced with a problem similar to that above: figuring out if some particular bit of HTML is an ad.
There are heuristic methods that are somewhat successful. Check out Identifying a Page's Primary Content for answers to a similar question.
Use the HTML Agility pack. It is an HTML parser made for this.
I just did something like this for our company's blog which uses wordpress. This is good for us because our wordress blog hasn't changed in years, but the others are right in that if your html changes a lot, parsing becomes a cumbersome solution.
Here is what I recommend:
Using Nuget install RestSharp and HtmlAgilityPack. Then download fizzler and include those references in your project (http://code.google.com/p/fizzler/downloads/list).
Here is some sample code I used to implement the blog's search on my site.
using System;
using System.Collections.Generic;
using Fizzler.Systems.HtmlAgilityPack;
using RestSharp;
using RestSharp.Contrib;
namespace BlogSearch
{
public class BlogSearcher
{
const string Site = "http://yourblog.com";
public static List<SearchResult> Get(string searchTerms, int count=10)
{
var searchResults = new List<SearchResult>();
var client = new RestSharp.RestClient(Site);
//note 10 is the page size for the search results
var pages = (int)Math.Ceiling((double)count/10);
for (int page = 1; page <= pages; page++)
{
var request = new RestSharp.RestRequest
{
Method = Method.GET,
//the part after .com/
Resource = "page/" + page
};
//Your search params here
request.AddParameter("s", HttpUtility.UrlEncode(searchTerms));
var res = client.Execute(request);
searchResults.AddRange(ParseHtml(res.Content));
}
return searchResults;
}
public static List<SearchResult> ParseHtml(string html)
{
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var results = doc.DocumentNode.QuerySelectorAll("#content-main > div");
var searchResults = new List<SearchResult>();
foreach(var node in results)
{
bool add = false;
var sr = new SearchResult();
var a = node.QuerySelector(".posttitle > h2 > a");
if (a != null)
{
add = true;
sr.Title = a.InnerText;
sr.Link = a.Attributes["href"].Value;
}
var p = node.QuerySelector(".entry > p");
if (p != null)
{
add = true;
sr.Exceprt = p.InnerText;
}
if(add)
searchResults.Add(sr);
}
return searchResults;
}
}
public class SearchResult
{
public string Title { get; set; }
public string Link { get; set; }
public string Exceprt { get; set; }
}
}
Good luck,
Eric
Here's my XML file:
<?xml version="1.0" encoding="utf-8" ?>
<Hero>
<Legion>
<Andromeda>
<HeroType>Agility</HeroType>
<Damage>39-53</Damage>
<Armor>3.1</Armor>
<MoveSpeed>295</MoveSpeed>
<AttackType>Ranged(400)</AttackType>
<AttackRate>.75</AttackRate>
<Strength>16</Strength>
<Agility>27</Agility>
<Intelligence>15</Intelligence>
<Icon>Images/Hero/Andromeda.gif</Icon>
</Andromeda>
<WitchSlayer>
<HeroType>Agility</HeroType>
<Damage>39-53</Damage>
<Armor>3.1</Armor>
<MoveSpeed>295</MoveSpeed>
<AttackType>Ranged(400)</AttackType>
<AttackRate>.75</AttackRate>
<Strength>16</Strength>
<Agility>27</Agility>
<Intelligence>15</Intelligence>
<Icon>Images/Hero/Andromeda.gif</Icon>
</WitchSlayer>
</Legion>
</Hero>
Here's my method, but it isn't working so I don't know what to do.
public string GetHeroIcon(string Name)
{
//Fix later. Load the XML file from resource and not from the physical location.
HeroInformation = new XPathDocument(#"C:\Users\Sergio\Documents\Visual Studio 2008\Projects\Erth v0.1[WPF]\Tome of Newerth v0.1[WPF]\InformationRepositories\HeroRepository\HeroInformation.xml");
Navigator = HeroInformation.CreateNavigator();
Navigator.MoveToRoot();
Navigator.MoveToChild("Witch","Legion");
string x = "";
do
{
x += Navigator.Value;
} while (Navigator.MoveToNext());
return x;
}
I need help making a method that recieves a string parameter "Name" and then return all of the attributes of the XML element.
In pseudo-code:
public void FindHero(string HeroName)
{
//Find the "HeroName" element in the XML file.
//For each tag inside of the HeroName parent element,
//add it to a single string and blast it out through a MessageBox.
}
I'm LEARNING how to use this, please don't leave snobby remarks like, "we won't do this for you." I'm not asking for something groundbreaking here, just a simple use case for what I need on my program and for my learning nothing else. :D I'm doing the whole app in WPF and I can literally say that I've not done ONE single thing with previous knowledge, I'm doing this just to learn new things in my spare time.
Thanks a bunch SO, you rock!
private static string GetHeroIcon(string name)
{
XDocument doc = XDocument.Load("C:/test.xml");
return doc.Descendants(name).Single().Element("Icon").Value;
}
First off, since you've tagged this question WPF, you should know that WPF has excellent support for binding directly to XML data. You can then for instance map an image in the GUI directly to the Icon element in the XML file. See this link for example: http://www.longhorncorner.com/UploadFile/cook451/DataBindingXAML12102005134820PM/DataBindingXAML.aspx (first hit on google for "wpf databinding xml")
From code, you can create an XPathDocument from your XML file, then get a Navigator and finally run custom XPath queries on it, like so:
// Get's the value of the <icon> tag for a hero
var node = myNavigator.SelectSingleNode("/Legion/Hero/" + nameOfHero + "/Icon");
var icon = node.Value;
// To get all the nodes for that hero, you could do
var nodeIter = myNavigator.Select("/Legion/Hero/" + nameOfHero)
var sb = new StringBuilder();
while (nodeIter.MoveNext())
{
sb.AppendLine(nodeIter.Current.Name + " = " + nodeIter.Current.Value);
}
MessageBox.Show(sb.ToString());
See this kb article for an example.
DISCLAIMER: I copied and pasted the code from my code and did some refactoring in this window. It may not compile on first run but that may mean that it takes 10 minutes to get it to where it needs to be.
I would strongly recommend that you use XML deserialization. It's object oriented, type-safe, and just flat out slick.
Try this:
1) Create a series of classes: One for Hero, Legion, Witchslayer, and Andromeda.
Here is an example of the Andromeda class:
using System.Xml.Serialization;
[XmlRoot( "Andromeda" )]
public class Andromeda
{
[XmlElement( "Damage" )]
public String Damage
{
get;set;
}
[XmlElement( "Armor" )]
public double Armor
{
get;set;
}
}
The Hero class should contain an instance of Legion and Legion should contain the rest to mimic the layout of the XML packet.
2) Use the XmlSerializer to deserialize the data:
XmlSerializer xmlSerializer = new XmlSerializer( typeof( Hero ) );
using ( StringReader reader = new StringReader( xmlDataString ) )
{
Hero hero = ( Hero ) xmlSerializer.Deserialize( reader );
}
If you set it up right, you'll be left with a hero instance that contains the nested objects and all of the data. Cool, huh?