Using linq to xml with Htmlagilitypack - c#

I am using HtmlAgilityPack for creatin a htmldocument from the string, like:
HtmlDocument updoc = new HtmlDocument();
updoc.load(stringContents);
Now i want to insert the HtmlNodes as a child of XElement. I tried :
XDocument xdoc = XDocument.load(path);
XElement body = xdoc.Descendants(ns + "body").Single();
body.Add(updoc.GetElementbyId("h"));
body.Add(updoc.GetElementbyId("m"));
body.Add(updoc.GetElementbyId("f"));
but result will only the object names (HtmlNodeAgilityPack, ..), not works. Basically i am trying to using the combination of HtmlAgilityPack with linq to xml. Is this possible ?

I'm just googling around for stuff, so this may not work for you. But you need to use the properties of the HtmlNode returned by GetElementbyId() to create your element.
So something like this:
HtmlNode node = updoc.GetElementbyId("h");
XElement e;
body.Add(e = new XElement(node.Name, XElement.Parse(node.InnerHtml)));
If the node has HtmlAttribute(s), add them like:
foreach(HtmlAttribute att in node.Attributes)
{
e.Add(new XAttribute(att.Name, att.Value));
}

Why not just use a StringBuilder to generate your xml and parse it with XDocument.Parse(string)
Example :
StringBuilder xmlBuilder = new StringBuilder();
//Build xml with the builder
XDocument xDoc = XDocument.Parse(xmlBuilder.ToString());

Related

Get xml node using c#

I have a request that returns a large xml file. I have the file in a XmlDocument type in my application. From that Doc how can I read an element like this:
<gphoto:videostatus>final</gphoto:videostatus>
I would like to pull that value final from that element. Also If i have multiple elements as well, can I pull that into a list? thanks for any advice.
If you already have an XmlDocument then you can use the function GetElementsByTagName() to create an XmlNodeList that can be accessed similar to an array.
http://msdn.microsoft.com/en-us/library/dc0c9ekk.aspx
//Create the XmlDocument.
XmlDocument doc = new XmlDocument();
doc.Load("books.xml");
//Display all the book titles.
XmlNodeList elemList = doc.GetElementsByTagName("title");
for (int i=0; i < elemList.Count; i++)
{
Console.WriteLine(elemList[i].InnerXml);
}
You can select nodes using XPath and SelectSingleNode SelectNodes. Look at http://www.codeproject.com/Articles/9494/Manipulate-XML-data-with-XPath-and-XmlDocument-C for examples. Then you can use for example InnerText to get final. Maybe you need to work with namespaces (gphoto). The //videostatus would select all videostatus elements
You can try using LINQ
XNamespace ns = XNamespace.Get(""); //use the xmnls namespace here
XElement element = XElement.Load(""); // xml file path
var result = element.Descendants(ns + "videostatus")
.Select(o =>o.Value).ToList();
foreach(var values in value)
{
}
Thanks
Deepu

Convert the following Linq to xml to .net 2.0

I am recently working on a .net 2.0 project I have to read some xml files and replace certain elements value.
Wondering how you do it the following not using linq to xml?
IEnumerable<XElement> cities= xmldoc.Descendants("City")
.Where(x => x.Value == "London");
foreach (XElement myElem in cities)
{
myElem.ReplaceWith(new XElement("City", "NewCity"));
}
or
var xElement = xdoc.Descendants("FirstName").Where(x => x.Value == "Max").First();
xElement.ReplaceWith(new XElement("FirstName", "NewValue");
Any suggestions
You can consider using XmlDocument, like this:
string xmlFile = "<xml><data<test /><test /><test /><test /></data></xml>";
var xmlDoc = new XmlDocument();
xmlDoc.Load(xmlFile);
var oNodes = xmlDoc.SelectNodes("//test");
foreach (var oNode in oNodes)
{
oNode.InnerText = "bla bla";
}
xmlDoc.Save("..path to xml file");
(In your case you can use InnerXml property of the document)
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.aspx
To selectNodes you should pass XPath Query, reference can be found:
http://www.w3schools.com/xpath/
Also if you XML contains namespace, you need to use XmlNamespaceManager:
http://msdn.microsoft.com/en-us/library/system.xml.xmlnamespacemanager.aspx
Otherwise xpath won't work.
You will need to use XmlDocument and query it using XPath with SelectNodes.
It will not be as nice and succint.

Get content of XML node using c#

simple question but I've been dinking around with it for an hour and it's really starting to frustrate me. I have XML that looks like this:
<TimelineInfo>
<PreTrialEd>Not Started</PreTrialEd>
<Ambassador>Problem</Ambassador>
<PsychEval>Completed</PsychEval>
</TimelineInfo>
And all I want to do is use C# to get the string stored between <Ambassador> and </Ambassador>.
So far I have:
XmlDocument doc = new XmlDocument();
doc.Load("C:\\test.xml");
XmlNode x = doc.SelectSingleNode("/TimelineInfo/Ambassador");
which selects the note just fine, now how in the world do I get the content in there?
May I suggest having a look at LINQ-to-XML (System.Xml.Linq)?
var doc = XDocument.Load("C:\\test.xml");
string result = (string)doc.Root.Element("Ambassador");
LINQ-to-XML is much more friendly than the Xml* classes (System.Xml).
Otherwise you should be able to get the value of the element by retrieving the InnerText property.
string result = x.InnerText;
The InnerText property should work fine for you.
http://msdn.microsoft.com/en-us/library/system.xml.xmlnode.innertext.aspx
FWIW, you might consider switching API to linq-to-xml (XElement and friends) as IMHO it's a friendly, easier API to interact with.
System.Xml version (NOTE: no casting to XmlElement needed)
var xml = #"<TimelineInfo>
<PreTrialEd>Not Started</PreTrialEd>
<Ambassador>Problem</Ambassador>
<PsychEval>Completed</PsychEval>
</TimelineInfo>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
var node = doc.SelectSingleNode("/TimelineInfo/Ambassador");
Console.WriteLine(node.InnerText);
linq-to-xml version:
var xml = #"<TimelineInfo>
<PreTrialEd>Not Started</PreTrialEd>
<Ambassador>Problem</Ambassador>
<PsychEval>Completed</PsychEval>
</TimelineInfo>";
var root = XElement.Parse(xml);
string ambassador = (string)root.Element("Ambassador");
Console.WriteLine(ambassador);
XmlDocument doc = new XmlDocument();
doc.Load("C:\\test.xml");
XmlNode x = doc.SelectSingleNode("/TimelineInfo/Ambassador");
x.InnerText will return the contents
Try using Linq to XML - it provides a very easy way to query xml datasources - http://msdn.microsoft.com/en-us/library/bb387098%28v=VS.100%29.aspx

Most elegant way to query XML string using XPath

I'm wondering what the most elegant way is in C# to query a STRING that is valid xml using XPath?
Currently, I am doing this (using LINQ):
var el = XElement.Parse(xmlString);
var h2 = el.XPathSelectElement("//h2");
Simple example using Linq to XML :
XDocument doc = XDocument.Parse(someStringContainingXml);
var cats = from node in doc.Descendants("Animal")
where node.Attribute("Species").Value == "Cat"
select node.Attribute("Name").Value;
Much clearer than XPath IMHO...
Just for the record, I did not want to go with Linq2XML but XPath and found this way:
var xPathDoc = new XPathDocument(new StringReader("your XML string goes here"));

Read attribute from xml

Can someone help me read attribute ows_AZPersonnummer with asp.net using c# from this xml structure
<listitems
xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns:rs="urn:schemas-microsoft-com:rowset"
xmlns:z="#RowsetSchema"
xmlns="http://schemas.microsoft.com/sharepoint/soap/">
<rs:data ItemCount="1">
<z:row
ows_AZNamnUppdragsansvarig="Peter"
ows_AZTypAvUtbetalning="Arvode till privatperson"
ows_AZPersonnummer="196202081276"
ows_AZPlusgiro="5456436534"
ows_MetaInfo="1;#"
ows__ModerationStatus="0"
ows__Level="1" ows_ID="1"
ows_owshiddenversion="6"
ows_UniqueId="1;#{11E4AD4C-7931-46D8-80BB-7E482C605990}"
ows_FSObjType="1;#0"
ows_Created="2009-04-15T08:29:32Z"
ows_FileRef="1;#uppdragsavtal/Lists/Uppdragsavtal/1_.000"
/>
</rs:data>
</listitems>
And get value 196202081276.
Open this up in an XmlDocument object, then use the SelectNode function with the following XPath:
//*[local-name() = 'row']/#ows_AZPersonnummer
Basically, this looks for every element named "row", regardless of depth and namespace, and returns the ows_AZPersonnummer attribute of it. Should help avoid any namespace issues you might be having.
The XmlNamespaceManager is your friend:
string xml = "..."; //your xml here
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
XmlNamespaceManager nsm = new XmlNamespaceManager(new NameTable());
nsm.AddNamespace("z", "#RowsetSchema");
XmlNode n = doc.DocumentElement
.SelectSingleNode("//#ows_AZPersonnummer", nsm);
Console.WriteLine(n.Value);
You can also use LINQ to XML:
XDocument xd = XDocument.Parse(xml);
XNamespace xns = "#RowsetSchema";
string result1 = xd.Descendants(xns + "row")
.First()
.Attribute("ows_AZPersonnummer")
.Value;
// Or...
string result2 =
(from el in xd.Descendants(xns + "row")
select el).First().Attribute("ows_AZPersonnummer").Value;
I'd say you need an XML parser, which I believe are common. This looks like a simple XML structure, so the handling code shouldn't be too hard.
Use <%# Eval("path to attribute") %> but you need to load the xml has a DataSource.
Otherwise you can load it using XmlTextReader. Here's an example.

Categories