I'm having trouble retrieving a single node by its explicit XPath that I have already found by other ways. I have node and I can get its XPath, but when I try to retrieve that same node again this time via node.XPath it gives the "expression must evaluate to a node-set" error. Shouldn't this work? I'm using HtmlAgilityPack in C# btw for the HtmlDocument.
HtmlDocument doc = new HtmlDocument();
doc.Load(#"..\..\test1.htm");
HtmlNode node = doc.DocumentNode.SelectSingleNode("(//node()[#id='something')])[first()]");
HtmlNode same = doc.DocumentNode.SelectSingleNode(node.XPath);
BTW: this is the value of node.XPath:
"/html[1]/body[1]/table[1]/tr[1]/td[1]/div[1]/div[1]/div[2]/table[1]/tr[1]/td[1]/div[1]/div[1]/table[1]/tr[1]/td[1]/div[1]/div[1]/div[4]/div[2]/div[1]/div[1]/div[4]/#text[2]"
I was able to get it working by replacing #text with the function text(). I'm not sure why it didn't just emit the XPath that way in the first place.
HtmlNode same = doc.DocumentNode.SelectSingleNode(node.XPath.Replace("#text","text()");
Your XPath ends in "#text[2]", which means "the second 'text' attribute". Attributes aren't nodes, they're node metadata.
This is a common problem I've had with XPath: wanting the value of an attribute while the XPath operation absolutely has to extract a node.
The solution I've used for this is to wrap my XPath fetching with something that detects and strips off the attribute portion of the string (via a myXPathString.LastIndexOf( "#" ) method call) and then uses the truncated myXPathString to fetch the node and collect the desired attribute value as a second step.
Hope that helps,
J
Related
I'm trying to capture the attribute "description" in this XML:
<ProductoModel xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/WebApi.Models">
<descripcion>descripcion 1</descripcion>
<fecha_registro>2016-03-01</fecha_registro>
<id_producto>1</id_producto>
<id_proveedor>1</id_proveedor>
<nombre_producto>producto 1</nombre_producto>
<precio>200</precio>
</ProductoModel>
My Code :
XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml(content);
XmlNamespaceManager manager = new XmlNamespaceManager(xDoc.NameTable);
manager.AddNamespace("MYNS", "http://schemas.datacontract.org/2004/07/WebApi.Models");
XmlNode node = xDoc.DocumentElement.SelectSingleNode("MYNS:ProductoModel", manager);
MessageBox.Show(node.Attributes.GetNamedItem("descripcion").Value);
The problem is I can not capture the attribute "descripcion" and get the following error:
Object reference not set to an instance of an object.
As I can capture the required attribute?
<descripcion> is not attribute. It is element.
You can get any element (or attribute) with a single xpath query.
XmlNode node = xDoc.DocumentElement.SelectSingleNode("/MYNS:ProductoModel/MYNS:descripcion", manager);
MessageBox.Show(node.InnerText);
Note the character / at the beginning of the xpath expression.
If you want another easy way operate XML, check this out. This is a little tool for xml operate, it's much easier to use and understand than XmlNode.
This is the line of code I am using, when I look in the watch window, 'c' is null.
HtmlNodeCollection c = doc.DocumentNode.SelectNodes("//*[#id=\"content\"]/table/tbody/tr[2]/td/center[2]/b");
But when I declare 'c' as this, the watch window shows it to be a valid HtmlNodeCollection
HtmlNodeCollection c = new HtmlNodeCollection(doc.DocumentNode.ParentNode);
If I then set 'c' to the first code snippet, it goes back to being null.
I know the XPath is correct, as I obtained it from the Chrome Inspect Element of the element I want to get.
SelectNodes returns null when nothing has been found.
You think your XPATH is ok because you used a browser's (Chrome, Firefox, etc.) constructed XPATH, but unfortunately, this XPATH is not exactly the same as the one you got from the network (or a file, or a raw stream).
Browsers rely on the in-memory DOM they use internally which can be dramatically different. That's why you see elements such as TBODY that only exist in DOM, not in markup (where they are optional).
So, I suggest you get back to the string/stream you give to the Html Agility Pack and check that XPATH again. I bet there is no TBODY, for a start.
I'm trying to write a method which takes a string variable (the name of a certain node) and gets the tr element containing an element with text same as the given string.
First step would be to find an element in my html with
element.text = string
But i cant get the the XPath expression for that.
I tried
driver.FindElement(By.XPath(String.Format("//span[text()={0}]", &stringVariable)));
This code throws an exception "cannot be evaluated or does not result in a Webelement."
Thanks in advance
EDIT:
neither
//tr[span[text()= 'variableValue']]
nor
//tr[span[contains(text(), 'variableValue')]]
works whereas
//tr[contains(text(), 'partOfVariableValueUntilFirstSpace')]
will work. I cannot explain why...
You can use following Xpath
//tr[span[text()='variableValue']]
It will find tr element which have a span element with text as variable value.
I've found a lot of articles about how to get node content by using simple XPath expression and C#, for example:
XPath:
/bookstore/author/first-name
C#:
string xpathExpression = "/bookstore/author/first-name";
nodes = navigator.Select(xpathExpression);
I wonder how to get content that is inside of an element, and the same element is inside another element and another and another.
Just take a look on below code:
<Cell>
<CellContent>
<Para>
<ParaLine>
<String>ABCabcABC abcABC abc ABCABCABC.</string>
</ParaLine>
</Para>
</CellContent>
</Cell>
I only want to extract content ABCabcABC abcABC abc ABCABCABC. from String element.
Do you know how to resolve problem by use XPath expression and .Net C#?
After googling c# .net xpath for few seconds you'll find this article, which provides example which you can easily modify to use XPathDocument, XPathNavigator and XPathNavigator::SelectSingleNode():
XPathNavigator nav;
XPathDocument docNav;
string xPath;
docNav = new XPathDocument("c:\\books.xml");
nav = docNav.CreateNavigator();
xPath = "/Cell/CellContent/Para/ParaLine/String/text()";
string value = nav.SelectSingleNode(xPath).Value
I recommend more reading on xPath syntax. Much more.
navigator.SelectSingleNode("/Cell/CellContent/Para/ParaLine/String/text()").Value
You can use Linq to XML as well to get value of specified element
var list = XDocument.Parse("xml string").Descendants("ParaLine")
.Select(x => x.Element("string").Value).ToList();
From above query you will get value of all the string element which are inside ParaLine tag.
I have the below fragement of XML, notice that the Reference node holds a URI which links to the Id attribute of the Body node.
<Reference URI="#Body">
<SOAP-ENV:Body Id="Body" xmlns:SOAP-ENV="http://www.dingo.org">
<ns0:Add xmlns:ns0="http://www.moo.com">
<ns0:a>2</ns0:a>
<ns0:b>3</ns0:b>
</ns0:Add>
</SOAP-ENV:Body>
If I had the value of the URI attribute how would I then get the whole Body XMLNode? I presume this would be best done via an XPath epression but haven't any clue on XPath. Note that the XML will not always be so simple. I'm doing this in c# btw :)
Any ideas?
Thanks
Jon
EDIT: I wouldn't know the XML structure or namespaces before hand, all I would know is that the reference element has the ID of the xmlNode i want to retrieve, hope this is sligtly clearer.
You can add a condition that applies to a relative (or absolute node) to any step of an XPath expression.
In this case:
//*[#id=substring-after(/Reference/#URI, '#')]
The //* matches all elements in the document. The part in [] is a condition. Inside the condition the part of the URI element of the root References node is taken, but ignoring the '#' (and anything before it).
Sample code, assuming you have loaded your XML into XPathDocument doc:
var nav = doc.CreateNavigator();
var found = nav.SelectSingleNode("//*[#id=substring-after(/Reference/#URI, '#')]");
If you have the value of the URI attribute in a variable you could use
myXmlDocument.DocumentElement.SelectSingleNode("//SOAP-ENV:Body[ID='pURI']")
where pURI is the value of the URI attribute and myXmlDocument is the Xml Document object
Something like this:
XmlDocument requestDocument = new XmlDocument();
requestDocument.LoadXml(yourXmlString);
String someXml = requestDocument.SelectSingleNode(#"/*[local-name()='Reference ']/*[local-name()='Body']").InnerXml;