I'm trying to check an XML document for two rules via XPath evaluate.
The rules are:
/root/path1/text()='TABLE1'
/root/path2/text()='TABLE2'
My code looks like:
XPathDocument document = new XPathDocument(myDocument);
XPathNavigator navigator = document.CreateNavigator();
XmlNamespaceManager xpathNsMgr = new XmlNamespaceManager(navigator.NameTable);
xpathNsMgr.AddNamespace("xsl", "http://www.w3.org/1999/XSL/Transform");
var result = (bool)navigator.Evaluate("((/root/path1/text()='TABLE1') and (/root/path2/text()='TABLE2'))", xpathNsMgr);
If I check both XPath via evaluate for their own, then everything works. But if I check them combined as shown in the code above, I get the following error:
xsltcontext is needed for this query because of an unknown function
Why isn't it possible to combine both XPath and evaluate them together? I thought "and", "or" etc. are valid operators since XPath 1.0...
.NET XPathNavigator supports XPath 1.0 only.
You can model your check easily by turning the condition into a predicate (square brackets) and see if the resulting node-set is empty or not.
var result = navigator.Evaluate("/*[path1 = 'TABLE1' and path2 = 'TABLE2']", xpathNsMgr);
Here /* selects the document element. Write /root instead if the actual name of the document element is important.
As stated by #Tomalak and and #Martin Honnen, my conditions are not fully supported by XPath 1.0. Functions like exists() are part of XPath 2.0.
Since .NET 4.5 (which I have to use) doesn't support XPath 2.0, I used the following Nuget package to solve my problem: https://www.nuget.org/packages/XPath2/
Just replace
navigator.Evaluate(...
with
navigator.XPath2Evaluate(...
And the expression can be evaluated.
Related
I get an invalid token exception while trying to evaluate below expression via XpathNavigor:
var expression = if(//DovizCins = 'YTL') then '1' else '2';
var nav = doc.CreateNavigator();
XPathExpression xp = XPathExpression.Compile(expression);
var value = nav.Evaluate(xp);
return value?.ToString() ?? string.Empty;
Exception is:
System.Xml.XPath.XPathException: ''if(//DovizCins = 'YTL') then '1' else '2'' has an invalid token.'
I completely concur with Michael Kay.
The official MS documentation is mistaken: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath?view=netframework-4.7.1
Excerpt
"...The System.Xml.XPath namespace contains the classes that define a cursor model for navigating and editing XML information items as instances of the XQuery 1.0 and XPath 2.0 Data Model..."
XQuery 1.0 and XPath 2.0 are partially supported by MS SQL Server.
The .Net framework doesn't support any XQuery, and its XPath is 1.0
Microsoft's XML technology is way out of date. This is XPath 2.0 syntax, introduced in 2007, and Microsoft has yet to catch up: they're still shipping XPath 1.0.
This is the xpath text i tried to use along with HtmlAgilityPack C# parser.
//div[#id = 'sc1']/table/tbody/tr/td/span[#class='blacktxt']
I tried to evaluate the xpath expression with firefox xpath add=on and sucessfully got the required items. But the c# code returns an Null exception.
HtmlAgilityPack.HtmlNodeCollection node = htmldoc.DocumentNode.SelectNodes("//div[#id ='sc1']/table/tbody/tr/td/span[#class='blacktxt']");
MessageBox.Show(node.ToString());
the node always contains null value...
Please help me to find the way to get around this problem...
Thank you..
DOM Requires <tbody/> Tags to be Inserted
All common browser extensions for building XPath expressions work on the DOM. Opposite to the HTML specs, the DOM specs require <tr/> elements to be inside <tbody/> elements, so browsers add such elements if missing. You can easily see the difference if looking at the HTML source using Firebug (or similar developer tools working on the DOM) versus displaying the page source (using wget or similar tools that do not interpret anything if necessary).
The Solution
Remove the /tbody axis step, and your XPath expression will probably work.
//div[#id = 'sc1']/table/tr/td/span[#class='blacktxt']
If you Need to Support Both HTML With and Without <tbody/> Tags
For a more general solution, you could replace the /tbody axis step by a decendant-or-self step //, but this could jump into "inner tables":
//div[#id = 'sc1']/table//tr/td/span[#class='blacktxt']
Better would be to use alternative XPath expressions:
//div[#id = 'sc1']/table/tr/td/span[#class='blacktxt'] | //div[#id = 'sc1']/table/tbody/tr/td/span[#class='blacktxt']
A cleaner XPath 2.0 only solution would be
//div[#id = 'sc1']/table/(tbody, self::*)/tr/td/span[#class='blacktxt']
I've found a lot of articles about how to get node content by using simple XPath expression and C#, for example:
XPath:
/bookstore/author/first-name
C#:
string xpathExpression = "/bookstore/author/first-name";
nodes = navigator.Select(xpathExpression);
I wonder how to get content that is inside of an element, and the same element is inside another element and another and another.
Just take a look on below code:
<Cell>
<CellContent>
<Para>
<ParaLine>
<String>ABCabcABC abcABC abc ABCABCABC.</string>
</ParaLine>
</Para>
</CellContent>
</Cell>
I only want to extract content ABCabcABC abcABC abc ABCABCABC. from String element.
Do you know how to resolve problem by use XPath expression and .Net C#?
After googling c# .net xpath for few seconds you'll find this article, which provides example which you can easily modify to use XPathDocument, XPathNavigator and XPathNavigator::SelectSingleNode():
XPathNavigator nav;
XPathDocument docNav;
string xPath;
docNav = new XPathDocument("c:\\books.xml");
nav = docNav.CreateNavigator();
xPath = "/Cell/CellContent/Para/ParaLine/String/text()";
string value = nav.SelectSingleNode(xPath).Value
I recommend more reading on xPath syntax. Much more.
navigator.SelectSingleNode("/Cell/CellContent/Para/ParaLine/String/text()").Value
You can use Linq to XML as well to get value of specified element
var list = XDocument.Parse("xml string").Descendants("ParaLine")
.Select(x => x.Element("string").Value).ToList();
From above query you will get value of all the string element which are inside ParaLine tag.
My application needs to evaluate XPath expression against some XML data. Expression is provided by user at runtime. So, I cannot create XmlNamespaceManager to pass to XPathEvaluate because I don't know prefixes and namespaces at compile time.
Is there any possibility to specify namespaces declaration within xpath expression?
Answers to comments:
XML data has one default namespace but there can be nested elements with any namespaces. User knows namespaces of the data he works with.
User-provided xpath expression is to be evaluated against many XML documents, and every document can have its own prefixes for the same namespaces.
If the same prefix can be bound to different namespaces and prefixes aren't known in advance, then the only pure XPath way to specify such expressions is to use this form of referring to elements:
someName[namespace-uri() = 'exactNamespace']
So, a particular XPath expression would be:
/*/a[namespace-uri() = 'defaultNS']/b[namespace-uri() = 'NSB']
/c[namespace-uri() = 'defaultNS']
I don't know any way to define a namespace prefix in an XPath expression.
But you can write the XPath expression to be agnostic of namespace-prefixes by using local-name() and namespace-uri() functions where appropriate.
Or if you know the XML-namespaces in advance, you can register an arbitrary prefix for them in the XmlNamespaceManager and tell your user to use that prefix in the XPath expression. It doesn't matter if the XML document itself registers a different prefix or no prefix at all. Path resolution is based on the namespace alone, not on the prefix.
Another option would be to scan the document at runtime (use XmlReader for low resource overhead if you haven't loaded it already) and then add the used mappings in the document in the XmlNamespaceManager. I'm not sure if you can get the namespaces and prefixes from XmlDocument, but I see no direct method to do it. It's easy with XmlReader though, since it exposes NamespaceURI and Prefix members for each node.
Is there any possibility to specify namespaces declaration within xpath expression?
The answer is no - it's always done in the calling environment (which is actually more flexible).
An alternative would be to use XQuery, which does allow declaring namespaces in the query prolog.
UPDATE (2020)
In XPath 3.1 you can use the syntax /*/Q{http://my-namespace}a.
Sadly, though, if you're still using Microsoft software, then the situation hasn't changed since 2011 - you're still stuck with XPath 1.0 with all its shortcomings.
I am trying to create a winform application that searches through an XML doc.
for my search I need to convert the the XML attribute in the xpath condition to lower case, by using lower-case() xpath function.
this causes a problem related to the function namespace.
I have tried to add the namespace manualy:
XmlNamespaceManager nsMgr = new XmlNamespaceManager(prs.Doc.NameTable);
nsMgr.AddNamespace("fn", "http://www.w3.org/2005/02/xpath-functions");
XmlNodeList results = prs.Doc.SelectNodes("//function[starts-with(fn:lower-case(#name),'" + txtSearch.Text + "')]",nsMgr);
but still I get exception:
XsltContext is needed for this query because of an unknown function.
The lower-case() function is defined for XPath 2.0.
In XPath 1.0 to convert letters to lower case one can still use the
translate() function as shown below:
translate(#attrName, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'abcdefghijklmnopqrstuvwxyz')
fn:lower-case is defined in XQuery 1.0 and XPath 2.0. XSLT 2.0 works with XPATH 2.0.
AFAIK, .NET hasn't support XPATH 2.0 yet. and the XSLT version from .NET is 1.0 as well not 2.0 yet.
I think CodeMelt is correct and gets my +1, but perhaps the Microsoft ms:string-compare extension function (with case-insensitive option) may help solve your problem?