xpath and c#

xpath and c# - c#

I am trying to create a winform application that searches through an XML doc.
for my search I need to convert the the XML attribute in the xpath condition to lower case, by using lower-case() xpath function.
this causes a problem related to the function namespace.
I have tried to add the namespace manualy:
XmlNamespaceManager nsMgr = new XmlNamespaceManager(prs.Doc.NameTable);
nsMgr.AddNamespace("fn", "http://www.w3.org/2005/02/xpath-functions");
XmlNodeList results = prs.Doc.SelectNodes("//function[starts-with(fn:lower-case(#name),'" + txtSearch.Text + "')]",nsMgr);
but still I get exception:
XsltContext is needed for this query because of an unknown function.

The lower-case() function is defined for XPath 2.0.
In XPath 1.0 to convert letters to lower case one can still use the
translate() function as shown below:
translate(#attrName, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'abcdefghijklmnopqrstuvwxyz')

fn:lower-case is defined in XQuery 1.0 and XPath 2.0. XSLT 2.0 works with XPATH 2.0.
AFAIK, .NET hasn't support XPATH 2.0 yet. and the XSLT version from .NET is 1.0 as well not 2.0 yet.

I think CodeMelt is correct and gets my +1, but perhaps the Microsoft ms:string-compare extension function (with case-insensitive option) may help solve your problem?

Related

why I get an invalid token while using Xpath if-then-else Expression with c# XPathNavigator Evaluate?

I get an invalid token exception while trying to evaluate below expression via XpathNavigor:
var expression = if(//DovizCins = 'YTL') then '1' else '2';
var nav = doc.CreateNavigator();
XPathExpression xp = XPathExpression.Compile(expression);
var value = nav.Evaluate(xp);
return value?.ToString() ?? string.Empty;
Exception is:
System.Xml.XPath.XPathException: ''if(//DovizCins = 'YTL') then '1' else '2'' has an invalid token.'

I completely concur with Michael Kay.
The official MS documentation is mistaken: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath?view=netframework-4.7.1
Excerpt
"...The System.Xml.XPath namespace contains the classes that define a cursor model for navigating and editing XML information items as instances of the XQuery 1.0 and XPath 2.0 Data Model..."
XQuery 1.0 and XPath 2.0 are partially supported by MS SQL Server.
The .Net framework doesn't support any XQuery, and its XPath is 1.0

Microsoft's XML technology is way out of date. This is XPath 2.0 syntax, introduced in 2007, and Microsoft has yet to catch up: they're still shipping XPath 1.0.

How to combine two XPath queries in C#

I'm trying to check an XML document for two rules via XPath evaluate.
The rules are:
/root/path1/text()='TABLE1'
/root/path2/text()='TABLE2'
My code looks like:
XPathDocument document = new XPathDocument(myDocument);
XPathNavigator navigator = document.CreateNavigator();
XmlNamespaceManager xpathNsMgr = new XmlNamespaceManager(navigator.NameTable);
xpathNsMgr.AddNamespace("xsl", "http://www.w3.org/1999/XSL/Transform");
var result = (bool)navigator.Evaluate("((/root/path1/text()='TABLE1') and (/root/path2/text()='TABLE2'))", xpathNsMgr);
If I check both XPath via evaluate for their own, then everything works. But if I check them combined as shown in the code above, I get the following error:
xsltcontext is needed for this query because of an unknown function
Why isn't it possible to combine both XPath and evaluate them together? I thought "and", "or" etc. are valid operators since XPath 1.0...

.NET XPathNavigator supports XPath 1.0 only.
You can model your check easily by turning the condition into a predicate (square brackets) and see if the resulting node-set is empty or not.
var result = navigator.Evaluate("/*[path1 = 'TABLE1' and path2 = 'TABLE2']", xpathNsMgr);
Here /* selects the document element. Write /root instead if the actual name of the document element is important.

As stated by #Tomalak and and #Martin Honnen, my conditions are not fully supported by XPath 1.0. Functions like exists() are part of XPath 2.0.
Since .NET 4.5 (which I have to use) doesn't support XPath 2.0, I used the following Nuget package to solve my problem: https://www.nuget.org/packages/XPath2/
Just replace
navigator.Evaluate(...
with
navigator.XPath2Evaluate(...
And the expression can be evaluated.

HtmlAgilityPack C#--- Selectnodes Always returns a Null

This is the xpath text i tried to use along with HtmlAgilityPack C# parser.
//div[#id = 'sc1']/table/tbody/tr/td/span[#class='blacktxt']
I tried to evaluate the xpath expression with firefox xpath add=on and sucessfully got the required items. But the c# code returns an Null exception.
HtmlAgilityPack.HtmlNodeCollection node = htmldoc.DocumentNode.SelectNodes("//div[#id ='sc1']/table/tbody/tr/td/span[#class='blacktxt']");
MessageBox.Show(node.ToString());
the node always contains null value...
Please help me to find the way to get around this problem...
Thank you..

DOM Requires <tbody/> Tags to be Inserted
All common browser extensions for building XPath expressions work on the DOM. Opposite to the HTML specs, the DOM specs require <tr/> elements to be inside <tbody/> elements, so browsers add such elements if missing. You can easily see the difference if looking at the HTML source using Firebug (or similar developer tools working on the DOM) versus displaying the page source (using wget or similar tools that do not interpret anything if necessary).
The Solution
Remove the /tbody axis step, and your XPath expression will probably work.
//div[#id = 'sc1']/table/tr/td/span[#class='blacktxt']
If you Need to Support Both HTML With and Without <tbody/> Tags
For a more general solution, you could replace the /tbody axis step by a decendant-or-self step //, but this could jump into "inner tables":
//div[#id = 'sc1']/table//tr/td/span[#class='blacktxt']
Better would be to use alternative XPath expressions:
//div[#id = 'sc1']/table/tr/td/span[#class='blacktxt'] | //div[#id = 'sc1']/table/tbody/tr/td/span[#class='blacktxt']
A cleaner XPath 2.0 only solution would be
//div[#id = 'sc1']/table/(tbody, self::*)/tr/td/span[#class='blacktxt']

XPath lower-case() function

I'm using XPATH to select certain nodes from an XML document.
The user is able to insert a value for the location. It's working fine, but it does not work if different cases are used.
I've decided that changing both the XML values and the user's input to lower case before being compared is probably the best way to go about it.
I've got this as my selector at the moment:
NodeIter = nav.Select("/Houses/House/location[contains(../location, '" + location_input + "')]");
I've tried putting the lower-case() function in various locations, but it isn't happy with it.
How do I make it so that the value of ../location is compared as lower case?
Note: location_input is set to lower using ToLower() within my c# code.

The lower-case() function is only supported from XPath 2.0 onwards. If your environment supports this version of the standard, you can write:
NodeIter = nav.Select("/Houses/House/location[contains(lower-case(.), '"
+ location_input + "')]");
However, chances are you're stuck with XPath 1.0. In that case, you can abuse the translate() function:
NodeIter = nav.Select("/Houses/House/location[contains(translate(., "
+ "'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '"
+ location_input + "')]");

translate(../location, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') if you can get away with just A-Z

lower-case http://www.w3.org/TR/xpath-functions/#func-lower-case is part of XPath 2.0 and XQuery 1.0 so you need to use an XPath 2.0 or XQuery 1.0 implementation like XQSharp or like the .NET version of Saxon 9 if you want to use such functions.
With XPath 1.0 all you can do is NodeIter = nav.Select(string.Format("/Houses/House/location[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXZY', 'abcdefghijklmnopqrstuvwxyz'), '{0}')]", location_input));.

Note that strictly speaking, translating two strings to lower (or upper) case is not a correct way to do a case-blind comparison, because the mapping of lower-case to upper-case characters in Unicode is not one-to-one. In principle, in XPath 2.0 you should use a case-blind collation. Unfortunately though, although many XSLT 2.0 and XQuery 1.0 processors allow you to use a case-blind collation, there are no standards for collation URIs, so your code becomes processor-dependent.

As long as you are dealing with .net, you can use a Microsoft extension to do a case-insensitive comparison: ms:string-compare
https://msdn.microsoft.com/en-us/library/ms256114(v=vs.120).aspx

I had same dilemma using VS2017(NetFramework 4.6.1) and installed the XPath2 NuGet package. So far it has been worked fine for me when using XPath2 functions.

re:test() XPath to HtmlAgilityPack (get all p tags with matched regex internal)

I want all <p>=.+=</p> tags. The Regex works on its own, without the <p> tags.
Here's my XPath: "//p[re:test(.,'^=.+=$', 'i')]"
But I'm getting an exception when I plug it into,
HtmlNodeCollection pNodes = htmlDoc.DocumentNode.SelectNodes("//p[re:test(.,'^=.+=$', 'i')]");
The exception is:
Namespace Manager or XsltContext
needed. This query has a prefix,
variable, or user-defined function.
Edit: The Html is generated by FCKEditor and has no namespace defined. Do I need to set something for this to work?
The HTML:
<p><style type="text/css">
h2 a { color: black; }</style></p>
<p>----</p>
<h2>test link</h2>
<p>== Heading 2 ==</p>
<p>----</p>
<p>=== Heading [http://searisen.com SeaRisen.com] ===</p>

Apparently HtmlAgilityPack doesn't handle namespaces (not that I had one). So I've come up with this hack,
var pNodes = htmlDoc.DocumentNode.SelectNodes("//p")
.Where(node => Regex.Match(node.InnerText, "^=.+=$").Success);
If there is an HtmlAgilityPack solution I'd love to hear it!

The error you have is due to the fact that the expression re:test uses an XPATH function named test (declared in a namespace whose prefix is re), that is unknown to the XSLT context.
I don't know where you got that expression from, but it's not standard, so it means nothing in the Html Agility Pack context :-)
For indepth explanation, see this cool article here: Adding Custom Functions to XPath. Note you could make it work using these techniques.
That said, here a "pure" Html Agility Pack / XPATH implementation:
var pNodes = htmlDoc.DocumentNode.SelectNodes("//p[text()='=.+=']");
It uses a filter (between [ and ]) and the standard XPATH function text() which means "inner text".

To echo what Simon Mourier said, The re:test() function is not a core XPath function. It is available in Calibre's XPath function set (http://manual.calibre-ebook.com/xpath.html#term-re-test), but that is a non-standard extension. I am not aware of any other systems, besides Calibre, that may expose the re:test() function.
For a good summary of core XPath functions and XSLT extension functions, see https://developer.mozilla.org/en-US/docs/Web/XPath/Functions

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

xpath and c# - c#

The lower-case() function is defined for XPath 2.0. In XPath 1.0 to convert letters to lower case one can still use the translate() function as shown below: translate(#attrName, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')

fn:lower-case is defined in XQuery 1.0 and XPath 2.0. XSLT 2.0 works with XPATH 2.0. AFAIK, .NET hasn't support XPATH 2.0 yet. and the XSLT version from .NET is 1.0 as well not 2.0 yet.

I think CodeMelt is correct and gets my +1, but perhaps the Microsoft ms:string-compare extension function (with case-insensitive option) may help solve your problem?

Related

why I get an invalid token while using Xpath if-then-else Expression with c# XPathNavigator Evaluate?

How to combine two XPath queries in C#

HtmlAgilityPack C#--- Selectnodes Always returns a Null

XPath lower-case() function

re:test() XPath to HtmlAgilityPack (get all p tags with matched regex internal)

Categories

Resources