I'm using XPATH to select certain nodes from an XML document.
The user is able to insert a value for the location. It's working fine, but it does not work if different cases are used.
I've decided that changing both the XML values and the user's input to lower case before being compared is probably the best way to go about it.
I've got this as my selector at the moment:
NodeIter = nav.Select("/Houses/House/location[contains(../location, '" + location_input + "')]");
I've tried putting the lower-case() function in various locations, but it isn't happy with it.
How do I make it so that the value of ../location is compared as lower case?
Note: location_input is set to lower using ToLower() within my c# code.
The lower-case() function is only supported from XPath 2.0 onwards. If your environment supports this version of the standard, you can write:
NodeIter = nav.Select("/Houses/House/location[contains(lower-case(.), '"
+ location_input + "')]");
However, chances are you're stuck with XPath 1.0. In that case, you can abuse the translate() function:
NodeIter = nav.Select("/Houses/House/location[contains(translate(., "
+ "'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '"
+ location_input + "')]");
translate(../location, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') if you can get away with just A-Z
lower-case http://www.w3.org/TR/xpath-functions/#func-lower-case is part of XPath 2.0 and XQuery 1.0 so you need to use an XPath 2.0 or XQuery 1.0 implementation like XQSharp or like the .NET version of Saxon 9 if you want to use such functions.
With XPath 1.0 all you can do is NodeIter = nav.Select(string.Format("/Houses/House/location[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXZY', 'abcdefghijklmnopqrstuvwxyz'), '{0}')]", location_input));.
Note that strictly speaking, translating two strings to lower (or upper) case is not a correct way to do a case-blind comparison, because the mapping of lower-case to upper-case characters in Unicode is not one-to-one. In principle, in XPath 2.0 you should use a case-blind collation. Unfortunately though, although many XSLT 2.0 and XQuery 1.0 processors allow you to use a case-blind collation, there are no standards for collation URIs, so your code becomes processor-dependent.
As long as you are dealing with .net, you can use a Microsoft extension to do a case-insensitive comparison: ms:string-compare
https://msdn.microsoft.com/en-us/library/ms256114(v=vs.120).aspx
I had same dilemma using VS2017(NetFramework 4.6.1) and installed the XPath2 NuGet package. So far it has been worked fine for me when using XPath2 functions.
Related
I get an invalid token exception while trying to evaluate below expression via XpathNavigor:
var expression = if(//DovizCins = 'YTL') then '1' else '2';
var nav = doc.CreateNavigator();
XPathExpression xp = XPathExpression.Compile(expression);
var value = nav.Evaluate(xp);
return value?.ToString() ?? string.Empty;
Exception is:
System.Xml.XPath.XPathException: ''if(//DovizCins = 'YTL') then '1' else '2'' has an invalid token.'
I completely concur with Michael Kay.
The official MS documentation is mistaken: https://learn.microsoft.com/en-us/dotnet/api/system.xml.xpath?view=netframework-4.7.1
Excerpt
"...The System.Xml.XPath namespace contains the classes that define a cursor model for navigating and editing XML information items as instances of the XQuery 1.0 and XPath 2.0 Data Model..."
XQuery 1.0 and XPath 2.0 are partially supported by MS SQL Server.
The .Net framework doesn't support any XQuery, and its XPath is 1.0
Microsoft's XML technology is way out of date. This is XPath 2.0 syntax, introduced in 2007, and Microsoft has yet to catch up: they're still shipping XPath 1.0.
I'm trying to check an XML document for two rules via XPath evaluate.
The rules are:
/root/path1/text()='TABLE1'
/root/path2/text()='TABLE2'
My code looks like:
XPathDocument document = new XPathDocument(myDocument);
XPathNavigator navigator = document.CreateNavigator();
XmlNamespaceManager xpathNsMgr = new XmlNamespaceManager(navigator.NameTable);
xpathNsMgr.AddNamespace("xsl", "http://www.w3.org/1999/XSL/Transform");
var result = (bool)navigator.Evaluate("((/root/path1/text()='TABLE1') and (/root/path2/text()='TABLE2'))", xpathNsMgr);
If I check both XPath via evaluate for their own, then everything works. But if I check them combined as shown in the code above, I get the following error:
xsltcontext is needed for this query because of an unknown function
Why isn't it possible to combine both XPath and evaluate them together? I thought "and", "or" etc. are valid operators since XPath 1.0...
.NET XPathNavigator supports XPath 1.0 only.
You can model your check easily by turning the condition into a predicate (square brackets) and see if the resulting node-set is empty or not.
var result = navigator.Evaluate("/*[path1 = 'TABLE1' and path2 = 'TABLE2']", xpathNsMgr);
Here /* selects the document element. Write /root instead if the actual name of the document element is important.
As stated by #Tomalak and and #Martin Honnen, my conditions are not fully supported by XPath 1.0. Functions like exists() are part of XPath 2.0.
Since .NET 4.5 (which I have to use) doesn't support XPath 2.0, I used the following Nuget package to solve my problem: https://www.nuget.org/packages/XPath2/
Just replace
navigator.Evaluate(...
with
navigator.XPath2Evaluate(...
And the expression can be evaluated.
Using MS Visual Studio 2013 to create a C# application, I am trying to get the following output in an XML document.
<UnitsOfMeasure>
&uom-data;
</UnitsOfMeasure>
I keep getting
<UnitsOfMeasure>
&uom-data;
</UnitsOfMeasure>
Here is the code I have tried
XElement uom = new XElement("UnitsOfMeasure");
uom.Add("\n" + tab2, new XText("&uom-data;"), "\n" + tab1);
sd.Add("\n" + tab1, uom);
sd.Add("\n");
XElement uom = new XElement("UnitsOfMeasure");
uom.Add("\n" + tab2, new XText((char)38 + "uom-data;"), "\n" + tab1);
sd.Add("\n" + tab1, uom);
sd.Add("\n");
Thanks
The problem is that & has a special meaning in XML - it's used to escape other things; see beware of the ampersand when using xml, for example. What's being written for you is the correct way to include an ampersand inside XML and when an XML parser reads it back in, it should convert the & back to &.
So perhaps, if anything, you may have a problem with whatever code is reading that XML back in again as it should be converting it back for you.
XML has things called "entities", which take the form ampersand-characters-semicolon.
The XML entity is a alias for a different block of text (although in most cases, entities are just used just to insert a single character -- generally characters not on the keyboard)
& is the most commonly used -- it's to insert an &. © is for the copyright symbol.
In addition to the standard ones, you are allowed to define your own.
The fact that what you are trying to enter -- &uom-data; -- so neatly follows the entity format, I suspect that it really IS an entity and you are just missing the part where it's defined.
I'm working on a Natural Language Processing (NLP) project in which I use a syntactic parser to create a syntactic parse tree out of a given sentence.
Example Input: I ran into Joe and Jill and then we went shopping
Example Output: [TOP [S [S [NP [PRP I]] [VP [VBD ran] [PP [IN into] [NP [NNP Joe] [CC and] [NNP Jill]]]]] [CC and] [S [ADVP [RB then]] [NP [PRP we]] [VP [VBD went] [NP [NN shopping]]]]]]
I'm looking for a C# utility that will let me do complex queries like:
Get the first VBD related to 'Joe'
Get the NP closest to 'Shopping'
Here's a Java utility that does this, I'm looking for a C# equivalent.
Any help would be much appreciated.
There are at least two NLP frameworks, i.e.
SharpNLP (NOTE: project inactive since 2006)
Proxem
And here you can find instructions to use a java NLP in .NET:
Using OpenNLP in .NET project
This page is about using java OpenNLP, but could apply to the java library you've mentioned in your post
Or use NLTK following this guidelines:
Open Source NLP in C# 3.5 using NLTK
We already use
One option would be to parse the output into C# code and then encoding it to XML making every node into string.Format("<{0}>", this.Name); and string.Format("</{0}>", this._name); in the middle put all the child nodes recursively.
After you do this, I would use a tool for querying XML/HTML to parse the tree. Thousands of people already use query selectors and jQuery to parse tree-like structure based on the relation between nodes. I think this is far superior to TRegex or other outdated and un-maintained java utilities.
For example, this is to answer your first example:
var xml = CQ.Create(d.ToXml());
//this can be simpler with CSS selectors but I chose Linq since you'll probably find it easier
//Find joe, in our case the node that has the text 'Joe'
var joe = xml["*"].First(x => x.InnerHTML.Equals("Joe"));
//Find the last (deepest) element that answers the critiria that it has "Joe" in it, and has a VBD in it
//in our case the VP
var closestToVbd = xml["*"].Last(x => x.Cq().Has(joe).Has("VBD").Any());
Console.WriteLine("Closest node to VPD:\n " +closestToVbd.OuterHTML);
//If we want the VBD itself we can just find the VBD in that element
Console.WriteLine("\n\n VBD itself is " + closestToVbd.Cq().Find("VBD")[0].OuterHTML);
Here is your second example
//Now for NP closest to 'Shopping', find the element with the text 'shopping' and find it's closest NP
var closest = xml["*"].First(x => x.InnerHTML.Equals("shopping")).Cq()
.Closest("NP")[0].OuterHTML;
Console.WriteLine("\n\n NP closest to shopping is: " + closest);
I am trying to create a winform application that searches through an XML doc.
for my search I need to convert the the XML attribute in the xpath condition to lower case, by using lower-case() xpath function.
this causes a problem related to the function namespace.
I have tried to add the namespace manualy:
XmlNamespaceManager nsMgr = new XmlNamespaceManager(prs.Doc.NameTable);
nsMgr.AddNamespace("fn", "http://www.w3.org/2005/02/xpath-functions");
XmlNodeList results = prs.Doc.SelectNodes("//function[starts-with(fn:lower-case(#name),'" + txtSearch.Text + "')]",nsMgr);
but still I get exception:
XsltContext is needed for this query because of an unknown function.
The lower-case() function is defined for XPath 2.0.
In XPath 1.0 to convert letters to lower case one can still use the
translate() function as shown below:
translate(#attrName, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'abcdefghijklmnopqrstuvwxyz')
fn:lower-case is defined in XQuery 1.0 and XPath 2.0. XSLT 2.0 works with XPATH 2.0.
AFAIK, .NET hasn't support XPATH 2.0 yet. and the XSLT version from .NET is 1.0 as well not 2.0 yet.
I think CodeMelt is correct and gets my +1, but perhaps the Microsoft ms:string-compare extension function (with case-insensitive option) may help solve your problem?