Elasticsearch searching exact value of field with case insensitive - c#

How can I search in elasticsearch field by its exact value but with case insensitive query?
For example I have field with value { "Type": "Płatność kartą" },
and my query will search by value "płatność kartą". I need to be able to search by list of string parameters (i.e. "płatność kartą", "płatność gotówką", etc.). I tried elastic TERMS query but it didn't return value when sensitive case difference appears. Field index is set to not_analyzed.

If you choose not analyzed when indexing, Elastic is not analyzing these terms at index time and that means they are stored verbatim. So when you are querying, you get no results as the query terms don't match the stored fields.
In order to be able to query with lowercase and get the uppercase results, too, you need to use an analyzer on your mapping. Here are the available options from the docs.
If none the available analyzers fit you, you can define your custom one, by specifying the filters you want to be applied. For example, using just the lowercase filter, Elastic will index the RegisteredPaymentType field just lowercased. Then, while querying, the same analyzer will be applied to the query and you will get the expecting results.

Related

Elastic Search Terms query not giving result but match query is giving

I am using a Kibana query
GET /MY_INDEX_HERE/_search
{
"query": {
"match": {
"AccountId":
"bc73afda-d4f2"
}
}
}
and it is giving me the results, I expect. But when I use a terms query, it is not showing any result, although the values are same. There is no difference in any character or any case, everything is lower-case only.
My final goal is to get the result in my C# code. My query is
.Query(q => q.Raw(strQuery) && q.Terms(c => c.Name("by_account").Field(p => p.AccountsList).Terms(accountTerms))
With this query I am getting zero results, but if I comment the second query I am getting results:
.Query(q => q.Raw(strQuery) //&& q.Terms(c => c.Name("by_account").Field(p => p.AccountsList).Terms(accountTerms))
Right now my
strQuery ={\"bool\":{\"must\":[{\"match_phrase_prefix\":{\"fullname\":\"test\"}}]}}
and
List<string> accountTerms = new List<string>();
accountTerms.Add("bc73afda-d4f2");
The AccountId is a GUID type, I cant paste the full value here, hence it's not a complete value.
I assume that your data is stored in a text field, which is an analyzed field type.
If you are using the standard analyzer, your source bc73afda-d4f2 is transformed into the two tokens bc73afda and d4f2. Therefore, if you're querying for the term bc73afda-d4f2 there is no token that matches .
You have different options now, depending on what you want to achieve exactly and which parts you can influence (Elasticsearch and/or C# code).
You could edit your second query to use match instead of term.
You could add your second condition to the existing strQuery. That would be something along the lines of:
\"bool\":{\"must\":[{\"match_phrase_prefix\":{\"fullname\":\"test\"}},{\"match\":{\"AccountId\":\"bc73afda-d4f2\"}}]}}
If your mapping is a multi-type text/keyword field or you can make it into one (or just use a keyword field instead of text), you can query the AccountId.keyword field with your terms query. keyword fields are not analyzed, so the term query for bc73afda-d4f2 would be a hit.
You could use a custom analyzer and reindex your data. But this is probably overkill/the wrong way to approach your issue.
Troubleshooting Kibana 'terms' Query
The terms query is an exact match query, which means it doesn't analyze the query input or the field values before matching them.
If the AccountId field in your index has been mapped as a keyword field or a text field with the keyword type, then the input bc73afda-d4f2 will be treated as a single term, and it will only match if the field value is an exact match, including case.
Check the mapping of the AccountId:
GET /MY_INDEX_HERE/_mapping/field/AccountId
If the type of the field is text or keyword, Adjust your query to use the exact case and value of the field.
Alternatively, you can use a match_phrase query instead of the terms query to match the exact phrase of the AccountId field, regardless of the case.
GET /MY_INDEX_HERE/_search
{
"query": {
"match_phrase": {
"AccountId": "bc73afda-d4f2"
}
}
}

How do I programmatically set the sort field on a MongoDB query using the C# driver?

I have a collection of Tool objects, which I want to optionally filter, and return into a paginated table on a web page. I've got it working with the filter and the pagination, but I'm having trouble with the sort. I'm using an Angular Material table, which lets the user chose the sort field and direction at run time.
Using the MongoDB C# driver, I built a collection of tools which match theFilter, (fo = find options = case insensitive). Skip and Limit provide the pagination - I do know that's not necessarily efficient for big collections, that is not a concern here - and ToList sends it to the API.
tools = _tools.Find<Tool>(theFilter, fo)
.Sort(Builders<Tool>.Sort.Descending(x => x.Description))
.Skip(pageNo * pageSize)
.Limit(pageSize)
.ToList();
In that example, the Sort call correctly sorts the collection in descending order by the description field. I need to be able, at run time, to chose a different field (e.g. x.id, x.Name, x.location, x.whatever), and to be able to switch between descending and ascending order.
Attempts to use MongoDB's syntax:
.Sort("{ description: -1}")
fail, as does attempting to build a SortDefinition object using the field's name:
private SortDefinition<T> BuildSortDefinition<T>(string fieldName, string sortDirection)
{
FieldDefinition<T> theField = new StringFieldDefinition<T>(fieldName);
SortDefinition<T> theSort;
if (sortDirection.ToLower() == "desc")
theSort = Builders<T>.Sort.Descending(theField);
else
theSort = Builders<T>.Sort.Ascending(theField);
return theSort;
}
I've only been able to make Sort work if I use a lambda expression. How can I either fix the lambda expression to use a configurable field; or use the .Sort properly in order to use a configurable field, in this scenario?
My problem was using the wrong casing for the search field - "description" instead of "Description". Once I passed the correct case, it worked fine.

Return Values That Are In Lowercase

We recently discovered a bug in our system whereby any serial numbers that have been entered in lowercase have not been processed correctly.
To correct this, we need to add a one off function that will run through the database and re-process all items with lower case serial numbers.
In linq, is there a query I can run that will return a list of such items?
Note: I am not asking how to convert lowercase to uppercase or reverse, which is all google will return. I need to generate a list of all database entries where the serial number has been entered in lowercase.
EDIT: I am using Linq to MS SQL, which appears to be case insensitive.
Yes, there is. You can try something like this:
var result = serialnumber.Any(c => char.IsLower(c));
[EDIT]
Well, in case of Linq to Entities...
As is stated here: Regex in Linq (EntityFramework), String processing in database, there's few ways to workaround it.
Change database table structure. E.g. create table Foo_Filter which will link your entities to filters. And then create table Filters
which will contain filters data.
Execute query in memory and use Linq to Objects. This option will be slow, because you have to fetch all data from database to memory
Note: link to MSDN documentation has been added by me.
For example:
var result = context.Serials.ToList().Where(sn => sn.Any(c => char.IsLower(c)));
Another way is to use SqlMethods.Like Method
Finally, i'd strongly recommend to read this: Case sensitive search using Entity Framework and Custom Annotation

Search Substring on a Integer Value

Let's say we have a mongodb collection that has elements containing an int attribute value like: {"MyCollectionAttribute": 12345}
How can I search the string "234" inside the int using Query<T>. syntax?
For now it seems to work(as explained here) using raw query like:
var query = new QueryDocument("$where", "/234/.test(this.MyCollectionAttribute)");
myCollection.Find(query);
Is it preferable to store the values directly as strings instead of integers, since a regex match will be slow? How do you approach theese situations?
Edit
Context: a company can have some internal codes that are numbers. In sql server they can be stored as a column of int type in order to have data integrity at database level and then queried from linq to sql with something like:
.where(item => item.CompanyCode.ToString().Contains("234"))
In this way there is both data integrity at db level and type safety of the query.
I asked the question in order to see how this scenario can be implemented using mongodb.
Does not make much sense what you are asking.
Regular expressions are for search within strings and not within integers.
If you want to perform a substring search (for whatever reason) then store your numbers
as strings and not as integers - obviously.

MongoDB use index in regular expression query

I am using the official C# MongoDB driver.
If I have an index on three elements {"firstname":1,"surname":1,"companyname":1} can I search the collection by using a regular expression that directly matches against the index value?
So, if someone enters "sun bat" as a search term, I would create a regex as follows
(?=.\bsun)(?=.\bbat).* and this should match any index entries where firstname or surname or companyname starts with 'sun' AND where firstname or surname or companyname starts with 'bat'.
If I can't do it this way, how can I do it? The user just types their search terms, so I won't know which element (firstname, surname, companyname) each search term (sun or bat) refers to.
Update: for MongoDB 2.4 and above you should not use this method but use MongoDB's text index instead.
Below is the original and still relevant answer for MongoDB < 2.4.
Great question. Keep this in mind:
MongoDB can only use one index per query.
Queries that use regular expressions only use an index when the regex is rooted and case sensitive.
The best way to do a search across multiple fields is to create an array of search terms (lower case) for each document and index that field. This takes advantage of the multi-keys feature of MongoDB.
So the document might look like:
{
"firstname": "Tyler",
"surname": "Brock",
"companyname": "Awesome, Inc.",
"search_terms": [ "tyler", "brock", "awesome inc"]
}
You would create an index: db.users.ensureIndex({ "search_terms": 1 })
Then when someone searches for "Tyler", you smash the case and search the collection using a case sensitive regex that matches the beginning of the string:
db.users.find({ "search_terms": /^tyler/ })
What mongodb does when executing this query is to try and match your term to every element of the array (the index is setup that way too -- so it's speedy). Hopefully that will get you where you need to be, good luck.
Note: These examples are in the shell. I have never written a single line of C# but the concepts will translate even though the syntax may differ.

Categories