Elastic Search Terms query not giving result but match query is giving - c#

I am using a Kibana query
GET /MY_INDEX_HERE/_search
{
"query": {
"match": {
"AccountId":
"bc73afda-d4f2"
}
}
}
and it is giving me the results, I expect. But when I use a terms query, it is not showing any result, although the values are same. There is no difference in any character or any case, everything is lower-case only.
My final goal is to get the result in my C# code. My query is
.Query(q => q.Raw(strQuery) && q.Terms(c => c.Name("by_account").Field(p => p.AccountsList).Terms(accountTerms))
With this query I am getting zero results, but if I comment the second query I am getting results:
.Query(q => q.Raw(strQuery) //&& q.Terms(c => c.Name("by_account").Field(p => p.AccountsList).Terms(accountTerms))
Right now my
strQuery ={\"bool\":{\"must\":[{\"match_phrase_prefix\":{\"fullname\":\"test\"}}]}}
and
List<string> accountTerms = new List<string>();
accountTerms.Add("bc73afda-d4f2");
The AccountId is a GUID type, I cant paste the full value here, hence it's not a complete value.

I assume that your data is stored in a text field, which is an analyzed field type.
If you are using the standard analyzer, your source bc73afda-d4f2 is transformed into the two tokens bc73afda and d4f2. Therefore, if you're querying for the term bc73afda-d4f2 there is no token that matches .
You have different options now, depending on what you want to achieve exactly and which parts you can influence (Elasticsearch and/or C# code).
You could edit your second query to use match instead of term.
You could add your second condition to the existing strQuery. That would be something along the lines of:
\"bool\":{\"must\":[{\"match_phrase_prefix\":{\"fullname\":\"test\"}},{\"match\":{\"AccountId\":\"bc73afda-d4f2\"}}]}}
If your mapping is a multi-type text/keyword field or you can make it into one (or just use a keyword field instead of text), you can query the AccountId.keyword field with your terms query. keyword fields are not analyzed, so the term query for bc73afda-d4f2 would be a hit.
You could use a custom analyzer and reindex your data. But this is probably overkill/the wrong way to approach your issue.

Troubleshooting Kibana 'terms' Query
The terms query is an exact match query, which means it doesn't analyze the query input or the field values before matching them.
If the AccountId field in your index has been mapped as a keyword field or a text field with the keyword type, then the input bc73afda-d4f2 will be treated as a single term, and it will only match if the field value is an exact match, including case.
Check the mapping of the AccountId:
GET /MY_INDEX_HERE/_mapping/field/AccountId
If the type of the field is text or keyword, Adjust your query to use the exact case and value of the field.
Alternatively, you can use a match_phrase query instead of the terms query to match the exact phrase of the AccountId field, regardless of the case.
GET /MY_INDEX_HERE/_search
{
"query": {
"match_phrase": {
"AccountId": "bc73afda-d4f2"
}
}
}

Related

Performing a wildcard search

I have a search where I use LINQ with EF. When ever the search criteria are null or empty I need to return everything. Currently I've used if conditions as a solution. and from that I moved to a solution like this.
data = data
.Where(p => !string.IsNullOrEmpty(searchriteria1)? p.field1.Contains(searchriteria1) : true)
.Where(p => !string.IsNullOrEmpty(searchriteria2)? p.field2.Contains(searchriteria2) : true);
Is there a better way to do this? maybe use an extension or any better approach?
You could check the search criteria field previously and build up the query this way:
IQueryable<Foo> data = context.Foo.AsQueryable();
if(!string.IsNullOrEmpty(searchriteria1))
{
data = data.Where(p => p.field1.Contains(searchriteria1));
}
if (!string.IsNullOrEmpty(searchriteria2))
{
data = data.Where(p => p.field2.Contains(searchriteria2));
}
There are two parts to the question. How to filter dynamically and how to filter efficiently.
Dynamic criteria
For the first question, there's no need for a catch-all query when using LINQ. Catch-all queries result in inefficient execution plans, so it's best to avoid them.
LINQ isn't SQL though. You can construct your query part by part. The final query will be translated to SQL only when you try to enumerate it. This means you can write :
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>p.Field1.Contains(searchCriteria1);
}
You can chain multiple Where call to get the equivalent of multiple AND criteria.
To generate more complex queries using eg OR you'd have to construct the proper Expression<Func<...,bool>> objects, or use a library like LINQKit to make this bearable.
Efficiency
Whether you can write an efficient query depends on the search criteria. The clause field LIKE '%potato%' can't use any indexes and will end up scanning the entire table.
On the other hand, field LIKE 'potato% can take advantage of an index that covers field because it will be converted to a range search like field >='potato' and field<='potatp.
If you want to implement autocomplete or spell checking though, you often want to find text that has the fewest differences from the criteria.
Full Text Search
You can efficiently search for words, word variations and even full phrases using Full-Text Search indexes and FTS functions like CONTAINS or FREETEXT.
FTS is similar to how Google or ... StackOverflow searches for words or sentences.
Quoting form the docs:
CONTAINS can search for:
A word or phrase.
The prefix of a word or phrase.
A word near another word.
A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
A word that is a synonym of another word using a thesaurus (for example, the word "metal" can have synonyms such as "aluminum" and "steel").
FREETEXT on the other hand is closer to how Google/SO work by searching for an entire phrase, returning close matches, not just exact matches.
Both CONTAINS and FREETEXT are available in EF Core 5 and later, through the DbFunctions.Contains and DbFunctions.FreeText functions.
This means that if you want to search for a word or phrase, you could construct a proper FTS argument and use :
var searchCriteria1="' Mountain OR Road '";
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>DbFunctions.Contains(p.Field1.Contains(searchCriteria1));
}
That's a lot easier than using LINQKit.
Or search for ride, riding, ridden with :
var searchCriteria1="' FORMSOF (INFLECTIONAL, ride) '";
shorter syntax
data.Where(p => (string.IsNullOrEmpty(searchriteria1) || p.field1.Contains(searchriteria1))
&& (string.IsNullOrEmpty(searchriteria2) || p.field2.Contains(searchriteria2)));
public static List<Test> getAll(Expression<Func<Test, bool>> filter = null)
{
return filter == null ? context.Set<Test>().ToList() : context.Set<Test>().Where(filter).ToList();
}
If you want to filter
var l=getAll(p => p.field1.Contains(searchriteria1)&&p.field2.Contains(searchriteria2));
no filter
var l=getAll();

Elasticsearch searching exact value of field with case insensitive

How can I search in elasticsearch field by its exact value but with case insensitive query?
For example I have field with value { "Type": "Płatność kartą" },
and my query will search by value "płatność kartą". I need to be able to search by list of string parameters (i.e. "płatność kartą", "płatność gotówką", etc.). I tried elastic TERMS query but it didn't return value when sensitive case difference appears. Field index is set to not_analyzed.
If you choose not analyzed when indexing, Elastic is not analyzing these terms at index time and that means they are stored verbatim. So when you are querying, you get no results as the query terms don't match the stored fields.
In order to be able to query with lowercase and get the uppercase results, too, you need to use an analyzer on your mapping. Here are the available options from the docs.
If none the available analyzers fit you, you can define your custom one, by specifying the filters you want to be applied. For example, using just the lowercase filter, Elastic will index the RegisteredPaymentType field just lowercased. Then, while querying, the same analyzer will be applied to the query and you will get the expecting results.

Textual Mining on the column Cell of Table that remove the Duplicates based on "##" notation

Let's Assume I have Table in SQL server that represents employee information for example
I want to do the Textual Mining on the Degree column that remove the Duplicates based on "##" notation.
LINQ to SQL
I am using Linq to SQL , so I am planning to get this data in C# variable context.And Perform operation on string and store again to the location!
Rules: i need to update the data or generate new table!
Is this right way of doing whether its possible ? need some suggestion on this approach or any alternative suggestions are welcome
So it looks like you need to break up the string based on the "##" delimiters, take the distinct items, and put them back in -- comma-delimited this time? The String.Split method to break up the string and then LINQ's Distinct extension method should get you just the unique ones.
Assuming you've got the text of the degree in a variable somewhere:
var uniques = degree
.Split(new String[] { "##" }, StringSplitOptions.None)
.Distinct();
String.Split usually works with a single character delimiter, but there's an overload that allows splitting on a larger string, so you'll have to use that one.
Then you can use String.Join to comma-delimit the unique items, or whatever else you need to do.
Edit: Apologies, I thought your original question was more about how to eliminate the duplicates than how to use LINQ to SQL.
Assuming you've got your DataContext and object model set up, you just need to select your object(s) out of the database using LINQ to SQL, make the changes you need to them, and then and then call SubmitChanges() on them.
For example:
var degrees = from d in context.GetTable<Employee>() select d;
foreach (var d in degrees)
{
d.Degree = String.Join(",", d.Degree
.Split(new String[] { "##" }, StringSplitOptions.None)
.Distinct());
}
context.SubmitChanges();
If you're new to LINQ to SQL, it may be worthwhile to run through a tutorial or two first. Here's part 1 of a pretty good series:
Lastly, you mentioned in your edit that you have the option of creating a new table after making your changes -- if that's the case, I'd consider storing the individual degrees in a table that links back to the employee record, rather than storing them as comma-separated values. It depends on your needs, of course, but SQL is designed to work in tables and sets, so the less string parsing/processing you can do the better.
Good luck!

How do I calculate a checksum on all columns in a row using LINQ and Entity Framework?

The query I am trying to execute is similar to this:
var checksum = from i in db.Items
where i.Id == id
select SqlFunctions.Checksum("*");
However, this returns the checksum value of the string "*" rather than evaluating the wildcard. Is there a way to calculate the checksum of all the columns instead?
Update:
var checksum = db.Database.SqlQuery<int?>("SELECT CHECKSUM(*) FROM [Catalog].[Item] WHERE Id = #p0", id);
This gives me the result I want but seems dirty. Is there a way to do this without inline SQL?
This can be done with the SqlFunctions class. This Class allows for linq-to-entities code to include methods that are easily converted to Sql.
First of all in your current edit: Using inline SQL is not 'dirty' and is totally fine in most (if not all) cases. ORMs don't provide everything, especially if there isn't a good object-column mapping that exists. However, since you're using entity framework you might as well get aquanted with the SqlFunctions static methods.
In this case there are a lot of overloads for performing a checksum, however they must all be of the same type. Since you didn't post what types your columns or how many you have, I don't want to recommend the wrong overload in an example for you to use.
Here are your options:
SqlFunctions.Checksum():
bool?
char[]
DateTime?
DateTimeOffset?
Decimal?
double?
Guid?
TimeSpan?
String
All of the above have overloads to allow up to 3 parameters (of the same type).
SqlFunctions.AggregateChecksum():
IEnumerable<int>
IEnumerable<int?>
If you take a look at the documentation for these functions you'll see that the parameters that you're passing are VALUES, not column names. So you should be using them inside of a Select() clause. This is why when you passed "*" to the operation it checksummed the string containing a single asterisk instead of all columns. Also, keep in mind that these functions cannot be called directly, and must only be used within a Linq-To-Entities query.
Let's assume your columns named "ItemName" & "Description" are both strings, and you also want your id, which is an int:
var checksum = db.Items.Where(i => i.Id == id)
.Select(i => SqlFunctions.Checksum(i.Id.ToString(), i.ItemName, i.Description));
Unfortunately, as you see in the above example we had to cast our int to a string. There are no overloads that allow for different typed parameters for computing a checksum, nor are there any options that allow for more than 3 parameters in the checksum function; however, as I mentioned above sometimes you need to do an inline SQL command and this is OK.

Search Substring on a Integer Value

Let's say we have a mongodb collection that has elements containing an int attribute value like: {"MyCollectionAttribute": 12345}
How can I search the string "234" inside the int using Query<T>. syntax?
For now it seems to work(as explained here) using raw query like:
var query = new QueryDocument("$where", "/234/.test(this.MyCollectionAttribute)");
myCollection.Find(query);
Is it preferable to store the values directly as strings instead of integers, since a regex match will be slow? How do you approach theese situations?
Edit
Context: a company can have some internal codes that are numbers. In sql server they can be stored as a column of int type in order to have data integrity at database level and then queried from linq to sql with something like:
.where(item => item.CompanyCode.ToString().Contains("234"))
In this way there is both data integrity at db level and type safety of the query.
I asked the question in order to see how this scenario can be implemented using mongodb.
Does not make much sense what you are asking.
Regular expressions are for search within strings and not within integers.
If you want to perform a substring search (for whatever reason) then store your numbers
as strings and not as integers - obviously.

Categories