I am using the official C# MongoDB driver.
If I have an index on three elements {"firstname":1,"surname":1,"companyname":1} can I search the collection by using a regular expression that directly matches against the index value?
So, if someone enters "sun bat" as a search term, I would create a regex as follows
(?=.\bsun)(?=.\bbat).* and this should match any index entries where firstname or surname or companyname starts with 'sun' AND where firstname or surname or companyname starts with 'bat'.
If I can't do it this way, how can I do it? The user just types their search terms, so I won't know which element (firstname, surname, companyname) each search term (sun or bat) refers to.
Update: for MongoDB 2.4 and above you should not use this method but use MongoDB's text index instead.
Below is the original and still relevant answer for MongoDB < 2.4.
Great question. Keep this in mind:
MongoDB can only use one index per query.
Queries that use regular expressions only use an index when the regex is rooted and case sensitive.
The best way to do a search across multiple fields is to create an array of search terms (lower case) for each document and index that field. This takes advantage of the multi-keys feature of MongoDB.
So the document might look like:
{
"firstname": "Tyler",
"surname": "Brock",
"companyname": "Awesome, Inc.",
"search_terms": [ "tyler", "brock", "awesome inc"]
}
You would create an index: db.users.ensureIndex({ "search_terms": 1 })
Then when someone searches for "Tyler", you smash the case and search the collection using a case sensitive regex that matches the beginning of the string:
db.users.find({ "search_terms": /^tyler/ })
What mongodb does when executing this query is to try and match your term to every element of the array (the index is setup that way too -- so it's speedy). Hopefully that will get you where you need to be, good luck.
Note: These examples are in the shell. I have never written a single line of C# but the concepts will translate even though the syntax may differ.
Related
I have a search where I use LINQ with EF. When ever the search criteria are null or empty I need to return everything. Currently I've used if conditions as a solution. and from that I moved to a solution like this.
data = data
.Where(p => !string.IsNullOrEmpty(searchriteria1)? p.field1.Contains(searchriteria1) : true)
.Where(p => !string.IsNullOrEmpty(searchriteria2)? p.field2.Contains(searchriteria2) : true);
Is there a better way to do this? maybe use an extension or any better approach?
You could check the search criteria field previously and build up the query this way:
IQueryable<Foo> data = context.Foo.AsQueryable();
if(!string.IsNullOrEmpty(searchriteria1))
{
data = data.Where(p => p.field1.Contains(searchriteria1));
}
if (!string.IsNullOrEmpty(searchriteria2))
{
data = data.Where(p => p.field2.Contains(searchriteria2));
}
There are two parts to the question. How to filter dynamically and how to filter efficiently.
Dynamic criteria
For the first question, there's no need for a catch-all query when using LINQ. Catch-all queries result in inefficient execution plans, so it's best to avoid them.
LINQ isn't SQL though. You can construct your query part by part. The final query will be translated to SQL only when you try to enumerate it. This means you can write :
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>p.Field1.Contains(searchCriteria1);
}
You can chain multiple Where call to get the equivalent of multiple AND criteria.
To generate more complex queries using eg OR you'd have to construct the proper Expression<Func<...,bool>> objects, or use a library like LINQKit to make this bearable.
Efficiency
Whether you can write an efficient query depends on the search criteria. The clause field LIKE '%potato%' can't use any indexes and will end up scanning the entire table.
On the other hand, field LIKE 'potato% can take advantage of an index that covers field because it will be converted to a range search like field >='potato' and field<='potatp.
If you want to implement autocomplete or spell checking though, you often want to find text that has the fewest differences from the criteria.
Full Text Search
You can efficiently search for words, word variations and even full phrases using Full-Text Search indexes and FTS functions like CONTAINS or FREETEXT.
FTS is similar to how Google or ... StackOverflow searches for words or sentences.
Quoting form the docs:
CONTAINS can search for:
A word or phrase.
The prefix of a word or phrase.
A word near another word.
A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
A word that is a synonym of another word using a thesaurus (for example, the word "metal" can have synonyms such as "aluminum" and "steel").
FREETEXT on the other hand is closer to how Google/SO work by searching for an entire phrase, returning close matches, not just exact matches.
Both CONTAINS and FREETEXT are available in EF Core 5 and later, through the DbFunctions.Contains and DbFunctions.FreeText functions.
This means that if you want to search for a word or phrase, you could construct a proper FTS argument and use :
var searchCriteria1="' Mountain OR Road '";
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>DbFunctions.Contains(p.Field1.Contains(searchCriteria1));
}
That's a lot easier than using LINQKit.
Or search for ride, riding, ridden with :
var searchCriteria1="' FORMSOF (INFLECTIONAL, ride) '";
shorter syntax
data.Where(p => (string.IsNullOrEmpty(searchriteria1) || p.field1.Contains(searchriteria1))
&& (string.IsNullOrEmpty(searchriteria2) || p.field2.Contains(searchriteria2)));
public static List<Test> getAll(Expression<Func<Test, bool>> filter = null)
{
return filter == null ? context.Set<Test>().ToList() : context.Set<Test>().Where(filter).ToList();
}
If you want to filter
var l=getAll(p => p.field1.Contains(searchriteria1)&&p.field2.Contains(searchriteria2));
no filter
var l=getAll();
I am new to Lucene.net ,Here I would like to know how to make a lucene search query almost like an sql query .Lemme give more..
I have set of parameter values,Let assume like a stored procedure has set of parameters .Now I want to build a query with all this parameters.
searchParams.UseLast = Convert.ToBoolean(base.Arguments["UseLast"]);
searchParams.LastEditedFrom= Convert.ToDateTime(base.Arguments["LastEditedFrom"]);
searchParams.LastEditedTo = Convert.ToDateTime(base.Arguments["LastEditedTo"]);
searchParams.Reviewed = Convert.ToBoolean(base.Arguments["Reviewed"]);
searchParams.Approved = Convert.ToBoolean(base.Arguments["Approved"]);
searchParams.Include = Convert.ToBoolean(base.Arguments["Include"]);
searchParams.IsVisibleToUser = Convert.ToBoolean(base.Arguments["IsVisibleToUser"]);
searchParams.IsEntry = Convert.ToBoolean(base.Arguments["IsEntry"]);
searchParams.UserId = Convert.ToInt32(base.Arguments["UserId"]);
IEnumerable Categories = base.Arguments["Categories"] as IEnumerable;
IEnumerable Departments = base.Arguments["Departments"] as IEnumerable;
String mQuery = "How to construct it ....!!!" // Need help in this
var query = queryParser.Parse(mQuery);
indexSearcher.Search(query, collector);
Here I want to fetch all records from lucene index which has the value for all the above fields.
I'm unclear what you are using searchParams for, however in general you may construct your query string (mQuery) in this cases with any of the features of the Lucene query syntax. Here is a link to the documentation for Lucene.Net version 4.8 Query Parser Syntax.
In general, when multiple words are listed in the query they are treated with a logical OR but doc matches that contain all terms are ranked higher than docs with only one term. So for example white dog would match docs containing white dog or white or dog. You can put and in the statement if you only want docs that match all the terms so for example you could say small and white and dog if you only want docs that contain all three terms.
To specify the specific field to search you list the field name followed by a colon. So for example you can search UserId:ron and Categories:dogs. There is much more to the Lucene query syntax but hopefully that will get you started. For more details see the Lucene query syntax doc I referred to.
How can I search in elasticsearch field by its exact value but with case insensitive query?
For example I have field with value { "Type": "Płatność kartą" },
and my query will search by value "płatność kartą". I need to be able to search by list of string parameters (i.e. "płatność kartą", "płatność gotówką", etc.). I tried elastic TERMS query but it didn't return value when sensitive case difference appears. Field index is set to not_analyzed.
If you choose not analyzed when indexing, Elastic is not analyzing these terms at index time and that means they are stored verbatim. So when you are querying, you get no results as the query terms don't match the stored fields.
In order to be able to query with lowercase and get the uppercase results, too, you need to use an analyzer on your mapping. Here are the available options from the docs.
If none the available analyzers fit you, you can define your custom one, by specifying the filters you want to be applied. For example, using just the lowercase filter, Elastic will index the RegisteredPaymentType field just lowercased. Then, while querying, the same analyzer will be applied to the query and you will get the expecting results.
Following is my schema
Product_Name (Analyzed),Category (Analyzed)
Scenario:
I want to search those products whose category is exactly "Cellphones & Accessories" and Product_Name is "sam*"
Equivalent SQL Query is
select * from products
where Product_Name like '%sam%' and Category='Cellphones & Accessories'
I am using lucene.net.
I need equivalent lucene.net statement.
As this is a few months old I'll be brief (I can expand if you're still interested)...
If you want to have an exact match to Category then do not analyze. Analyzers will chop the string up into bits which are then searchable. Matching case can be problematic so maybe just the lowercase analyzer would work for that field.
It might be useful to have several fields analyzed in different ways so that different queries can be used.
NOTE: "sam*" is not equivalent to "%sam%"
Do you want "sam" to be a prefix ie "sample" or a word "the sam product"?
If it's a word then a no stopword analyzer should be fine.
A nice trick is to create many fields (with the same name) with variations of the name. Probably with just a lower case analyzer
name: "some sample product"
name: "sample product"
name" "product"
Then have a look at "prefix queries". a query of (name:sam) would then match.
Also have a look at the PerFieldAnalyzerWrapper in order to use a different analyzer for each field.
I'm trying to implament a search engine and wonder what is the best way to perform a search on collection of entities, while entity is an object of data, and the search criteria is changing from time to time: in the number of fields to search by, and in the which fields to search by. for example:
given a collection of itemEntity, (itemEntity is an object contains id, name, gender, age...ect.) I would like to be flexible with search: you can search by name + gender , or you can search by id only and so on.
How to do it?
p.s.
I'm writing in c#
Scott Gu blogged about the dynamic linq expressions, you can find something useful there.
NEVERMIND
THANKS FOR TRYING TO HELP
DID IT BY MY SELF...
MOVE ON ALL SEARCH CRITERIA,(RECIEVE IT IN A DICTIONARY), THEN BY SWITCH CASE - DOING A LINQ SELECT QUERY -> GOT THE RESULTS.