How to construct a lucene search query with multiple parameters - c#

I am new to Lucene.net ,Here I would like to know how to make a lucene search query almost like an sql query .Lemme give more..
I have set of parameter values,Let assume like a stored procedure has set of parameters .Now I want to build a query with all this parameters.
searchParams.UseLast = Convert.ToBoolean(base.Arguments["UseLast"]);
searchParams.LastEditedFrom= Convert.ToDateTime(base.Arguments["LastEditedFrom"]);
searchParams.LastEditedTo = Convert.ToDateTime(base.Arguments["LastEditedTo"]);
searchParams.Reviewed = Convert.ToBoolean(base.Arguments["Reviewed"]);
searchParams.Approved = Convert.ToBoolean(base.Arguments["Approved"]);
searchParams.Include = Convert.ToBoolean(base.Arguments["Include"]);
searchParams.IsVisibleToUser = Convert.ToBoolean(base.Arguments["IsVisibleToUser"]);
searchParams.IsEntry = Convert.ToBoolean(base.Arguments["IsEntry"]);
searchParams.UserId = Convert.ToInt32(base.Arguments["UserId"]);
IEnumerable Categories = base.Arguments["Categories"] as IEnumerable;
IEnumerable Departments = base.Arguments["Departments"] as IEnumerable;
String mQuery = "How to construct it ....!!!" // Need help in this
var query = queryParser.Parse(mQuery);
indexSearcher.Search(query, collector);
Here I want to fetch all records from lucene index which has the value for all the above fields.

I'm unclear what you are using searchParams for, however in general you may construct your query string (mQuery) in this cases with any of the features of the Lucene query syntax. Here is a link to the documentation for Lucene.Net version 4.8 Query Parser Syntax.
In general, when multiple words are listed in the query they are treated with a logical OR but doc matches that contain all terms are ranked higher than docs with only one term. So for example white dog would match docs containing white dog or white or dog. You can put and in the statement if you only want docs that match all the terms so for example you could say small and white and dog if you only want docs that contain all three terms.
To specify the specific field to search you list the field name followed by a colon. So for example you can search UserId:ron and Categories:dogs. There is much more to the Lucene query syntax but hopefully that will get you started. For more details see the Lucene query syntax doc I referred to.

Related

Performing a wildcard search

I have a search where I use LINQ with EF. When ever the search criteria are null or empty I need to return everything. Currently I've used if conditions as a solution. and from that I moved to a solution like this.
data = data
.Where(p => !string.IsNullOrEmpty(searchriteria1)? p.field1.Contains(searchriteria1) : true)
.Where(p => !string.IsNullOrEmpty(searchriteria2)? p.field2.Contains(searchriteria2) : true);
Is there a better way to do this? maybe use an extension or any better approach?
You could check the search criteria field previously and build up the query this way:
IQueryable<Foo> data = context.Foo.AsQueryable();
if(!string.IsNullOrEmpty(searchriteria1))
{
data = data.Where(p => p.field1.Contains(searchriteria1));
}
if (!string.IsNullOrEmpty(searchriteria2))
{
data = data.Where(p => p.field2.Contains(searchriteria2));
}
There are two parts to the question. How to filter dynamically and how to filter efficiently.
Dynamic criteria
For the first question, there's no need for a catch-all query when using LINQ. Catch-all queries result in inefficient execution plans, so it's best to avoid them.
LINQ isn't SQL though. You can construct your query part by part. The final query will be translated to SQL only when you try to enumerate it. This means you can write :
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>p.Field1.Contains(searchCriteria1);
}
You can chain multiple Where call to get the equivalent of multiple AND criteria.
To generate more complex queries using eg OR you'd have to construct the proper Expression<Func<...,bool>> objects, or use a library like LINQKit to make this bearable.
Efficiency
Whether you can write an efficient query depends on the search criteria. The clause field LIKE '%potato%' can't use any indexes and will end up scanning the entire table.
On the other hand, field LIKE 'potato% can take advantage of an index that covers field because it will be converted to a range search like field >='potato' and field<='potatp.
If you want to implement autocomplete or spell checking though, you often want to find text that has the fewest differences from the criteria.
Full Text Search
You can efficiently search for words, word variations and even full phrases using Full-Text Search indexes and FTS functions like CONTAINS or FREETEXT.
FTS is similar to how Google or ... StackOverflow searches for words or sentences.
Quoting form the docs:
CONTAINS can search for:
A word or phrase.
The prefix of a word or phrase.
A word near another word.
A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
A word that is a synonym of another word using a thesaurus (for example, the word "metal" can have synonyms such as "aluminum" and "steel").
FREETEXT on the other hand is closer to how Google/SO work by searching for an entire phrase, returning close matches, not just exact matches.
Both CONTAINS and FREETEXT are available in EF Core 5 and later, through the DbFunctions.Contains and DbFunctions.FreeText functions.
This means that if you want to search for a word or phrase, you could construct a proper FTS argument and use :
var searchCriteria1="' Mountain OR Road '";
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>DbFunctions.Contains(p.Field1.Contains(searchCriteria1));
}
That's a lot easier than using LINQKit.
Or search for ride, riding, ridden with :
var searchCriteria1="' FORMSOF (INFLECTIONAL, ride) '";
shorter syntax
data.Where(p => (string.IsNullOrEmpty(searchriteria1) || p.field1.Contains(searchriteria1))
&& (string.IsNullOrEmpty(searchriteria2) || p.field2.Contains(searchriteria2)));
public static List<Test> getAll(Expression<Func<Test, bool>> filter = null)
{
return filter == null ? context.Set<Test>().ToList() : context.Set<Test>().Where(filter).ToList();
}
If you want to filter
var l=getAll(p => p.field1.Contains(searchriteria1)&&p.field2.Contains(searchriteria2));
no filter
var l=getAll();

Stored procedure fulltext search

Can I have some help with next:
My c# app calls for sql stored procedure, and passes some search term (string containing multiple words).
public List<ZapisModel> GetZapis_FullText(string Parametar)
{
using (IDbConnection connection = new System.Data.SqlClient.SqlConnection(SQLConn.CnnVal(Program.db)))
{
var output = connection.Query<ZapisModel>("dbo.spZapis_GetByTest_FTC #parametar", new { parametar = Parametar }).ToList();
return output;
}
}
Sql stored procedure should take that passed parameter, and do a fulltext search with each word from string. I want to enable user to search for ALL words from string, for ANY word from string, so that is why I need to split string, and include each word in search using AND or OR, based on user preferences.
Also I would like this job entirely done in stored procedure.So app passes search term, stored procedure processes it, and returns search result.
I suggest you don't parse your search criteria string at SQL procedure level. Instead, prepare the correct FTS search string using C#. In my projects I use a small (just 2 files) freeware library called EasyFTS.
Here is the EasyFTS project page on GitHub: https://github.com/SoftCircuits/FullTextSearchQuery
The syntax is very easy:
var fts = new EasyFts();
var orignalNonFTSEnabledQueryString = "Center Neck Spray Bottle";
var ftsEnabledsearchText = fts.ToFtsQuery(orignalNonFTSEnabledQueryString).
The above code will convert the normal non-FTS-enabled search string eg. Center Neck Spray Bottle to the FTS-enabled version like:
"(((FORMSOF(THESAURUS, Center) AND FORMSOF(THESAURUS, Neck)) AND FORMSOF(THESAURUS, Spray)) AND FORMSOF(THESAURUS, Bottle))"
which you can use in the CONTAINS/CONTAINSTABLE or FREETEXT/FREETEXTTABLE clauses.
Theare many configuration options for the EasyFTS class. You can have it general FTS query strings with words separated by AND or OR. Just examine their docs.
This article contains very thorough description of EasyFTS: http://www.blackbeltcoder.com/Articles/data/easy-full-text-search-queries .
Disclaimer: I'm not the author of the EasyFTS library.
HTH

Linq spit numeric list in search (kentico)

In kentico the standard way to get documents in below (which I believe is based on ObjectQuery and has linq commands). Im trying to filter it by one more field "newsCategory" which contains data like "1|2|3". So I cant add .Search("newsCategory", 1) etc because I need to split the list before I can search it. What direction should I be looking? A select sub-query? (Im new to linq)
// Get documents
var news = DocumentHelper.GetDocuments("CMS.News")
.OnSite("CorporateSite")
.Path("/News", PathTypeEnum.Children)
.Culture("en-us")
.CombineWithDefaultCulture(false);
As far as this is a field from the coupled table, you can't access it through property, but have to use GetValue() instead. Once you've got, you can work with it like with regular string:
var news = DocumentHelper.GetDocuments("CMS.News")
.OnSite("CorporateSite")
.Path("/News", PathTypeEnum.Children)
.Culture("en-us")
.CombineWithDefaultCulture(false)
.Where(d => d.GetStringValue("newsCategory","").Split('|').Contains("1"));
Are you sure your data is 1|2|3 and not 1|2|3| or |1|2|3 ?
If it is, you could do .Where("NewsCategory", QueryOperator.Like, "%" + id + "|%")
Otherwise you may have to get back more results, and then loop through them and split the values to find the exact one you want.
EDIT: Check out this article that shows some more advanced where commands you can use with the Data Query API. You should be able to MacGyver a proper filter with those options.
I believe you're looking for:
.WhereLike("DocumentCategoryID", "CategoryID");
//OR
.WhereLike("DocumentCategory","CategoryName");
I don't have v8 installed to double check which exact key/value pair to filter by, but according to this Document Query API article you filter document sets with the WhereLike() method.
According to the API documentation, GetDocuments() returns a MultiDocumentQuery object. I'm not 100% certain if that implements IEnumerable, so you may not even be able to use LINQ with it.
I believe something like this would work. There is a wherein property that should be able to pull the value out. Not exactly sure how it would handle the scenario of having a 1 and then an 11, but it may be work looking into.
// Get documents
var news = DocumentHelper.GetDocuments("CMS.News")
.OnSite("CorporateSite")
.Path("/News", PathTypeEnum.Children)
.Culture("en-us")
.CombineWithDefaultCulture(false)
.WhereIn("NewsCategory",1);

Using [0] or First() with LINQ OrderBy

If I have a structure like this
Albums
- Album
- Discs
- Tracks
and I want to order a collection of albums by the title of the first track on the first disc.
Is there something similar to the following I could do (keeping in mind I need to use the OrderBy extension method that accepts a string)?
albums.OrderBy("Discs[0].Tracks[0].Title")
I need to be able to sort using a string expression thus the need to use the OrderBy method i.e. albums.OrderBy("Track[0].Title"). The reason for this is our custom framework uses a sort expression (e.g. "Title") passed back from a GridView which is looked up in a dictionary (e.g. "Track[0].Title") to get the correct order by clause. That is, the field and direction of sorting is dynamically determined at runtime.
or
albums.OrderBy("Discs.First().Tracks.First().Title")
Untested, but how about:
var query = from album in albums
let disc = album.Discs.First()
let track = disc.Tracks.First()
orderby track.Title
select album;
LINQ has two ways to query "from . in .." and Lambda expressions. They way you were almost writing it looked Lambda-ish. Here would be the Lambda expression:
albums.OrderBy(a=>a.Discs.First().Tracks.First().Title)
I used variable 'a' to indicate album but you can use any variable, this is identical to the first expression:
albums.OrderBy(album=>album.Discs.First().Tracks.First().Title)
or you can use the from obj in obj form as mention in the other answers.
How about this, in order to satisfy your need for an initial query that does not perform the sorting? This uses anonymous types to store the album information, plus the name of the first track so you can sort on it later.
var query = from album in albums
let disc = album.Discs.First()
let track = disc.Tracks.First()
select new { Album = album, FirstTrack = track.Title };
var sortedQuery = from album in query
order by album.FirstTrack
select album.Album;
Sorry people,
It looks like the OrderBy method that I am asking about and trying to use is specific to the ORM (genom-e) that we are using and is not reflected on the .net Queryable or IEnumerable classes (unlike the majority of genom-e's LINQ functionality). There is no OrderBy overload that accepts a string in .net, this is specific to genom-e.
Those of you using .net encountering a similar problem should probably give either of the above two answers a try.

Converting user-entered search query to where clause for use in SQL Server full-text search

What's the best way to convert search terms entered by a user, into a query that can be used in a where clause for full-text searching to query a table and get back relevant results? For example, the following query entered by the user:
+"e-mail" +attachment -"word document" -"e-learning"
Should translate into something like:
SELECT * FROM MyTable WHERE (CONTAINS(*, '"e-mail"')) AND (CONTAINS(*, '"attachment"')) AND (NOT CONTAINS(*, '"word document"')) AND (NOT CONTAINS(*, '"e-learning"'))
I'm using a query parser class at the moment, which parses the query entered by users into tokens using a regular expression, and then constructs the where clause from the tokens.
However, given that this is probably a common requirement by a lot of systems using full-text search, I'm curious as to how other developers have approached this problem, and whether there's a better way of doing things.
How to implement the accepted answer using .Net / C# / Entity Framework...
Install Irony using nuget.
Add the sample class from:
http://irony.codeplex.com/SourceControl/latest#Irony.Samples/FullTextSearchQueryConverter/SearchGrammar.cs
Write code like this to convert the user-entered string to a query.
var grammar = new Irony.Samples.FullTextSearch.SearchGrammar();
var parser = new Irony.Parsing.Parser(grammar);
var parseTree = parser.Parse(userEnteredSearchString);
string query = Irony.Samples.FullTextSearch.SearchGrammar.ConvertQuery(parseTree.Root);
Perhaps write a stored procedure like this:
create procedure [dbo].[SearchLivingFish]
#Query nvarchar(2000)
as
select *
from Fish
inner join containstable(Fish, *, #Query, 100) as ft
on ft.[Key] = FishId
where IsLiving = 1
order by rank desc
Run the query.
var fishes = db.SearchLivingFish(query);
This may not be exactly what you are looking for but it may offer you some further ideas.
http://www.sqlservercentral.com/articles/Full-Text+Search+(2008)/64248/
In addition to #franzo's answer above you probably also want to change the default stop word behaviour in SQL. Otherwise queries containing single digit numbers (or other stop words) will not return any results.
Either disable stop words, create your own stop word list and/or set noise words to be transformed as explained in SQL 2008: Turn off Stop Words for Full Text Search Query
To view the system list of (English) sql stop words, run:
select * from sys.fulltext_system_stopwords where language_id = 1033
I realize it's a bit of a side-step from your original question, but have you considered moving away from SQL fulltext indexes and using something like Lucene/Solr instead?
The easiest way to do this is to use dynamic SQL (I know, insert security issues here) and break the phrase into a correctly formatted string.
You can use a function to break the phrase into a table variable that you can use to create the new string.
A combination of GoldParser and Calitha should sort you out here.
This article: http://www.15seconds.com/issue/070719.htm has a googleToSql class as well, which does some of the translation for you.

Categories