Can I have some help with next:
My c# app calls for sql stored procedure, and passes some search term (string containing multiple words).
public List<ZapisModel> GetZapis_FullText(string Parametar)
{
using (IDbConnection connection = new System.Data.SqlClient.SqlConnection(SQLConn.CnnVal(Program.db)))
{
var output = connection.Query<ZapisModel>("dbo.spZapis_GetByTest_FTC #parametar", new { parametar = Parametar }).ToList();
return output;
}
}
Sql stored procedure should take that passed parameter, and do a fulltext search with each word from string. I want to enable user to search for ALL words from string, for ANY word from string, so that is why I need to split string, and include each word in search using AND or OR, based on user preferences.
Also I would like this job entirely done in stored procedure.So app passes search term, stored procedure processes it, and returns search result.
I suggest you don't parse your search criteria string at SQL procedure level. Instead, prepare the correct FTS search string using C#. In my projects I use a small (just 2 files) freeware library called EasyFTS.
Here is the EasyFTS project page on GitHub: https://github.com/SoftCircuits/FullTextSearchQuery
The syntax is very easy:
var fts = new EasyFts();
var orignalNonFTSEnabledQueryString = "Center Neck Spray Bottle";
var ftsEnabledsearchText = fts.ToFtsQuery(orignalNonFTSEnabledQueryString).
The above code will convert the normal non-FTS-enabled search string eg. Center Neck Spray Bottle to the FTS-enabled version like:
"(((FORMSOF(THESAURUS, Center) AND FORMSOF(THESAURUS, Neck)) AND FORMSOF(THESAURUS, Spray)) AND FORMSOF(THESAURUS, Bottle))"
which you can use in the CONTAINS/CONTAINSTABLE or FREETEXT/FREETEXTTABLE clauses.
Theare many configuration options for the EasyFTS class. You can have it general FTS query strings with words separated by AND or OR. Just examine their docs.
This article contains very thorough description of EasyFTS: http://www.blackbeltcoder.com/Articles/data/easy-full-text-search-queries .
Disclaimer: I'm not the author of the EasyFTS library.
HTH
Related
I have a search where I use LINQ with EF. When ever the search criteria are null or empty I need to return everything. Currently I've used if conditions as a solution. and from that I moved to a solution like this.
data = data
.Where(p => !string.IsNullOrEmpty(searchriteria1)? p.field1.Contains(searchriteria1) : true)
.Where(p => !string.IsNullOrEmpty(searchriteria2)? p.field2.Contains(searchriteria2) : true);
Is there a better way to do this? maybe use an extension or any better approach?
You could check the search criteria field previously and build up the query this way:
IQueryable<Foo> data = context.Foo.AsQueryable();
if(!string.IsNullOrEmpty(searchriteria1))
{
data = data.Where(p => p.field1.Contains(searchriteria1));
}
if (!string.IsNullOrEmpty(searchriteria2))
{
data = data.Where(p => p.field2.Contains(searchriteria2));
}
There are two parts to the question. How to filter dynamically and how to filter efficiently.
Dynamic criteria
For the first question, there's no need for a catch-all query when using LINQ. Catch-all queries result in inefficient execution plans, so it's best to avoid them.
LINQ isn't SQL though. You can construct your query part by part. The final query will be translated to SQL only when you try to enumerate it. This means you can write :
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>p.Field1.Contains(searchCriteria1);
}
You can chain multiple Where call to get the equivalent of multiple AND criteria.
To generate more complex queries using eg OR you'd have to construct the proper Expression<Func<...,bool>> objects, or use a library like LINQKit to make this bearable.
Efficiency
Whether you can write an efficient query depends on the search criteria. The clause field LIKE '%potato%' can't use any indexes and will end up scanning the entire table.
On the other hand, field LIKE 'potato% can take advantage of an index that covers field because it will be converted to a range search like field >='potato' and field<='potatp.
If you want to implement autocomplete or spell checking though, you often want to find text that has the fewest differences from the criteria.
Full Text Search
You can efficiently search for words, word variations and even full phrases using Full-Text Search indexes and FTS functions like CONTAINS or FREETEXT.
FTS is similar to how Google or ... StackOverflow searches for words or sentences.
Quoting form the docs:
CONTAINS can search for:
A word or phrase.
The prefix of a word or phrase.
A word near another word.
A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
A word that is a synonym of another word using a thesaurus (for example, the word "metal" can have synonyms such as "aluminum" and "steel").
FREETEXT on the other hand is closer to how Google/SO work by searching for an entire phrase, returning close matches, not just exact matches.
Both CONTAINS and FREETEXT are available in EF Core 5 and later, through the DbFunctions.Contains and DbFunctions.FreeText functions.
This means that if you want to search for a word or phrase, you could construct a proper FTS argument and use :
var searchCriteria1="' Mountain OR Road '";
if(!String.IsNullOrEmpty(searchCriteria1))
{
query=query=.Where(p=>DbFunctions.Contains(p.Field1.Contains(searchCriteria1));
}
That's a lot easier than using LINQKit.
Or search for ride, riding, ridden with :
var searchCriteria1="' FORMSOF (INFLECTIONAL, ride) '";
shorter syntax
data.Where(p => (string.IsNullOrEmpty(searchriteria1) || p.field1.Contains(searchriteria1))
&& (string.IsNullOrEmpty(searchriteria2) || p.field2.Contains(searchriteria2)));
public static List<Test> getAll(Expression<Func<Test, bool>> filter = null)
{
return filter == null ? context.Set<Test>().ToList() : context.Set<Test>().Where(filter).ToList();
}
If you want to filter
var l=getAll(p => p.field1.Contains(searchriteria1)&&p.field2.Contains(searchriteria2));
no filter
var l=getAll();
I want to create a unique small string <= 258 chars that is suitable as a windows filename.
This is to uniquely label a Xml query result.
Here is a sample query:
SELECT * FROM ( SELECT [utcDT],
MAX(CASE WHEN[Symbol] = 'fish' THEN[Close] END) AS [fish],
MAX(CASE WHEN[Symbol] = 'chips' THEN[Close] END) AS [chips]
FROM [DATA].[1M].[ASTS_NOGAP]
WHERE [Date] >= '2011-12-27'
AND [Date] <= '2012-07-01'
AND [Symbol] IN ('fish','chips')
GROUP BY [utcDT] ) AS A
WHERE [utcDT] IS NOT NULL AND [fish] IS NOT NULL AND [chips] IS NOT NULL
ORDER BY [utcDT]
BUT is could be a longer query.
The compress is one way only, i.e. I do NOT need to decompress.
I want to end up with a unique file name like:
ksdgfsbhdfjksgdjbajysjdgyasagfdjahgdkjasgjgfjkgjkgdjkfgjskdjfgsajgdjfgjsgy.xml
EDIT1:
The generated filename must be unique to the query - such that another
app would generate the same filename for the same query.
How can I achieve this?
There is a small risk for collisions, but this should do what you need:
public string GetUniqueFileNameForQuery(string sql)
{
using (var hasher = SHA256.Create())
{
var queryBytes = Encoding.UTF8.GetBytes(sql);
var queryHash = hasher.ComputeHash(queryBytes);
// "/" may be included, but is not legal for file names
return Convert.ToBase64String(queryHash).Replace("/", "-")+".xml";
}
}
This needs using System.Security.Cryptography; at the top of the file.
I also need to add a note about working with SQL from client code languages like C#.
Most queries are going to need input of some kind: an ID field for a lookup, a date range, a username, something to tell the query which records you need out of a larger set. It's very poor practice to substitute these inputs directly into the SQL string in your C# (or other language) code. That opens you up to an issue known as SQL Injection, and it's kind of a big deal.
Instead, for most all queries, there will be a placeholder variable name for each input argument. It matters for this question because you'll have the same SQL query text for two queries that differ only by arguments.
For example, say you have this query:
SELECT * FROM Users WHERE Username = #Username
You run this query twice, once with 'jsmith' as the input, and once with 'jdoe'. The SQL didn't change, and therefore the encoded file name didn't change.
You maybe be inclined to ask to get the value of the SQL after the parameter inputs are substituted into the query, but this misunderstands what happens. The parameter inputs are never, at any time, substituted into the sql query. That's the whole point. Even the database server will instead treat them as procedure variables.
The point here is you also need a way to encode any parameter data used with your query. Here's one basic naive option:
public string GetUniqueFileNameForQuery(DbCommand query)
{
var sql = query.CommandText;
foreach(var p in query.Parameters)
{
sql = sql.Replace(p.Name, p.Value.ToString());
}
using (var hasher = SHA256.Create())
{
var queryBytes = Encoding.UTF8.GetBytes(sql);
var queryHash = hasher.ComputeHash(queryBytes);
// "/" may be included, but is not legal for file names
return Convert.ToBase64String(queryHash).Replace("/", "-")+".xml";
}
}
Note: this code could produce invalid SQL. For example, you might end up with something like this:
SELECT * FROM Users WHERE LastName = O'Brien
But since you're not actually trying to run the query, that should be okay. You also need to be careful with systems like OleDB, which uses positional matching and ? for all parameter placeholders. In this case, the parameter name won't match the placeholder, or even if it did, the first parameter would match the placeholder for all the others.
I am new to Lucene.net ,Here I would like to know how to make a lucene search query almost like an sql query .Lemme give more..
I have set of parameter values,Let assume like a stored procedure has set of parameters .Now I want to build a query with all this parameters.
searchParams.UseLast = Convert.ToBoolean(base.Arguments["UseLast"]);
searchParams.LastEditedFrom= Convert.ToDateTime(base.Arguments["LastEditedFrom"]);
searchParams.LastEditedTo = Convert.ToDateTime(base.Arguments["LastEditedTo"]);
searchParams.Reviewed = Convert.ToBoolean(base.Arguments["Reviewed"]);
searchParams.Approved = Convert.ToBoolean(base.Arguments["Approved"]);
searchParams.Include = Convert.ToBoolean(base.Arguments["Include"]);
searchParams.IsVisibleToUser = Convert.ToBoolean(base.Arguments["IsVisibleToUser"]);
searchParams.IsEntry = Convert.ToBoolean(base.Arguments["IsEntry"]);
searchParams.UserId = Convert.ToInt32(base.Arguments["UserId"]);
IEnumerable Categories = base.Arguments["Categories"] as IEnumerable;
IEnumerable Departments = base.Arguments["Departments"] as IEnumerable;
String mQuery = "How to construct it ....!!!" // Need help in this
var query = queryParser.Parse(mQuery);
indexSearcher.Search(query, collector);
Here I want to fetch all records from lucene index which has the value for all the above fields.
I'm unclear what you are using searchParams for, however in general you may construct your query string (mQuery) in this cases with any of the features of the Lucene query syntax. Here is a link to the documentation for Lucene.Net version 4.8 Query Parser Syntax.
In general, when multiple words are listed in the query they are treated with a logical OR but doc matches that contain all terms are ranked higher than docs with only one term. So for example white dog would match docs containing white dog or white or dog. You can put and in the statement if you only want docs that match all the terms so for example you could say small and white and dog if you only want docs that contain all three terms.
To specify the specific field to search you list the field name followed by a colon. So for example you can search UserId:ron and Categories:dogs. There is much more to the Lucene query syntax but hopefully that will get you started. For more details see the Lucene query syntax doc I referred to.
We recently discovered a bug in our system whereby any serial numbers that have been entered in lowercase have not been processed correctly.
To correct this, we need to add a one off function that will run through the database and re-process all items with lower case serial numbers.
In linq, is there a query I can run that will return a list of such items?
Note: I am not asking how to convert lowercase to uppercase or reverse, which is all google will return. I need to generate a list of all database entries where the serial number has been entered in lowercase.
EDIT: I am using Linq to MS SQL, which appears to be case insensitive.
Yes, there is. You can try something like this:
var result = serialnumber.Any(c => char.IsLower(c));
[EDIT]
Well, in case of Linq to Entities...
As is stated here: Regex in Linq (EntityFramework), String processing in database, there's few ways to workaround it.
Change database table structure. E.g. create table Foo_Filter which will link your entities to filters. And then create table Filters
which will contain filters data.
Execute query in memory and use Linq to Objects. This option will be slow, because you have to fetch all data from database to memory
Note: link to MSDN documentation has been added by me.
For example:
var result = context.Serials.ToList().Where(sn => sn.Any(c => char.IsLower(c)));
Another way is to use SqlMethods.Like Method
Finally, i'd strongly recommend to read this: Case sensitive search using Entity Framework and Custom Annotation
What's the best way to convert search terms entered by a user, into a query that can be used in a where clause for full-text searching to query a table and get back relevant results? For example, the following query entered by the user:
+"e-mail" +attachment -"word document" -"e-learning"
Should translate into something like:
SELECT * FROM MyTable WHERE (CONTAINS(*, '"e-mail"')) AND (CONTAINS(*, '"attachment"')) AND (NOT CONTAINS(*, '"word document"')) AND (NOT CONTAINS(*, '"e-learning"'))
I'm using a query parser class at the moment, which parses the query entered by users into tokens using a regular expression, and then constructs the where clause from the tokens.
However, given that this is probably a common requirement by a lot of systems using full-text search, I'm curious as to how other developers have approached this problem, and whether there's a better way of doing things.
How to implement the accepted answer using .Net / C# / Entity Framework...
Install Irony using nuget.
Add the sample class from:
http://irony.codeplex.com/SourceControl/latest#Irony.Samples/FullTextSearchQueryConverter/SearchGrammar.cs
Write code like this to convert the user-entered string to a query.
var grammar = new Irony.Samples.FullTextSearch.SearchGrammar();
var parser = new Irony.Parsing.Parser(grammar);
var parseTree = parser.Parse(userEnteredSearchString);
string query = Irony.Samples.FullTextSearch.SearchGrammar.ConvertQuery(parseTree.Root);
Perhaps write a stored procedure like this:
create procedure [dbo].[SearchLivingFish]
#Query nvarchar(2000)
as
select *
from Fish
inner join containstable(Fish, *, #Query, 100) as ft
on ft.[Key] = FishId
where IsLiving = 1
order by rank desc
Run the query.
var fishes = db.SearchLivingFish(query);
This may not be exactly what you are looking for but it may offer you some further ideas.
http://www.sqlservercentral.com/articles/Full-Text+Search+(2008)/64248/
In addition to #franzo's answer above you probably also want to change the default stop word behaviour in SQL. Otherwise queries containing single digit numbers (or other stop words) will not return any results.
Either disable stop words, create your own stop word list and/or set noise words to be transformed as explained in SQL 2008: Turn off Stop Words for Full Text Search Query
To view the system list of (English) sql stop words, run:
select * from sys.fulltext_system_stopwords where language_id = 1033
I realize it's a bit of a side-step from your original question, but have you considered moving away from SQL fulltext indexes and using something like Lucene/Solr instead?
The easiest way to do this is to use dynamic SQL (I know, insert security issues here) and break the phrase into a correctly formatted string.
You can use a function to break the phrase into a table variable that you can use to create the new string.
A combination of GoldParser and Calitha should sort you out here.
This article: http://www.15seconds.com/issue/070719.htm has a googleToSql class as well, which does some of the translation for you.