Optimize Entity Framework query with multiple contain statements - c#

I'm trying to optimize a query which is taking around 6 seconds to execute.
string[] filters = ...;
var data =
(from n in ctx.People
.Where(np => np.IsActive)
let isFilterMatch = filters.All(f => n.FirstName.ToLower().Contains(f) ||
n.Prefix.ToLower().Contains(f) ||
n.MiddleName.ToLower().Contains(f) ||
n.LastName.ToLower().Contains(f) ||
n.Information.Email.ToLower().Contains(f) ||
(n.Address!= null &&
(SqlFunctions.StringConvert((double)n.Address.Number).
Contains(f) ||
n.Address.Street.ToLower().Contains(f) ||
n.Address.ZipCode.ToLower().Contains(f) ||
n.Address.City.ToLower().Contains(f))))
where isFilterMatch
orderby n.LastName
select n
).Take(numberOfItems).ToList();
This is a query for a search dialog. The user can type in any text and it will then search for a person that matches the input. We split the user input into a string array and then do a Contains on the Person fields. The query cannot be precompiled because of the filter array.
How can I optimize this function? I heard about things like FullTextSearch on Sql Server or stored procedures. Could that help?
We are using Sql Server 2008, Entity Framework 4.0 (Model First) and C#.

I would not use a SQL query / Linq query for this search query. Normal queries for text searching can be slow and they only return exact results; they don't correct spelling/grammar errors etc.
You might consider using the 'Full Text Search' functionality of SQL Server; but the resulting performance might be still poor. Please refer to http://www.sql-server-performance.com/2010/full-text-search-2008/.
I would suggest to use a search indexer like Apache Lucene (which is available as a dll in Lucene.NET). Another option is that you write your own Windows service that indexes all the records.

Related

'Join' Method not supported in cosmos DB 3.3.0,Is there any other option

I am querying for report generation and below are some scenarios scenarios which got pass while
var account_logs = container.GetItemLinqQueryable<AccountLog>(true).Where(u => u.AccessedOn > MinDate);
var TempGoodResetIDs = (from ll in account_logs
where (ll.AccessedOn >= StartDate) &&
(ll.AccessedOn <= EndDate)
&& ((ll.Activity == 3) &&
((ll.Result == (int)Log.AccountResult.PasswordReset) ||
(ll.Result == (int)Log.AccountResult.TempPWDSentThroughEmail)))
select ll)
This got passed and I gotaccount_log filled with data.
Then I have smething like this in code.
var BadResetIDs = TempBadResetIDs.Select(ll => ll.ActivityID).Distinct().Except(GoodResetIDs);
var Captcha = (from ll in account_logs
join
b in BadResetIDs on ll.ActivityID equals b
where ((ll.Activity == 3) && (ll.Result == 5))
select ll.ActivityID).Count()
Here I got an exception that 'Join' is not supported in Cosmos.Is there a workaround to join cosmos document with BadResetIDs which is an iquerable and contains activity ID?
I've used SelectMany but not sure who to compare two different object accountlog and BadResetIDs.
While Cosmos SQL has a JOIN operator it only works by joining data within a single document. Cosmos does not support joining several documents in a query so the LINQ join operator is not supported.
In your case you might be able to solve your problem by performing two queries. However, you will be moving data from the database to the client to create the second query and you run the risk of the database having changed in the meantime.
Having the desire to join documents in a query could be a sign that you are retrofitting a relational database approach on top of Cosmos. Designing your system based on "no SQL" thinking from the start can lead to a very different data model.
If you get really technical there is one exception to a query not being able to combine multiple documents: You can create a stored procedure in javascript that can do that. However, a stored procedure can only execute within a single partition so it's not a general solution to combing multiple documents in a single query.

MongoDB C# Using custom extensions in Linq or Lambda Query, for checking string similarity

I'm trying to get from Mongo collection only that "x.SystemName" is similar at 80% to input string. It works when i load all entries from collection to variable, but i want to get results directly from database.
This code works:
IEnumerable<Company> companies = await FindAllCompanies();
IEnumerable<Company> _dbresult = (from company in companies
where company.SystemName.StringRateSimilarity(inputSystemName) >= 0.8 || company.SystemName.Contains(inputSystemName) || company.Name.ToLowerInvariant().Contains(name.ToLowerInvariant())
select company);
But i want to make it like this without previously loading all Companies to variable:
IEnumerable<Company> _dbresult = (from company in _db.DatabaseHost.GetCollection<Company>(collectionName).AsQueryable()
where company.SystemName.StringRateSimilarity(inputSystemName) >= 0.8 || company.SystemName.Contains(inputSystemName) || company.Name.ToLowerInvariant().Contains(name.ToLowerInvariant())
select company);
or like this:
var collection = _db.DatabaseHost.GetCollection<Company>(collectionName);
IEnumerable<Company> _dbresult = collection.AsQueryable().Where(x => (Services.API.StringRateSimilarity(x.SystemName, inputSystemName) >= 0.8) || (x.SystemName.Contains(inputSystemName)) || (x.Name.ToLowerInvariant().Contains(name.ToLowerInvariant()))).Select(e => e);
But this make InvalidOperationException: StringRateSimilarity({document}{systemName}, "TEST") is not supported. Error
In order for the MongoDB C# driver to be able to translate a C# Expression into a MongoDB query it needs to "know" how to do that. With your handwritten method, there is no way the driver could know what your method does and hence how MongoDB could do the same thing on the Database side (which may or may not be possible).
I would suggest you tackle that problem from the other side:
Try to express your "80% similarity" logic using a pure MongoDB query.
Translate that MongoDB query into C# using only the built-in string functions that are supported by the C# driver or alternatively resort to sending that raw MongoDB query as a string.

How to reduce Entity Framework Linq query execution time? [duplicate]

This question already has answers here:
Optimize entity framework query
(5 answers)
Closed 5 years ago.
I am using Entity Framework in my ASP.NET MVC application, and I am facing an issue in loading data from SQL Server via LINQ. My query returns the result in 4 seconds, but I need to get in less time, still searching for better solution.
Here is my linq query:
var user =
context.CreateObjectSet<DAL.ProductMaster>()
.AsNoTracking()
.Include("Product1").AsNoTracking()
.Include("Product2").AsNoTracking()
.Include("Product3").AsNoTracking()
.Include("Product4").AsNoTracking()
.Include("Product5").AsNoTracking()
.Where(x => x.Id == request.Id)
).FirstOrDefault();
No need to get multiple entities fulfilling the predicate in Where() only to then select the first one; simply call FirstOrDefault() directly.
if (request == null || organizationsList == null || !organizationsList.Any())
return;
var user = context.CreateObjectSet<DAL.User>()
.AsNoTracking()
.Include("UsersinPrograms.FollowUps")
.Include() ...
.FirstOrDefault(x => x.Id == request.userId && organizationsList.Contains(x.OrganizationId)));
Also, remove the Include()'s that you don't need.
For this and any large scale business logic operations one should create Stored Procedures instead of basing them in Linq. Then in EF one creates a DTO object to handle the result and consumes the procedure in EF.
At that point the speed gained or lost is up to how the procedure's sql has been structured, with an eye on indexes used within the database to possibly provide an enhancement on speed.
Linq SQL is really boilerplate and not designed for speed.
I have done this with multiple EF projects.
Also one can build their linq query in Linqpad and then switch to view the SQL generated. That would provide an idea on how to structure one's SQL in a stored proc.
Also you could look into using SQL CTEs (Common Table Experssions) to build your query one step at a time until the data is just as needed.

How can i speed up these linq queries?

Can i combine the following two linq queries in an single one, for speeding things up?
The first one, searches and performs the paging
Products.Data = db.Products.Where(x => x.ProductCode.Contains(search) ||
x.Name.Contains(search) ||
x.Description.Contains(search) ||
x.DescriptionExtra.Contains(search) ||
SqlFunctions.StringConvert(x.Price).Contains(search) ||
SqlFunctions.StringConvert(x.PriceOffer).Contains(search) ||
SqlFunctions.StringConvert(x.FinalPrice).Contains(search) ||
SqlFunctions.StringConvert(x.FinalPriceOffer).Contains(search))
.OrderBy(p => p.ProductID)
.Skip(PageSize * (page - 1))
.Take(PageSize).ToList();
while the second one counts the total filtered results.
int count = db.Products.Where(x => x.ProductCode.Contains(search) ||
x.Name.Contains(search) ||
x.Description.Contains(search) ||
x.DescriptionExtra.Contains(search) ||
SqlFunctions.StringConvert(x.Price).Contains(search) ||
SqlFunctions.StringConvert(x.PriceOffer).Contains(search) ||
SqlFunctions.StringConvert(x.FinalPrice).Contains(search) ||
SqlFunctions.StringConvert(x.FinalPriceOffer).Contains(search))
.Count();
Get rid of your ridiculously inefficient conversions.
SqlFunctions.StringConvert(x.Price).Contains(search) ||
No index use possible, full table scan, plus a conversion - that is as bad as it gets.
And make sure you have all indices.
Nothing else you can do.
I do not think You can combine them directly. This is one problem of paging - You need to know the total count of result anyway. Problem of dynamic paging further is, that one page can be inconsistent with another, because it is from different time. You can easily miss item completely because of this. If this can be a problem, I would avoid dynamic paging. You can fill ids of the whole result into some temporary table on server and do paging from there. Or You can return all ids from fulltext search and query the rest of data on demand.
There are some more optimizations, You can start returning results when the search string is at least 3 characters long, or You can build special table with count estimates for this purpose. You can also decide, that You would return only first ten pages and save server storage for ids (or client bandwidth for ids).
I am sad to see "Stop using contains" answers without alternative. Searching in the middle of the words is many times a must. The fact is, SQL server is terribly slow on text processing and searching is no exception. AFAIK even full-text indexes would not help You much with in-the-middle substring searching.
For the presented query on 10k records I would expect about 40ms per query to get counts or all results (my desktop). You can make computed persisted colum on this table with all texts concatenated and all numbers converted and query only that column. It would speed things up significantly (under 10ms for query on my desktop).
[computedCol] AS (((((((((([text1]+' ')+[text2])+' ')+[text3])+' ')+CONVERT([nvarchar](max),[d1]))+' ')+CONVERT([nvarchar](max),[d2]))+' ')+CONVERT([nvarchar](max),[d3])) PERSISTED
Stop using 'contains' function cause it's a very slow thing (if you can)
Make sure that Your queries can make use of index`es in the DB.
If You MUST have 'contains' - the take a look at full-text search capabilieties of SQL, but You might need to change this so pure sql or customise how your Linq is translated to SQL to make use of Full-Text indexes

LINQ can't use string.contains?

This is my code:
string queryString = "Marco".ToLower();
utenti = db.User.Where(p =>
queryString.Contains(p.Nickname.ToLower()) ||
queryString.Contains(p.Nome.ToLower()) ||
queryString.Contains(p.Cognome.ToLower())).ToList();
but I get:
Only arguments that can be evaluated on the client are supported for the String.Contains method.
Why? Can't I use .Contains()?
Try .IndexOf. It is not LINQ that can't do Contains, it's LINQ to Entities and LINQ to SQL that can't.
string queryString = "Marco";
utenti = db.User.Where(p =>
queryString.IndexOf(p.Nickname, StringComparison.OrdinalIgnoreCase) >= 0 ||
queryString.IndexOf(p.Nome, StringComparison.OrdinalIgnoreCase) >= 0 ||
queryString.IndexOf(p.Cognom, StringComparison.OrdinalIgnoreCasee) >= 0)
.ToList();
Why?
LINQ uses deferred execution. This means it waits until you want to iterate over your query results before it does anything. There are 3 main types of LINQ:
LINQ to Objects - when your IEnumerable is already on the heap.
LINQ to Entities - when you want to query a database using Entity Framework.
LINQ to SQL - when you want to query a database using LINQ to SQL.
Deferred execution in the context of the second 2 means that your query is not executed on the database until you enumerate the results in a foreach block, or invoke an enumeration method like .ToList, .ToArray, etc. Until then, your query is just stored as expression trees in memory.
Your query would work just peachy if db.User was a collection in memory. However when the data is in a database, LINQ to Entities (or LINQ to SQL) must translate your expression trees to what it calls a "store expression" -- which is just fancy talk for "convert my LINQ expressions to SQL".
Now imagine you had a custom C# algorithm you wanted to use for your query, and you did something like this:
var result = db.User.Where(x => MyCustomMethod(x));
There is no way today that LINQ to Entities can convert your C# code into a SQL query (store expression). It is the same with a lot of other C# methods you rely on daily. It also does not support .ToLower, .ToUpper, .StartsWith, .EndsWith, etc. There is a limited number of C# methods that can be converted to store expressions, and .IndexOf just happens to be one of them.
However keep in mind that it is only the string object's Contains method that we are talking about here that is not supported for store expressions. LINQ to Entities does support .Contains on IEnumerables. The following is valid and will work with LINQ to Entities (not sure about LINQ to SQL):
var idsIWantToFind = new[] { 1, 2, 3 };
var users = db.Where(x => idsIWantToFind.Contains(x.UserId));
The above is the equivalent of doing a SQL WHERE UserId IN (1, 2, 3) predicate.

Categories