Telerik Linq Extremly Slow When Using Query with OrderBy - c#

I am having trouble managing large dataset with linq and telerik model, I cant figure out the problem or how linq is working to execute the query.
I am queryng the database using linq to a 300000+ records and it seems that linq is executing the query before applying the take and skip paramenters.
I am executing this query using linq:
var result = repository.Documents.Where(p => p.TenantId = 1 && p.TipoDocumento == SriDocType && p.RucReceptor == ruc).OrderByDescending(p => p.FechaEmision).Take(20).Skip(0).ToList();
Then I run the same query by using sql:
var result = ((EddocumentRepository)repository).Model.ExecuteQuery<Riverminds.ShardLayer.Eddocument>("SELECT TOP 20 * FROM eddocuments WHERE TenantId = 1 and TipoDocumento = 1 and RucReceptor = '0990017514001' Order By FechaEmision Desc", (new List<System.Data.Common.DbParameter>()).ToArray()).ToList();
On the first Query I get a timeout exception, it takes more than a minute the query, if I change the maxexecutiontime, it will work but will take like 2 minutes.
Now If I run the second query that basically is the same thing but with sql text, it takes just a second or 2 seconds. It is really crazy but it is happening, and I need to use linq because I am working with Kendo Asp Net Mvc and using the ToDataSourceResult that is the same thing than linq. It takes a lot of time executing the query.
Any Idea?.
UPDATE
Doing some Linq Queries with the help of the posted comments, I can figure out that the problem is the paramenters, when I use "ruc" paramenters the LINQ query throw a timeout exception because it takes a lot of time to execute it, The same query with SQL takes 1 second.
Deleting the ruc condition, with Linq takes the same time as SQL, 1 second, I check mappings and it seems ok, the column is a 255 nvarchar nullable, so I think something is wrong with the parameter ruc. I am posting the mapping detail of the column and the linq and sql that takes 1 second.
configuration.HasProperty(x => x.RucReceptor).HasFieldName("_rucReceptor").WithDataAccessKind(DataAccessKind.ReadWrite).ToColumn("RucReceptor").IsNullable().HasColumnType("nvarchar").HasLength(255);
var result = repository.Documents.Where(p => p.TenantId = 1 && p.TipoDocumento == SriDocType).OrderByDescending(p => p.FechaEmision).Take(20).Skip(0).ToList();
var result = ((EddocumentRepository)repository).Model.ExecuteQuery<Riverminds.ShardLayer.Eddocument>("SELECT TOP 20 * FROM eddocuments WHERE TenantId = 1 and TipoDocumento = 1 Order By FechaEmision Desc", (new List<System.Data.Common.DbParameter>()).ToArray()).ToList();
Thanks for your Help
Santiago Munoz

This difference in the query speed usually is due to one of the following:
The generated SQL statement is not the most efficient, in your case I don't believe it is the case as your query is pretty straight forward. However you could inspect the generated SQL statement by executing
string sql = repository.Documents.Where(p => p.TenantId = 1 && p.TipoDocumento == SriDocType && p.RucReceptor == ruc).OrderByDescending(p => p.FechaEmision).Take(20).Skip(0).ToString()
There might be parameter type mismatch causing implicit type conversion at server side that prevents the SQL server to utilize the existing indexes. The usual suspects are "string" type of properties. From your examples I see this: RucReceptor = '0990017514001'. Check the mapping for this column if it is Unicode but in the db is varchar this definitely will affect the performance negatively. Fix the types in the mapping to correspond to those in the DB and it should run fast.
Hope this helps.

Related

My ForEach loop with a LINQ expression cannot be translated?

I need to go through all my chemicals and send out a warning to users if the expiration date is either 2 days away or 15 days away.
I have it set like that so the warning is sent out twice. Once 2 weeks before it expires, and once 2 days before it expires.
So I have this loop:
foreach (var chemical in _context.Chemical.Where(c => (c.DateOfExpiration - DateTime.Now).TotalDays == 15 || (c.DateOfExpiration - DateTime.Now).TotalDays == 2))
{
// send out notices
var base = _context.Base.Find(chemical.BaseId);
var catalyst = _context.Catalyst.Find(chemical.CatalystId);
_warningService.SendSafetyWarning(chemical, base, catalyst.Name);
}
But whenever I run it, I get this error:
System.InvalidOperationException: 'The LINQ expression 'DbSet<Chemical>
.Where(p => (p.DateOfExpiration - DateTime.Now).TotalDays == 15 || (p.DateOfExpiration - DateTime.Now).TotalDays == 2)'
could not be translated.
I'm not sure why it's giving me this error when I run it.
Is there a better way of doing what I'm trying to do?
Thanks!
DateTime.TotalDays returns a double.
You might have more luck changing your code to:
foreach (var chemical in _context.Chemical.Where(c => ((int)(c.DateOfExpiration - DateTime.Now).TotalDays == 15) || ((int)(c.DateOfExpiration - DateTime.Now).TotalDays == 2)))
{
// send out notices
var base = _context.Base.Find(chemical.BaseId);
var catalyst = _context.Catalyst.Find(chemical.CatalystId);
_warningService.SendSafetyWarning(chemical, base, catalyst.Name);
}
There are two solutions to this problem.
I'll first explain why it is throwing an error. EF is converting your code to an SQL Expression. When working with Expression you cannot parse in any C# code and expect it to work. EF needs to be able to translate it to SQL. That is why you are getting the exception saying that it can't translate your code.
When working directly in the Where clause on the DbSet or the DbContext you are actualy talking to the Where method for an IQueryable. Which has a parameter of Expression<Func<bool, T>> where T is your DbSet<T> type.
You either need to simplify your query so it can be translated to SQL. Or you can convert your IQueryable to an IEnumerable.
When you want to run your query on the SQL server, you need to use the first approach. But when you can run your query on the client you can use the second approach.
The first approached as mentioned by vivek nuna is using EntityFunctions. It might not be enough for your use case, because it has limited functionality.
The statement made that DateTime.Now is different on each iteratoion is not true. Because the query is converted once, it isn't running the query on every iteration. It simply doesn't know how to translate your query to SQL.
https://learn.microsoft.com/en-us/dotnet/api/system.data.objects.entityfunctions?view=netframework-4.8
The second approach means adding AsEnumerable after your DbSet. This will query all your data from your SQL server, and evaluate in C#. This is usually not recommended if you have a large data set. Example:
var chemicals = _context.Chemical.AsEnumerable(); // this will get the complete collection
chemicals.Where(i => i.Value == true); // everything will work now
And answer to a "better" approach. It's a bit cleaner code:
If you have navigation properties for your Base and Catalyst you can also include this in your query.
Note that you can still query server side before pulling everything client side using a Where on the _context.Chemical and then including and calling AsEnumerable.
var chemicals = _context.Chemical.Include(i => i.Base).Include(i => i.Catalyst).AsEnumerable();
foreach (var chemical in chemicals.Where(i => true)) // set correct where clause
{
_warningService.SendSafetyWarning(chemical, chemical.Base, chemical.Catalyst);
}
You can declare the variable before and then pass it to your expression. Like DateOfExpiration == nextFifteenDays …..
var nextFifteenDays = DateTime.Now.Adddays(15);
var nextTwoDays = DateTime.Now.AddDays(2);
Or you can use EntityFunctions.Diffdays
Reason for the error is well explained by #neil that DateTime.Now will keep on changing, so can’t be used.

Simulate Entity Framework's .Last() when using SQL Server

SQL server is able to translate EF .First() using its function TOP(1). But when using Entity Framework's .Last() function, it throws an exception. SQL server does not recognize such functions, for obvious reasons.
I used to work it around by sorting descending and taking the first corresponding line :
var v = db.Table.OrderByDescending(t => t.ID).FirstOrDefault(t => t.ClientNumber == ClientNumberDetected);
This does it with a single query, but sorting the whole table (million rows) before querying...
Do I have good reasons to think there will be speed issues if I abuse of this technique ?
I thought of something similar... but it requires two query :
int maxID_of_Client = db.Where(t => t.ClientNumber == ClientNumberDetected).Max(t => t.ID);
var v = db.First(t => t.ID == maxID_of_Client);
It's consisting of retrieving the max ID of the client, then use this ID to retrieve the last line of the client.
It doesn't seems faster to query two times...
There must be a way to optimize this and use a single query without sorting millions of datas.
Unless there is something I don't understand, I'm probably not the first to think about this problem and I want to solve it for good !
Thanks in advance.
The assumption driving this question is that result sets with no ordering clause come back from your DB in any predictable order at all.
In reality, result sets that come back from SQL have no implicit ordering and none should be assumed.
Therefore, the result of
db.Table.FirstOrDefault(t => t.ClientNumber == ClientNumberDetected)
is actually indeterminate.
Whether you're taking first or last, without ordering it's all meaningless anyway.
Now, what goes to SQL where you add an ordering clause to your LINQ? It will be something similar to...
SELECT TOP(1) something FROM somewhere WHERE foo=bar ORDER BY somevalue
or, in the the descending/last case
SELECT TOP(1) something FROM somewhere WHERE foo=bar ORDER BY somevalue DESC
From SQL's POV, there's no significant difference here and your DB will be optimized for this sort of query. The index can be scanned in either direction, and the cost of each query above is the same.
TL;DR :
db.Table.OrderByDescending(t => t.ID)
.FirstOrDefault(t => t.ClientNumber == ClientNumberDetected)
is just fine.

Entity Framework 5.0 query results are different from a direct SQL query on the database

So I've got a very simple query (against a single table) that is returning a different result set when run from Entity Framework vs when it's run directly SQL Server Management Studio. At first I thought EF was generating different SQL, but after rewriting it to be a direct SQL query it's still returning a different result.
Working Query:
SELECT *
FROM Users
WHERE IsActive = 1
AND IsAdvisor = 1
ORDER BY Initials
Failing Code:
//var query = this.BaseDB.Users.Where(x => x.IsAdvisor && x.IsActive).OrderBy(x => x.Initials);
var query = this.BaseDB.Users.SqlQuery("SELECT * FROM Users WHERE Users.IsActive = 1 AND Users.IsAdvisor = 1 ORDER BY Initials");
return query.ToList(); //this query (either variant, returns the same result set, but is missing values.
Any idea what's happening? By the way, I did double check to make sure I was querying the same DB.
Firstly, thanks for the responses. This was my own mistake, the previous developer was setting the display property for certain records individually :(, EF was returning the correct result set.

EF LINQ ToList is very slow

I am using ASP NET MVC 4.5 and EF6, code first migrations.
I have this code, which takes about 6 seconds.
var filtered = _repository.Requests.Where(r => some conditions); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes 6 seconds, has 8 items inside
I thought that this is because of relations, it must build them inside memory, but that is not the case, because even when I return 0 fields, it is still as slow.
var filtered = _repository.Requests.Where(r => some conditions).Select(e => new {}); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes still around 5-6 seconds, has 8 items inside
Now the Requests table is quite complex, lots of relations and has ~16k items. On the other hand, the filtered list should only contain proxies to 8 items.
Why is ToList() method so slow? I actually think the problem is not in ToList() method, but probably EF issue, or bad design problem.
Anyone has had experience with anything like this?
EDIT:
These are the conditions:
_repository.Requests.Where(r => ids.Any(a => a == r.Student.Id) && r.StartDate <= cycle.EndDate && r.EndDate >= cycle.StartDate)
So basically, I can checking if Student id is in my id list and checking if dates match.
Your filtered variable contains a query which is a question, and it doesn't contain the answer. If you request the answer by calling .ToList(), that is when the query is executed. And that is the reason why it is slow, because only when you call .ToList() is the query executed by your database.
It is called Deferred execution. A google might give you some more information about it.
If you show some of your conditions, we might be able to say why it is slow.
In addition to Maarten's answer I think the problem is about two different situation
some condition is complex and results in complex and heavy joins or query in your database
some condition is filtering on a column which does not have an index and this cause the full table scan and make your query slow.
I suggest start monitoring the query generated by Entity Framework, it's very simple, you just need to set Log function of your context and see the results,
using (var context = new MyContext())
{
context.Database.Log = Console.Write;
// Your code here...
}
if you see something strange in generated query try to make it better by breaking it in parts, some times Entity Framework generated queries are not so good.
if the query is okay then the problem lies in your database (assuming no network problem).
run your query with an SQL profiler and check what's wrong.
UPDATE
I suggest you to:
add index for StartDate and EndDate Column in your table (one for each, not one for both)
ToList executes the query against DB, while first line is not.
Can you show some conditions code here?
To increase the performance you need to optimize query/create indexes on the DB tables.
Your first line of code only returns an IQueryable. This is a representation of a query that you want to run not the result of the query. The query itself is only runs on the databse when you call .ToList() on your IQueryable, because its the first point that you have actually asked for data.
Your adjustment to add the .Select only adds to the existing IQueryable query definition. It doesnt change what conditions have to execute. You have essentially changed the following, where you get back 8 records:
select * from Requests where [some conditions];
to something like:
select '' from Requests where [some conditions];
You will still have to perform the full query with the conditions giving you 8 records, but for each one, you only asked for an empty string, so you get back 8 empty strings.
The long and the short of this is that any performance problem you are having is coming from your "some conditions". Without seeing them, its is difficult to know. But I have seen people in the past add .Where clauses inside a loop, before calling .ToList() and inadvertently creating a massively complicated query.
Jaanus. The most likely reason of this issue is complecity of generated SQL query by entity framework. I guess that your filter condition contains some check of other tables.
Try to check generated query by "SQL Server Profiler". And then copy this query to "Management Studio" and check "Estimated execution plan". As a rule "Management Studio" generatd index recomendation for your query try to follow these recomendations.

Linq to Entities : Count is very slow

I need to have parent and parent.child.count()....in the query.. when i do this it is taking 20 seconds....its not a huge database...Any ideas for optimization...
var plist = context.persons
.Select(p => new
{
p.fullName,
c.personID,
p.Status,
p.Birthdate,
p.Accounts.Count
}).ToList();
Here is a great article on using count() when you really meant to use any()
http://blogs.teamb.com/craigstuntz/2010/04/21/38598/
Do you need to use .count or could you use .any?
http://msdn.microsoft.com/en-us/library/bb534972.aspx
Since this is entity framework, open up the sql profiler and take a look at what sql queries are being sent to the database. It sounds like you may see that a single query is sent to fetch the group identifiers, and then another set of queries (one for each group) might be fetching the count. If that's happening, you'll have to post the linq query for someone to resolve the issue.
Based on the code you sent, it doesn't look like things should be taking that long. I have a few suggestions:
Use LinqPad to do this query. It will let you see the SQL that gets generated. Then run that SQL code in SQL Server Management Studio, and tell it to include the actual execution plan. This will help you learn whether there's a particular point in the query that's taking a lot of time. For example, if you don't have an index on the Account table's PersonId reference, this query will take a lot longer.
Look at how you're using this data. It's very rare that you really need to have all the people in your entire system in memory at the same time. In fact, I suspect that simply getting all this person data out of the database is probably taking a lot more time than the Count() is.
Are you displaying this data? If so, wouldn't it be better to "page" the results, only showing maybe ten entries at a time? You can use the .Take(int) method before calling .ToList() to get only as many entries as you need.
If you're processing and aggregating this data for the sake of site metrics, it's probably better to set up your query to return the end result before it gets evaluated.
If you can describe how this data is being used, or provide a screenshot of the SQL's execution, we can provide more feedback.
I solved a similar problem using the GroupBy method.
IEnumerable> accounts = Accounts.GroupBy(x => x.personID);
accounts.Count() will return the number of accounts that belong to the person.
accounts.Key will return the personID of the group.
I had a somehow similar problem, I tried these and worked out better :
child.count(x=> x.paretnID == inputParentID)
child.where(x=> x.parentID == inputParentID)
my original code which took around 15-20 seconds on each iteration was:
return (isEdit) ? db.ChasisBuys.Single(x => x.ChasisBuyID == long.Parse(Request.QueryString["chbid"])).Chasises.Count(y => y.Bikes.Count > 0 && y.ColorID == buyItems[(int)index].ColorID && y.ChasisTypeID == buyItems[(int)index].ChasisTypeID).ToString() : "-";
new code which runs good is :
**return (isEdit) ? db.Chasises.Where(x => x.ChasisBuyID == long.Parse(Request.QueryString["chbid"])).Count(y => y.Bikes.Count > 0 && y.ColorID == buyItems[(int)index].ColorID && y.ChasisTypeID == buyItems[(int)index].ChasisTypeID).ToString() : "-";**
Database has around 1000 records in chasises , about 5 in chasisBuys and about 20 in Bikes.
my opinion is that Linq to SQL queries does not do preevaluations such in logical statements which for instance if you write "return a && b && c;" if statement a is false other statements are not evaluated and I was expecting such thing in linq to sql but it's not the case.

Categories