Func Delegates cause LINQ-to-Entities to pull back the entire table - c#

Passing a Func<> as a Where/Count filter causes LINQ to pull back the entire table. Here's a simple example.
pdx.Database.Log = strWriter1.Write;
totalCount = pdx.Principals.Count(x => x.PrincipalNumber.ToLower().Contains("42"));
Looking at the log I see
SELECT [GroupBy1].[A1] AS [C1] FROM ( SELECT COUNT(1) AS [A1]
FROM [Dealer].[Principal] AS [Extent1]
WHERE LOWER([Extent1].[PrincipalNumber]) LIKE N'%42%'
) AS [GroupBy1]
Did not pull back the full table. Simple enough. Now let's assign that lambda to a Func<>
pdx.Database.Log = strWriter2.Write;
Func<Principal, bool> filter = (x => x.PrincipalNumber.ToLower().Contains("42"));
totalCount = pdx.Principals.Count(filter);
The log shows it's pulling down the entire table.
SELECT
[Extent1].[PrincipalNumber] AS [PrincipalNumber],
[Extent1].[Id] AS [Id],
[Extent1].[CompanyName] AS [CompanyName],
...
[Extent1].[DistrictSalesManagerId] AS [DistrictSalesManagerId]
FROM [Dealer].[Principal] AS [Extent1]
That's pretty bad for performance. I have functions that do LINQ queries. I want to pass lambda filters to these functions so I can filter on various things, but apparently I can't pass lambdas as Func<>s because it will kill the performance. What are my alternatives?
What I want to do...
public IEnumerable<DealerInfo> GetMyPage(Func<Principal, bool> filter, int pageNumber, int pageSize, out int totalCount)
{
List<DealerInfo> dealers;
using (MyContext pdx = new MyContext())
{
totalCount = pdx.Principals.Count(filter);
// More LINQ stuff here, but UGH the performance...
}
}

You actually need to pass Expression<Func<TSrource,T>> , Linq to Entities cannot translate Func<T> to sql, change the signatures to be like:
public IEnumerable<DealerInfo> GetMyPage(Expression<Func<Principal, bool>> filter, int pageNumber, int pageSize, out int totalCount)
{
List<DealerInfo> dealers;
using (MyContext pdx = new MyContext())
{
totalCount = pdx.Principals.Count(filter);
// More LINQ stuff here, but UGH the performance...
}
}
When you pass Func<T,TResult>> in the Count method as argument, it would call Count method extension method of IEnumerable<T> which is in memory collection, so that is causing the whole table data to be loaded in to memory first and the count delegate then gets executed when all data is loaded and memory and executes the provided delegate call in memory , while passing Expression<Func<T>> as argument will make it translate the statement to proper sql if possible and then will make call to Count extension method of IQueryable<T> so you will have the correct query executing and result back.

Related

Use Entity Framework Select Single Row. Entity Framework did not generate sql statement right?

I'm using Entity Framework as my data provider. And there is one specific table did not generate sql statement right.
While I have passed query condition, but the Entity Framework still generate sql statement for the whole table.
The Code is like this:
public IList<Device> GetDeviceByNodeId(string nodeId)
=> GetModels(device => device.DeviceNodeId == nodeId).ToList();
public virtual IEnumerable<T> GetModels(Func<T, bool> exp)
=> EntitySet.Where(exp);
And the generated sql statement is like :
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[DeviceTypeId] AS [DeviceTypeId],
[Extent1].[OriginalDeviceId] AS [OriginalDeviceId],
[Extent1].[DeviceCode] AS [DeviceCode],
[Extent1].[StatCode] AS [StatCode],
[Extent1].[DevicePassword] AS [DevicePassword],
[Extent1].[DeviceModuleGuid] AS [DeviceModuleGuid],
[Extent1].[DeviceNodeId] AS [DeviceNodeId],
[Extent1].[FirmwareSetId] AS [FirmwareSetId],
[Extent1].[ProjectId] AS [ProjectId],
[Extent1].[StartTime] AS [StartTime],
[Extent1].[PreEndTime] AS [PreEndTime],
[Extent1].[EndTime] AS [EndTime],
[Extent1].[Status] AS [Status],
[Extent1].[CameraId] AS [CameraId],
[Extent1].[DomainId] AS [DomainId],
[Extent1].[CreateDateTime] AS [CreateDateTime],
[Extent1].[CreateUserId] AS [CreateUserId],
[Extent1].[LastUpdateDateTime] AS [LastUpdateDateTime],
[Extent1].[LastUpdateUserId] AS [LastUpdateUserId],
[Extent1].[IsDeleted] AS [IsDeleted],
[Extent1].[IsEnabled] AS [IsEnabled]
FROM [dbo].[Devices] AS [Extent1]
Did I use linq the wrong way ?
It is because you are using a function, and not an expression.
Remember that IQueryables are also IEnumerables, so you are really calling Enumerable.Where, not Queryable.Where. This will cause Entity Framework to execute the query and filter the results in memory.
Change your method signature from:
public virtual IEnumerable<T> GetModels(Func<T, bool> exp)
to
public virtual IEnumerable<T> GetModels(Expression<Func<T, bool>> exp)
See this post for more information about the differences between Expressions and Functions.

Get SQL statement from a predicate?

Is it possible to build the 'WHERE' clause of a SQL statement based on a predicate?
For example:
public override IQueryable<Customer>
SearchFor(Expression<Func<Customer, bool>> predicate)
{ }
The base method just uses EF and all it does is:
return dbSet.Where(predicate);
However, in this particular scenario I need to override the method and, based on the predicate parameter, build a sql statement and execute that statement against the database directly (skipping EF).
So my new method would be:
public override IQueryable<Customer> SearchFor(Expression<Func<Customer, bool>> predicate)
{
var where = predicate.ToString(); //Not actual code!!
var sql = "SELECT id, name FROM customers WHERE " + where;
//Execute sql statement
}
And the caller would do:
var customers = customerRepository.SearchFor(x => x.CustomerType = "ABC" && x.Age > 21);
The customer entity in this example is just an example. The reason for me to build an sql statement instead of using EF is:
Use Dapper for performance.
I will execute a stored procedure to fetch the reocrds.
The entities I'm using are not mapped to a table. A table exists in the database but these entities are just placeholder POCO's for when retrieving records.
Is it possible?
Take a look at DataContext.GetCommand Method
Per MSDN:
Gets the information about SQL commands generated by LINQ to SQL.
This will get you the whole SQL, but some string manipulation will get you just the Where clause.
I wrote code to convert to SQL.
https://github.com/phnx47/MicroOrm.Dapper.Repositories - Include SqlGenerator
https://github.com/phnx47/MicroOrm.SqlGenerator
Example:
var account = accountsRepository.FindAsync(x => x.Id == id)
Generated:
SELECT Accounts.Id, Accounts.Name, FROM Accounts WHERE Accounts.Id = #Id

DbSet<T>.Where(where).ToList() - why SQL does not include where clause?

Why is EF 6 querying the database for all records with the following code?
public virtual List<T> Find(Func<T, bool> where = null)
{
_db.Configuration.LazyLoadingEnabled = false;
if (where == null) throw new NullReferenceException("The 'where' parameter of the Repository.Find() method is null.");
return _dbSet.Where(where).ToList();
}
Produces the following output
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Sequence] AS [Sequence],
[Extent1].[Description] AS [Description],
[Extent1].[Instructions] AS [Instructions],
[Extent1].[WorkCenterOperationId] AS [WorkCenterOperationId],
[Extent1].[JobId] AS [JobId],
[Extent1].[JobAssemblyId] AS [JobAssemblyId],
[Extent1].[RowVersion] AS [RowVersion]
FROM [dbo].[JobOperations] AS [Extent1]
Two questions:
Why isn't the query executed with the where statement?
How do I get the query to execute with the where statement?
You used a Func<T,bool> rather than an Expression<Func<T,bool>> and so you've forced (somewhere) a transition from the database Linq-to-Entities to Linq-to-Objects. So it's processed in memory.
And, as #Marc points out, a simple fix may be:
public virtual List<T> Find(Expression<Func<T, bool>> where = null)
...
But that, in turn, depends on whether the calling code is in a form that can generate either of Func<T,bool> or Expression<Func<T,bool>> (usually, a lambda will be convertible to either form)

Force Entity Framework to use SQL parameterization for better SQL proc cache reuse

Entity Framework always seems to use constants in generated SQL for values provided to Skip() and Take().
In the ultra-simplified example below:
int x = 10;
int y = 10;
var stuff = context.Users
.OrderBy(u => u.Id)
.Skip(x)
.Take(y)
.Select(u => u.Id)
.ToList();
x = 20;
var stuff2 = context.Users
.OrderBy(u => u.Id)
.Skip(x)
.Take(y)
.Select(u => u.Id)
.ToList();
the above code generates the following SQL queries:
SELECT TOP (10)
[Extent1].[Id] AS [Id]
FROM ( SELECT [Extent1].[Id] AS [Id], row_number() OVER (ORDER BY [Extent1].[Id] ASC) AS [row_number]
FROM [dbo].[User] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > 10
ORDER BY [Extent1].[Id] ASC
SELECT TOP (10)
[Extent1].[Id] AS [Id]
FROM ( SELECT [Extent1].[Id] AS [Id], row_number() OVER (ORDER BY [Extent1].[Id] ASC) AS [row_number]
FROM [dbo].[User] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > 20
ORDER BY [Extent1].[Id] ASC
Resulting in 2 Adhoc plans added to the SQL proc cache with 1 use each.
What I'd like to accomplish is to parameterize the Skip() and Take() logic so the following SQL queries are generated:
EXEC sp_executesql N'SELECT TOP (#p__linq__0)
[Extent1].[Id] AS [Id]
FROM ( SELECT [Extent1].[Id] AS [Id], row_number() OVER (ORDER BY [Extent1].[Id] ASC) AS [row_number]
FROM [dbo].[User] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > #p__linq__1
ORDER BY [Extent1].[Id] ASC',N'#p__linq__0 int,#p__linq__1 int',#p__linq__0=10,#p__linq__1=10
EXEC sp_executesql N'SELECT TOP (#p__linq__0)
[Extent1].[Id] AS [Id]
FROM ( SELECT [Extent1].[Id] AS [Id], row_number() OVER (ORDER BY [Extent1].[Id] ASC) AS [row_number]
FROM [dbo].[User] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > #p__linq__1
ORDER BY [Extent1].[Id] ASC',N'#p__linq__0 int,#p__linq__1 int',#p__linq__0=10,#p__linq__1=20
This results in 1 Prepared plan added to the SQL proc cache with 2 uses.
I have some fairly complex queries and am experiencing significant overhead (on the SQL Server side) on the first run, and much faster execution on subsequent runs (since it can use the plan cache). Note that these more advanced queries already use sp_executesql as other values are parameterized so I'm not concerned about that aspect.
The first set of queries generated above basically means any pagination logic will create a new entry in the plan cache for each page, bloating the cache and requiring the plan generation overhead to be incurred for each page.
Can I force Entity Framework to parameterize values? I've noticed for other values e.g. in Where clauses, sometimes it parameterizes values, and sometimes it uses constants.
Am I completely out to lunch? Is there any reason why Entity Framework's existing behavior is better than the behavior I desire?
Edit:
In case it's relevant, I should mention that I'm using Entity Framework 4.2.
Edit 2:
This question is not a duplicate of Entity Framework/Linq to SQL: Skip & Take, which merely asks how to ensure that Skip and Take execute in SQL instead of on the client. This question pertains to parameterizing these values.
Update: the Skip and Take extension methods that take lambda parameters described below are part of Entity Framework from version 6 and onwards. You can take advantage of them by importing the System.Data.Entity namespace in your code.
In general LINQ to Entities translates constants as constants and variables passed to the query into parameters.
The problem is that the Queryable versions of Skip and Take accept simple integer parameters and not lambda expressions, therefore while LINQ to Entities can see the values you pass, it cannot see the fact that you used a variable to pass them (in other words, methods like Skip and Take don't have access to the method's closure).
This not only affects the parameterization in LINQ to Entities but also the learned expectation that if you pass a variable to a LINQ query the latest value of the variable is used every time you re-execute the query. E.g., something like this works for Where but not for Skip or Take:
var letter = "";
var q = from db.Beattles.Where(p => p.Name.StartsWith(letter));
letter = "p";
var beattle1 = q.First(); // Returns Paul
letter = "j";
var beattle2 = q.First(); // Returns John
Note that the same peculiarity also affects ElementAt but this one is currently not supported by LINQ to Entities.
Here is a trick that you can use to force the parameterization of Skip and Take and at the same time make them behave more like other query operators:
public static class PagingExtensions
{
private static readonly MethodInfo SkipMethodInfo =
typeof(Queryable).GetMethod("Skip");
public static IQueryable<TSource> Skip<TSource>(
this IQueryable<TSource> source,
Expression<Func<int>> countAccessor)
{
return Parameterize(SkipMethodInfo, source, countAccessor);
}
private static readonly MethodInfo TakeMethodInfo =
typeof(Queryable).GetMethod("Take");
public static IQueryable<TSource> Take<TSource>(
this IQueryable<TSource> source,
Expression<Func<int>> countAccessor)
{
return Parameterize(TakeMethodInfo, source, countAccessor);
}
private static IQueryable<TSource> Parameterize<TSource, TParameter>(
MethodInfo methodInfo,
IQueryable<TSource> source,
Expression<Func<TParameter>> parameterAccessor)
{
if (source == null)
throw new ArgumentNullException("source");
if (parameterAccessor == null)
throw new ArgumentNullException("parameterAccessor");
return source.Provider.CreateQuery<TSource>(
Expression.Call(
null,
methodInfo.MakeGenericMethod(new[] { typeof(TSource) }),
new[] { source.Expression, parameterAccessor.Body }));
}
}
The class above defines new overloads of Skip and Take that expect a lambda expression and can hence capture variables. Using the methods like this will result in the variables being translated to parameters by LINQ to Entities:
int x = 10;
int y = 10;
var query = context.Users.OrderBy(u => u.Id).Skip(() => x).Take(() => y);
var result1 = query.ToList();
x = 20;
var result2 = query.ToList();
Hope this helps.
The methods Skip and Top of ObjectQuery<T> can be parametrized. There is an example at MSDN.
I did a similar thing in a model of my own and sql server profiler showed the parts
SELECT TOP (#limit)
and
WHERE [Extent1].[row_number] > #skip
So, yes. It can be done. And I agree with others that this is a valuable observation you made here.

Entity Framework/Linq to SQL: Skip & Take

Just curious as to how Skip & Take are supposed to work. I'm getting the results I want to see on the client side, but when I hook up the AnjLab SQL Profiler and look at the SQL that is being executed it looks as though it is querying for and returning the entire set of rows to the client.
Is it really returning all the rows then sorting and narrowing down stuff with LINQ on the client side?
I've tried doing it with both Entity Framework and Linq to SQL; both appear to have the same behavior.
Not sure it makes any difference, but I'm using C# in VWD 2010.
Any insight?
public IEnumerable<Store> ListStores(Func<Store, string> sort, bool desc, int page, int pageSize, out int totalRecords)
{
var context = new TectonicEntities();
totalRecords = context.Stores.Count();
int skipRows = (page - 1) * pageSize;
if (desc)
return context.Stores.OrderByDescending(sort).Skip(skipRows).Take(pageSize).ToList();
return context.Stores.OrderBy(sort).Skip(skipRows).Take(pageSize).ToList();
}
Resulting SQL (Note: I'm excluding the Count query):
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name],
[Extent1].[LegalName] AS [LegalName],
[Extent1].[YearEstablished] AS [YearEstablished],
[Extent1].[DiskPath] AS [DiskPath],
[Extent1].[URL] AS [URL],
[Extent1].[SecureURL] AS [SecureURL],
[Extent1].[UseSSL] AS [UseSSL]
FROM [dbo].[tec_Stores] AS [Extent1]
After some further research, I found that the following works the way I would expect it to:
public IEnumerable<Store> ListStores(Func<Store, string> sort, bool desc, int page, int pageSize, out int totalRecords)
{
var context = new TectonicEntities();
totalRecords = context.Stores.Count();
int skipRows = (page - 1) * pageSize;
var qry = from s in context.Stores orderby s.Name ascending select s;
return qry.Skip(skipRows).Take(pageSize);
}
Resulting SQL:
SELECT TOP (3)
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name],
[Extent1].[LegalName] AS [LegalName],
[Extent1].[YearEstablished] AS [YearEstablished],
[Extent1].[DiskPath] AS [DiskPath],
[Extent1].[URL] AS [URL],
[Extent1].[SecureURL] AS [SecureURL],
[Extent1].[UseSSL] AS [UseSSL]
FROM ( SELECT [Extent1].[ID] AS [ID], [Extent1].[Name] AS [Name], [Extent1].[LegalName] AS [LegalName], [Extent1].[YearEstablished] AS [YearEstablished], [Extent1].[DiskPath] AS [DiskPath], [Extent1].[URL] AS [URL], [Extent1].[SecureURL] AS [SecureURL], [Extent1].[UseSSL] AS [UseSSL], row_number() OVER (ORDER BY [Extent1].[Name] ASC) AS [row_number]
FROM [dbo].[tec_Stores] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > 3
ORDER BY [Extent1].[Name] ASC
I really like the way the first option works; Passing in a lambda expression for sort. Is there any way to accomplish the same thing in the LINQ to SQL orderby syntax? I tried using qry.OrderBy(sort).Skip(skipRows).Take(pageSize), but that ended up giving me the same results as my first block of code. Leads me to believe my issues are somehow tied to OrderBy.
====================================
PROBLEM SOLVED
Had to wrap the incoming lambda function in Expression:
Expression<Func<Store,string>> sort
The following works and accomplishes the simplicity I was looking for:
public IEnumerable<Store> ListStores(Expression<Func<Store, string>> sort, bool desc, int page, int pageSize, out int totalRecords)
{
List<Store> stores = new List<Store>();
using (var context = new TectonicEntities())
{
totalRecords = context.Stores.Count();
int skipRows = (page - 1) * pageSize;
if (desc)
stores = context.Stores.OrderByDescending(sort).Skip(skipRows).Take(pageSize).ToList();
else
stores = context.Stores.OrderBy(sort).Skip(skipRows).Take(pageSize).ToList();
}
return stores;
}
The main thing that fixed it for me was changing the Func sort parameter to:
Expression<Func<Store, string>> sort
As long as you don't do it like queryable.ToList().Skip(5).Take(10), it won't return the whole recordset.
Take
Doing only Take(10).ToList(), does a SELECT TOP 10 * FROM.
Skip
Skip works a bit different because there is no 'LIMIT' function in TSQL. However it creates an SQL query that is based on the work described in this ScottGu blog post.
If you see the whole recordset returned, it probably is because you are doing a ToList() somewhere too early.
Entity Framework 6 solution here...
http://anthonychu.ca/post/entity-framework-parameterize-skip-take-queries-sql/
e.g.
using System.Data.Entity;
....
int skip = 5;
int take = 10;
myQuery.Skip(() => skip).Take(() => take);
I created simple extension:
public static IEnumerable<T> SelectPage<T, T2>(this IEnumerable<T> list, Func<T, T2> sortFunc, bool isDescending, int index, int length)
{
List<T> result = null;
if (isDescending)
result = list.OrderByDescending(sortFunc).Skip(index).Take(length).ToList();
else
result = list.OrderBy(sortFunc).Skip(index).Take(length).ToList();
return result;
}
Simple use:
using (var context = new TransportContext())
{
var drivers = (from x in context.Drivers where x.TransportId == trasnportId select x).SelectPage(x => x.Id, false, index, length).ToList();
}
If you are using SQL Server as DB
Then you can convert
context.Users.OrderBy(u => u.Id)
.Skip(() => 10)
.Take(() => 5)
.ToList
=>
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[UserName] AS [UserName]
FROM [dbo].[AspNetUsers] AS [Extent1]
ORDER BY [Extent1].[Id] ASC
OFFSET 10 ROWS FETCH NEXT 5 ROWS ONLY
refrence: https://anthonychu.ca/post/entity-framework-parameterize-skip-take-queries-sql/
Try this:
public IEnumerable<Store> ListStores(Func<Store, string> sort, bool desc, int page, int pageSize, out int totalRecords)
{
var context = new TectonicEntities();
var results = context.Stores;
totalRecords = results.Count();
int skipRows = (page - 1) * pageSize;
if (desc)
results = results.OrderByDescending(sort);
return results.Skip(skipRows).Take(pageSize).ToList();
}
in truth, that last .ToList() isn't really necessary as you are returning IEnumerable...
There will be 2 database calls, one for the count and one when the ToList() is executed.

Categories