Can anyone tell me why these two code snippets give me two different SQL executions:
First
return nHibernateSession.Query<TEntity>()
.Where(_filter)
.Select(_selector)
.FirstOrDefault();
Where I pass in the arguments
Func<Product, bool> _filter = x => x.Id == 10;
Func<Product, string> _selector = x => x.Name;
When inspecting the query using NHibernate Profiler, this shows that I fully hydrate the entire collection of products in an unbounded query. It selects all fields and all rows, and then I guess it filters the resultset and only returns the name.
Second
And one where I have explicitly specified my query. I tried this to debug the previous generic expression.
return nHibernateSession.Query<Product>()
.Where(x => x.Id == 10)
.Select(x => x.Name)
.FirstOrDefault();
This one behaves as I would expect the previous one to behave. It only selects the name column, and applies a WHERE clause that ensures I only get product 10.
NHibernate needs the expression tree of the filter in order to generate SQL from it. When you pass a Func to it, NHibernate cannot translate it to SQL, so it retrieves all data from DB and then applies the Func filter to it.
change the first one to
Expression<Func<Product, bool>> _filter = x => x.Id == 10;
Expression<Func<Product, string>> _selector = x => x.Name;
In this case you give the whole expression to NHibernate, therefore a restricted query can be generated based on it.
Related
I have a database that users can run a variety of calculations on. The calculations run on 4 different columns each calculation does not necessarily use every column i.e. calculation1 might turn into sql like
SELECT SUM(Column1)
FROM TABLE
WHERE Column1 is not null
and calculation2 would be
SELECT SUM(Column2)
WHERE Column2 is null
I am trying to generate this via linq and I can get the correct data back by calculating everything every time such as
table.Where(x => x.Column1 != null)
.Where(x => x.Column2 == null)
.GroupBy(x => x.Date)
.Select(dateGroup => new
{
Calculation1 = dateGroup.Sum(x => x.Column1 != null),
Calculation2 = dateGroup.Sum(x => x.Column2 == null)
}
The problem is that my dataset is very large, and so I do not want to perform a calculation unless the user has requested it. I have looked into dynamically generating Linq queries. All I have found so far is PredicateBuilder and DynamicSQL, which appear to only be useful for dynamically generating the Where predicate, and hardcoding the sql query itself as a string with the Sum(Column1) or Sum(Column2) being inserted when necessary.
How would one go about dynamically adding the different parts of the Select query into an anonymous type like this? Or should I be looking at an entirely different way of handling this
You can return your query without executing it, which will allow you to dynamically choose what to return.
That said, you cannot dynamically modify an anonymous type at runtime. They are statically typed at compile time. However, you can use a different return object to allow for dynamic properties without needing an external library.
var query = table
.Where(x => x.Column1 != null)
.Where(x => x.Column2 == null)
.GroupBy(x => x.Date);
You can then dyamically resolve queries with any one of the following:
dynamic
dynamic returnObject = new ExpandoObject();
if (includeOne)
returnObject.Calculation1 = groupedQuery.Select (q => q.Sum(x => x.Column1));
if (includeTwo)
returnObject.Calculation2 = groupedQuery.Select (q => q.Sum (x => x.Column2));
Concrete Type
var returnObject = new StronglyTypedObject();
if (includeOne)
returnObject.Calculation1 = groupedQuery.Select (q => q.Sum(x => x.BrandId));
Dictionary<string, int>
I solved this and kept myself from having to lose type safety with Dynamic Linq by using a hacky workaround. I have a object containing bools that correspond to what calculations I want to do such as
public class CalculationChecks
{
bool doCalculation1 {get;set;}
bool doCalculation2 {get;set;}
}
and then do a check in my select for whether or not I should do the calculation or return a constant, like so
Select(x => new
{
Calculation1 = doCalculation1 ? DoCalculation1(x) : 0,
Calculation2 = doCalculation2 ? DoCalculation2(x) : 0
}
However, this appears to be an edge case with linq to sql or ef, that causes the generated sql to still do the calculations specified in DoCalculation1() and DoCalculation2 and then use a case statement to decide whether or not its going to return the data to me. It runs signficantly slower,40-60% in testing, and the execution plan shows that it uses a much more inefficient query.
The solution to this problem was to use an ExpressionVisitor to go through the expression and remove the calculations if the corresponding bool was false. The code showing how implement this ExpressionVisitor was provided by #StriplingWarrior on this question Have EF Linq Select statement Select a constant or a function
Using both of these solutions together is still not creating sql that runs at 100% the speed of plain sql. In testing it was within 10s of plain sql no matter the size of the test set, and the major portions of the execution plan were the same
There is a list, policiesToDelete of entity class, MonitoringRelations. Out of this list I have selected two elements and construed a new list:
var policyKeysToDelete = policiesToDelete
.Select(r => new {r.PolicyId, r.GroupId})
.ToList();
Now, I have a query where I want to compare elements from policyKeysToDelete list.
var objectsToDelete = (from p in storageContext.MonitoringRelations
where policyKeysToDelete
.Any(x => x == new {p.PolicyId, p.GroupId})
select p)
.ToList();
The problem: the query above throws this exception:
NotSupportedException: Unable to create a constant value of type 'Anonymous type'. Only primitive types or enumeration types are supported in this context.
I have tried changing the anonymous list to a list<tuple<PolicyId, GroupId>> , but that also didn't help, throwing the almost same exception. I tried using Contains in place of Any but that also didn't help.
Any idea how can I solve this problem?
EF cannot translate a list of complex objects into the SQL query. What EF can do, is translate a list of simple values into SQL when you use it with the .Contains method.
So if you extract a list of PolicyId's and a list of GroupId's from the policyKeysToDelete, and use it to select as much as you can with EF, then you can do the full check in the resultset which is then in-memory, using Linq-to-objects.
Warning: you are extracting too much from the database, so depending on the amount of data, a different solution might be better.
var policyKeysToDelete = policiesToDelete
.Select(r => new {r.PolicyId, r.GroupId})
.ToList();
// List of values types, which can be translated to SQL
var policyIds = policyKeysToDelete.Select(x => x.PolicyId).ToList();
var groupIds = policyKeysToDelete.Select(x => x.GroupId).ToList();
var objectsToDelete = storageContext.MonitoringRelations
// Do the part that we can do in the database, which is select the records
// which have an corresponding PolicyId or GroupId
.Where(x => policyIds.Contains(x.PolicyId) || groupIds.Contains(x.GroupId))
// Use this method to indicate that whatever follows after should not be
// translated to SQL
.AsEnumerable()
// Do the full check in-memory
.Where(x => policyKeysToDelete
.Any(y => x.PolicyId == y.PolicyId && x.GroupId == y.GroupId)
)
.ToList();
I am new to EF and LINQ.
The following two pieces of code work:
dbContext.Categories.Where(cat => [...big ugly code for filtering...] );
&
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories.Where(
cat => [...big ugly code for filtering...] );
But I want somehow to create only one, reusable, expression or delegate for my filter. I have the following:
private static Expression<Func<Category, bool>> Filter(filter)
{
return cat => [...big ugly code for filtering...] ;
}
but I cannot use it in SelectMany.
I am aware that:
Where clause of standard query accepts Expression<Func<Category,bool>> and returns IQueryable<Category>
Where clause of SelectMany accepts Func<Category,bool> and returns IEnumerable<Category>.
What is the best way to accomplish this? Are any tricks here?
PS: I want in the end to get all categories of a product.
It looks like you're trying to use SelectMany as a filter. SelectMany is used to flatten a collection of collections (or a collection of a type that contains another collection) into one flat collection.
I think what you want is:
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories)
.Where(filter);
In which case you can reuse the same expression to filter.
EDIT
Based on your updated question it looks like you are applying Where to an IEnumerable<T> property, so the compiler is binding to IEnumerable.Where which takes a Func instead of an Expression.
You should be able to just call AsQueryable() on your collection property to bind to IQueryable.Where():
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories
.AsQueryable()
.Where(filter);
The next option would be to compile the expression to turn it into a Func:
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories
.Where(filter.Compile());
But it wouldn't surprise me if the underlying data provider isn't able to translate that to SQL.
All you need to do is call the Filter function before executing the query and store it in a local variable. When the query provider sees a method it attempts to translate that method into SQL, rather than executing the method and using the result. Whenever it encounters a local variable it doesn't attempt to translate it into SQL but rather evaluates the variable to its value, and then uses that value in the query.
As for the problems that you're having due to the fact that the relationship collection isn't an IQueryable, it's probably best to simply approach the query differently and just pull directly from the categories list instead:
var filter = Filter();
dbContext.Categories.Where(filter)
.Where(cat => cat.Product.PROD_UID == 1234);
After analyzing in more detail the SQL generated by EF, I realized that having the filter inside SelectMany is not more efficient.
So, both suggestions (initial one from DStanley and Servy's) should be ok for my case (many-to-many relation between Categories and Products)
/* 1 */
dbContext.Products.Where(prod => prod.PROD_UID == 1234)
.SelectMany(prod => prod.Categories)
.Where( Filter ); // reuseable expression
this will result into a INNER JOIN
/* 2 */
dbContext.Categories.Where( Filter ) // reuseable expression
.Where(c => c.Products.Where(prod.PROD_UID == 1234).Any());
this will result into a EXISTS (sub-select)
The execution plan seems to be absolutely identical for both in my case; so, I will choose for now #2, and will keep an eye on performance.
I have a method where I am trying to return all default customer addresses with the matching gender. I would like to be able to build up the filtering query bit by bit by passing in System.Func methods to the where clause.
var emailAddresses = new List<string>();
// get all customers.
IQueryable<Customer> customersQ = base.GetAllQueryable(appContext).Where(o => o.Deleted == false);
// for each customer filter, filter the query.
var genders = new List<string>() { "C" };
Func<Customer, bool> customerGender = (o => genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender));
customersQ = customersQ.Where(customerGender).AsQueryable();
emailAddresses = (from c in customersQ
select c.Email).Distinct().ToList();
return emailAddresses;
But this method calls the database for every address (8000) times which is very slow.
however if I replace the two lines
Func<Customer, bool> customerGender = (o => genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender));
customersQ = customersQ.Where(customerGender).AsQueryable();
with one line
customersQ = customersQ.Where(o => genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender)).AsQueryable();
Then the query only makes one call to the database and is very fast.
My question is why does this make a difference? How can I make the first method work with only calling the database once?
Use expression instead of Func:
Expression<Func<Customer, bool>> customerGender = (o =>
genders.Contains(o.Addresses.FirstOrDefault(a => a.IsDefaultAddress).Gender));
customersQ = customersQ.Where(customerGender).AsQueryable();
When you are using simple Func delegate, then Where extension of Enumerable is called. Thus all data goes into memory, where it is enumerated and lambda is executed for each entity. And you have many calls to database.
On the other hand, when you are using expression, then Where extension of Queryable is called, and expression is converted into SQL query. That's why you have single query in second case (if you use in-place lambda it is converted into expression).
I have made myself an ExpressionBuilder class that helps me put together expressions that can be used as a predicate when doing Linq to Sql queries. It has worked great. However, I just discovered Expressions can only be used to filter on Tables, and not on EntitySets??Why on earth is this the case?
For example if I have Company and an Employee with a Salary. I could create these two expressions:
Expression<Func<Company, bool>> cp = x => x.Name.StartsWith("Micro");
Expression<Func<Employee, bool>> ep = x => x.Name.StartsWith("John");
I would then expect to be able to do the following, however it only partially works:
var companies = dataContext.Companies
.Where(cp) // Goes fine
.Select(x => new
{
x.Name,
SumOfSalaries = x.Employees
.Where(ep) // Causes compile-time error
.Sum(y => y.Salary),
}
.ToList();
Also, if I do a ep.Compile() it compiles, but then I get an error when running the query.
Why is this the case? Am I missing something? I don't find this logical. Can I fix this somehow? Or do you have a good workaround?
I know that I in this case could just use Where(x => x.Name.StartsWith("John")) instead, but the problem is that the expressions I need are not that trivial. They are longer strings of AndAlsos and OrElses.
If you are going to pass a lambda expression to a LINQ to SQL provider don't create it as an Expression<T> - let the provider do that for you.
The following works for me - note both are compiled:
var companies = dataContext.Companies.Where(cp.Compile())
.Select(x => new
{
x.Name,
SumOfSalaries = x.Employees
.Where( ep.Compile() )
.Sum(y => y.Salary),
}
).ToList();
The expression parser seems to be losing type information in there somewhere after the first where clause when you put in the second. To be honest, I'm not sure why yet.
Edit: To be clear, I do understand that EntitySet doesn't support passing an expression into the where clause. What I don't completely understand is why it fails when you add the Where(ep.Compile()).
My theory is that in compiling the first where (Where(cp.Compile()), Linq2Sql quits parsing the expression - that it can't parse the ep.Compile() against an entityset, and is unable to decide to break up the query into two until you compile the first where clause.
I think you need to rewrite your query. Another way to ask for the details you want, is: "Give me the sum of the salaries for the selected employees in the selected companies, organized by the company name".
So, with the employee salaries in focus, we can write:
//Expression<Func<Employee, bool>> ep = x => x.Name.StartsWith("John");
//Expression<Func<Company, bool>> cp = x => x.Name.StartsWith("Micro");
Expression<Func<Employee, bool>> ep = x => x.Name.StartsWith("John");
Expression<Func<Employee, bool>> cp = x => x.Company.Name.StartsWith("Micro");
var salaryByCompany = dataContext.Employees
.Where(ep)
.Where(cp)
.GroupBy(employee => employee.Company.Name)
.Select(companyEmployees => new
{
Name = companyEmployees.Key,
SumOfSalaries = companyEmployees.Sum(employee => employee.Salary)
});
var companies = salaryByCompany.ToList();