Combine Expressions instead of using multiple queries in Entity Framework - c#

I have following generic queryable (which may already have selections applied):
IQueryable<TEntity> queryable = DBSet<TEntity>.AsQueryable();
Then there is the Provider class that looks like this:
public class Provider<TEntity>
{
public Expression<Func<TEntity, bool>> Condition { get; set; }
[...]
}
The Condition could be defined per instance in the following fashion:
Condition = entity => entity.Id == 3;
Now I want to select all Provider instances which have a Condition that is met at least by one entity of the DBSet:
List<Provider> providers = [...];
var matchingProviders = providers.Where(provider => queryable.Any(provider.Condition))
The problem with this: I'm starting a query for each Provider instance in the list. I'd rather use a single query to achieve the same result. This topic is especially important because of questionable performance. How can I achieve the same results with a single query and improve performance using Linq statements or Expression Trees?

Interesting challenge. The only way I see is to build dynamically UNION ALL query like this:
SELECT TOP 1 0 FROM Table WHERE Condition[0]
UNION ALL
SELECT TOP 1 1 FROM Table WHERE Condition[1]
...
UNION ALL
SELECT TOP 1 N-1 FROM Table WHERE Condition[N-1]
and then use the returned numbers as index to get the matching providers.
Something like this:
var parameter = Expression.Parameter(typeof(TEntity), "e");
var indexQuery = providers
.Select((provider, index) => queryable
.Where(provider.Condition)
.Take(1)
.Select(Expression.Lambda<Func<TEntity, int>>(Expression.Constant(index), parameter)))
.Aggregate(Queryable.Concat);
var indexes = indexQuery.ToList();
var matchingProviders = indexes.Select(index => providers[index]);
Note that I could have built the query without using Expression class by replacing the above Select with
.Select(_ => index)
but that would introduce unnecessary SQL query parameter for each index.

Here is another (crazy) idea that came in my mind. Please note that similar to my previous answer, it doesn't guarantee better performance (in fact it could be worse). It just presents a way to do what you are asking with a single SQL query.
Here we are going to create a query that returns a single string with length N consisting of '0' and '1' characters with '1' denoting a match (something like string bit array). The query will use my favorite group by constant technique to build dynamically something like this:
var matchInfo = queryable
.GroupBy(e => 1)
.Select(g =>
(g.Max(Condition[0] ? "1" : "0")) +
(g.Max(Condition[1] ? "1" : "0")) +
...
(g.Max(Condition[N-1] ? "1" : "0")))
.FirstOrDefault() ?? "";
And here is the code:
var group = Expression.Parameter(typeof(IGrouping<int, TEntity>), "g");
var concatArgs = providers.Select(provider => Expression.Call(
typeof(Enumerable), "Max", new[] { typeof(TEntity), typeof(string) },
group, Expression.Lambda(
Expression.Condition(
provider.Condition.Body, Expression.Constant("1"), Expression.Constant("0")),
provider.Condition.Parameters)));
var concatCall = Expression.Call(
typeof(string).GetMethod("Concat", new[] { typeof(string[]) }),
Expression.NewArrayInit(typeof(string), concatArgs));
var selector = Expression.Lambda<Func<IGrouping<int, TEntity>, string>>(concatCall, group);
var matchInfo = queryable
.GroupBy(e => 1)
.Select(selector)
.FirstOrDefault() ?? "";
var matchingProviders = matchInfo.Zip(providers,
(match, provider) => match == '1' ? provider : null)
.Where(provider => provider != null)
.ToList();
Enjoy:)
P.S. In my opinion, this query will run with constant speed (regarding number and type of the conditions, i.e. can be considered O(N) in the best, worst and average cases, where N is the number of the records in the table) because the database has to perform always a full table scan. Still it will be interesting to know what's the actual performance, but most likely doing something like this just doesn't worth the efforts.
Update: Regarding the bounty and the updated requirement:
Find a fast query that only reads a record of the table once and ends the query if already all conditions are met
There is no standard SQL construct (not even speaking about LINQ query translation) that satisfies both conditions. The constructs that allow early end like EXISTS can be used for a single condition, thus when executed for multiple conditions will violate the first rule of reading the table record only once. While the constructs that use aggregates like in this answer satisfy the first rule, but in order to produce the aggregate value they have to read all the records, thus cannot exit earlier.
Shortly, there is no query that can satisfy both requirements. What about the fast part, it really depends of the size of the data and the number and type of the conditions, table indexes etc., so again there is simply no "best" general solution for all cases.

Based on this Post by #Ivan I created an expression that is slightly faster in some cases.
It uses Any instead of Max to get the desired results.
var group = Expression.Parameter(typeof(IGrouping<int, TEntity>), "g");
var anyMethod = typeof(Enumerable)
.GetMethods()
.First(m => m.Name == "Any" && m.GetParameters()
.Count() == 2)
.MakeGenericMethod(typeof(TEntity));
var concatArgs = Providers.Select(provider =>
Expression.Call(anyMethod, group,
Expression.Lambda(provider.Condition.Body, provider.Condition.Parameters)));
var convertExpression = concatArgs.Select(concat =>
Expression.Condition(concat, Expression.Constant("1"), Expression.Constant("0")));
var concatCall = Expression.Call(
typeof(string).GetMethod("Concat", new[] { typeof(string[]) }),
Expression.NewArrayInit(typeof(string), convertExpression));
var selector = Expression.Lambda<Func<IGrouping<int, TEntity>, string>>(concatCall, group);
var matchInfo = queryable
.GroupBy(e => 1)
.Select(selector)
.First();
var MatchingProviders = matchInfo.Zip(Providers,
(match, provider) => match == '1' ? provider : null)
.Where(provider => provider != null)
.ToList();

The approach I tried here was to create Conditions and nest them into one Expression. If one of the Conditions is met, we get the index of the Provider for it.
private static Expression NestedExpression(
IEnumerable<Expression<Func<TEntity, bool>>> expressions,
int startIndex = 0)
{
var range = expressions.ToList();
range.RemoveRange(0, startIndex);
if (range.Count == 0)
return Expression.Constant(-1);
return Expression.Condition(
range[0].Body,
Expression.Constant(startIndex),
NestedExpression(expressions, ++startIndex));
}
Because the Expressions still may use different ParameterExpressions, we need an ExpressionVisitor to rewrite those:
private class PredicateRewriterVisitor : ExpressionVisitor
{
private readonly ParameterExpression _parameterExpression;
public PredicateRewriterVisitor(ParameterExpression parameterExpression)
{
_parameterExpression = parameterExpression;
}
protected override Expression VisitParameter(ParameterExpression node)
{
return _parameterExpression;
}
}
For the rewrite we only need to call this method:
private static Expression<Func<T, bool>> Rewrite<T>(
Expression<Func<T, bool>> exp,
ParameterExpression parameterExpression)
{
var newExpression = new PredicateRewriterVisitor(parameterExpression).Visit(exp);
return (Expression<Func<T, bool>>)newExpression;
}
The query itself and the selection of the Provider instances works like this:
var parameterExpression = Expression.Parameter(typeof(TEntity), "src");
var conditions = Providers.Select(provider =>
Rewrite(provider.Condition, parameterExpression)
);
var nestedExpression = NestedExpression(conditions);
var lambda = Expression.Lambda<Func<TEntity, int>>(nestedExpression, parameterExpression);
var matchInfo = queryable.Select(lambda).Distinct();
var MatchingProviders = Providers.Where((provider, index) => matchInfo.Contains(index));
Note: Another option which isn't really fast as well

Here is another view of the problem that has nothing to do with expressions.
Since the main goal is to improve the performance, if the attempts to produce the result with single query don't help, we could try improving the speed by parallelizing the execution of the original multi query solution.
Since it's really a LINQ to Objects query (which internally executes multiple EF queries), theoretically it should be a simple matter of turning it into a PLINQ query by inserting AsParallel like this (non working):
var matchingProviders = providers
.AsParallel()
.Where(provider => queryable.Any(provider.Condition))
.ToList();
However, it turns out that EF DbContext is not well suited for multi thread access, and the above simply generates runtime errors. So I had to resort to TPL using one of the Parallel.ForEach overloads that allows us to supply local state, which I used to allocate several DbContext instances during the execution.
The final working code looks like this:
var matchingProviders = new List<Provider<TEntity>>();
Parallel.ForEach(providers,
() => new
{
context = new MyDbContext(),
matchingProviders = new List<Provider<TEntity>>()
},
(provider, state, data) =>
{
if (data.context.Set<TEntity>().Any(provider.Condition))
data.matchingProviders.Add(provider);
return data;
},
data =>
{
data.context.Dispose();
if (data.matchingProviders.Count > 0)
{
lock (matchingProviders)
matchingProviders.AddRange(data.matchingProviders);
}
}
);
If you have a multi core CPU (which is normal nowadays) and a good database server, this should give you the improvement you are seeking for.

Related

How could I generate a c# .NET core function that can mimic the Include().ThenInclude.(Where) behaviour?

I'm trying to create a generic blazor component that and has a config object.
So far I've been able to successfully mimic .Where(), .Include().ThenInclude() separately, but now I need to add condition to the include.
My config file has the following structure
class Config
{
public List<string> includes {get;set;}
public List<Expression<Func<T,bool>>> conditions {get;set;}
}
database structure
For the example lets assume the following structure.
Dataset (1)<--(n) DatasetItem
DatasetItem (1)-->(1) Image
DatasetItem (1)-->(1) Label
Image (1)<--(n) Labels
Label(1)<--(n) Tags
where an image can be a part of more than one dataset and it's tags might or might not be included in the same dataset. Imagine an picture of a city park with dogs, people, and cars. Someone would want a dataset of vehicles and consider only the cars and ignore the labels about animals and people.
A Label include the coordinates and one or more TagNames (vehicle, car, red, fourWheeler, etc)
what works
And I can give conditions and includes this way:
var config = new Config<DatasetItem>()
{
includes = new() {"Image", "Image.Labels", "Image.Labels.Tags"}
conditions = new() {di => di.datasetId == 1, di => di.Label.width > 100}
}
the problems and limitations of this approach
For this to work I use the .Include(str) overload, but I don't like that approach since it is prone to fail.
The other problem with this approach is a limitation on the conditions.
For example, I can not add a condition to get only the Tags that belong to a specific category.
It seems that is not possible to add a condition that would give me this filter. Best I got was with the use of All or Any, but those do not give the desire result.
config.conditions.Add(
di => di.Images.Any(i => Labels.Any(l => l.Tags.Any(t => t.Name == "vehicle"))
)
what I'm looking for
Whith regular use of Include, obtaining the desire result would be like this.
DatasetItems.Include(di => di.Images)
.ThenInclude(i => i.Labels)
.ThenInclude(l => l.Tags.Where(t => t.Name == "vehicle"))
So, is there any way to achieve this?
I thing the solution would require to use Expression Trees, Reflection, and Recursive functions using MakeGenericMethod, but I havent found a way to achive this.
the ideal solution would be to find a structure like the one for conditions
List<Expression<Func<T,something>>> that lets me add the condition in this way
includes.add(di => di.Images.Labels.Tags.Where(t => t.Name == "vehicle"))
a failed attempt
I'm including a failed attempt I took to solve this, just in case this helps to narrow down the intention and for if someone can use it for solving the problem. This attempt is far from ideal.
property was a dot-separated string, like "Image.Labels.Tags" and filter was suppoused to be the Where expression.
This function does not compile.
public static IQueryable<T> FilteredInclude<T>(this IQueryable<T> query, string property, Func<dynamic, bool> filter) where T : class
{
var type = typeof(T);
var properties = property.Split('.');
var propertyInfo = type.GetProperty(properties[0]);
var parameter = Expression.Parameter(type, "x");
var propertyExpression = Expression.Property(parameter, propertyInfo);
var lambda = Expression.Lambda(propertyExpression, parameter);
var filteredQuery = query.Where(x => filter(propertyInfo.GetValue(x)));
if (properties.Length == 1)
{
var methodInfo = typeof(QueryableExtensions).GetMethod(nameof(Include));
methodInfo = methodInfo.MakeGenericMethod(type, propertyInfo.PropertyType);
var result = methodInfo.Invoke(null, new object[] { filteredQuery, lambda });
return (IQueryable<T>)result;
}
else
{
var nextProperty = string.Join(".", properties.Skip(1));
var result = filteredQuery.FilteredInclude(nextProperty, filter);
var thenIncludeMethod = typeof(Microsoft.EntityFrameworkCore.Query.Internal.IncludeExpression)
.GetMethod("ThenInclude")
.MakeGenericMethod(propertyInfo.PropertyType);
var resultWithThenInclude = thenIncludeMethod.Invoke(result, new object[] { lambda });
return (IQueryable<T>)resultWithThenInclude;
}
}

EF Core 5 - In memory object list filtering not working against DBset

I need to filter against a DBset using a list of objects passed in as a variable to my repo.
The following worked in EF6
List<MyFilterObj> incomingList = [{A: 1, B: 2}, {A: 3, B: 4}]
return await context.MyDBset
.Where(x => incomingList.Any(v =>
v.A == x.A &&
v.B == x.B)).ToListAsync();
If I try this same approach in EF Core 5 I get the following error:
The LINQ expression (MyDBset) could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
I can't seperate the incoming list into seperate A and B lists as the combination is essential to the filter accuracy. Otherwise I could have used this approach
return await context.MyDBset
.Where(x => incomingListA.Contains(x.A) &&
incomingListB.Contains(x.B)).ToListAsync();
This approach does work in EF Core 5
Basically I want EF to translate my query into something like:
SELECT * FROM MyTable AS m
Where (m.A = 1 AND m.B = 2) OR (m.A = 3 AND m.B = 4)
If I loop through my incomingList adding "where" filters to a query the resultant SQL uses "AND" instead of "OR" like so:
SELECT * FROM MyTable AS m
Where (m.A = 1 AND m.B = 2) AND (m.A = 3 AND m.B = 4)
Right now my work around involves binging the result set into memory with some basic filters then running an in memory filter using the original syntax. This is obviously not ideal.
Any suggestions? Thanks
Since EF Core is not willing to translate such expression (it's a well known limitation that Contains with primitive value is the only supported operation on in-memory collection in L2E query since it directly translates to SQL IN operator), you have to do it yourself. Any is basically equivalent of Or as you mentioned, so for not so big list and top level query it could easily be done by using some predicate builder implementation to dynamically build || predicate. Or directly using the the Expression class methods as in this custom extension method from my answer to How to simplify repetitive OR condition in Where(e => e.prop1.contains() || e.prop2.contains() || ...):
public static partial class QueryableExtensions
{
public static IQueryable<T> WhereAnyMatch<T, V>(this IQueryable<T> source, IEnumerable<V> values, Expression<Func<T, V, bool>> match)
{
var parameter = match.Parameters[0];
var body = values
// the easiest way to let EF Core use parameter in the SQL query rather than literal value
.Select(value => ((Expression<Func<V>>)(() => value)).Body)
.Select(value => Expression.Invoke(match, parameter, value))
.Aggregate<Expression>(Expression.OrElse);
var predicate = Expression.Lambda<Func<T, bool>>(body, parameter);
return source.Where(predicate);
}
}
with sample usage for your scenario:
context.MyDBset
.WhereAnyMatch(incomingList, (x, v) => v.A == x.A && v.B == x.B)
Normally instead of Expression.Invoke helper utilities like this would use custom ExpressionVisitor to replace the parameters of the source expression (thus simulation a "call"), which produces more naturally looking query expression, but for EF Core it doesn't really matter since it recognizes and correctly translates invocation expressions inside the expression tree.
But just in case Expression.Invoke doesn't work, here is the parameter replacing version - it should work with all query providers. First, the helpers
public static partial class ExpressionBuilder
{
public static Expression ReplaceParameter(this Expression source, ParameterExpression parameter, Expression value)
=> new ParameterReplacer { Parameter = parameter, Value = value }.Visit(source);
class ParameterReplacer : ExpressionVisitor
{
public ParameterExpression Parameter;
public Expression Value;
protected override Expression VisitParameter(ParameterExpression node)
=> node == Parameter ? Value : node;
}
}
and then simply replace the invocation
.Select(value => Expression.Invoke(match, parameter, value))
with
.Select(value => match.Body.ReplaceParameter(match.Parameters[1], value))
because specifically here we are reusing the first parameter (var parameter = match.Parameters[0];), otherwise it should be replaced as well.

Use Linq Any query or Expressions to build query for performance

we are Entity Framework in our project. Need to know the performance impact between .ANY() and Expressions to form Where clause.
In the below function i used two approach to get result:
APPROACH 1 - Form Lambda expression query using ANY()
From my observation using .Any() is not adding where clause when query is executed(checked in sql profiler) what EF does is gets all matched inner joined records store in-memory and then apply condition specified in .ANY()
APPROACH 2 - Form Expression Query Starts
With Expressions i'm explicitly forming where clause and executing.checked the same in SQL query Profiler i'm able to see where clause.
Note: To form expression where clause i'm doing additional loops and "CombinePredicates".
Now, my doubts are:
which approach will improve performance. Do i need to go with Lambda
with .ANY() or Expressions?
what is the right way to from where clause to improve performance?
If not the two approach suggest me the right way to do it
private bool GetClientNotifications(int clientId, IList<ClientNotification> clientNotifications)
{
IList<string> clientNotificationList = null;
var clientNotificationsExists = clientNotifications?.Select(x => new { x.Name, x.notificationId
}).ToList();
if (clientNotificationsExists?.Count > 0)
{
//**Approach 1 => Form Lamada Query starts**
notificationList = this._clientNotificationRepository?.FindBy(x => clientNotificationsExists.Any(x1 => x.notificationId == x1.notificationId && x.clientId ==
clientId)).Select(x => x.Name).ToList();
**//Form Lamada Query Ends**
//**Approach 2 =>Form Expression Query Starts**
var filterExpressions = new List<Expression<Func<DbModel.ClientNotification, bool>>>();
Expression<Func<DbModel.ClientNotification, bool>> predicate = null;
foreach (var clientNotification in clientNotificationsExists)
{
predicate = a => a.clientId == clientId && a.notificationId == clientNotification .notificationId;
filterExpressions.Add(predicate);
}
predicate = filterExpressions.CombinePredicates<DbModel.ClientNotification>(Expression.OrElse);
clientNotificationList = this._clientNotificationRepository?.FindBy(predicate).Select(x => x.Name).ToList();
//**Form Expression Query Ends**
}
return clientNotificationList;
}
If non of the approaches were good please suggest me the right way to do.
I noticed the clientId being used on both approaches is used on conditions linked to clientNotificationsExists but there is no actual relationship with its objects so this condition can be moved up one level. This might bring a minuscule benefit as the overall sql will be smaller without the duplicate clauses.
From the described behavior the first approach filter is not being translated to sql so it is being resolved at the client side however, if it could be translated the performance would be similar or identical.
If you want to generate an IN sql clause you can use an array and use the Contains method within your filter expression. That could perhaps improve a little bit but don't get your hopes too high.
Try the following:
private bool GetClientNotifications(int clientId, IList<ClientNotification> clientNotifications)
{
IList<string> clientNotificationList = null;
var clientNotificationsExists = clientNotifications?
.Select(x => new { x.Name, x.notificationId }).ToList();
if (clientNotificationsExists?.Count > 0)
{
var notificationIds = clientNotificationsExists.Select(x => x.notificationId).ToArray();
clientNotificationList = this._clientNotificationRepository?
.FindBy(x => x.clientId == clientId && notificationIds.Contains(x.notificationId));
}
return clientNotificationList;
}

C# Expression to use Guid bookmark

I have a requirement to work through several tables that need to be synchronized/backed up. All of these tables' classes implement ITrackModifiedDate:
interface ITrackModifiedDate
{
DateTime ModifiedDate { get; set; }
}
I need to work through these in batches, which means I need to bookmark where I last got up to. So I can sort by ModifiedDate, and just keep track of my last ModifiedDate. But there's a wrinkle in the plan: it can easily happen that multiple records have the identical ModifiedDate. That means I need a secondary identifier, and the only thing I can generically rely on being there is the key field, which is invariably a Guid.
I got some help in the Expression stuff from here, but when I tried tinkering to add in the "greater than" clause, things broke.
async Task<ICollection<T>> GetModifiedRecords<T>(DateTime modifiedSince, Guid lastId) where T : class, ITrackModifiedDate
{
var parameterExp = Expression.Parameter(typeof(T), "x");
var propertyExp = Expression.Property(parameterExp, keyField);
var target = Expression.Constant(lastId);
var greaterThanMethod = Expression.GreaterThan(propertyExp, target);
var lambda = Expression.Lambda<Func<T, bool>>(greaterThanMethod, parameterExp);
var query = db.Set<T>()
.Where(t => t.ModifiedDate > modifiedSince ||
t.ModifiedDate == modifiedSince && lambda.Invoke(t));
var orderByExp = Expression.Lambda(propertyExp, parameterExp);
var thenByMethodGeneric = typeof(Queryable)
.GetMethods()
.Single(mi => mi.Name == "ThenBy" && mi.GetParameters().Length == 2);
var thenByMethod = thenByMethodGeneric.MakeGenericMethod(typeof(T), propertyExp.Type);
// first order by date, then id
query = query.OrderBy(t => t.ModifiedDate)
.AsQueryable();
query = (IQueryable<T>)thenByMethod.Invoke(null, new object[] { query, orderByExp });
return await query.ToListAsync();
}
Attempting to run this query results in:
System.InvalidOperationException: The binary operator GreaterThan is not defined for the types 'System.Guid' and 'System.Guid'.
Oh dear. It seems Guids, like humans, don't like being compared with each other. Either that, or I'm using the wrong comparison expression.
The obvious solution that jumps to mind is to convert the Guid to a string for comparison purposes, but (a) that seems a little inefficient, and (b) I don't know how to write an Expression that converts a Guid to a string.
Is converting to a string the right way to go? If so, what Expression will do the job? If not, what's the correct approach?
The general approach is to replace the unsupported Guid operators with Guid.CompareTo(Guid) calls, e.g. instead of
guidA > guidB
use
guidA.CompareTo(guidB) > 0
In your case, replace
var greaterThanMethod = Expression.GreaterThan(propertyExp, target);
with
var compareTo = Expression.Call(propertyExp, "CompareTo", Type.EmptyTypes, target);
var greaterThanMethod = Expression.GreaterThan(compareTo, Expression.Constant(0));
This works for most of the query providers (LINQ to Objects, LINQ To Entities (EF6)). Unfortunately doesn't work with EF Core 2.x which requires a different approach. If that's the case, register the custom GuidFunctions class as explained in the linked answer, and use this instead:
var greaterThanMethod = Expression.Call(
typeof(GuidFunctions), "IsGreaterThan", Type.EmptyTypes,
propertyExp, target);

Apply "List" of IQueryables to a target?

I have this idea to create a "list" of IQueryables that do different kinds of operations.
So basically:
var query1 = Enumerable.Empty<Person>().AsQueryable().Where(e => e.Name == "Ronald");
var query2 = Enumerable.Empty<Person>().AsQueryable().Where(e => e.Age == 43);
var query3 = Enumerable.Empty<Person>().AsQueryable().Select(e => e.EyeColor);
var listOfQueries = new List<IQueryable<Person>
{
query1,
query2,
query3
};
Now, I also have this DbSet full of "Persons" and I would like to "apply" all my queries against that DbSet. How would I do that? Is it even possible?
An updated example:
var personQueryFactory = new PersonQueryFactory();
var personQueryByFirstname = personQueryFactory.CreateQueryByFirstname("Ronald"); //Query by Firstname.
var personQueryByAge = personQueryFactory.CreateQueryByAge(42); //Query by age.
var personQueryByHasChildWithAgeOver = personQueryFactory.CreateQueryByChildAgeOver(25); //Query using a "join" to the child-relationship.
var personQuerySkip = personQueryFactory.Take(5); //Only get the 5 first matching the queries.
var personQuery = personQueryFactory.AggregateQueries //Aggragate all the queries into one single query.
(
personQueryByFirstname,
personQueryByAge,
personQueryByHasChildWithAgeOver,
personQuerySkip
);
var personSurnames = personsService.Query(personQuery, e => new { Surname = e.Surname }); //Get only the surname of the first 5 persons with a firstname of "Ronald" with the age 42 and that has a child thats over 25 years old.
var personDomainObjects = personsService.Query<DomainPerson>(personQuery); //Get the first 5 persons as a domain-object (mapping behind the "scenes") with a firstname of "Ronald" with the age 42 and that has a child thats over 25 years old.
var personDaos = personsService.Query(personQuery); //Get the first 5 persons as a DAO-objects/entityframework-entities with a firstname of "Ronald" with the age 42 and that has a child thats over 25 years old.
The reason for doing this would be to create a more "unified" way of creating and re-using predefined queries, and then to be able to execute them against the DbSet and return the result as a domain-object and not an "entity-framework-model/object"
This is possible, but you need to reframe your approach slightly.
Storing the filters
var query1 = Enumerable.Empty<Person>().AsQueryable().Where(e => e.Name == "Ronald");
var query2 = Enumerable.Empty<Person>().AsQueryable().Where(e => e.Age == 43);
var query3 = Enumerable.Empty<Person>().AsQueryable().Select(e => e.EyeColor);
Reading your intention, you don't actually want to handle IQueryable objects, but rather the parameter that you supply to the Where method.
Edit: I missed that the third one was a Select(), not a Where(). Ive adjusted the rest of the answer as if this had also been a Where(). See the comments below for my response as to why you can't mix Select() with Where() easily.
Func<Person,bool> filter1 = (e => e.Name == "Ronald");
Func<Person,bool> filter2 = (e => e.Age == 43);
Func<Person,bool> filter3 = (e => e.EyeColor == "Blue");
This stores the same information (filter criteria), but it doesn't wrap each filter in an IQueryable of its own.
A short explanation
Notice the Func<A,B> notation. In this case, A is the input type (Person), and B is the output type (bool).
This can be extended further. A Func<string,Person,bool> has two input parameters (string, Person) and one output parameter (bool). A usage example:
Func<string, Person, bool> filter = (inputString, inputPerson) => inputString == "TEST" && inputPerson.Age > 35;
There is always one output parameter (the last type). Every other mentioned type is an input parameter.
Putting the filters in a list
Intuitively, since Func<Person,bool> represents a single filter; you can represent a list of filters by using a List<Func<Person,bool>>.
Nesting generic types get a bit hard to read, but it does work just like any other List<T>.
List<Func<Person,bool>> listOfFilters = new List<Func<Person,bool>>()
{
(e => e.Name == "Ronald"),
(e => e.Age == 43),
(e => e.EyeColor == "Blue")
};
Executing the filters
You're in luck. Because you want to apply all the filters (logical AND), you can do this very easily by stacking them:
var myFilteredData = myContext.Set<Person>()
.Where(filter1)
.Where(filter2)
.Where(filter3)
.ToList();
Or, if you're using a List<Func<Person,bool>>:
var myFilteredData = myContext.Set<Person>().AsQueryable();
foreach(var filter in listOfFilters)
{
myFilteredData = myFilteredData.Where(filter);
}
Fringe cases
However, if you were trying to look for all items which fit one or more filters (logical OR), it becomes slightly more difficult.
The full answer is quite complicated.. You can check it here.
However, assuming you have the filters set in known variables, there is a simpler method:
Func<Person, bool> filterCombined =
e => filter1(e) || filter2(e) || filter3(e);
var myFilteredData = myContext.Set<Person>()
.Where(filterCombined)
.ToList();
One of the problems is that the collection of IQueryable is only valid as long as your DbSet is valid. As soon as your DbContext is Disposed your carefully filled collection is worthless.
So you have to think of another method to reconstruct the query than the one that uses the DbSet<Person>
Although at first glance they seem the same, there is a difference between IEnumerable and IQueryable. An Enumerable has everything in it to enumerate over the resulting sequence.
A Queryable on the other hand holds an Expression and a Provider. The Provider knows where the data can be fetched. This is usually a database, but it can also be a CSV-file or other items where you can fetch sequences. It is the task of the Provider to interpret the Expression and to translate it info a format that the database can understand, usually SQL.
While concatenating the linq statements into one big linq statements, the database is not accessed. Only the Expression is changed.
Once you call GetEnumerator() and MoveNext() (usually by doing ForEach, or ToList(), or similar), the Expression is sent to the Provider who will translate it into a query format that the database understands and perform the query. the result of the query is an Enumerable sequence, so Getenumerator() and MoveNext() of the provider's query result are called.
Because your IQueryable holds this Provider, you can't enumerate anymore after the Provider has been disposed.
When using entity framework, the DbSet holds the Provider. In the Provider is the Database information held by the DbContext. Once you Dispose the DbContext you can't use the IQueryable anymore:
IQueryable<Person> query = null;
using (var dbContext = new MyDbcontext())
{
query = dbContext.Persons.Where(person => person.Age > 20);
}
foreach (var person in query)
{
// expect exception: the DbContext is already Disposed
}
So you can't put the Provider in your collection or possible queries. However, you could remember the Expression. The only thing your require from your Expression is that it returns a Person. You also need a function that takes this Expression and a QueryaProvider for Persons to convert it to an IQueryable.
Let's create a generic function for this, so It can be used for any type, not just for Persons:
static IQueryable<TSource> ToQueryable<TSource>(this IQueryProvider provider,
Expression expression)
{
return provider.CreateQuery(expression);
}
// well, let's add the following also:
static IQueryable<Tsource> ToQueryable<TSource>(this DbContext dbContext,
Expression expression)
{
return dbContext.Set<TSource>.Provider.ToQueryable<TSource>(expression);
}
For help on extension functions see Extension Functions Demystified
Now only once you create your collection of Expressions. For fast lookup make it a Dictionary:
enum PersonQuery
{
ByFirstname,
ByAge,
ByHasChildWithAgeOver,
Skip,
}
public IReadOnlyDictionary<PersonQuery, Expression> CreateExpressions()
{
Dictionary<PersonQuery, Expression> dict = new Dictionary<PersonQuery, Expression>();
using (var dbContext = new MyDbContext())
{
IQueryable<Person> queryByFirstName = dbContext.Persons
.Where(...);
dict.Add(PersonQuery.ByfirstName, queryByFirstName.Expression);
... // etc for the other queries
}
return dict.
}
Usage:
IReadOnlyCollection<Person> PerformQuery(PersonQuery queryId)
{
using (var dbContext = new MyDbContext())
{
// get the Expression from the dictionary:
var expression = this.QueryDictionary[queryId];
// translate to IQueryable:
var query = dbContext.ToQueryable<Person>(expression);
// perform the query:
return query.ToList();
// because all items are fetched by now, the DbContext may be Disposed
}
}

Categories