Implementing IQueryable.Count - c#

I'm working on an IQueryable provider. In my IQueryProvider I have the following code:
public TResult Execute<TResult>(Expression expression)
{
var query = GetQueryText(expression);
// Call the Web service and get the results.
var items = myWebService.Select<TResult>(query);
IQueryable<TResult> queryableItems = items.AsQueryable<TResult>();
return (TResult)queryableItems;
}
GetQueryText does all the leg work and works out the query string for the expression tree. This is all working well, so Where, OrderBy and Take are sorted. The webservice supports a count query using the following:
int count = myWebService.Count(query);
But I can't get my head round where I put this in the IQueryable or IQueryProvider.
I've basically worked from reading tutorials and open source examples, but can't seem to find one that does Count.

The answer appears simpler than I first thought. This blog post helped:
The Execute method is the entry point into your provider for actually executing query expressions. Having an explicit execute instead of just relying on IEnumerable.GetEnumerator() is important because it allows execution of expressions that do not necessarily yield sequences. For example, the query “myquery.Count()” returns a single integer. The expression tree for this query is a method call to the Count method that returns the integer. The Queryable.Count method (as well as the other aggregates and the like) use this method to execute the query ‘right now’.
I did some debugging and for a query of myContext.Where(x => x.var1 > 5) Execute is called and TResult is an IEnumerable<MyClass>
For myContext.Where(x => x.var1 > 5).Count() Execute is called and TResult is an int
So my Execute method just needs to return appropriately.

Related

Would using an extension method to query an IEnumerable be equivalent to using IQueryable?

I understand the main difference between IQueryable and IEnumerable. IQueryable executes the query in the database and returns the filtered results while IEnumerable brings the full result set into memory and executes the filtering query there.
I don't think using an extension method to query on the initial assignment of a variable will cause it to be executed in the database like IQueryable, but I just wanted to make sure.
This code will cause the full result set of People to be returned from the database, and then the filtering is done in memory:
(The People property on the context is of type DbSet)
IEnumerable<Person> people = context.People;
Person person = people.Where(x => x.FirstName == "John");
Even though I am adding the filtering below as an extension method before assigning the item to my variable, I'm assuming this code should work the same way as the code above, and bring back the full result set into memory before filtering it, right?
Person person = context.People.Where(x => x.FirstName == "John");
EDIT:
Thanks for the replies guys. I modified the code example to show what I meant (removed the IEnumerable in the second paragraph of code).
Also, to clarify, context.People is of type DbSet, which implements both IQueryable and IEnumerable. So I 'm not actually sure which .Where method is being called. IntelliSense tells me it is the IQueryable version, but can this be trusted? Is this always the case when working directly with a DbSet of a context?
IEnumerable<Person> people = context.People;
Person person = people.Where(x => x.FirstName == "John");
... will execute the IEnumerable<T>.Where method extension, which accepts a Func<TSource, bool> predicate parameter, forcing the filtering to happen in memory.
In contrast...
IEnumerable<Person> people = context.People.Where(x => x.FirstName == "John");
...will execute the IQueryable<T>.Where method extension, which accepts a Expression<Func<TSource, bool>> predicate parameter. Notice that this is an expression, not a delegate, which allows it to translate the where condition to a database condition.
So it really does make a difference which extension method you invoke.
IQueryable<T> works on expressions. It is effectively a query-builder, accumulating information about a query without doing anything... until the moment you need a value from it, when:
a query is generated from the accumulated Expression, in the target language (e.g. SQL)
the query is executed, usually on the database,
results are converted back to a C# object.
IEnumerable<T> extensions works on pre-compiled functions. When you need a value from it:
C# code in those functions is executed.
It is easy to confuse the two, because:
both have extension functions with similar names,
the lambda syntax is the same for Expressions and Functions - so you cannot tell them apart,
the use of "var" to declare variables removes the datatype (often the only clue as to which interface is being used).
IQueryable<int> a;
IEnumerable<int> b;
int x1 = a.FirstOrDefault(i => i > 10); // Expression passed in
int x2 = b.FirstOrDefault(i => i > 10); // Function passed in
Extension methods with the same name usually do the same thing (because they were written that way) but sometimes they don't.
So the answer is: No, they are not equivalent.

Using the same expression for EntityFramework queries and in-memory evaluation

The problem description
In my application, I have a search function that builds a complex search query based on user input on top of EntityFramework DbSet object and runs it against the database, something like this:
public static IQueryable<MyEntity> ApplySearchQuery(SearchSpec search, IQueryable<MyEntity> query)
{
if (search.Condition1.HasValue)
query = query.Where(e => e.SomeProperty == search.Condition1);
if (search.Condition2.HasValue)
query = query.Where(e => e.OtherProperty == search.Condition2);
....
}
This performs pretty well on the database side. Now I have come to a point that I have a single MyEntity at hand, and I want to see if a particular SearchSpec matches the entity or not. And I need to do this for a potentially large number of SearchSpec objects, so the performance is important.
I'd also really like to avoid duplicating the expressions.
Here's what I've thought so far:
I can call Expression.Compile() on the expressions to convert them to a delegate, and then call them. But since I have a parameter (the search parameter) I need to build expressions and compile them every time, making it a very inefficient way (I suppose, correct me if I'm wrong).
I can wrap my single entity in an IQueriable using new [] { myEntity }.AsQueriable() and then evaluate the query on it. Not sure how well this will perform.
The question:
Which of the above approaches are faster?
Is any of my assumptions (about the limitations) wrong?
Is there any other way that I haven't thought about yet?
Here PredicateBuilder could work miracles for you. With PredicateBuilder you can build an Expression that can be used against an IQueryable, but, when compiled, also against objects.
You could have a method that builds and returns the Expression
using LinqKit;
Expression<Func<MyEntity, bool>> CreateExpression(SearchSpec search)
{
var predicate = PredicateBuilder.True<MyEntity>();
if (search.Condition1.HasValue)
predicate = predicate.And(e => e.SomeProperty == search.Condition1);
if (search.Condition2.HasValue)
predicate = predicate.And(e => e.OtherProperty == search.Condition2);
...
return predicate;
}
To be used as:
var predicate = CreateExpression(search);
var result = query.Where(predicate.Expand()); // will be translated into SQL.
var match = predicate.Compile()(myEntity);
Notice the Expand call. Without it, EF will fail because under the hood Invoke will be called, which can't be translated into SQL. Expand replaces these calls so that EF can convert the expression to SQL.

Timeout expired when using a Func<> instead of a lambda

Given:
public EntityAddress ReadSingle(Func<EntityAddress, bool> predicate)
{
//var result = Context.CV3Address.FirstOrDefault(a => a.GUID == 1100222);
var result = Context.CV3Address.FirstOrDefault(predicate);
return result;
}
FirstOrDefault(a => a.GUID == 1100222); returns a result immediately.
FirstOrDefault(predicate); results in a timeout exception. Note that predicate = the lambda expression
My suspicion is that the latter method attempts to pull down all records which is not gonna happen with a table this large.
Why does this happen?
It happens because of the type of the predicate, which should have been instead
Expression<Func<CV3Address, bool>>
If the predicate is an expression tree (as above) and Context.CV3Address is an IQueryable<CV3Address> then EF can translate the expression tree to SQL and directly get the results from the database.
On the other hand, if the predicate is a Func<CV3Address, bool> (a delegate; a pointer to compiled code) then this cannot be translated to SQL. Therefore LINQ has no other option that to treat your repository as an IEnumerable<CV3Address>, which is a sequence that can be filtered in-memory. That has the side effect of needing to pull all records from the database in order to filter them.
If you hardcode the predicate then the compiler can treat it either as an expression tree or a delegate, and due to the type of Context.CV3Address it treats it as an expression tree.
FirstOrDefault(a => a.GUID == 1100222) creates an expression tree that uses LINQ to Entities to run the query on the DB server.
FirstOrDefault(predicate) downloads the entire table and runs the filter locally.
You need to change your method to take an expression tree:
Expression<Func<CV3Address, bool>>

Entity framework - COUNT rather than SELECT

If I call the GetFoo().Count() from a method outside of my DB class, Entity Framework 5 does a rather inefficient SELECT query rather than a COUNT. From reading a few other questions like this, I see that this is expected behaviour.
public IEnumerable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
I've therefore added a count method to my DB class, which correctly performs a COUNT query:
public int GetFooCount()
{
return context.Items.Where(d => d.Foo.equals("bar")).Count();
}
To save me from specifying queries multiple times, I'd like to change this to the following. However this again performs a SELECT, even though it's within the DB class. Why is this - and how can I avoid it?
public int GetFooCount()
{
return this.GetFoo().Count();
}
since GetFoo() returns an IEnumerable<DbItems>, the query is executed as a SELECT, then Count is applied to the collection of objects and is not projected to the SQL.
One option is returning an IQueryable<DbItems> instead:
public IQueryable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
But that may change the behavior of other callers that are expecting the colection to be loaded (with IQueryable it will be lazy-loaded). Particularly methods that add .Where calls that cannot be translated to SQL. Unfortunately you won't know about those at compile time, so thorough testing will be necessary.
I would instead create a new method that returns an IQueryable:
public IQueryable<DbItems> GetFooQuery()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
so your existing usages aren't affected. If you wanted to re-use that code you could change GetFoo to:
public IEnumerable<DbItems> GetFoo()
{
return GetFooQuery().AsEnumerable();
}
In order to understand this behavior you need to understand difference between IEnumerable<T> and IQueryable<T> extensions. First one works with Linq to Objects, which is in-memory queries. This queries are not translated into SQL, because this is simple .NET code. So, if you have some IEnumerable<T> value, and you are executing Count() this invokes Enumerable.Count extension method, which is something like:
public static int Count<TSource>(this IEnumerable<TSource> source)
{
int num = 0;
foreach(var item in source)
num++;
return num;
}
But there is completely different story with IQueryable<T> extensions. These methods are translated by underlying LINQ provider (EF in your case) to something other than .NET code. E.g. to SQL. And this translation occurs when you execute query. All query is analyzed, and nice (well, not always nice) SQL is generated. This SQL is executed in database and result is returned to you as result of query execution.
So, your method returns IEnumerable<T> - that means you are using Enumerable.Count() method which should be executed in memory. Thus following query is translated by EF into SQL
context.Items.Where(d => d.Foo.equals("bar")) // translated into SELECT WHERE
executed, and then count of items calculated in-memory with method above. But if you will change return type to IQueryable<T>, then all changes
public IQueryable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
Now Queryable<T>.Count() is executed. This means query continues building (well, actually Count() is the operator which forces query execution, but Count() becomes part of this query). And EF translates
context.Items.Where(d => d.Foo.equals("bar")).Count()
into SQL query which is executed on server side.

Convert an IQueryable linq query to IEnumerable<T> cancels out linq optimized way to work?

I'm kinda newbie on .NET, and I was wondering how does linq works, since you can aply many linq queries, one after another, but none of them are really executed until they're used to transfer information or converted to list, etc.
There are 2 important ways to get a linq query, by using IQueryable<T>, which aplies the where filters directly on the Sql, and IEnumerable which get all the records and then it work with them on memory. However, let's take a look on this code:
//Linq dynamic library
IQueryable<Table> myResult = db.Categories
.Where(a => a.Name.Contains(StringName))
.OrderBy("Name")
.Skip(0)
.Take(10);
if (myResult != null)
{
return myResult.AsEnumerable();
}
else
{ return null; }
Depsite i'm using Linq dynamic library, the direct result from this query is being get on IQueryable<T>, if the query is finally being returned as IEnumerable, is the query being really filtered on the sql? or is it in memory?
It's still going to execute in the database, don't worry. Basically it's all down to which implementation of Where etc is used. While you're calling methods on IQueryable<T> - via the Queryable extension methods - it will be using expression trees. When you start to fetch from that query, it will be turned into SQL and sent to the database.
On the other hand, if you use any of those methods after you've got it as an IEnumerable<T> (in terms of the compile-time type), that will use the extension methods in Enumerable, and all of the rest of the processing would be done in-process.
As an example, consider this:
var query = db.People
.Where(x => x.Name.StartsWith("J"))
.AsEnumerable()
.Where(x => x.Age > 20);
Here AsEnumerable() just returns its input sequence, but typed as IEnumerable<T>. In this case, the database query would return only people whose name began with J - and then the age filtering would be done at the client instead.
If you return an IEnumerable<T> and then further refine the query, then the further refinement happens in memory. The part that was expressed on an IQueryable<T> will get translated to the appropiate SQL statements (for the LINQ-to-SQL case, obviously).
See Returning IEnumerable<T> vs. IQueryable<T> for a longer and more detailed answer.

Categories