Non-Generic AsEnumerable - c#

How would you suggest using AsEnumerable on a non-generic IQueryable?
I cannot use the Cast<T> or OfType<T> methods to get an IQueryable<object> before calling AsEnumerable, since these methods have their own explicit translation by the underlying IQueryProvider and will break the query translation if I use them with a non-mapped entity (obviously object is not mapped).
Right now, I have my own extension method for this (below), but I'm wondering if there's a way built into the framework.
public static IEnumerable AsEnumerable(this IQueryable queryable)
{
foreach (var item in queryable)
{
yield return item;
}
}
So, with the above extension method, I can now do:
IQueryable myQuery = // some query...
// this uses the built in AsEnumerable, but breaks the IQueryable's provider because object is not mapped to the underlying datasource (and cannot be)
var result = myQuery.Cast<object>().AsEnumerable().Select(x => ....);
// this works (and uses the extension method defined above), but I'm wondering if there's a way in the framework to accomplish this
var result = myQuery.AsEnumerable().Cast<object>().Select(x => ...);

Since the interface IQueryable inherits from IEnumerable why not:
IQueryable queryable;
IEnumerable<T> = (queryable as IEnumerable).Cast<T>();
Edit
There are two Cast<> extension methods:
public static IQueryable<TResult> Cast<TResult>(this IQueryable source)
public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)
Which one is called is statically decided by the compiler. So casting an IQueryable as an IEnumerable will cause the second extension method to be called, where it will be treated as an IEnumerable.

JaredPar's answer in other words:
public static IEnumerable AsEnumerable(this IEnumerable source)
{
return source;
}
Usage:
IQueryable queryable = // ...
IEnumerable enumerable = queryable.AsEnumerable();
IEnumerable<Foo> result = enumerable.Cast<Foo>();
// ↑
// Enumerable.Cast<TResult>(this IEnumerable source)

The type IQueryable inherits from the non-generic IEnumerable so this extension method doesn't seem to serve any purpose. Why not just use it as IEnumerable directly?

To clarify, you are trying to add a "force expression evaluation" to your linq expression tree so that part of the expression tree is evaluated against the underlying provider (linq to SQL for example) and the rest is evaluated in memory (linq to objects). But you want to be able to specify the entire express tree without actually executing the query or reading any results into memory.
This is a good thing and it shows some insight into the way that Linq works. Good job.
The confusion I see here is your use of "AsEnumerable" as the method name. To me (and I think many people who thinq linq ) "AsEnumerable" is too similar to the "AsQueryable" method which is essentially a cast, not an actual evaluator. I propose you rename your method to "Evaluate". This is my Evaluate() method from my personal Linq extensions library.
/// <summary>
/// This call forces immediate evaluation of the expression tree.
/// Any earlier expressions are evaluated immediately against the underlying IQueryable (perhaps
/// a Linq to SQL provider) while any later expressions are evaluated against the resulting object
/// graph in memory.
/// This is one way to determine whether expressions get evaluated by the underlying provider or
/// by Linq to Objects in memory.
/// </summary>
public static IEnumerable<T> Evaluate<T>(this IEnumerable<T> expression)
{
foreach (var item in expression)
{
yield return item;
}
}
/// <summary>
/// This call forces immediate evaluation of the expression tree.
/// Any earlier expressions are evaluated immediately against the underlying IQueryable (perhaps
/// a Linq to SQL provider) while any later expressions are evaluated against the resulting object
/// graph in memory.
/// This is one way to determine whether expressions get evaluated by the underlying provider or
/// by Linq to Objects in memory.
/// </summary>
public static IEnumerable Evaluate(this IEnumerable expression)
{
foreach (var item in expression)
{
yield return item;
}
}
This allows you to write a query where some of the query is evaluated by SQL (for example) and the rest is evaluated in memory. A good thing.

Related

Entity Framework - Expression vs Func

Consider this 2 codes:
Using Func<T, bool>
public IQueryable<Blog> GetBlogs(Func<Blog, bool> predicate)
{
return context.Blogs.Where(predicate).AsQueryable();
}
Using Expression<Func<T, bool>>
public IQueryable<Blog> GetBlogs(Expression<Func<Blog, bool>> predicate)
{
return context.Blogs.Where(predicate); // No need of AsQueryable
}
So, in the first case, the Entity Framework will always returns all objects from the database, right? So what's the point in calling AsQueryable? Does it help anyway? It´s similar to the Expression version?
Does it help anyway?
No.
All it does is lie to the caller of the method, in that they think that they have an IQueryable that will translate any additional operators applied to it to SQL run in the database, when in fact you just have an IEnumerable in sheep's clothing. If you really want the operation to be performed in-memory, and not in the DB, then at least be explicit about it an leave the IEnumerable typed as an IEnumerable.

At what point is a LINQ data source determined?

Given the below two samples of LINQ, at what point is a LINQ data source determined?
int[] numbers = new int[7] { 0, 1, 2, 3, 4, 5, 6 };
IEnumerable<int> linqToOjects = numbers.Where(x => true);
XElement root = XElement.Load("PurchaseOrder.xml");
IEnumerable<XElement> linqToXML = root.Elements("Address").Where(x => true);
My understanding is that the underlying code used to query these two different data sources lives within the IEnumerable object produced by the LINQ methods.
My question is, at what point exactly is it determined whether code will be generated to use the Linq To Objects library or the Linq To XML library?
I would assume that the underlying code (the code which actually does the work of querying the data) used to query these data sources exist within their own libraries and are called upon dependent on the data source. I have looked at https://referencesource.microsoft.com/ to look at the code of the Where clause/extension method thinking that the call to the desired provider might be in there, but it appears to be generic.
How is the magic which goes into the IEnumerable determined?
The "data source" is determined immediately. For example, in your first example, the return value of Where is an object that implements IEnumerable<int> (the Enumerable.WhereArrayIterator<int> class in particular) that has a dependency on the numbers object (stored as a field). And the return value of Where in the second example is an enumerable object that has an dependency on the xml element object. So even before you start enumerating, the resulting enumerable knows where to get the data from.
My question is, at what point exactly is it determined whether code
will be generated to use the Linq To Objects library or the Linq To
XML library?
I think there is no code generation. LINQ just uses the datasource enumerator.
You have a class that implement IEnumerable
Exposes the enumerator, which supports a simple iteration over a
collection of a specified type.
So you can use the method GetEnumerator.
Returns an enumerator that iterates through the collection.
And this all LINQ needs to work, an enumerator.
In your example you use the Where LINQ extension method to apply some filter.
IEnumerable<T> Where(this IEnumerable<T> source, Func<T, bool> predicate)
In the implementation we need to:
- get the enumerator (source.GetEnumerator())
- iterate through the collection and apply the filter (predicate)
In the Enumerable reference source you have the implementation of the method Where. You can see that he uses some specific implementation for array (TSource[]) and list (List), but he uses WhereEnumerableIterator for all the other classes that implement IEnumerable.
So there is no code generation, the code is there.
I think you can understand the implementation of the class WhereEnumerableIterator, you only need to understand first how to implement IEnumerator.
Here you can see the implementation of MoveNext. They call source.GetEnumerator() and then they iterate through the collection (enumerator.MoveNext()) and apply the filter (predicate(item)).
public override bool MoveNext() {
switch (state) {
case 1:
enumerator = source.GetEnumerator();
state = 2;
goto case 2;
case 2:
while (enumerator.MoveNext()) {
TSource item = enumerator.Current;
if (predicate(item)) {
current = item;
return true;
}
}
Dispose();
break;
}
return false;
}
XContainer.GetElement returns an IEnumerable using the yield keyword.
When you use the yield keyword in a statement, you indicate that the
method, operator, or get accessor in which it appears is an iterator.
Using yield to define an iterator removes the need for an explicit
extra class (the class that holds the state for an enumeration, see
IEnumerator for an example) when you implement the IEnumerable and
IEnumerator pattern for a custom collection type.
Thanks to the magic of yield keyword we can obtain an IEnumerable, and we can enumerate the collection. And this is the only thing LINQ needs.

Entity framework - COUNT rather than SELECT

If I call the GetFoo().Count() from a method outside of my DB class, Entity Framework 5 does a rather inefficient SELECT query rather than a COUNT. From reading a few other questions like this, I see that this is expected behaviour.
public IEnumerable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
I've therefore added a count method to my DB class, which correctly performs a COUNT query:
public int GetFooCount()
{
return context.Items.Where(d => d.Foo.equals("bar")).Count();
}
To save me from specifying queries multiple times, I'd like to change this to the following. However this again performs a SELECT, even though it's within the DB class. Why is this - and how can I avoid it?
public int GetFooCount()
{
return this.GetFoo().Count();
}
since GetFoo() returns an IEnumerable<DbItems>, the query is executed as a SELECT, then Count is applied to the collection of objects and is not projected to the SQL.
One option is returning an IQueryable<DbItems> instead:
public IQueryable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
But that may change the behavior of other callers that are expecting the colection to be loaded (with IQueryable it will be lazy-loaded). Particularly methods that add .Where calls that cannot be translated to SQL. Unfortunately you won't know about those at compile time, so thorough testing will be necessary.
I would instead create a new method that returns an IQueryable:
public IQueryable<DbItems> GetFooQuery()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
so your existing usages aren't affected. If you wanted to re-use that code you could change GetFoo to:
public IEnumerable<DbItems> GetFoo()
{
return GetFooQuery().AsEnumerable();
}
In order to understand this behavior you need to understand difference between IEnumerable<T> and IQueryable<T> extensions. First one works with Linq to Objects, which is in-memory queries. This queries are not translated into SQL, because this is simple .NET code. So, if you have some IEnumerable<T> value, and you are executing Count() this invokes Enumerable.Count extension method, which is something like:
public static int Count<TSource>(this IEnumerable<TSource> source)
{
int num = 0;
foreach(var item in source)
num++;
return num;
}
But there is completely different story with IQueryable<T> extensions. These methods are translated by underlying LINQ provider (EF in your case) to something other than .NET code. E.g. to SQL. And this translation occurs when you execute query. All query is analyzed, and nice (well, not always nice) SQL is generated. This SQL is executed in database and result is returned to you as result of query execution.
So, your method returns IEnumerable<T> - that means you are using Enumerable.Count() method which should be executed in memory. Thus following query is translated by EF into SQL
context.Items.Where(d => d.Foo.equals("bar")) // translated into SELECT WHERE
executed, and then count of items calculated in-memory with method above. But if you will change return type to IQueryable<T>, then all changes
public IQueryable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
Now Queryable<T>.Count() is executed. This means query continues building (well, actually Count() is the operator which forces query execution, but Count() becomes part of this query). And EF translates
context.Items.Where(d => d.Foo.equals("bar")).Count()
into SQL query which is executed on server side.

When is ObjectQuery really an IOrderedQueryable?

Applied to entity framework, the extension methods Select() and OrderBy() both return an ObjectQuery, which is defined as:
public class ObjectQuery<T> : ObjectQuery, IOrderedQueryable<T>,
IQueryable<T>, <... more interfaces>
The return type of Select() is IQueryable<T> and that of OrderBy is IOrderedQueryable<T>. So you could say that both return the same type but in a different wrapper. Luckily so, because now we can apply ThenBy after OrderBy was called.
Now my problem.
Let's say I have this:
var query = context.Plots.Where(p => p.TrialId == 21);
This gives me an IQueryable<Plot>, which is an ObjectQuery<Plot>. But it is also an IOrderedQueryable:
var b = query is IOrderedQueryable<Plot>; // True!
But still:
var query2 = query.ThenBy(p => p.Number); // Does not compile.
// 'IQueryable<Plot>' does not contain a definition for 'ThenBy'
// and no extension method 'ThenBy' ....
When I do:
var query2 = ((IOrderedQueryable<Plot>)query).ThenBy(p => p.Number);
It compiles, but gives a runtime exception:
Expression of type 'IQueryable`1[Plot]' cannot be used for parameter of type 'IOrderedQueryable`1[Plot]' of method 'IOrderedQueryable`1[Plot] ThenBy[Plot,Nullable`1](IOrderedQueryable`1[Plot], Expressions.Expression`1[System.Func`2[Plot,System.Nullable`1[System.Int32]]])'
The cast is carried out (I checked), but the parameter of ThenBy is still seen as IQueryable (which puzzles me a bit).
Now suppose some method returns an ObjectQuery<Plot> to me as IQueryable<Plot> (like Select()). What if I want to know whether it is safe to call ThenBy on the returned object. How can I figure it out if the ObjectQuery is "real" or a "fake" IOrderedQueryable without catching exeptions?
Expression Trees are genuinely good fun! (or perhaps I'm a little bit of a freak) and will likely become useful in many a developer's future if Project Roslyn is anything to go by! =)
In your case, simple inherit from MSDN's ExpressionVisitor, and override the VisitMethodCall method in an inheriting class with something to compare m.MethodInfo with SortBy (i.e. if you're not too fussy simply check the name, if you want to be fussy use reflection to grab the actual SortBy MethodInfo to compare with.
Let me know if/what you need examples of, but honestly, after copy/pasting the ExpressionVisitor you'll probably need no more than 10 lines of non-expression-tree code ;-)
Hope that helps
Although Expression Trees are good fun, wouldn't in this case the simple solution be to use OrderBy rather than ThenBy?
OrderBy is an extension on IQueryable and returns an IOrderedQueryable.
ThenBy is an extension on IOrderedQueryable and returns an IOrderedQueryable.
So if you have a IQueryable (as in your case above, where query is an IQueryable) and you want to apply an initial ordering to it, use OrderBy. ThenBy is only intended to apply additional ordering to an already ordered query.
If you have a LINQ result of some kind, but you aren't sure if it is an IQueryable or an IOrderedQueryable and want to apply additional filtering to it, you could make two methods like:
static IOrderedQueryable<T, TKey> ApplyAdditionalOrdering<T, TKey>(this IOrderedQueryable<T, TKey> source, Expression<Func<T, TFilter>> orderBy)
{
return source.ThenBy(orderBy);
}
And
static IOrderedQueryable<T, TKey> ApplyAdditionalOrdering<T, TKey>(this IQueryable<T> source, Expression<Func<T, TFilter>> orderBy)
{
return source.OrderBy(orderBy);
}
The compiler will figure out the correct one to call based on the compile-time type of your query object.

Implementing IQueryable.Count

I'm working on an IQueryable provider. In my IQueryProvider I have the following code:
public TResult Execute<TResult>(Expression expression)
{
var query = GetQueryText(expression);
// Call the Web service and get the results.
var items = myWebService.Select<TResult>(query);
IQueryable<TResult> queryableItems = items.AsQueryable<TResult>();
return (TResult)queryableItems;
}
GetQueryText does all the leg work and works out the query string for the expression tree. This is all working well, so Where, OrderBy and Take are sorted. The webservice supports a count query using the following:
int count = myWebService.Count(query);
But I can't get my head round where I put this in the IQueryable or IQueryProvider.
I've basically worked from reading tutorials and open source examples, but can't seem to find one that does Count.
The answer appears simpler than I first thought. This blog post helped:
The Execute method is the entry point into your provider for actually executing query expressions. Having an explicit execute instead of just relying on IEnumerable.GetEnumerator() is important because it allows execution of expressions that do not necessarily yield sequences. For example, the query “myquery.Count()” returns a single integer. The expression tree for this query is a method call to the Count method that returns the integer. The Queryable.Count method (as well as the other aggregates and the like) use this method to execute the query ‘right now’.
I did some debugging and for a query of myContext.Where(x => x.var1 > 5) Execute is called and TResult is an IEnumerable<MyClass>
For myContext.Where(x => x.var1 > 5).Count() Execute is called and TResult is an int
So my Execute method just needs to return appropriately.

Categories