My question is how / when it makes sense to overload (if thats possible?) the Where() extension method of IQueryable when you are making your own IQueryable implementation?
For example in Entity Framework its my understanding that a Where() call made against an ObjectSet will change the actual SQL thats being passed to the database. Alternatively if you cast to IEnumerable() first the filtering is done with LINQ-To-Objects rather than LINQ-To-Entities.
For instance:
new MyDBEntities().MyDbTable.Where(x => x.SomeProperty == "SomeValue");
// this is linq-to-entities and changes the database-level SQL
Versus:
new MyDBEntities().MyDbTable.AsEnumerable().Where(x => x.SomeProperty == "SomeValue");
// this is linq-to-objects and filters the IEnumerable
How do you deal with this when implementing your own IQueryable and IQueryable already has pre-defined extension methods such as Where()? Specifically I want make my own very simple ORM that uses native SQL and SqlDataReader under the hood, and want to have a .Where() method that changes the native SQL before passing it to the database.
Should I even use IQueryable or just create my own class entirely? I want to be able to use lambda syntax and have my SQL command altered based on the lambda function(s) used for filtration.
You could create your own type analogous to IQueryable<T>, but that's probably not a good idea. You should probably write your own implementation of the interface. There is a nice series of articles on how to do it, but be prepared, doing so is not one of the easier tasks.
Related
If the IQueryable interface performs the query expression in the server rather than fetching all records like IEnumerable, why is IQueryable not replaced by IEnumerable where it can be faster and more efficient?
DBSet<T> has two flavors of Where (IQueryable and IEnumerable). Is there a way to call the IEnumerable version because the IQueryable is called by default, without calling ToList()?
If the IQueryable perform the query Expression in the server rather
than fetching all records like IEnumerable, why IQueryable not
replaced by IEnumerable where it can be faster and more efficient?
IQueryable and IEnumerable represent two different things. Think of a IQueryable as a "question", it does not have any results itself. A IEnumerable is an "answer", it only has data connected to it but you can't tell what generated that data.
It is not that a IQueryable is "faster" per say, it just allows you to put your filtering and projections in to the "question" you ask to the SQL server and let it return only the answers it needs to (In the form of a IEnumerable by calling .ToList() or similar).
If you only use a IEnumerable the only question you can ask is "Give me everything you know" then on the answer it gives you you perform your filtering and projections. That is why IQueryable is considered faster, because there is a lot less data that needs to be processed because you where able to ask a more specific question to the server.
The reason IQueryable has not replaced IEnumerable everywhere is because the thing you are asking a question has to be able to understand the question you are asking it. It takes a lot of work to be able to parse every possible thing you could ask it to filter or project on so most implementations limit themselves to only common things they know they need to be able to answer. For example in Entity Framework when you ask a question it does not understand how to handle you will get a error that says something similar to "Specified method is not supported" when you try to get a IEnumerable (an answer) from the IQueryable.
DBSet<T> has two flavors of Where (IQueryable and IEnumerable).
is there a way to call the IEnumerable version because the
IQueryable is called by default, without calling ToList()?
The class DBSet<T> has no Where method on it at all. The two Where functions come from two extension methods, Enumerable.Where and Queryable.Where. You can force it to use the Enumerable overload by casting the object to a IEnumerable<T> before you call the extension method. However do remember, Queryable.Where filters the question, Enumerable.Where only filters the result.
It is wasteful to ask for results from the server to then just throw them away, so I would not recommend doing this.
Let's say I need an extension method which selects only required properties from different sources. The source could be the database or in-memory collection. So I have defined such extension method:
public IQueryable<TResult> SelectDynamic<TResult>(
this IQueryable<T> source,
...)
This works fine for IQueryables. But, I have to call this function also for IEnumerables.
And in that case, I can call it with the help of .AsQueryable():
myEnumerable.AsQueryable()
.SelectDynamic(...)
.ToList();
Both work fine. And if both work fine, in which conditions I have to create two different extension methods for the same purpose, one for IEnumerable and another one for IQueryable?
My method has to send query to the database in case of Queryable.
For example, here is the source of .Select extension method inside System.Linq namespace:
.Select for IEnumerable
.Select for IQueryable
I am repeating my main question again:
My method must send query to the database in case of Queryable, but not when working with IEnumerable. And for now, I am using AsQueryable() for the enumerables. Because, I dont want to write same code for the Enumerable. Can it have some side effects?
If your code only actually works when the objects its dealing with are loaded in memory, just supply the IEnumerable variant and let your callers decide when they want to convert an IQueryable into an in-memory IEnumerable.
Generally, you won't implement new variations around IQueryable unless you're writing a new database provider.
myEnumerable.AsQueryable() returns a custom object: new EnumerableQuery<TElement>(myEnumerable); (source code)
This EnumerableQuery class implements IEnumerable<T> and IQueryable<T>
When using the EnumerableQuery result of .AsQueryable() as an IEnumerable, the implementation of the interface method IEnumerable<T>.GetIterator() simply returns the original source iterator, so no change and minimal overhead.
When using the result of .AsQueryable() as an IQueriable, the implementation of the interface property IQueriable.Expression simply returns Expression.Constant(this), ready to be evaluated later as an IEnumerable when the whole expression tree is consumed.
(All the other methods and code paths of EnumerableQuery are not really relevant, when the EnumerableQuery is constructed directly from an IEnumerable, as far as I can tell)
If I understand you correctly, you have implemented your method selectDynamic<TResult>() in such a way that you construct an expression tree inside the method, that produces the desired result when compiled.
As far as I understand the source code, when you call e.g. myEnumerable.AsEnumerable().selectDynamic().ToList(), the expression tree you constructed is compiled and executed on myEnumerable, and the total overhead should be fairly minimal, since all this work is only done once per query, not once per element.
So i think there is nothing wrong with implementing your IEnumerable Extension method like this:
public IEnumerable<TResult> SelectDynamic<TResult>(
this IEnumerable<T> source,...)
return source.AsQueryable().SelectDynamic();
}
There is some slight overhead, since this compiles the query once each time this method is called, and I am not sure the JITer is smart enough to cache this compilation. But I think that will not be noticeable in most circumstances, unless you execute this query a thousand times per second.
There should be no other side efects, apart from slight performance issues, in implementing the IEnumerable extension method in this way.
This code causes a NotSupportedException.
var detailList = context.Details.Where(x => x.GetType().GetProperty("Code").GetValue(x,null).ToString() == "00101").ToList();
But this code works.
var detailList = context.Details.AsEnumerable().Where(x => x.GetType().GetProperty("Code").GetValue(x,null).ToString() == "00101").ToList();
MSDN says:
- AsEnumerable() Returns the input typed as IEnumerable
- DbSet Is an IEnumerable
So why we need to use AsEnumerable() method?
DbSet is also IQueryable.
IQueryable has its own set of LINQ extension methods that translate expression trees into SQL, and do not support reflection.
By calling AsEnumerable(), you change the compile-time type of the expression to IEnumerable<T>, forcing the extension methods to bind to the standard LINQ ones.
If you prefer to run your query on the server, you should build an expression tree instead of using reflection.
The first query attempts to have the query provider translate the query into SQL and execute it against the database. It fails to create a valid database query, so it fails with the error mentioned.
Using AsEnumerable types the query as an IEnumerable<T>, rather than an IQueryable<T>, statically, and as such ends up calling the LINQ to objects version of the query methods, pulling the entire table into memory and then performing all of the operations within the application.
When you're querying an IQueryable<T>, your method gets translated via an Expression Tree into SQL. AsEnumerable to change the compile time type to IEnumerable<T> and to bring all the entities from your database into memory, where you can query them via reflection via LINQ to Objects.
I know these questions have been asked before, I'll start by listing a few of them (the ones I've read so far):
IEnumerable vs IQueryable
List, IList, IEnumerable, IQueryable, ICollection, which is most flexible return type?
Returning IEnumerable<T> vs. IQueryable<T>
IEnumerable<T> as return type
https://stackoverflow.com/questions/2712253/ienumerable-and-iqueryable
Views with business logic vs code
WPF IEnumerable<T> vs IQueryable<T> as DataSource
IEnumerable<T> VS IList<T> VS IQueryable<T>
What interface should my service return? IQueryable, IList, IEnumerable?
Should I return IEnumerable<T> or IQueryable<T> from my DAL?
As you can see, there's some great resources on SO alone on the subject, but there is one question/section of the question I'm still not sure about having read these all through.
I'm primarily concerned with the IEnumerable vs IQueryable question, and more specifically the coupling between the DAL and it's consumers.
I've found varying opinions suggested regarding the two interfaces, which have been great. However, I'm concerned with the implications of a DAL returning IQueryable. As I understand it IQueryable suggest/implies that there is a Linq Provider under the hood. That's concern number one - what if the DAL suddenly requires data from a non-Linq provided source? The following would work, but is it more of a hack?
public static IQueryable<Product> GetAll()
{
// this function used to use a L2S context or similar to return data
// from a database, however, now it uses a non linq provider
// simulate the non linq provider...
List<Product> results = new List<Product> { new Product() };
return results.AsQueryable();
}
So I can use the AsQueryable() extension though I don't admit to knowing exactly what this does? I always imagine IQueryables as being the underlying expression trees which we can append as necessary until we're ready to perform our query and fetch the results.
I could rectify this by changing the return type of the function to IEnumerable. I can then return IQueryable from the function because it inherits IEnumerable, and I get to keep the deferred loading. What I lose is the ability to append to the query expression:
var results = SomeClass.GetAll().Where(x => x.ProductTypeId == 5);
When returning IQueryable, as I understand it, this would simply append the expression. When returning IEnumerable, despite maintaining the deferred loading, the expression has to be evaluated so the results will be brought to memory and enumerated through to filter out incorrect ProductTypeIds.
How do other people get round this?
Provide more functions in the DAL - GetAllByProductType, GetAllByStartDate,... etc
Provide an overload that accepts predicates? i.e.
public static IEnumerable<Product> GetAll(Predicate<Product> predicate)
{
List<Product> results = new List<Product> { new Product() };
return results.Where(x => predicate(x));
}
One last part (Sorry, I know, really long question!).
I found IEnumerable to be the most recommended across all the questions I checked, but what about the deferred loadings' requirement for a datacontext to be available? As I understand it, if your function returns IEnumerable, but you return IQueryable, the IQueryable is reliant on an underlying datacontext. Because the result at this stage is actually an expression and nothing has been brought to memory, you cannot guarantee that the DAL's/function's consumer is going to perform the query, nor when. So do I have to keep the instance of the context that the results were derived from available somehow? Is this how/why the Unit of Work pattern comes into play?
Summary of the questions for clarity ( did a search for "?"...):
If using IQueryable as a return type, are you too tightly coupling your UI/Business Logic to Linq Providers?
Is using the AsQueryable() extension a good idea if you suddenly need to return data from a non-Linq Provided source?
Anyone have a good link describing how for example converting a standard list to AsQueryable works, what it actually does?
How do you handle additional filtering requirements supplied by business logic to your DAL?
It seems the deferred loading of both IEnumerable and IQueryable are subject to maintaining the underlying provider, should I be using a Unit of Work pattern or something else to handle this?
Thanks a lot in advance!
well, you aren't strictly coupled to any specific provider, but as a re-phrasing of that: you can't easily test the code, since each provider has different supported features (meaning: what works for one might not work for another - even something like .Single())
I don't think so, if there is any question in your mind about ever changing provider - see above
it just provides a decorated wrapper that uses .Compile() on any lambdas, and uses LINQ-to-Objects instead. Note LINQ-to-Objects has more support than any other provider, so this won't be an issue - except that it means that any "mocks" using this approach don't really test your actual code at all and are largely pointless (IMO)
yeah, tricky - see below
yeah, tricky - see below
Personally, I'd prefer well defined APIs here that take known parameters and return a loaded List<T> or IList<T> (or similar) of results; this gives you a testable/mockable API, and doesn't leave you at the mercy of deferred execution (closed connection hell, etc). It also means that any differences between providers is handled internally to the implementation of your data layer. It also makes a much closer fit for calling scenarios such as web-services, etc.
In short; given a choice between IEnumerable<T> and IQueryable<T>, I choose neither - opting instead to use IList<T> or List<T>. If I need additional filtering, then either:
I'll add that to the existing API via parameters, and do the filtering inside my data layer
I'll accept that oversized data is coming back, which I then need to filter out at the caller
I'm trying to mimic the LINQ Where extension method for my ADO.NET DAL methods.
Bascially, my aim is to have a single method that I can call. Such as:
Product p = Dal.GetProduct(x => x.ProductId == 32);
Product p2 = Dal.GetProduct(x => x.ProductName.Contains("Soap"));
I then want to dissect those Predicates and send the filter options to parameters in an ADO.NET Stored Procedure call.
Any comments greatly appreciated.
As #Daniel points out, this is far from simple. The solution outline is to let GetProduct take an argument of type Expression<Func<Product, bool>>. You then have to traverse the parse-tree of this expression, generating the correct SQL for functions known and also decide how to handle unknown functions. There are basically two options for that:
Throw an error (as linq-to-sql does).
Skip it in the translation and then apply it on the returned result. The performance impact of this can of course be huge if a lot of data is retreived just to be filtered out.
It would be a fun exercise to do it - but I can hardly see a way to justify it in the real world, when there are already linq2sql, linq2entities and linq2NHibernate that does the job.
In addition to Anders's answer, I just want to mention that you can analyze the expression tree by using an expression visitor. To do that, you can inherit the ExpressionVisitor class (it's new in .NET 4, but you can find a 3.5 implementation in LinqKit) and override the methods you want to analyze each node.
You might also be interested in those links :
Building a LINQ Provider
Walkthrough: Creating an IQueryable LINQ Provider