Entity Framework and IEnumerable

Entity Framework and IEnumerable - c#

I have written a paging function that I had hoped would do an EF safe paging. It seems though that this functions executes the EF query BEFORE paging, where as what I would like it to do is defer the paging to the database. I thought IEnumerable would be safe but it appears it is not. What is happening here?
private IEnumerable<T> PageList<T>(IEnumerable<T> list, PaginationOptions options)
{
return list.Skip((options.Page - 1) * options.ResultsPerPage).Take(options.ResultsPerPage);
}
If I test this as a function vs the actual function called one generates SQL that contains the paging information and the other (the function) doesnt.
var pagedStuff = this.PageList(activities, options); // doesnt create correct query
var pagedStuff = activities.Skip((options.Page - 1) * options.ResultsPerPage).Take(options.ResultsPerPage);
P.s. I use IEnumerable as sometimes this function is used for normal Lists.

I use IEnumerable as sometimes this function is used for normal Lists
You may want to consider making an IQueryable overload as well:
private IQueryable<T> PageList<T>(IQueryable<T> list, PaginationOptions options)
{
return list.Skip((options.Page - 1) * options.ResultsPerPage).Take(options.ResultsPerPage);
}
The problem is that since the input is an IEnumerable, the query is being executed before the paging is applied, so you get all records back and the paging is done in Linq-to-Objects.
Using IQueryable delays the execution of the query until after the paging is applied.

You're invoking IEnumerable.Skip/Take instead of IQueryable.Skip/Take.
Try to use the AsQueryable method:
return list.AsQueryable()
.Skip((options.Page - 1) * options.ResultsPerPage)
.Take(options.ResultsPerPage);

Related

Entity framework - COUNT rather than SELECT

If I call the GetFoo().Count() from a method outside of my DB class, Entity Framework 5 does a rather inefficient SELECT query rather than a COUNT. From reading a few other questions like this, I see that this is expected behaviour.
public IEnumerable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
I've therefore added a count method to my DB class, which correctly performs a COUNT query:
public int GetFooCount()
{
return context.Items.Where(d => d.Foo.equals("bar")).Count();
}
To save me from specifying queries multiple times, I'd like to change this to the following. However this again performs a SELECT, even though it's within the DB class. Why is this - and how can I avoid it?
public int GetFooCount()
{
return this.GetFoo().Count();
}

since GetFoo() returns an IEnumerable<DbItems>, the query is executed as a SELECT, then Count is applied to the collection of objects and is not projected to the SQL.
One option is returning an IQueryable<DbItems> instead:
public IQueryable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
But that may change the behavior of other callers that are expecting the colection to be loaded (with IQueryable it will be lazy-loaded). Particularly methods that add .Where calls that cannot be translated to SQL. Unfortunately you won't know about those at compile time, so thorough testing will be necessary.
I would instead create a new method that returns an IQueryable:
public IQueryable<DbItems> GetFooQuery()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
so your existing usages aren't affected. If you wanted to re-use that code you could change GetFoo to:
public IEnumerable<DbItems> GetFoo()
{
return GetFooQuery().AsEnumerable();
}

In order to understand this behavior you need to understand difference between IEnumerable<T> and IQueryable<T> extensions. First one works with Linq to Objects, which is in-memory queries. This queries are not translated into SQL, because this is simple .NET code. So, if you have some IEnumerable<T> value, and you are executing Count() this invokes Enumerable.Count extension method, which is something like:
public static int Count<TSource>(this IEnumerable<TSource> source)
{
int num = 0;
foreach(var item in source)
num++;
return num;
}
But there is completely different story with IQueryable<T> extensions. These methods are translated by underlying LINQ provider (EF in your case) to something other than .NET code. E.g. to SQL. And this translation occurs when you execute query. All query is analyzed, and nice (well, not always nice) SQL is generated. This SQL is executed in database and result is returned to you as result of query execution.
So, your method returns IEnumerable<T> - that means you are using Enumerable.Count() method which should be executed in memory. Thus following query is translated by EF into SQL
context.Items.Where(d => d.Foo.equals("bar")) // translated into SELECT WHERE
executed, and then count of items calculated in-memory with method above. But if you will change return type to IQueryable<T>, then all changes
public IQueryable<DbItems> GetFoo()
{
return context.Items.Where(d => d.Foo.equals("bar"));
}
Now Queryable<T>.Count() is executed. This means query continues building (well, actually Count() is the operator which forces query execution, but Count() becomes part of this query). And EF translates
context.Items.Where(d => d.Foo.equals("bar")).Count()
into SQL query which is executed on server side.

Convert an IQueryable linq query to IEnumerable<T> cancels out linq optimized way to work?

I'm kinda newbie on .NET, and I was wondering how does linq works, since you can aply many linq queries, one after another, but none of them are really executed until they're used to transfer information or converted to list, etc.
There are 2 important ways to get a linq query, by using IQueryable<T>, which aplies the where filters directly on the Sql, and IEnumerable which get all the records and then it work with them on memory. However, let's take a look on this code:
//Linq dynamic library
IQueryable<Table> myResult = db.Categories
.Where(a => a.Name.Contains(StringName))
.OrderBy("Name")
.Skip(0)
.Take(10);
if (myResult != null)
{
return myResult.AsEnumerable();
}
else
{ return null; }
Depsite i'm using Linq dynamic library, the direct result from this query is being get on IQueryable<T>, if the query is finally being returned as IEnumerable, is the query being really filtered on the sql? or is it in memory?

It's still going to execute in the database, don't worry. Basically it's all down to which implementation of Where etc is used. While you're calling methods on IQueryable<T> - via the Queryable extension methods - it will be using expression trees. When you start to fetch from that query, it will be turned into SQL and sent to the database.
On the other hand, if you use any of those methods after you've got it as an IEnumerable<T> (in terms of the compile-time type), that will use the extension methods in Enumerable, and all of the rest of the processing would be done in-process.
As an example, consider this:
var query = db.People
.Where(x => x.Name.StartsWith("J"))
.AsEnumerable()
.Where(x => x.Age > 20);
Here AsEnumerable() just returns its input sequence, but typed as IEnumerable<T>. In this case, the database query would return only people whose name began with J - and then the age filtering would be done at the client instead.

If you return an IEnumerable<T> and then further refine the query, then the further refinement happens in memory. The part that was expressed on an IQueryable<T> will get translated to the appropiate SQL statements (for the LINQ-to-SQL case, obviously).
See Returning IEnumerable<T> vs. IQueryable<T> for a longer and more detailed answer.

mixing c# functions with Linq-To-SQL conditions

I am having trouble mixing c# functions with conditions in Linq-To-SQL
suppose i have a database table "things" and a local c# Function : bool isGood(thing, params)
I want to use that function to select rows from the table.
var bad = dataContext.Things.Where(t=>t.type=mytype && !isGood(t,myparams))
dataContect.Things.deleteAllOnSubmit(bad);
or
if (dataContext.Things.Any(t=>t.type=mytype && isGood(t,myparams)))
{
return false;
}
Of course this does not work, Linq has no way of translating my function into a SQL statement. So this will produce:
NotSupportedException: Method 'Boolean isGood(thing,params)' has no supported translation to SQL.
What is the best way to redesign this so that it will work?
I can split the statements and convert to list like this:
List<Things> mythings dataContext.Things.Where(t=>t.type=mytype).toList()
if (mythings.Any(t=>isGood(t,myparams)))
{
return false;
}
this works, but seems inefficient, since the whole list has to be generated in every case.
And I don't think I can do a deleteAllOnSubmit with the result
I could do a foreach over mythings instead of calling toList(), that also works. Seems inelegant though.
What other options do I have, and what would be the recommended approach here?
edit:
calling asEnumerable() seems to be another option, and seems better than toList() at least. I.e.
dataContext.Things.Where(t=>t.type=mytype).asEnumerable().Any(t=>isGood(t,myparams))

Pulling the whole list back from the database to run a local c# function on it might seem inefficient, but that what you'd have to do if your isGood() function is local.
If you can translate your isGood() function into Linq, you could apply it before the toList() call, so it would get translated into SQL and the whole list wouldn't be retrieved.

Implementing IQueryable.Count

I'm working on an IQueryable provider. In my IQueryProvider I have the following code:
public TResult Execute<TResult>(Expression expression)
{
var query = GetQueryText(expression);
// Call the Web service and get the results.
var items = myWebService.Select<TResult>(query);
IQueryable<TResult> queryableItems = items.AsQueryable<TResult>();
return (TResult)queryableItems;
}
GetQueryText does all the leg work and works out the query string for the expression tree. This is all working well, so Where, OrderBy and Take are sorted. The webservice supports a count query using the following:
int count = myWebService.Count(query);
But I can't get my head round where I put this in the IQueryable or IQueryProvider.
I've basically worked from reading tutorials and open source examples, but can't seem to find one that does Count.

The answer appears simpler than I first thought. This blog post helped:
The Execute method is the entry point into your provider for actually executing query expressions. Having an explicit execute instead of just relying on IEnumerable.GetEnumerator() is important because it allows execution of expressions that do not necessarily yield sequences. For example, the query “myquery.Count()” returns a single integer. The expression tree for this query is a method call to the Count method that returns the integer. The Queryable.Count method (as well as the other aggregates and the like) use this method to execute the query ‘right now’.
I did some debugging and for a query of myContext.Where(x => x.var1 > 5) Execute is called and TResult is an IEnumerable<MyClass>
For myContext.Where(x => x.var1 > 5).Count() Execute is called and TResult is an int
So my Execute method just needs to return appropriately.

What's the difference between these LINQ queries?

I use LINQ-SQL as my DAL, I then have a project called DB which acts as my BLL. Various applications then access the BLL to read / write data from the SQL Database.
I have these methods in my BLL for one particular table:
public IEnumerable<SystemSalesTaxList> Get_SystemSalesTaxList()
{
return from s in db.SystemSalesTaxLists
select s;
}
public SystemSalesTaxList Get_SystemSalesTaxList(string strSalesTaxID)
{
return Get_SystemSalesTaxList().Where(s => s.SalesTaxID == strSalesTaxID).FirstOrDefault();
}
public SystemSalesTaxList Get_SystemSalesTaxListByZipCode(string strZipCode)
{
return Get_SystemSalesTaxList().Where(s => s.ZipCode == strZipCode).FirstOrDefault();
}
All pretty straight forward I thought.
Get_SystemSalesTaxListByZipCode is always returning a null value though, even when it has a ZIP Code that exists in that table.
If I write the method like this, it returns the row I want:
public SystemSalesTaxList Get_SystemSalesTaxListByZipCode(string strZipCode)
{
var salesTax = from s in db.SystemSalesTaxLists
where s.ZipCode == strZipCode
select s;
return salesTax.FirstOrDefault();
}
Why does the other method not return the same, as the query should be identical ?
Note that, the overloaded Get_SystemSalesTaxList(string strSalesTaxID) returns a record just fine when I give it a valid SalesTaxID.
Is there a more efficient way to write these "helper" type classes ?
Thanks!

This is probably down to the different ways LINQ handles IEnumerable<T> and IQueryable<T>.
You have declared Get_SystemSalesTaxList as returning IEnumerable<SystemSalesTaxList>. That means that when, in your first code sample, you apply the Where operator to the results of Get_SystemSalesTaxList, it gets resolved to the Enumerable.Where extension method. (Note that what matters is the declared type. Yes, at runtime Get_SystemSalesTaxList is returning an IQueryable<SystemSalesTaxList>, but its declared type -- what the compiler sees -- is IEnumerable<SystemSalesTaxList>.) Enumerable.Where runs the specified .NET predicate over the target sequence. In this case, it iterates over all the SystemSalesTaxList objects returned by Get_SystemSalesTaxList, yielding the ones where the ZipCode property equals the specified zip code string (using the .NET String == operator).
But in your last code sample, you apply the Where operator to db.SystemSalesTaxList, which is declared as being of type IQueryable<SystemSalesTaxList>. So the Where operator in that sample gets resolved to Queryable.Where, which translates the specified predicate expression to SQL and runs it on the database.
So what's different in the zip code methods is that the first one runs the C# s.ZipCode == strZipCode test in .NET, and the second translates that into a SQL query WHERE ZipCode = 'CA 12345' (parameterised SQL really but you get the idea). Why do these give different results? Hard to be sure, but the C# == predicate is case-sensitive, and depending on your collation settings the SQL may or may not be case-sensitive. So my suspicion is that strZipCode doesn't match the database zip codes in case, but in the second version SQL Server collation is smoothing this over.
The best solution is probably to change the declaration of Get_SystemSalesTaxList to return IQueryable<SystemSalesTaxList>. The major benefit of this is that it means queries built on Get_SystemSalesTaxList will be executed database side. At the moment, your methods are pulling back EVERYTHING in the database table and filtering it client side. Changing the declaration will get your queries translated to SQL and they will run much more efficiently, and hopefully will solve your zip code issue into the bargain.

The real issue here is the use of IEnumerable<T>, which breaks "composition" of queries; this has two effects:
you are reading all (or at least, more than you need) of your table each time, even if you ask for a single row
you are running LINQ-to-Objects rules, so case-sensitivity applies
Instead, you want to be using IQueryable<T> inside your data layer, allowing you to combine multiple queries with additional Where, OrderBy, Skip, Take, etc as needed and have it build the TSQL to match (and use your db's case-sensitivity rules).
Is there a more efficient way to write these "helper" type classes ?
For more efficient (less code to debug, doesn't stream the entire table, better use of the identity-map to short-circuit additional lookups (via FirstOrDefault etc)):
public IEnumerable<SystemSalesTaxList> Get_SystemSalesTaxList()
{
return db.SystemSalesTaxLists;
}
public SystemSalesTaxList Get_SystemSalesTaxList(string salesTaxID)
{
return db.SystemSalesTaxLists.FirstOrDefault(s => s.SalesTaxID==salesTaxID);
}
public SystemSalesTaxList Get_SystemSalesTaxListByZipCode(string zipCode)
{
return db.SystemSalesTaxLists.FirstOrDefault(s => s.ZipCode == zipCode);
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.