How to maintain LINQ deferred execution? - c#

Suppose I have an IQueryable<T> expression that I'd like to encapsulate the definition of, store it and reuse it or embed it in a larger query later. For example:
IQueryable<Foo> myQuery =
from foo in blah.Foos
where foo.Bar == bar
select foo;
Now I believe that I can just keep that myQuery object around and use it like I described. But some things I'm not sure about:
How best to parameterize it? Initially I've defined this in a method and then returned the IQueryable<T> as the result of the method. This way I can define blah and bar as method arguments and I guess it just creates a new IQueryable<T> each time. Is this the best way to encapsulate the logic of an IQueryable<T>? Are there other ways?
What if my query resolves to a scalar rather than IQueryable? For instance, what if I want this query to be exactly as shown but append .Any() to just let me know if there were any results that matched? If I add the (...).Any() then the result is bool and immediately executed, right? Is there a way to utilize these Queryable operators (Any, SindleOrDefault, etc.) without immediately executing? How does LINQ-to-SQL handle this?
Edit: Part 2 is really more about trying to understand what are the limitation differences between IQueryable<T>.Where(Expression<Func<T, bool>>) vs. IQueryable<T>.Any(Expression<Func<T, bool>>). It seems as though the latter isn't as flexible when creating larger queries where the execution is to be delayed. The Where() can be appended and then other constructs can be later appended and then finally executed. Since the Any() returns a scalar value it sounds like it will immediately execute before the rest of the query can be built.

You have to be really careful about passing around IQueryables when you're using a DataContext, because once the context get's disposed you won't be able to execute on that IQueryable anymore. If you're not using a context then you might be ok, but be aware of that.
.Any() and .FirstOrDefault() are not deferred. When you call them they will cause execution to occur. However, this may not do what you think it does. For instance, in LINQ to SQL if you perform an .Any() on an IQueryable it acts as a IF EXISTS( SQL HERE ) basically.
You can chain IQueryable's along like this if you want to:
var firstQuery = from f in context.Foos
where f.Bar == bar
select f;
var secondQuery = from f in firstQuery
where f.Bar == anotherBar
orderby f.SomeDate
select f;
if (secondQuery.Any()) //immediately executes IF EXISTS( second query in SQL )
{
//causes execution on second query
//and allows you to enumerate through the results
foreach (var foo in secondQuery)
{
//do something
}
//or
//immediately executes second query in SQL with a TOP 1
//or something like that
var foo = secondQuery.FirstOrDefault();
}

A much better option than caching IQueryable objects is to cache Expression trees. All IQueryable objects have a property called Expression (I believe), which represents the current expression tree for that query.
At a later point in time, you can recreate the query by calling queryable.Provider.CreateQuery(expression), or directly on whatever the provider is (in your case a Linq2Sql Data Context).
Parameterizing these expression trees is slightly harder however, as they use ConstantExpressions to build in a value. In order to parameterize these queries, you WILL have to rebuild the query every time you want different parameters.

Any() used this way is deferred.
var q = dc.Customers.Where(c => c.Orders.Any());
Any() used this way is not deferred, but is still translated to SQL (the whole customers table is not loaded into memory).
bool result = dc.Customers.Any();
If you want a deferred Any(), do it this way:
public static class QueryableExtensions
{
public static Func<bool> DeferredAny<T>(this IQueryable<T> source)
{
return () => source.Any();
}
}
Which is called like this:
Func<bool> f = dc.Customers.DeferredAny();
bool result = f();
The downside is that this technique won't allow for sub-querying.

Create a partial application of your query inside an expression
Func[Bar,IQueryable[Blah],IQueryable[Foo]] queryMaker =
(criteria, queryable) => from foo in queryable.Foos
where foo.Bar == criteria
select foo;
and then you can use it by ...
IQueryable[Blah] blah = context.Blah;
Bar someCriteria = new Bar();
IQueryable[Foo] someFoosQuery = queryMaker(blah, someCriteria);
The query could be encapsulated within a class if you want to make it more portable / reusable.
public class FooBarQuery
{
public Bar Criteria { get; set; }
public IQueryable[Foo] GetQuery( IQueryable[Blah] queryable )
{
return from foo in queryable.Foos
where foo.Bar == Criteria
select foo;
}
}

Related

What collection should I use in a linq-to-sql query? Queryable vs Enumerable vs List

Imagine the following classes:
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
class Underage
{
public int Age { get; set; }
}
And I do something like this:
var underAge = db.Underage.Select(u => u.Age) .ToList()/AsEnumerable()/AsQueryable()
var result = db.Persons.Where(p => underAge.Contains(p.Age)).ToList();
What is the best option? If I call ToList(), the items will be retrieved once, but If I choose AsEnumerable or AsQueryable will they be executed everytime a person gets selected from database in the Where() (if that's how it works, I don't know much about what a database does in the background)?
Also is there a bigger difference when Underage would contain thousands of records vs a small amount?
None.
You definitely don't want ToList() here, as that would load all of the matching values into memory, and then use it to write a query for result that had a massive IN (…) clause in it. That's a waste when what you really want is to just get the matching values out of the database in the first place, with a query that is something like
SELECT *
FROM Persons
WHERE EXISTS(
SELECT NULL FROM
FROM Underage
WHERE Underage.age = Persons.age
)
AsEnumerable() often prevents things being done on the database, though providers will likely examine the source and undo the effects of that AsEnumerable(). AsQueryable() is fine, but doesn't actually do anything in this case.
You don't need any of them:
var underAge = db.Underage.Select(u => u.Age);
var result = db.Persons.Where(p => underAge.Contains(p.Age)).ToList();
(For that matter, check you really do need that ToList() in the last line; if you're going to do further queries on it, it'll likely hurt them, if you're just going to enumerate through the results it's a waste of time and memory, if you're going to store them or do lots of in-memory operations that can't be done in Linq it'll be for the better).
If you just leave your first query as
var underage = db.Underage.Select(u => u.Age);
It will be of type IQueryable and it will not ping your database ( acutaly called deferred execution). Once you actually want to execute a call to the database you can use a greedy operator such as .ToList(), .ToArray(), .ToDictionary(). This will give your result variable an IEnumerable collection.
See linq deferred execution
Your should use join to fetch the data.
var query = (from p in db.Persons
join u in db.Underage
on u.Age equals p.Age
select p).ToList();
If you not use .ToList() in above query it return IEnumerable of type Person and actual data will be fetch when you use the object.
All are equally good ToList(), AsEnurmeable and Querable based on the scenario you want to use. For your case ToList looks good for me as you just want to fetch list of person have underage

Buffering an expression via property in LINQ (vs Lambda)

I have a large select expression to reuse in several classes. For the DRY principle I have chosen to create a property that returns the Expression to the caller code
protected virtual Expression<Func<SezioneJoin, QueryRow>> Select
{
get
{
return sj => new QueryRow
{
A01 = sj.A.A01,
A01a = sj.A.A01a,
A01b = sj.A.A01b,
A02 = sj.A.A02,
A03 = sj.A.A03,
A11 = sj.A.A11,
A12 = sj.A.A12,
A12a = sj.A.A12a,
A12b = sj.A.A12b,
A12c = sj.A.A12c,
A21 = sj.A.A21,
A22 = sj.A.A22,
..............
Lots of assignements
};
}
}
Now I can successfully use the property if I do
var query = dataContext.entity.Join(...).Where(x => ...).Select(Select);
But the following will not compile:
from SezioneJoin sj in (
from A a in ...
join D d in ... on new { ... } equals new { ... }
where
d.D13 == "086" &&
!String.IsNullOrEmpty(a.A32) && a.A32 != "086"
orderby a.A21
orderby a.prog
select new SezioneJoin{...})
select Select
Error is
Unable to cast 'System.Linq.IQueryable<System.Linq.Expressions.Expression<System.Func<DiagnosticoSite.Data.Query.SezioneJoin,DiagnosticoSite.Data.Query.QueryRow>>>' into 'System.Linq.IQueryable<DiagnosticoSite.Data.Query.QueryRow>'
I can understand that the LINQ syntax requires the body of the select statement to be the inner type of the IQueryable that it returns, so the compiler is fooled into returning a list of expressions. With the Lambda syntax, the expression is a parameter that is either compiled in-line or returned by some other method (even dynamically!).
I would like to ask if there is any way to circumvent this and avoid defining large select expressions inline
protected virtual Expression> Select
I'd avoid using the names of any of the Linq-mapped methods (Select, Where, GroupBy, OrderBy, OrderByDescending) as member names. It works in this case, but when it causes problems by matching the definitions for those it can be confusing if you aren't in the habit of just not using those names unless you deliberately want to override Linq.
On a related note. Consider that:
from var item in source select item.Something
is equivalent to:
source.Select(item => item.Something);
Therefore:
from SezioneJoin sj in (/*…*/) select Select;
is equivalent to:
(/*…*/).Select(sj => Select);
That is, you arent' creating a query that executes the expression in Select, but one that returns the expression itself.
You should either just use the form .Select(Select) or use select sj => (Select)(sj) but that second one will (if I even have the parentheses correct to stop it clashing with Queryable.Select, I haven't tested that) call the Select property every time so is at best wasteful and at worse not going to be something a query provider can make use of, so it will fail with most linq-providers. In all, use the .Select(Select) form (and change the name).
(On a separate note, if you're going to buffer an expression, actually buffer it; create a private Expression<Func<SezioneJoin, QueryRow>> once and return it in the property's getter, rather than creating it every time).
Simply use extension method in place of last LINQ select statement:
var query = from SezioneJoin sj ... select new SezioneJoin{...});
var projection = query.Select(Select);

Running a simple LINQ query in parallel

I'm still very new to LINQ and PLINQ. I generally just use loops and List.BinarySearch in a lot of cases, but I'm trying to get out of that mindset where I can.
public class Staff
{
// ...
public bool Matches(string searchString)
{
// ...
}
}
Using "normal" LINQ - sorry, I'm unfamiliar with the terminology - I can do the following:
var matchedStaff = from s
in allStaff
where s.Matches(searchString)
select s;
But I'd like to do this in parallel:
var matchedStaff = allStaff.AsParallel().Select(s => s.Matches(searchString));
When I check the type of matchedStaff, it's a list of bools, which isn't what I want.
First of all, what am I doing wrong here, and secondly, how do I return a List<Staff> from this query?
public List<Staff> Search(string searchString)
{
return allStaff.AsParallel().Select(/* something */).AsEnumerable();
}
returns IEnumerable<type>, not List<type>.
For your first question, you should just replace Select with Where :
var matchedStaff = allStaff.AsParallel().Where(s => s.Matches(searchString));
Select is a projection operator, not a filtering one, that's why you are getting an IEnumerable<bool> corresponding to the projection of all your Staff objects from the input sequence to bools returned by your Matches method call.
I understand it can be counter intuitive for you not to use select at all as it seems you are more familiar with the "query syntax" where select keyword is mandatory which is not the case using the "lambda syntax" (or "fluent syntax" ... whatever the naming), but that's how it is ;)
Projections operators, such a Select, are taking as input an element from the sequence and transform/projects this element somehow to another type of element (here projecting to bool type). Whereas filtering operators, such as Where, are taking as input an element from the sequence and either output the element as such in the output sequence or are not outputing the element at all, based on a predicate.
As for your second question, AsEnumerable returns an IEnumerable as it's name indicates ;)
If you want to get a List<Staff> you should rather call ToList() (as it's name indicates ;)) :
return allStaff.AsParallel().Select(/* something */).ToList();
Hope this helps.
There is no need to abandon normal LINQ syntax to achieve parallelism. You can rewrite your original query:
var matchedStaff = from s in allStaff
where s.Matches(searchString)
select s;
The parallel LINQ (“PLINQ”) version would be:
var matchedStaff = from s in allStaff.AsParallel()
where s.Matches(searchString)
select s;
To understand where the bools are coming from, when you write the following:
var matchedStaff = allStaff.AsParallel().Select(s => s.Matches(searchString));
That is equivalent to the following query syntax:
var matchedStaff = from s in allStaff.AsParallel() select s.Matches(searchString);
As stated by darkey, if you want to use the C# syntax instead of the query syntax, you should use Where():
var matchedStaff = allStaff.AsParallel().Where(s => s.Matches(searchString));

does putting Linq query inside a method affect deferred excecution?

Linq query is not executed until the sequence returned by the query is actually iterated.
I have a query that is used repeatedly, so I am going to encapuslate it inside a method. I'd like to know if it interferes with the deferred execution.
If I encapsulate a Linq query into a method like below,
the query gets executed at line 2, not line 1 where the method is called. Is this correct?
public IEnumerable<Person> GetOldPeopleQuery()
{
return personList.Where(p => p.Age > 60);
}
public void SomeOtherMethod()
{
var getWomenQuery = GetOldPeopleQuery().Where(p => p.Gender == "F"); //line 1
int numberOfOldWomen = getWomanQuery.Count(); //line 2
}
P.S. I am using Linq-To-EF, if it makes any difference.
The query is lazy evaluated when you enumerate the result the first time, that is not affected by putting it inside a method.
There is however another thing in your code that will be very inefficient. Once you've returned an IEnumerable the next linq statement applied to the collection will be a linq-to-objects query. That means that in your case you will load all old people from the database and then filter out the women in memory. The same with the count, it will be done in memory.
If you instead return an IQueryable<Person> those two questions will be evaluated using linq-to-entities and the filtering and summing can be done in the database.

C# - AsEnumerable Example

What is the exact use of AsEnumerable? Will it change non-enumerable collection to enumerable
collection?.Please give me a simple example.
From the "Remarks" section of the MSDN documentation:
The AsEnumerable<TSource> method has no effect
other than to change the compile-time
type of source from a type that
implements IEnumerable<T> to
IEnumerable<T> itself.
AsEnumerable<TSource> can be used to choose
between query implementations when a
sequence implements IEnumerable<T> but also has a different set
of public query methods available. For
example, given a generic class Table
that implements IEnumerable<T> and has its own methods such
as Where, Select, and SelectMany, a
call to Where would invoke the public
Where method of Table. A Table type
that represents a database table could
have a Where method that takes the
predicate argument as an expression
tree and converts the tree to SQL for
remote execution. If remote execution
is not desired, for example because
the predicate invokes a local method,
the AsEnumerable<TSource>
method can be used to hide the custom
methods and instead make the standard
query operators available.
If you take a look in reflector:
public static IEnumerable<TSource> AsEnumerable<TSource>(this IEnumerable<TSource> source)
{
return source;
}
It basically does nothing more than down casting something that implements IEnumerable.
Nobody has mentioned this for some reason, but observe that something.AsEnumerable() is equivalent to (IEnumerable<TSomething>) something. The difference is that the cast requires the type of the elements to be specified explicitly, which is, of course, inconvenient. For me, that's the main reason to use AsEnumerable() instead of the cast.
AsEnumerable() converts an array (or list, or collection) into an IEnumerable<T> of the collection.
See http://msdn.microsoft.com/en-us/library/bb335435.aspx for more information.
From the above article:
The AsEnumerable<TSource>(IEnumerable<TSource>) method has no
effect other than to change the compile-time type of source from a type
that implements IEnumerable<T> to IEnumerable<T> itself.
After reading the answers, i guess you are still missing a practical example.
I use this to enable me to use linq on a datatable
var mySelect = from table in myDataSet.Tables[0].AsEnumerable()
where table["myColumn"].ToString() == "Some text"
select table;
AsEnumerable can only be used on enumerable collections. It just changes the type of the collection to IEnumerable<T> to access more easily the IEnumerable extensions.
No it doesn't change a non-enumerable collection to an enumerable one. What is does it return the collection back to you as an IEnumerable so that you can use it as an enumerable. That way you can use the object in conjunction with IEnumerable extensions and be treated as such.
Here's example code which may illustrate LukeH's correct explanation.
IEnumerable<Order> orderQuery = dataContext.Orders
.Where(o => o.Customer.Name == "Bob")
.AsEnumerable()
.Where(o => MyFancyFilterMethod(o, MyFancyObject));
The first Where is Queryable.Where, which is translated into sql and run in the database (o.Customer is not loaded into memory).
The second Where is Enumerable.Where, which calls an in-memory method with an instance of something I don't want to send into the database.
Without the AsEnumerable method, I'd have to write it like this:
IEnumerable<Order> orderQuery =
((IEnumerable<Order>)
(dataContext.Orders.Where(o => o.Customer.Name == "Bob")))
.Where(o => MyFancyFilterMethod(o, MyFancyObject));
Or
IEnumerable<Order> orderQuery =
Enumerable.Where(
dataContext.Orders.Where(o => o.Customer.Name == "Bob"),
(o => MyFancyFilterMethod(o, MyFancyObject));
Neither of which flow well at all.
static void Main()
{
/*
"AsEnumerable" purpose is to cast an IQueryable<T> sequence to IEnumerable<T>,
forcing the remainder of the query to execute locally instead of on database as below example so it can hurt performance. (bind Enumerable operators instead of Queryable).
In below example we have cars table in SQL Server and are going to filter red cars and filter equipment with some regex:
*/
Regex wordCounter = new Regex(#"\w");
var query = dataContext.Cars.Where(car=> article.Color == "red" && wordCounter.Matches(car.Equipment).Count < 10);
/*
SQL Server doesn’t support regular expressions therefore the LINQ-to-db providers will throw an exception: query cannot be translated to SQL.
TO solve this firstly we can get all cars with red color using a LINQ to SQL query,
and secondly filtering locally for Equipment of less than 10 words:
*/
Regex wordCounter = new Regex(#"\w");
IEnumerable<Car> sqlQuery = dataContext.Cars
.Where(car => car.Color == "red");
IEnumerable<Car> localQuery = sqlQuery
.Where(car => wordCounter.Matches(car.Equipment).Count < 10);
/*
Because sqlQuery is of type IEnumerable<Car>, the second query binds to the local query operators,
therefore that part of the filtering is run on the client.
With AsEnumerable, we can do the same in a single query:
*/
Regex wordCounter = new Regex(#"\w");
var query = dataContext.Cars
.Where(car => car.Color == "red")
.AsEnumerable()
.Where(car => wordCounter.Matches(car.Equipment).Count < 10);
/*
An alternative to calling AsEnumerable is ToArray or ToList.
*/
}
The Enumerable.AsEnumerable method can be used to hide a type's custom implementation of a standard query operator
Consider the following example. we have a custom List called MyList
public class MyList<T> : List<T>
{
public string Where()
{
return $"This is the first element {this[0]}";
}
}
MyList has a method called Where which is Enumerable.Where() exact same name. when I use it, actually I am calling my version of Where, not Enumerable's version
MyList<int> list = new MyList<int>();
list.Add(4);
list.Add(2);
list.Add(7);
string result = list.Where();
// the result is "This is the first element 4"
Now how can I find the elements which are less than 5 with the Enumerable's version of Where?
The answer is: Use AsEnumerable() method and then call Where
IEnumerable<int> result = list.AsEnumerable().Where(e => e < 5);
This time the result contains the list of elements that are less than 5

Categories