Linq query is not executed until the sequence returned by the query is actually iterated.
I have a query that is used repeatedly, so I am going to encapuslate it inside a method. I'd like to know if it interferes with the deferred execution.
If I encapsulate a Linq query into a method like below,
the query gets executed at line 2, not line 1 where the method is called. Is this correct?
public IEnumerable<Person> GetOldPeopleQuery()
{
return personList.Where(p => p.Age > 60);
}
public void SomeOtherMethod()
{
var getWomenQuery = GetOldPeopleQuery().Where(p => p.Gender == "F"); //line 1
int numberOfOldWomen = getWomanQuery.Count(); //line 2
}
P.S. I am using Linq-To-EF, if it makes any difference.
The query is lazy evaluated when you enumerate the result the first time, that is not affected by putting it inside a method.
There is however another thing in your code that will be very inefficient. Once you've returned an IEnumerable the next linq statement applied to the collection will be a linq-to-objects query. That means that in your case you will load all old people from the database and then filter out the women in memory. The same with the count, it will be done in memory.
If you instead return an IQueryable<Person> those two questions will be evaluated using linq-to-entities and the filtering and summing can be done in the database.
Related
I am trying to get data from linq in asp.net core. I have a table with a Position with a FacultyID field, how do I get it from the Position table with an existing userid. My query
var claimsIdentity = _httpContextAccessor.HttpContext.User.Identity as ClaimsIdentity;
var userId = claimsIdentity.FindFirst(ClaimTypes.NameIdentifier)?.Value.ToString();
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).???;
What can I add after the mark? to get the data. Thank you so much
There are several things you can do. An example in your case would be:
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).FirstOrDefault();
If you expect more than 1 results, then you would do:
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).ToList();
You have to be aware of the difference between a query and the result of a query.
The query does not represent the data itself, it represents the potential to fetch some data.
If you look closely to the LINQ methods, you will find there are two groups: the LINQ methods that return IQueryable<...> and the others.
The IQueryable methods don't execute the query. These functions are called lazy, they use deferred execution. You can find these terms in the remarks section of every LINQ method.
As long as you concatenate IQueryable LINQ methods, the query is not executed. It is not costly to concatenate LINQ methods in separate statements.
The query is executed as soon as you start enumerating the query. At its lowest level this is done using GetEnumerator and MoveNext / Current:
IQueryable<Customer> customers = ...; // Query not executed yet!
// execute the query and process the fetched data
using (IEnumerator<Customer> enumerator = customers.GetEnumerator())
{
while(enumerator.MoveNext())
{
// there is a Customer, it is in property Current:
Customer customer = enumerator.Current;
this.ProcessFetchedCustomer(customer);
}
}
This code, or something very similar is done when you use foreach, or one of the LINQ methods that don't return IQueryable<...>, like ToList, ToDictionary, FirstOrDefault, Sum, Any, ...
var data = dbContext.Positions
.Where(p => p.UserID.ToString() == userId)
.Select(x => x.FacultyID);
If you use your debugger, you will see that data is an IQueryable<Position>. You'll have to use one of the other LINQ methods to execute the query.
To get all Positions in the query:
List<Position> fetchedPositions result = data.ToList();
If you expect only one position:
Position fetchedPosition = data.FirstOrDefault();
If you want to know if there is any position at all:
if (positionAvailable = data.Any())
{
...
}
Be aware: if you use the IQueryable, the data will be fetched again from the DbContext. So if you want to do all three statements efficiently these, make sure you don't use the original data three times:
List<Position> fetchedPositions result = data.ToList();
Position firstPosition = fetchedPostion.FirstOrDefault();
if (firstPosition != null)
{
ProcessPosition(firstPosition);
}
I am getting the error about LINQ to Entities does not recognize the method 'System.String Format but in the past I was able to do this when I have included .AsEnumerable() is there something different I need to do because of the GroupBy section?
select new PresentationLayer.Models.PanelMeeting
{
GroupId = pg.GroupId,
MeetingId = pmd.MeetingId,
GuidelineName = pmv.GuidelineName,
PanelDisclosuresAttendanceURL = string.Format("{0}?MeetingId={1}&GroupId=0",PanelDisclosureLink, pmd.MeetingId),
}).GroupBy(g => new
{
g.MeetingId,
g.GroupId
})
.AsEnumerable()
.SelectMany(grp => grp.AsEnumerable()).ToList(),
You have to be aware of the difference between an IEnumerable<...> and an IQueryable<...>.
IEnumerable
An object that implements IEnumerable<...> represents a sequence of similar items. You can ask for the first element of the sequence, and as long as you've got elements you can ask for the next element. IEnumerable objects are supposed to be executed within your own process. IEnumerable objects hold everything to enumerate the sequence.
At its lowest level, this is done using GetEnumerator() / MoveNext() / Current:
IEnumerable<Customer> customers = ...
IEnumerator<Customer> enumerator = customers.GetEnumerator();
while (enumerator.MoveNext())
{
// There is a next Customer
Customer customer = enumerator.Current;
ProcessCustomer(customer);
}
If you use foreach, then internally GetEnumerator / MoveNext / Current are called.
If you look closely to LINQ, you will see that there are two groups of LINQ methods. Those that return IEnumerable<TResult> and those that dont't return IEnumerable<...>
LINQ functions from the first group won't enumerate the query. They use deferred execution, or lazy execution. In the comments section of every LINQ method, you'll find this description.
The LINQ functions of the other group will execute the query. If you look at the reference source of extension class Enumerable, you'll see that they internally use foreach, or at lower level use GetEnumerator / MoveNext / Current
IQueryable
An object that implements IQueryable<...> seems like an IEnumerable. However, it represents the potential to fetch data for an Enumerable sequence. The data is usually provided by a different process.
For this, the IQueryable holds an Expression and a Provider. The Expression represents what must be fetched in some generic format. The Provider knows who will provide the data (usually a database management system) and how to communicate with this DBMS (usually SQL).
When you start enumerating the sequence, deep inside using GetEnumerator, the Expression is sent to the Provider, who will try to translate it into SQL. The data is fetched from the DBMS, and returned as an Enumerable object. The fetched data is accessed by repeatedly calling MoveNext / Current.
Because the database is not contacted until you start enumerating, you'll have to keep the connection to the database open until you've finished enumerating. You've probably made the following mistake once:
IQueryable<Customer> customers;
using (var dbContext = new OrderDbContext(...))
{
customers = dbContext.Customers.Where(customer => customer...);
}
var fetchedCustomers = customers.ToList();
Back to your question
In your query, you use string.Format(...). Your Provider doesn't know how to translate this method into SQL. Your Provider also doesn't know any of your local methods. In fact, there are even several standard LINQ methods that are not supported by LINQ to entities. See Supported and Unsupported LINQ methods.
How to solve the problem?
If you need to call unsupported methods, you can use AsEnumerable to fetch the data. All LINQ methods after AsEnumerable are executed by your own process. Hence you can call any of your own functions.
Database Management systems are extremely optimized in table handling. One of the slower parts of a database query is the transport of the selected data to your local process. Hence, let the DBMS do all selecting, try to transport as little data as possible.
So let your DBMS do your Where / (Group-)Join / Sum / FirstOrDefault / Any etc. String formatting can be done best by you.
In your String.Format you use PanelDisclosureLink and pmd.MeetingId. It will probably be faster if your process does the formatting. Alas you forgot to give us the beginning or your query.
I'm not sure where your PanelDisclosureLink comes from. Is it a local variable? If that is the case, then PanelDisclosuresAttendanceURL will be the same string for every item in your group. Is this intended?
var panelDisclosureLine = ...;
var result = dbContext... // probably some joining with Pgs, Pmds and Pmvs,
.Select(... => new
{
GroupId = pg.GroupId,
MeetingId = pmd.MeetingId,
GuidelineName = pmv.GuidelineName,
})
// make groups with same combinations of [MeetingId, GroupId]
.GroupBy(joinResult => new
{
MeetingId = joinResult.MeetingId,
GroupId = joinResult.GroupId,
},
// parameter resultSelector: use the Key, and all JoinResult items that have this key
// to make one new:
(key, joinResultItemsWithThisKey) => new
{
MeetingId = key.MeetingId,
GroupId = key.GroupId,
GuideLineNames = joinResultsItemsWithThisKey
.Select(joinResultItem => joinResultItem.GuideLineName)
.ToList(),
})
So by now the DBMS has transformed your join result into objects with
[MeetingId, GroupId] combinations and a list of all GuideLineNames that have belong to
this [MeetingId, GroupId] combination.
Now you can move it to your local process and use String.Format.
.AsEnumerable()
.SelectMany (fetchedItem => fetchedItem.GuideLineNames,
(fetchedItem, guideLineName) => PresentationLayer.Models.PanelMeeting
{
GroupId = fetchedItem.GroupId,
MeetingId = fetchedItem.MeetingId,
GuidelineName = guidelineName,
PanelDisclosuresAttendanceURL = string.Format("...",
PanelDisclosureLink,
fetchedItem.MeetingId);
Note: in my parameter choice plurals are collections; singulars are elements of these collections.
PanelDisclosuresAttendanceURL = string.Format("{0}?MeetingId={1}&GroupId=0",PanelDisclosureLink, pmd.MeetingId),
}).
.GroupBy
If you want to use string.Format you first have to get the data from the server.
You can just move the .GroupBy( ... ) and then the .AsEnumerable() call to the top, before select new PresentationLayer.Models.PanelMeeting { ... }. If you are not selecting too much data that way...
I have the following methods:
private IEnumerable<CTNTransactionsView> RetrieveCTNTransactionsNotInTLS() {
IQueryable<int> talismanIdCollection = this._cc.TLSTransactionView.Select(x => x.kSECSYSTrans);
return this._cc.CTNTransactionView
.Where(x => !talismanIdCollection.Contains(x.kSECSYSTrans));
}
public IEnumerable<CTNTransactionsView> RetrieveCTNTransactionsNotInTLSPast24Hours() {
DateTime previousDate = DateTime.Now.Date.AddDays(-1.0);
return this.RetrieveCTNTransactionsNotInTLS()
.Where(x => x.dSECSYSTimeStamp >= previousDate);
}
public IEnumerable<CTNTransactionsView> RetrieveCTNTransactionsNotInTLSPast24HoursVersionTwo() {
DateTime previousDate = DateTime.Now.Date.AddDays(-1.0);
IQueryable<int> talismanIdCollection = this._cc.TLSTransactionView
.Select(x => x.kSECSYSTrans);
return this._cc.CTNTransactionView
.Where(x => !talismanIdCollection.Contains(x.kSECSYSTrans))
.Where(x=> x.dSECSYSTimeStamp >= previousDate);
}
For some reason, the SQL output generated by Entity Framework 6 does not match the results.
The RetrieveCTNTransactionsNotInTLSPast24HoursVersionTwo() method will properly give a SQL Output Statement that has the following:
select ...... from ... where ... AND ([Extent1].[dSECSYSTimeStamp] >= #p__linq__0)}
The other one does not have the filter for the dSECSYSTimeStamp when I View the SQL Statement Output.
The methods I am comparing are the RetrieveCTNTransactionsNotInTLSPast24Hours() and the RetrieveCTNTransactionsNotInTLSPast24HoursVersionTwo().
I have compared the SQL using VS as well as attaching a Debug.Writeline() to the Database.Log in the context.
From debugging and looking at the SQL output, one seems to contain the date filter whereas the other doesn't and yet they both provide the correct result.
I have tried looking at the SQL (by breakpointing and seeing the output) from using the following:
System.Diagnostics.Debug.WriteLine("Running first method");
var result = this.repo.RetrieveCTNTransactionsNotInTLSPast24Hours();
var count = result.Count();
System.Diagnostics.Debug.WriteLine("Running Second method");
var resultTwo = this.repo.RetrieveCTNTransactionsNotInTLSPast24HoursVersionTwo();
var count2 = resultTwo.Count();
I am using EF 6.0.
Note: The results are the same as both do exactly the same thing and output the same result. However, I am curious and would like to understand why the SQL Generated isn't the same?
The issue is you are returning an IEnumerable from your method. If you do this, you force SQL to run the query (unfiltered) and then use C# to then run the second query. Alter your internal query to return an IQueryable. This will allow the unexecuted expression tree to be passed into the second query, which will then be evaluated when your run it.
i.e.
private IQueryable<CTNTransactionsView> RetrieveCTNTransactionsNotInTLS()
You should then get the same SQL.
I think, you just define a query expression, but not use it immediately, in this time, I will add .ToList() in the Linq expression's end, but you have to change method's return type into a right type like List<kSECSYSTrans> before this operation. I just seldom use the type IEnumerable.
Imagine the following classes:
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
class Underage
{
public int Age { get; set; }
}
And I do something like this:
var underAge = db.Underage.Select(u => u.Age) .ToList()/AsEnumerable()/AsQueryable()
var result = db.Persons.Where(p => underAge.Contains(p.Age)).ToList();
What is the best option? If I call ToList(), the items will be retrieved once, but If I choose AsEnumerable or AsQueryable will they be executed everytime a person gets selected from database in the Where() (if that's how it works, I don't know much about what a database does in the background)?
Also is there a bigger difference when Underage would contain thousands of records vs a small amount?
None.
You definitely don't want ToList() here, as that would load all of the matching values into memory, and then use it to write a query for result that had a massive IN (…) clause in it. That's a waste when what you really want is to just get the matching values out of the database in the first place, with a query that is something like
SELECT *
FROM Persons
WHERE EXISTS(
SELECT NULL FROM
FROM Underage
WHERE Underage.age = Persons.age
)
AsEnumerable() often prevents things being done on the database, though providers will likely examine the source and undo the effects of that AsEnumerable(). AsQueryable() is fine, but doesn't actually do anything in this case.
You don't need any of them:
var underAge = db.Underage.Select(u => u.Age);
var result = db.Persons.Where(p => underAge.Contains(p.Age)).ToList();
(For that matter, check you really do need that ToList() in the last line; if you're going to do further queries on it, it'll likely hurt them, if you're just going to enumerate through the results it's a waste of time and memory, if you're going to store them or do lots of in-memory operations that can't be done in Linq it'll be for the better).
If you just leave your first query as
var underage = db.Underage.Select(u => u.Age);
It will be of type IQueryable and it will not ping your database ( acutaly called deferred execution). Once you actually want to execute a call to the database you can use a greedy operator such as .ToList(), .ToArray(), .ToDictionary(). This will give your result variable an IEnumerable collection.
See linq deferred execution
Your should use join to fetch the data.
var query = (from p in db.Persons
join u in db.Underage
on u.Age equals p.Age
select p).ToList();
If you not use .ToList() in above query it return IEnumerable of type Person and actual data will be fetch when you use the object.
All are equally good ToList(), AsEnurmeable and Querable based on the scenario you want to use. For your case ToList looks good for me as you just want to fetch list of person have underage
Suppose I have an IQueryable<T> expression that I'd like to encapsulate the definition of, store it and reuse it or embed it in a larger query later. For example:
IQueryable<Foo> myQuery =
from foo in blah.Foos
where foo.Bar == bar
select foo;
Now I believe that I can just keep that myQuery object around and use it like I described. But some things I'm not sure about:
How best to parameterize it? Initially I've defined this in a method and then returned the IQueryable<T> as the result of the method. This way I can define blah and bar as method arguments and I guess it just creates a new IQueryable<T> each time. Is this the best way to encapsulate the logic of an IQueryable<T>? Are there other ways?
What if my query resolves to a scalar rather than IQueryable? For instance, what if I want this query to be exactly as shown but append .Any() to just let me know if there were any results that matched? If I add the (...).Any() then the result is bool and immediately executed, right? Is there a way to utilize these Queryable operators (Any, SindleOrDefault, etc.) without immediately executing? How does LINQ-to-SQL handle this?
Edit: Part 2 is really more about trying to understand what are the limitation differences between IQueryable<T>.Where(Expression<Func<T, bool>>) vs. IQueryable<T>.Any(Expression<Func<T, bool>>). It seems as though the latter isn't as flexible when creating larger queries where the execution is to be delayed. The Where() can be appended and then other constructs can be later appended and then finally executed. Since the Any() returns a scalar value it sounds like it will immediately execute before the rest of the query can be built.
You have to be really careful about passing around IQueryables when you're using a DataContext, because once the context get's disposed you won't be able to execute on that IQueryable anymore. If you're not using a context then you might be ok, but be aware of that.
.Any() and .FirstOrDefault() are not deferred. When you call them they will cause execution to occur. However, this may not do what you think it does. For instance, in LINQ to SQL if you perform an .Any() on an IQueryable it acts as a IF EXISTS( SQL HERE ) basically.
You can chain IQueryable's along like this if you want to:
var firstQuery = from f in context.Foos
where f.Bar == bar
select f;
var secondQuery = from f in firstQuery
where f.Bar == anotherBar
orderby f.SomeDate
select f;
if (secondQuery.Any()) //immediately executes IF EXISTS( second query in SQL )
{
//causes execution on second query
//and allows you to enumerate through the results
foreach (var foo in secondQuery)
{
//do something
}
//or
//immediately executes second query in SQL with a TOP 1
//or something like that
var foo = secondQuery.FirstOrDefault();
}
A much better option than caching IQueryable objects is to cache Expression trees. All IQueryable objects have a property called Expression (I believe), which represents the current expression tree for that query.
At a later point in time, you can recreate the query by calling queryable.Provider.CreateQuery(expression), or directly on whatever the provider is (in your case a Linq2Sql Data Context).
Parameterizing these expression trees is slightly harder however, as they use ConstantExpressions to build in a value. In order to parameterize these queries, you WILL have to rebuild the query every time you want different parameters.
Any() used this way is deferred.
var q = dc.Customers.Where(c => c.Orders.Any());
Any() used this way is not deferred, but is still translated to SQL (the whole customers table is not loaded into memory).
bool result = dc.Customers.Any();
If you want a deferred Any(), do it this way:
public static class QueryableExtensions
{
public static Func<bool> DeferredAny<T>(this IQueryable<T> source)
{
return () => source.Any();
}
}
Which is called like this:
Func<bool> f = dc.Customers.DeferredAny();
bool result = f();
The downside is that this technique won't allow for sub-querying.
Create a partial application of your query inside an expression
Func[Bar,IQueryable[Blah],IQueryable[Foo]] queryMaker =
(criteria, queryable) => from foo in queryable.Foos
where foo.Bar == criteria
select foo;
and then you can use it by ...
IQueryable[Blah] blah = context.Blah;
Bar someCriteria = new Bar();
IQueryable[Foo] someFoosQuery = queryMaker(blah, someCriteria);
The query could be encapsulated within a class if you want to make it more portable / reusable.
public class FooBarQuery
{
public Bar Criteria { get; set; }
public IQueryable[Foo] GetQuery( IQueryable[Blah] queryable )
{
return from foo in queryable.Foos
where foo.Bar == Criteria
select foo;
}
}