Does multiple evaluations of an IQueryable object hit the Database multiple times? - c#

In the Entity Framework, if I do the following:
var count = myIQueryable.Count();
var list = myIQueryable.ToList();
does that hit the Database twice? or just once?

In order to effectively count the number of entries, the framework needs to evaluate the query, thus hitting the Database. However, because the query may have changed between the Count() and ToList(), it must evaluate again. Consider the following:
var myIQueryable = my.db<SomeModel>(); // IQueryable<SomeModel>
var countQuery = myIQueryable.Count(); // 10
MakeAdditions(myIQueryable, 10); // adds 10 items
var list = myIQueryable.ToList(); // List<SomeModel> count=20
MakeAdditions(myIQueryable, 10);
var countList = list.Count(); // still 20, List never changed
Put another way, all calls against an IQueryable are still subject to the way it runs its queries. After capturing a query into a List, you are exclusively dealing with your in-memory List, independant of changes that occur to the IQueryable's data source.

Yes, it does hit the database twice, as both Count and ToList are eagerly evaluated. If you just want to access it once, do the following:
var list = myIQueryable.ToList();
var count = list.Count;

Related

How to get data from linq

I am trying to get data from linq in asp.net core. I have a table with a Position with a FacultyID field, how do I get it from the Position table with an existing userid. My query
var claimsIdentity = _httpContextAccessor.HttpContext.User.Identity as ClaimsIdentity;
var userId = claimsIdentity.FindFirst(ClaimTypes.NameIdentifier)?.Value.ToString();
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).???;
What can I add after the mark? to get the data. Thank you so much
There are several things you can do. An example in your case would be:
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).FirstOrDefault();
If you expect more than 1 results, then you would do:
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).ToList();
You have to be aware of the difference between a query and the result of a query.
The query does not represent the data itself, it represents the potential to fetch some data.
If you look closely to the LINQ methods, you will find there are two groups: the LINQ methods that return IQueryable<...> and the others.
The IQueryable methods don't execute the query. These functions are called lazy, they use deferred execution. You can find these terms in the remarks section of every LINQ method.
As long as you concatenate IQueryable LINQ methods, the query is not executed. It is not costly to concatenate LINQ methods in separate statements.
The query is executed as soon as you start enumerating the query. At its lowest level this is done using GetEnumerator and MoveNext / Current:
IQueryable<Customer> customers = ...; // Query not executed yet!
// execute the query and process the fetched data
using (IEnumerator<Customer> enumerator = customers.GetEnumerator())
{
while(enumerator.MoveNext())
{
// there is a Customer, it is in property Current:
Customer customer = enumerator.Current;
this.ProcessFetchedCustomer(customer);
}
}
This code, or something very similar is done when you use foreach, or one of the LINQ methods that don't return IQueryable<...>, like ToList, ToDictionary, FirstOrDefault, Sum, Any, ...
var data = dbContext.Positions
.Where(p => p.UserID.ToString() == userId)
.Select(x => x.FacultyID);
If you use your debugger, you will see that data is an IQueryable<Position>. You'll have to use one of the other LINQ methods to execute the query.
To get all Positions in the query:
List<Position> fetchedPositions result = data.ToList();
If you expect only one position:
Position fetchedPosition = data.FirstOrDefault();
If you want to know if there is any position at all:
if (positionAvailable = data.Any())
{
...
}
Be aware: if you use the IQueryable, the data will be fetched again from the DbContext. So if you want to do all three statements efficiently these, make sure you don't use the original data three times:
List<Position> fetchedPositions result = data.ToList();
Position firstPosition = fetchedPostion.FirstOrDefault();
if (firstPosition != null)
{
ProcessPosition(firstPosition);
}

LINQ using String Format

I am getting the error about LINQ to Entities does not recognize the method 'System.String Format but in the past I was able to do this when I have included .AsEnumerable() is there something different I need to do because of the GroupBy section?
select new PresentationLayer.Models.PanelMeeting
{
GroupId = pg.GroupId,
MeetingId = pmd.MeetingId,
GuidelineName = pmv.GuidelineName,
PanelDisclosuresAttendanceURL = string.Format("{0}?MeetingId={1}&GroupId=0",PanelDisclosureLink, pmd.MeetingId),
}).GroupBy(g => new
{
g.MeetingId,
g.GroupId
})
.AsEnumerable()
.SelectMany(grp => grp.AsEnumerable()).ToList(),
You have to be aware of the difference between an IEnumerable<...> and an IQueryable<...>.
IEnumerable
An object that implements IEnumerable<...> represents a sequence of similar items. You can ask for the first element of the sequence, and as long as you've got elements you can ask for the next element. IEnumerable objects are supposed to be executed within your own process. IEnumerable objects hold everything to enumerate the sequence.
At its lowest level, this is done using GetEnumerator() / MoveNext() / Current:
IEnumerable<Customer> customers = ...
IEnumerator<Customer> enumerator = customers.GetEnumerator();
while (enumerator.MoveNext())
{
// There is a next Customer
Customer customer = enumerator.Current;
ProcessCustomer(customer);
}
If you use foreach, then internally GetEnumerator / MoveNext / Current are called.
If you look closely to LINQ, you will see that there are two groups of LINQ methods. Those that return IEnumerable<TResult> and those that dont't return IEnumerable<...>
LINQ functions from the first group won't enumerate the query. They use deferred execution, or lazy execution. In the comments section of every LINQ method, you'll find this description.
The LINQ functions of the other group will execute the query. If you look at the reference source of extension class Enumerable, you'll see that they internally use foreach, or at lower level use GetEnumerator / MoveNext / Current
IQueryable
An object that implements IQueryable<...> seems like an IEnumerable. However, it represents the potential to fetch data for an Enumerable sequence. The data is usually provided by a different process.
For this, the IQueryable holds an Expression and a Provider. The Expression represents what must be fetched in some generic format. The Provider knows who will provide the data (usually a database management system) and how to communicate with this DBMS (usually SQL).
When you start enumerating the sequence, deep inside using GetEnumerator, the Expression is sent to the Provider, who will try to translate it into SQL. The data is fetched from the DBMS, and returned as an Enumerable object. The fetched data is accessed by repeatedly calling MoveNext / Current.
Because the database is not contacted until you start enumerating, you'll have to keep the connection to the database open until you've finished enumerating. You've probably made the following mistake once:
IQueryable<Customer> customers;
using (var dbContext = new OrderDbContext(...))
{
customers = dbContext.Customers.Where(customer => customer...);
}
var fetchedCustomers = customers.ToList();
Back to your question
In your query, you use string.Format(...). Your Provider doesn't know how to translate this method into SQL. Your Provider also doesn't know any of your local methods. In fact, there are even several standard LINQ methods that are not supported by LINQ to entities. See Supported and Unsupported LINQ methods.
How to solve the problem?
If you need to call unsupported methods, you can use AsEnumerable to fetch the data. All LINQ methods after AsEnumerable are executed by your own process. Hence you can call any of your own functions.
Database Management systems are extremely optimized in table handling. One of the slower parts of a database query is the transport of the selected data to your local process. Hence, let the DBMS do all selecting, try to transport as little data as possible.
So let your DBMS do your Where / (Group-)Join / Sum / FirstOrDefault / Any etc. String formatting can be done best by you.
In your String.Format you use PanelDisclosureLink and pmd.MeetingId. It will probably be faster if your process does the formatting. Alas you forgot to give us the beginning or your query.
I'm not sure where your PanelDisclosureLink comes from. Is it a local variable? If that is the case, then PanelDisclosuresAttendanceURL will be the same string for every item in your group. Is this intended?
var panelDisclosureLine = ...;
var result = dbContext... // probably some joining with Pgs, Pmds and Pmvs,
.Select(... => new
{
GroupId = pg.GroupId,
MeetingId = pmd.MeetingId,
GuidelineName = pmv.GuidelineName,
})
// make groups with same combinations of [MeetingId, GroupId]
.GroupBy(joinResult => new
{
MeetingId = joinResult.MeetingId,
GroupId = joinResult.GroupId,
},
// parameter resultSelector: use the Key, and all JoinResult items that have this key
// to make one new:
(key, joinResultItemsWithThisKey) => new
{
MeetingId = key.MeetingId,
GroupId = key.GroupId,
GuideLineNames = joinResultsItemsWithThisKey
.Select(joinResultItem => joinResultItem.GuideLineName)
.ToList(),
})
So by now the DBMS has transformed your join result into objects with
[MeetingId, GroupId] combinations and a list of all GuideLineNames that have belong to
this [MeetingId, GroupId] combination.
Now you can move it to your local process and use String.Format.
.AsEnumerable()
.SelectMany (fetchedItem => fetchedItem.GuideLineNames,
(fetchedItem, guideLineName) => PresentationLayer.Models.PanelMeeting
{
GroupId = fetchedItem.GroupId,
MeetingId = fetchedItem.MeetingId,
GuidelineName = guidelineName,
PanelDisclosuresAttendanceURL = string.Format("...",
PanelDisclosureLink,
fetchedItem.MeetingId);
Note: in my parameter choice plurals are collections; singulars are elements of these collections.
PanelDisclosuresAttendanceURL = string.Format("{0}?MeetingId={1}&GroupId=0",PanelDisclosureLink, pmd.MeetingId),
}).
.GroupBy
If you want to use string.Format you first have to get the data from the server.
You can just move the .GroupBy( ... ) and then the .AsEnumerable() call to the top, before select new PresentationLayer.Models.PanelMeeting { ... }. If you are not selecting too much data that way...

C# IEnumerable being reset in child method

I have the below method:
private static List<List<job>> SplitJobsByMonth(IEnumerable<job> inactiveJobs)
{
List<List<job>> jobsByMonth = new List<List<job>>();
DateTime cutOff = DateTime.Now.Date.AddMonths(-1).Date;
cutOff = cutOff.AddDays(-cutOff.Day + 1);
List<job> temp;
while (inactiveJobs.Count() > 0)
{
temp = inactiveJobs.Where(j => j.completeddt >= cutOff).ToList();
jobsByMonth.Add(temp);
inactiveJobs = inactiveJobs.Where(a => !temp.Contains(a));
cutOff = cutOff.AddMonths(-1);
}
return jobsByMonth;
}
It aims to split the jobs by month. 'job' is a class, not a struct. In the while loop, the passed in IEnumerable is reset with each iteration to remove the jobs that have been processed:
inactiveJobs = inactiveJobs.Where(a => !temp.Contains(a));
Typically this reduces the content of this collection by quite a lot. However, on the next iteration the line:
temp = inactiveJobs.Where(j => j.completeddt >= cutOff).ToList();
restores the inactiveJobs object to the state it was when it was passed into the method - so the collection is full again.
I have solved this problem by refactoring this method slightly, but I am curious as to why this issue occurs as I can't explain it. Can anyone explain why this is happening?
Why not just use a group by?
private static List<List<job>> SplitJobsByMonth(IEnumerable<job> inactiveJobs)
{
var jobsByMonth = (from job in inactiveJobs
group job by new DateTime(job.completeddt.Year, job.completeddt.Month, 1)
into g
select g.ToList()).ToList();
return jobsByMonth;
}
This happens because of deferred execution of LINQ's Where.
When you do this
inactiveJobs = inactiveJobs.Where(a => !temp.Contains(a));
no evaluation is actually happening until you start iterating the IEnumerable. If you add ToList after Where, the iteration would happen right away, so the content of interactiveJobs would be reduced:
inactiveJobs = inactiveJobs.Where(a => !temp.Contains(a)).ToList();
In LINQ, queries have two different behaviors of execution: immediate and deferred.
The query is actually executed when the query variable is iterated over, not when the query variable is created. This is called deferred execution.
You can also force a query to execute immediately, which is useful for caching query results.
In order to make this add .ToList() in the end of your line:
inactiveJobs = inactiveJobs.Where(a => !temp.Contains(a)).ToList();
This executes the created query immediately and writes result to your variable.
You can see more about this example Here.

does putting Linq query inside a method affect deferred excecution?

Linq query is not executed until the sequence returned by the query is actually iterated.
I have a query that is used repeatedly, so I am going to encapuslate it inside a method. I'd like to know if it interferes with the deferred execution.
If I encapsulate a Linq query into a method like below,
the query gets executed at line 2, not line 1 where the method is called. Is this correct?
public IEnumerable<Person> GetOldPeopleQuery()
{
return personList.Where(p => p.Age > 60);
}
public void SomeOtherMethod()
{
var getWomenQuery = GetOldPeopleQuery().Where(p => p.Gender == "F"); //line 1
int numberOfOldWomen = getWomanQuery.Count(); //line 2
}
P.S. I am using Linq-To-EF, if it makes any difference.
The query is lazy evaluated when you enumerate the result the first time, that is not affected by putting it inside a method.
There is however another thing in your code that will be very inefficient. Once you've returned an IEnumerable the next linq statement applied to the collection will be a linq-to-objects query. That means that in your case you will load all old people from the database and then filter out the women in memory. The same with the count, it will be done in memory.
If you instead return an IQueryable<Person> those two questions will be evaluated using linq-to-entities and the filtering and summing can be done in the database.

Databinding multiple controls to a single LINQ query

I have a LINQ query I wish to run and drop the result into a var or IQueryable. However, I'm binding the result to multiple (4 to 10) controls and only want the query to run once.
When I just put the result into all the datasource values, the query runs for every control and the controls (comboboxes, for example), change selectedvalues to match each other whenever any of them is changed.
When I databind the controls to the result.ToList() or something similar, that fixes the synchronization problem (i.e. they behave independently as they should), but the query still runs once for every control.
This was easycakes in ADO.NET days. How can I make the LINQ query run once and still databind to multiple controls?
Pseudocode:
var result = from c in dc.whatevers select c;
ddlItem1.DataSource = result;
ddlItem2.DataSource = result;
ddlItem3.DataSource = result;
Also:
var result = from c in dc.whatevers select c;
ddlItem1.DataSource = result.ToList();
ddlItem2.DataSource = result.ToList();
ddlItem3.DataSource = result.ToList();
Also:
List<whatever> result = (from c in dc.whatevers select c).ToList();
ddlItem1.DataSource = result;
ddlItem2.DataSource = result;
ddlItem3.DataSource = result;
The last option in your example is the easiest.
That will execute the query once, read it into memory, and use the in memory representation to bind to the controls
Calling ToList() should force query execution a single time. Using the resulting list should NOT repeat the query but load the values from the in memory collection. Are you positive that the code you're running is both what you have above and actually running the queries four times?
Try this:
var result = from c in dc.whatevers select c;
List<whatevers> resultList = result.ToList(); // Query runs here
ddlItem1.DataSource = new List<whatevers>(resultList); // Copy of the list
ddlItem2.DataSource = new List<whatevers>(resultList); // Copy of the list
ddlItem3.DataSource = new List<whatevers>(resultList); // Copy of the list

Categories