Entity framework .ToList() is slow? (Query all) - c#

public List<Employee> GetEmployees(){
var employee = new ApplicationDBContext().Employee;
return employee.ToList();
}
//somewhere in other part of code.
//Use GetEmployees.
var employees = GetEmployees();
var importantEmployees = employees.Where(e => e.IsImportant == true);
In terms of performance, this method is feasible?
Is there any solution to make it fast?
Thanks!

As soon as GetEmployees() executes ToList(), you retrieve all the records from the database, not just the "important" ones. By the time you execute the Where clause later on, it's too late.
Create another method, where you filter with Where before calling ToList().
public List<Employee> GetImportantEmployees()
{
var employee = new ApplicationDBContext().Employee;
return employee.Where(e => e.IsImportant).ToList();
}
Other than that, I'm not sure what else you can do to make it faster from your C# code. Apply more filters if you only need a subset of the "important" employees (also before calling ToList()).

Related

How to get data from linq

I am trying to get data from linq in asp.net core. I have a table with a Position with a FacultyID field, how do I get it from the Position table with an existing userid. My query
var claimsIdentity = _httpContextAccessor.HttpContext.User.Identity as ClaimsIdentity;
var userId = claimsIdentity.FindFirst(ClaimTypes.NameIdentifier)?.Value.ToString();
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).???;
What can I add after the mark? to get the data. Thank you so much
There are several things you can do. An example in your case would be:
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).FirstOrDefault();
If you expect more than 1 results, then you would do:
var data = _context.Positions.Where(p => p.UserID.ToString() == userId).Select(x => x.FacultyID).ToList();
You have to be aware of the difference between a query and the result of a query.
The query does not represent the data itself, it represents the potential to fetch some data.
If you look closely to the LINQ methods, you will find there are two groups: the LINQ methods that return IQueryable<...> and the others.
The IQueryable methods don't execute the query. These functions are called lazy, they use deferred execution. You can find these terms in the remarks section of every LINQ method.
As long as you concatenate IQueryable LINQ methods, the query is not executed. It is not costly to concatenate LINQ methods in separate statements.
The query is executed as soon as you start enumerating the query. At its lowest level this is done using GetEnumerator and MoveNext / Current:
IQueryable<Customer> customers = ...; // Query not executed yet!
// execute the query and process the fetched data
using (IEnumerator<Customer> enumerator = customers.GetEnumerator())
{
while(enumerator.MoveNext())
{
// there is a Customer, it is in property Current:
Customer customer = enumerator.Current;
this.ProcessFetchedCustomer(customer);
}
}
This code, or something very similar is done when you use foreach, or one of the LINQ methods that don't return IQueryable<...>, like ToList, ToDictionary, FirstOrDefault, Sum, Any, ...
var data = dbContext.Positions
.Where(p => p.UserID.ToString() == userId)
.Select(x => x.FacultyID);
If you use your debugger, you will see that data is an IQueryable<Position>. You'll have to use one of the other LINQ methods to execute the query.
To get all Positions in the query:
List<Position> fetchedPositions result = data.ToList();
If you expect only one position:
Position fetchedPosition = data.FirstOrDefault();
If you want to know if there is any position at all:
if (positionAvailable = data.Any())
{
...
}
Be aware: if you use the IQueryable, the data will be fetched again from the DbContext. So if you want to do all three statements efficiently these, make sure you don't use the original data three times:
List<Position> fetchedPositions result = data.ToList();
Position firstPosition = fetchedPostion.FirstOrDefault();
if (firstPosition != null)
{
ProcessPosition(firstPosition);
}

C# LINQ executing the same work over and over

Came across some legacy code where the logic attempts to prevent un-necessary multiple calls to an expensive query GetStudentsOnCourse(), but fails due to a misunderstanding of deferred execution.
var students = studentsToRemoveRecords.Select(x => x.CourseId)
.Distinct()
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value));
var studentsToRemove = new List<Student>();
foreach (var record in studentsToRemoveRecords)
{
studentsToRemove.Add(
students.Single(s => s.Id == record.StudentId));
}
Here, if there are 2 records for the same course in studentsToRemoveRecords, the query GetStudentsOnCourse() will needlessly be called twice (with the same course id) instead of once.
You can solve this by converting students to a list beforehand and forcing it to memory (preventing the execution from being deferred). Or by simply rewriting the logic into something a bit simpler.
But I then realised I actually struggle to put into words exactly why GetStudentsOnCourse() is called twice in the scenario above... is it that LINQ is repeating the same work everytime studentsToRemoveRecords is iterated over, even though the resulting input values are identical each time?
is it that LINQ is repeating the same work everytime studentsToRemoveRecords is iterated over, even though the resulting input values are identical each time?
Yes, that's the nature of LINQ. Some Visual Studio Extensions, like ReSharper, give you warnings when you create code that might lead to multiple iterations of a LINQ Query.
If you want to avoid it, do this:
var students = studentsToRemoveRecords.Select(x => x.CourseId)
.Distinct()
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value))
.ToList();
With ToList() the Query is executed immediately and the resulting entities are stored in a List<T>. Now you can iterate several times over students without having performance issues.
Edit to include comments:
Here is a link to some good documentation about it (thank you Sergio): LINQ Documentation
And some thoughts about your question how to handle this in a large code base:
Well, there are reasons for both scenarios - direct execution and storing the result into a new list, and deferred execution.
If you are familiar with SQL databases, you can think of a LINQ Query like a View or a Stored Procedure. You define what filtering/altering you want to execute on a base table to get the resulting entities. And each time you query that View/execute that Stored Procedure, it runs based on the current data in the base table.
Same for LINQ. Your Query (without ToList()) was just like the definition of the View. And each time you iterate over it, that definition gets executed based on the current Entities in studentsToRemoveRecords at that moment.
And maybe that's your intetion. Maybe you know that this base list is altering and you want to execute your query several times, expecting different results. Then do it without ToList().
But when you want to execute your query only once and then expect an immutable result list over which you can iterate multiple times, do it with ToList().
So both Scenarios are valid. And when you iterate only once, both scenarios are equal (disclaimer: when you iterate directly after defining the query). Maybe that's why you saw it so many times like this. It depends what you want.
Unclear exactly how your classes are done, BUT:
public class Student
{
public int Id { get; set; }
}
public class StudentCourse
{
public int StudentId { get; set; }
public int? CourseId { get; set; }
}
public class StudentRepository
{
public StudentCourse[] StudentCourses = new[]
{
new StudentCourse { CourseId = 1, StudentId = 100 },
new StudentCourse { CourseId = 2, StudentId = 200 },
new StudentCourse { CourseId = 3, StudentId = 300 },
new StudentCourse { CourseId = 4, StudentId = 400 },
};
public Student[] GetStudentsOnCourse(int courseId)
{
Console.WriteLine($"{nameof(GetStudentsOnCourse)}({courseId})");
return StudentCourses.Where(x => x.CourseId == courseId).Select(x => new Student { Id = x.StudentId }).ToArray();
}
}
and then
static void Main(string[] args)
{
var studentRepository = new StudentRepository();
var studentsToRemoveRecords = studentRepository.StudentCourses.ToArray();
var students = studentsToRemoveRecords.Select(x => x.CourseId)
.Distinct()
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value));
//.ToArray();
var studentsToRemove = new List<Student>();
foreach (var record in studentsToRemoveRecords)
{
studentsToRemove.Add(
students.Single(s => s.Id == record.StudentId));
}
}
the method is called 16 times, with .ToArray() it is called 4 times. Note that .Single() will parse the full students collection to check that there is a single student with the "right" Id. Compare it with First() that will break after finding one record with the right Id (10 total calls of the method). As I've said in my comment, the method is called studentsToRemoveRecords.Count() * studentsToRemoveRecords.Distinct().Count(), so something like x ^ 2. Doing a .ToArray() "memoizes" the result of the GetStudentsOnCourse.
Just out of curiosity, you can add this class to your code:
public static class Tools
{
public static IEnumerable<T> DebugEnumeration<T>(this IEnumerable<T> enu)
{
Console.WriteLine("Begin Enumeration");
foreach (var res in enu)
{
yield return res;
}
}
}
and then do:
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value))
.DebugEnumeration();
This will show you when the SelectMany is enumerated.

What collection should I use in a linq-to-sql query? Queryable vs Enumerable vs List

Imagine the following classes:
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
class Underage
{
public int Age { get; set; }
}
And I do something like this:
var underAge = db.Underage.Select(u => u.Age) .ToList()/AsEnumerable()/AsQueryable()
var result = db.Persons.Where(p => underAge.Contains(p.Age)).ToList();
What is the best option? If I call ToList(), the items will be retrieved once, but If I choose AsEnumerable or AsQueryable will they be executed everytime a person gets selected from database in the Where() (if that's how it works, I don't know much about what a database does in the background)?
Also is there a bigger difference when Underage would contain thousands of records vs a small amount?
None.
You definitely don't want ToList() here, as that would load all of the matching values into memory, and then use it to write a query for result that had a massive IN (…) clause in it. That's a waste when what you really want is to just get the matching values out of the database in the first place, with a query that is something like
SELECT *
FROM Persons
WHERE EXISTS(
SELECT NULL FROM
FROM Underage
WHERE Underage.age = Persons.age
)
AsEnumerable() often prevents things being done on the database, though providers will likely examine the source and undo the effects of that AsEnumerable(). AsQueryable() is fine, but doesn't actually do anything in this case.
You don't need any of them:
var underAge = db.Underage.Select(u => u.Age);
var result = db.Persons.Where(p => underAge.Contains(p.Age)).ToList();
(For that matter, check you really do need that ToList() in the last line; if you're going to do further queries on it, it'll likely hurt them, if you're just going to enumerate through the results it's a waste of time and memory, if you're going to store them or do lots of in-memory operations that can't be done in Linq it'll be for the better).
If you just leave your first query as
var underage = db.Underage.Select(u => u.Age);
It will be of type IQueryable and it will not ping your database ( acutaly called deferred execution). Once you actually want to execute a call to the database you can use a greedy operator such as .ToList(), .ToArray(), .ToDictionary(). This will give your result variable an IEnumerable collection.
See linq deferred execution
Your should use join to fetch the data.
var query = (from p in db.Persons
join u in db.Underage
on u.Age equals p.Age
select p).ToList();
If you not use .ToList() in above query it return IEnumerable of type Person and actual data will be fetch when you use the object.
All are equally good ToList(), AsEnurmeable and Querable based on the scenario you want to use. For your case ToList looks good for me as you just want to fetch list of person have underage

c# contains method usage

I'm adding a "search" functionality to a web app I'm working on and I have the following action method:
public PartialViewResult SearchEmployees(string search_employees)
{
var employeeList = _db.Employees.ToList();
var resultList = employeeList.Where(t => t.FirstName.Contains(search_employees)).ToList();
return PartialView(resultList)
}
here I'm trying to filter out all employees that have a first name that contains the search string, however I keep getting a null list. Am I using the lambda expression wrong?
another question, is .Contains case sensitive? (I know in java theres .equals and .equalsIgnoreCase, is there something similar to this for .Contains?)
The problem here was the .ToList() in the first line.
.NET's string.Contains() method is, by default, case sensitive. However, if you use .Contains() in a LINQ-to-Entities query, Contains() will follow the case sensitivity of the database (most databases are case insensitive).
When you called .ToList() in the first line, it pulled down all of your data from the database, so the second line was doing an ordinary .NET .Contains(). Not only did it give you unexpected results, it's also terrible for performance, so please make a point to use a query before you use .ToList() (if you even use .ToList() at all).
public PartialViewResult SearchEmployees(string search_employees)
{
var employeeList = _db.Employees;
var resultList = employeeList.Where(t => t.FirstName.Contains(search_employees))
.ToList();
return PartialView(resultList)
}
Can you try the following code?
public PartialViewResult SearchEmployees(string search_employees)
{
var employeeList = _db.Employees.ToList();
var resultList = employeeList;
if(!String.IsNullOrEmpty(search_employees))
resultList = employeeList.Where(t => t.FirstName.Contains(search_employees)).ToList();
return PartialView(resultList)
}
Thanks,
Amit

Can't get deferred LINQ statements to work in nHibernate

Trying to write dynamic queries using the LINQ provider for NHibernate, but I am having issues. My understanding was that LINQ queries were deferred until called, (i.e. with ToList()), so I have the following code:
string[] filteredIds = new[] { "someIdNotInUse"};
var result = _products
.GetAll()
.Skip(0)
.Take(10);
if (filteredIds != null)
{
result.Where(x => x.Child1.Child2.Any(z => filteredIds.Contains(z.Child3.Id)));
}
var r = result.ToList();
The Where filter in the conditional block is not applied; when I run .ToList, I get records where I expect none. However, if I remove the where filter and append it directly to the _products call, it works as expected. Am I misunderstanding how the LINQ provider works? How is creating a query like this possible, without rewriting the query for every possible filter condition and combination?
Methods in LINQ don't affect the object they're called on - they return a new object representing the result of the call. So you want:
if (filteredIds != null)
{
result = result.Where(...);
}
(Think of it as being a bit like calling Replace or Trim on a string - the string is immutable, so it's only the return value which matters.)

Categories