Linq - Order by in Include - c#

I have a situation where OrderBy need to be done for Include object. This is how I have tried so far
Customers query = null;
try
{
query = _context.Customers
.Include(x => x.CustomerStatus)
.ThenInclude(x => x.StatusNavigation)
.Select(x => new Customers()
{
Id = x.Id,
Address = x.Address,
Contact = x.Contact,
Name = x.Name,
CustomerStatus = new List<CustomerStatus>
{
x.CustomerStatus.OrderByDescending(y => y.Date).FirstOrDefault()
}
})
.FirstOrDefault(x => x.Id == 3);
}
catch (Exception ex)
{
throw;
}
The above code successfully ordering the include element but it is not including it's child table.
Eg: Customer include CustomerStatus but CustomerStatus not including StatusNavigation tables.
I even tried with this but neither it can help me
_context.Customers
.Include(x => x.CustomerStatus.OrderByDescending(y => y.Date).FirstOrDefault())
.ThenInclude(x => x.StatusNavigation).FirstOrDefault(x => x.Id == 3);
What am I doing wrong please guide me someone
Even I tried this way
var query = _context.CustomerStatus
.GroupBy(x => x.CustomerId)
.Select(x => x.OrderByDescending(y => y.Date).FirstOrDefault())
.Include(x => x.StatusNavigation)
.Join(_context.Customers, first => first.CustomerId, second => second.Id, (first, second) => new Customers
{
Id = second.Id,
Name = second.Name,
Address = second.Address,
Contact = second.Contact,
CustomerStatus = new List<CustomerStatus> {
new CustomerStatus
{
Id = first.Id,
CustomerId = first.CustomerId,
Date = first.Date,
StatusNavigation = first.StatusNavigation
}
},
}).FirstOrDefault(x => x.Id == 3);
but this is hitting a databases a 3 times and filtering the result in memory.
First select all data from customer status and then from status and then from customer then it filter all the data in memory. Is there any other efficient way to do this??
This is how I have prepared by entity class

As #Chris Pratt mentioned once you are doing new Customer inside the select you are creating a new model. You are discarding the models build by the EntityFramework. My suggestion would be have the query just:
query = _context.Customers
.Include(x => x.CustomerStatus)
.ThenInclude(x => x.StatusNavigation);
Like this you would have an IQueryable object which it would not be executed unless you do a select from it:
var customer3 = query.FirstOrDefault(x=>x.Id==3)
Which returns the customer and the interlinked tables (CustomerStatus and StatusNavigation). Then you can create the object that you want:
var customer = new Customers()
{
Id = customer3.Id,
Address = customer3.Address,
Contact = customer3.Contact,
Name = x.Name,
CustomerStatus = new List<CustomerStatus>
{
customer3.CustomerStatus.OrderByDescending(y => y.Date).FirstOrDefault()
}
})
In this way you can reuse the query for creating different response objects and have a single querying to database, but downside is that more memory is used then the original query (even though it shouldn't be too much of an issue).
If the model that is originally return from database doesn't meet the requirements (i.e. you always need to do: CustomerStatus = new List {...} ) it might indicate that the database schema is not well defined to the needs of the application, so a refactoring might be needed.

What I think is happening is that you are actually overriding the Include and ThenInclude. Include is explicitly to eager-load a navigation property. However, you're doing a couple of things that are likely hindering this.
First, you're selecting into a new Customer. That alone may be enough to break the logic of Include. Second, you're overriding what gets put in the CustomerStatus collection. That should ideally be just loaded in automatically via Include, but by altering it to just have the first entity, you're essentially throwing away the effect of Include. (Selecting a relationship is enough to cause a join to be issued, without explicitly calling Include). Third, the ThenInclude is predicated on the Include, so overriding that is probably throwing out the ThenIncude as well.
All this is conjecture. I haven't done anything exactly like what you're doing here before, but nothing else makes sense.
Try selecting into a new CustomerStatus as well:
CustomerStatus = x.CustomerStatus.OrderByDescending(o => o.Date).Select(s => new CustomerStatus
{
x.Id,
x.Status,
x.Date,
x.CustomerId,
x.Customer,
x.StatusNavigation
})
You can remove the Include/ThenInclude at that point, because the act of selecting these relationships will cause the join.

After Reading from Couple of sources (Source 1) and (Source 2). I think what is happening is that If you use select after Include. It disregards Include even if you are using Include query data in select. So to solve this use .AsEnumerable() before calling select.
query = _context.Customers
.Include(x => x.CustomerStatus)
.ThenInclude(x => x.StatusNavigation)
.AsEnumerable()
.Select(x => new Customers()
{
Id = x.Id,
Address = x.Address,
Contact = x.Contact,
Name = x.Name,
CustomerStatus = new List<CustomerStatus>
{
x.CustomerStatus.OrderByDescending(y => y.Date).FirstOrDefault()
}
})
.FirstOrDefault(x => x.Id == 3);

Related

Linq one to many with filter

I have an Entity Framework database that I'm querying, so I'm using linq-to-entities.
Here's my query:
// 'Find' is just a wrapper method that returns IQueryable
var q = r.Find(topic =>
topic.PageId != null &&
!topic.Page.IsDeleted &&
topic.Page.IsActive)
// These are standard EF extension methods, which are used to include
linked tables. Note: Page_Topic has a one-to-many relationship with topic.
.Include(topic => topic.Page.Route)
.Include(topic => topic.Page_Topic.Select(pt => pt.Page.Route))
// HERE'S THE QUESTION: This select statement needs to flatten Page_Topic (which it does). But it seems to do it in the wrong place. To explain, if I were to include another column that depended on Page_Topic (for example: 'PillarRoutName2', I'd have to apply the same flattening logic to that column too. Surely the filtering of Page_Topic should be done higher up the query in a DRY way.
.Select(x => new
{
TopicName = x.Name,
HubRouteName = x.Page.Route.Name,
PillarRouteName = x.Page_Topic.FirstOrDefault(y => y.IsPrimary).Page.Route.Name
}).ToList();
Surely the filtering of Page_Topic should be done higher up the query in a DRY way.
Correct! And it's easy to do this:
.Select(x => new
{
TopicName = x.Name,
HubRouteName = x.Page.Route.Name,
FirstTopic = x.Page_Topic.FirstOrDefault(y => y.IsPrimary)
})
.Select(x => new
{
TopicName = x.TopicName,
HubRouteName = x.HubRouteName,
PillarRouteName = x.FirstTopic.Page.Route.Name,
PillarRoutName2 = x.FirstTopic. ...
}).ToList();
Depending on where you start to get properties from FirstTopic you can also use x.Page_Topic.FirstOrDefault(y => y.IsPrimary).Page or .Page.Route in the first part.
Note that you don't need the Includes. They will be ignored because the query is a projection (Select(x => new ...).

How can I reuse a subquery inside a select expression?

In my database I have two tables Organizations and OrganizationMembers, with a 1:N relationship.
I want to express a query that returns each organization with the first and last name of the first organization owner.
My current select expression works, but it's neither efficient nor does it look right to me, since every subquery gets defined multiple times.
await dbContext.Organizations
.AsNoTracking()
.Select(x =>
{
return new OrganizationListItem
{
Id = x.Id,
Name = x.Name,
OwnerFirstName = (x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner)).FirstName,
OwnerLastName = (x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner)).LastName,
OwnerEmailAddress = (x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner)).EmailAddress
};
})
.ToArrayAsync();
Is it somehow possible to summarize or reuse the subqueries, so I don't need to define them multiple times?
Note that I've already tried storing the subquery result in a variable. This doesn't work, because it requires converting the expression into a statement body, which results in a compiler error.
The subquery can be reused by introducing intermediate projection (Select), which is the equivalent of let operator in the query syntax.
For instance:
dbContext.Organizations.AsNoTracking()
// intermediate projection
.Select(x => new
{
Organization = x,
Owner = x.Members
.Where(member => member.Role == RoleType.Owner)
.OrderBy(member => member.CreatedAt)
.FirstOrDefault()
})
// final projection
.Select(x => new OrganizationListItem
{
Id = x.Organization.Id,
Name = x.Organization.Name,
OwnerFirstName = Owner.FirstName,
OwnerLastName = Owner.LastName,
OwnerEmailAddress = Owner.EmailAddress
})
Note that in pre EF Core 3.0 you have to use FirstOrDefault instead of First if you want to avoid client evaluation.
Also this does not make the generated SQL query better/faster - it still contains separate inline subquery for each property included in the final select. Hence will improve readability, but not the efficiency.
That's why it's usually better to project nested object into unflattened DTO property, i.e. instead of OwnerFirstName, OwnerLastName, OwnerEmailAddress have a class with properties FirstName, LastName, EmailAddress and property let say Owner of that type in OrganizationListItem (similar to entity with reference navigation property). This way you will be able to use something like
dbContext.Organizations.AsNoTracking()
.Select(x => new
{
Id = x.Organization.Id,
Name = x.Organization.Name,
Owner = x.Members
.Where(member => member.Role == RoleType.Owner)
.OrderBy(member => member.CreatedAt)
.Select(member => new OwnerInfo // the new class
{
FirstName = member.FirstName,
LastName = member.LastName,
EmailAddress = member.EmailAddress
})
.FirstOrDefault()
})
Unfortunately in pre 3.0 versions EF Core will generate N + 1 SQL queries for this LINQ query, but in 3.0+ it will generate a single and quite efficient SQL query.
How about this:
await dbContext.Organizations
.AsNoTracking()
.Select(x =>
{
var firstMember = x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner);
return new OrganizationListItem
{
Id = x.Id,
Name = x.Name,
OwnerFirstName = firstMember.FirstName,
OwnerLastName = firstMember.LastName,
OwnerEmailAddress = firstMember.EmailAddress
};
})
.ToArrayAsync();
How about doing this like
await dbContext.Organizations
.AsNoTracking()
.Select(x => new OrganizationListItem
{
Id = x.Id,
Name = x.Name,
OwnerFirstName = x.Members.FirstOrDefault(member => member.Role == RoleType.Owner).FirstName,
OwnerLastName = x.Members.FirstOrDefault(member => member.Role == RoleType.Owner)).LastName,
OwnerEmailAddress = x.Members.FirstOrDefault(member => member.Role == RoleType.Owner)).EmailAddress
})
.ToArrayAsync();

Entity Framework Structure

I am going through this tutorial to help me better understand the EF Structure. I currently use SQL.
https://learn.microsoft.com/en-us/aspnet/core/data/ef-rp/read-related-data?view=aspnetcore-2.1&tabs=visual-studio
In this example, it shows the instructor, office, student, course, grade, and assignments
public async Task OnGetAsync(int? id, int? courseID)
{
Instructor = new InstructorIndexData();
Instructor.Instructors = await _context.Instructors
.Include(i => i.OfficeAssignment)
.Include(i => i.CourseAssignments)
.ThenInclude(i => i.Course)
.ThenInclude(i => i.Department)
.Include(i => i.CourseAssignments)
.ThenInclude(i => i.Course)
.ThenInclude(i => i.Enrollments)
.ThenInclude(i => i.Student)
.AsNoTracking()
.OrderBy(i => i.LastName)
.ToListAsync();
if (id != null)
{
InstructorID = id.Value;
Instructor instructor = Instructor.Instructors.Where(
i => i.ID == id.Value).Single();
Instructor.Courses = instructor.CourseAssignments.Select(s => s.Course);
}
if (courseID != null)
{
CourseID = courseID.Value;
Instructor.Enrollments = Instructor.Courses.Where(
x => x.CourseID == courseID).Single().Enrollments;
}
}
To help me better understand the syntax would this SQL Statement be the equivalent?
SELECT *
FROM Instructor INNER JOIN
OfficeAssignment ON Instructor.ID = OfficeAssignment.InstructorID INNER JOIN
Department ON Instructor.ID = Department.InstructorID INNER JOIN
Course ON Department.DepartmentID = Course.DepartmentID INNER JOIN
Enrollment ON Course.CourseID = Enrollment.CourseID INNER JOIN
CourseAssignment ON Course.CourseID = CourseAssignment.CourseID INNER JOIN
Student ON Enrollment.StudentID = Student.ID
WHERE Instructor.ID = #ID AND Course.CourseID = #CourseID ORDER BY Instructor.Lastname
It helps to use entities as objects rather than thinking of them as tables. Yes, they typically correlate directly to the underlying tables, but that is a means to an end. You can leverage the relationships more directly than simply treating it as another way to write SQL.
For example:
Instructor.Instructors = await _context.Instructors
.Include(i => i.OfficeAssignment)
.Include(i => i.CourseAssignments)
.ThenInclude(i => i.Course)
.ThenInclude(i => i.Department)
.Include(i => i.CourseAssignments)
.ThenInclude(i => i.Course)
.ThenInclude(i => i.Enrollments)
.ThenInclude(i => i.Student)
.AsNoTracking()
.OrderBy(i => i.LastName)
.ToListAsync();
This will correspond roughly to an SQL statement with a bunch of inner joins and an OrderBy clause. In the realm of EF though, this would be considered bad practice. The reason is that like an SQL statement with inner joins, you are effectively doing a "SELECT *" across all of those tables. Do you really want all of the columns of all of the joined tables?
AsNoTracking() merely tells EF that for the data retrieved, you aren't going to modify it, so don't bother tracking dirty state. This is a performance tweak for read operations.
ToListAsync() performs the query as an awaitable operation which will free up the thread the method was called on. No magic multi-threaded execution here, just the call can hand off to SQL Server, release it's thread, then be assigned a new thread based on a continuation point after the await.
One warning sign I see with the example is the use of the null-able parameters. Can this method validly be called with:
Neither an ID or course ID?
and
An ID with no course ID?
and
A course ID with no ID?
and
Both an ID and course ID?
If any of these combinations is invalid then the method should be split up or refined.
Getting back to the "SELECT *" behaviour, using EF you have a lot of power hiding behind the scenes ready to turn Linq map/reduce operations into SQL to run against the server and return you a meaningful, minimal set of data.
For example:
var query = _context.Instructors.AsQueryable();
if (id.HasValue)
query = query.Where(i => i.ID == id.Value);
query = query.OrderBy(i => i.LastName);
var instructors = await query.Select(i => new InstructorIndexData
{
InstructorId = i.ID,
// ...
Courses = i.CourseAssignments.Select(ca => new CourseData {
CourseId = ca.Course.ID,
CourseName = ca.Course.Name,
//..
}
}).ToListAsync()
if (courseId.HasValue)
{
var enrollments = await query.SelectMany(i => i.Courses.SingleOrDefault(c => c.CourseID == courseID.Value).Enrollments.Select(e => new EnrollmentData
{
InstructorId = i.ID,
EnrollmentId = e.EnrollmentID,
CourseId = e.Course.CourseID,
//...
}).ToListAsync();
// From here, group the Enrollments by Instructor ID and add them to the Instructor index data.
var groupedEnrollments = enrollments.GroupBy(e => e.InstructorId);
foreach(instructorId in groupedEnrollments.Keys)
{
var instructor = instructors.Single(i => i.InstructorId == instructorId);
instructor.Enrollments = groupedEnrollments[instructorId].ToList();
}
}
Now the caveat here is that I'm basing this on memory and with a rough guess of your structure and desired output. The key points would be leveraging the IQueryable and issuing Select statements to just pull back the exact data you need to populate the objects you want to provide to a view.
I do this in 2 query executions, one to get the instructor(s), then the second to get the enrollments if requested based on the provided course ID. Personally I'd split this into two methods since I'd expect the enrollments would be optional. Also there is a difference between fetching one instructor, and all instructors. In cases where potentially large amounts of data are returned, you should look at establishing pagination with Skip() and Take() to avoid expensive queries bogging down the CPU, network, and memory usage.

Retain default order for Linq Contains

I want to retain the default order that comes from sql, after processing by Linq also.I know this question has been asked before. Here is a link Linq Where Contains ... Keep default order.
But still i couldn't apply it to my linq query correctly. could anyone pls help me with this? Thanks!
Here is the query
var x = db.ItemTemplates.Where(a => a.MainGroupId == mnId)
.Where(a => a.SubGruopId == sbId)
.FirstOrDefault();
var ids = new List<int> { x.Atribute1, x.Atribute2, x.Atribute3, x.Atribute4 };
var y = db.Atributes.Where(a => ids.Contains(a.AtributeId))
.Select(g => new
{
Name = g.AtributeName,
AtType = g.AtributeType,
Options = g.atributeDetails
.Where(w=>w.AtributeDetailId!=null)
.Select(z => new
{
Value=z.AtributeDetailId,
Text=z.AtDetailVal
})
});
Your assumption is wrong. SQL server is the one that is sending the results back in the order you are getting them. However, you can fix that:
var x = db.ItemTemplates.Where(a => a.MainGroupId == mnId)
.Where(a => a.SubGruopId == sbId)
.FirstOrDefault();
var ids = new List<int> { x.Atribute1, x.Atribute2, x.Atribute3, x.Atribute4 };
var y = db.Atributes.Where(a => ids.Contains(a.AtributeId))
.Select(g => new
{
Id = g.AtributeId,
Name = g.AtributeName,
AtType = g.AtributeType,
Options = g.atributeDetails
.Where(w=>w.AtributeDetailId!=null)
.Select(z => new
{
Value=z.AtributeDetailId,
Text=z.AtDetailVal
})
})
.ToList()
.OrderBy(z=>ids.IndexOf(z.Id));
Feel free to do another select after the orderby to create a new anonymous object without the Id if you absolutely need it to not contain the id.
PS. You might want to correct the spelling of Attribute, and you should be consistent in if you are going to prefix your property names, and how you do so. Your table prefixes everything with Atribute(sp?), and then when you go and cast into your anonymous object, you remove the prefix on all the properties except AtributeType, which you prefix with At. Pick one and stick with it, choose AtName, AtType, AtOptions or Name, Type, Options.

How to avoid "select n+1" pattern in Linq

I have a query (including LinqKit) of the form:
Expression<Func<Country, DateTime, bool>> countryIndepBeforeExpr =
(ct, dt) => ct.IndependenceDate <= dt;
DateTime someDate = GetSomeDate();
var q = db.Continent.AsExpandable().Select(c =>
new
{
c.ID,
c.Name,
c.Area,
Countries = c.Countries.AsQueryable()
.Where(ct => countryIndepBeforeExpr.Invoke(ct, someDate))
.Select(ct => new {ct.ID, ct.Name, ct.IndependenceDate})
});
Now I want to iterate through q... but since the Countries property of each element is of type IQueryable, it will be lazy loaded, causing n+1 queries to be executed, which isn't very nice.
What is the correct way to write this query so that all necessary data will be fetched in a single query to the db?
EDIT
Hm, well it might have helped if I had actually run a Sql trace before asking this question. I assumed that because the inner property was of type IQueryable that it would be lazy-loaded... but after doing some actual testing, it turns out that Linq to Entities is smart enough to run the whole query at once.
Sorry to waste all your time. I would delete the question, but since it already has answers, I can't. Maybe it can serve as some kind of warning to others to test your hypothesis before assuming it to be true!
Include countries to your model when you call for continents. With something like this:
var continents = db.Continent.Include(c => c.Countries).ToArray();
Then you can make your linq operations without iQueryable object.
I think this should work (moving AsExpandable() to root of IQueryable):
var q = db.Continent
.AsExpandable()
.Select(c => new
{
c.ID,
c.Name,
c.Area,
Countries = c.Countries
.Where(ct => countryIndepBeforeExpr.Invoke(ct, someDate))
.Select(ct => new {ct.ID, ct.Name, ct.IndependenceDate})
});
If not, create two IQueryable and join them together:
var continents = db.Continents;
var countries = db.Countries
.AsExpandable()
.Where(c => countryIndepBeforeExpr.Invoke(c, someDate))
.Select(c => new { c.ID, c.Name, c.IndependenceDate });
var q = continents.GroupJoin(countries,
continent => continent.ID,
country => country.ContinentId,
(continent, countries) => new
{
continent.ID,
continent.Name,
continent.Area,
Countries = countries.Select(c => new
{
c.ID,
c.Name,
c.IndependenceDate
})
});

Categories