LINQ query dropping includes when adding `.Contains()` in where clause - c#

I have a somewhat complex query I'm trying to build in Linq (EntityFramework Core 2.1), and I hit behavior I can't comprehend. The below query runs well and seemingly efficiently:
var q = (
from n in TaskUpdates.Include(t => t.Status).Include("Task").Include("Task.Requirement").Include("User").Include("User.Employee")
where n.User.Employee.EmployeeNumber == 765448466
group n by n.UpdateDate into tu
select tu.OrderByDescending(t=>t.UpdateDate).FirstOrDefault()
)
.Select(x => x.Task.Requirement);
This works as I'd expect, does all the joins I want and includes the expected fields in the SELECT clause:
SELECT [t].[TaskUpdateID], [t].[Active], [t].[TaskId], [t].[Notes], [t].[StatusId], [t].[UpdateDate], [t].[UserId], [t.Task].[TaskID], [t.Task].[Active], [t.Task].[CreatedDate], [t.Task].[RequirementId], [t.Task].[UserId], [t.Task.Requirement].[RequirementID], [t.Task.Requirement].[Active], [t.Task.Requirement].[Description], [t.Task.Requirement].[Hours], [t.Task.Requirement].[Link], [t.Task.Requirement].[Name], [t.Task.Requirement].[RequirementTypeId], [t.Task.Requirement].[ExternalId], [t.Task.Requirement].[SortOrder], [t.Status].[StatusId], [t.Status].[Active], [t.Status].[IsComplete], [t.Status].[Title], [t.User].[UserId], [t.User].[Active], [t.User].[Created], [t.User].[EmployeeNumber], [t.User].[LastLogin], [t.User].[LastUpdated], [t.User.Employee].[EMPLOYEENUMBER], [t.User.Employee].[BEGINDATE], [t.User.Employee].[CITY], [t.User.Employee].[EMPLOYEETYPE], [t.User.Employee].[ENDDATE], [t.User.Employee].[FIRST_NAME], [t.User.Employee].[GENERATION_SUFFIX], [t.User.Employee].[STATUS], [t.User.Employee].[LAST_NAME], [t.User.Employee].[MIDDLE_NAME], [t.User.Employee].[MOBILE], [t.User.Employee].[ORGCODE], [t.User.Employee].[PHONE_NUMBER], [t.User.Employee].[PRIMARYEMAIL], [t.User.Employee].[STATE], [t.User.Employee].[STREET], [t.User.Employee].[TITLE], [t.User.Employee].[ZIPCODE], [t.User.Employee].[BUILDING], [t.User.Employee].[ROOM]
FROM [TaskUpdates] AS [t]
INNER JOIN [Tasks] AS [t.Task] ON [t].[TaskId] = [t.Task].[TaskID]
LEFT JOIN [Requirements] AS [t.Task.Requirement] ON [t.Task].[RequirementId] = [t.Task.Requirement].[RequirementID]
INNER JOIN [Status] AS [t.Status] ON [t].[StatusId] = [t.Status].[StatusId]
INNER JOIN [Users] AS [t.User] ON [t].[UserId] = [t.User].[UserId]
INNER JOIN [DirectoryPeople] AS [t.User.Employee] ON [t.User].[EmployeeNumber] = [t.User.Employee].[EMPLOYEENUMBER]
WHERE [t.User.Employee].[EMPLOYEENUMBER] = 765448466
ORDER BY [t].[UpdateDate]
GO
(I'm using LINQPad to experiment with this query and get the SQL.) In particular, the ending .Select(...) method correctly returns the Requirement object from the query.
What baffles me is if I want to make this query return data for multiple employees, and I change the where clause like so:
var employeeNumbers = new int[] { 765448466 };
var q = (
from n in TaskUpdates.Include(t => t.Status).Include("Task").Include("Task.Requirement").Include("User").Include("User.Employee")
//where n.User.Employee.EmployeeNumber == 765448466
where employeeNumbers.Contains(n.User.Employee.EmployeeNumber)
group n by n.UpdateDate into tu
select tu.OrderByDescending(t=>t.UpdateDate).FirstOrDefault()
)
.Select(x => x.Task.Requirement);
This changes the resulting SQL WHERE clause exactly as I would expect, but it now completely ignores the Includes in the from clause:
SELECT [t].[TaskUpdateID], [t].[Active], [t].[TaskId], [t].[Notes], [t].[StatusId], [t].[UpdateDate], [t].[UserId]
FROM [TaskUpdates] AS [t]
INNER JOIN [Users] AS [t.User] ON [t].[UserId] = [t.User].[UserId]
INNER JOIN [DirectoryPeople] AS [t.User.Employee] ON [t.User].[EmployeeNumber] = [t.User.Employee].[EMPLOYEENUMBER]
WHERE [t.User.Employee].[EMPLOYEENUMBER] IN (765448466)
ORDER BY [t].[UpdateDate]
GO
(only joins as necessary to execute the where) and the result of the final .Select(...) now returns null.
Is this known behavior, with or without explanation? Am I using the Include directives incorrectly, or is there a better way/place for them to go that will resolve this issue?

I can't say for certain the cause, I would suspect EF is going down a different translation path with the Contains and missing the Includes, however as you can see it's not translating the GroupBy at all, so it can definitely be reworked to match more the EF style.
TaskUpdates
.Include(x => x.Task)
.ThenInclude(x => x.Requirement)
.Where(x => employeeNumbers.Contains(x.User.Employee.EmployeeNumber))
.ToList()
.GroupBy(x => x.UpdateDate)
.Select(x => new {
UpdateDate = x.Key,
FirstRequirement = x.First().Task.Requirement
})
.ToList();
This should translate the statements before the first ToList into SQL, populate the results in-memory and allow C# to do the groupby and aggregates on the whole object which SQL would be unable to do.

Related

EF How to query more entities with .include() and using repository pattern

I got the following sql statement that I want to implement with entity framework with linq (lambda expression). Here is the SQL:
select *
from tbl_ExampleStoneCatalog
join tbl_ExampleStoneCategory
on tbl_ExampleStoneCatalog.fk_ESC = tbl_ExampleStoneCategory.pk_ESC
join tbl_ExampleStones
on tbl_ExampleStoneCatalog.fk_ES = tbl_ExampleStones.pk_ES
join tbl_ExampleReviewStoneCatalog
on tbl_ExampleStones.pk_ES = tbl_ExampleReviewStoneCatalog.fk_ES
where .fk_StoneCategory = '%someParameter%'
I tried to use the .include() which brings me to this:
var res = (await this._exampleStoneCatalog.Query()
.include(esc => esc.ExampleStoneCategory)
.include(es => es.ExampleStones)
.include(es => es.ExampleStones.ExampleReviewStoneCatalog))
.Where(w => w.ExampleStones.ExampleReviewStoneCatalog.Any(
a => a.StoneCategoryID.Equals(%someParameter%)));
Unfortunately the code stated above won't deliver me the desired result. Furthermore there is a nested Where condition in it => ExampleStones.ExampleReviewStoneCatalog.StoneCategoryID. From what I understand after some research is, that this is not solvable easily with .include().
Is there other ways to filter in nested queries using the lambda expression?
If seems like a many-to-many relationship. I always find it easiest to begin with the connecting table here.
var res = _tbl_B.Repository.Where(b => b.c.Value == "whatever" && b.a.Value == "whatever").Select(b => b.a);
I have found a work around for this problem. The main challenge here is to filter in a nested SQL query. I could not find a solution with .include(). Especially my current work environment in which we are useing repository pattern wouldn't allow me to filter within includes like:
var res = await this._exampleStoneCatalog.Query().include(x => x.ExampleStones.ExampleReviewStoneCatalog.Where(w => w.StoneCategoryID.Equals(%SomeParameter%))).SelectAsync();
Hence I come to the following solution with using linq to sql.
My solution:
var exampleStoneCatalogEnum = await this._exampleStoneCatalog.Query().SelectAsync();
var exampleStoneCategoryEnum = await this._exampleStoneCategoryRepository.Query().SelectAsync();
var exampleStonesEnum = await this.exampleStonesRepository.Query().SelectAsync();
var exampleReviewStoneCatalogEnum = await this.exampleReviewStoneCatalogRepository.Query().SelectAsync();
var result = from exampleStoneCatalog in exampleStoneCatalogEnum
join exampleStoneCategory in exampleStoneCategoryEnum on exampleStoneCatalog.Id equals exampleStoneCategory.Id
join exampleStones in exampleStonesEnum on exampleStoneCatalog.Id equals exampleStones.Id
join exampleReviewStoneCatalog in exampleReviewStoneCatalogEnum on exampleStones.Id equals exampleReviewStoneCatalog.Id
where exampleReviewStoneCatalog.StoneCategoryID.Equals(revCategory)
select exampleStoneCatalog;
return result;
as you can see I first get the required data of each table and join them in my result including the where condition in the end. This returns the desired result.

Entity Framework (using In and Select Distinct)

I am relatively new to Entity Framework 6.0 and I have come across a situation where I want to execute a query in my C# app that would be similar to this SQL Query:
select * from periods where id in (select distinct periodid from ratedetails where rateid = 3)
Is it actually possible to execute a query like this in EF or would I need to break it into smaller steps?
Assuming that you have in your Context class:
DbSet<Period> Periods...
DbSet<RateDetail> RateDetails...
You could use some Linq like this:
var distincts = dbContext.RateDetails
.Where(i => i.rateId == 3)
.Select(i => i.PeriodId)
.Distinct();
var result = dbContext.Periods
.Where(i => i.Id)
.Any(j => distincts.Contains(j.Id));
Edit: Depending on your entities, you will probably need a custom Comparer for Distinct(). You can find a tutorial here, and also here
or use some more Linq magic to split the results.
Yes, this can be done but you should really provide a better example for your query. You are already providing a bad starting point there. Lets use this one:
SELECT value1, value2, commonValue
FROM table1
WHERE EXISTS (
SELECT 1
FROM table2
WHERE table1.commonValue = table2.commonValue
// include some more filters here on table2
)
First, its almost always better to use EXISTS instead of IN.
Now to turn this into a Lambda would be something like this, again you provided no objects or object graph so I will just make something up.
DbContext myContext = this.getContext();
var myResults = myContext.DbSet<Type1>().Where(x => myContext.DbSet<Type2>().Any(y => y.commonValue == x.commonValue)).Select(x => x);
EDIT - updated after you provided the new sql statement
Using your example objects this would produce the best result. Again, this is more efficient than a Contains which translates to an IN clause.
Sql you really want:
SELECT *
FROM periods
WHERE EXISTS (SELECT 1 FROM ratedetails WHERE rateid = 3 AND periods.id = ratedetails.periodid)
The Lamda statement you are after
DbContext myContext = this.getContext();
var myResults = myContext.DbSet<Periods>()
.Where(x => myContext.DbSet<RateDetails>().Any(y => y.periodid == x.id && y.rateid == 3))
.Select(x => x);
Here is a good starting point for learning about lamda's and how to use them.
Lambda Expressions (C# Programming Guide).
this is your second where clause in your query
var priodidList=ratedetails.where(x=>x.rateid ==3).DistinctBy(x=>x.rateid);
now for first part of query
var selected = periods.Where(p => p.id
.Any(a => priodidList.Contains(a.periodid ))
.ToList();

Entity Framework Group By Then Order By

I'm generating a query using Entity Framework which uses a group by clause and then attempts to order each of the groups to get specific data. I attempted to optimize the order by to only happen once using a let statement but the results are incorrect but the query still executes.
Concept:
var results =
(from n in noteEntities.NoteLog
where associatedIDs.Contains(n.AssociatedID)
group n by n.AssociatedID into gn
let ogn = gn.OrderByDescending(t => t.CreatedDateTime)
let successNote = ogn.FirstOrDefault(x => x.Type == "Success")
let lastStatusNote = ogn.FirstOrDefault()
select new { Success = successNote, Status = lastStatusNote, AssociatedID = gn.Key }).ToList();
However, the problem is that using, what should be the ordered let variable, ogn in the subsequent let statements is not using an order by descending list and I'm getting the wrong success and status notes. I've also tried changing things up to create a sub-query and reference the result but that doesn't seem to return an ordered list either, ex:
var subQuery =
(from n in noteEntities.NoteLog
where associatedIDs.Contains(n.AssociatedID)
group n by n.AssociatedID into gn
select gn.OrderByDescending(t => t.CreatedDateTime));
var results =
(from s in subQuery
let successNote = s.FirstOrDefault(x => x.Type == "Success")
let lastStatusNote = s.FirstOrDefault()
select new { Success = successNote, Status = lastStatusNote }).ToList();
I can make this work by using OrderByDescending twice in the select statement or let statements for the success and status notes but this becomes very slow, and redundant, when there are a lot of notes. Is there a way to run the order by only once and get the right results back?
In SQL a subquery with Order By must have a TOP statement (yours does not). And when Linq detects that there is no FirstOrDefault or Takestatements with the ordered subquery it just strips the OrderByDescending.
If you are having a performance problem with the query perhaps you should look into indexing the table.

sub linq query is making this take a very long time, how can I make this faster?

I have a list of employees that I build like this:
var employees = db.employees.Where(e => e.isActive == true).ToList();
var latestSales = from es in db.employee_sales.Where(x => x.returned == false);
Now what I want is a result like this:
int employeeId
List<DateTime> lastSaleDates
So I tried this, but the query takes a very very long time to finish:
var result =
(from e in employees
select new EmployeeDetails
{
EmployeeId = e.employeeId,
LastSaleDates =
(from lsd in latestSales.Where(x => x.EmployeeId == e.EmployeeId)
.Select(x => x.SaleDate)
select lsd).ToList()
};
The above works, but literally takes 1 minute to finish.
What is a more effecient way to do this?
You can use join to get all data in single query
var result = from e in db.employees.Where(x => x.isActive)
join es in db.employee_sales.Where(x => x.returned)
on e.EmployeeId equals es.EmployeeId into g
select new {
EmployeeId = e.employeeId,
LastSaleDates = g.Select(x => x.SaleDate)
};
Unfortunately you can't use ToList() method with Linq to Entities. So either map anonymous objects manually to your EmployeeDetails or change LastSalesDates type to IEnumerable<DateTime>.
Your calls to ToList are pulling things into memory. You should opt to build up a Linq expression instead of pulling an entire query into memory. In your second query, you are issuing a new query for each employee, since your are then operating in the Linq-to-objects domain (as opposed to in the EF). Try removing your calls to ToList.
You should also look into using Foreign Key Association Properties to makes this query a lot nicer. Association properties are some of the most powerful and useful parts of EF. Read more about them here. If you have the proper association properties, your query can look as nice as this:
var result = from e in employees
select new EmployeeDetails
{
EmployeeId = e.employeeId,
LastSaleDates = e.AssociatedSales
}
You might also consider using a join instead. Read about Linq's Join method here.
Is there an association in your model between employees and latestSales? Have you checked SQL Profiler or other profiling tools to see the SQL that's generated? Make sure the ToList() isn't issuing a separate query for each employee.
If you can live with a result structure as IEnumerable<EmployeeId, IEnumerable<DateTime>>, you could consider modifying this to be:
var result = (from e in employees
select new EmployeeDetails
{
EmployeeId = e.employeeId,
LastSaleDates = (from lsd in latestSales
where e.employeeId equals lsd.EmployeeId
select lsd.SaleDate)
};
I have some more general recommendations at http://www.thinqlinq.com/Post.aspx/Title/LINQ-to-Database-Performance-hints to help track issues down.

Convert A Union Query To LINQ To Entity Query

Can someone help me with converting this query to a Linq to entities query in the proper way. I am fairly new to Linq and want to write these queries properly. This is a fairly involved one for what im doing with UNION and sub queries in it
SELECT pf.FileID, pf.ServerName, pf.MigrationType
FROM pOrders pf
WHERE pf.FileID IN (select GCMFileID FROM Signals
where SignalFileID = " + FileID + ")
UNION
SELECT pf.FileID, pf.ServerName, pf.MigrationType
FROM pOrders pf
WHERE pf.FileID = " + FileID + "
order by pf.MigrationType desc
I know, I saw comments... but
var signalIds = Signals.Where(s => s.SignalFileId = FILEID).Select(x => x.GCMFileID ).ToArray();
pOrders.Where(pf => signalIds.Contains(pf.FileID))
.Union(
pOrders.Where(pf => pf.FileID == FILEID))
.OrderByDescending(u => u.MigrationType)
.Select(u => new {u.FileID, u.ServerName, u.MigrationType});
var innerquery = from t in db.Signals
where t.SignalFileID == FileID
select new {t.SignalFieldID};
var query = (from p in db.pOrders
where p.FieldID.Contains(innerquery.SignalFieldID)
select new {p.FileID, p.ServerName, p.MigrationType}).Union
(from p in db.pOrders
where p.FieldID ==FieldID
orderby p.MigrationType
select new {p.FileID, p.ServerName, p.MigrationType})
I know this is an old question but I thought I'd add my two cents hoping I can save some time for someone who thinks as I originally did that Union() is the correct method to use.
My first misstep was to create a custom comparer with my entity's logical keys after I hit my first error that the xml column type cannot be used in a distinct. Then, Linq to Entities complained it did not recognize Union(). I notice the accepted answer calls ToArray. This brings the entire results of the first query into memory before doing the Union. The OP wants Linq to Entities so you need to act on an IQueryable. Use Concat. The entire query will run in the database.
var innerquery = (from t in db.Signals
where t.SignalFileID == FileID
select t.SignalFileID);
var query = (from p in db.pOrders
where innerquery.Contains(p.FileID)
select new {p.FileID, p.ServerName, p.MigrationType})
.Concat(from p in db.pOrders
where p.FileID == FileID
select new {p.FileID, p.ServerName, p.MigrationType})
.OrderBy(o => o.MigrationType);

Categories