Linq Query With Multiple Joins Not Giving Correct Results - c#

I have a Linq query which is being used to replace a database function. This is the first one with multiple joins and I can't seem to figure out why it returns 0 results.
If you can see any difference which could result in the incorrect return it would be greatly appreciated......I've been trying to solve it longer than I should have.
Linq Query
context.StorageAreaRacks
.Join(context.StorageAreas, sar => sar.StorageAreaId, sa => sa.Id, (sar, sa) => new { sar, sa })
.Join(context.StorageAreaTypes, xsar => xsar.sar.StorageAreaId, sat => sat.Id, (xsar, sat) => new { xsar, sat })
.Join(context.Racks, xxsar => xxsar.xsar.sar.RackId, r => r.Id, (xxsar, r) => new { xxsar, r })
.Where(x => x.xxsar.sat.IsManual == false)
.Where(x => x.r.IsEnabled == true)
.Where(x => x.r.IsVirtual == false)
.Select(x => new { x.xxsar.sat.Id, x.xxsar.sat.Name })
.Distinct()
.ToList();
This is the query which is generated by the LINQ query
SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[Id] AS [Id],
[Distinct1].[Name] AS [Name]
FROM ( SELECT DISTINCT
[Extent2].[Id] AS [Id],
[Extent2].[Name] AS [Name],
1 AS [C1]
FROM [dbo].[StorageAreaRacks] AS [Extent1]
INNER JOIN [dbo].[StorageAreaTypes] AS [Extent2] ON [Extent1].[StorageAreaId] = [Extent2].[Id]
INNER JOIN [dbo].[Racks] AS [Extent3] ON [Extent1].[RackId] = [Extent3].[Id]
WHERE (0 = [Extent2].[IsManual]) AND (1 = [Extent3].[IsEnabled]) AND (0 = [Extent3].[IsVirtual])
) AS [Distinct1]
Sql Query which produces required results
SELECT DISTINCT sat.Name, sat.Id
FROM StorageAreaRacks sar
JOIN StorageAreas sa on sa.id = sar.StorageAreaId
JOIN StorageAreaTypes sat on sat.id = sa.StorageAreaTypeId
JOIN Racks r on r.id = sar.RackId
WHERE sat.IsManual = 0
AND r.IsEnabled = 1
AND r.IsVirtual = 0

Using joins with LINQ method syntax is hard to read and error prone.
Using joins with LINQ query syntax is better, but still error prone (you can join by the wrong key as you did) and does not give you information about join cardinality.
The best for LINQ to Entities queries is to use navigation properties (as Gert Arnold suggested in the comments and not only - see Don’t use Linq’s Join. Navigate!) because they have none of the aforementioned drawbacks.
The whole query should be something like this:
var query = context.StorageAreaRacks
.Where(sar => !sar.StorageArea.StorageAreaType.IsManual
&& sar.Rack.IsEnabled && !sar.Rack.IsVirtual)
.Select(sar => new
{
sar.StorageArea.StorageAreaType.Id,
sar.StorageArea.StorageAreaType.Name,
})
.Distinct();
or
var query = (
from sar in context.StorageAreaRacks
let sat = sar.StorageArea.StorageAreaType
let r = sar.Rack
where !sat.IsManual && r.IsEnabled && !r.IsVirtual
select new { sat.Id, sat.Name })
.Distinct();
Simple, readable and almost no place for mistakes. Navigation properties are one of the most beautiful features of EF, don't miss them.

Your LINQ doesn't translate the SQL properly; it Joins the StorageAreaTypes on the StorageAreaRack.StorageAreaId instead of on the StorageAreas.StorageAreaTypeId, which is why EF drops the StorageAreas Join - it has no effect on the outcome.
I think it is clearer if you elevate the members of each join to flatten the anonymous objects and name them based on their members (that are the join tables). Also, no reason to separate the Where clauses, LINQ can use && as well as SQL using AND. Also, if you have boolean values, don't compare them to true or false. Also there is no reason to pass range variables through that aren't used later.
Putting it all together:
var ans = context.StorageAreaRacks
.Join(context.StorageAreas, sar => sar.StorageAreaId, sa => sa.Id, (sar, sa) => new { sar, sa })
.Join(context.StorageAreaTypes, sarsa => sarsa.sa.StorageAreaTypeId, sat => sat.Id, (sarsa, sat) => new { sarsa.sar, sat })
.Join(context.Racks, sarsat => sarsat.sar.RackId, r => r.Id, (sarsat, r) => new { sarsat.sat, r })
.Where(satr => !satr.sat.IsManual && satr.r.IsEnabled && !satr.r.IsVirtual)
.Select(satr => new { satr.sat.Id, satr.sat.Name })
.Distinct()
.ToList();
However, I think when multiple joins are involved and when translating SQL, LINQ comprehension syntax can be easier to understand:
var ans = (from sar in context.StorageAreaRacks
join sa in context.StorageAreas on sar.StorageAreaId equals sa.Id
join sat in context.StorageAreaTypes on sa.StorageAreaTypeId equals sat.Id
join r in context.Racks on sar.RackId equals r.Id
where !sat.IsManual && r.IsEnabled && !r.IsVirtual
select new {
sat.Name,
sat.Id
}).Distinct().ToList();

You are missing a Where for your rack ID != null in your LINQ statement, and a Distinct().

Related

Convert this SQL query to LINQ?

Using a SQL query within Microsoft SQL Server, I am able to get my desired results. Now I'm trying to utilize this query in my project via LINQ.
My SQL query is
SELECT distinct DeviceId, max (Head), max(Shoulder), max(Chest)
FROM EventUserOverPressure eop
JOIN UserEventInfo uei on uei.UserEventInfo_Id = eop.UserEventInfo_Id
JOIN BlastRecord br ON br.BlastRecord_Id = uei.BlastRecord_Id
JOIN WeaponsFiringLog wfl ON wfl.BlastRecord_Id = br.BlastRecord_Id
JOIN WeaponsFired wf ON wf.Blast_WFL_Id = wfl.Blast_WFL_Id
WHERE br.BlastRecord_Id = 1599
group BY DeviceId
Thus far, my LINQ query is
var myOverPressures = (from eop in db.EventUserOverPressures
join uei in ueiList on eop.UserEventInfo_Id equals uei.UserEventInfo_Id
join br in blastRecords on uei.BlastRecord_Id equals br
join wfl in weaponFiringLogss on uei.BlastRecord_Id equals wfl.BlastRecord_Id
join wf in weaponsFired on wfl.Blast_WFL_Id equals wf.Blast_WFL_Id
where (eop.Chest > 0 || eop.Head > 0 || eop.Shoulder > 0)
select new { eop.DeviceDataId, eop.Head, eop.Shoulder, eop.Chest }).Distinct().ToList();
I know the BlastRecord_Id is set to 1599 and it's a variable in LINQ. That's intentional. I was trying to figure out my query in SQL, so I focused on specific record. In LINQ it needs to work for all BlastRecord_Id's. Using LINQ, I'm able to group by DeviceDataId on the next, outside of the initial query.
My goal is to group by DeviceDataId as part of this query, and get the max values for Head Shoulder and Chest - like I did in the SQL query. If it matters, my end goal is sort my results. I know my SQL query results give me what I need in order to sort how I want. I've spent an embarrassing amount of time trying to figure this out. Any help is greatly appreciated.
Try:
var myOverPressures = (
from eop in db.EventUserOverPressures
join uei in ueiList on eop.UserEventInfo_Id equals uei.UserEventInfo_Id
join br in blastRecords on uei.BlastRecord_Id equals br.BlastRecord_Id
join wfl in weaponFiringLogss on uei.BlastRecord_Id equals wfl.BlastRecord_Id
join wf in weaponsFired on wfl.Blast_WFL_Id equals wf.Blast_WFL_Id
where (eop.Chest > 0 || eop.Head > 0 || eop.Shoulder > 0)
select new { eop.DeviceDataId, eop.Head, eop.Shoulder, eop.Chest }
)
.GroupBy(r => r.DeviceDataId)
.Select(g => new {
DeviceDataId = g.Key,
maxHead = g.Max(r => r.Head),
maxShoulder = g.Max(r => r.Shoulder),
maxChest = g.Max(r => r.Chest)
})
.ToList();
The .GroupBy() maps the data to a collection of groups, each of which has a key and a collection of group member objects. The .Select() then extracts the key and calculates the max of the Head/Shoulder/Chest values within each group.
I removed the .Distinct(), as I believe it is unnecesary due to the fact that each group key (DeviceDataId) should already be distinct.
As a side note: I noticed that the join structure of your query has what appears to be two independent one-to-many join relationships:
BlastRecord
+--> UserEventInfo --> EventUserOverPressure
+--> WeaponsFiringLog --> WeaponsFired
This may lead to the results being the cartesian product of the two join paths, yielding duplicate data. This could be a problem if you were counting of summing the effects, but if max() is the only aggregation used, I do not believe the results are affected.
I believe the grouping and aggregation may also be done in the LINQ query syntax. Something like:
var myOverPressures = (
...
group eop by eop.DeviceDataId into g
select new {
DeviceDataId = g.Key,
maxHead = g.Max(r => r.Head),
maxShoulder = g.Max(r => r.Shoulder),
maxChest = g.Max(r => r.Chest)
}
)
.ToList();
(I am not 100% sure I have this right. If someone spots an error and comments, I will correct the above.)
Try this one
var query = (from eop in db.EventUserOverPressure
join uei in db.UserEventInfo on eop.UserEventInfo_Id equals uei.UserEventInfo_Id
join br in db.BlastRecord on uei.BlastRecord_Id equals br.BlastRecord_Id
join wfl in db.WeaponsFiringLog on br.BlastRecord_Id equals wfl.BlastRecord_Id
join wf in db.WeaponsFired on wfl.Blast_WFL_Id equals wf.Blast_WFL_Id
where br.BlastRecord_Id == 1599
group eop by eop.DeviceId into g
select new
{
DeviceId = g.Key,
Head = g.Max(x => x.Head),
Shoulder = g.Max(x => x.Shoulder),
Chest = g.Max(x => x.Chest)
});

How to convert LINQ query syntax with joins, groups and inline query to LINQ method syntax

I need to convert the following query in SQL to LINQ using the method syntax, but I'm getting confused when I try to include the inline query, joins and groupings:
SELECT CL.*, CB.*, CD.*, CC.*, CS.*
FROM clients AS CL
INNER JOIN (
SELECT A.client_id, SUM(A.amount) AS balance
FROM accounts AS A
WHERE A.account_id = 1
GROUP BY A.client_id
HAVING (SUM(A.amount) > 0.015) OR (SUM(A.amount) < -0.015)
) AS CB ON CL.client_id = CB.client_id
INNER JOIN client_details AS CD ON (CL.client_id = CD.client_id) AND (CL.audit_id = CD.audit_id)
LEFT JOIN client_categories AS CC ON CD.client_category_id = CC.client_category_id
LEFT JOIN statuses AS CS ON CD.status_id = CS.status_id
I did get the GROUP BY and HAVING portions working independently of the overall query, but I have not been able to merge this with the rest as an inline query (though I've tried .Any and other LINQ methods):
.Where(a => a.AccountId == 1)
.GroupBy(a => new { a.ClientId })
.Where(ag => ag.Sum(a => a.Amount) > 0.015M || ag.Sum(a => a.Amount) < -0.015M)
.Select(ag => new { Id = ag.Key.ClientId, Balance = ag.Sum(a => a.Amount) })
I have managed to convert it to LINQ using the query syntax, which works perfectly for me, but I need it in the method (Lambda) syntax:
var clients = from c in _context.Clients
join cb in (from a in _context.Accounts
where a.AccountId == 1
group a by new { Id = a.ClientId } into g
where g.Sum(gs => gs.Amount) > 0.015M || g.Sum(gs => gs.Amount) < -0.015M
select new { g.Key.Id, Balance = g.Sum(gs => gs.Amount) }) on c.Id equals cb.Id
join cd in _context.ClientDetails on new { c.Id, c.AuditId } equals new { cd.Id, cd.AuditId }
join cc in _context.ClientCategories on cd.ClientCategoryId equals cc.Id into ccj
from cc in ccj.DefaultIfEmpty()
join cs in _context.Statuses on cd.StatusId equals cs.Id into csj
from cs in csj.DefaultIfEmpty()
select new Client(c, cb.Balance, new ClientDetails(cd, cc, cs));
Any help greatly appreciated.
It appears that LinqPad offers the solution. Run the LINQ query syntax against your database and, when it finishes, click the Lambda symbol at the bottom of the page to see the method syntax:
LinqPad Window
Thanks LinqPad!

LINQ lambda - OrderByDescending is adding undesired select

I'm working with a pretty wild lambda query. Here is my initial LINQ lambda statement (not being sorted/ordered by):
var query = orders.Join(customers, o => o.CustomerID, c => c.ID, (o, c) => new { o, c })
.Join(ordersections, o => o.o.ID, os => os.OrderID, (o, os) => new { o.o, o.c, os })
.Join(tickets, o => o.os.ID, t => t.OrderSectionID, (o, t) => new { o.o, o.c, o.os, t })
.Join(events, o => o.t.EventID, e => e.id, (o, e) => new { o.o, o.c, o.os, o.t, e })
.Join(clients, o => o.e.ClientID, cl => cl.id, (o, cl) => new { o.o, o.c, o.os, o.t, o.e, cl })
.Join(venues, o => o.e.VenueID, v => v.VenueID, (o, v) => new ModelsCs.LINQ.CustomerSearchResult { order = o.o, customer = o.c, orderSection = o.os, ticket = o.t, evt = o.e, client = o.cl, venue = v })
.AsExpandable()
.Where(predicate) // from PredicateBuilder
.GroupBy(x => new
{
// variables to group by
})
.Select(s => new CustomerSearchResult
{
// Selecting the variables, all good and fun!
});
The SQL that is generated is as follows:
SELECT <correct variables to select>
FROM [dbo].[Order] AS [t0]
INNER JOIN [dbo].[Customer] AS [t1] ON [t0].[Customer] = ([t1].[Customer])
INNER JOIN [dbo].[OrderSection] AS [t2] ON [t0].[Order] = [t2].[Order]
INNER JOIN [dbo].[Ticket] AS [t3] ON [t2].[OrderSection] = [t3].[OrderSection]
INNER JOIN [dbo].[Event] AS [t4] ON [t3].[Event] = [t4].[Event]
INNER JOIN [dbo].[Client] AS [t5] ON [t4].[Client] = ([t5].[Client])
INNER JOIN [dbo].[Venue] AS [t6] ON [t4].[Venue] = ([t6].[Venue])
WHERE ([t5].[Brand] = #p0)
AND ([t0].[Brand] = #p1)
AND ([t4].[EventStart] >= #p2)
AND ([t0].[OrderDateTime] >= #p3)
AND ([t1].[email] LIKE #p4)
GROUP BY <correct group by variables>
Beautiful! But I need to order the results, so I also want this at the end:
...
ORDER BY SortingVariable1 desc
(^^^^ THIS IS WHAT I'M TRYING TO DO)
Here is what I have already tried:
So I tried adding this to my LINQ lambda statement:
.OrderByDescending(x => x.SortingVariable1)
But this is now the SQL code that is generated:
SELECT <correct variables to select>
FROM (
SELECT <correct GROUP BY variables>
FROM [dbo].[Order] AS [t0]
INNER JOIN [dbo].[Customer] AS [t1] ON [t0].[Customer] = ([t1].[Customer])
INNER JOIN [dbo].[OrderSection] AS [t2] ON [t0].[Order] = [t2].[Order]
INNER JOIN [dbo].[Ticket] AS [t3] ON [t2].[OrderSection] = [t3].[OrderSection]
INNER JOIN [dbo].[Event] AS [t4] ON [t3].[Event] = [t4].[Event]
INNER JOIN [dbo].[Client] AS [t5] ON [t4].[Client] = ([t5].[Client])
INNER JOIN [dbo].[Venue] AS [t6] ON [t4].[Venue] = ([t6].[Venue])
WHERE ([t5].[Brand] = #p0)
AND ([t0].[Brand] = #p1)
AND ([t4].[EventStart] >= #p2)
AND ([t0].[OrderDateTime] >= #p3)
AND ([t1].[email] LIKE #p4)
GROUP BY <correct group by variables>
) AS [t7]
ORDER BY [t7].[SortingVariable1] DESC
No matter where in my lambda statement I put that .OrderByDescending, it doesn't work correctly.
My question: Does anyone know how I can alter my LINQ Lambda statement to correctly add an ORDER BY SortingVariable1 DESC to the end of the generated SQL statement?
The outer SELECT by itself is not a problem, because it does not come with an additional overhead of descernable magnitude. The addition of nesting allows SQL generator do sorting on any of the returned fields, even the calculated ones, without including the computation twice.
This behavior is due to a limitation of SQL illustrated by the example below:
SELECT A+B as A_plus_B
FROM MyTable
ORDER BY A_plus_B -- <=== This does not work
The query above must be re-written either with the computation repeated twice, i.e.
SELECT A+B as A_plus_B
FROM MyTable
ORDER BY A+B -- <=== Computation is repeated
or with a nested query or a CTE:
SELECT A_plusB FROM (
SELECT A+B as A_plus_B
FROM MyTable
)
ORDER BY A_plus_B -- <=== This works
LINQ's SQL generator takes the second approach, producing the statement that you see.
It is correctly adding the Order By. It is in the nature of auto-generated code that it is often not going to be as pretty as human generated code. It'll often be more verbose in what it writes, simply because generating such code is often easier.
If you want to have exactly a certain set of SQL code, you'll need to write it by hand. If you want to let it be automatically generated for you then you'll have to be satisfied with less pretty but perfectly correct and equally functional code.

optimize Entity Framework Linq query (selected 1 unexpected field)

All,
Can anyone help me optimize the following EF/Linq query:
The EF/Linq query (taken from LinqPad):
Articles
.AsNoTracking()
.Where(a => a.Active == "J")
.SelectMany(a => KerlServices
.Where(ks => ks.Service.SAPProductNumber == a.SAPProductNumber))
.Select(ks => new {
ks.KerlCode,
ks.Service.SAPProductNumber,
ks.Service.Type })
.ToList()
The relation between Articles and Services (ks.Service.SAPProductNumber == a.SAPProductNumber) is in theory a 1:optional relation with cannot be defined in EF. This is however not my question.
The resulting SQL query:
SELECT
[Join1].[F_SERVICESID] AS [F_SERVICESID],
[Join1].[F_KERLCOD] AS [F_KERLCOD],
[Join1].[F_SAPARTNUM] AS [F_SAPARTNUM],
[Join1].[F_TYPE] AS [F_TYPE]
FROM [dbo].[T_ART] AS [Extent1]
INNER JOIN (SELECT [Extent2].[F_KERLCOD] AS [F_KERLCOD], [Extent2].[F_SERVICESID] AS [F_SERVICESID], [Extent3].[F_SAPARTNUM] AS [F_SAPARTNUM], [Extent3].[F_TYPE] AS [F_TYPE]
FROM [dbo].[T_SERVICESKERL] AS [Extent2]
INNER JOIN [dbo].[T_SERVICES] AS [Extent3] ON [Extent2].[F_SERVICESID] = [Extent3].[F_ID] ) AS [Join1] ON [Extent1].[F_SAPARTNUM] = [Join1].[F_SAPARTNUM]
WHERE N'J' = [Extent1].[F_ACTIND]
Why does EF generate a query that selects [Join1].[F_SERVICESID]? I don't need this field. Does anyone know a way to prevent this?
Kind regards, Jan.
ADDITION 1:
KerlServices
.AsNoTracking()
.Select(ks => new {
ks.KerlCode,
ks.Service.SAPProductNumber,
ks.Service.Type })
.Join(
Articles,
ks => ks.SAPProductNumber,
a => a.SAPProductNumber,
(ks, a) => new { ks, a.Active })
.Where(ksa => ksa.Active == "J")
.Select(ksa => ksa.ks)
.ToList()
results in:
SELECT
[Extent1].[F_SERVICESID] AS [F_SERVICESID],
[Extent1].[F_KERLCOD] AS [F_KERLCOD],
[Extent2].[F_SAPARTNUM] AS [F_SAPARTNUM],
[Extent2].[F_TYPE] AS [F_TYPE]
FROM [dbo].[T_SERVICESKERL] AS [Extent1]
INNER JOIN [dbo].[T_SERVICES] AS [Extent2] ON [Extent1].[F_SERVICESID] = [Extent2].[F_ID]
INNER JOIN [dbo].[T_ART] AS [Extent3] ON [Extent2].[F_SAPARTNUM] = [Extent3].[F_SAPARTNUM]
WHERE N'J' = [Extent3].[F_ACTIND]
This 'improvement' does not answer my own question, but the result surely looks prettier to me.
UPDATE 1:
The query in Ivan Stoev's answer produces the following SQL:
SELECT
[Extent1].[F_SERVICESID] AS [F_SERVICESID],
[Extent1].[F_KERLCOD] AS [F_KERLCOD],
[Extent2].[F_SAPARTNUM] AS [F_SAPARTNUM],
[Extent2].[F_TYPE] AS [F_TYPE]
FROM [dbo].[T_SERVICESKERL] AS [Extent1]
INNER JOIN [dbo].[T_SERVICES] AS [Extent2] ON [Extent1].[F_SERVICESID] = [Extent2].[F_ID]
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[T_ART] AS [Extent3]
WHERE (N'J' = [Extent3].[F_ACTIND]) AND ([Extent3].[F_SAPARTNUM] = [Extent2].[F_SAPARTNUM])
)
Why does EF generate a query that selects [Join1].[F_SERVICESID]? I don't need this field.
That's weird if true, I have no explanation for that.
Can anyone help me optimize the following EF/Linq query
It's worth trying the following, which for me represents the most logical way to retrieve the data in question:
KerlServices
.AsNoTracking()
.Select(ks => new {
ks.KerlCode,
ks.Service.SAPProductNumber,
ks.Service.Type })
.Where(ks => Articles.Any(a => a.Active == "J" && a.SAPProductNumber == ks.SAPProductNumber)
.ToList()
UPDATE: Recently I've encountered that EF includes some additional fields in the generated SQL query when dialing with foreign key relations. These fields are not included in the projected result, so I think you should not worry about. Take any of the queries above, execute it inside the real code environment (VS Debug) and check the the projected list - I'm pretty sure the field in question will not be there.

Returning (blanks) in many to many Linq query

A follow up to this question: Changing a linq query to filter on many-many
I have the following Linq query
public static List<string> selectedLocations = new List<string>();
// I then populate selectedLocations with a number of difference strings, each
// corresponding to a valid Location
viewModel.people = (from c in db.People
select c)
.OrderBy(x => x.Name)
.ToList();
// Here I'm basically filtering my dataset to include Locations from
// my array of selectedLocations
viewModel.people = from c in viewModel.people
where (
from a in selectedLocations
where c.Locations.Any(o => o.Name == a)
select a
).Any()
select c;
How can I modify the query so that it also returns people that have NO location set at all?
You can do filtering on database side:
viewModel.people =
(from p in db.People
where !p.Locations.Any() ||
p.Locations.Any(l => selectedLocations.Contains(l.Name))
orderby p.Name
select p).ToList();
Or lambda syntax:
viewModel.people =
db.People.Where(p => !p.Locations.Any() ||
p.Locations.Any(l => selectedLocations.Contains(l.Name)))
.OrderBy(p => p.Name)
.ToList();
EF will generate two EXISTS subqueries in this case. Something like:
SELECT [Extent1].[Name]
[Extent1].[Id]
-- other fields from People table
FROM [dbo].[People] AS [Extent1]
WHERE (NOT EXISTS (SELECT 1 AS [C1]
FROM [dbo].[PeopleLocations] AS [Extent2]
WHERE [Extent2].[PersonId] = [Extent1].[Id])
OR EXISTS (SELECT 1 AS [C1]
FROM [dbo].[PeopleLocations] AS [Extent3]
WHERE [Extent3].[PersonId] = [Extent1].[Id])
AND [Extent3].[Name] IN ('location1', 'location2')))
ORDER BY [Extent1].[Name] ASC

Categories