Optimising LINQ-to-SQL queries - c#

I have a very heavy LINQ-to-SQL query, which does a number of joins onto different tables to return an anonymous type. The problem is, if the amount of rows returned is fairly large (> 200), then the query becomes awfully slow and ends up timing out. I know I can increase the data context timeout setting, but that's a last resort.
I'm just wondering if my query would work better if I split it up, and do my comparisons as LINQ-to-Objects queries so I can possibly even use PLINQ to maximise the the processing power. But I'm that's a foreign concept to me, and I can't get my head around on how I would split it up. Can anyone offer any advice? I'm not asking for the code to be written for me, just some general guidance on how I could improve this would be great.
Note I've ensured the database has all the correct keys that I'm joining on, and I've ensured these keys are up to date.
The query is below:
var cons = (from c in dc.Consignments
join p in dc.PODs on c.IntConNo equals p.Consignment into pg
join d in dc.Depots on c.DeliveryDepot equals d.Letter
join sl in dc.Accounts on c.Customer equals sl.LegacyID
join ss in dc.Accounts on sl.InvoiceAccount equals ss.LegacyID
join su in dc.Accounts on c.Subcontractor equals su.Name into sug
join sub in dc.Accountsubbies on ss.ID equals sub.AccountID into subg
where (sug.FirstOrDefault() == null
|| sug.FirstOrDefault().Customer == false)
select new
{
ID = c.ID,
IntConNo = c.IntConNo,
LegacyID = c.LegacyID,
PODs = pg.DefaultIfEmpty(),
TripNumber = c.TripNumber,
DropSequence = c.DropSequence,
TripDate = c.TripDate,
Depot = d.Name,
CustomerName = c.Customer,
CustomerReference = c.CustomerReference,
DeliveryName = c.DeliveryName,
DeliveryTown = c.DeliveryTown,
DeliveryPostcode = c.DeliveryPostcode,
VehicleText = c.VehicleReg + c.Subcontractor,
SubbieID = sug.DefaultIfEmpty().FirstOrDefault().ID.ToString(),
SubbieList = subg.DefaultIfEmpty(),
ScanType = ss.PODScanning == null ? 0 : ss.PODScanning
});
Here's the generated SQL as requested:
{SELECT [t0].[ID], [t0].[IntConNo], [t0].[LegacyID], [t6].[test], [t6].[ID] AS [ID2], [t6].[Consignment], [t6].[Status], [t6].[NTConsignment], [t6].[CustomerRef], [t6].[Timestamp], [t6].[SignedBy], [t6].[Clause], [t6].[BarcodeNumber], [t6].[MainRef], [t6].[Notes], [t6].[ConsignmentRef], [t6].[PODedBy], (
SELECT COUNT(*)
FROM (
SELECT NULL AS [EMPTY]
) AS [t10]
LEFT OUTER JOIN (
SELECT NULL AS [EMPTY]
FROM [dbo].[PODs] AS [t11]
WHERE [t0].[IntConNo] = [t11].[Consignment]
) AS [t12] ON 1=1
) AS [value], [t0].[TripNumber], [t0].[DropSequence], [t0].[TripDate], [t1].[Name] AS [Depot], [t0].[Customer] AS [CustomerName], [t0].[CustomerReference], [t0].[DeliveryName], [t0].[DeliveryTown], [t0].[DeliveryPostcode], [t0].[VehicleReg] + [t0].[Subcontractor] AS [VehicleText], CONVERT(NVarChar,(
SELECT [t16].[ID]
FROM (
SELECT TOP (1) [t15].[ID]
FROM (
SELECT NULL AS [EMPTY]
) AS [t13]
LEFT OUTER JOIN (
SELECT [t14].[ID]
FROM [dbo].[Account] AS [t14]
WHERE [t0].[Subcontractor] = [t14].[Name]
) AS [t15] ON 1=1
ORDER BY [t15].[ID]
) AS [t16]
)) AS [SubbieID],
(CASE
WHEN [t3].[PODScanning] IS NULL THEN #p0
ELSE [t3].[PODScanning]
END) AS [ScanType], [t3].[ID] AS [ID3]
FROM [dbo].[Consignments] AS [t0]
INNER JOIN [dbo].[Depots] AS [t1] ON [t0].[DeliveryDepot] = [t1].[Letter]
INNER JOIN [dbo].[Account] AS [t2] ON [t0].[Customer] = [t2].[LegacyID]
INNER JOIN [dbo].[Account] AS [t3] ON [t2].[InvoiceAccount] = [t3].[LegacyID]
LEFT OUTER JOIN ((
SELECT NULL AS [EMPTY]
) AS [t4]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t5].[ID], [t5].[Consignment], [t5].[Status], [t5].[NTConsignment], [t5].[CustomerRef], [t5].[Timestamp], [t5].[SignedBy], [t5].[Clause], [t5].[BarcodeNumber], [t5].[MainRef], [t5].[Notes], [t5].[ConsignmentRef], [t5].[PODedBy]
FROM [dbo].[PODs] AS [t5]
) AS [t6] ON 1=1 ) ON [t0].[IntConNo] = [t6].[Consignment]
WHERE ((NOT (EXISTS(
SELECT TOP (1) NULL AS [EMPTY]
FROM [dbo].[Account] AS [t7]
WHERE [t0].[Subcontractor] = [t7].[Name]
ORDER BY [t7].[ID]
))) OR (NOT (((
SELECT [t9].[Customer]
FROM (
SELECT TOP (1) [t8].[Customer]
FROM [dbo].[Account] AS [t8]
WHERE [t0].[Subcontractor] = [t8].[Name]
ORDER BY [t8].[ID]
) AS [t9]
)) = 1))) AND ([t2].[Customer] = 1) AND ([t3].[Customer] = 1)
ORDER BY [t0].[ID], [t1].[ID], [t2].[ID], [t3].[ID], [t6].[ID]
}

Try moving the subcontractor join up higher and push the where clause along with it. That way you're not unnecessarily making joins which would fail at the end.
I would also modify the select for the subcontractor id, so you don't get the Id of a potentially null value.
var cons = (from c in dc.Consignments
join su in dc.Accounts on c.Subcontractor equals su.Name into sug
where (sug.FirstOrDefault() == null || sug.FirstOrDefault().Customer == false)
join p in dc.PODs on c.IntConNo equals p.Consignment into pg
join d in dc.Depots on c.DeliveryDepot equals d.Letter
join sl in dc.Accounts on c.Customer equals sl.LegacyID
join ss in dc.Accounts on sl.InvoiceAccount equals ss.LegacyID
join sub in dc.Accountsubbies on ss.ID equals sub.AccountID into subg
let firstSubContractor = sug.DefaultIfEmpty().FirstOrDefault()
select new
{
ID = c.ID,
IntConNo = c.IntConNo,
LegacyID = c.LegacyID,
PODs = pg.DefaultIfEmpty(),
TripNumber = c.TripNumber,
DropSequence = c.DropSequence,
TripDate = c.TripDate,
Depot = d.Name,
CustomerName = c.Customer,
CustomerReference = c.CustomerReference,
DeliveryName = c.DeliveryName,
DeliveryTown = c.DeliveryTown,
DeliveryPostcode = c.DeliveryPostcode,
VehicleText = c.VehicleReg + c.Subcontractor,
SubbieID = firstSubContractor == null ? "" : firstSubContractor.ID.ToString(),
SubbieList = subg.DefaultIfEmpty(),
ScanType = ss.PODScanning == null ? 0 : ss.PODScanning
});

Related

Define nested derived table in Entity Framework

I am converting SQL query into linq ef 6.0. I have a problem when convert nest derived table into linq query. Following is SQL query and Linq expression where i stuck
Select * FROM orders ords WITH ( NOLOCK )
INNER JOIN orders_list ol WITH ( NOLOCK ) ON ords.order_id = ol.order_id
--AND ol.order_list_id = #order_list_id
INNER JOIN product WITH ( NOLOCK ) ON product.id = ol.product_id
INNER JOIN OrderContact ca WITH ( NOLOCK ) ON ca.id= ords.OrdercontactRecipientId --and ca.is_primary_address = 1
INNER JOIN OrderContact ca1 WITH ( NOLOCK ) ON ca1.id=ords.OrderContactCustomerID--and ca.is_primary_address = 1
LEFT JOIN ( SELECT oslTemp.order_list_id ,
status_id
FROM Ticket_status_log
INNER JOIN ( SELECT osl.order_list_id ,
MAX(log_id) AS log_id
FROM Ticket_status_log osl
INNER JOIN orders_list
WITH ( NOLOCK ) ON orders_list.order_list_id = osl.order_list_id
WHERE osl.is_deleted = 0
AND osl.status_id > 0
AND osl.status_id < 3
GROUP BY osl.order_list_id
) AS oslTemp ON oslTemp.log_id = Ticket_status_log.log_id
) status_log ON ol.order_list_id = status_log.order_list_id
WHERE ords.is_deleted = 0
Following is linq expression
var query = (from ords in con.ordersDB
join ol in con.Orders_ListDB on ords.order_id equals ol.order_id
join product in con.ProductDB on ol.product_id equals product.ID
join ca in con.OrderContactDB on ords.OrderContactRecipientID equals ca.ID
join ca1 in con.OrderContactDB on ords.OrderContactCustomerID equals ca1.ID
join tktlog in con.ticket_status_logDB
)
I have stuck when there is derived nest table join. How to handle this
You can use let clause to solve this problem Microsoft Docs.
Write you subquery in let and then join the let variable with other table variables
I had figure it out by myself. I am posting the answer so that any other will get the solution
var innerQuery2 = (from tktlog in con.ticket_status_logDB
join ordlist in con.Orders_ListDB on tktlog.order_list_id equals ordlist.order_list_id
where tktlog.is_deleted == false && tktlog.status_id > 0 && tktlog.status_id < 3
group tktlog by tktlog.order_list_id into grouped
select new
{
order_list_id = grouped.Key,
log_id = grouped.Select(x => x.log_id).Max()
});
var innerQuery1 = (from tktlog in con.ticket_status_logDB
join osltemp in innerQuery2 on tktlog.log_id equals osltemp.log_id
select new
{
osltemp.order_list_id,
tktlog.status_id
});
var query = (from ords in con.ordersDB
join ol in con.Orders_ListDB on ords.order_id equals ol.order_id
join product in con.ProductDB on ol.product_id equals product.ID
join ca in con.OrderContactDB on ords.OrderContactRecipientID equals ca.ID
join ca1 in con.OrderContactDB on ords.OrderContactCustomerID equals ca1.ID
join cityml in con.CityMLDB on ca.CityMLID equals cityml.ID into leftcityml
from citymlresult in leftcityml.DefaultIfEmpty()
join state in con.StateDB on ca.StateID equals state.ID into leftstate
from stateresult in leftstate.DefaultIfEmpty()
join area in con.AreaMLDB on ca.AreaMLID equals area.ID into leftarea
from arearesult in leftarea.DefaultIfEmpty()
join statuslog in innerQuery1 on ol.order_list_id equals statuslog.order_list_id into leftquery1
from status_log in leftquery1.DefaultIfEmpty()
where ords.is_deleted == false && ords.is_voided == false && ords.type_id == 2
&& (ol.item_number ?? 0) > 0 &&
(
(
(ol.parent_id == parintid || ol.parent_id == orderListId || ol.order_list_id == orderListId)
&& (ol.SubType ?? 0) == 4
)
|| (ol.order_list_id == orderListId)
)
orderby arearesult.Name ?? ca.AreaOther
select new {});

Linq Left Outer Join with Count

I want to create this SQL query:
SELECT
a.[Seat],
b.[PlayerId],
b.[UserName],
b.[NickName],
COUNT(c.PlayerId) AS Trophy
FROM [dbo].[tbl_PlayerTableSeat] AS a
INNER JOIN [dbo].[tbl_Player] AS b ON a.[PlayerId] = b.[PlayerId]
INNER JOIN [dbo].[tbl_GameVirtualTable] AS d ON d.GameVirtualTableId = a.GameVirtualTableId
LEFT OUTER JOIN [dbo].[tbl_PlayerTableWinning] AS c ON a.[PlayerId] = c.[PlayerId] AND c.GameTableId = d.GameTableId
WHERE a.GameVirtualTableId = 36
GROUP BY a.[Seat], b.[PlayerId], b.[UserName], b.[NickName]
I have this Linq
var virtualTableSeatList = (from s in db.PlayerTableSeat
join p in db.Player on s.PlayerId equals p.PlayerId
join v in db.GameVirtualTable on s.GameVirtualTableId equals v.GameVirtualTableId
join w in db.PlayerTableWinning on new { X1 = s.PlayerId, X2 = v.GameTableId } equals new { X1 = w.PlayerId, X2 = w.GameTableId } into gj
from g in gj.DefaultIfEmpty()
where s.GameVirtualTableId == virtualGameTableId
group new { p, s } by new { p.PlayerId, s.Seat, p.NickName, p.UserName } into grp
select new VirtualTableSeatDto
{
PlayerId = grp.Key.PlayerId,
Seat = grp.Key.Seat,
NickName = grp.Key.NickName,
UserName = grp.Key.UserName,
Trophy = grp.Count()
}
).ToList();
From SQL Profiler, the Linq generates this SQL query:
exec sp_executesql N'SELECT
[GroupBy1].[K2] AS [PlayerId],
CAST( [GroupBy1].[K1] AS int) AS [C1],
[GroupBy1].[K4] AS [NickName],
[GroupBy1].[K3] AS [UserName],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Extent1].[Seat] AS [K1],
[Extent2].[PlayerId] AS [K2],
[Extent2].[UserName] AS [K3],
[Extent2].[NickName] AS [K4],
COUNT(1) AS [A1]
FROM [dbo].[tbl_PlayerTableSeat] AS [Extent1]
INNER JOIN [dbo].[tbl_Player] AS [Extent2] ON [Extent1].[PlayerId] = [Extent2].[PlayerId]
INNER JOIN [dbo].[tbl_GameVirtualTable] AS [Extent3] ON [Extent1].[GameVirtualTableId] = [Extent3].[GameVirtualTableId]
LEFT OUTER JOIN [dbo].[tbl_PlayerTableWinning] AS [Extent4] ON ([Extent1].[PlayerId] = [Extent4].[PlayerId]) AND ([Extent3].[GameTableId] = [Extent4].[GameTableId])
WHERE [Extent1].[GameVirtualTableId] = #p__linq__0
GROUP BY [Extent1].[Seat], [Extent2].[PlayerId], [Extent2].[UserName], [Extent2].[NickName]
) AS [GroupBy1]',N'#p__linq__0 int',#p__linq__0=36
I want to change COUNT(1) AS [A1] to COUNT([Extent4].[PlayerId]) AS [A1]
so it can return correct data.
I have no idea how to change the LinQ
Trophy = grp.Count()
so that it can count PlayerId of PlayerTableWinning instead of COUNT(1)
Updated: #Ivan Stoev
By adding the g into the group.
group new { p, s, g }
And sum the group
Trophy = grp.Sum(item => item.w != null ? 1 : 0)
It return the correct answer. However, it is using SUM instead of count. The SQL query generated is as below:
exec sp_executesql N'SELECT
[GroupBy1].[K2] AS [PlayerId],
CAST( [GroupBy1].[K1] AS int) AS [C1],
[GroupBy1].[K4] AS [NickName],
[GroupBy1].[K3] AS [UserName],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Filter1].[K1] AS [K1],
[Filter1].[K2] AS [K2],
[Filter1].[K3] AS [K3],
[Filter1].[K4] AS [K4],
SUM([Filter1].[A1]) AS [A1]
FROM ( SELECT
[Extent1].[Seat] AS [K1],
[Extent2].[PlayerId] AS [K2],
[Extent2].[UserName] AS [K3],
[Extent2].[NickName] AS [K4],
CASE WHEN ( NOT (([Extent4].[GameTableId] IS NULL) AND ([Extent4].[PlayerId] IS NULL) AND ([Extent4].[GameRoundId] IS NULL))) THEN 1 ELSE 0 END AS [A1]
FROM [dbo].[tbl_PlayerTableSeat] AS [Extent1]
INNER JOIN [dbo].[tbl_Player] AS [Extent2] ON [Extent1].[PlayerId] = [Extent2].[PlayerId]
INNER JOIN [dbo].[tbl_GameVirtualTable] AS [Extent3] ON [Extent1].[GameVirtualTableId] = [Extent3].[GameVirtualTableId]
LEFT OUTER JOIN [dbo].[tbl_PlayerTableWinning] AS [Extent4] ON ([Extent1].[PlayerId] = [Extent4].[PlayerId]) AND ([Extent3].[GameTableId] = [Extent4].[GameTableId])
WHERE [Extent1].[GameVirtualTableId] = #p__linq__0
) AS [Filter1]
GROUP BY [K1], [K2], [K3], [K4]
) AS [GroupBy1]',N'#p__linq__0 int',#p__linq__0=36
The only (but significant) difference between SQL COUNT(field) and COUNT(1) is that the former is excluding the NULL values, which when applied to the normally required field from the right side of a left outer join like in your case produces a different result when there are no matching records - the former returns 0 while the latter returns 1.
The "natural" LINQ equivalent would be Count(field != null), but that unfortunately is translated to a quite different SQL by the current EF query provider. So in such cases I personally use the closer equivalent expression Sum(field != null ? 1 : 0) which produces a much better SQL.
In order to apply the above to your query, you'll need an access to w inside the grouping, so change
group new { p, s }
to
group new { p, s, w }
and then use
Trophy = grp.Sum(item => item.w != null ? 1 : 0)

Creating Linq from SQL with OrderBy and GroupBy

I have the following table structure.
TableA TableB TableC
- MID - PID - PID
- NAME - INIT_DATE - MID
This is the SQL Query that I need to translate into Linq
SELECT TOP 10 TableA.NAME,
COUNT(TableB.INIT_DATE) AS [TOTALCOUNT]
FROM TableC
INNER JOIN TableA ON TableC.MID = TableA.MID
LEFT OUTER JOIN TableB ON TableC.PID = TableB.PID
GROUP BY TableA.NAME
ORDER BY [TOTALCOUNT] DESC
I tried to reproduce the above query with this Linq query:
iqModel = (from tableC in DB.TableC
join tableA in DB.TableA on tableC.MID equals tableA.MID
select new { tableC, tableA } into TM
join tableB in DB.TableB on TM.tableC.PID equals J.PID into TJ
from D in TJ.DefaultIfEmpty()
select new { TM, D } into MD
group MD by MD.TM.tableA.NAME into results
let TOTALCOUNT = results.Select(item=>item.D.INIT_DATE).Count()
orderby TOTALCOUNT descending
select new SelectListItem
{
Text = results.Key.ToString(),
Value = TOTALCOUNT.ToString()
}).Take(10);
But I think I am doing something wrong.
The Output of the LINQ and SQL is not same. I think up to JOIN or GROUPBY it is Correct.
EDIT :-
I have also tried the following Linq query but still it's not working correctly.
var iqModel = (from c in DB.TableC
join a in DB.TableA on c.MID equals a.MID
join b in DB.b on c.PID equals b.PID into b_join
from b in b_join.DefaultIfEmpty()
select new SelectListItem { Text = a.NAME, Value = b.INIT_DATE != null ? b.INIT_DATE.ToString() : string.Empty });
var igModel = iqModel.GroupBy(item => item.Text);
var result = igModel.OrderByDescending(item => item.Select(r => r.Value).Count());
I want to understand what am I doing wrong and how can it be fixed.
I am newbie to LINQ to SQL I think in above LINQ I really made it complicated by adding more select.
I think the difference is caused by the fact that the SQL COUNT(field) function does not include NULL values. There is no direct equivalent construct in LINQ, but it could be simulated with Count(e => e.Field != null) or like this (which seems to produce better SQL):
var query =
(from a in db.TableA
join c in db.TableC on a.MID equals c.MID
join b in db.TableB on c.PID equals b.PID into joinB
from b in joinB.DefaultIfEmpty()
group b by a.Name into g
let TOTALCOUNT = g.Sum(e => e.INIT_DATE != null ? 1 : 0)
orderby TOTALCOUNT descending
select new SelectListItem { Text = g.Key, Value = TOTALCOUNT }
).Take(10);
which generates the following SQL
SELECT TOP (10)
[Project1].[C2] AS [C1],
[Project1].[Name] AS [Name],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [Name],
1 AS [C2]
FROM ( SELECT
[Join2].[K1] AS [K1],
SUM([Join2].[A1]) AS [A1]
FROM ( SELECT
[Extent1].[Name] AS [K1],
CASE WHEN ([Extent3].[INIT_DATE] IS NOT NULL) THEN 1 ELSE 0 END AS [A1]
FROM [dbo].[TableAs] AS [Extent1]
INNER JOIN [dbo].[TableCs] AS [Extent2] ON [Extent1].[MID] = [Extent2].[MID]
LEFT OUTER JOIN [dbo].[TableBs] AS [Extent3] ON [Extent2].[PID] = [Extent3].[PID]
) AS [Join2]
GROUP BY [K1]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC
I assume, that you not see "group by" command at resulting query, instead of it "distinct" command is used. Am I right?
First query makes distinct by TableA.NAME and then calculates COUNT(TableB.INIT_DATE) with the help of subquery like this:
select distinct1.Name, (select count() from *join query* where Name = distinct1.Name)
from (select distinct Name from *join query*) as distinct1
If so, not worry about it. Because conversion from linq to real t-sql script sometimes very unpredictable (you can not force them to be equal, only when query is very simple), but both queries are equivalent one to another and return same results (compare them to make sure).

LINQ query optimization for slow grouping

I have a LINQ query that gets data via Entity Framework Code First from an SQL database. This works, but it works very very slow.
This is the original query:
var tmpResult = from mdv in allMetaDataValues
where mdv.Metadata.InputType == MetadataInputType.String && mdv.Metadata.ShowInFilter && !mdv.Metadata.IsHidden && !string.IsNullOrEmpty(mdv.ValueString)
group mdv by new
{
mdv.ValueString,
mdv.Metadata
} into g
let first = g.FirstOrDefault()
select new
{
MetadataTitle = g.Key.Metadata.Title,
MetadataID = g.Key.Metadata.ID,
CollectionColor = g.Key.Metadata.Collection.Color,
CollectionID = g.Key.Metadata.Collection.ID,
MetadataValueCount = 0,
MetadataValueTitle = g.Key.ValueString,
MetadataValueID = first.ID
};
This is the generated SQL from the original query:
{SELECT
0 AS [C1],
[Project4].[Title] AS [Title],
[Project4].[ID] AS [ID],
[Extent9].[Color] AS [Color],
[Project4].[Collection_ID] AS [Collection_ID],
[Project4].[ValueString] AS [ValueString],
[Project4].[C1] AS [C2]
FROM (SELECT
[Project2].[ValueString] AS [ValueString],
[Project2].[ID] AS [ID],
[Project2].[Title] AS [Title],
[Project2].[Collection_ID] AS [Collection_ID],
(SELECT TOP (1)
[Filter4].[ID1] AS [ID]
FROM ( SELECT [Extent6].[ID] AS [ID1], [Extent6].[ValueString] AS [ValueString], [Extent7].[Collection_ID] AS [Collection_ID1], [Extent8].[ID] AS [ID2], [Extent8].[InputType] AS [InputType], [Extent8].[ShowInFilter] AS [ShowInFilter], [Extent8].[IsHidden] AS [IsHidden1]
FROM [dbo].[MetadataValue] AS [Extent6]
LEFT OUTER JOIN [dbo].[Media] AS [Extent7] ON [Extent6].[Media_ID] = [Extent7].[ID]
INNER JOIN [dbo].[Metadata] AS [Extent8] ON [Extent6].[Metadata_ID] = [Extent8].[ID]
WHERE ( NOT (([Extent6].[ValueString] IS NULL) OR (( CAST(LEN([Extent6].[ValueString]) AS int)) = 0))) AND ([Extent7].[IsHidden] <> cast(1 as bit))
) AS [Filter4]
WHERE (2 = CAST( [Filter4].[InputType] AS int)) AND ([Filter4].[ShowInFilter] = 1) AND ([Filter4].[IsHidden1] <> cast(1 as bit)) AND ([Filter4].[Collection_ID1] = #p__linq__0) AND (([Project2].[ValueString] = [Filter4].[ValueString]) OR (([Project2].[ValueString] IS NULL) AND ([Filter4].[ValueString] IS NULL))) AND (([Project2].[ID] = [Filter4].[ID2]) OR (1 = 0))) AS [C1]
FROM ( SELECT
[Distinct1].[ValueString] AS [ValueString],
[Distinct1].[ID] AS [ID],
[Distinct1].[Title] AS [Title],
[Distinct1].[Collection_ID] AS [Collection_ID]
FROM ( SELECT DISTINCT
[Filter2].[ValueString] AS [ValueString],
[Filter2].[ID3] AS [ID],
[Filter2].[InputType1] AS [InputType],
[Filter2].[Title1] AS [Title],
[Filter2].[ShowInFilter1] AS [ShowInFilter],
[Filter2].[IsHidden2] AS [IsHidden],
[Filter2].[Collection_ID2] AS [Collection_ID]
FROM ( SELECT [Filter1].[ValueString], [Filter1].[Collection_ID3], [Filter1].[IsHidden3], [Filter1].[ID3], [Filter1].[InputType1], [Filter1].[Title1], [Filter1].[ShowInFilter1], [Filter1].[IsHidden2], [Filter1].[Collection_ID2]
FROM ( SELECT [Extent1].[ValueString] AS [ValueString], [Extent2].[Collection_ID] AS [Collection_ID3], [Extent4].[IsHidden] AS [IsHidden3], [Extent5].[ID] AS [ID3], [Extent5].[InputType] AS [InputType1], [Extent5].[Title] AS [Title1], [Extent5].[ShowInFilter] AS [ShowInFilter1], [Extent5].[IsHidden] AS [IsHidden2], [Extent5].[Collection_ID] AS [Collection_ID2]
FROM [dbo].[MetadataValue] AS [Extent1]
LEFT OUTER JOIN [dbo].[Media] AS [Extent2] ON [Extent1].[Media_ID] = [Extent2].[ID]
INNER JOIN [dbo].[Metadata] AS [Extent3] ON [Extent1].[Metadata_ID] = [Extent3].[ID]
LEFT OUTER JOIN [dbo].[Metadata] AS [Extent4] ON [Extent1].[Metadata_ID] = [Extent4].[ID]
LEFT OUTER JOIN [dbo].[Metadata] AS [Extent5] ON [Extent1].[Metadata_ID] = [Extent5].[ID]
WHERE ( NOT (([Extent1].[ValueString] IS NULL) OR (( CAST(LEN([Extent1].[ValueString]) AS int)) = 0))) AND ([Extent2].[IsHidden] <> cast(1 as bit)) AND (2 = CAST( [Extent3].[InputType] AS int)) AND ([Extent3].[ShowInFilter] = 1)
) AS [Filter1]
WHERE [Filter1].[IsHidden3] <> cast(1 as bit)
) AS [Filter2]
WHERE [Filter2].[Collection_ID3] = #p__linq__0
) AS [Distinct1]
) AS [Project2] ) AS [Project4]
LEFT OUTER JOIN [dbo].[Collection] AS [Extent9] ON [Project4].[Collection_ID] = [Extent9].[ID]}
If we remove the "let first = g.FirstOrDefault()" and change "MetadataValueID = first.ID" to "MetadataValueID = 0" so that we just have a fixed ID = 0 for testing purposes, then the data loads very fast and the generated query itself is half the size compared to the original
So it seems that this part is making the query very slow:
let first = g.FirstOrDefault()
...
MetadataValueID = first.ID
};
How can this be rewritten?
If I try to rewrite the code, it is still slow:
MetadataValueID = g.Select(x => x.ID).FirstOrDefault()
or
let first = g.Select(x => x.ID).FirstOrDefault()
...
MetadataValueID = first
};
Any suggestions?
Using EF I have allways felt that it has problems efficiently translating stuff like g.Key.Metadata.Collection, so I try to join more explicitly and to include only fields, that are neccessary for your result. You can use include instead of join using repository pattern.
Then your query would look like this:
from mdv in allMetaDataValues.Include("Metadata").Include("Metadata.Collection")
where mdv.Metadata.InputType == MetadataInputType.String &&
mdv.Metadata.ShowInFilter &&
!mdv.Metadata.IsHidden &&
!string.IsNullOrEmpty(mdv.ValueString)
group mdv by new
{
MetadataID = mdv.Metadata.ID,
CollectionID = mdv.Metadata.Collection.ID,
mdv.Metadata.Title,
mdv.Metadata.Collection.Color,
mdv.ValueString
} into g
let first = g.FirstOrDefault().ID
select new
{
MetadataTitle = g.Key.Title,
MetadataID = g.Key.MetadataID,
CollectionColor = g.Key.Color,
CollectionID = g.Key.CollectionID,
MetadataValueCount = 0,
MetadataValueTitle = g.Key.ValueString,
MetadataValueID = first
}
Good tool for playing with linq is LinqPad.
The problem is also that:
let first = g.FirstOrDefault().ID
cannot be easily translated to SQL see this answer. But this rewrite simplifies the underlying query for it at least. It remains to me unclear, why you need first ID from a set without using orderby.
It could be rewriten like this:
let first = (from f in allMetaDataValues
where f.Metadata.ID == g.Key.MetadataID &&
f.ValuesString == g.Key.ValuesString select f.ID)
.FirstOrDefault()
This way you do not let EF write the query for you and you can specify exactly how to do the select.
To speed up the query you can also consider adding indexes to database according to the generated query - namely index using both colums used in where clause of this let first query.
Try the following solution.
Replace FirstOrDefault() with .Take(1). FirstOrDefault() is not lazy loaded.
var tmpResult = from mdv in allMetaDataValues
where mdv.Metadata.InputType == MetadataInputType.String && mdv.Metadata.ShowInFilter && !mdv.Metadata.IsHidden && !string.IsNullOrEmpty(mdv.ValueString)
group mdv by new
{
mdv.ValueString,
mdv.Metadata
} into g
let first = g.Take(1)
select new
{
MetadataTitle = g.Key.Metadata.Title,
MetadataID = g.Key.Metadata.ID,
CollectionColor = g.Key.Metadata.Collection.Color,
CollectionID = g.Key.Metadata.Collection.ID,
MetadataValueCount = 0,
MetadataValueTitle = g.Key.ValueString,
MetadataValueID = first.ID
};

How to limit a LINQ left outer join to one row

I have a left outer join (below) returning results as expected. I need to limit the results from the 'right' table to the 'first' hit. Can I do that somehow? Currently, I get a result for every record in both tables, I only want to see one result from the table on the left (items) no matter how many results I have in the right table (photos).
var query = from i in db.items
join p in db.photos
on i.id equals p.item_id into tempPhoto
from tp in tempPhoto.DefaultIfEmpty()
orderby i.date descending
select new
{
itemName = i.name,
itemID = i.id,
id = i.id,
photoID = tp.PhotoID.ToString()
};
GridView1.DataSource = query;
GridView1.DataBind();
This will do the job for you.
from i in db.items
let p = db.photos.Where(p2 => i.id == p2.item_id).FirstOrDefault()
orderby i.date descending
select new
{
itemName = i.name,
itemID = i.id,
id = i.id,
photoID = p == null ? null : p.PhotoID.ToString();
}
I got this sql when I generated it against my own model (and without the name and second id columns in the projection).
SELECT [t0].[Id] AS [Id], CONVERT(NVarChar,(
SELECT [t2].[PhotoId]
FROM (
SELECT TOP (1) [t1].[PhotoId]
FROM [dbo].[Photos] AS [t1]
WHERE [t1].[Item_Id] = ([t0].[Id])
) AS [t2]
)) AS [PhotoId]
FROM [dbo].[Items] AS [t0]
ORDER BY [t0].[Id] DESC
When I asked for the plan, it showed that the subquery is implemented by this join:
<RelOp LogicalOp="Left Outer Join" PhysicalOp="Nested Loops">
What you want to do is group the table. The best way to do this is:
var query = from i in db.items
join p in (from p in db.photos
group p by p.item_id into gp
where gp.Count() > 0
select new { item_id = g.Key, Photo = g.First() })
on i.id equals p.item_id into tempPhoto
from tp in tempPhoto.DefaultIfEmpty()
orderby i.date descending
select new
{
itemName = i.name,
itemID = i.id,
id = i.id,
photoID = tp.Photo.PhotoID.ToString()
};
Edit: This is Amy B speaking. I'm only doing this because Nick asked me to. Nick, please modify or remove this section as you feel is appropriate.
The SQL generated is quite large. The int 0 (to be compared with the count) is passed in via parameter.
SELECT [t0].X AS [id], CONVERT(NVarChar(MAX),(
SELECT [t6].Y
FROM (
SELECT TOP (1) [t5].Y
FROM [dbo].[Photos] AS [t5]
WHERE (([t4].Y IS NULL) AND ([t5].Y IS NULL)) OR (([t4].Y IS NOT NULL) AND ([t5].Y IS NOT NULL) AND ([t4].Y = [t5].Y))
) AS [t6]
)) AS [PhotoId]
FROM [dbo].[Items] AS [t0]
CROSS APPLY ((
SELECT NULL AS [EMPTY]
) AS [t1]
OUTER APPLY (
SELECT [t3].Y
FROM (
SELECT COUNT(*) AS [value], [t2].Y
FROM [dbo].[Photos] AS [t2]
GROUP BY [t2].Y
) AS [t3]
WHERE (([t0].X) = [t3].Y) AND ([t3].[value] > #p0)
) AS [t4])
ORDER BY [t0].Z DESC
The execution plan reveals three left joins. At least one is trivial and should not be counted (it brings in the zero). There is enough complexity here that I cannot clearly point to any problem for efficiency. It might run great.
You could do something like:
var q = from c in
(from s in args
select s).First()
select c;
Around the last part of the query. Not sure if it will work or what kind of wack SQL it will produce :)
Use an inner query. Include DefaultIfEmpty for the case of no photo and orderby for the case of more than one. The following example takes the photo with the greatest id.
var query =
from i in db.items
let p = from p in db.photos where i.id == p.item_id orderby p.id select p).DefaultIfEmpty().Last()
orderby i.date descending
select new {
itemName = i.name,
itemID = i.id,
id = i.id,
photoID = p.PhotoID
};
If you need to handle the case of no photo specially, you can omit DefaultIfEmpty and use FirstOrDefault/LastOrDefault instead.

Categories