I have a SQL statement which afaik is correct but the response from the SQL server is incorrect. I've debugged this issue and found that if I execute the SQL statement without the wrapping store procedure I get different results. All I have done is replaced the variable with the actual values
Linq generated code:
exec sp_executesql N'SELECT [t0].[RoomId], [t0].[Title], [t0].[Detail], [t0].[ThumbnailPath], [t0].[PageId], [t0].[TypeId], [t0].[LocationId], [t0].[TimeStamp], [t0].[DeleteStamp]
FROM [dbo].[Room] AS [t0]
INNER JOIN [dbo].[RoomType] AS [t1] ON [t1].[RoomTypeId] = [t0].[TypeId]
WHERE ([t1].[Sleeps] >= #p0) AND ([t0].[DeleteStamp] IS NULL) AND (((
SELECT COUNT(*)
FROM [dbo].[Booking] AS [t2]
INNER JOIN [dbo].[Order] AS [t3] ON [t3].[OrderId] = [t2].[OrderId]
WHERE ([t2].[StartStamp] <= #p1)
AND ([t2].[EndStamp] >= #p2)
AND (([t3].[Status] = #p3)
OR ([t3].[Status] = #p4)
OR (([t3].[Status] = #p5) AND ([t3].[CreatedStamp] > #p6)))
AND ([t2].[RoomId] = [t0].[RoomId])
)) = #p7)
',N'#p0 int,#p1 datetime,#p2 datetime,#p3 int,#p4 int,#p5 int,#p6 datetime,#p7 int',
#p0=1,#p1='2011-04-05 00:00:00',#p2='2011-04-04 00:00:00',#p3=3,#p4=5,#p5=0,#p6='2011-04-04 12:36:09.490',#p7=0
Without the SP
SELECT [t0].[RoomId], [t0].[Title], [t0].[Detail], [t0].[ThumbnailPath], [t0].[PageId], [t0].[TypeId], [t0].[LocationId], [t0].[TimeStamp], [t0].[DeleteStamp]
FROM [dbo].[Room] AS [t0]
INNER JOIN [dbo].[RoomType] AS [t1] ON [t1].[RoomTypeId] = [t0].[TypeId]
WHERE ([t1].[Sleeps] >= 1) AND ([t0].[DeleteStamp] IS NULL) AND (((
SELECT COUNT(*)
FROM [dbo].[Booking] AS [t2]
INNER JOIN [dbo].[Order] AS [t3] ON [t3].[OrderId] = [t2].[OrderId]
WHERE ([t2].[StartStamp] <= '2011-04-05 00:00:00')
AND ([t2].[EndStamp] >= '2011-04-04 00:00:00')
AND (([t3].[Status] = 3)
OR ([t3].[Status] = 4)
OR (([t3].[Status] = 5) AND ([t3].[CreatedStamp] > '2011-04-04 12:36:09.490')))
AND ([t2].[RoomId] = [t0].[RoomId])
)) = 0)
The first result set returns 1 row where as the 2nd returns me 21!!
Can anybody spot the difference as its driving me crazy.
You made an error replacing the variables!
You replaced p4 with 4 when you should have replaced it with 5 and p5 with 5 instead of 0.
Well, one difference is #p5=0 while you have [t3].[Status] = 5 in the other.
Related
Using C# and Linq to SQL, I found that my query with multiple where is orders of magnitude slower than with a single where / and.
Here is the query
using (TeradiodeDataContext dc = new TeradiodeDataContext())
{
var filterPartNumberID = 71;
var diodeIDsInBlades = (from bd in dc.BladeDiodes
select bd.DiodeID.Value).Distinct();
var diodesWithTestData = (from t in dc.Tests
join tt in dc.TestTypes on t.TestTypeID equals tt.ID
where tt.DevicePartNumberID == filterPartNumberID
select t.DeviceID.Value).Distinct();
var result = (from d in dc.Diodes
where d.DevicePartNumberID == filterPartNumberID
where diodesWithTestData.Contains(d.ID)
where !diodeIDsInBlades.Contains(d.ID)
orderby d.Name
select d);
var list = result.ToList();
// ~15 seconds
}
However, when the condition in the final query is this
where d.DevicePartNumberID == filterPartNumberID
& diodesWithTestData.Contains(d.ID)
& !diodeIDsInBlades.Contains(d.ID)
// milliseconds
it is very fast.
Comparing the SQL in result before calling ToList(), here are the queries (value 71 manually added in place of #params)
-- MULTIPLE WHERE
SELECT [t0].[ID], [t0].[Name], [t0].[M2MID], [t0].[DevicePartNumberID], [t0].[Comments], [t0].[Hold]
FROM [dbo].[Diode] AS [t0]
WHERE (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t2].[value]
FROM (
SELECT [t1].[DiodeID] AS [value]
FROM [dbo].[BladeDiode] AS [t1]
) AS [t2]
) AS [t3]
WHERE [t3].[value] = [t0].[ID]
))) AND (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t6].[value]
FROM (
SELECT [t4].[DeviceID] AS [value], [t5].[DevicePartNumberID]
FROM [dbo].[Test] AS [t4]
INNER JOIN [dbo].[TestType] AS [t5] ON [t4].[TestTypeID] = ([t5].[ID])
) AS [t6]
WHERE [t6].[DevicePartNumberID] = (71)
) AS [t7]
WHERE [t7].[value] = [t0].[ID]
)) AND ([t0].[DevicePartNumberID] = 71)
ORDER BY [t0].[Name]
and
-- SINGLE WHERE
SELECT [t0].[ID], [t0].[Name], [t0].[M2MID], [t0].[DevicePartNumberID], [t0].[Comments], [t0].[Hold]
FROM [dbo].[Diode] AS [t0]
WHERE ([t0].[DevicePartNumberID] = 71) AND (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t3].[value]
FROM (
SELECT [t1].[DeviceID] AS [value], [t2].[DevicePartNumberID]
FROM [dbo].[Test] AS [t1]
INNER JOIN [dbo].[TestType] AS [t2] ON [t1].[TestTypeID] = ([t2].[ID])
) AS [t3]
WHERE [t3].[DevicePartNumberID] = (71)
) AS [t4]
WHERE [t4].[value] = [t0].[ID]
)) AND (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT DISTINCT [t6].[value]
FROM (
SELECT [t5].[DiodeID] AS [value]
FROM [dbo].[BladeDiode] AS [t5]
) AS [t6]
) AS [t7]
WHERE [t7].[value] = [t0].[ID]
)))
ORDER BY [t0].[Name]
The two SQL queries execute in < 1 second in SSMS and produce the same results.
So I'm wondering why the first is slower on the LINQ side. It's worrying to me because I know I've used multiple where elsewhere, without being aware of a such a severe performance impact.
This question even has answered with both multiple & and where. And this answer even suggests using multiple where clauses.
Can anyone explain why this happens in my case?
Because writing like this
if (someParam1 != 0)
{
myQuery = myQuery.Where(q => q.SomeField1 == someParam1)
}
if (someParam2 != 0)
{
myQuery = myQuery.Where(q => q.SomeField2 == someParam2)
}
is NOT(upd) the same as (in case when someParam1 and someParam2 != 0)
myQuery = from t in Table
where t.SomeField1 == someParam1
&& t.SomeField2 == someParam2
select t;
is (NOT deleted) the same as
myQuery = from t in Table
where t.SomeField1 == someParam1
where t.SomeField2 == someParam2
select t;
UPD
Yes, I do mistake. Second query is same, first is not same.
First and Second queries not EXACTLY the same. Let me show you what I mean.
1st query with lamda-expression writen as
t.Where(r => t.SomeField1 == someParam1 && t.SomeField2 == someParam2)
2nd query as
t.Where(r => r.SomeField1 == someParam1).Where(r => r.SomeField2 == someParam2)
In this case in generated SQL Predicate with SomeField2 goes first (it is important, see below)
In 1st case we getting this SQL:
SELECT <all field from Table>
FROM table t
WHERE t.SomeField1 = :someParam1
AND t.SomeField2 = :someParam2
In 2 case the SQL is:
SELECT <all field from Table>
FROM table t
WHERE t.SomeField2 = :someParam2
AND t.SomeField1 = :someParam1
As we see there are 2 'same' SQLs. As we see, the OP's SQLs are also 'same', they are different in order of predicates in WHERE clause (as in my example). And I guess that SQL optimizer generate 2 different execution plans and may be(!!!) doing NOT EXISTS, then EXISTS and then filtering take more time than do first filtering and after that do EXISTS and NOT EXISTS
UPD2
It is a 'problem' of Linq Provider (ORM). I'm using another ORM (linq2db), and it generates for me EXACTLY the same SQLs in both cases.
I'm having some trouble with SQL timeout for the following LINQ2SQL query:
DateTime date = DateTime.Parse("2013-08-01 00:00:00.000");
Clients.Where(e =>
(
!Orders.Any(f => f.ClientId.Equals(e.Id) && f.OrderDate >= date)
||
Comments.Any(f => f.KeyId.Equals(e.Id))
)
).Count().Dump();
When running this in LinqPad it will take forever to finish and will become an SQL timeout if running on the server.
The SQL-code generated:
-- Region Parameters
DECLARE #p0 DateTime = '2013-08-01 00:00:00.000'
-- EndRegion
SELECT COUNT(*) AS [value]
FROM [Clients] AS [t0]
WHERE (NOT (EXISTS(
SELECT NULL AS [EMPTY]
FROM [Orders] AS [t1]
WHERE ([t1].[ClientId] = [t0].[Id]) AND ([t1].[OrderDate] >= #p0)
))) OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [Comments] AS [t2]
WHERE [t2].[KeyId] = [t0].[Id]
))
Works fine in SQL-studio!
But:
SELECT COUNT(*) AS [value]
FROM [Clients] AS [t0]
WHERE
(NOT (EXISTS(SELECT NULL AS [EMPTY] FROM [Orders] AS [t1] WHERE ([t1].[ClientId] = [t0].[Id]) AND ([t1].[OrderDate] >= '2013-08-01 00:00:00.000'))))
OR
(EXISTS(SELECT NULL AS [EMPTY] FROM [Comments] AS [t2] WHERE [t2].[KeyId] = [t0].[Id]))
And will get me a the problem as actually running the query in LinqPad.
What is the difference of using DECLARE #p0 DateTime = '2013-08-01 00:00:00.000' compared to using the constant date and how do I get my Linq2SQL to work?
EDIT:
See execution plans for both queries:
Timeouts:
Fine:
Some other things I've noticed is that if I remove the NOT it works fine:
SELECT COUNT(*) AS [value]
FROM [Clients] AS [t0]
WHERE
((EXISTS(SELECT NULL AS [EMPTY] FROM [Orders] AS [t1] WHERE ([t1].[ClientId] = [t0].[Id]) AND ([t1].[OrderDate] >= '2013-08-01 00:00:00.000'))))
OR
(EXISTS(SELECT NULL AS [EMPTY] FROM [Comments] AS [t2] WHERE [t2].[KeyId] = [t0].[Id]))
Or if I remove the OR EXISTS parts it also works fine:
SELECT COUNT(*) AS [value]
FROM [Clients] AS [t0]
WHERE
((EXISTS(SELECT NULL AS [EMPTY] FROM [Orders] AS [t1] WHERE ([t1].[ClientId] = [t0].[Id]) AND ([t1].[OrderDate] >= '2013-08-01 00:00:00.000'))))
Thanks
/Niels
Your Orders table must be fairly large. You have an index on OrderDate right?
SQL Server actually generates 2 different execution plans in this example. Or if it generates the same plan, SQL gives greatly different numbers of returned rows for the 2 statements.
DECLARE #p0 DateTime = '2013-08-01 00:00:00.000'
SELECT * FROM Orders WHERE OrderDate >= #p0
SELECT * FROM Orders WHERE OrderDate >= '2013-08-01 00:00:00.000'
The first statement generates a parameterized query, plan optimizer will assume #p0 is unknown at the time and choose an execution plan that best fits unknown values.
The 2nd statement, optimizer will take into account that you supplied a fixed value. SQL will look at the index distribution and estimate how many rows will be filtered by >= '2013-08-01'
The execution plan is not visible, but in general sql performance recommendation don't use negations it always be a hit in performance. In your case try to use the <= instead the negation with the >=
And if you use many or it will hit your performance as well. Try to use subquerys as a work around to not use many or or negations.
The solution for me was to rebuild the index of OrderDate.
I'm newish to LinqToSQL and the project that I am working on cannot be changed to something else. I am translating some old SQL code to Linq. Not being that hot at linq, I used Linqer to do the translation for me. The query took about 90 seconds to run, so I thought it must be the linqToSQL. However, when I copied the query that the LinqToSQL produced and ran an ExecuteQuery on the datacontext it was super quick as I expected. I've copied the full queries, rather than trying to distil it down, but it looks like the issue is with something LinqToSQL is doing behind the scenes.
To summarise, if I copy the T-SQL created by linq and run
var results = DB.ExecuteQuery<InvoiceBalanceCheckDTO.InvoiceBalanceCheck>(#"T-SQL created by Linq - see below").ToList()
it completes with expected results in about 0.5 seconds.
It runs about the same time directly in SSMS. However, if I use the linqToSQL code that creates the T-SQL and do ToList() it takes ages. The result is only 9 records, although without the constraint to check the balance <> 0, there would be around 19,000 records. It's as if it's getting all 19,000 and then checking <> 0 after it's got the records.
I have also changed the Linq to project into the class used above, rather than to an anonymous type, but it makes not difference
This is the original SQL :
SELECT InvoiceNum, Max(AccountCode), Sum(AmountInc) AS Balance
FROM
(SELECT InvoiceNum, AccountCode, AmountInc From TourBookAccount WHERE AccDetailTypeE IN(20,30) AND InvoiceNum >= 1000
UNION ALL
SELECT InvoiceNum, '<no matching invoice>' AS AccountCode, AccountInvoiceDetail.AmountInc
FROM AccountInvoiceDetail
INNER JOIN AccountInvoice ON AccountInvoiceDetail.InvoiceID=AccountInvoice.InvoiceID
WHERE AccDetailTypeE IN(20,30)
AND InvoiceNum >= 1000
) as t
GROUP BY InvoiceNum
HAVING (Sum(t.AmountInc)<>0)
ORDER BY InvoiceNum
and this is the linq
var test = (from t in
(
//this gets the TourBookAccount totals
from tba in DB.TourBookAccount
where
detailTypes.Contains(tba.AccDetailTypeE) &&
tba.InvoiceNum >= dto.CheckInvoiceNumFrom
select new
{
InvoiceNum = tba.InvoiceNum,
AccountCode = tba.AccountCode,
Balance = tba.AmountInc
}
)
.Concat //note that concat, since it's possible that the AccountInvoice record does not actually exist
(
//this gets the Invoice detail totals.
from aid in DB.AccountInvoiceDetail
where
detailTypes.Contains(aid.AccDetailTypeE) &&
aid.AccountInvoice.InvoiceNum >= dto.CheckInvoiceNumFrom &&
select new
{
InvoiceNum = aid.AccountInvoice.InvoiceNum,
AccountCode = "<No Account Records>",
Balance = aid.AmountInc
}
)
group t by t.InvoiceNum into g
where Convert.ToDecimal(g.Sum(p => p.Balance)) != 0m
select new
{
InvoiceNum = g.Key,
AccountCode = g.Max(p => p.AccountCode),
Balance = g.Sum(p => p.Balance)
}).ToList();
and this is the T-SQL that the linq produces
SELECT [t5].[InvoiceNum], [t5].[value2] AS [AccountCode], [t5].[value3] AS [Balance]
FROM (
SELECT SUM([t4].[AmountInc]) AS [value], MAX([t4].[AccountCode]) AS [value2], SUM([t4].[AmountInc]) AS [value3], [t4].[InvoiceNum]
FROM (
SELECT [t3].[InvoiceNum], [t3].[AccountCode], [t3].[AmountInc]
FROM (
SELECT [t0].[InvoiceNum], [t0].[AccountCode], [t0].[AmountInc]
FROM [dbo].[TourBookAccount] AS [t0]
WHERE ([t0].[AccDetailTypeE] IN (20, 30)) AND ([t0].[InvoiceNum] >= 1000)
UNION ALL
SELECT [t2].[InvoiceNum],'<No Account Records>' AS [value], [t1].[AmountInc]
FROM [dbo].[AccountInvoiceDetail] AS [t1]
INNER JOIN [dbo].[AccountInvoice] AS [t2] ON [t2].[InvoiceID] = [t1].[InvoiceID]
WHERE ([t1].[AccDetailTypeE] IN (20, 30)) AND ([t2].[InvoiceNum] >= 1000)
) AS [t3]
) AS [t4]
GROUP BY [t4].[InvoiceNum]
) AS [t5]
WHERE [t5].[value] <> 0
I would bet money, that the problem is in this line:
where Convert.ToDecimal(g.Sum(p => p.Balance)) != 0m
What is probably happening, is that it can't translate this to SQL and silently tries to get all rows from db to memory, and then do filtering on in memory objects (LINQ to objects)
Maybe try to change this to something like:
where g.Sum(p=>.Balance!=0)
Well, the answer turned out not to be LinqToSQL itself (although possibly the way it creates the query could be blamed) , but the way SQL server handles the query. When I was running the query on the database to check speed (and running the created T=SQL in DB.ExecuteQuery) I had all the variables hardcoded. When I changed it to use the exact sql that Linq produces (i.e. with variables that are substituted) it ran just as slow in SSMS.
Looking at the execution plans of the two, they are quite different. A quick search on SO brought me to this page : Why does a parameterized query produces vastly slower query plan vs non-parameterized query which indicated that the problem was SQL server's "Parameter sniffing".
The culprit turned out to be the "No Account Records" string
For completeness, here is the generated T-SQL that Linq creates.
Change #p10 to the actual hardcoded string, and it's back to full speed !
In the end I just removed the line from the linq and set the account code afterwards and all was good.
Thanks #Botis,#Blorgbeard,#ElectricLlama & #Scott for suggestions.
DECLARE #p0 as Int = 20
DECLARE #p1 as Int = 30
DECLARE #p2 as Int = 1000
DECLARE #p3 as Int = 20
DECLARE #p4 as Int = 30
DECLARE #p5 as Int = 1000
DECLARE #p6 as Int = 40
DECLARE #p7 as Int = 10
DECLARE #p8 as Int = 0
DECLARE #p9 as Int = 1
DECLARE #p10 as NVarChar(4000)= '<No Account Records>' /*replace this parameter with the actual text in the SQl and it's way faster.*/
DECLARE #p11 as Decimal(33,4) = 0
SELECT [t5].[InvoiceNum], [t5].[value2] AS [AccountCode], [t5].[value3] AS [Balance]
FROM (
SELECT SUM([t4].[AmountInc]) AS [value], MAX([t4].[AccountCode]) AS [value2], SUM([t4].[AmountInc]) AS [value3], [t4].[InvoiceNum]
FROM (
SELECT [t3].[InvoiceNum], [t3].[AccountCode], [t3].[AmountInc]
FROM (
SELECT [t0].[InvoiceNum], [t0].[AccountCode], [t0].[AmountInc]
FROM [dbo].[TourBookAccount] AS [t0]
WHERE ([t0].[AccDetailTypeE] IN (#p0, #p1)) AND ([t0].[InvoiceNum] >= #p2)
UNION ALL
SELECT [t2].[InvoiceNum], #p10 AS [value], [t1].[AmountInc]
FROM [dbo].[AccountInvoiceDetail] AS [t1]
INNER JOIN [dbo].[AccountInvoice] AS [t2] ON [t2].[InvoiceID] = [t1].[InvoiceID]
WHERE ([t1].[AccDetailTypeE] IN (#p3, #p4)) AND ([t2].[InvoiceNum] >= #p5) AND ([t2].[InvoiceStatusE] <= #p6) AND ([t2].[InvoiceTypeE] = #p7) AND ([t1].[BookNum] <> #p8) AND ([t1].[AccDetailSourceE] = #p9)
) AS [t3]
) AS [t4]
GROUP BY [t4].[InvoiceNum]
) AS [t5]
WHERE [t5].[value] <> #p11
SELECT [t5].[InvoiceNum], [t5].[value2] AS [AccountCode], [t5].[value3] AS [Balance]
FROM (
SELECT SUM([t4].[AmountInc]) AS [value], MAX([t4].[AccountCode]) AS [value2], SUM([t4].[AmountInc]) AS [value3], [t4].[InvoiceNum]
FROM (
SELECT [t3].[InvoiceNum], [t3].[AccountCode], [t3].[AmountInc]
FROM (
SELECT [t0].[InvoiceNum], [t0].[AccountCode], [t0].[AmountInc]
FROM [dbo].[TourBookAccount] AS [t0]
WHERE ([t0].[AccDetailTypeE] IN (20, 30)) AND ([t0].[InvoiceNum] >= 1000)
UNION ALL
SELECT [t2].[InvoiceNum], '<No Account Records>' AS [value], [t1].[AmountInc]
FROM [dbo].[AccountInvoiceDetail] AS [t1]
INNER JOIN [dbo].[AccountInvoice] AS [t2] ON [t2].[InvoiceID] = [t1].[InvoiceID]
WHERE ([t1].[AccDetailTypeE] IN (20, 30)) AND ([t2].[InvoiceNum] >= 0) AND ([t2].[InvoiceStatusE] <= 40) AND ([t2].[InvoiceTypeE] = 10) AND ([t1].[BookNum] <> 0) AND ([t1].[AccDetailSourceE] = 1)
) AS [t3]
) AS [t4]
GROUP BY [t4].[InvoiceNum]
) AS [t5]
WHERE [t5].[value] <> 0
I have following SQL Query and would like to convert to LINQ to SQL which I will use in entity framework 5.0
var internationalDesksList =
from internationalDesks in _context.InternationalDesks
from subsection in
_context.Subsections.Where(
s =>
internationalDesks.EBALocationId == s.LocationId ||
internationalDesks.FELocationId == s.LocationId).DefaultIfEmpty()
where subsection.PublicationId == 1
select new {internationalDesks.Id, subsection.LocationId};
I have referred the following posts and answers. Though no luck.
LINQ to SQL - Left Outer Join with multiple join conditions Linq
left join on multiple (OR) conditions
When I tried this query in LINQPad I got the following answer which is correct.
-- Region Parameters
DECLARE #p0 Int = 1
-- EndRegion
SELECT [t0].[Id], [t1].[Id] AS [Id1]
FROM [InternationalDesks] AS [t0]
LEFT OUTER JOIN [Subsection] AS [t1] ON (([t0].[FELocationId]) = [t1].[LocationId]) OR (([t0].[EBALocationId]) = [t1].[LocationId])
WHERE [t1].[PublicationId] = #p0
However in entity framework 5 ( DBContext ) it is not providing me with the correct query. When I checked in SQL profiler all columns in subsection table is selected. That's it.
Following is the result:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Description] AS [Description],
[Extent1].[PracticeAreaId] AS [PracticeAreaId],
[Extent1].[LocationId] AS [LocationId],
...
FROM [dbo].[Subsection] AS [Extent1]
Don't know what could be problem. Please help me.
With LINQ you can't do LEFT OUTER JOIN on some boolean expression, only equijoins are supported. So, you can generate CROSS JOIN this way:
var internationalDesksList =
from internationalDesks in _context.InternationalDesks
from subsection in _context.Subsections
where subsection.PublicationId == 1 &&
(internationalDesks.EBALocationId == subsection.LocationId ||
internationalDesks.FELocationId == subsection.LocationId)
select new {
internationalDesks.Id,
subsection.LocationId
};
EF 5 will generate following SQL:
SELECT
[Extent1].[Id] AS [Id],
[Extent2].[LocationId] AS [LocationId]
FROM [dbo].[InternationalDesks] AS [Extent1]
CROSS JOIN [dbo].[Subsections] AS [Extent2]
WHERE (1 = [Extent2].[PublicationId]) AND
([Extent1].[EBALocationId] = [Extent2].[LocationId] OR
[Extent1].[FELocationId] = [Extent2].[LocationId])
As you can see, only required columns are selected. I also checked this query in LINQ to SQL - following query is generated:
DECLARE #p0 Int = 1
SELECT [t0].[Id], [t1].[LocationId]
FROM [InternationalDesks] AS [t0], [Subsections] AS [t1]
WHERE ([t1].[PublicationId] = #p0) AND
(([t0].[EBALocationId] = [t1].[LocationId]) OR
([t0].[FELocationId] = [t1].[LocationId]))
I am trying to convert a SQL query to LINQ. Somehow my count(distinct(x)) logic does not seem to be working correctly. The original SQL is quite efficient(or so i think), but the generated SQL is not even returning the correct result.
I am trying to fix this LINQ to do what the original SQL is doing, AND in an efficient way as the original query is doing. Help here would be really apreciated as I am stuck here :(
SQL which is working and I need to make a comparable LINQ of:
SELECT [t1].[PersonID] AS [personid]
FROM [dbo].[Code] AS [t0]
INNER JOIN [dbo].[phonenumbers] AS [t1] ON [t1].[PhoneCode] = [t0].[Code]
INNER JOIN [dbo].[person] ON [t1].[PersonID]= [dbo].[Person].PersonID
WHERE ([t0].[codetype] = 'phone') AND (
([t0].[CodeDescription] = 'Home') AND ([t1].[PhoneNum] = '111')
OR
([t0].[CodeDescription] = 'Work') AND ([t1].[PhoneNum] = '222') )
GROUP BY [t1].[PersonID] HAVING COUNT(DISTINCT([t1].[PhoneNum]))=2
The LINQ which I made is approximately as below:
var ids = context.Code.Where(predicate);
var rs = from r in ids
group r by new { r.phonenumbers.person.PersonID} into g
let matchcount=g.Select(p => p.phonenumbers.PhoneNum).Distinct().Count()
where matchcount ==2
select new
{
personid = g.Key
};
Unfortunately, the above LINQ is NOT generating the correct result, and is actually internally getting generated to the SQL shown below. By the way, this generated query is also reading ALL the rows(about 19592040) around 2 times due to the COUNTS :( Wich is a big performance issue too. Please help/point me to the right direction.
Declare #p0 VarChar(10)='phone'
Declare #p1 VarChar(10)='Home'
Declare #p2 VarChar(10)='111'
Declare #p3 VarChar(10)='Work'
Declare #p4 VarChar(10)='222'
Declare #p5 VarChar(10)='2'
SELECT [t9].[PersonID], (
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t13].[PhoneNum]
FROM [dbo].[Code] AS [t10]
INNER JOIN [dbo].[phonenumbers] AS [t11] ON [t11].[PhoneType] = [t10].[Code]
INNER JOIN [dbo].[Person] AS [t12] ON [t12].[PersonID] = [t11].[PersonID]
INNER JOIN [dbo].[phonenumbers] AS [t13] ON [t13].[PhoneType] = [t10].[Code]
WHERE ([t9].[PersonID] = [t12].[PersonID]) AND ([t10].[codetype] = #p0) AND ((([t10].[codetype] = #p1) AND ([t11].[PhoneNum] = #p2)) OR (([t10].[codetype] = #p3) AND ([t11].[PhoneNum] = #p4)))
) AS [t14]
) AS [cnt]
FROM (
SELECT [t3].[PersonID], (
SELECT COUNT(*)
FROM (
SELECT DISTINCT [t7].[PhoneNum]
FROM [dbo].[Code] AS [t4]
INNER JOIN [dbo].[phonenumbers] AS [t5] ON [t5].[PhoneType] = [t4].[Code]
INNER JOIN [dbo].[Person] AS [t6] ON [t6].[PersonID] = [t5].[PersonID]
INNER JOIN [dbo].[phonenumbers] AS [t7] ON [t7].[PhoneType] = [t4].[Code]
WHERE ([t3].[PersonID] = [t6].[PersonID]) AND ([t4].[codetype] = #p0) AND ((([t4].[codetype] = #p1) AND ([t5].[PhoneNum] = #p2)) OR (([t4].[codetype] = #p3) AND ([t5].[PhoneNum] = #p4)))
) AS [t8]
) AS [value]
FROM (
SELECT [t2].[PersonID]
FROM [dbo].[Code] AS [t0]
INNER JOIN [dbo].[phonenumbers] AS [t1] ON [t1].[PhoneType] = [t0].[Code]
INNER JOIN [dbo].[Person] AS [t2] ON [t2].[PersonID] = [t1].[PersonID]
WHERE ([t0].[codetype] = #p0) AND ((([t0].[codetype] = #p1) AND ([t1].[PhoneNum] = #p2)) OR (([t0].[codetype] = #p3) AND ([t1].[PhoneNum] = #p4)))
GROUP BY [t2].[PersonID]
) AS [t3]
) AS [t9]
WHERE [t9].[value] = #p5
Thanks!
I think the issue might be new { r.phonenumbers.person.PersonID}.
Why are you newing up a new object here rather than just grouping by r.phonenumbers.person directly? new {} is going to be a different object every time which will never group.
After grouping by person I would Select a group => new {person = group.person, phoneNumbers = group.person.phonenumbers} and then perform the check for how many phone numbers they have and then any final projection.
Drats! It looks like the fault was on my side(GIGO principle!)
In my ORM, I had created the associations from right to left, instead of the other way round. I think that was the issue.
Only issue left now is that somehow the LINQ is generating one INNER JOIN twice, and that is keeping the final result to be retrieved correctly. If i comment it out i nthe generated sql, i am getting correct result. That's the only issue now and I guess i'll open a new question for that. Thanks for your time!.