Linq to SQL Join Allowing Nulls When I don't want that - c#

I've written the below query to join a couple of tables. The la.UserProfileId is nullable for some ungodly reason.
When I write the equivalent SQL statement, I get 0 records as I should at this point.
var result = (from e in _ctx.Employees
join la in _ctx.LoginAudits on e.UserProfile.Id equals la.UserProfileId.Value
where la.LoginDate >= fromDate
&& e.Client.Id == clientID
select new
{
la.Id,
employeeID = e.Id,
e.Client.DisplayName,
la.UserProfileId
}).ToList();
The above LINQ code generates the below SQL.
exec sp_executesql N'SELECT
1 AS [C1],
[Extent2].[Id] AS [Id],
[Extent1].[Id] AS [Id1],
[Extent3].[DisplayName] AS [DisplayName],
[Extent2].[UserProfileId] AS [UserProfileId]
FROM [dbo].[Employees] AS [Extent1]
INNER JOIN [dbo].[LoginAudits] AS [Extent2] ON ([Extent1].[UserProfile_Id] = [Extent2].[UserProfileId]) OR (([Extent1].[UserProfile_Id] IS NULL) AND ([Extent2].[UserProfileId] IS NULL))
INNER JOIN [dbo].[Clients] AS [Extent3] ON [Extent1].[Client_Id] = [Extent3].[Id]
WHERE ([Extent2].[LoginDate] >= #p__linq__0) AND ([Extent1].[Client_Id] = #p__linq__1)',N'#p__linq__0 datetime2(7),#p__linq__1 bigint',#p__linq__0='2018-02-09 11:11:29.1047249',#p__linq__1=37
As you can see, it includes "OR (([Extent1].[UserProfile_Id] IS NULL) AND ([Extent2].[UserProfileId] IS NULL))"
That is the exact opposite of what I want. How do I make it do a normal inner join and not try to allow for null values?
I was able to work around this by adding && la.UserProfileId != null in my WHERE clause, but ideally I'd rather make the JOIN behave like normal INNER JOIN and not try to anticipate stuff I don't ask for.

That is the exact opposite of what I want. How do I make it do a normal inner join and not try to allow for null values?
The reasoning behind is that in C# null == null evaluates to true, while in SQL it evaluates to NULL (and basically is handled like FALSE). So EF is trying to emulate the C# behavior in order to get the same result as if you were running the same query in LINQ to Objects.
This is the default EF6 behavior. It is controlled by the UseDatabaseNullSemantics property, so if you want to use the SQL behavior, you should set it to true in your DbContext derived class constructor or from outside:
[dbContext.]Configuration.UseDatabaseNullSemantics = true;
But that's not enough though. It affects all the comparison operators, but they forgot to apply it to joins. The solution is to not use LINQ join operator, but correlated where (EF is smart enough to turn it into SQL JOIN).
So additionally to setting UseDatabaseNullSemantics to true, replace
join la in _ctx.LoginAudits on e.UserProfile.Id equals la.UserProfileId
with
from la in _ctx.LoginAudits where e.UserProfile.Id == la.UserProfileId
and you'll get the desired INNER JOIN.

Related

Entity Framework LINQ to Entities Join Query Timeout

I am executing the following LINQ to Entities query but it is stuck and does not return response until timeout. I executed the same query on SQL Server and it return 92000 in 3 sec.
var query = (from r in WinCtx.PartsRoutings
join s in WinCtx.Tab_Processes on r.ProcessName equals s.ProcessName
join p in WinCtx.Tab_Parts on r.CustPartNum equals p.CustPartNum
select new { r}).ToList();
SQL Generated:
SELECT [ I omitted columns]
FROM [dbo].[PartsRouting] AS [Extent1]
INNER JOIN [dbo].[Tab_Processes] AS [Extent2] ON ([Extent1].[ProcessName] = [Extent2].[ProcessName]) OR (([Extent1].[ProcessName] IS NULL) AND ([Extent2].[ProcessName] IS NULL))
INNER JOIN [dbo].[Tab_Parts] AS [Extent3] ON ([Extent1].[CustPartNum] = [Extent3].[CustPartNum]) OR (([Extent1].[CustPartNum] IS NULL) AND ([Extent3].[CustPartNum] IS NULL))
PartsRouting Table has 100,000+ records, Parts = 15000+, Processes = 200.
I tried too many things found online but nothing worked for me as to how I can achieve the result with same performance of SQL.
Based on the comments, looks like the issue is caused by the additional OR with IS NULL conditions in joins generated by the EF SQL translator. They were added in EF in order to emulate the C# == operator semantics which are different from SQL = for NULL values.
You can start by turning that EF behavior off through UseDatabaseNullSemantics property (it's false by default):
WinCtx.Configuration.UseDatabaseNullSemantics = true;
Unfortunately that's not enough, because it fixes the normal comparison operators, but they simply forgot to do the same for join conditions.
In case you are using joins just for filtering (as it seems), you can replace them with LINQ Any conditions which translates to SQL EXISTS and nowadays database query optimizers are treating it the same way as if it was an inner join:
var query = (from r in WinCtx.PartsRoutings
where WinCtx.Tab_Processes.Any(s => r.ProcessName == s.ProcessName)
where WinCtx.Tab_Parts.Any(p => r.CustPartNum == p.CustPartNum)
select new { r }).ToList();
You might also consider using just select r since creating anonymous type with single property just introdeces additional memory overhead with no advantages.
Update: Looking at the latest comment, you do need fields from joined tables (that's why it's important to not omit relevant parts of the query in question). In such case, you could try the alternative join syntax with where clauses:
WinCtx.Configuration.UseDatabaseNullSemantics = true;
var query = (from r in WinCtx.PartsRoutings
from s in WinCtx.Tab_Processes where r.ProcessName == s.ProcessName
from p in WinCtx.Tab_Parts where r.CustPartNum == p.CustPartNum
select new { r, s.Foo, p.Bar }).ToList();

EntityFramework generating poor tsql (linqpad generates much better)

I have a database structure as below
Family(1) ----- (*) FamilyPersons -----(1)Person(1)------() Expenses (1) -----(0..1)GroceriesDetails
Let me explain that relation, Family can have one or more than one person , we have a mapping table FamilyPersons between Family and Persons. Now each person can enter his expenses which go into Expenses Table. Expense Table has a column ExpenseType (groceries, entertainemnet etc)
and details of each of these expenses goes into their own Tables, so we have a GroceriesDetails table (similarly we have other tables), so we have 1 to 0..1 relation between Expense and Groceries.
Now I am writing a query to get Complete GroceriesDetails for a family
GroceriesDetails.Where (g => g.Expenses.Person.FamilyPersons.Any(fp =>
fp.FamilyId == 1) && g.Expenses.ExpenseType == "GC" )
For this the sql generated by EF is
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Amount] AS [Amount]
FROM [dbo].[GroceriesDetails] AS [Extent1]
INNER JOIN (SELECT [Extent3].[Id] AS [Id1]
FROM [dbo].[Expenses] AS [Extent2]
INNER JOIN [dbo].[GroceriesDetails] AS [Extent3] ON [Extent2].[Id] = [Extent3].[Id]
WHERE N'GC' = [Extent2].[ExpenseType] ) AS [Filter1] ON [Extent1].[Id] = [Filter1].[Id1]
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[Expenses] AS [Extent4]
INNER JOIN [dbo].[GroceriesDetails] AS [Extent5] ON [Extent4].[Id] = [Extent5].[Id]
INNER JOIN [dbo].[FamilyPerson] AS [Extent6] ON [Extent4].[PersonId] = [Extent6].[PersonId]
WHERE ([Extent1].[Id] = [Extent5].[Id]) AND (1 = [Extent6].[FamilyId])
)
In this query there is a full table join between Expenses and GroceriesDetails tables which is causing performance issues.
Whereas Linqpad generates a much better SQL
SELECT [t0].[Id], [t0].[Amount]
FROM [GroceriesDetails] AS [t0]
INNER JOIN [Expenses] AS [t1] ON [t1].[Id] = [t0].[Id]
WHERE (EXISTS(
SELECT NULL AS [EMPTY]
FROM [Expenses] AS [t2]
INNER JOIN [Person] AS [t3] ON [t3].[Id] = [t2].[PersonId]
CROSS JOIN [FamilyPerson] AS [t4]
WHERE ([t4].[FamilyId] = #p0) AND ([t2].[Id] = [t0].[Id]) AND ([t4].[PersonId] =
[t3].[Id])
)) AND ([t1].[ExpenseType] = #p1)
Please note that we are using WCF data services so this query is written against a WCF data service reference, so I can't traverse from top (family) to bottom (Groceries) as OData allows only one level of select.
Any help on optimizing this code is appreciated.
From the comments I learned that LinqPad uses Linq2SQL while the app uses EF, and that explains the difference.
The thing is that you have zero control on how EF generates SQL.
The only thing you can do is to rewrite your LINQ query to make it "closer" to desired SQL.
For example, instead of
GroceriesDetails.Where (g => g.Expenses.Person.FamilyPersons.Any(fp => fp.FamilyId == 1)
&& g.Expenses.ExpenseType == "GC" )
you can try to write something like (pseudocode):
from g in GrosseriesDetails
join e in Expenses on g.Id = e.GrosseryId
join p in Persons on p.Id = e.PersonId
join f in FamilyPersons on f.PersonId = p.Id
where f.FamilyId == 1 && e.ExpenseType == "GC"
It almost always helps as it tells an ORM a straightforward way to transform it into SQL. The idea is that the expression tree in the "original" case is more complex compare to the proposed scenario, and by simplifying the expression tree we make translator's job easier and more straightforward.
But besides manipulating the LINQ there is no control over how it generates SQL from the expression tree.

How to convert following SQL Query into a LINQ to SQL query

I have following SQL Query and would like to convert to LINQ to SQL which I will use in entity framework 5.0
var internationalDesksList =
from internationalDesks in _context.InternationalDesks
from subsection in
_context.Subsections.Where(
s =>
internationalDesks.EBALocationId == s.LocationId ||
internationalDesks.FELocationId == s.LocationId).DefaultIfEmpty()
where subsection.PublicationId == 1
select new {internationalDesks.Id, subsection.LocationId};
I have referred the following posts and answers. Though no luck.
LINQ to SQL - Left Outer Join with multiple join conditions Linq
left join on multiple (OR) conditions
When I tried this query in LINQPad I got the following answer which is correct.
-- Region Parameters
DECLARE #p0 Int = 1
-- EndRegion
SELECT [t0].[Id], [t1].[Id] AS [Id1]
FROM [InternationalDesks] AS [t0]
LEFT OUTER JOIN [Subsection] AS [t1] ON (([t0].[FELocationId]) = [t1].[LocationId]) OR (([t0].[EBALocationId]) = [t1].[LocationId])
WHERE [t1].[PublicationId] = #p0
However in entity framework 5 ( DBContext ) it is not providing me with the correct query. When I checked in SQL profiler all columns in subsection table is selected. That's it.
Following is the result:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Description] AS [Description],
[Extent1].[PracticeAreaId] AS [PracticeAreaId],
[Extent1].[LocationId] AS [LocationId],
...
FROM [dbo].[Subsection] AS [Extent1]
Don't know what could be problem. Please help me.
With LINQ you can't do LEFT OUTER JOIN on some boolean expression, only equijoins are supported. So, you can generate CROSS JOIN this way:
var internationalDesksList =
from internationalDesks in _context.InternationalDesks
from subsection in _context.Subsections
where subsection.PublicationId == 1 &&
(internationalDesks.EBALocationId == subsection.LocationId ||
internationalDesks.FELocationId == subsection.LocationId)
select new {
internationalDesks.Id,
subsection.LocationId
};
EF 5 will generate following SQL:
SELECT
[Extent1].[Id] AS [Id],
[Extent2].[LocationId] AS [LocationId]
FROM [dbo].[InternationalDesks] AS [Extent1]
CROSS JOIN [dbo].[Subsections] AS [Extent2]
WHERE (1 = [Extent2].[PublicationId]) AND
([Extent1].[EBALocationId] = [Extent2].[LocationId] OR
[Extent1].[FELocationId] = [Extent2].[LocationId])
As you can see, only required columns are selected. I also checked this query in LINQ to SQL - following query is generated:
DECLARE #p0 Int = 1
SELECT [t0].[Id], [t1].[LocationId]
FROM [InternationalDesks] AS [t0], [Subsections] AS [t1]
WHERE ([t1].[PublicationId] = #p0) AND
(([t0].[EBALocationId] = [t1].[LocationId]) OR
([t0].[FELocationId] = [t1].[LocationId]))

LinqToEntities produces incorrect SQL (Doubled Subquery)

I have a LinqToEntities query that double produces a subquery when creating the SQL. This causes the result set to come back with 0-3 results, every time the query is run. The subquery on its own produces a single random result (as it should). What is going on here?
The LINQ query:
from jpj in JobProviderJobs
where jpj.JobID == 4725
&& jpj.JobProviderID == (from jp2 in JobProviderJobs
where jp2.JobID == 4725
orderby Guid.NewGuid()
select jp2.JobProviderID).FirstOrDefault()
select new
{
JobProviderID = jpj.JobProviderID
}
Produce this SQL:
SELECT
[Filter2].[JobID] AS [JobID],
[Filter2].[JobProviderID1] AS [JobProviderID]
FROM (SELECT [Extent1].[JobID] AS [JobID], [Extent1].[JobProviderID] AS [JobProviderID1], [Limit1].[JobProviderID] AS [JobProviderID2]
FROM [dbo].[JobProviderJob] AS [Extent1]
LEFT OUTER JOIN (SELECT TOP (1) [Project1].[JobProviderID] AS [JobProviderID]
FROM ( SELECT
NEWID() AS [C1],
[Extent2].[JobProviderID] AS [JobProviderID]
FROM [dbo].[JobProviderJob] AS [Extent2]
WHERE 4725 = [Extent2].[JobID]
) AS [Project1]
ORDER BY [Project1].[C1] ASC ) AS [Limit1] ON 1 = 1
WHERE 4725 = [Extent1].[JobID] ) AS [Filter2]
LEFT OUTER JOIN (SELECT TOP (1) [Project2].[JobProviderID] AS [JobProviderID]
FROM ( SELECT
NEWID() AS [C1],
[Extent3].[JobProviderID] AS [JobProviderID]
FROM [dbo].[JobProviderJob] AS [Extent3]
WHERE 4725 = [Extent3].[JobID]
) AS [Project2]
ORDER BY [Project2].[C1] ASC ) AS [Limit2] ON 1 = 1
WHERE [Filter2].[JobProviderID1] = (CASE WHEN ([Filter2].[JobProviderID2] IS NULL) THEN 0 ELSE [Limit2].[JobProviderID] END)
EDIT:
So changing the subquery to this works, but I have no idea why
(from jp2 in JobProviderJobs
where jp2.JobID == 4725
orderby Guid.NewGuid()
select jp2).FirstOrDefault().JobProviderID
It's doing this due to the expected behavior of FirstOrDefault(). Calling FirstOrDefault() on an empty set of JobProviderJobs would produce a null value, but calling it on an empty set of ints would produce a 0. Recognizing this, LINQ to Entities tries to invoke a case statement at the end of the query to ensure that if there are no matching JobProviderJobs, the result of the select will be 0 instead of null.
In most cases, recreating the inner projection would cause no problems, but your use of NewGuid() obviously throws this logic off.
You found one solution. Another one would be to cast the result of the inner expression thusly:
from jpj in JobProviderJobs
where jpj.JobID == 4725
&& jpj.JobProviderID == (from jp2 in JobProviderJobs
where jp2.JobID == 4725
orderby Guid.NewGuid()
select (int?) jp2.JobProviderID).FirstOrDefault()
select new
{
JobProviderID = jpj.JobProviderID
}
A more correct implementation would reuse the initial SQL projection rather than creating two equivalent projections. It's likely that parser simply isn't complex enough to handle this properly, and the developers never saw any need to fix it because SQL Server should be able to optimize the identical projections away as long as they are deterministic.
You should probably log a bug report about this if one doesn't already exist.

Pulling Data from 1 DataContext to use in another's Method

I have been trying to convert this SQL statement to a LINQ one and am having trouble with the fact that part of the info returned is in a Seperate Database(Datacontext) from the rest. I am pretty sure this can be overcome however I seem to be failing at accomplishing this or finding examples of previous successful attempts.
Can someone offer some guidance on what I do to overcome that hurdle? Thanks
SELECT p.PersonID, p.FirstName, p.MiddleName, p.LastName, cp.EnrollmentID, cp.EnrollmentDate, cp.DisenrollmentDate
FROM [Connect].dbo.tblPerson AS p
INNER JOIN (
SELECT c.ClientID, c.EnrollmentID, c.EnrollmentDate, c.DisenrollmentDate
FROM [CMO].dbo.tblCMOEnrollment AS c
LEFT OUTER JOIN [CMO].dbo.tblWorkerHistory AS wh
ON c.EnrollmentID = wh.EnrollmentID
INNER JOIN [CMO].dbo.tblStaffExtended AS se
ON wh.Worker = se.StaffID
WHERE (wh.EndDate IS NULL OR wh.EndDate >= getdate())
AND wh.Worker = --WorkerGUID Param here
) AS cp
ON p.PersonID = cp.ClientID
ORDER BY p.PersonID
I have asked a similar question here before as was told I would need to create a View in order to accomplish this. Is that still true or was it ever?
I use LINQPad to do a lot of my LINQ to SQL. One of the features it allows is the use of multiple data contexts for one query.
for instance here is some code that I wrote in LINQPad
from template in RateTemplates
where
template.Policies.Any(p =>
Staging_history.Changes.Any(c =>
(c.Policies.Any(cp => cp.PolicyID == p.PolicyID) ||
c.PolicyFees.Any(cpf => cpf.PolicyID == p.PolicyID) ||
c.PolicyOptions.Any(cpo => cpo.PolicyID == p.PolicyID)) &&
c.ChangeTime > new DateTime(2012, 1, 11)
)
)
select new
{
TemplateID = template.ID,
UserID = template.UserID,
PropertyIDs = template.Properties.Select(ppty => ppty.PropertyID)
}
The table "RateTemplates" is a part of my first Data Context (With LINQPad you do not have to define the first data context in your code it is just assumed, but if you do this is C# you would need to specifically say which context to use etc). "Staging_history" is the second Data Context and I am using the table "Changes" from this one.
LINQ to SQL will do all sorts of magic in the background and the resulting SQL that gets executed is ...
-- Region Parameters
DECLARE #p0 DateTime = '2012-01-11 00:00:00.000'
-- EndRegion
SELECT [t0].[ID] AS [TemplateID], [t0].[UserID], [t1].[PropertyID], (
SELECT COUNT(*)
FROM [Property] AS [t7]
WHERE [t7].[RateTemplateID] = [t0].[ID]
) AS [value]
FROM [RateTemplate] AS [t0]
LEFT OUTER JOIN [Property] AS [t1] ON [t1].[RateTemplateID] = [t0].[ID]
WHERE EXISTS(
SELECT NULL AS [EMPTY]
FROM [Policy] AS [t2]
WHERE (EXISTS(
SELECT NULL AS [EMPTY]
FROM [staging_history].[dbo].[Change] AS [t3]
WHERE ((EXISTS(
SELECT NULL AS [EMPTY]
FROM [staging_history].[dbo].[Policy] AS [t4]
WHERE ([t4].[PolicyID] = [t2].[PolicyID]) AND ([t4].[ChangeID] = [t3].[ID])
)) OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [staging_history].[dbo].[PolicyFee] AS [t5]
WHERE ([t5].[PolicyID] = [t2].[PolicyID]) AND ([t5].[ChangeID] = [t3].[ID])
)) OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [staging_history].[dbo].[PolicyOption] AS [t6]
WHERE ([t6].[PolicyID] = [t2].[PolicyID]) AND ([t6].[ChangeID] = [t3].[ID])
))) AND ([t3].[ChangeTime] > #p0)
)) AND ([t2].[RateTemplateID] = [t0].[ID])
)
ORDER BY [t0].[ID], [t1].[PropertyID]
So it looks like you would just need to load up one data context for each database that you want to use and then just build up a LINQ query that makes use of both data contexts in one linq statement, like I have up above.
Hopefully this helps you out and gets you the results you are wanting without having to go creating views for each cross context queries that you want to do.
My understanding (I'm no guru on linqtosql) was the same, that it wasn't possible without using a view/sproc.
However, a quick search, found this on MSDN forums with a workaround. Quote from Damien's answer on there:
2.Add one of the tables to the other data context (Copy the DBML over and prefix the name attribute with the name of the database, e.g.
database2.dbo.MyTable)

Categories