Left outer join not being honoured - c#

We have the following EF code:
var qry =
from c in db.Contacts
join comp in db.Companies on c.CompanyId equals comp.CompanyId
into compLeft
from cj in compLeft.DefaultIfEmpty()
select new CompleteUserDlModel
{
CompanyName = cj.Company1,
CompanyId = c.CompanyId
};
which generates this SQL
SELECT
[Extent1].[CompanyId] AS [CompanyId],
[Extent2].[Company] AS [Company]
FROM [dbo].[Contacts] AS [Extent1]
INNER JOIN [dbo].[Company] AS [Extent2] ON [Extent1].[CompanyId] = [Extent2].[CompanyId]
but we actually want
SELECT
[Extent1].[CompanyId] AS [CompanyId],
[Extent2].[Company] AS [Company]
FROM [dbo].[Contacts] AS [Extent1]
LEFT OUTER JOIN [dbo].[Company] AS [Extent2] ON [Extent1].[CompanyId] = [Extent2].[CompanyId]
Could someone point out what we've done wrong, please?
All the refs on left outer joins in C# EF (i.e. LEFT OUTER JOIN in LINQ) point to the syntax we're using. Clearly, we're missing something.

Could someone point out what we've done wrong, please?
Probably you have Contact.CompanyId a typed as int instead of int?, making it a required property, and so EF assumes you have referential integrity when generating a query.
But, as always, left join in LINQ has bad code smell, and can almost always be replaced by just querying your target entity and traversing its Navigation Properties. EG:
from c in db.Contacts
select new
{
CompanyName = c.CompanyId.HasValue?c.Company.CompanyName : null,
CompanyId = c.CompanyId
};

var qry =
from c in db.Contacts
join comp in db.Companies on c.CompanyId equals comp.CompanyId
into compLeft
from cj in compLeft.DefaultIfEmpty()
select new CompleteUserDlModel
{
CompanyName = cj.Company1,
CompanyId = c?.CompanyId ?? String.Empty
};

Related

Creating Linq from SQL with OrderBy and GroupBy

I have the following table structure.
TableA TableB TableC
- MID - PID - PID
- NAME - INIT_DATE - MID
This is the SQL Query that I need to translate into Linq
SELECT TOP 10 TableA.NAME,
COUNT(TableB.INIT_DATE) AS [TOTALCOUNT]
FROM TableC
INNER JOIN TableA ON TableC.MID = TableA.MID
LEFT OUTER JOIN TableB ON TableC.PID = TableB.PID
GROUP BY TableA.NAME
ORDER BY [TOTALCOUNT] DESC
I tried to reproduce the above query with this Linq query:
iqModel = (from tableC in DB.TableC
join tableA in DB.TableA on tableC.MID equals tableA.MID
select new { tableC, tableA } into TM
join tableB in DB.TableB on TM.tableC.PID equals J.PID into TJ
from D in TJ.DefaultIfEmpty()
select new { TM, D } into MD
group MD by MD.TM.tableA.NAME into results
let TOTALCOUNT = results.Select(item=>item.D.INIT_DATE).Count()
orderby TOTALCOUNT descending
select new SelectListItem
{
Text = results.Key.ToString(),
Value = TOTALCOUNT.ToString()
}).Take(10);
But I think I am doing something wrong.
The Output of the LINQ and SQL is not same. I think up to JOIN or GROUPBY it is Correct.
EDIT :-
I have also tried the following Linq query but still it's not working correctly.
var iqModel = (from c in DB.TableC
join a in DB.TableA on c.MID equals a.MID
join b in DB.b on c.PID equals b.PID into b_join
from b in b_join.DefaultIfEmpty()
select new SelectListItem { Text = a.NAME, Value = b.INIT_DATE != null ? b.INIT_DATE.ToString() : string.Empty });
var igModel = iqModel.GroupBy(item => item.Text);
var result = igModel.OrderByDescending(item => item.Select(r => r.Value).Count());
I want to understand what am I doing wrong and how can it be fixed.
I am newbie to LINQ to SQL I think in above LINQ I really made it complicated by adding more select.
I think the difference is caused by the fact that the SQL COUNT(field) function does not include NULL values. There is no direct equivalent construct in LINQ, but it could be simulated with Count(e => e.Field != null) or like this (which seems to produce better SQL):
var query =
(from a in db.TableA
join c in db.TableC on a.MID equals c.MID
join b in db.TableB on c.PID equals b.PID into joinB
from b in joinB.DefaultIfEmpty()
group b by a.Name into g
let TOTALCOUNT = g.Sum(e => e.INIT_DATE != null ? 1 : 0)
orderby TOTALCOUNT descending
select new SelectListItem { Text = g.Key, Value = TOTALCOUNT }
).Take(10);
which generates the following SQL
SELECT TOP (10)
[Project1].[C2] AS [C1],
[Project1].[Name] AS [Name],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [Name],
1 AS [C2]
FROM ( SELECT
[Join2].[K1] AS [K1],
SUM([Join2].[A1]) AS [A1]
FROM ( SELECT
[Extent1].[Name] AS [K1],
CASE WHEN ([Extent3].[INIT_DATE] IS NOT NULL) THEN 1 ELSE 0 END AS [A1]
FROM [dbo].[TableAs] AS [Extent1]
INNER JOIN [dbo].[TableCs] AS [Extent2] ON [Extent1].[MID] = [Extent2].[MID]
LEFT OUTER JOIN [dbo].[TableBs] AS [Extent3] ON [Extent2].[PID] = [Extent3].[PID]
) AS [Join2]
GROUP BY [K1]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC
I assume, that you not see "group by" command at resulting query, instead of it "distinct" command is used. Am I right?
First query makes distinct by TableA.NAME and then calculates COUNT(TableB.INIT_DATE) with the help of subquery like this:
select distinct1.Name, (select count() from *join query* where Name = distinct1.Name)
from (select distinct Name from *join query*) as distinct1
If so, not worry about it. Because conversion from linq to real t-sql script sometimes very unpredictable (you can not force them to be equal, only when query is very simple), but both queries are equivalent one to another and return same results (compare them to make sure).

EntityFramework generating poor tsql (linqpad generates much better)

I have a database structure as below
Family(1) ----- (*) FamilyPersons -----(1)Person(1)------() Expenses (1) -----(0..1)GroceriesDetails
Let me explain that relation, Family can have one or more than one person , we have a mapping table FamilyPersons between Family and Persons. Now each person can enter his expenses which go into Expenses Table. Expense Table has a column ExpenseType (groceries, entertainemnet etc)
and details of each of these expenses goes into their own Tables, so we have a GroceriesDetails table (similarly we have other tables), so we have 1 to 0..1 relation between Expense and Groceries.
Now I am writing a query to get Complete GroceriesDetails for a family
GroceriesDetails.Where (g => g.Expenses.Person.FamilyPersons.Any(fp =>
fp.FamilyId == 1) && g.Expenses.ExpenseType == "GC" )
For this the sql generated by EF is
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Amount] AS [Amount]
FROM [dbo].[GroceriesDetails] AS [Extent1]
INNER JOIN (SELECT [Extent3].[Id] AS [Id1]
FROM [dbo].[Expenses] AS [Extent2]
INNER JOIN [dbo].[GroceriesDetails] AS [Extent3] ON [Extent2].[Id] = [Extent3].[Id]
WHERE N'GC' = [Extent2].[ExpenseType] ) AS [Filter1] ON [Extent1].[Id] = [Filter1].[Id1]
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[Expenses] AS [Extent4]
INNER JOIN [dbo].[GroceriesDetails] AS [Extent5] ON [Extent4].[Id] = [Extent5].[Id]
INNER JOIN [dbo].[FamilyPerson] AS [Extent6] ON [Extent4].[PersonId] = [Extent6].[PersonId]
WHERE ([Extent1].[Id] = [Extent5].[Id]) AND (1 = [Extent6].[FamilyId])
)
In this query there is a full table join between Expenses and GroceriesDetails tables which is causing performance issues.
Whereas Linqpad generates a much better SQL
SELECT [t0].[Id], [t0].[Amount]
FROM [GroceriesDetails] AS [t0]
INNER JOIN [Expenses] AS [t1] ON [t1].[Id] = [t0].[Id]
WHERE (EXISTS(
SELECT NULL AS [EMPTY]
FROM [Expenses] AS [t2]
INNER JOIN [Person] AS [t3] ON [t3].[Id] = [t2].[PersonId]
CROSS JOIN [FamilyPerson] AS [t4]
WHERE ([t4].[FamilyId] = #p0) AND ([t2].[Id] = [t0].[Id]) AND ([t4].[PersonId] =
[t3].[Id])
)) AND ([t1].[ExpenseType] = #p1)
Please note that we are using WCF data services so this query is written against a WCF data service reference, so I can't traverse from top (family) to bottom (Groceries) as OData allows only one level of select.
Any help on optimizing this code is appreciated.
From the comments I learned that LinqPad uses Linq2SQL while the app uses EF, and that explains the difference.
The thing is that you have zero control on how EF generates SQL.
The only thing you can do is to rewrite your LINQ query to make it "closer" to desired SQL.
For example, instead of
GroceriesDetails.Where (g => g.Expenses.Person.FamilyPersons.Any(fp => fp.FamilyId == 1)
&& g.Expenses.ExpenseType == "GC" )
you can try to write something like (pseudocode):
from g in GrosseriesDetails
join e in Expenses on g.Id = e.GrosseryId
join p in Persons on p.Id = e.PersonId
join f in FamilyPersons on f.PersonId = p.Id
where f.FamilyId == 1 && e.ExpenseType == "GC"
It almost always helps as it tells an ORM a straightforward way to transform it into SQL. The idea is that the expression tree in the "original" case is more complex compare to the proposed scenario, and by simplifying the expression tree we make translator's job easier and more straightforward.
But besides manipulating the LINQ there is no control over how it generates SQL from the expression tree.

Entity Framework Query for inner join

What would be the query for:
select s.* from Service s
inner join ServiceAssignment sa on sa.ServiceId = s.Id
where sa.LocationId = 1
in entity framework?
This is what I wrote:
var serv = (from s in db.Services
join sl in Location on s.id equals sl.id
where sl.id = s.id
select s).ToList();
but it's wrong. Can some one guide me to the path?
from s in db.Services
join sa in db.ServiceAssignments on s.Id equals sa.ServiceId
where sa.LocationId == 1
select s
Where db is your DbContext. Generated query will look like (sample for EF6):
SELECT [Extent1].[Id] AS [Id]
-- other fields from Services table
FROM [dbo].[Services] AS [Extent1]
INNER JOIN [dbo].[ServiceAssignments] AS [Extent2]
ON [Extent1].[Id] = [Extent2].[ServiceId]
WHERE [Extent2].[LocationId] = 1
In case anyone's interested in the Method syntax, if you have a navigation property, it's way easy:
db.Services.Where(s=>s.ServiceAssignment.LocationId == 1);
If you don't, unless there's some Join() override I'm unaware of, I think it looks pretty gnarly (and I'm a Method syntax purist):
db.Services.Join(db.ServiceAssignments,
s => s.Id,
sa => sa.ServiceId,
(s, sa) => new {service = s, asgnmt = sa})
.Where(ssa => ssa.asgnmt.LocationId == 1)
.Select(ssa => ssa.service);
You could use a navigation property if its available. It produces an inner join in the SQL.
from s in db.Services
where s.ServiceAssignment.LocationId == 1
select s

Optimising LINQ-to-SQL queries

I have a very heavy LINQ-to-SQL query, which does a number of joins onto different tables to return an anonymous type. The problem is, if the amount of rows returned is fairly large (> 200), then the query becomes awfully slow and ends up timing out. I know I can increase the data context timeout setting, but that's a last resort.
I'm just wondering if my query would work better if I split it up, and do my comparisons as LINQ-to-Objects queries so I can possibly even use PLINQ to maximise the the processing power. But I'm that's a foreign concept to me, and I can't get my head around on how I would split it up. Can anyone offer any advice? I'm not asking for the code to be written for me, just some general guidance on how I could improve this would be great.
Note I've ensured the database has all the correct keys that I'm joining on, and I've ensured these keys are up to date.
The query is below:
var cons = (from c in dc.Consignments
join p in dc.PODs on c.IntConNo equals p.Consignment into pg
join d in dc.Depots on c.DeliveryDepot equals d.Letter
join sl in dc.Accounts on c.Customer equals sl.LegacyID
join ss in dc.Accounts on sl.InvoiceAccount equals ss.LegacyID
join su in dc.Accounts on c.Subcontractor equals su.Name into sug
join sub in dc.Accountsubbies on ss.ID equals sub.AccountID into subg
where (sug.FirstOrDefault() == null
|| sug.FirstOrDefault().Customer == false)
select new
{
ID = c.ID,
IntConNo = c.IntConNo,
LegacyID = c.LegacyID,
PODs = pg.DefaultIfEmpty(),
TripNumber = c.TripNumber,
DropSequence = c.DropSequence,
TripDate = c.TripDate,
Depot = d.Name,
CustomerName = c.Customer,
CustomerReference = c.CustomerReference,
DeliveryName = c.DeliveryName,
DeliveryTown = c.DeliveryTown,
DeliveryPostcode = c.DeliveryPostcode,
VehicleText = c.VehicleReg + c.Subcontractor,
SubbieID = sug.DefaultIfEmpty().FirstOrDefault().ID.ToString(),
SubbieList = subg.DefaultIfEmpty(),
ScanType = ss.PODScanning == null ? 0 : ss.PODScanning
});
Here's the generated SQL as requested:
{SELECT [t0].[ID], [t0].[IntConNo], [t0].[LegacyID], [t6].[test], [t6].[ID] AS [ID2], [t6].[Consignment], [t6].[Status], [t6].[NTConsignment], [t6].[CustomerRef], [t6].[Timestamp], [t6].[SignedBy], [t6].[Clause], [t6].[BarcodeNumber], [t6].[MainRef], [t6].[Notes], [t6].[ConsignmentRef], [t6].[PODedBy], (
SELECT COUNT(*)
FROM (
SELECT NULL AS [EMPTY]
) AS [t10]
LEFT OUTER JOIN (
SELECT NULL AS [EMPTY]
FROM [dbo].[PODs] AS [t11]
WHERE [t0].[IntConNo] = [t11].[Consignment]
) AS [t12] ON 1=1
) AS [value], [t0].[TripNumber], [t0].[DropSequence], [t0].[TripDate], [t1].[Name] AS [Depot], [t0].[Customer] AS [CustomerName], [t0].[CustomerReference], [t0].[DeliveryName], [t0].[DeliveryTown], [t0].[DeliveryPostcode], [t0].[VehicleReg] + [t0].[Subcontractor] AS [VehicleText], CONVERT(NVarChar,(
SELECT [t16].[ID]
FROM (
SELECT TOP (1) [t15].[ID]
FROM (
SELECT NULL AS [EMPTY]
) AS [t13]
LEFT OUTER JOIN (
SELECT [t14].[ID]
FROM [dbo].[Account] AS [t14]
WHERE [t0].[Subcontractor] = [t14].[Name]
) AS [t15] ON 1=1
ORDER BY [t15].[ID]
) AS [t16]
)) AS [SubbieID],
(CASE
WHEN [t3].[PODScanning] IS NULL THEN #p0
ELSE [t3].[PODScanning]
END) AS [ScanType], [t3].[ID] AS [ID3]
FROM [dbo].[Consignments] AS [t0]
INNER JOIN [dbo].[Depots] AS [t1] ON [t0].[DeliveryDepot] = [t1].[Letter]
INNER JOIN [dbo].[Account] AS [t2] ON [t0].[Customer] = [t2].[LegacyID]
INNER JOIN [dbo].[Account] AS [t3] ON [t2].[InvoiceAccount] = [t3].[LegacyID]
LEFT OUTER JOIN ((
SELECT NULL AS [EMPTY]
) AS [t4]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t5].[ID], [t5].[Consignment], [t5].[Status], [t5].[NTConsignment], [t5].[CustomerRef], [t5].[Timestamp], [t5].[SignedBy], [t5].[Clause], [t5].[BarcodeNumber], [t5].[MainRef], [t5].[Notes], [t5].[ConsignmentRef], [t5].[PODedBy]
FROM [dbo].[PODs] AS [t5]
) AS [t6] ON 1=1 ) ON [t0].[IntConNo] = [t6].[Consignment]
WHERE ((NOT (EXISTS(
SELECT TOP (1) NULL AS [EMPTY]
FROM [dbo].[Account] AS [t7]
WHERE [t0].[Subcontractor] = [t7].[Name]
ORDER BY [t7].[ID]
))) OR (NOT (((
SELECT [t9].[Customer]
FROM (
SELECT TOP (1) [t8].[Customer]
FROM [dbo].[Account] AS [t8]
WHERE [t0].[Subcontractor] = [t8].[Name]
ORDER BY [t8].[ID]
) AS [t9]
)) = 1))) AND ([t2].[Customer] = 1) AND ([t3].[Customer] = 1)
ORDER BY [t0].[ID], [t1].[ID], [t2].[ID], [t3].[ID], [t6].[ID]
}
Try moving the subcontractor join up higher and push the where clause along with it. That way you're not unnecessarily making joins which would fail at the end.
I would also modify the select for the subcontractor id, so you don't get the Id of a potentially null value.
var cons = (from c in dc.Consignments
join su in dc.Accounts on c.Subcontractor equals su.Name into sug
where (sug.FirstOrDefault() == null || sug.FirstOrDefault().Customer == false)
join p in dc.PODs on c.IntConNo equals p.Consignment into pg
join d in dc.Depots on c.DeliveryDepot equals d.Letter
join sl in dc.Accounts on c.Customer equals sl.LegacyID
join ss in dc.Accounts on sl.InvoiceAccount equals ss.LegacyID
join sub in dc.Accountsubbies on ss.ID equals sub.AccountID into subg
let firstSubContractor = sug.DefaultIfEmpty().FirstOrDefault()
select new
{
ID = c.ID,
IntConNo = c.IntConNo,
LegacyID = c.LegacyID,
PODs = pg.DefaultIfEmpty(),
TripNumber = c.TripNumber,
DropSequence = c.DropSequence,
TripDate = c.TripDate,
Depot = d.Name,
CustomerName = c.Customer,
CustomerReference = c.CustomerReference,
DeliveryName = c.DeliveryName,
DeliveryTown = c.DeliveryTown,
DeliveryPostcode = c.DeliveryPostcode,
VehicleText = c.VehicleReg + c.Subcontractor,
SubbieID = firstSubContractor == null ? "" : firstSubContractor.ID.ToString(),
SubbieList = subg.DefaultIfEmpty(),
ScanType = ss.PODScanning == null ? 0 : ss.PODScanning
});

Linq Combine Left Join Data

Say I have the following database:
Users
-------
UserId (PK)
UserName
Roles
-----
RoleId (PK)
RoleName
UserRoles
---------
UserId (PK)
RoleId (PK)
Users 1-M UserRoles M-1 Roles
Using LinqToSQL, I want to return the following set:
[User1], [Role1, Role2, Role3]
[User2], [Role2, Role3]
[User3], []
Etc...
What is the most efficient way to create this LinqToSql Query?
In addition, if I want to create a filter to return only users that have Role1, what would that entail?
Thx.
Define "efficient". But otherwise...
from u in dataContext.Users
select new { User = u, Roles = u.UserRoles.Select(ur => ur.Role) }
And filtering users by RoleID:
from u in dataContext.Users
where u.UserRoles.Any(ur => ur.RoleID == 1)
select u
Or by some other Role attribute, say, Name:
from u in dataContext.Users
where u.UserRoles.Any(ur => ur.Role.Name == "Role 1")
select u
Combining it all together:
from u in dataContext.Users
select new
{
User = u,
Roles = from ur in u.UserRoles
where ur.RoleID == 1 || ur.Role.Name == "Role 1"
select ur.Role
}
This is the single query that I would construct to get your desired result set all at once
from u in Users
join ur in UserRoles on u.UserId equals ur.UserId
join r in Roles on ur.RoleId equals r.RoleId
group r by u into grouping
select grouping
It produces the following SQL:
SELECT [t0].[UserId], [t0].[Username]
FROM [Users] AS [t0]
INNER JOIN [UserRoles] AS [t1] ON [t0].[UserId] = [t1].[UserId]
INNER JOIN [Roles] AS [t2] ON [t1].[RoleId] = [t2].[RoleId]
GROUP BY [t0].[UserID], [t0].[Username]
GO
-- Region Parameters
DECLARE #x1 Int = 2
-- EndRegion
SELECT [t2].[RoleId], [t2].[RoleName]
FROM [Users] AS [t0]
INNER JOIN [UserRoles] AS [t1] ON [t0].[UserId] = [t1].[UserId]
INNER JOIN [Roles] AS [t2] ON [t1].[RoleId] = [t2].[RoleId]
WHERE #x1 = [t0].[UserId]
#Pavel's looks like it produces a better SQL statement:
SELECT [t0].[UserId], [t0].[Username], [t2].[RoleId], [t2].[RoleName] (
SELECT COUNT(*)
FROM [UserRoles] AS [t3]
INNER JOIN [Roles] AS [t4] ON [t4].[RoleId] = [t3].[RoleId]
WHERE [t3].[UserId] = [t0].[UserId]
) AS [value]
FROM [Users] AS [t0]
LEFT OUTER JOIN ([UserRoles] AS [t1]
INNER JOIN [Roles] AS [t2] ON [t2].[RoleId] = [t1].[RoleId]) ON [t1].[UserId] = [t0].[UserId]
ORDER BY [t0].[UserId], [t1].[UserRoleId], [t2].[RoleId]
In terms of efficient, testing is going to be the best way to figure out what most performant approach is for your situation.

Categories