Entity Framework: Include and Explicitly defined foreign keys [duplicate] - c#

If I use a join, the Include() method is no longer working, eg:
from e in dc.Entities.Include("Properties")
join i in dc.Items on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select e
e.Properties is not loaded
Without the join, the Include() works
Lee

UPDATE: Actually I recently added another Tip that covers this, and provides an alternate probably better solution. The idea is to delay the use of Include() until the end of the query, see this for more information: Tip 22 - How to make include really include
There is known limitation in the Entity Framework when using Include().
Certain operations are just not supported with Include.
Looks like you may have run into one on those limitations, to work around this you should try something like this:
var results =
from e in dc.Entities //Notice no include
join i in dc.Items on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select new {Entity = e, Properties = e.Properties};
This will bring back the Properties, and if the relationship between entity and Properties is a one to many (but not a many to many) you will find that each resulting anonymous type has the same values in:
anonType.Entity.Properties
anonType.Properties
This is a side-effect of a feature in the Entity Framework called relationship fixup.
See this Tip 1 in my EF Tips series for more information.

Try this:
var query = (ObjectQuery<Entities>)(from e in dc.Entities
join i in dc.Items on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select e)
return query.Include("Properties")

So what is the name of the navigation property on "Entity" which relates to "Item.Member" (i.e., is the other end of the navigation). You should be using this instead of the join. For example, if "entity" add a property called Member with the cardinality of 1 and Member had a property called Items with a cardinality of many, you could do this:
from e in dc.Entities.Include("Properties")
where e.Member.Items.Any(i => i.Collection.ID == collectionID)
select e
I'm guessing at the properties of your model here, but this should give you the general idea. In most cases, using join in LINQ to Entities is wrong, because it suggests that either your navigational properties are not set up correctly, or you are not using them.

So, I realise I am late to the party here, however I thought I'd add my findings. This should really be a comment on Alex James's post, but as I don't have the reputation it'll have to go here.
So my answer is: it doesn't seem to work at all as you would intend. Alex James gives two interesting solutions, however if you try them and check the SQL, it's horrible.
The example I was working on is:
var theRelease = from release in context.Releases
where release.Name == "Hello World"
select release;
var allProductionVersions = from prodVer in context.ProductionVersions
where prodVer.Status == 1
select prodVer;
var combined = (from release in theRelease
join p in allProductionVersions on release.Id equals p.ReleaseID
select release).Include(release => release.ProductionVersions);
var allProductionsForChosenRelease = combined.ToList();
This follows the simpler of the two examples. Without the include it produces the perfectly respectable sql:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name]
FROM [dbo].[Releases] AS [Extent1]
INNER JOIN [dbo].[ProductionVersions] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ReleaseID]
WHERE ('Hello World' = [Extent1].[Name]) AND (1 = [Extent2].[Status])
But with, OMG:
SELECT
[Project1].[Id1] AS [Id],
[Project1].[Id] AS [Id1],
[Project1].[Name] AS [Name],
[Project1].[C1] AS [C1],
[Project1].[Id2] AS [Id2],
[Project1].[Status] AS [Status],
[Project1].[ReleaseID] AS [ReleaseID]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent2].[Id] AS [Id1],
[Extent3].[Id] AS [Id2],
[Extent3].[Status] AS [Status],
[Extent3].[ReleaseID] AS [ReleaseID],
CASE WHEN ([Extent3].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
FROM [dbo].[Releases] AS [Extent1]
INNER JOIN [dbo].[ProductionVersions] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ReleaseID]
LEFT OUTER JOIN [dbo].[ProductionVersions] AS [Extent3] ON [Extent1].[Id] = [Extent3].[ReleaseID]
WHERE ('Hello World' = [Extent1].[Name]) AND (1 = [Extent2].[Status])
) AS [Project1]
ORDER BY [Project1].[Id1] ASC, [Project1].[Id] ASC, [Project1].[C1] ASC
Total garbage. The key point to note here is the fact that it returns the outer joined version of the table which has not been limited by status=1.
This results in the WRONG data being returned:
Id Id1 Name C1 Id2 Status ReleaseID
2 1 Hello World 1 1 2 1
2 1 Hello World 1 2 1 1
Note that the status of 2 is being returned there, despite our restriction. It simply does not work.
If I have gone wrong somewhere, I would be delighted to find out, as this is making a mockery of Linq. I love the idea, but the execution doesn't seem to be usable at the moment.
Out of curiosity, I tried the LinqToSQL dbml rather than the LinqToEntities edmx that produced the mess above:
SELECT [t0].[Id], [t0].[Name], [t2].[Id] AS [Id2], [t2].[Status], [t2].[ReleaseID], (
SELECT COUNT(*)
FROM [dbo].[ProductionVersions] AS [t3]
WHERE [t3].[ReleaseID] = [t0].[Id]
) AS [value]
FROM [dbo].[Releases] AS [t0]
INNER JOIN [dbo].[ProductionVersions] AS [t1] ON [t0].[Id] = [t1].[ReleaseID]
LEFT OUTER JOIN [dbo].[ProductionVersions] AS [t2] ON [t2].[ReleaseID] = [t0].[Id]
WHERE ([t0].[Name] = #p0) AND ([t1].[Status] = #p1)
ORDER BY [t0].[Id], [t1].[Id], [t2].[Id]
Slightly more compact - weird count clause, but overall same total FAIL.
Has anybody actually ever used this stuff in a real business application? I'm really starting to wonder...
Please tell me I've missed something obvious, as I really want to like Linq!

Try the more verbose way to do more or less the same thing obtain the same results, but with more datacalls:
var mydata = from e in dc.Entities
join i in dc.Items
on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select e;
foreach (Entity ent in mydata) {
if(!ent.Properties.IsLoaded) { ent.Properties.Load(); }
}
Do you still get the same (unexpected) result?
EDIT: Changed the first sentence, as it was incorrect. Thanks for the pointer comment!

Related

LINQ to SQL generates a different query for similar group by expressions

I noticed that when using GroupBy in Linq to SQL, there's a difference in the result query when providing a reference Id as the Key versus using the actual navigation property as the Key.
Example 1:
Employees.GroupBy(x => x.CompanyId).Select(g => g.Count())
Result SQL:
SELECT COUNT(*) AS [value]
FROM [Employees] AS [t0]
GROUP BY [t0].[CompanyId]
Example 2:
Employees.GroupBy(x => x.Company).Select(g => g.Count())
Result SQL:
SELECT [t1].[value]
FROM (
SELECT COUNT(*) AS [value], [t0].[DivisionDeductionID]
FROM [CheckDeductions] AS [t0]
GROUP BY [t0].[DivisionDeductionID]
) AS [t1]
LEFT OUTER JOIN [DivisionDeductions] AS [t2] ON [t2].[DivisionDeductionID] = [t1].[DivisionDeductionID]
Looking at Example #2, it is obvious that [t2] is never used other than the LEFT JOIN itself. why doesn't LINQ to SQL detects that and just uses the same query as Example #1? it anyways groups by the ID field.
This looks like EF's SQL generator has missed an opportunity to optimize the query: indeed, since [t2] is not used outside the outer join, it could be thrown away, along with a nested select.
It appears that EF writers added a join for [t2] because they did not want to differentiate between a situation (1) when a navigation property is used only for its PK (so the corresponding FK could be used in its place) and (2) a situation when the query actually pull additional fields from it.
This practice is completely justified, given that RDBMS optimizes out the unnecessary join anyway.

How can I call Include in Entity Framework after filter query

I'm trying call Include and loading related entities in EF !after WHERE query. How can I do this in EF? In SQL it's looks like this:
SELECT * FROM T1 INNER JOIN ( SELECT * FROM T WHERE ... (filtering data)) as T2 ON T1.A = T2.A (loading data to filtered data)
If i write db.Include(..).Where(..) or db.Where(..).Include(..) , in SQL server profiler I will see next query:
SELECT ...
FROM T1 AS [Extent1]
INNER JOIN T2 AS [Extent2] ON [Extent1].[A] = [Extent2].[A]
LEFT OUTER JOIN T2 AS [Extent3] ON [Extent1].[A] = [Extent3].[A]
WHERE N'B1' = [Extent1].[B]
But in this first performed join query and after filtering query.
Thanks in advance
AFAIK Unfortunately, this feature is not yet supported in Entity Framework (filtering on include).
Nearest you can get is to to perform separate queries.
You can check out Larislav's answer for more information.

Entity Framework generates inefficient select when using Find()

I am noticing that the Entity Framework is generated some inefficient queries when using the Find() method. For example here is my C# code.
Model model = unit.Repository.DbSet.Find(model.ID);
Generate Find() Query
DECLARE #p0 int = 1
SELECT
[Limit1].[ID] AS [ID],
[Limit1].[UserID] AS [UserID],
[Limit1].[Started] AS [Started],
[Limit1].[Updated] AS [Updated],
[Limit1].[Completed] AS [Completed]
FROM ( SELECT TOP (2)
[Extent1].[ID] AS [ID],
[Extent1].[UserID] AS [UserID],
[Extent1].[Started] AS [Started],
[Extent1].[Updated] AS [Updated],
[Extent1].[Completed] AS [Completed]
FROM [dbo].[Table] AS [Extent1]
WHERE [Extent1].[ID] = #p0
) AS [Limit1]
It seems to be running a whole other select query which is unnecessary. Here is the output using the SingleOrDefault() method.
Generate SingleOrDefault() Query
DECLARE #p__linq__0 int = 1
SELECT TOP (2)
[Extent1].[ID] AS [ID],
[Extent1].[UserID] AS [UserID],
[Extent1].[Started] AS [Started],
[Extent1].[Updated] AS [Updated],
[Extent1].[Completed] AS [Completed]
FROM [dbo].[Table] AS [Extent1]
WHERE [Extent1].[ID] = #p__linq__0
Is there a reason why Find() is generating two selects? Should the Find() method be avoided in favor of the SingleOrDefault() method?
I doubt there is any performance difference between the two, for sql server at least. It looks like the first one just has an extra wrapper around the select. Running a similar query on a database that I have generates the exact same plan, so I would imagine the outer select gets optimized away in the execution plan.

Why linq-to-sql query is translated into subquery?

Why this linq query:
(from c in Orders
select new
{
Id=c.Id,
DeliveryDate = c.DeliveryDate.Value
}).Take(10)
is translated into
SELECT TOP (10) [t1].[Id], [t1].[value] AS [DeliveryDate]
FROM (
SELECT [t0].[Id], [t0].[DeliveryDate] AS [value]
FROM [Orders] AS [t0]
) AS [t1]
but when I change DeliveryDate = c.DeliveryDate.Value into DeliveryDate = c.DeliveryDate SQL query looks as simple as:
SELECT TOP (10) [t0].[Id], [t0].[DeliveryDate]
FROM [Orders] AS [t0]
I think this is because the LINQ2SQL's translator is under-optimized. Use of a "property" (Value) triggers creation of a sub-query, which turns out to be unnecessary.
It is worth to note that any RDBMS worth its salt would generate identical query plans for both SQL queries, so in the end it would not matter either way.
Possibly a bug / not optimized issue. I can not explain it different.

Trouble understanding the SQL generated from this Entity Framework query

I created an Entity Framework model that contains two tables from the Northwind database to test some of its functionality: Products and CAtegories.
It automatically created an association between Category and Product which is 0..1 to *.
I wrote this simple query:
var beverages = from p in db.Products.Include("Category")
where p.Category.CategoryName == "Beverages"
select p;
var beverageList = beverages.ToList();
I ran SQL Profiler and ran the code so i could see the SQL that it generates and this is what it generated:
SELECT
[Extent1].[ProductID] AS [ProductID],
[Extent1].[ProductName] AS [ProductName],
[Extent1].[SupplierID] AS [SupplierID],
[Extent1].[QuantityPerUnit] AS [QuantityPerUnit],
[Extent1].[UnitPrice] AS [UnitPrice],
[Extent1].[UnitsInStock] AS [UnitsInStock],
[Extent1].[UnitsOnOrder] AS [UnitsOnOrder],
[Extent1].[ReorderLevel] AS [ReorderLevel],
[Extent1].[Discontinued] AS [Discontinued],
[Extent3].[CategoryID] AS [CategoryID],
[Extent3].[CategoryName] AS [CategoryName],
[Extent3].[Description] AS [Description],
[Extent3].[Picture] AS [Picture]
FROM [dbo].[Products] AS [Extent1]
INNER JOIN [dbo].[Categories] AS [Extent2]
ON [Extent1].[CategoryID] = [Extent2].CategoryID]
LEFT OUTER JOIN [dbo].[Categories] AS [Extent3]
ON [Extent1].[CategoryID] = [Extent3].[CategoryID]
WHERE N'Beverages' = [Extent2].[CategoryName]
I am curious why the query inner joins to Categories and then left joins to it. The select statement is using the fields from the left joined table. Can someone help me understand the reason for this? If I remove the left join and change the select list to pull from Extent2 I get the same results for this query. In what situation would this not be true?
[Extent3] is a realization of Include(Category) and Include should not impact on result of selection from "main" table Product, so LEFT JOIN (all records from Product and some records from the right table Category).
[Extent2] is really to filter all records by related table Category with name "Beverages", so in this case it is the strong restriction (INNER JOIN)
Why two? :) Because of parsing expression-by-expression and auto generation for every statement (Include, Where)
You'll notice that the query is pulling all columns in the SELECT list from the copy of the Categories table aliased Extent3, but it's checking the CategoryName against the copy aliased Extent2.
In other words, in this scenario EF's query generation is not realizing that you're Include()ing and restricting the query via the same table, so it's blindly using two copies.
Unfortunately, beyond explaining what's going on, my experience with EF is not advanced enough to suggest a solution...
djacobson and igor explain pretty well why this happens. The way I personally use the Entity Framework, I avoid using Include altogether. Depending on what you're planning to do with the data, you could do something like this:
var beverages = from p in db.Products
select new {p, p.Category} into pc
where pc.Category.CategoryName == "Beverages"
select pc;
return beverages.ToList().Select(pc => pc.p);
... which, at least in EF 4.0, will produce just a single inner join. Entity Framework is smart enough to make it so that the product's Category property is populated with the category that came back from the database with it.
Of course, it's very likely that SQL Server optimizes things away so this won't actually gain you anything.
(Not directly an answer to your question if the queries are the same, but the comment field is too restricting for this)
If you leave out the .Include(), doesn't it load it anyway (because of the where)? Generally it makes more sense to me to use projections instead of Include():
var beverages = from p in db.Products.Include("Category")
where p.Category.CategoryName == "Beverages"
select new { Product = p, Category = p.Category };
var beverageList = beverages.ToList();

Categories