Entity Framework generates inefficient select when using Find()

Entity Framework generates inefficient select when using Find() - c#

I am noticing that the Entity Framework is generated some inefficient queries when using the Find() method. For example here is my C# code.
Model model = unit.Repository.DbSet.Find(model.ID);
Generate Find() Query
DECLARE #p0 int = 1
SELECT
[Limit1].[ID] AS [ID],
[Limit1].[UserID] AS [UserID],
[Limit1].[Started] AS [Started],
[Limit1].[Updated] AS [Updated],
[Limit1].[Completed] AS [Completed]
FROM ( SELECT TOP (2)
[Extent1].[ID] AS [ID],
[Extent1].[UserID] AS [UserID],
[Extent1].[Started] AS [Started],
[Extent1].[Updated] AS [Updated],
[Extent1].[Completed] AS [Completed]
FROM [dbo].[Table] AS [Extent1]
WHERE [Extent1].[ID] = #p0
) AS [Limit1]
It seems to be running a whole other select query which is unnecessary. Here is the output using the SingleOrDefault() method.
Generate SingleOrDefault() Query
DECLARE #p__linq__0 int = 1
SELECT TOP (2)
[Extent1].[ID] AS [ID],
[Extent1].[UserID] AS [UserID],
[Extent1].[Started] AS [Started],
[Extent1].[Updated] AS [Updated],
[Extent1].[Completed] AS [Completed]
FROM [dbo].[Table] AS [Extent1]
WHERE [Extent1].[ID] = #p__linq__0
Is there a reason why Find() is generating two selects? Should the Find() method be avoided in favor of the SingleOrDefault() method?

I doubt there is any performance difference between the two, for sql server at least. It looks like the first one just has an extra wrapper around the select. Running a similar query on a database that I have generates the exact same plan, so I would imagine the outer select gets optimized away in the execution plan.

Related

Why does System.Data.Linq generates ROW_NUMBER() for Paging instead of OFFSET/FETCH for SQL Server 2012

We are using Linq and Entity Framework to access a SQL Server 2012 database. We are having some performance issue, so after some investigation, we were able to fix some of the problems, but I would like to use SQL query with OFFSET/FETCH instead of ROW_NUMBER() and BETWEEN syntax.
The performance difference is not so big. OFFSET/FETCH is quicker by about 10%. Do you have any idea why the generated query uses ROW_NUMBER() and BETWEEN syntax? What can I do to force Linq to generate OFFSET/FETCH query?
C# code:
var orders = dc.Orders.OrderBy(q => q.LastModifiedTimestamp)
.Skip(q => skipCount)
.Take(q => takeCount)
.ToList();
The currently generated query:
-- Region Parameters
DECLARE #p0 Int = 10
DECLARE #p1 Int = 10
-- EndRegion
SELECT [t2].[OrderId], [t2].[CustomerId]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t1].[OrderId], [t1].[CustomerId]
FROM (
SELECT DISTINCT [t0].[OrderId], [t0].[CustomerId]
FROM [Order] AS [t0]
) AS [t1]
) AS [t2]
WHERE [t2].[ROW_NUMBER] BETWEEN #p0 + 1 AND #p0 + #p1
ORDER BY [t2].[ROW_NUMBER]
The preferred query:
SELECT *
FROM [Order]
ORDER BY LastModifiedTimestamp
OFFSET 10000 ROWS
FETCH NEXT 10000 ROWS ONLY
Do you have any idea why the generated query use ROW_NUMBER() and BETWEEN syntax? What can I do to force Linq to generate OFFSET/FETCH query?

Count(x => ...) vs Where(x => ...).Count()

Is there any difference between these two LINQ-to-Entities queries:
context.Table.Count(x => ...)
and
context.Table.Where(x => ...).Count()
in terms of performance and generated SQL?
I tried to look generated SQL myself, but I only know how to get the SQL from IQueryable, but Count returns the value directly.

I have managed to see the SQL (thanks to #dasblinkenlight), the answer is - no, both LINQ queries generate exactly the same SQL query, at least for a simple query without a grouping:
SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
COUNT(1) AS [A1]
FROM [dbo].[Table] AS [Extent1]
WHERE <condition>
) AS [GroupBy1]

Entity Framework: Include and Explicitly defined foreign keys [duplicate]

If I use a join, the Include() method is no longer working, eg:
from e in dc.Entities.Include("Properties")
join i in dc.Items on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select e
e.Properties is not loaded
Without the join, the Include() works
Lee

UPDATE: Actually I recently added another Tip that covers this, and provides an alternate probably better solution. The idea is to delay the use of Include() until the end of the query, see this for more information: Tip 22 - How to make include really include
There is known limitation in the Entity Framework when using Include().
Certain operations are just not supported with Include.
Looks like you may have run into one on those limitations, to work around this you should try something like this:
var results =
from e in dc.Entities //Notice no include
join i in dc.Items on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select new {Entity = e, Properties = e.Properties};
This will bring back the Properties, and if the relationship between entity and Properties is a one to many (but not a many to many) you will find that each resulting anonymous type has the same values in:
anonType.Entity.Properties
anonType.Properties
This is a side-effect of a feature in the Entity Framework called relationship fixup.
See this Tip 1 in my EF Tips series for more information.

Try this:
var query = (ObjectQuery<Entities>)(from e in dc.Entities
join i in dc.Items on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select e)
return query.Include("Properties")

So what is the name of the navigation property on "Entity" which relates to "Item.Member" (i.e., is the other end of the navigation). You should be using this instead of the join. For example, if "entity" add a property called Member with the cardinality of 1 and Member had a property called Items with a cardinality of many, you could do this:
from e in dc.Entities.Include("Properties")
where e.Member.Items.Any(i => i.Collection.ID == collectionID)
select e
I'm guessing at the properties of your model here, but this should give you the general idea. In most cases, using join in LINQ to Entities is wrong, because it suggests that either your navigational properties are not set up correctly, or you are not using them.

So, I realise I am late to the party here, however I thought I'd add my findings. This should really be a comment on Alex James's post, but as I don't have the reputation it'll have to go here.
So my answer is: it doesn't seem to work at all as you would intend. Alex James gives two interesting solutions, however if you try them and check the SQL, it's horrible.
The example I was working on is:
var theRelease = from release in context.Releases
where release.Name == "Hello World"
select release;
var allProductionVersions = from prodVer in context.ProductionVersions
where prodVer.Status == 1
select prodVer;
var combined = (from release in theRelease
join p in allProductionVersions on release.Id equals p.ReleaseID
select release).Include(release => release.ProductionVersions);
var allProductionsForChosenRelease = combined.ToList();
This follows the simpler of the two examples. Without the include it produces the perfectly respectable sql:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name]
FROM [dbo].[Releases] AS [Extent1]
INNER JOIN [dbo].[ProductionVersions] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ReleaseID]
WHERE ('Hello World' = [Extent1].[Name]) AND (1 = [Extent2].[Status])
But with, OMG:
SELECT
[Project1].[Id1] AS [Id],
[Project1].[Id] AS [Id1],
[Project1].[Name] AS [Name],
[Project1].[C1] AS [C1],
[Project1].[Id2] AS [Id2],
[Project1].[Status] AS [Status],
[Project1].[ReleaseID] AS [ReleaseID]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent2].[Id] AS [Id1],
[Extent3].[Id] AS [Id2],
[Extent3].[Status] AS [Status],
[Extent3].[ReleaseID] AS [ReleaseID],
CASE WHEN ([Extent3].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
FROM [dbo].[Releases] AS [Extent1]
INNER JOIN [dbo].[ProductionVersions] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ReleaseID]
LEFT OUTER JOIN [dbo].[ProductionVersions] AS [Extent3] ON [Extent1].[Id] = [Extent3].[ReleaseID]
WHERE ('Hello World' = [Extent1].[Name]) AND (1 = [Extent2].[Status])
) AS [Project1]
ORDER BY [Project1].[Id1] ASC, [Project1].[Id] ASC, [Project1].[C1] ASC
Total garbage. The key point to note here is the fact that it returns the outer joined version of the table which has not been limited by status=1.
This results in the WRONG data being returned:
Id Id1 Name C1 Id2 Status ReleaseID
2 1 Hello World 1 1 2 1
2 1 Hello World 1 2 1 1
Note that the status of 2 is being returned there, despite our restriction. It simply does not work.
If I have gone wrong somewhere, I would be delighted to find out, as this is making a mockery of Linq. I love the idea, but the execution doesn't seem to be usable at the moment.
Out of curiosity, I tried the LinqToSQL dbml rather than the LinqToEntities edmx that produced the mess above:
SELECT [t0].[Id], [t0].[Name], [t2].[Id] AS [Id2], [t2].[Status], [t2].[ReleaseID], (
SELECT COUNT(*)
FROM [dbo].[ProductionVersions] AS [t3]
WHERE [t3].[ReleaseID] = [t0].[Id]
) AS [value]
FROM [dbo].[Releases] AS [t0]
INNER JOIN [dbo].[ProductionVersions] AS [t1] ON [t0].[Id] = [t1].[ReleaseID]
LEFT OUTER JOIN [dbo].[ProductionVersions] AS [t2] ON [t2].[ReleaseID] = [t0].[Id]
WHERE ([t0].[Name] = #p0) AND ([t1].[Status] = #p1)
ORDER BY [t0].[Id], [t1].[Id], [t2].[Id]
Slightly more compact - weird count clause, but overall same total FAIL.
Has anybody actually ever used this stuff in a real business application? I'm really starting to wonder...
Please tell me I've missed something obvious, as I really want to like Linq!

Try the more verbose way to do more or less the same thing obtain the same results, but with more datacalls:
var mydata = from e in dc.Entities
join i in dc.Items
on e.ID equals i.Member.ID
where (i.Collection.ID == collectionID)
select e;
foreach (Entity ent in mydata) {
if(!ent.Properties.IsLoaded) { ent.Properties.Load(); }
}
Do you still get the same (unexpected) result?
EDIT: Changed the first sentence, as it was incorrect. Thanks for the pointer comment!

Why linq-to-sql query is translated into subquery?

Why this linq query:
(from c in Orders
select new
{
Id=c.Id,
DeliveryDate = c.DeliveryDate.Value
}).Take(10)
is translated into
SELECT TOP (10) [t1].[Id], [t1].[value] AS [DeliveryDate]
FROM (
SELECT [t0].[Id], [t0].[DeliveryDate] AS [value]
FROM [Orders] AS [t0]
) AS [t1]
but when I change DeliveryDate = c.DeliveryDate.Value into DeliveryDate = c.DeliveryDate SQL query looks as simple as:
SELECT TOP (10) [t0].[Id], [t0].[DeliveryDate]
FROM [Orders] AS [t0]

I think this is because the LINQ2SQL's translator is under-optimized. Use of a "property" (Value) triggers creation of a sub-query, which turns out to be unnecessary.
It is worth to note that any RDBMS worth its salt would generate identical query plans for both SQL queries, so in the end it would not matter either way.

Possibly a bug / not optimized issue. I can not explain it different.

Improve query generated from entity framework

I have a query linq like this:
var query = from c in Context.Customers select c;
var result = query.ToList();
Linq query generate this tsql code:
exec sp_executesql N'SELECT
[Project1].[Id] AS [Id],
[Project1].[Name] AS [Name],
[Project1].[Email] AS [Email]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Email] AS [Email]
FROM [dbo].[Customers] AS [Extent1] ) AS [Project1]
Is there a way for not generate subquery?

Do you have any evidence that that query is causing performance problems? I imagine the query-optimizer would easily recognize that.
If you're certain after profiling that the query is a performance problem (doubtful) - and only then - you could simply turn the query into a stored procedure, and call that instead.

You use a tool like Linq because you don't want to write SQL, before abandoning that you should at least compare the query plan of your proposed SQL vs that generated by the tool. I don't have access to SQL Studio at the moment, but I would be a bit surprised if the query plans aren't identical...
EDIT: having had a chance to check out the query plans, they are in fact identical.

Short answer: No you cannot modify that query.
Long answer: If you want to reimplement Linq provider and query generator then perhaps there is a way but I doubt you want to do that. You can also implement custom EF provider wrapper which will take query passed from EF and reformat it but that will be hard as well - and slow. Are you going to write custom interpreter for SQL queries?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Entity Framework generates inefficient select when using Find() - c#

Related

Why does System.Data.Linq generates ROW_NUMBER() for Paging instead of OFFSET/FETCH for SQL Server 2012

Count(x => ...) vs Where(x => ...).Count()

Entity Framework: Include and Explicitly defined foreign keys [duplicate]

Why linq-to-sql query is translated into subquery?

Improve query generated from entity framework

Categories

Resources