I have two related tables lets say:
Table Products (Main Table)
ID
Name
Type
Table Parts (Child of Products, contains detailled Information)
ID
ProductID
PartName
PartValue
I would like to get Products ordered by the value of a specific part (e.G. Engine)
I came up with the following:
// Code to construct a query to get all desired products
products = products.OrderBy(c => c.Parts.Where(d => d.PartName == "Engine").Select(d => d.Value).FirstOrDefault());
This works but is too slow. Can I improve the query or will I have to redsign my database so I won't be sorting like this in the first place?
Generated SQL-Query:
SELECT
[Project3].[ID] AS [ID],
[Project3].[Name] AS [Name]
FROM ( SELECT
[Project2].[ID] AS [ID],
[Project2].[Name] AS [Name],
[Project2].[C1] AS [C1]
FROM ( SELECT
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name],
(SELECT TOP (1)
[Extent2].[PartValue] AS [PartValue]
FROM [dbo].[Parts] AS [Extent2]
WHERE ([Extent1].[ID] = [Extent2].[ProductID]) AND (N'Engine' = [Extent2].[PartName])) AS [C1]
FROM [dbo].[Products] AS [Extent1]
WHERE (1 = [Extent1].[Type])
) AS [Project2]
) AS [Project3]
ORDER BY [Project3].[C1] ASC
If you want to try a join, it would be a bit like this:
Products
.Join(Parts, pd => pd.ID, pt => pt.ProductID, (pd, pt) => new { pd.Name, pt.PartName, pt.PartValue })
.Where(x => x.PartName == "Engine")
.OrderBy(x => x.PartValue);
This would result in a single select statement without the inner selects from before
Related
I have the following query that works fine
var myList = (from p in db.full
group p by p.object into g
orderby g.Count() descending
select new StringIntType
{
str = g.Key,
nbr = g.Count()
}).Take(50).ToList();
The problem is that it's a bit slow due to the fact that i'm using count(), which is translated to count(*).
I need to know if is there a way to use count(object),
Here is what i got in sql server profiler
exec sp_executesql N'SELECT TOP (50)
[Project1].[C2] AS [C1],
[Project1].[object] AS [object],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [object],
1 AS [C2]
FROM ( SELECT
[Extent1].[object], AS [K1],
COUNT(1) AS [A1]
FROM (SELECT
[full].[mc_host_class] AS [mc_host_class],
[full].[event_handle] AS [event_handle],
[full].[mc_host_address] AS [mc_host_address],
[full].[mc_object_class] AS [mc_object_class],
[full].[mc_object] AS [mc_object],
[full].[mc_incident_time] AS [mc_incident_time],
[full].[date_reception] AS [date_reception],
[full].[status] AS [status],
[full].[mc_owner] AS [mc_owner],
[full].[msg] AS [msg],
[full].[duration] AS [duration],
[full].[repeat_count] AS [repeat_count],
[full].[mc_date_modification] AS [mc_date_modification],
[full].[event_class] AS [event_class],
[full].[bycn_ticket_remedy] AS [bycn_ticket_remedy],
[full].[mc_host] AS [mc_host],
[full].[acknowledge_by] AS [acknowledge_by],
[full].[acknowledge_by_time] AS [acknowledge_by_time],
[full].[assigned_by] AS [assigned_by],
[full].[assigned_to] AS [assigned_to],
[full].[assigned_by_time] AS [assigned_by_time],
[full].[closed_b y] AS [closed_by],
[full].[closed_by_time] AS [closed_by_time],
[full].[blacked_out] AS [blacked_out],
[full].[bycn_liaison_type] AS [bycn_liaison_type],
[full].[bycn_liaison_debit] AS [bycn_liaison_debit],
[full].[cause] AS [cause],
[full].[mc_location] AS [mc_location],
[full].[mc_parameter] AS [mc_parameter]
FROM [dbo].[full] AS [full]) AS [Extent1]
GROUP BY [Extent1].[object],
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC',N'#p__linq__0 datetime2(7),#p__linq__1 datetime2(7)',#p__linq__0='2015-03-14 00:00:00',#p__linq__1='2015-04-15 00:00:00'
Perhaps couple optimisations can do the trick:
Do the take first, before selecting
Count groups only once using a let keyword
So metacode (this code written in notepad and won't compile!)
var topFifty = (
from p in db.full
group p by p.object into g
let groupedCount = g.Count()
orderby groupedCount descending
select p.key, groupedCount
)
.Take(50).ToList();
var topFifty.Select(x => new StringIntType
{
str = x.Key,
nbr = x.Count
}).ToList();
I want to ask why Linq to Entities return more join than necessary, and way to improve them.
Here is my code:
var items = dc.WHItems
.Select(c => new testModel()
{
ID = c.ID,
Name = c.Name,
Code = c.Code,
Discontinued = c.Discontinued,
Type = c.WHItemType.Name,
Category = c.WHItemType.WHItemCategory.Name
})
.Where(c => c.Discontinued == false)
.Take(10);
Resulting SQL took from SQL Profiler
SELECT TOP (10)
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name],
[Extent1].[Code] AS [Code],
[Extent1].[Discontinued] AS [Discontinued],
[Extent2].[Name] AS [Name1],
[Extent4].[Name] AS [Name2]
FROM [dbo].[WHItems] AS [Extent1]
INNER JOIN [dbo].[WHItemTypes] AS [Extent2] ON [Extent1].[WHItemTypeID] = [Extent2].[ID]
LEFT OUTER JOIN [dbo].[WHItemTypes] AS [Extent3] ON [Extent1].[WHItemTypeID] = [Extent3].[ID]
LEFT OUTER JOIN [dbo].[WHItemCategories] AS [Extent4] ON [Extent3].[WHItemCategoryID] = [Extent4].[ID]
WHERE 0 = [Extent1].[Discontinued]
However using LINQPad4, resulting query satisfied me
SELECT TOP (10) [t0].[ID], [t0].[Name], [t0].[Code], [t0].[Discontinued], [t1].[Name] AS [Type], [t2].[Name] AS [Category]
FROM [WHItems] AS [t0]
INNER JOIN [WHItemTypes] AS [t1] ON [t1].[ID] = [t0].[WHItemTypeID]
INNER JOIN [WHItemCategories] AS [t2] ON [t2].[ID] = [t1].[WHItemCategoryID]
WHERE NOT ([t0].[Discontinued] = 1)
Here is another code, where I try to get items from inventories. Table WHInventories use a composite key on WHItemID and WHID, which I think explained the choice of using 'Inner Join' instead of 'Left Join' as above.
var items = dc.WHInventories
.Select(c => new testModel()
{
ID = c.WHItemID,
Name = c.WHItem.Name,
Code = c.WHItem.Code,
Discontinued = c.WHItem.Discontinued
})
.Where(c => c.Discontinued == false)
.Take(10);
Resulting queries in SQL Profiler show additional left outer join. Why would Column Code has to be retrieved from Extent3? while Column Name and Discontinued using the same Extent2
SELECT TOP (10)
[Extent1].[WHItemID] AS [WHItemID],
[Extent2].[Name] AS [Name],
[Extent3].[Code] AS [Code],
[Extent2].[Discontinued] AS [Discontinued]
FROM [dbo].[WHInventories] AS [Extent1]
INNER JOIN [dbo].[WHItems] AS [Extent2] ON [Extent1].[WHItemID] = [Extent2].[ID]
LEFT OUTER JOIN [dbo].[WHItems] AS [Extent3] ON [Extent1].[WHItemID] = [Extent3].[ID]
WHERE 0 = [Extent2].[Discontinued]
If I try explicit join which is means loosing the advantage of having 'Navigation properties'.
var items = dc.WHInventories
.Join(dc.WHItems, c => c.WHItemID, d => d.ID, (c, d) => new { c, d })
.Select(c => new testModel()
{
ID = c.c.WHItemID,
Name = c.d.Name,
Code = c.d.Code,
Discontinued = c.d.Discontinued
})
.Where(c => c.Discontinued == false)
.Take(10);
Resulting Query in SQL Profiler show what I expected. Column Code now using Extent2
SELECT TOP (10)
[Extent1].[WHItemID] AS [WHItemID],
[Extent2].[Name] AS [Name],
[Extent2].[Code] AS [Code],
[Extent2].[Discontinued] AS [Discontinued]
FROM [dbo].[WHInventories] AS [Extent1]
INNER JOIN [dbo].[WHItems] AS [Extent2] ON [Extent1].[WHItemID] = [Extent2].[ID]
WHERE 0 = [Extent2].[Discontinued]
I've been fooling around with some LINQ over Entities and I'm getting strange results and I would like to get an explanation...
Given the following LINQ query,
// Sample # 1
IEnumerable<GroupInformation> groupingInfo;
groupingInfo = from a in context.AccountingTransaction
group a by a.Type into grp
select new GroupInformation()
{
GroupName = grp.Key,
GroupCount = grp.Count()
};
I get the following SQL query (taken from SQL Profiler):
SELECT
1 AS [C1],
[GroupBy1].[K1] AS [Type],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Extent1].[Type] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[AccountingTransaction] AS [Extent1]
GROUP BY [Extent1].[Type]
) AS [GroupBy1]
So far so good.
If I change my LINQ query to:
// Sample # 2
groupingInfo = context.AccountingTransaction.
GroupBy(a => a.Type).
Select(grp => new GroupInformation()
{
GroupName = grp.Key,
GroupCount = grp.Count()
});
it yields to the exact same SQL query. Makes sense to me.
Here comes the interesting part... If I change my LINQ query to:
// Sample # 3
IEnumerable<AccountingTransaction> accounts;
IEnumerable<IGrouping<object, AccountingTransaction>> groups;
IEnumerable<GroupInformation> groupingInfo;
accounts = context.AccountingTransaction;
groups = accounts.GroupBy(a => a.Type);
groupingInfo = groups.Select(grp => new GroupInformation()
{
GroupName = grp.Key,
GroupCount = grp.Count()
});
the following SQL is executed (I stripped a few of the fields from the actual query, but all the fields from the table (~ 15 fields) were included in the query, twice):
SELECT
[Project2].[C1] AS [C1],
[Project2].[Type] AS [Type],
[Project2].[C2] AS [C2],
[Project2].[Id] AS [Id],
[Project2].[TimeStamp] AS [TimeStamp],
-- <snip>
FROM ( SELECT
[Distinct1].[Type] AS [Type],
1 AS [C1],
[Extent2].[Id] AS [Id],
[Extent2].[TimeStamp] AS [TimeStamp],
-- <snip>
CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
FROM (SELECT DISTINCT
[Extent1].[Type] AS [Type]
FROM [dbo].[AccountingTransaction] AS [Extent1] ) AS [Distinct1]
LEFT OUTER JOIN [dbo].[AccountingTransaction] AS [Extent2] ON [Distinct1].[Type] = [Extent2].[Type]
) AS [Project2]
ORDER BY [Project2].[Type] ASC, [Project2].[C2] ASC
Why are the SQLs generated are so different? After all, the exact same code is executed, it's just that sample # 3 is using intermediate variables to get the same job done!
Also, if I do:
Console.WriteLine(groupingInfo.ToString());
for sample # 1 and sample # 2, I get the exact same query that was captured by SQL Profiler, but for sample # 3, I get:
System.Linq.Enumerable+WhereSelectEnumerableIterator`2[System.Linq.IGrouping`2[System.Object,TestLinq.AccountingTransaction],TestLinq.GroupInformation]
What is the difference? Why can't I get the SQL Query generated by LINQ if I split the LINQ query in multiple instructions?
The ulitmate goal is to be able to add operators to the query (Where, OrderBy, etc.) at run-time.
BTW, I've seen this behavior in EF 4.0 and EF 6.0.
Thank you for your help.
The reason is because in your third attempt you're referring to accounts as IEnumerable<AccountingTransaction> which will cause the query to be invoked using Linq-To-Objects (Enumerable.GroupBy and Enumerable.Select)
On the other hand, in your first and second attempts the reference to AccountingTransaction is preserved as IQueryable<AccountingTransaction> and the query will be executed using Linq-To-Entities which will then transform it to the appropriate SQL statement.
I have the following line:
WorkPlaces.FirstOrDefault()
.WorkSteps.Where(x=>x.Failcodes_Id != null)
.OrderByDescending(x=>x.Timestamp)
.FirstOrDefault()
There are about 10-20 workplaces and each workplace have thousands of worksteps. I would like to get the last workstep for each of the workplaces.
The code above is an example from linqpad because I couldn't believe that the generated sql looks like this:
SELECT TOP (1)
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Description] AS [Description],
[Extent1].[Active] AS [Active],
[Extent1].[ProductionLine_Id] AS [ProductionLine_Id],
[Extent1].[DefaultTechnology_Id] AS [DefaultTechnology_Id],
[Extent1].[PrinterName] AS [PrinterName],
[Extent1].[Deleted] AS [Deleted],
[Extent2].[Id] AS [Id1],
[Extent1].[LoggedInUser_UserId] AS [LoggedInUser_UserId]
FROM [dbo].[WorkPlaces] AS [Extent1]
LEFT OUTER JOIN [dbo].[WorkplaceParameterSet] AS [Extent2] ON [Extent1].[Id] = [Extent2].[WorkPlace_Id]
GO
-- Region Parameters
DECLARE #EntityKeyValue1 Int = 1
-- EndRegion
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Timestamp] AS [Timestamp],
[Extent1].[Description] AS [Description],
[Extent1].[WorkPlace_Id] AS [WorkPlace_Id],
[Extent1].[WorkItemState_Id] AS [WorkItemState_Id],
[Extent1].[UserId] AS [UserId],
[Extent1].[WorkItem_Id] AS [WorkItem_Id],
[Extent1].[Technology_Id] AS [Technology_Id],
[Extent1].[Failcodes_Id] AS [Failcodes_Id],
[Extent1].[DrawingNo] AS [DrawingNo],
[Extent1].[ManualData] AS [ManualData],
[Extent1].[Deleted] AS [Deleted],
[Extent1].[WorkItemState_Arrival_Id] AS [WorkItemState_Arrival_Id]
FROM [dbo].[WorkSteps] AS [Extent1]
WHERE [Extent1].[WorkPlace_Id] = #EntityKeyValue1
Is there a way to get one line from worksteps without downloading 9000 records to pick one from the top of the list?
Rather than getting each workplace individually and then getting the workstep for that work place in a query you can use Select to project each workplace into the workstep that you want in one query:
var query = WorkPlaces.Select(workplace => workplace.WorkSteps
.Where(x => x.Failcodes_Id != null)
.OrderByDescending(x => x.Timestamp)
.FirstOrDefault());
You should use IQueryable interface instead of IEnumerable. Also check this.
I have Linq-to-SQL code that works with a many-to-many relationship, but note that the relationship itself has its own set of attributes (in this case, Products are in Many Categories, and each product-in-category relation has its own SortOrder attribute).
I have a Linq-to-SQL block that returns matching Products with Category membership information. When I execute the code it generates optimised T-SQL code like so:
exec sp_executesql N'SELECT [t0].[ProductId], [t0].[Name], [t1].[ProductId] AS [ProductId2], [t1].[CategoryId], [t1].[SortOrder] AS [SortOrder2], [t2].[CategoryId] AS [CategoryId2], [t2].[Name] AS [Name2] (
SELECT COUNT(*)
FROM [dbo].[ProductsInCategories] AS [t3]
INNER JOIN [dbo].[Categories] AS [t4] ON [t4].[CategoryId] = [t3].[CategoryId]
WHERE [t3].[ProductId] = [t0].[ProductId]
) AS [value]
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN ([dbo].[ProductsInCategories] AS [t1]
INNER JOIN [dbo].[Categories] AS [t2] ON [t2].[CategoryId] = [t1].[CategoryId]) ON [t1].[ProductId] = [t0].[ProductId]
WHERE (([t0].[OwnerId]) = #p0) AND ([t0].[Visible] = 1)
ORDER BY [t0].[SortOrder], [t0].[Name], [t0].[ProductId], [t1].[CategoryId]',N'#p0 bigint',#p0=3
However, when I add paging instructions (i.e.".Skip(0).Take(50)") to the Linq expression the generated SQL becomes this:
exec sp_executesql N'SELECT TOP (50) [t0].[ProductId], [t0].[Name]
FROM [dbo].[Products] AS [t0]
WHERE (([t0].[OwnerId]) = #p0) AND ([t0].[Visible] = 1)
ORDER BY [t0].[SortOrder], [t0].[Name]',N'#p0 bigint',#p0=3
Which means the Category membership information isn't loaded anymore, so Linq-to-SQL then executes the manual loading code 50 times over (one for each member in the returned set):
exec sp_executesql N'SELECT [t0].[ProductId], [t0].[CategoryId], [t0].[SortOrder], [t1].[CategoryId] AS [CategoryId2], [t1].[Name]
FROM [dbo].[ProductsInCategories] AS [t0]
INNER JOIN [dbo].[Categories] AS [t1] ON [t1].[CategoryId] = [t0].[CategoryId]
WHERE [t0].[ProductId] = #x1',N'#x1 bigint',#x1=1141
(obviously the "#x1" ID parameter varies for each result from the original query).
So clearly Linq paging breaks the query and causes it to load data separately. Is there a way around this or should I do paging in my own software?
...fortunately the number of products in the database is small enough (<500) to do this, but it just feels dirty because there could be tens of thousands of products, and this just wouldn't be a good query.
EDIT:
Here is my Linq:
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Product>( p => p.ProductsInCategories );
dlo.LoadWith<ProductsInCategory>( pic => pic.Category );
this.LoadOptions = dlo;
query = from p in this.Products
select p;
// The lines below are added conditionally:
query = query.OrderBy( p => p.SortOrder ).ThenBy( p => p.Name );
query = query.Where( p => p.Visible );
query = query.Where( p => p.Name.Contains( filter ) || p.Description.Contains( filter ) );
query = query.Where( p => p.OwnerId == siteId );
The skip/take lines are added optionally, and are the only differences that cause the different T-SQL generation (as far as I know):
IQueryable<Product> query = GetProducts( siteId, category, filter, showHidden, sortBySortOrder );
///////////////////////////////////
total = query.Count();
var pagedProducts = query.Skip( pageIndex * pageSize ).Take( pageSize );
return pagedProducts;
An alternative answer which first pages the products and then selects products and categories in a parent-child structure would be like this:
var filter = "a";
var pageSize = 2;
var pageIndex = 1;
// get the correct products
var query = Products.AsQueryable();
query = query.Where (q => q.Name.Contains(filter));
query = query.OrderBy (q => q.SortOrder).ThenBy(q => q.Name);
// do paging
query = query.Skip(pageSize*pageIndex).Take(pageSize);
// now get products + categories as tree structure
var query2 = query.Select(
q=>new
{
q.Name,
Categories=q.ProductsInCategories.Select (pic => pic.Category)
});
Which produces a single SQL statement
-- Region Parameters
DECLARE #p0 NVarChar(1000) = '%a%'
DECLARE #p1 Int = 2
DECLARE #p2 Int = 2
-- EndRegion
SELECT [t2].[Name], [t4].[CategoryId], [t4].[Name] AS [Name2], [t4].[Visible], (
SELECT COUNT(*)
FROM (
SELECT [t5].[CategoryId]
FROM [ProductsInCategories] AS [t5]
WHERE [t5].[ProductId] = [t2].[ProductId]
) AS [t6]
INNER JOIN [Categories] AS [t7] ON [t7].[CategoryId] = [t6].[CategoryId]
) AS [value]
FROM (
SELECT [t1].[ProductId], [t1].[Name], [t1].[ROW_NUMBER]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[SortOrder], [t0].[Name], [t0].[ProductId]) AS [ROW_NUMBER], [t0].[ProductId], [t0].[Name]
FROM [Products] AS [t0]
WHERE [t0].[Name] LIKE #p0
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN #p1 + 1 AND #p1 + #p2
) AS [t2]
LEFT OUTER JOIN ([ProductsInCategories] AS [t3]
INNER JOIN [Categories] AS [t4] ON [t4].[CategoryId] = [t3].[CategoryId]) ON [t3].[ProductId] = [t2].[ProductId]
ORDER BY [t2].[ROW_NUMBER], [t3].[CategoryId], [t3].[ProductId]
Here is a workaround: you should construct your query based on all your conditions, perform ordering there but select only the primary key on your Product table (let's assume this is ProductId column).
The next step is to take the total count (to calculate rows should be skipped and taken),
and the last step is to select all the records from your Product table whose ProductIds are in the query (note: Skip and Take extension methods should be applied to query, not to the new select itself).
This will get you a SELECT statement similar to yours (from the first example) with related entities.
EDIT:
Just created a similar DB structure (according to the original SQL from the question):
Then used:
using (var db = new TestDataContext())
{
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Product>(p => p.ProductsInCategories);
options.LoadWith<ProductsInCategory>(pic => pic.Category);
db.LoadOptions = options;
var filter = "product";
var pageIndex = 1;
var pageSize = 10;
var query = db.Products
.OrderBy(p => p.SortOrder)
.ThenBy(p => p.Name)
.Where(p => p.Name.Contains(filter) || p.Description.Contains(filter))
.Select(p => p.ProductId);
var total = query.Count();
var products = db.Products
.Where(p => query.Skip(pageIndex * pageSize).Take(pageSize).Contains(p.ProductId))
.ToList();
}
After the .ToList() call, products variable hold products with product categories with categories. This also produced 2 SQL statement, one - for .Count() statement:
exec sp_executesql N'SELECT COUNT(*) AS [value]
FROM [dbo].[Products] AS [t0]
WHERE ([t0].[Name] LIKE #p0) OR ([t0].[Description] LIKE #p1)',N'#p0 nvarchar(4000),#p1 nvarchar(4000)',#p0=N'%product%',#p1=N'%product%'
and another one for .ToList():
exec sp_executesql N'SELECT [t0].[ProductId], [t0].[Name], [t0].[Description], [t0].[SortOrder], [t1].[ProductId] AS [ProductId2], [t1].[CategoryId], [t1].[SortOrder] AS [SortOrder2], [t2].[CategoryId] AS [CategoryId2], [t2].[Name] AS [Name2], (
SELECT COUNT(*)
FROM (
SELECT NULL AS [EMPTY]
FROM [dbo].[ProductsInCategories] AS [t6]
INNER JOIN [dbo].[Category] AS [t7] ON [t7].[CategoryId] = [t6].[CategoryId]
WHERE [t6].[ProductId] = [t0].[ProductId]
) AS [t8]
) AS [value]
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN ([dbo].[ProductsInCategories] AS [t1]
INNER JOIN [dbo].[Category] AS [t2] ON [t2].[CategoryId] = [t1].[CategoryId]) ON [t1].[ProductId] = [t0].[ProductId]
WHERE EXISTS(
SELECT NULL AS [EMPTY]
FROM (
SELECT [t4].[ProductId]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t3].[SortOrder], [t3].[Name], [t3].[ProductId]) AS [ROW_NUMBER], [t3].[ProductId]
FROM [dbo].[Products] AS [t3]
WHERE ([t3].[Name] LIKE #p0) OR ([t3].[Description] LIKE #p1)
) AS [t4]
WHERE [t4].[ROW_NUMBER] BETWEEN #p2 + 1 AND #p2 + #p3
) AS [t5]
WHERE [t5].[ProductId] = [t0].[ProductId]
)
ORDER BY [t0].[ProductId], [t1].[CategoryId]',N'#p0 nvarchar(4000),#p1 nvarchar(4000),#p2 int,#p3 int',#p0=N'%product%',#p1=N'%product%',#p2=10,#p3=10
No more extra queries (as SQL Server Profiler said).