How do I speed up this EF query? - c#

How do I speed up this EntityFramework query? The profiler tells me that most of the time is spent in od.Order with ~5000 calls.
var orderDetails = context.OrderDetails.ToList();
foreach (OrderDetail od in orderDetails)
{
var date = od.Order.date;
if (!trPerDay.ContainsKey(date))
{
trPerDay.Add(date, od.quantity);
}
else
{
trPerDay[date] += od.quantity;
}
}
Order property is defined like this:
[MetadataType(typeof(OrderDetailMetaData))]
public partial class OrderDetail
{
public int orderID { get; set; }
public string productID { get; set; }
public int quantity { get; set; }
public bool upgraded { get; set; }
public virtual Order Order { get; set; }
public virtual Product Product { get; set; }
}

What you posted loads the entire OrderDetails table in a single query, from a single thread. Then it tries to lazily load each order which results in a separate call to the database.
It's far faster to let the database do the calculations and only load the final results.
In this case it seems the loop is trying to calculate the total order quantity per order date. The SQL query that produces this would be something like :
SELECT Date,SUM(details.Quantity)
FROM Orders inner join OrderDetails details
on Orders.ID=details.OrderID
GROUP BY Orders.Date
The equivalent in LINQ can be :
var query=context.OrderDetails
.GroupBy(d=>d.Order.Date)
.Select(g=>new {
Date=g.Key,
Total=g.Sum(dt=>dt.Quantity)
});
var totals=await query.ToListAsync();
or
var totals=await query.ToDictionaryAsync(t=>t.Date,t=>t.Quantity)
In both cases, a GROUP BY query will be generated that calculates the totals by date.
This assumes that Date is what it says - a date. Either a date-typed field in the database, or a datetime without a time component. If it's actually a Date+Time, the query will have to be adjusted to use only the date part. Luckily, EF Core maps DateTime.Date to the equivalent SQL function call:
var query=context.OrderDetails
.GroupBy(d=>d.Order.Date)
.Select(g=>new {
Date=g.Key,
Total=g.Sum(dt=>dt.Quantity)
});

Related

Execute Linq to SQL queries on Navigation properties in Model

Is it possible to make Statistics.Sum(s => s.Conversions) linq query as Linq to SQL and not Linq to Object like in this code below. Every time when I access TotalConversions property, the whole Statistics table downloaded from database and then SUM linq executed locally. I want to do that in database server as SQL.
public class User : Entity
{
public int Id { get; set; }
public string Email { get; set; }
public virtual ICollection<Statistic> Statistics { get; set; }
[NotMapped]
public int TotalConversions
{
get
{
return Statistics.Sum(s => s.Conversions);
}
}
}
Yes, but you need a reference to the DbContext. This is one of the costs of having the Entities be persistence-ignorant.
Then the property would look something like:
return db.Users.Single(s => s.Id = this.Id).Statistics.Sum(s => s.Conversions);

Linq to Entities filter navigation collection properties

I have an order class that has two navigation properties of type collections; OrderDetails and Categories. There is a one to many relationship between Order and both OrderDetail and Category. An Order may or may not have a Category associated to it. An OrderDetail record has a CustomerID field.
I am trying to retrieve a list of Orders that have categories associated to them and their corresponding OrderDetail records for a specific customer. I want to achieve this using linq to entities if possible.
public class order
{
public order()
{
OrderDetails = new list<OrderDetail>();
Categories = new list<Category>();
}
public int OrderID { get; set; }
public DateTime OrderDate { get; set; }
public virtual List<OrderDetail> OrderDetails { get; set; }
public virtual List<Category> Categories{ get; set; }
}
public class OrderDetail
{
public int OrderDetailID { get; set; }
public int CustomerID { get; set; }
public virtual Order Order { get; set; }
}
public class Category
{
public int CategoryID { get; set; }
public string CategoryName { get; set; }
public virtual Order Order { get; set; }
}
I can get it to work if I start with the OrderDetail entity first as shown below but how would I write this if I want to start with the Order entity first?
var query = from od in _dbCtx.OrderDetails
.Include("Order")
.Include("Order.Categories")
where od.CustomerID == custID && od.Order.Categories.Count > 0
select od;
You can try this:
var query =_dbCtx.Orders.Include("OrderDetails")
.Include("Categories")
.Where(o=>o.Categories.Count>0)
.SelectMany(o=>o.OrderDetails.Where(od=>od.CustomerID == custID));
The key in this query is the SelectMany extension method, which is used to flatten the Where's result into one single collection.
Edit 1
Due to you have disabled lazy loading, the Order navigation property in the OrderDetails that you get when you execute my query are null. One option could be using the Load method when you use the result:
foreach(var od in query)
{
// Load the order related to a given OrderDetail
context.Entry(od).Reference(p => p.Order).Load();
// Load the Categories related to the order
context.Entry(blog).Collection(p => p.Order.Categories).Load();
}
Another option could be returning an anonymous type:
var query =_dbCtx.Orders.Include("OrderDetails")
.Include("Categories")
.Where(o=>o.Categories.Count>0)
.SelectMany(o=>o.OrderDetails.Where(od=>od.CustomerID == custID).Select(od=>new {Order=o,OrderDetail=od}));
But I don't like anyone of these solutions.The most direct way is the query that you already had from the beginning.
The default setting for Entity Framework is to allow lazy loading and dynamic proxies.
And in this case when you are using the virtual keyword on the relational properties these 'should' (in case you have not disabled it in EF) load with Lazy Loading.
Lazy Loading Loads the relational properties when you need it. Example:
var load = data.Orders.OrderDetails.Tolist() // Would load all OrderDetails to a list.
//Below would load all OrderDetails that has a OrderId smaller than 5
var loadSpecific = data.Orders.Where(x=> x.OrderId < 5).OrderDetails.ToList()
The case you are describing is Eager Loading('Include' statements), Nothing wrong with it. But if you are planning on using it I would consider using below syntax instead. This would give compilation error if you decide to change the name of the relational property.
var load = data.Orders
.Include(x => x.OrderDetails)
.Include(x => x.Categories)
I suggest you take 10-15 minutes of time and read up on it in this article:
https://msdn.microsoft.com/en-us/data/jj574232.aspx

I need help speeding up this EF LINQ query

I am using EntityFramework 6 and running into some major speed issues -- this query is taking over two seconds to run. I have spent the better part of the day using LinqPad in order to speed up the query but I could only get it down from 4 to two seconds. I have tried grouping, joins, etc. but the generated SQL looks overly complicated to me. I am guessing that I am just taking the wrong approach to writing the LINQ.
Here is what I am attempting to do
Find all A where Valid is null and AccountId isn't the current user
Make sure the Collection of B does not contain any B where AccountId is the current user
Order the resulting A by the number of B in its collection in descending order
Any A that doesn't have any B should be at the end of the returned results.
I have to models which look like this:
public class A
{
public int Id { get; set; }
public bool? Valid { get; set; }
public string AccountId { get; set; }
public virtual ICollection<B> Collection { get; set; }
}
public class B
{
public int Id { get; set; }
public bool Valid { get; set; }
public string AccountId { get; set; }
public DateTime CreatedDate { get; set; }
public virtual A Property { get; set; }
}
The table for A has about one million rows and B will eventually have around ten million. Right now B is sitting at 50,000.
Here is what the query currently looks like. It gives me the expected results but I have to run an orderby multiple times and do other unnecessary steps:
var filterA = this.context.A.Where(gt => gt.Valid == null && !gt.AccountId.Contains(account.Id));
var joinedQuery = from b in this.context.B.Where(gv => !gv.AccountId.Contains(account.Id))
join a in filterA on gv.A equals a
where !a.Collection.Any(v => v.AccountId.Contains(account.Id))
let count = gt.Collection.Count()
orderby count descending
select new { A = gt, Count = count };
IQueryable<GifTag> output = joinedQuery
.Where(t => t.A != null)
.Select(t => t.A)
.Distinct()
.Take(20)
.OrderBy(t => t.Collection.Count);
Thanks
Well you could always try to remove these two lines from the joinQuery
where !a.Collection.Any(v => v.AccountId.Contains(account.Id))
and
orderby count descending
the first line have already been filtered in the first Query
and the orderline, well do do the ordering on the last Query so there is no point in doing it twice

Inserting item into DataGrid

I have the following tables:
I'm using Entity Framework Database First, therefore the following entity class is generated:
public partial class Sal1 {
public string SaleID { get; set; }
public string ItemID { get; set; }
public int Quantity { get; set; }
public decimal Total { get; set; }
public virtual Item Item { get; set; }
public virtual Sale Sale { get; set; }
}
Then put the Sal1 rows into a datagrid like this:
private List<Sal1> saleItems = new List<Sal1>();
...
var query = from sa in db.Sal1
where sa.SaleID.Equals(tempSale)
select sa;
foreach(Sal1 si in query) {
saleItems.Add(si);
}
...
dgDetails.ItemsSource = saleItems;
But it turns out like this:
My question is, how should I tweak the query above so that I get the equivalent of the following SQL:
select T0.SaleID, T0.ItemID, T1.Name, T0.Quantity, T0.Total
from Sal1 T0 inner join Item T1 on T0.ItemID = T1.ItemID;
Thanks in advance.
EDIT: I seem to have found a solution, but I had to do this:
var query = from sa in db.Sal1
where sa.SaleID.Equals(tempSale)
select new { sa.SaleID, sa.ItemID, sa.Item.Name,
sa.Item.Manufacturer, sa.Quantity, sa.Total };
And I had to change the type of saleItems to object.
private List<object> saleItems = new List<object>();
Is this the best way to do it?
Just like SQL, LINQ also supports JOINs. You can read more about their syntax here. You should change your query accordingly to get your results. Instead of spoonfeeding the exact answer, I'm guiding you to a more detailed explanation, as it contains valuable information that will help you in the future too.

EF Code First improving performance for self referencing, one to many relationships

I have an AccountGroup which is a self-referencing entity. A leaf AccountGroup can contain 1 or more Accounts. Both entities have Balance property. Each AccountGroup has a Balance which is either a sum of Balances in sub-groups or sum of Balances of all Accounts (in case of leaf group).
In order to build a tree listing of all AccountGroups and Accounts I have to traverse this object graph recursively, which causes a lot (I mean a lot!!!) of calls to DB...
Is there any way to improve upon this in such way that # of DB calls is reduced?
Thanks
Here is the trimmed down code
Account (belongs to only 1 AccountGroup)
public class Account
{
public int Id { get; set; }
public int GroupId { get; set; }
public string Name { get; set; }
public decimal Balance { get; set; }
public string AccountType { get; set; }
public virtual AccountGroup Group { get; set; }
}
AccountGroup (has 0 or many AccountGroups, has 1 or more Accounts if it is a leaf)
public class AccountGroup
{
public AccountGroup()
{
Accounts = new HashSet<Account>();
Groups = new HashSet<AccountGroup>();
}
public int Id { get; set; }
public bool IsRoot { get { return Parent == null; } }
public bool IsLeaf { get { return !Groups.Any(); } }
public decimal Balance { get { return IsLeaf ? Accounts.Sum(a => a.Balance) : Groups.Sum(g => g.Balance); } } // if leaf group, get sum of all account balances, otherwise get sum of all subgroups
public int? ParentId { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public virtual ISet<Account> Accounts { get; private set; }
public virtual ISet<AccountGroup> Groups { get; private set; }
public virtual AccountGroup Parent { get; set; }
}
Calling Code
// start processing root groups (ones without parent)
foreach (var rootGroup in db.AccountGroups.Include(g=>g.Groups).Where(g => g.ParentId == null))
{
TraverseAccountGroup(rootGroup, 0);
}
// recursive method
private static void TraverseAccountGroup(AccountGroup accountGroup, int level)
{
//
// process account group
//
Console.WriteLine("{0}{1} ({2})", String.Empty.PadRight(level * 2, '.'), accountGroup.Name, level);
//
// if subgroups exist, process recursivelly
//
if (accountGroup.Groups.Any())
{
foreach (var subGroup in accountGroup.Groups)
{
TraverseAccountGroup(subGroup, level + 1);
}
}
//
// otherwise, process accounts belonging to leaf subgroup
//
else
{
foreach (var account in accountGroup.Accounts)
{
Console.WriteLine("ACCOUNT [{0}]", account.Name);
}
}
}
CTE Approach
There are two ways to increase speed of queries against tree data types. The first (and likely easiest) is using a Stored Procedure and the execute sql functionality of EF to load the tree. The SProc will cache and the result set execution speed will be increased. My recommendation for the query in the sproc would be a recursive CTE.
http://msdn.microsoft.com/en-us/library/ms186243(v=sql.105).aspx
with <CTEName> as
(
SELECT
<Root Query>
FROM <TABLE>
UNION ALL
SELECT
<Child Query>
FROM <TABLE>
INNER JOIN <CTEName>
ON <CTEJoinCondition>
WHERE
<TERMINATION CONDITION>
)
Edit
Execute your sproc or CTE inline with:
DbContext ctx = new SampleContext();
ctx.Database.SqlQuery<YourEntityType>(#"SQL OR SPROC COMMAND HERE", new[] { "Param1", "Param2", "Etc" });
Flatten Your Tree Structure
The second approach is to build a flat representation of your tree. You can flatten a tree into a flat structure for quick querying and then use a linkage between the flat structure and the actual tree node to cut out the self referencing entity. You can build the flat structure using the above recursive CTE query.
This is just one approach but there are many papers on the subject:
http://www.governor.co.uk/news-plus-views/2010/5/17/depth-first-tree-flattening-with-the-yield-keyword-in-c-sharp/
EDIT: Adding additional clarification
Just a note, the Recursive CTE cache's the symbols for the query before iterating over the structure. This is the fastest and simplest way to write a query to solve your problem. However, this HAS to be a SQL query. You can use execute sql directly or you can execute a SProc. Sprocs cache the execution graph after being ran so they perform better than native queries that have to build an execution plan prior to running. This is entirely up to you.
The issue with a flat representation of your tree is you have to routinely rebuild or constantly upkeep the flat structure. Depending on your query path would determine what flattening algorithm you should use, but the end result remains the same. The flat structure is the only way to "accomplish" what you want to do inside EF without having to cheat and execute raw SQL through the DBConnection.

Categories