"Execution Timeout" on converting LINQ query results using ToList() - c#

As the title states, I'm getting a "Wait operation timed out" message (inner exception message: "Timeout expired") on a module I'm maintaining. Everytime the app tries to convert the query results using ToList(), it times out regardless of the number of results.
Reason this needs to be converted to list: Results needed to be exported to Excel for download.
Below is the code:
public Tuple<IEnumerable<ProductPriceSearchResultDto>, int> GetProductPriceSearchResults(ProductPriceFilterDto filter, int? pageNo = null)
{
//// Predicate builder
var predicate = GetProductPriceSearchFilter(filter);
//// This runs for approx. 1 minute before throwing a "Wait operation timed out" message...
var query = this.GetProductPriceSearchQuery()
.Where(predicate)
.Distinct()
.OrderBy(x => x.DosageFormName)
.ToList();
return Tuple.Create<IEnumerable<ProductPriceSearchResultDto>, int>(query, 0);
}
My query:
var query = (from price in this.context.ProductPrice.AsExpandable()
join product in this.context.vwDistributorProducts.AsExpandable()
on price.DosageFormCode equals product.DosageFormCode
join customer in this.context.vwCustomerBranch.AsExpandable()
on price.CustCd equals customer.CustomerCode
where price.CountryId == CurrentUserService.Identity.CountryId && !product.IsInactive
select new { price.PriceKey, price.EffectivityDateFrom, price.ContractPrice, price.ListPrice,
product.DosageFormName, product.MpgCode, product.DosageFormCode,
customer.CustomerName }).GroupBy(x => x.DosageFormCode)
.Select(x => x.OrderByDescending(y => y.EffectivityDateFrom).FirstOrDefault())
.Select(
x =>
new ProductPriceSearchResultDto
{
PriceKey = x.PriceKey,
DosageFormCode = x.DosageFormCode,
DosageFormName = x.DosageFormName,
EffectiveFrom = x.EffectivityDateFrom,
Price = x.ListPrice,
MpgCode = x.MpgCode,
ContractPrice = x.ContractPrice,
CustomerName = x.CustomerName
});
return query;
Notes:
ProductPrice is a table and has a non-clustered index pointing at columns CountryId and DosageFormCode.
vwDistributorProducts and vwCustomerBranch are views copied from the client's database.
I'm already at my wit's end. How do I get rid of this error? Is there something in the code that I need to change?
Edit: As much as possible, I don't want to resort to setting a command timeout because 1.) app's doing okay without it by far...except for this function and 2.) this is already a huge application and I don't want to possibly put the other modules' performances at risk.
Any help would be greatly appreciated. Thank you.

I'd try and log the sql this translates into.
The actual sql may then be used to get the query plan, which may lead you closer to the root cause.

Related

Large database linq query to sql server takes forever

Background
So, I am using a React frontend and a .net core 3.1 backend for a webapp where I display a view with a list of data. The list is often times several thousands long. In this case its around 7500. We virtualize it to prevent sluggishness. Along with the display of data, every row has a column with the latest logchange someone did on that datarow. The logs and the rest of the data for every row comes from two different applications with their own databases. The log data consists of the name, and date of when the log was made, is also supposed to be rendering for every row.
The problem
When you route to the page, a useEffect fires that fetches the rows from one of the databases. When I get the response, I filter out all of the ids from the data and then I post that list to the other endpoint to request the latest log from every id. This endpoint queries the logging database. The number of ids I am passing to the endpoint is about 7200+. It wont always be this much, but sometimes.
Troubleshooting
This is the query that is giving me trouble in the log endpoint
public async Task<IActionResult> GetLatestLog(ODataActionParameters parameters)
{
var LogIds= (LogIds)parameters["LogIds"];
var results = await context.Set<LogEvent>()
.Where(x => LogIds.Ids.Contains(x.Id)).ToListAsync(); //55 600 entities
results = results
.GroupBy(x => x.ContextId)
.Select(x => x.OrderByDescending(p => p.CreationDate).First()).ToList(); //7 500 entities
var transformed = results.Select(MapEntityToLogEvent).ToList();
return Ok(transformed);
}
The first db query takes around 25 seconds (!) and returns around 56000 entities.
The second linq takes about 2 seconds, and returns around 7500 entites, and the mapping takes around 1 second.
The database is SQL server, and there are three indexes, one of which is Id, the other two are irrelevant for this assignment.
I have tried different queries, AsNoTracking, but to no avail.
Obviously this is horrible. Do you know of a way to optimize this query?
There are two ways, how to improve your query:
Pure EF Core
We can rewrite LINQ query to be translatable and avoid unnecessary records on the client side. Note that your GroupBy will work with EF Core 6:
public async Task<IActionResult> GetLatestLog(ODataActionParameters parameters)
{
var LogIds = (LogIds)parameters["LogIds"];
var results = context.Set<LogEvent>()
.Where(x => LogIds.Ids.Contains(x.Id));
results =
from d in results.Select(d => new { d.ContextId }).Distinct()
from r in results
.Where(r => r.ContextId == d.ContextId)
.OrderByDescending(r => r.CreationDate)
.Take(1)
select r;
var transformed = await results.Select(MapEntityToLogEvent).ToListAsync();
return Ok(transformed);
}
Using third party extension
With linq2db.EntityFrameworkCore we can use full power of the SQL and make most efficient query in this case.
Big list of ids can fast be copied to temorary table and used in result query.
Retrieveing only latest records by ContextId can be done effectively with Windows Function ROW_NUMBER.
Disclaimer I'm maintainer of this library.
// helper class for creating temporary table
class IdsTable
{
public int Id { get; set; }
}
public async Task<IActionResult> GetLatestLog(ODataActionParameters parameters)
{
var LogIds = (LogIds)parameters["LogIds"];
using var db = context.CreateLinqToDBConnection();
TempTable<IdsTable>? idsTable = null;
var results = context.Set<LogEvent>().AsQueryable();
try
{
// avoid using temporary table for small amount of Ids
if (LogIds.Ids.Count() < 20)
{
results = results.Where(x => LogIds.Ids.Contains(x.Id));
}
else
{
// initializing temporary table
idsTable = await db.CreateTampTableAsync(LogIds.Ids.Select(id => new IdsTable { Id = id }, tableName: "temporaryIds"));
// filter via join
results =
from t in idsTable
join r in results on t.Id equals r.Id
select r;
}
// selecting last log
results =
from r in results
select new
{
r,
rn = Sql.Ext.RowNumber().Over()
.PartitionBy(r.ContextId)
.OrderByDesc(r.CreationDate)
.ToValue()
} into s
where s.rn == 1
select s.r;
var transformed = await results
.Select(MapEntityToLogEvent)
.ToListAsyncLinqToDB(); // we have to use our extension because of name collision with EF Core extensions
}
finally
{
// dropping temporaty table if it was used
idsTable?.Dispose();
}
return Ok(transformed);
}
Warning
Also note that logs count will grow and you have to limit result set by date and probably count of retrieved records.

C# Linq expressions can't loop through data results?

I am trying to loop through the IQueryable results data but I get an error at the loop?
var pivot = from f in query
group f by new
{
Account = f.Account
}
into g
select new
{
Account = g.Key.Account,
Com = g.Where(d => d.Party == "Com").Sum(d => d.Amount),
};
foreach (var item in pivot)
{
Console.WriteLine($"\t {item.Account} {item.Com}");
}
I just want to see what is my data after I manipulate it.
The error message I get is:
System.InvalidOperationException
"Processing of the LINQ expression
'AsQueryable(Where(\r\n source:
NavigationTreeExpression\r\n Value:
default(IGrouping<<>f__AnonymousType1, StepTwo>)\r\n
Expression: (Unhandled parameter: e6), \r\n predicate: (d) =>
d.Party == \"Com\"))' by 'NavigationExpandingExpressionVisitor'
failed. This may indicate either a bug or a limitation in EF Core. See
https://go.microsoft.com/fwlink/?linkid=2101433 for more detailed
information."
Below is the query used to create query
var query = from inn in db.InputTE.Take(getRecord)
join y in db.InputYEM on inn.YPerform equals y.YPerform
select new StageTwo
{
Party = inn.Party,
Account = y.Account,
Amount = inn.Amount
};
The error message is, essentially, saying that Linq to Entities isn't able adequately translate your expression into SQL. It's going to have to load the entire data set into memory to process. The link in the error message goes into great detail about the problem--it's worth reading.
This error is new in EF Core 3.0. Previously, EF would quietly proceed loading the data set into memory, which often lead to devs writing inefficient queries without realizing it.
Try simplifying the query by moving the where clause before the group by in your query.
This expression might not be exactly what you need, but I think it's close.
from f in query
where f.Party == "Com"
group by f.Account into g
select new { Account = g.Key, Com = g.Sum(d => d.Amount) }

Entity Framework Group By Then Order By

I'm generating a query using Entity Framework which uses a group by clause and then attempts to order each of the groups to get specific data. I attempted to optimize the order by to only happen once using a let statement but the results are incorrect but the query still executes.
Concept:
var results =
(from n in noteEntities.NoteLog
where associatedIDs.Contains(n.AssociatedID)
group n by n.AssociatedID into gn
let ogn = gn.OrderByDescending(t => t.CreatedDateTime)
let successNote = ogn.FirstOrDefault(x => x.Type == "Success")
let lastStatusNote = ogn.FirstOrDefault()
select new { Success = successNote, Status = lastStatusNote, AssociatedID = gn.Key }).ToList();
However, the problem is that using, what should be the ordered let variable, ogn in the subsequent let statements is not using an order by descending list and I'm getting the wrong success and status notes. I've also tried changing things up to create a sub-query and reference the result but that doesn't seem to return an ordered list either, ex:
var subQuery =
(from n in noteEntities.NoteLog
where associatedIDs.Contains(n.AssociatedID)
group n by n.AssociatedID into gn
select gn.OrderByDescending(t => t.CreatedDateTime));
var results =
(from s in subQuery
let successNote = s.FirstOrDefault(x => x.Type == "Success")
let lastStatusNote = s.FirstOrDefault()
select new { Success = successNote, Status = lastStatusNote }).ToList();
I can make this work by using OrderByDescending twice in the select statement or let statements for the success and status notes but this becomes very slow, and redundant, when there are a lot of notes. Is there a way to run the order by only once and get the right results back?
In SQL a subquery with Order By must have a TOP statement (yours does not). And when Linq detects that there is no FirstOrDefault or Takestatements with the ordered subquery it just strips the OrderByDescending.
If you are having a performance problem with the query perhaps you should look into indexing the table.

Better loading performance with EF code first and MVC 4

I am trying to make better (= faster) response in my MVC 4 project and mainly in Web Api part. I added MiniProfiler to see where is problem with slow loading but I can't figure out.
duration (ms) from start (ms) query time (ms)
http://www.url.com:80/api/day?city=param (example) 1396.1 +0.0 1 sql 173.8
logging 9.3 +520.9
EF query 4051.5 +530.2 2 sql 169.6
then when I tried same url again I have these numbers:
http://www.url.com:80/api/day?city=param (example) 245.6 +0.0 1 sql 50.6
logging 8.6 +19.6
EF query 7.7 +28.3
but when I tried it after 2 mins later I get again big numbers like in first example.
Same with loading Home Index:
http://www.blanskomenu.amchosting.cz:80/ 333.0 +0.0
Controller: HomeController.Index 71.0 +286.8
Find: Index 100.4 +387.8
Render : Index 2468.1 +494.6
This is my method for Web Api in first example
[OutputCache(CacheProfile = "Cache1Hour", VaryByParam = "city")]
public IEnumerable<RestaurantDayMealsView> GetDay(string city)
{
var profiler = MiniProfiler.Current;
using (profiler.Step("logging"))
{
var logFile = new LogFile(System.Web.HttpContext.Current.Server.MapPath("~/Logs/"), DateTime.Today);
logFile.Write(String.Format("{0},api/daymenu,{1}", DateTime.Now, city));
}
using (profiler.Step("EF query"))
{
var meals = repo.GetAllDayMealsForCity(city);
if (meals == null)
{
throw new HttpResponseException(Request.CreateResponse(HttpStatusCode.NotFound));
}
return meals;
}
}
and my repository method:
public IEnumerable<RestaurantDayMealsView> GetAllDayMealsForCity(string city)
{
return db.Restaurants
.Include(rest => rest.Meals)
.Where(rest => rest.City.Name == city)
.OrderBy(r => r.Order)
.AsEnumerable()
.Select(r => new RestaurantDayMealsView()
{
Id = r.Id,
Name = r.Name,
Meals = r.Meals.Where(meal => meal.Date == DateTime.Today).ToList(),
IsPropagated = r.IsPropagated
}).Where(r => r.Meals.Count > 0);
}
for my Home Index I have in my controller just:
public ActionResult Index()
{
return View();
}
So my questions are:
Why is Rendering of Index taking so long? I have just default website so I think there is no problem with css and other things.
What is taking so long in EF query when it is not query? How can I fix these problems?
I was looking at these links: SO list and ASP.NET MVC Overview - performence and I tried some tricks and read about others but nothing help me much. Is it possible that problem is with hosting? Or where? Thanks
It looks like you've got a 1+N query issue in your repository method. Using Include is only optimized if your don't modify the collection (i.e. use something like Where on it). When you do that, EF will re-fetch the records from the database. You need to cast Meals to a List first, and then run your Where clause. That will essentially freeze the pre-selected results for Meals and then filter them in memory instead of at the database.
Meals = r.Meals.ToList().Where(meal => meal.Date == DateTime.Today).ToList(),
1.
In your Repository.GetAllDayMealsForCity() method:
return db.Restaurants
.Include(rest => rest.Meals)
.Where(rest => rest.City.Name == city)
.OrderBy(r => r.Order)
.AsEnumerable() // <-- Materiazling the query before projection
.Select(r => new RestaurantDayMealsView()
{
Id = r.Id,
Name = r.Name,
Meals = r.Meals.Where(meal => meal.Date == DateTime.Today).ToList(),
IsPropagated = r.IsPropagated
}).Where(r => r.Meals.Count > 0);
You call AsEnumerable() before Projecting the results using the Select method. you have to remember that AsEnumerable() is causing the query to 'Materialize' (execute), and because you're calling it before the Select method, your query is not limiting the results to the data needed by RestaurantDayMealsView only (the further projection is done on an in-memory objects and not on the data store).
Also, your last Where could be also appended before the AsEnumerable() method.
2.
The reason for the significant difference in your profiling results between the first and second hit could be that after the first time Entity Framework is querying for the data from SQL Server, it internally caches the results in memory for better performance.

Using Count with Take with LINQ

Is there a way to get the whole count when using the Take operator?
You can do both.
IEnumerable<T> query = ...complicated query;
int c = query.Count();
query = query.Take(n);
Just execute the count before the take. this will cause the query to be executed twice, but i believe that that is unavoidable.
if this is in a Linq2SQL context, as your comment implies then this will in fact query the database twice. As far as lazy loading goes though it will depend on how the result of the query is actually used.
For example: if you have two tables say Product and ProductVersion where each Product has multiple ProductVersions associated via a foreign key.
if this is your query:
var query = db.Products.Where(p => complicated condition).OrderBy(p => p.Name).ThenBy(...).Select(p => p);
where you are just selecting Products but after executing the query:
var results = query.ToList();//forces query execution
results[0].ProductVersions;//<-- Lazy loading occurs
if you reference any foreign key or related object that was not part of the original query then it will be lazy loaded in. In your case, the count will not cause any lazy loading because it is simply returning an int. but depending on what you actually do with the result of the Take() you may or may not have Lazy loading occur. Sometimes it can be difficult to tell if you have LazyLoading ocurring, to check you should log your queries using the DataContext.Log property.
The easiest way would be to just do a Count of the query, and then do Take:
var q = ...;
var count = q.Count();
var result = q.Take(...);
It is possible to do this in a single Linq-to-SQL query (where only one SQL statement will be executed). The generated SQL does look unpleasant though, so your performance may vary.
If this is your query:
IQueryable<Person> yourQuery = People
.Where(x => /* complicated query .. */);
You can append the following to it:
var result = yourQuery
.GroupBy (x => true) // This will match all of the rows from your query ..
.Select (g => new {
// .. so 'g', the group, will then contain all of the rows from your query.
CountAll = g.Count(),
TakeFive = g.Take(5),
// We could also query for a max value.
MaxAgeFromAll = g.Max(x => x.PersonAge)
})
.FirstOrDefault();
Which will let you access your data like so:
// Check that result is not null before access.
// If there are no records to find, then 'result' will return null (because of the grouping)
if(result != null) {
var count = result.CountAll;
var firstFiveRows = result.TakeFive;
var maxPersonAge = result.MaxAgeFromAll;
}

Categories