I am facing a performance issue when doing a LINQ subquery in C#.
The query is as follows:
var _result = _db.Prices
.Join(_db.Vendors,
e => e.VendorId,
v => v.VendorId,
(e, v) => new {
TotalPrice = e.TotalPrice,
Vendor = v.Title,
AmountPaid = _db.Amounts
.Where(p => p.PrId == e.PrId )
.GroupBy(p => p.PrId )
.Select(grp => (decimal?)grp.Sum(k => k.AmountPaid) ?? 0M)
.FirstOrDefault()
});
Prices, Vendors and Amounts are Entity Framework objects.
Firstly I am doing a join of Prices to Vendors table on VendorId and on the result I am adding a calculated field AmountPaid. The result is a subquery which returns the sum of AmountPaid column in Amounts table for the given record matched by PrId index.
All the above is returned as a JSON object in ASP.NET web api.
The result of the above for the records in a database is only 1992 rows.
The above command neeeds about 20 seconds to return the result. If I remove the AmountPaid calculation it needs only 1 second.
If I run the equivalent SQL command in SQL Server studio it needs only half a second with the AmountPaid calculation included:
SELECT Prices.*, Vendors.Title,
AmountPaid = (SELECT SUM(AmountPaid) from Amounts where Amounts.PrId = Prices.PrId GROUP BY PrId) FROM Amounts
INNER JOIN Vendors ON Amounts.VendorId = Vendors.VendorId
Is there a way to fix this?
I cannot understand the issue since the SQL command does not have an issue in execution.
Is there an error on the LINQ syntax, or another way to get in LINQ the same result set?
Related
I have a table in which I want to calculate the rank of my data based on Amount.
var result = await context.Table
.Select(i => new Table {
Rank = // want to calculate the rank based on amount
}).OrderByDescending(i => i.Amount)
.ToListAsync();
I tried using the element index but for that, I need to fetch all the data to the client first.
There's no direct implementation for RANK() in EF. You are better off with using a custom LINQ query to do what you want, like,
var query = from t in context.Table
orderby t.Amount descending
select new
{
Rank = (from o in context.Table
where o.Amount > s.Amount
select o).Count() + 1
};
If you know the SQL query you want to execute, you can create a custom function or stored procedure and then execute it using Entity Framework. Please refer this.
I have a database with the following schema:
Now, I'm trying to pull all landingpages for a domain and sort those by the first UrlFilter's FilterType that matches a certain group. This is the LINQ I've come up with so far:
var baseQuery = DbSet.AsNoTracking()
.Where(e => EF.Functions.Contains(EF.Property<string>(e, "Url"), $"\"{searchTerm}*\""))
.Where(e => e.DomainLandingPages.Select(lp => lp.DomainId).Contains(domainId));
var count = baseQuery.Count();
var page = baseQuery
.Select(e => new
{
LandingPage = e,
UrlFilter = e.LandingPageUrlFilters.FirstOrDefault(f => f.UrlFilter.GroupId == groupId)
})
.Select(e => new
{
e.LandingPage,
FilterType = e.UrlFilter == null ? UrlFilterType.NotCovered : e.UrlFilter.UrlFilter.UrlFilterType
})
.OrderBy(e => e.FilterType)
.Skip(10).Take(75).ToList();
Now, while this technically works, it's quite slow with execution times ranging from 10-30 seconds, which is not good enough for the use case. The LINQ is translated to the following SQL:
SELECT [l1].[Id], [l1].[LastUpdated], [l1].[Url], CASE
WHEN (
SELECT TOP(1) [l].[LandingPageId]
FROM [LandingPageUrlFilters] AS [l]
INNER JOIN [UrlFilters] AS [u] ON [l].[UrlFilterId] = [u].[Id]
WHERE ([l1].[Id] = [l].[LandingPageId]) AND ([u].[GroupId] = #__groupId_3)) IS NULL THEN 4
ELSE (
SELECT TOP(1) [u0].[UrlFilterType]
FROM [LandingPageUrlFilters] AS [l0]
INNER JOIN [UrlFilters] AS [u0] ON [l0].[UrlFilterId] = [u0].[Id]
WHERE ([l1].[Id] = [l0].[LandingPageId]) AND ([u0].[GroupId] = #__groupId_3))
END AS [FilterType]
FROM [LandingPages] AS [l1]
WHERE CONTAINS([l1].[Url], #__Format_1) AND #__domainId_2 IN (
SELECT [d].[DomainId]
FROM [DomainLandingPages] AS [d]
WHERE [l1].[Id] = [d].[LandingPageId]
)
ORDER BY CASE
WHEN (
SELECT TOP(1) [l2].[LandingPageId]
FROM [LandingPageUrlFilters] AS [l2]
INNER JOIN [UrlFilters] AS [u1] ON [l2].[UrlFilterId] = [u1].[Id]
WHERE ([l1].[Id] = [l2].[LandingPageId]) AND ([u1].[GroupId] = #__groupId_3)) IS NULL THEN 4
ELSE (
SELECT TOP(1) [u2].[UrlFilterType]
FROM [LandingPageUrlFilters] AS [l3]
INNER JOIN [UrlFilters] AS [u2] ON [l3].[UrlFilterId] = [u2].[Id]
WHERE ([l1].[Id] = [l3].[LandingPageId]) AND ([u2].[GroupId] = #__groupId_3))
END
OFFSET #__p_4 ROWS FETCH NEXT #__p_5 ROWS ONLY
Now my question is, how can I improve the execution time of this? Either by SQL or LINQ
EDIT: So I've been tinkering with some raw SQL and this is what I've come up with:
with matched_urls as (
select l.id, min(f.urlfiltertype) as Filter
from landingpages l
join landingpageurlfilters lpf on lpf.landingpageid = l.id
join urlfilters f on lpf.urlfilterid = f.id
where f.groupid = #groupId
and contains(Url, '"barz*"')
group by l.id
) select l.id, 5 as Filter
from landingpages l
where #domainId in (
select domainid
from domainlandingpages dlp
where l.id = dlp.landingpageid
) and l.id not in (select id from matched_urls ) and contains(Url, '"barz*"')
union select * from matched_urls
order by Filter
offset 10 rows fetch next 30 rows only
This performs somewhat okay, cutting the execution time down to ~5 seconds. As this is to be used for a table search I would however like to get it down even further. Is there any way to improve this SQL?
You're right to have a look at the generated SQL. In general, I would advise to learn SQL, write a performing SQL query and work your way back (either use a stored procedure or raw SQL, or design your LINQ query with that same philosophy.
I suspect this will be better (not tested):
var page = (
from e in baseQuery
let urlFilter = e.LandingPageUrlFilters.OrderBy(f => f.UrlFilterType).FirstOrDefault(f => f.UrlFilter.GroupId == groupId)
let filterType = urlFilter == null ? UrlFilterType.NotCovered : e.UrlFilter.UrlFilter.UrlFilterType
select new
{
LandingPage = e,
FilterType = filterType
}
).Skip(10).Take(75).ToList();
one of the way to improve the execution time is see execution plan in SSMS (SQL Server Management Studio).
After look on the execution plan you can design some indexes, or if you have no experiences with this, you can see if SSMS recommends some indexes.
Next try to create the indexes and execute the query again and see if execution time was improved.
Note: this is only one of many possible ways to improve execution time...
I am trying to do quite a simple group by, and sum, with EF Core 3.0
However am getting a strange error:
System.InvalidOperationException: 'Processing of the LINQ expression
'AsQueryable((Unhandled parameter:
y).TransactionLines)' by 'NavigationExpandingExpressionVisitor'
failed. This may indicate either a bug or a limitation in EF Core.
var creditBalances = await context.Transaction
.Include(x => x.TransactionLines)
.Include(x=>x.CreditAccount)
.Where(x => x.CreditAccount.UserAccount.Id == userAccount.Id)
.GroupBy(x => new
{
x.CreditAccount.ExternalId
})
.Select(x => new
{
x.Key.ExternalId,
amount = x.Sum(y => y.TransactionLines.Sum(z => z.Amount))
})
.ToListAsync();
I'm battling to see where an issue can arise, so not even sure where to start. I am trying to get a sum of all the transaction amounts (Which is a Sum of all the TransactionLines for each transaction - i.e. A Transaction amount is made of the lines associated to it).
I then sum up all the transactions, grouping by then CreditAccount ID.
The line, Unhandled parameter: y is worrying. Maybe my grouping and summing is out.
So start at the TransactionLines level and this is as simple as:
var q = from c in context.TransactionLines
where c.Transaction.CreditAccount.UserAccount.Id == userAccount.Id
group c by c.Transaction.CreditAccount.ExternalId into g
select new
{
ExternalId = g.Key,
Amount = g.Sum(x => x.Amount)
};
var creditBalances = await q.ToListAsync();
( You don't need any Include() since you're not returning an Entity with related data. You're projecting a custom data shape. )
Which translates to:
SELECT [c].[ExternalId], SUM([t].[Amount]) AS [Amount]
FROM [TransactionLines] AS [t]
LEFT JOIN [Transaction] AS [t0] ON [t].[TransactionId] = [t0].[Id]
LEFT JOIN [CreditAccounts] AS [c] ON [t0].[CreditAccountId] = [c].[Id]
LEFT JOIN [UserAccount] AS [u] ON [c].[UserAccountId] = [u].[Id]
WHERE [u].[Id] = #__userAccount_Id_0
GROUP BY [c].[ExternalId]
I have the next query:
select VisitLines.ProcedureId, COUNT(DISTINCT VisitLines.VisitId) as nt
from Visits
LEFT JOIN VisitLines ON Visits.Id = VisitLines.VisitId
WHERE Visits.VisitStatusId = 1 AND Visits.IsActive = 1 AND VisitLines.IsActive = 1
GROUP BY VisitLines.ProcedureId
Main question: Does ability exists to grouping by column from join using linq ? I'm wondering how to do it using 'collection' column.
Is it possible to force EF to generate COUNT(DISTINCT column) ? IQueryable.GroupBy.Select(x => x.Select(n => n.Number).Distinct().Count()) generate query with few subqueries which much slower then COUNT(DISTINCT )
I found. Need to use SelectMany with second parameter resultSelector:
dbContext.Visits.Where(x => x.IsActive)
.SelectMany(x => x.VisitLines, (v, vl) => new
{
v.Id,
vl.ProcedureId
})
.GroupBy(x => x.ProcedureId)
.Select(x => new
{
Id = x.Key,
VisitCount = x.Count()
}).ToArray();
It generates the desired SQL, but with exception that I need distinct count by visit.
And if I change VisitCount = x.Distinct().Count() then EF generates a query with few subqueries again. But the main issue resolved
My ASP.Net application has the following Linq to SQL function to get a distinct list of height values from the product table.
public static List<string> getHeightList(string catID)
{
using (CategoriesClassesDataContext db = new CategoriesClassesDataContext())
{
var heightTable = (from p in db.Products
join cp in db.CatProducts on p.ProductID equals cp.ProductID
where p.Enabled == true && (p.CaseOnly == null || p.CaseOnly == false) && cp.CatID == catID
select new { Height = p.Height, sort = Convert.ToDecimal(p.Height.Replace("\"", "")) }).Distinct().OrderBy(s => s.sort);
List<string> heightList = new List<string>();
foreach (var s in heightTable)
{
heightList.Add(s.Height.ToString());
}
return heightList;
}
}
I ran Redgate SQL Monitor which shows that this query is using a lot of resources.
Redgate is also showing that I am running the following query:
select count(distinct [height]) from product p
join catproduct cp on p.productid = cp.productid
join cat c on cp.catid = c.catid
where p.enabled=1 and p.displayfilter = 1 and c.catid = 'C2-14'
My questions are:
A suggestion to change the function so that it uses less resources?
Also, how does linq to sql generate the above query from my function? (I did not write select count(distinct [height]) from product anywhere in the code)
There are 90,000 records in the products. This category which I am trying to get the distinct list of heights has 50,000 product records
Thank you in advance,
Nick
First of all your posted sql query and linq query doesn't match at all. it's not the LINQ query rather the underlying SQL query itself performing slow. Make sure, all the columns involved in JOIN ON clause and WHERE clause and ORDER BY clause are indexed properly in order to have a better execution plan; else you will end up getting a FULL Table Scan and a File Sort and query will deemed to perform slow.
The join multiplies the number of Products the query returns. To undo that, you apply Distinct at the end. It will certainly reduce db resources if you return unique Products right away:
var heightTable = (from p in db.Products
where p.CatProducts.Any(cp => cp.CatID == catID)
&& p.Enabled && (p.CaseOnly == null || !p.CaseOnly)
select new
{
Height = p.Height,
sort = Convert.ToDecimal(p.Height.Replace("\"", ""))
}).OrderBy(s => s.sort);
This changes the join into a where clause. It saves the db engine the trouble of deduplicating the result.
If that still performs poorly, you should try to do the conversion and ordering in memory, i.e. after receiving the raw results from the database.
As for the count. I don't know where it comes from. Such queries typically get generated by paging libraries such as PagedList, but I see no trace of that in your code.
Side note: you can return ...
heightList.Select(x => x.Height.ToString()).ToList()
... instead of creating the list yourself.