Weird LINQ to SQL Timeout Issue using FirstOrDefault() - c#

I have the following code which times out:
using (var ts = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions { IsolationLevel = System.Transactions.IsolationLevel.ReadUncommitted }))
{
ECWSDataContext dc = new ECWSDataContext();
IQueryable<Ticket> results = dc.Tickets;
Business.TicketStatistic statistic = results
.Select(r => new
{
GroupID = 1,
IsVoided = r.IsVoided ? 1 : 0,
IsWarning = r.TicketFilingTypeID == 5 ? 1 : 0,
TotalFelonies = r.TotalFelonies,
TotalMisdemeanors = r.TotalMisdemeanors,
TotalInfractions = r.TotalInfractions,
TotalOrdinances = r.TotalOrdinances,
TotalWarnings = r.TotalWarnings
})
.GroupBy(t => t.GroupID)
.Select(g => new Business.TicketStatistic()
{
TotalTickets = g.Count(),
TotalVoids = g.Sum(x => x.IsVoided),
TotalTicketWarnings = g.Sum(x => x.IsWarning),
TotalFelonies = g.Sum(x => x.TotalFelonies),
TotalMisdemeanors = g.Sum(x => x.TotalMisdemeanors),
TotalInfractions = g.Sum(x => x.TotalInfractions),
TotalOrdinances = g.Sum(x => x.TotalOrdinances),
TotalOffenseWarnings = g.Sum(x => x.TotalWarnings)
}).FirstOrDefault();
}
I profiled the SQL using SQL Server Profiler and grabbed the executed SQL. As expected, it contains a TOP 1. When I run the exact SQL in SQL Management Studio, it comes back in no time at all. Yet, it continues to timeout in the code. Amazingly, changing it to the following works just fine:
using (var ts = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions { IsolationLevel = System.Transactions.IsolationLevel.ReadUncommitted }))
{
ECWSDataContext dc = new ECWSDataContext();
IQueryable<Ticket> results = dc.Tickets;
var stats = results
.Select(r => new
{
GroupID = 1,
IsVoided = r.IsVoided ? 1 : 0,
IsWarning = r.TicketFilingTypeID == 5 ? 1 : 0,
TotalFelonies = r.TotalFelonies,
TotalMisdemeanors = r.TotalMisdemeanors,
TotalInfractions = r.TotalInfractions,
TotalOrdinances = r.TotalOrdinances,
TotalWarnings = r.TotalWarnings
})
.GroupBy(t => t.GroupID)
.Select(g => new Business.TicketStatistic()
{
TotalTickets = g.Count(),
TotalVoids = g.Sum(x => x.IsVoided),
TotalTicketWarnings = g.Sum(x => x.IsWarning),
TotalFelonies = g.Sum(x => x.TotalFelonies),
TotalMisdemeanors = g.Sum(x => x.TotalMisdemeanors),
TotalInfractions = g.Sum(x => x.TotalInfractions),
TotalOrdinances = g.Sum(x => x.TotalOrdinances),
TotalOffenseWarnings = g.Sum(x => x.TotalWarnings)
}).ToArray();
Business.TicketStatistic statistic = stats.FirstOrDefault();
}
I understand that now I am enumerating the results before applying the FirstOrDefault() to the now in-memory collection. But it seems strange that executing the same SQL output in the first scenario directly in SQL Server had no problems.
Can somebody maybe explain what is going on here? In this instance, it was a group query that always returned one row regardless. So I am lucky that I can enumerate before applying FirstOrDefault(). But for possible future reference, what if that query returned thousands of rows to which I only wanted the TOP 1.
ADDITION INFO
The SQL using .FirstOrDefault():
SELECT TOP 1 Field1, Field2...
FROM
(
SELECT SUM(Field) as Field1, ...
FROM ...
) SUB
The SQL using .ToArray():
SELECT SUM(Field) as Field1, ...
FROM ...
Executing either directly in SQL Mgt Studio resulted in the same results in the same amount of time. However, when LINQ executes the first one, I get a timeout.

This is a common problem when using linq to sql. If you think about sql, when you do a group by and then a firstordefault you're asking sql to aggregate and then unaggregate. It's hard for sql to deal with the individual elements in a group by since it'll be doing multiple queries to reach the individual elements.
When you do ToArray, you're actually pulling the data back into memory and the group by is actually stored in memory with the individual elements so reaching these will be a lot faster.

Related

How to get better performance query result on filtering data

I have query that needs to filter large set of data by some search criteria.
The search is happening through 3 tables: Products, ProductPrimaryCodes, ProductCodes.
The large data (given there is around 2000 records, so is not that large, but is largest by the other tables data) set is in ProductCodes table.
Here is an example of what I've done.
var result = products.Where(x => x.Code.Contains(se) ||
x.ProductPrimaryCodes.Any(p => p.Code.Contains(se)) ||
x.ProductCodes.Any(p => p.Code.Contains(se)))
.Select(x => new ProductDto
{
Id = x.Id,
Name = x.Name,
InStock = x.InStock,
BrandId = (BrandType)x.BrandId,
Code = x.Code,
CategoryName = x.Category.Name,
SubCategoryName = x.SubCategory.Name,
});
The time that query executes is around 8-9 sec, so i believe is quite long for this kind of search. And just a note, without doing ProductCodes.Any(), the query executes in less than a second and retrieves result to the page.
ProductCodes table:
Id,
Code,
ProductId
Any suggestions how to get better performance of the query?
This is the solution that worked for me.
var filteredProductsByCode = products.Where(x => x.Code.Contains(se));
var filteredProducts = products.Where(x => x.ProductCodes.Any(p => p.Code.Contains(se))
|| x.ProductPrimaryCodes.Any(p => p.Code.Contains(se)));
return filteredProductsByCode.Union(filteredProducts).Select(x => new ProductDto
{
Id = x.Id,
Name = x.Name,
InStock = x.InStock,
BrandId = (BrandType)x.BrandId,
Code = x.Code,
CategoryName = x.Category.Name,
SubCategoryName = x.SubCategory.Name,
}).OrderByDescending(x => x.Id)
Clearly not the cleanest, but I will also consider introducing stored procedures for this kind of queries.

Multiple fields in GroupBy clause in LINQ (where one is computed field)

I have a LINQ query upon which I need to add two fields as group by clauses. While I can easily group by with as many column fields but the problem is occurring when one of the fields is a calculated field. I can't seem to be able to get my head around on how to add the second attribute in this case
var values = intermediateValues
//.GroupBy(x => new {x.Rate, x.ExpiryDate })
.GroupBy(r => new { Rate = ((int)(r.Rate / BucketSize)) * BucketSize })
.Select(y => new FXOptionScatterplotValue
{
Volume = y.Sum(z => z.TransactionType == "TERMINATION" ? -z.Volume : z.Volume),
Rate = y.Key.Rate,
ExpiryDate = y.Key.ExpiryDate,
Count = y.Count()
}).ToArray();
In the above code sample I would like to have ExpiryDate added to my existing GroupBy clause which has a computed field of Rate already there. The code looks like this in VS editor
So just include it as you have in the commented-out code:
.GroupBy(r => new { Rate = ((int)(r.Rate / BucketSize)) * BucketSize,
r.ExpiryDate })
This might help you
var values = intermediateValues
//.GroupBy(x => new {x.Rate, x.ExpiryDate })
.GroupBy(r => new { Rate = ((int)(r.Rate / BucketSize) ) * BucketSize,ExpiryDate1 = r.ExpiryDate })
.Select(y => new FXOptionScatterplotValue
{
Volume = y.Sum(z => z.TransactionType == "TERMINATION" ? -z.Volume : z.Volume),
Rate = y.Key.Rate,
ExpiryDate = y.Key.ExpiryDate1,
Count = y.Count()
}).ToArray();
Just use ExpiryDate1 as anonymous type and use this as key name....

Converting SQL Query to LINQ for application

I have a query in TSQL that I am trying to convert to LINQ for use in our web application, but I am really struggling with this one. It is MVC5 with EF6 and the database is SQL Server 2008 R2. Any help is appreciated!
SQL Query
select MAX(ShipFromCompanyName) as Supplier, COUNT(*) as AllSupplierCount,
SUM(isnull(cast(TransportationCharges as decimal(18,2)),0)) as AllFreightCharges,
SUM(isnull(cast(TransportationCharges as decimal(18,2)),0)) * .45 as FreightSavings
from table
group by ShipFromCompanyName
order by ShipFromCompanyName
ShipFromCompanyName and TransportationCharges are both stored as varchar in the database, and unfortunately I am unable to change the data type of TransportationCharge to a decimal
LINQ
var Scorecard = (from upsid in _db.table select upsid).GroupBy(x => new { x.ShipFromCompanyName, x.TransportationCharges })
.Select(x => new
{
x.Key.ShipFromCompanyName,
SupplierCount = x.Count(),
FreightCharges = x.Key.TransportationCharges.Cast<decimal>().Sum(),
}).ToList();
I think you are going to need to do it post processing and not have SQL do it. Have SQL do as much as it can then do the rest in memory
var Scorecard = (from upsid in _db.table select upsid).GroupBy(x => new { x.ShipFromCompanyName, x.TransportationCharges })
.Select(x => new
{
x.Key.ShipFromCompanyName,
SupplierCount = x.Count(),
FreightCharges = x.Key.TransportationCharges,
}).AsEnumerable()
.Select (x => new
{
ShipFromCompanyName = ShipFromCompanyName ,
SupplierCount = SupplierCount ,
FreightCharges = FreightCharges.Cast<decimal>.Sum() ,
}
Didn't test this code but it should give you the idea.
var Scorecard = (from upsid in _db.table select upsid)
.GroupBy(x => new { x.ShipFromCompanyName, x.TransportationCharges })
.Select(x => new
{
x.Key.ShipFromCompanyName,
SupplierCount = x.Count(),
FreightCharges = x.Key.TransportationCharges.Select(tc=>decimal.Parse(tc)).Sum()*0.45,
}).ToList();
Since I didn't understand your query clearly I am not sure, but this may work.

Linq two select statements, second one uses first ones result,

This linq query works well.
var qry = context.Boxes
.GroupBy(k=>k.Box_ID)
.Select( group => new {
Box_ID = group.Key,
TotalA = group.Sum(p => p.A),
TotalC = group.Sum(p => p.C)
})
.Select(p => new {
Box_ID = p.Kasa_ID,
TotalA = p.TotalA,
TotalC = p.TotalC,
DiffAC = p.TotalA - p.TotalC
});
But, i saw these type select statements, second one uses first select's anonymous type result, written like this:
var qry = context.Boxes
.GroupBy(k => k.Box_ID)
.Select(group => new
{
Box_ID = group.Key,
TotalA = group.Sum(p => p.A),
TotalC = group.Sum(p => p.C)
})
.Select(p => new
{
Box_ID, //*** compiler error
TotalA, //I'm asking about these 3 lines, is this syntax possible
TotalC, //TotalC = p.TotalC,
DiffAC = p.TotalA - p.TotalC // calculate
});
comments contains details.
When i try to compile second query, compiler gives me the error "The name 'Box_ID' does not exist in the current contex".
In fact there is no doubt with first syntax, but second one is more readable. How can i use second syntax? or in which condititons i can use it.
.Select(p => new
{
p.Box_ID,
p.TotalA,
p.TotalC,
DiffAC = p.TotalA - p.TotalC // calculate
});

LINQ group by with multiple counts

I am having trouble doing multiple counts on a single table in a LINQ query. I am using NHibernate, LINQ to NHibernate and C#.
query is a populated list. I have a table that has a boolean called FullRef. I want to do a LINQ query to give a count of occurances of FullRef = false and FullRef = true on each TrackId. TrackId gets a new row for each time he gets a track.Source == "UserRef".
In the following query I get the correct number count (from the FullRefTrueCount) of FullRef = true, but it gives an unknown wrong number on the FullRefFalseCount.
var query2 = from track in query
where track.Source == "UserRef"
group track by new { TrackId = track.TrackId, FullRef = track.FullRef } into d
select new FullReferrer
{
Customer = d.Key.TrackId,
FullRefFalseCount = d.Where(x => x.FullRef == false).Count(),
FullRefTrueCount = d.Where(x => x.FullRef == true).Count()
};
Anyone have any idea on how to fix it? I am pretty certain the .Where() clause is ignored and the "group by" is screwing me over.
If I could somehow
group track by new { TrackId = track.TrackId, FullRefTrue = track.FullRef, FullRefFalse = !track.FullRef }"
it would work. Is there some way to do this?
you should group by trackId only, if you want results by trackId...
var query2 = query
.Where(m => m.Source == "UserRef")
.GroupBy(m => m.TrackId)
.Select(g => new FullReferrer {
Customer = g.Key,
FullRefFalseCount = g.Count(x => !x.FullRef),
FullRefTrueCount = g.Count(x => x.FullRef)
});

Categories