I'm currently battling a linq query for my application using Entity Framework (6.1.3)
The query is as follows:
var productPeriods = (from pp in ctx.ProductPeriods
where pp.IsActive && pp.Product.IsBuyBackForProduct == null && !pp.Product.ProductAddOns.Any() && pp.PowerRegionID == powerRegionId
select new
{
ProductPeriod = pp,
Price = pp.Prices
.OrderByDescending(x => x.Created)
.GroupBy(x => x.FirmID)
.Select(pr => pr.FirstOrDefault())
.OrderByDescending(x => x.ProductPrice)
.FirstOrDefault()
}).ToList();
The purpose of the query is to find the latest price from the prices collection of a product period, grouped by the firm ID and then select the best price of the latest prices from each firm.
This works perfectly in Linqpad, but the first OrderByDescending(x => x.Created) doesn't work when used in context of Entity Framework.
Does anyone knows why? And perhaps have a solution for it? :-)
Thanks in advance!
Update
Thanks for all replies. I've tried the following:
select new {
ProductPeriod = p,
Price = p.Prices.GroupBy(x => x.FirmID).Select(pr => pr.OrderByDescending(x => x.Created).ThenByDescending(x => x.ProductPrice).FirstOrDefault())
}
But it seems like ThenByDescending(x => x.ProductPrice) gets ignored as well. The prices are not sorted correctly in the output. They're output like this:
Price: 0,22940, Created: 06-03-2015 10:15:09,
Price: 0,23150, Created: 06-03-2015 10:05:48
Price: 0,20040, Created: 06-03-2015 09:24:24
Update 2 (solution for now)
I came to the solution that the initial query just returns the latest prices from each firm. There's currently three firms, so the performance should be alright.
Later in my code, where I'm actually using the latest and best price, I simply do an .OrderByDescending(x => x.ProductPrice).FirstOrDefault() and check if it's not null.
I.e:
var productPeriods = (from pp in ctx.ProductPeriods
where pp.IsActive && pp.Product.IsBuyBackForProduct == null && !pp.Product.ProductAddOns.Any() && pp.PowerRegionID == powerRegionId
select new
{
ProductPeriod = pp,
Prices = pp.Prices.GroupBy(x => x.FirmID).Select(pr => pr.OrderByDescending(x => x.Created).FirstOrDefault())
}).ToList();
Later in my code:
var bestPriceOfToday = period.Prices.OrderByDescending(x => x.ProductPrice).FirstOrDefault()
The problem is the commands you are using. OrderBy and OrderByDescending do NOT add additional order by statements to the resulting query but instead they CREATE the order by statement and eliminate all orderby statements that existed before.
In order to use multiple orderby's you need to do the following:
OrderBy or OrderByDescending
ThenBy or ThenByDescending
the ThenBy statements can be used 1 or more times they just add additional order statements to the resulting query.
According to yours update, omnit select and type:
select new {
ProductPeriod = p,
Price = p.Prices.GroupBy(x => x.FirmID)
.OrderByDescending(x =>x.Created).ThenByDescending(x=>x.ProductPrice).FirstOrDefault()
}
That select was useless and could be the cause of problem
Related
How would you write a linq query with the following SQL statement. I've tried several methods referenced on stackoverflow but they either don't work with the EF version I'm using (EF core 3.5.1) or the DBMS (SQL Server).
select a.ProductID, a.DateTimeStamp, a.LastPrice
from Products a
where a.DateTimeStamp = (select max(DateTimeStamp) from Products where a.ProductID = ProductID)
For reference, a couple that I've tried (both get run-time errors).
var results = _context.Products
.GroupBy(s => s.ProductID)
.Select(s => s.OrderByDescending(x => x.DateTimeStamp).FirstOrDefault());
var results = _context.Products
.GroupBy(x => new { x.ProductID, x.DateTimeStamp })
.SelectMany(y => y.OrderByDescending(z => z.DateTimeStamp).Take(1))
Thanks!
I understand you would like to have a list of the latest prices of each products?
First of all I prefer to use group by option even over 1st query
select a.ProductID, a.DateTimeStamp, a.LastPrice
from Products a
where a.DateTimeStamp IN (select max(DateTimeStamp) from Products group by ProductID)
Later Linq:
var maxDateTimeStamps = _context.Products
.GroupBy(s => s.ProductID)
.Select(s => s.Max(x => x.DateTimeStamp)).ToArray();
var results = _context.Products.Where(s=>maxDateTimeStamps.Contains(s.DateTimeStamp));
-- all assuming that max datetime stamps are unique
I've managed to do it with the following which replicates the correlated sub query in the original post (other than using TOP and order by instead of the Max aggregate), though I feel like there must be a more elegant way to do this.
var results = from x
in _context.Products
where x.DateTimeStamp == (from y
in _context.Products
where y.ProductID == x.ProductID
orderby y.DateTimeStamp descending
select y.DateTimeStamp
).FirstOrDefault()
select x;
I prefer to break up these queries into IQueryable parts, do you can debug each "step".
Something like this:
IQueryable<ProductOrmEntity> pocoPerParentMaxUpdateDates =
entityDbContext.Products
//.Where(itm => itm.x == 1)/*if you need where */
.GroupBy(i => i.ProductID)
.Select(g => new ProductOrmEntity
{
ProductID = g.Key,
DateTimeStamp = g.Max(row => row.DateTimeStamp)
});
//// next line for debugging..do not leave in for production code
var temppocoPerParentMaxUpdateDates = pocoPerParentMaxUpdateDates.ToListAsync(CancellationToken.None);
IQueryable<ProductOrmEntity> filteredChildren =
from itm
in entityDbContext.Products
join pocoMaxUpdateDatePerParent in pocoPerParentMaxUpdateDates
on new { a = itm.DateTimeStamp, b = itm.ProductID }
equals
new { a = pocoMaxUpdateDatePerParent.DateTimeStamp, b = pocoMaxUpdateDatePerParent.ProductID }
// where
;
IEnumerable<ProductOrmEntity> hereIsWhatIWantItems = filteredChildren.ToListAsync(CancellationToken.None);
That last step, I am putting in an anonymous object. You can put the data in a "new ProductOrmEntity() { ProductID = pocoMaxUpdateDatePerParent.ProductID }...or you can get the FULL ProductOrmEntity object. Your original code, I don't know if getting all columns of the Product object is what you want, or only some of the columns of the object.
Hello this is a LINQ Query but it doesn't sort properly because four different dates are involved.
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
orderby i.English_seen_by_executive_on descending
orderby i.Brief_seen_by_executive_on descending
orderby i.French_seen_by_executive_on descending
orderby i.Russian_seen_by_executive_on descending
select i).ToList();
All i want is that it should somehow combine the four dates and sort them in group not one by one.
For Example, at the moment it sorts all English Reports based on the date that executive has seen it, then Brief Report and So on.
But i want that it should check which one is seen first and so on. For example if the first report which is seen is French, then Brief, then English then Russian, so it should sort it accordingly.
Is it Possible??
You need to have them all in one column. The approach I would do, assuming that the value of the respective cells is null, when you don't want them to show up in the order by:
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
select new
{
Date =
(
i.English_seen_by_executive_on != null ? i.English_seen_by_executive_on :
i.Brief_seen_by_executive_on != null ? i.Brief_seen_by_executive_on :
i.French_seen_by_executive_on != null ? i.French_seen_by_executive_on :
i.Russian_seen_by_executive_on
)
}).ToList().OrderBy(a => a.Date);
In the select clause you could add more columns if you whish.
Reference taken from here.
Why not just use .Min() or .Max() on the dates and then .OrderBy() or .OrderByDescending() based on that?
Logic is creating a new Enumerable (here, an array) with the 4 dates for the current line, and calculate the Max/Min of the 4 dates: this results in getting the latest/earliest of the 4. Then order the records based on this value.
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
select i)
.OrderBy(i => new[]{
i.English_seen_by_executive_on,
i.Brief_seen_by_executive_on,
i.French_seen_by_executive_on,
i.Russian_seen_by_executive_on
}.Max())
.ToList();
Your problem is not a problem if you use method syntax for your LINQ query instead of query syntax.
var EventReportRemarks = _context.pm_main_repz
.Where(rep => rep.Department.DepartmentName == "Finance")
.OrderByDescending(rep => rep.English_seen_by_executive_on)
.ThenByDescending(rep => rep.Brief_seen_by_executive_on)
.ThenByDescending(rep => rep.French_seen_by_executive_on descending)
.ThenByDescending(rep => resp.Russian_seen_by_executive_on descending)
.Select(rep => ...);
Optimization
One of the slower parts of a database query is the transport of selected data from the DBMS to your local process. Hence it is wise to limit the transported data to values you actually plan to use.
You transport way more data than you need to.
For example. Every pm_main_repz (my, you do love to use easy identifiers for your items, don't you?), every pm_main_repz has zero or more Employees. Every Employees belongs to exactly one pm_main_repz using a foreign key like pm_main_repzId.
If you use include to transport pm_main_repz 4 with his 1000 Employees every Employee will have a pm_main_repzId with value 4. You'll transport this value 1001 times, while 1 time would have been enough
Always use Select to select data from the database and Select only the properties you actually plan to use. Only use Include if you plan to update the fetched objects
Consider using a proper Select where you only select the items that you actually plan to use:
.Select(rep => new
{
// only Select the rep properties you actually plan to use:
Id = rep.Id,
Name = rep.Name,
...
Employees = rep.Employees.Select(employee => new
{
// again: select only the properties you plan to use
Id = employee.Id,
Name = employee.Name,
// not needed: foreign key to pm_main_repz
// pm_main_repzId = rep.pm_main_repzId,
})
.ToList(),
Department = new
{
Id = rep.Department,
...
}
// etc for pm_evt_cat and provencs
});
I'm working on a report right now that runs great with our on-premises DB (just refreshed from PROD). However, when I deploy the site to Azure, I get a SQL Timeout during its execution. If I point my development instance at the SQL Azure instance, I get a timeout as well.
Goal: To output a list of customers that have had an activity created during the search range, and when that customer is found, get some other information about that customer regarding policies, etc. I've removed some of the properties below for brevity (as best I can)...
UPDATE
After lots of trial and error, I can get the entire query to run fairly consistently within 1000MS so long as this block of code is not executed.
CurrentStatus = a.Activities
.Where(b => b.ActivityType.IsReportable)
.OrderByDescending(b => b.DueDateTime)
.Select(b => b.Status.Name)
.FirstOrDefault(),
With this code in place, things begin to go haywire. I think this Where clause is a big part of it: .Where(b => b.ActivityType.IsReportable). What is the best way to grab the status name?
EXISTING CODE
Any thoughts as to why SQL Azure would timeout whereas on-premises would turn this around in less than 100MS?
return db.Customers
.Where(a => a.Activities.Where(
b => b.CreatedDateTime >= search.BeginDateCreated
&& b.CreatedDateTime <= search.EndDateCreated).Count() > 0)
.Where(a => a.CustomerGroup.Any(d => d.GroupId== search.GroupId))
.Select(a => new CustomCustomerReport
{
CustomerId = a.Id,
Manager = a.Manager.Name,
Customer = a.FirstName + " " + a.LastName,
ContactSource= a.ContactSource!= null ? a.ContactSource.Name : "Unknown",
ContactDate = a.DateCreated,
NewSale = a.Sales
.Where(p => p.Employee.IsActive)
.OrderByDescending(p => p.DateCreated)
.Select(p => new PolicyViewModel
{
//MISC PROPERTIES
}).FirstOrDefault(),
ExistingSale = a.Sales
.Where(p => p.CancellationDate == null || p.CancellationDate <= myDate)
.Where(p => p.SaleDate < myDate)
.OrderByDescending(p => p.DateCreated)
.Select(p => new SalesViewModel
{
//MISC PROPERTIES
}).FirstOrDefault(),
CurrentStatus = a.Activities
.Where(b => b.ActivityType.IsReportable)
.OrderByDescending(b => b.DueDateTime)
.Select(b => b.Disposition.Name)
.FirstOrDefault(),
CustomerGroup = a.CustomerGroup
.Where(cd => cd.GroupId == search.GroupId)
.Select(cd => new GroupViewModel
{
//MISC PROPERTIES
}).FirstOrDefault()
}).ToList();
I cannot give you a definite answer but I would recommend approaching the problem by:
Run SQL profiler locally when this code is executed and see what SQL is generated and run. Look at the query execution plan for each query and look for table scans and other slow operations. Add indexes as needed.
Check your lambdas for things that cannot be easily translated into SQL. You might be pulling the contents of a table into memory and running lambdas on the results, which will be very slow. Change your lambdas or consider writing raw SQL.
Is the Azure database the same as your local database? If not, pull the data locally so your local system is indicative.
Remove sections (i.e. CustomerGroup then CurrentDisposition then ExistingSale then NewSale) and see if there is a significant performance improvement after removing the last section. Focus on the last removed section.
Looking at the line itself:
You use ".Count() > 0" on line 4. Use ".Any()" instead, since the former goes through every row in the database to get you an accurate count when you just want to know if at least one row satisfies the requirements.
Ensure fields referenced in where clauses have indexes, such as IsReportable.
Short answer: use memory.
Long answer:
Because of either bad maintenance plans or limited hardware, running this query in one big lump is what's causing it to fail on Azure. Even if that weren't the case, because of all the navigation properties you're using, this query would generate a staggering number of joins. The answer here is to break it down in smaller pieces that Azure can run. I'm going to try to rewrite your query into multiple smaller, easier to digest queries that use the memory of your .NET application. Please bear with me as I make (more or less) educated guesses about your business logic/db schema and rewrite the query accordingly. Sorry for using the query form of LINQ but I find things such as join and group by are more readable in that form.
var activityFilterCustomerIds = db.Activities
.Where(a =>
a.CreatedDateTime >= search.BeginDateCreated &&
a.CreatedDateTime <= search.EndDateCreated)
.Select(a => a.CustomerId)
.Distinct()
.ToList();
var groupFilterCustomerIds = db.CustomerGroup
.Where(g => g.GroupId = search.GroupId)
.Select(g => g.CustomerId)
.Distinct()
.ToList();
var customers = db.Customers
.AsNoTracking()
.Where(c =>
activityFilterCustomerIds.Contains(c.Id) &&
groupFilterCustomerIds.Contains(c.Id))
.ToList();
var customerIds = customers.Select(x => x.Id).ToList();
var newSales =
(from s in db.Sales
where customerIds.Contains(s.CustomerId)
&& s.Employee.IsActive
group s by s.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Sale = grouped
.OrderByDescending(x => x.DateCreated)
.Select(new PolicyViewModel
{
// properties
})
.FirstOrDefault()
}).ToList();
var existingSales =
(from s in db.Sales
where customerIds.Contains(s.CustomerId)
&& (s.CancellationDate == null || s.CancellationDate <= myDate)
&& s.SaleDate < myDate
group s by s.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Sale = grouped
.OrderByDescending(x => x.DateCreated)
.Select(new SalesViewModel
{
// properties
})
.FirstOrDefault()
}).ToList();
var currentStatuses =
(from a in db.Activities.AsNoTracking()
where customerIds.Contains(a.CustomerId)
&& a.ActivityType.IsReportable
group a by a.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Status = grouped
.OrderByDescending(x => x.DueDateTime)
.Select(x => x.Disposition.Name)
.FirstOrDefault()
}).ToList();
var customerGroups =
(from cg in db.CustomerGroups
where cg.GroupId == search.GroupId
group cg by cg.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Group = grouped
.Select(x =>
new GroupViewModel
{
// ...
})
.FirstOrDefault()
}).ToList();
return customers
.Select(c =>
new CustomCustomerReport
{
// ... simple props
// ...
// ...
NewSale = newSales
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Sale)
.FirstOrDefault(),
ExistingSale = existingSales
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Sale)
.FirstOrDefault(),
CurrentStatus = currentStatuses
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Status)
.FirstOrDefault(),
CustomerGroup = customerGroups
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Group)
.FirstOrDefault(),
})
.ToList();
Hard to suggest anything without seeing actual table definitions, espectially the indexes and foreign keys on Activities entity.
As far I understand Activity (CustomerId, ActivityTypeId, DueDateTime, DispositionId). If this is standard warehousing table (DateTime, ClientId, Activity), I'd suggest the following:
If number of Activities is reasonably small, then force the use of CONTAINS by
var activities = db.Activities.Where( x => x.IsReportable ).ToList();
...
.Where( b => activities.Contains(b.Activity) )
You can even help the optimiser by specifying that you want ActivityId.
Indexes on Activitiy entity should be up to date. For this particular query I suggest (CustomerId, ActivityId, DueDateTime DESC)
precache Disposition table, my crystal ball tells me that it's dictionary table.
For similar task to avoid constantly hitting Activity table I made another small table (CustomerId, LastActivity, LastVAlue) and updated it as the status changed.
My Environment: ASP.net and C# in VS 2013 Express.
I have been through many similar SO articles trying to work this out. I am amateur with Linq to SQL queries and c# in general.
I'm trying to use Linq to SQL to get the top 5 most recent distinct values from a column, then add them to a list. My application is asp.net using c# and a .dbml file for data abstraction.
I've tried it many different ways. I either get non-distinct yet sorted list, or I get a distinct unsorted list. What I have so far is below
var Top5MFG = (from mfg in db.orders
where mfg.manufacturer.Length > 0 && mfg.customerid == "blahblahblahblahblah"<br />
select new {
manufacturer = mfg.manufacturer,
date = mfg.date_created
})
.Distinct()
.OrderByDescending(s => s.date);
I'm thinking my "Distinct" is looking at the "ID" column, and perhaps I need to tell it I want it to look at the "manufacturer" column, but I haven't worked out how / if it's possible to do that.
I could do this with ease by using a storedproc, but I'm really trying to do it with c# code directly if possible. This is my first post to SO, I hope I have put it together properly. Any help much appreciated.
Thanks
No the Distinct compares manufacturer and date pairs.If you want to get distinct records by manufacturer then I recommend DistinctBy method.It's in the MoreLINQ library.Since its a third library method it's not supported in linq to sql, you still can use it by fetching the records from DB and do the rest in memory
(from mfg in db.orders
where mfg.manufacturer.Length > 0 && mfg.customerid == "blahblahblahblahblah"
select new {
manufacturer = mfg.manufacturer,
date = mfg.date_created
})
.AsEnumerable()
.DistinctBy(x => x.manufacturer)
.OrderByDescending(s => s.date)
.Take(5);
I think you can use the GroupBy to do what you want.
var Top5MFG = db.orders
.Where (x => x.manufacturer.Length > 0 && x.customerid == "blahblahblahblahblah")
.GroupBy(mfg => mfg.manufacturer)
.Select(g => g.First())
.OrderByDescending(d => d.date_created );
.Take(5);
One way you can distinct by a certain field is to replace:
...
.Distinct()
...
with:
...
.GroupBy(x => x.manufacturer )
.Select(g => g.First())
...
I have 3 tables
A project table
A product table
An update table
The product table holds different products from a project, and the update table holds updates made to various products and holds a reference to the user who did it.
Basically what I want is to have a query that returns all products (since products to projects is a many to one relation) ordered by the date they we're last updated by the user who is currently logged in.
This is my current query:
IEnumerable<ProjectProduct> list =
from joined in
(from product in db.GetTable<Product>()
join project in db.GetTable<Project>()
on product.ProjectId equals project.ID
select new { product, project })
join projectupd in db.GetTable<ProjectUpdate>()
on joined.product.ID equals projectupd.ProductID
where projectupd.CreatedBy == ParamUser
orderby projectupd.LastUpdate
select new ProjectProduct(joined.project, joined.product);
However, the result I'm getting is only the entries in the update table, and not all the existing products. I know that the "where" clause makes it only select the updates created by a specific user, so I'm on the right track, but I have tried a couple of things to make the query successful, without luck though.
Does anybody have a suggestion on how to get the desired result?
Here's an answer that's a little verbose, and it uses method-chain syntax, but I do think it does what your looking for:
var products = db.GetTable<Product>();
var projects = db.GetTable<Project>();
var projectUpdates = db.GetTable<ProjectUpdate>();
var latestProjectUpdatesForUser = projectUpdates
.Where(x => x.CreatedBy == paramUser)
.GroupBy(x => x.ProductId)
.Select(g => g.OrderByDescending(x => x.LastUpdate).First());
var list = products
.Join(
projects,
product => product.ProjectId,
project => project.Id,
(product, project) => new
{
Product = product,
Project = project,
Update = latestProjectUpdatesForUser.FirstOrDefault(u => u.ProductId == product.Id)
}
)
.OrderByDescending(x => x.Update != null ? (DateTime?)x.Update.LastUpdate : null)
.ThenBy(x => x.Project.Id)
.ThenBy(x => x.Product.Id)
.Select(x => new ProjectProduct { Project = x.Project, Product = x.Product});
It takes advantage of the fact that DateTime? is sortable and that null values end up last when using OrderByDescending.