I have table Order and OrderItem. Now, I need to get all order items for a range of orders (For example, order for the entire month). The order and order item itself has many columns. So, using Include("OrderItem") is amazingly slow.
The number of record of orders is around 20K record per month with order items are around 50K-60K records.
Since it is a range of date (eg. daily, weekly, monthly, yearly, etc...) currently, I am doing 2 selects.
var orders = (from rec in Orders.AsNoTracking()
where (rec.OrderDate >= startDate) && (rec.OrderDate <= endDate)
orderby rec.OrderDate descending
select rec).ToList();
var orderItems = (from rec in OrderItems.AsNoTracking()
join recOrder in Orders.AsNoTracking() on new { rec.LocationId, rec.StoreId, rec.OrderId } equals new { recOrder.LocationId, recOrder.StoreId, recOrder.OrderId }
orderby recOrder.OrderDate descending
where (recOrder.OrderDate >= startDate) && (recOrder.OrderDate <= endDate)
select rec).ToList();
This is all good. However, when I do paging (1 page can display 25 - 1000 records per page depending on user preferences), I am thinking to get away from 2 selects and use union or predicate (using PredicateBuilder in LinqKit), something like this:
var orders = (from rec in Orders.AsNoTracking()
where (rec.OrderDate >= startDate) && (rec.OrderDate <= endDate)
orderby rec.OrderDate descending
select rec).Skip(page * recPerPage).Take(recPerPage).ToList();
Now, to get the order items, I propose to use 3 options. Please let me know which one is better for small records (25 - 1000 records per page).
Option #1: Use another select with Skip and Take for order items.
Option #2: Use Union.
IQueryable<OrderItem> queries = null;
foreach (Tuple<int, int, int> key in orderIds)
{
int locationId = key.Item1, storeId = key.Item2, orderId = key.Item3;
var recs = (from rec in OrderItems.AsNoTracking()
where (rec.LocationId == locationId) && (rec.StoreId == storeId) && (rec.OrderId == orderId)
select rec);
queries = (queries == null) ? recs : queries.Union(recs);
}
var orderItems = queries.ToList();
Option #3: Use PredicateBuilder.
var predicate = PredicateBuilder.New<OrderItem>();
foreach (Tuple<int, int, int> key in orderIds)
{
int locationId = key.Item1, storeId = key.Item2, orderId = key.Item3;
predicate = predicate.Or(rec => (rec.LocationId == locationId) && (rec.StoreId == storeId) && (rec.OrderId == orderId));
}
var orderItems = OrderItems.AsNoTracking().AsExpandable().Where(predicate).ToList();
I prefer to use Union or PredicateBuilder so to get order items the function doesn't have to know if the orders were selected using Date Range OR search by product OR search by customer OR any other future search options. All OrderItem need to know is the list of order ids.
So, my question is, in term of speed and performance, which option is the best option?
NOTE:
If I tried to get 1000 records, sometimes Union and PredicateBuilder will throw StackOverflowException. So, I limit it to 500 records. If user preference is 1000 records, then, I do 2 calls.
I also tried to use LINQPad to look at SQL generated by predicate builder, but it couldn't recognize AsExpandable() (Yes, I have included LinqKit DLL). So, no luck here.
Please help.
Thanks.
Related
I'm trying to get the latest record for a group in linq but I want the id, not the date as sometimes the dates can be exactly the same as other records.
The following code gives me the key and the last date
var latestPositions = from bs in Scan
group bs by bs.Asset into op
select new
{
Asset = op.Key,
LastSeen = op.Max(x => x.LastSeen)
};
Returns something like this...
Asset LastSeen
BS1 2020-05-10
BS2 2020-07-10
Which is what I need, but I then need to get to the rest of the data from that table row, but if I join it on the two columns I can get multiple rows, is there a way for me to return the id column in the group by, so I can join on that?
Thanks
GroupBy cannot help here because of SQL limitation. But you can write workaround
var latestPositions = from bs in Scan
where !Scan.Any(s => s.Asset == bs.Asset && s.LastSeen > bs.LastSeen)
select bs;
But I have to mention that fastest way is using window functions which are not available in EF Core:
SELET
sc.Id
FROM (
SELECT
s.Id,
ROW_NUMBER() OVER (PARTITION BY s.Asset ORDER BY s.LastSeen DESC) AS RN
FROM Scan s
) sc
WHERE sc.RN == 1
But there is exists EF Core extension linq2db.EntityFrameworkCore which makes it possible via LINQ (I assume that Asset is just ID, not a navigation property)
var queryRn = from bs in Scan
select new
{
Entity = bs,
RN = Sql.Ext.RowNumber().Over()
.PartitionBy(bs.Asset).OrderByDesc(bs.LastSeen).ToValue()
};
// switch to alternative translator
queryRn = queryRn.ToLinqToDB();
var latestPositions = from q in queryRn
where q.RN == 1
select q.Entity;
I think I did what you wanted above and I wrote the full code on this link
If it's not what you want, can you write what you want a little more clearly.
var temp = from l in list
group l by l.Asset into lg
orderby lg.Key
select new { Asset = lg.Key, LastSeen = lg.Max(x => x.LastSeen), ID = lg.Where(x => x.LastSeen == lg.Max(y => y.LastSeen)).Single().ID };
So every Scan has a property Asset, a DateTime property LastSeen, and zero or more other properties.
Requirement: given a sequence of Scans, give me per Asset (all or some of the ) properties of the LastSeen Scan.
The following will do the trick:
var lastSeenScan = dbContext.Scans.GroupBy(scan => scan.Asset,
// parameter resultSelector: take every Asset, and all Scans that have this Asset value
// and order these Scans by descending value of lastSeen:
(asset, scansWithThisAssetValue) => scansWithThisAssetValue
.OrderByDescending(scan => scan.LastSeen)
// Select the scan properties that you plan to use.
// Not needed if you want the complete Scan
.Select(scan => new
{
Id = scan.Id,
Operator = scan.Operator,
...
})
// and keep only the First one, which is the Last seen one:
.FirstOrDefault();
In words: divide your table of of Scans into groups of scans that have the same value for property Asset. Order all scans in each group by descending value of LastSeen. This will make the Scan that has last been seen the first one.
From all scans in the group select only the properties that you plan to use, and take the first one.
Result: for every used Asset, you get the (selected properties of the) scan that has the highest value of LastSeen.
I have a VacancyApply table and that table consist of Status Id's,So i need Top5 data from each Status.I want to get top 5 records of each status.Status is int like 1,2,3
My Query
var result = (from ui in _context.VacancyApply
join s in _context.UserProfile on ui.UserId equals s.UserId
join x in _context.Vacancy on ui.VacancyId equals x.VacancyId
join st in _context.Status on ui.StatusId equals st.StatusId
where ui.UserId == userId && ui.IsActive == true
orderby ui.StatusId
select new VacancyApply
{
VacancyApplyId = ui.VacancyApplyId,
VacancyId = ui.VacancyId,
UserId = ui.UserId,
StatusId = ui.StatusId,
VacancyName = x.VacancyName,
VacancyStack = x.VacancyStack,
VacancyEndDate = x.VacancyEndDate,
StatusName = st.StatusName,
UserName = s.FirstName
}).ToList();
Now what I can see from the output is that it contains One VacancyId and One VendorId.
I have a feeling that you have Many to Many relationships between Vacancy and Status tables.
But nevertheless, the answer is very simple: you need to use LINQ Take extension method (maybe it will be good to make it follow after the OrderBy because just taking the last items doesn't make sense without some logic):
var output = (logic to join, filter, etc.).OrderBy(lambda).Take(N); // N is the number of
// items you want to select
Now if you want Generally to take the last items from Vacancy and only after join it with Status do this:
var output = Vacancy.OrderBy(lambda).Take(N).(now join, filter, etc. with other tables);
However, if you want to Group all similar Statuses in conjunction with Vacancies and only after taking the Top items, use GroupBy:
var output = (logic to join, filter, etc.).GroupBy(st => st.StausId).
.Select(group => group.OrderBy(lambda).Take(N));
My ASP.Net application has the following Linq to SQL function to get a distinct list of height values from the product table.
public static List<string> getHeightList(string catID)
{
using (CategoriesClassesDataContext db = new CategoriesClassesDataContext())
{
var heightTable = (from p in db.Products
join cp in db.CatProducts on p.ProductID equals cp.ProductID
where p.Enabled == true && (p.CaseOnly == null || p.CaseOnly == false) && cp.CatID == catID
select new { Height = p.Height, sort = Convert.ToDecimal(p.Height.Replace("\"", "")) }).Distinct().OrderBy(s => s.sort);
List<string> heightList = new List<string>();
foreach (var s in heightTable)
{
heightList.Add(s.Height.ToString());
}
return heightList;
}
}
I ran Redgate SQL Monitor which shows that this query is using a lot of resources.
Redgate is also showing that I am running the following query:
select count(distinct [height]) from product p
join catproduct cp on p.productid = cp.productid
join cat c on cp.catid = c.catid
where p.enabled=1 and p.displayfilter = 1 and c.catid = 'C2-14'
My questions are:
A suggestion to change the function so that it uses less resources?
Also, how does linq to sql generate the above query from my function? (I did not write select count(distinct [height]) from product anywhere in the code)
There are 90,000 records in the products. This category which I am trying to get the distinct list of heights has 50,000 product records
Thank you in advance,
Nick
First of all your posted sql query and linq query doesn't match at all. it's not the LINQ query rather the underlying SQL query itself performing slow. Make sure, all the columns involved in JOIN ON clause and WHERE clause and ORDER BY clause are indexed properly in order to have a better execution plan; else you will end up getting a FULL Table Scan and a File Sort and query will deemed to perform slow.
The join multiplies the number of Products the query returns. To undo that, you apply Distinct at the end. It will certainly reduce db resources if you return unique Products right away:
var heightTable = (from p in db.Products
where p.CatProducts.Any(cp => cp.CatID == catID)
&& p.Enabled && (p.CaseOnly == null || !p.CaseOnly)
select new
{
Height = p.Height,
sort = Convert.ToDecimal(p.Height.Replace("\"", ""))
}).OrderBy(s => s.sort);
This changes the join into a where clause. It saves the db engine the trouble of deduplicating the result.
If that still performs poorly, you should try to do the conversion and ordering in memory, i.e. after receiving the raw results from the database.
As for the count. I don't know where it comes from. Such queries typically get generated by paging libraries such as PagedList, but I see no trace of that in your code.
Side note: you can return ...
heightList.Select(x => x.Height.ToString()).ToList()
... instead of creating the list yourself.
I'm trying to get a list of students based on their status, grouped by their college.
So I have three tables, Students and Colleges. Each student record has a status, that can be 'Prospect', 'Accepted' or 'WebApp'. What I need to do is get a list of students based on the status selected and then display the College's name, along with the number of students that go to that college and have their status set to the status passed in. I think this needs to be an aggregate query, since the counts are coming from the string Status field.
I'm not sure how to do this in MS SQL, since the count is coming from the same table and it's based on the status field's value.
Here is the start of my query, which takes in the search parameters, but I can't figure out how to filter on the status to return the counts.
SELECT Colleges.Name, [Status], Count([Status])
FROM Students
JOIN Colleges ON Students.UniversityId = Colleges.id OR Students.+College = Colleges.Name
GROUP BY Students.[Status], Colleges.Name
ORDER BY Colleges.Name;
Accepts = Status('Accepted')
WebApps = Status('WebApp')
Total = Sum(Accpets + WebApps)
Select
Colleges.Name,
SUM(Case when Students.Status like 'Accepted' then 1 else 0 end) Accepts,
SUM(Case when Students.Status like 'WebApp' then 1 else 0 end) WebApps,
COUNT(*) Total
from Students
join Colleges on Students.UniversityId = Colleges.Id OR Students.CurrentCollege = Colleges.Name
Group by Colleges.Name
The LINQ:
var results =
(from c in db.Colleges // db is your DataContext
select new
{
CollegeName = c.Name,
AcceptedStatus = db.Students.Count(r => r.Status.ToUpper() == "ACCEPTED" && (r.UniversityId == c.Id || r.CurrentCollege == c.Name)),
WebAppStatus = db.Students.Count(r => r.Status.ToUpper() == "WEBAPP" && (r.UniversityId== c.Id || r.CurrentCollege == c.Name)),
Total = db.Students.Count(s => s.UniversityId == c.Id || s.CurrentCollege == c.Name)
}).ToList();
Try this http://www.linqpad.net/
Its free and you can convert the linq to sql queries
I am having following linq -
var quantity = (from p in context.StoreInventory
where p.BookId== BookId
&& p.StoreAddress == StoreAddress
select p).Sum(i => i.Quantity);
I am getting error -
The method 'Sum' is not supported
Can anyone tell me the reason and required changes.
var quantity = (from p in context.StoreInventory
where p.BookId== BookId
&& p.StoreAddress == StoreAddress
select p.Quantity).Sum();
This should work - the sum is performed on 'Quality' column, which is taken using select statement. That's because Sum(expression) is not supported by LINQ to Entities, but standard Sum() is.
Whole work should be done by database, so no rows will be retrieved by application - just single number.
Use Enumerable.ToList before you call Sum to convert the query to collection.
var quantity = (from p in context.StoreInventory
where p.BookId== BookId
&& p.StoreAddress == StoreAddress
select p).ToList().Sum(i => i.Quantity);
Edit: This will bring all the row and will apply the sum which is not efficient way of doing. As you need to sum up quantity you can select quanity instead of row.
var quantity = (from p in context.StoreInventory
where p.BookId== BookId
&& p.StoreAddress == StoreAddress
select p.Quantity).Sum();