Linq is very slow with predicate and order by clause - c#

We have List holding 20K objects which contains dates into it. We want to find most recent date from that list considering one condition. Using code as below.
listObject.Where(r => r.Date <= asOfDate).OrderByDescending(r => r.Date).FirstOrDefault();
This is taking too long than expected.
Can you please help what would be the better way to do it?
Thank You!

You can do (based on #Barns comment)
var maxDate = listObject.Where(r => r.Date <= asOfDate).Max(r => r.date);
var item = listObject.FirstOrDefault(r => r.date == maxDate);
this will only loop over your list twice instead of sorting it.

Try using Aggregate:
listObject
.Where(r => r.Date <= asOfDate)
.Aggregate((acc, curr) => curr.Date > acc.Date ? curr : acc)
Performance-wise it can be improved moving filtering logic inside the Aggregate and introducing null accumulator with null handling inside, but if performance is a big concern just switch to for loop.

You currently have three operations:
.Where(r => r.Date <= asOfDate) - time complexity O(n)
.OrderByDescending(r => r.Date)- time complexity (I imagine) O(n log(n))
.FirstOrDefault();- time complexity O(0)
You could do the following an get the same result:
var maxDate= listObject.Where(r => r.Date <= asOfDate).Max(r => r.date); - time complexity O(n)
var result = listObject.FirstOrDefault(r => r.Date == maxDate); - time complexity O(n)

Why not combine Where and Max operations like this:
var maxDate = listObject.Max(r => r.Date <= asOfDate ? r.Date : DateTime.MinValue);
var item = listObject.FirstOrDefault(r => r.date == maxDate);
This will run over the list only twice.

Have you tried sorting the collection first?
listObject
.OrderByDescending(ordr => ordr.Date)
.Where(obj => obj.Date <= asOfDate)
.FirstOrDefault();

Related

Getting the count of most repeated records in Linq

I am working on an application in which I have to store play history of a song in the data table. I have a table named PlayHistory which has four columns.
Id | SoundRecordingId(FK) | UserId(FK) | DateTime
Now i have to implement a query that will return the songs that are in trending phase i.e. being mostly played. I have written the following query in sql server that returns me data somehow closer to what I want.
select COUNT(*) as High,SoundRecordingId
from PlayHistory
where DateTime >= GETDATE()-30
group by SoundRecordingId
Having COUNT(*) > 1
order by SoundRecordingId desc
It returned me following data:
High SoundRecordingId
2 5
2 3
Which means Song with Ids 5 and 3 were played the most number of times i.e.2
How can I implement this through Linq in c#.
I have done this so far:
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.Take(20)
.ToList();
It returns me list of whole table with the count of SoundRecording objects but i want just count of the most repeated records.
Thanks
There is an overload of the .GroupBy method which will solve your problem.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Take(20)
.ToList();
I have simply added the result selector to the GroupBy method call here which does the same transformation you have written in your SQL.
The method overload in question is documented here
To go further into your problem, you will probably want to do another OrderByDescending to get your results in popularity order. To match the SQL statement you also have to filter for only counts > 1.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Where(x=>x.High>1)
.OrderByDescending(x=>x.High)
.ToList();
I like the 'linq' syntax it's similar to SQL
var query = from history in _db.PlayHistories
where history.DateTime >= monthBefore
group history by history.SoundRecordingId into historyGroup
where historyGroup.Count() > 1
orderby historyGroup.Key
select new { High = historyGroup.Count(), SoundRecordingId = historyGroup.Key };
var data = query.Take(20).ToList();
You´re allmost done. Just order your list by the count and take the first:
var max =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.OrderByDescending(x => x.Count())
.First();
This gives you a single key-value-pair where the Key is your SoundRecordingId and the value is the number of its occurences in your input-list.
EDIT: To get all records with that amount chose this instead:
var grouped =
_db.PlayHistories
.OrderByDescending(x => x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x => x.SoundRecordingId)
.Select(x => new { Id = x.Key, Count = x.Count() }
.OrderByDescending(x => x.Count)
.ToList();
var maxCount = grouped.First().Count;
var result = grouped.Where(x => x.Count == maxCount);
This solves the problem by giving you what you asked for. Your query in LINQ, returning just the play counts.
var list = _db.PlayHistories.Where(x => x.DateTimeProp > (DateTime.Now).AddMonths(-1))
.OrderByDescending(y => y.SoundRecordingId.Count())
.ThenBy(z => z.SoundRecordingId)
.Select(xx => xx.SoundRecordingId).Take(20).ToList();

Entity Framework - paging with group by skipping group

I have the following query:
var enumerable = repository.Elemtents.Where((s) =>
DbFunctions.TruncateTime(s.Timestamp) <= parameter.To.Date &&
DbFunctions.TruncateTime(s.Timestamp) >= parameter.From.Date)
.OrderByDescending((s) => s.Timestamp)
.GroupBy((s) => new {Date = DbFunctions.TruncateTime(s.Timestamp), s.Timestamp.Hour})
.OrderByDescending((s) => s.Key.Date);
I now want to apply paging with Skip() and Take(). In my table (protocol entries) can be a large amount of data. So I coul do the following, but it would be a perfomance lack.
var result = enumerable
.ToList()
.SelectMany((x) => x)
.Skip(0)
.Take(2);
I want to apply Skip() and Take() on the query directly so that it will be done on the sql server. If I do the following, I get weird results:
var result = repository.Elemtents.Where((s) =>
DbFunctions.TruncateTime(s.Timestamp) <= parameter.To.Date &&
DbFunctions.TruncateTime(s.Timestamp) >= parameter.From.Date)
.OrderByDescending((s) => s.Timestamp)
.GroupBy((s) => new {Date = DbFunctions.TruncateTime(s.Timestamp), s.Timestamp.Hour})
.OrderByDescending((s) => s.Key.Date)
.Skip(0)
.Take(2)
.ToList();
Does anyone know how to resolve this?

Linq statement distinct not working on dates

I've been looking all over stackoverflow.com and the Interwebz to find out how to use Linq with Distinct(), but I'm not having any luck with my situation.
What I'm trying to do is show a list of dates (Aug 2015, July 2015, etc) with each date showing just once.
What happens is that the months are duplicated because you can publish a new blog post more than once in a month. I thought using Distinct would help, but it seems like it's not doing anything. When I try adding GroupBy(), my OrderByDescending stops working, and I'm not seasoned enough to simply turn this into an IList, which I've seen in a couple other examples.
BlogRepeater.DataSource = _PostRepository.GetAll(ConfigurationManager.GetSiteID())
.Where(x => x.DatePublished != null && x.DatePublished >= DateTime.Now.AddYears(-1))
.Distinct().OrderByDescending(x => x.DatePublished)
.Select(x => x.DatePublished.Value.Date);
What is the best way to be doing this?
I've tried taking pieces from other examples, to no avail. Thanks in advance.
UPDATE: Thanks for the help! Here is working code in hopes it can help someone else in the future.
Code Behind:
BlogRepeater.DataSource =
_PostRepository
.GetAll(ConfigurationManager.GetSiteID())
.Where(x => x.DatePublished != null && x.DatePublished >= DateTime.Now.AddYears(-1) && x.Status == "Published")
.Select(x => x.DatePublished.Value.ToString("MMM yyyy"))
.Distinct()
.OrderBy(x => x);
BlogRepeater.DataBind();
Front End:
<%#(Container.DataItem)%>
If your DatePublished field contains DateTime values with different times, the .Distinct() will not behave how you expect it to, because those values are essentially different.
If you want distinct dates, not date/times, then you can move the last .Select before the .Distinct():
BlogRepeater.DataSource =
_PostRepository
.GetAll(ConfigurationManager.GetSiteID())
.Where(x => x.DatePublished != null && x.DatePublished >= DateTime.Now.AddYears(-1))
.Select(x => x.DatePublished.Value.Date)
.Distinct()
.OrderByDescending(x => x);
If you want to find the distinct months, not dates, then you have to change the
.Select(x => x.DatePublished.Value.Date)
line to
.Select(x => x.DatePublished.Value.ToString("MMM yyyy"))
Feel free to change the "MMM yyyy" format to anything else you find suitable.

Have lambda return 0 if count is null

Struggling to get my count correct. I believe I am close but I need it to return 0 if there are no records in the database. The compiler doesnt like what I have now. Any help would be appreciated. Thanks
var count = (_db.cart.Where(c => c.UserId == id)
.Select(c => (int) c.Quantity)).ToList().Count() ?? 0;
You want to use Sum, not Count
var totalQuantity = _db.cart.Where(c => c.UserId == id)
.Select(c => c.Quantity)
.DefaultIfEmpty(0)
.Sum();
No need for the null coalesce.
var count = _db.cart.Count(c => c.UserId == id); // get record count
If you're actually trying to get a sum:
var total = _db.cart.Where(c => c.UserId == id)
.Select(c => (int?)c.Quantity)
.Sum() ?? 0; // get total
FYI, regarding the code you originally posted...
You wouldn't want to call ToList().Count(), because calling ToList() will execute your query and pull back data. You'll end pulling back a list of quantities and performing the count locally, instead of generating an SQL statement that simply performs the count and returns a single number.
try this
var count=_db.cart.where(c=>c.Userid==id).Count();

LINQ - How to query a range of effective dates that only has start dates

I'm using C# 3.5 and EntityFramework. I have a list of items in the database that contain interest rates. Unfortunately this list only contains the Effective Start Date. I need to query this list for all items within a range.
However, I can't see a way to do this without querying the database twice. (Although I'm wondering if delayed execution with EntityFramework is making only one call.) Regardless, I'm wondering if I can do this without using my context twice.
internal IQueryable<Interest> GetInterests(DateTime startDate, DateTime endDate) {
var FirstDate = Context.All().Where(x => x.START_DATE < startDate).Max(x => x.START_DATE);
IQueryable<Interest> listOfItems = Context.All().Where(x => x.START_DATE >= FirstDate && x.START_DATE <= endDate);
return listOfItems;
}
If you could use a LINQ query, you can use let to do this:
(from c in dbContext.Table
let firstdate = dbContext.Table.Max(i => c.StartDate < startDate)
where c.StartDate >= firstdate
and c.StartDate <= enddate
select c)
I'm not sure if the max will work this way, so you may need to alternatively do:
(from c in dbContext.Table
let firstdate = dbContext.Table.Select(i => i.StartDate).Max(i => c.StartDate < i)
where c.StartDate >= firstdate
and c.StartDate <= enddate
select c)
Something like that.
I haven't tried this on EF but on Linq to objects it works fine:
var result = source
.OrderBy(x => x.start)
.GroupBy(x => x.start < startDate)
.SelectMany((x, i) => i == 0 ? new[] {new { value = x.Last().value, start = x.Last().start }} : x.Where(y => y.start < endDate));
The issue is that C# LINQ is missing an operator which gives you access to the previous item in a sequence. F# apparently can handle this. Workarounds involved either a GroupBy or an Aggregate operation. In this case, GroupBy can handle it.
It's not pretty and I wouldn't recommend using it over the two phase approach.

Categories