Simple LINQ query to count by month - c#

This should be a fairly basic LINQ query but I am confused on how to get this in the format I need.
Here's what I have right now, which I realize isn't even close:
var accidents = DataParser
.Parser
.ParseData()
.Where(w => w.EventDate.Year == year)
.GroupBy(g => g.EventDate.Month)
.Count()
.ToList();
The ToList isn't going to work here due to the Count.
I need the data in the simple format of:
Month Number | EventCount
1 | 45
2 | 62
3 | 42
... etc through month 12, preferably as a List<Events> (the DataParser returns Events objects).

You need to use your group like this:
var accidents = DataParser
.Parser
.ParseData()
.Where(w => w.EventDate.Year == year)
.GroupBy(g => g.EventDate.Month)
.Select(g => new { Month = g.Key, Count = g.Count() }
.ToList();
That will give a list of anonymous objects. If you need list of Events, create Event object instead: .Select(g => new Event() { Month = g.Key, Count = g.Count() }

Related

Getting the count of most repeated records in Linq

I am working on an application in which I have to store play history of a song in the data table. I have a table named PlayHistory which has four columns.
Id | SoundRecordingId(FK) | UserId(FK) | DateTime
Now i have to implement a query that will return the songs that are in trending phase i.e. being mostly played. I have written the following query in sql server that returns me data somehow closer to what I want.
select COUNT(*) as High,SoundRecordingId
from PlayHistory
where DateTime >= GETDATE()-30
group by SoundRecordingId
Having COUNT(*) > 1
order by SoundRecordingId desc
It returned me following data:
High SoundRecordingId
2 5
2 3
Which means Song with Ids 5 and 3 were played the most number of times i.e.2
How can I implement this through Linq in c#.
I have done this so far:
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.Take(20)
.ToList();
It returns me list of whole table with the count of SoundRecording objects but i want just count of the most repeated records.
Thanks
There is an overload of the .GroupBy method which will solve your problem.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Take(20)
.ToList();
I have simply added the result selector to the GroupBy method call here which does the same transformation you have written in your SQL.
The method overload in question is documented here
To go further into your problem, you will probably want to do another OrderByDescending to get your results in popularity order. To match the SQL statement you also have to filter for only counts > 1.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Where(x=>x.High>1)
.OrderByDescending(x=>x.High)
.ToList();
I like the 'linq' syntax it's similar to SQL
var query = from history in _db.PlayHistories
where history.DateTime >= monthBefore
group history by history.SoundRecordingId into historyGroup
where historyGroup.Count() > 1
orderby historyGroup.Key
select new { High = historyGroup.Count(), SoundRecordingId = historyGroup.Key };
var data = query.Take(20).ToList();
You´re allmost done. Just order your list by the count and take the first:
var max =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.OrderByDescending(x => x.Count())
.First();
This gives you a single key-value-pair where the Key is your SoundRecordingId and the value is the number of its occurences in your input-list.
EDIT: To get all records with that amount chose this instead:
var grouped =
_db.PlayHistories
.OrderByDescending(x => x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x => x.SoundRecordingId)
.Select(x => new { Id = x.Key, Count = x.Count() }
.OrderByDescending(x => x.Count)
.ToList();
var maxCount = grouped.First().Count;
var result = grouped.Where(x => x.Count == maxCount);
This solves the problem by giving you what you asked for. Your query in LINQ, returning just the play counts.
var list = _db.PlayHistories.Where(x => x.DateTimeProp > (DateTime.Now).AddMonths(-1))
.OrderByDescending(y => y.SoundRecordingId.Count())
.ThenBy(z => z.SoundRecordingId)
.Select(xx => xx.SoundRecordingId).Take(20).ToList();

Selecting fields in grouping

I have data like the following inside my DataTable:
id vrn seenDate
--- ---- --------
1 ABC 2017-01-01 20:00:05
2 ABC 2017-01-01 18:00:09
3 CCC 2016-05-05 00:00:00
I am trying to modify the data to only show vrn values with the most recent date. This is what I have done so far:
myDataTable.AsEnumerable().GroupBy(x => x.Field<string>("vrn")).Select(x => new { vrn = x.Key, seenDate = x.Max(y => y.Field<DateTime>("seenDate")) });
I need to modify the above to also select the id field (i.e. I do not want to group on this field, but I want to have it included in the resulting data set).
I cannot put in x.Field<int>("id") in the Select() part, as the Field clause does not exist.
You need an equivalent of MaxBy method from MoreLINQ.
In standard LINQ it can be emulated with OrderByDescending + First calls:
var result = myDataTable.AsEnumerable()
.GroupBy(x => x.Field<string>("vrn"))
.Select(g => g.OrderByDescending(x => x.Field<DateTime>("seenDate")).First())
.Select(x => new
{
vrn = x.Field<string>("vrn"),
id = x.Field<int>("id"),
seenDate = x.Field<DateTime>("seenDate"),
});
You can use select new like this to select anything you want from your data
var query = from pro in db.Projects
select new { pro.ProjectName, pro.ProjectId };
If you may have few ids in the same vrn with same max date then the following would work:
IEnumerable<DataRow> rows = myDataTable.AsEnumerable()
.GroupBy(x => x.Field<string>("vrn"))
.Select(x => new
{
Grouping = x,
MaxSeenDate = x.Max(y => y.Field<DateTime>("seenDate"))
})
.SelectMany(arg =>
arg.Grouping.Where(y => y.Field<DateTime>("seenDate") == arg.MaxSeenDate));
It will retrun an IEnumerable of the original DataRow so you have all your fields there.
Or you can add another select to have only the fields you need.

Getting most occured string value using LINQ C#

I have a list of strings which I group by their occurrence in the list like following (where key is the list of those strings):
mostCommonKeywords = key.GroupBy(v => v)
.OrderByDescending(g => g.Count()) // here I get the count number
.Select(g => g.Key)
.Distinct()
.ToList();
The thing I want to do now is when they are sorted out, I wanna get their count, since LINQ can clearly distinguish their number and sort them out... How can I get the count occurrence of each string in the list now ???
Edit:
Let's say I have count values that look like this:
124
68
55
48
32
19
13
10
I cannot simply Add 1 of this value into a variable called "Count" as #octavioccl suggested. I clearly have to store them into some kind of list or something...
Using the Select and projecting in an anonymous type:
var result=key.GroupBy(v => v)
.Select(g => new {g.Key, Count=g.Count()})
.OrderByDescending(e => e.Count)
// .Distinct()// Don't need this call
.ToList();
Update
You can also project your query using a DTO:
public class CustomDTO
{
public string Key{get;set;} // Change the type in case you need to
public int Count{get;set;}
}
So, your query would be:
var result=key.GroupBy(v => v)
.Select(g => new CustomDTO{Key=g.Key, Count=g.Count()})
.OrderByDescending(e => e.Count)
.ToList();
from a in keys
group a by a.key into g
select new { a.Key, Count = g.Count() };

Determine Duplicate data using LINQ to EF

I have a dataset that i want to groupby to determine duplicate data.
Example i have a dataset that looks like this.
|id | Number | ContactID
1 1234 5
2 9873 6
3 1234 7
4 9873 6
Now i want to select data that has more than one occurrence of Number but only if the ContactID is not the same.
So basically return
| Number | Count |
1234 2
Any help would be appreciated using LINQ to EF, thanks.
Update:
All thanks to #DrCopyPaste, as he told me that I misunderstood your problem. Here is the correct solution:-
var result = from c in db.list
group c by c.Number into g
let count = g.GroupBy(x => x.ContactID).Where(x => x.Count() == 1).Count()
where count != 0
select new
{
Number = g.Key,
Count = count
};
Sample Fiddle.
This query avoids making a custom IEqualityComparer as if I remember correctly don't think they play well with EF.
var results = data.GroupBy(number => number.Number)
.Where(number => number.Count() > 1)
.Select(number => new
{
Number = number.Key,
Count = number.GroupBy(contactId => contactId.ContactId).Count(x => x.Count() == 1)
})
.Where(x => x.Count > 0).ToList();
Fiddle
It does an initial GroupBy to get all Numbers that are duplicated. It then selects a new type that contains the number and a second GroupBy that groups by ContactId then counts all groups with exactly one entry. Then it takes all results whose count is greater than zero.
Have not testing it against EF, but the query uses only standard Linq operators so EF shouldn't have any issues translating it.
Another way of doing this(using 1 level of grouping):
var results = data
.Where(x => data.Any(y => y.Id != x.Id && y.Number == x.Number && y.ContactId != x.ContactId))
.GroupBy(x => x.Number)
.Select(grp => new { Number = grp.Key, Count = grp.Count() })
.ToList();
Fiddle

Multiple group by with aggregate in Linq

I currently have this code:
foreach (var newsToPolitician in news.NewsToPoliticians)
{
var politician = newsToPolitician.Politician;
var votes = (from s in db.Scores
where o.IDPolitician == politician.IDPolitician
&& o.IDNews == IDNews
group o by o.IDAtribute
into g
select new{
Atribute= g.Key,
TotalScore= g.Sum(x => x.Score)
}).ToList();
}
It works alright, but I want to avoid making multiple queries to my database in foreach loop.
My table Scores looks like this:
IDScore | IDNews | IDUser | IDPolitician | IDAtribute | Score
1 40 1010 35 1 1
2 40 1010 35 2 -1
3 40 1002 35 1 1
4 40 1002 35 2 1
5 40 1002 40 1 -1
...
My goal is to aggregate all the scores for all politicians in a news. A news can have up to 7 politicians.
Is it expensive to call my database up to seven times in a foreach loop. I know that isn't best practice so I'm interested is there any way to avoid it in this particular case and make one call to database and then process it on the server side?
Update - Due to user comments have re-jigged to try and ensure aggregation on the server.
In this case we can group on the server by both IDPolitician and IDAttribute and then pull the groups in with ToLookup locally as so:
var result = db.Scores.Where(s => s.IDNews == IDNews)
.Where(s => news.NewsToPoliticians
.Select(n => n.Politician.IDPolitician)
.Contains(s.IDPolitician))
.GroupBy(s => new
{
s.IDPolitician,
s.IDAttribute
},
(k,g ) => new
{
k.IDPolitician,
k.IDAttribute,
Sum = g.Sum(x => x.Score)
})
.ToLookup(anon => anon.IDPolitician,
anon => new { anon.IDAttribute, anon.Sum })
Legacy -
You want to use GroupJoin here, it would be something along the lines of:
var result = news.NewsToPoliticians
.GroupJoin( db.Scores.Where(s= > s.IDNews == IDNews),
p => p.IDPolitician,
s => s.IDPolitician,
(k,g) => new
{
PoliticianId = k,
GroupedVotes = g.GroupBy(s => s.IDAtribute,
(id, group) => new
{
Atribute = id,
TotalScore = group.Sum(x => x.Score)
})
})
.ToList();
However you are at the mercy of your provider as to how it translates this so it might still be multiple queries to get round this you could use something like:
var politicianIds = news.NewsToPoliticians.Select(p => p.IDPolitician).ToList()
var result = db.Scores.Where(s= > s.IDNews == IDNews)
.Where(s => politicianIds.Contains(s.IDPolitician))
.GroupBy(p => p.IDPolitician,
(k,g) => new
{
PoliticianId = k,
GroupedVotes = g.GroupBy(s => s.IDAtribute,
(id, group) => new
{
Atribute = id,
TotalScore = group.Sum(x => x.Score)
})
})
.ToList();
Which hopefully should be at most 2 query (depending on whether NewsToPoliticians is db dependent). You'll just have to try it out and see.
Use a stored procedure and get the SQL server engine to do all the work. You can still use Linq to call the stored procedure and this will minimize all the calls to the database

Categories