Using Linq to get maximum value of only duplicates - c#

I am trying to use Linq to return me all of the items in my object that have a specific property value duplicated, and the maximum value of another specific property for all of the duplicates.
My object has properties CourseInfoId which is how I want to check for duplicates and a property Priority which i want the maximum value (and lots of other properties).
I thought this would work, but it's giving me every item in the object.
var group = from a in r
group a by a.CourseInfoId into b
let maxPriority = b.Max(d => d.Priority)
where b.Skip(1).Any()
from c in b
where c.Priority == maxPriority
select c;
Where am I going wrong?

What you'll want to do is group by CourseInfoId, then filter on groups that have more than 1 item which will get you all of the duplicate items. Next, you'll have to flatten out the groups again and get the maximum property value from the results.
var maxPriority = items
.GroupBy(i => i.CourseInfoId)
.Where(g => g.Count() > 1)
.SelectMany(g => g)
.Max(i => i.Priority);
EDIT: I see now that you only want to check the properties of the duplicates, not all of the items with a duplicate ID. All you have to do is skip the first item of each group in the .SelectMany() call:
var maxPriority = items
.GroupBy(i => i.CourseInfoId)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Skip(1))
.Max(i => i.Priority);

Related

How to get count of number of rows where one of the columns entry is the same

First, sorry for a bad title, I don't really know what this is called.
I know that I can use context.Model.Where(a => a.Entity == "example").Count(). But I want something more generic where I can get a count of how many rows have the same entry in one of the columns. A pic of what I mean:
My end result that I wanna get is a list of the count like: 3, 1 etc
You can use a GroupBy statement for this to group by a value of your items, and then Select the result you want from it:
var result = await db.Model
.GroupBy(x => x.Age)
.Select(g => new {
Age = g.Key,
Count = g.Count(),
})
.ToListAsync();
The result is a list of objects that have an Age property with the age value, and a Count property with the number of items that had that Age value.
If you just want the counts, then you can just return those from the Select expression directly:
var result = await db.Model
.GroupBy(x => x.Age)
.Select(g => g.Count())
.ToListAsync();
Note that this will obviously prevent you from saying what age an individual count is representing.
Try using the distinct method and get as anonymous object then applying count on it
context.Model.Where(a => a.Entity == "example").Select(a = > new {a.User, a.Address, a.Age}).Distinct().Count()

Remove every first element of grouped collection

I have a collection of elements and some of these elements are duplicating. I need to extract all records but only the first record if the record is one of a duplicate set.
I was able to group the elements and find all elements that have duplicates, but how to remove every first element of a group?
var records =
dbContext.Competitors
.GroupBy(x => x.Email)
.Select(x => new { Properties = x,
Count = x.Key.Count() })
.Where(x => x.Count > 1)
.ToList();
EDIT: Seems like it's impossible to accomplish this task with EF, because it fails to translate the desired linq expression to SQL. I'll be happy if someone offer different approach.
To exclude the first record from each email-address group with more than one entry, you could do this:
var records = dbContext.Competitors
.GroupBy(x => x.Email)
.SelectMany(x => (x.Count() == 1) ? x : x.OrderBy(t=>t).Skip(1))
.ToList();
This is the logic :
Group by a property > Select every Group > (Possibly) Sort that > Skip first one
This can be turned into some linq code like this :
//use SelectMany to flat the array
var x = list.GroupBy(g => g.Key).Select(grp => grp.Skip(1)).SelectMany(i => i);

Group and count Dictionary Items

I have a Dictionary<string, CachedImage> with which I'd like to group the items, count the number of items in the group and iterate through each group if the count is greater than 6500.
The CachedImage class contains Path And ExpiresUtc properties that I am interested in.
My Linq-fu is sadly lacking with complex queries so this is as far as I have got and I think I've already messed up. I'm assuming what I want is possible.
Any help would be appreciated, especially with a quick walkthrough.
Regex searchTerm = new Regex(#"(jpeg|png|bmp|gif)");
var groups= PersistantDictionary.Instance.ToList()
.GroupBy(x => searchTerm.Match(x.Value.Path))
.Select(y => new
{
Path = y.Key,
Expires = y.Select(z => z.Value.ExpiresUtc),
Count = y.Sum(z => z.Key.Count())
})
.AsEnumerable();
Try this:
var groups = PersistantDictionary.Instance.ToList()
.GroupBy(x => searchTerm.Match(x.Value.Path).Value)
.Where(g => g.Count() > 6500);
foreach (var group in groups)
{
Console.WriteLine("{0} images for extension {1}", group.Count(), group.Key);
foreach (KeyValuePair<string, CachedImage> pair in group)
{
//Do stuff with each CachedImage.
}
}
So to break this down:
PersistantDictionary.Instance.ToList()
Produces a list of KeyValuePair<string, CachedImage>.
.GroupBy(x => searchTerm.Match(x.Value.Path).Value)
Groups the list by the Regex match against the Path of the CachedImage. Note that I have used the Value property - the Match method returns a Match object, so it would be best to group by the actual text of the match. The outcome of this step will be an IEnumerable of <IGrouping<string, KeyValuePair<string, CachedImage>>>.
.Where(g => g.Count() > 6500);
This ensures that only those groups with > 6500 items will be retrieved.

Linq group objects by unique values and make two lists

I have an array of objects with property Number inside them. I need to group them by values i.e. objects contain those sample values:
1 2 3 3 3 4 5 6 6 6 7 7
I have to group them like this:
listOfUniqe = {1,2,4,5}
listOfDuplicates1 = {3,3,3}
listOfDuplicates2 = {6,6,6}
listOfDuplicates3 = {7,7}
...
I tried to use distinct, with First(). But this distincts me first occurences and remove duplicates. I want to erase also first occurence of object if it had duplicates and move them to another list.
List<Reports> distinct = new List<Reports>;
distinct = ArrayOfObjects.GroupBy(p => p.Number).Select(g => g.First()).ToList();
Any ideas how I could do this?
To get groups with just one element use that:
distinct = ArrayOfObjects.GroupBy(p => p.Number)
.Where(g => g.Count() == 1)
.ToList();
And to get list of groups with more elements use that:
nonDistinct = ArrayOfObjects.GroupBy(p => p.Number)
.Where(g => g.Count() > 1)
.Select(g => g.ToList())
.ToList();
First group the items:
var groups = values.GroupBy(p => p.Number).ToList();
The unique ones are the ones with a group count of one:
var unique = groups.Where(g => g.Count() == 1).Select(g => g.Single()).ToList();
The ones with duplicates are the other ones:
var nonUnique = groups.Where(g => g.Count() > 1).ToList();

Counting grouped data with Linq to Sql

I have a database of documents in an array, each with an owner and a document type, and I'm trying to get a list of the 5 most common document types for a specific user.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id);
This returns all the documents belonging to a specific owner and grouped as I need them, I now need a way to extract the ids of the most common document types. I'm not too familiar with Linq to Sql, so any help would be great.
This would order the groups by count descending and then take the top 5 of them, you could adapt to another number or completely take out the Take() if its not needed in your case:
var mostCommon = docTypes.OrderByDescending( x => x.Count()).Take(5);
To just select the top document keys:
var mostCommonDocTypes = docTypes.OrderByDescending( x => x.Count())
.Select( x=> x.Key)
.Take(5);
You can also of course combine this with your original query by appending/chaining it, just separated for clarity in this answer.
Using the Select you can get the value from the Key of the Grouping (the Id) and then a count of each item in the grouping.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id)
.Select(groupingById=>
new
{
Id = groupingById.Key,
Count = groupingById.Count(),
})
.OrderByDescending(x => x.Count);

Categories