Remove every first element of grouped collection - c#

I have a collection of elements and some of these elements are duplicating. I need to extract all records but only the first record if the record is one of a duplicate set.
I was able to group the elements and find all elements that have duplicates, but how to remove every first element of a group?
var records =
dbContext.Competitors
.GroupBy(x => x.Email)
.Select(x => new { Properties = x,
Count = x.Key.Count() })
.Where(x => x.Count > 1)
.ToList();
EDIT: Seems like it's impossible to accomplish this task with EF, because it fails to translate the desired linq expression to SQL. I'll be happy if someone offer different approach.

To exclude the first record from each email-address group with more than one entry, you could do this:
var records = dbContext.Competitors
.GroupBy(x => x.Email)
.SelectMany(x => (x.Count() == 1) ? x : x.OrderBy(t=>t).Skip(1))
.ToList();

This is the logic :
Group by a property > Select every Group > (Possibly) Sort that > Skip first one
This can be turned into some linq code like this :
//use SelectMany to flat the array
var x = list.GroupBy(g => g.Key).Select(grp => grp.Skip(1)).SelectMany(i => i);

Related

How to do in this in Linq C#

So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.
If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?

Using Linq to get maximum value of only duplicates

I am trying to use Linq to return me all of the items in my object that have a specific property value duplicated, and the maximum value of another specific property for all of the duplicates.
My object has properties CourseInfoId which is how I want to check for duplicates and a property Priority which i want the maximum value (and lots of other properties).
I thought this would work, but it's giving me every item in the object.
var group = from a in r
group a by a.CourseInfoId into b
let maxPriority = b.Max(d => d.Priority)
where b.Skip(1).Any()
from c in b
where c.Priority == maxPriority
select c;
Where am I going wrong?
What you'll want to do is group by CourseInfoId, then filter on groups that have more than 1 item which will get you all of the duplicate items. Next, you'll have to flatten out the groups again and get the maximum property value from the results.
var maxPriority = items
.GroupBy(i => i.CourseInfoId)
.Where(g => g.Count() > 1)
.SelectMany(g => g)
.Max(i => i.Priority);
EDIT: I see now that you only want to check the properties of the duplicates, not all of the items with a duplicate ID. All you have to do is skip the first item of each group in the .SelectMany() call:
var maxPriority = items
.GroupBy(i => i.CourseInfoId)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Skip(1))
.Max(i => i.Priority);

GroupBy and Sum

I read a lot of GroupBy + Sum topics but I didn't understand how to use it.
I have a list of contacts, and in this list, i want to get the state (which appears more).
So my code is:
contacts.GroupBy(i => i.Address.State.ToUpperInvariant());
In this GroupBy, I want to know the state that appears more (and remove the case of "" because empty state is not important to me).
How do I do it?
I was thinking in something like this:
contacts.GroupBy(i => i.Address.State.ToUpperInvariant()).Select(i => i.Max());
Thanks in advance!
You want something like:
var counts = contacts
.Where(c => c.State != string.Empty)
.GroupBy(i => i.Address.State, StringComparer.OrdinalIgnoreCase)
.Select(grp => new { State = grp.Key, Count = grp.Count());
GroupBy returns an IEnumerable<IGrouping<TKey, TSource>>. Since IGrouping<TKey, TSource> implements IEnumerable<TSource>, you can use the Count extension method to get the number of elements in the group.

Counting grouped data with Linq to Sql

I have a database of documents in an array, each with an owner and a document type, and I'm trying to get a list of the 5 most common document types for a specific user.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id);
This returns all the documents belonging to a specific owner and grouped as I need them, I now need a way to extract the ids of the most common document types. I'm not too familiar with Linq to Sql, so any help would be great.
This would order the groups by count descending and then take the top 5 of them, you could adapt to another number or completely take out the Take() if its not needed in your case:
var mostCommon = docTypes.OrderByDescending( x => x.Count()).Take(5);
To just select the top document keys:
var mostCommonDocTypes = docTypes.OrderByDescending( x => x.Count())
.Select( x=> x.Key)
.Take(5);
You can also of course combine this with your original query by appending/chaining it, just separated for clarity in this answer.
Using the Select you can get the value from the Key of the Grouping (the Id) and then a count of each item in the grouping.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id)
.Select(groupingById=>
new
{
Id = groupingById.Key,
Count = groupingById.Count(),
})
.OrderByDescending(x => x.Count);

LINQ-To-SQL row number greater than

If i have an object of type Photo and a Result set which is sorted in a particular order, is there a way for me to get the position of the current Photo object in the result set. and then get all objects that would follow it?
You can do something like this (not terribly efficient):
var result =
photos.Select((p, i) => new { Index = i, Photo = p })
.SkipWhile(x => x.Photo != photo).Skip(1);
This will give you all photos following photo combined with their index in the original collection.
If you're sorting against an Id:
// gets the previous photo, or null if none:
var previousPhoto = db.Photos
.Where(p => p.Id < currentPhotoId)
.OrderByDescending(p => p.Id)
.FirstOrDefault();
// gets the next photo, or null if none:
var nextPhoto = db.Photos
.Where(p => p.Id > currentPhotoId)
.OrderBy(p => p.Id)
.FirstOrDefault();
If you have custom ordering, you'd need to replace the OrderBy/OrderByDescending expression with your custom ordering. You'd also need to use the same ordering criteria in Where() to get only those photos before or after the current photo.
Not sure if I understand correctly but this might help:
var result = photos.Skip(100); // 100 would be the position of current object.
// if you don't know the index of the current object:
// This won't work on LINQ to SQL directly, do it on a list or something.
var result = photos.SkipWhile(x => x != currentObject).Skip(1);
In reality, if you are dealing with a database, there should be some identifier (a set of columns) you are sorting by. If you want to do the whole thing on the server side, you can grab the properties of the current object and filter the result set specifically for objects that would come after that in the sort order you want.
Index of the photo:
result.IndexOf(photo);
Items after it:
result.SkipWhile((q, i) => i <= result.IndexOf(photo));

Categories