Counting grouped data with Linq to Sql - c#

I have a database of documents in an array, each with an owner and a document type, and I'm trying to get a list of the 5 most common document types for a specific user.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id);
This returns all the documents belonging to a specific owner and grouped as I need them, I now need a way to extract the ids of the most common document types. I'm not too familiar with Linq to Sql, so any help would be great.

This would order the groups by count descending and then take the top 5 of them, you could adapt to another number or completely take out the Take() if its not needed in your case:
var mostCommon = docTypes.OrderByDescending( x => x.Count()).Take(5);
To just select the top document keys:
var mostCommonDocTypes = docTypes.OrderByDescending( x => x.Count())
.Select( x=> x.Key)
.Take(5);
You can also of course combine this with your original query by appending/chaining it, just separated for clarity in this answer.

Using the Select you can get the value from the Key of the Grouping (the Id) and then a count of each item in the grouping.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id)
.Select(groupingById=>
new
{
Id = groupingById.Key,
Count = groupingById.Count(),
})
.OrderByDescending(x => x.Count);

Related

How to sort something in LINQ based on many dates?

Hello this is a LINQ Query but it doesn't sort properly because four different dates are involved.
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
orderby i.English_seen_by_executive_on descending
orderby i.Brief_seen_by_executive_on descending
orderby i.French_seen_by_executive_on descending
orderby i.Russian_seen_by_executive_on descending
select i).ToList();
All i want is that it should somehow combine the four dates and sort them in group not one by one.
For Example, at the moment it sorts all English Reports based on the date that executive has seen it, then Brief Report and So on.
But i want that it should check which one is seen first and so on. For example if the first report which is seen is French, then Brief, then English then Russian, so it should sort it accordingly.
Is it Possible??
You need to have them all in one column. The approach I would do, assuming that the value of the respective cells is null, when you don't want them to show up in the order by:
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
select new
{
Date =
(
i.English_seen_by_executive_on != null ? i.English_seen_by_executive_on :
i.Brief_seen_by_executive_on != null ? i.Brief_seen_by_executive_on :
i.French_seen_by_executive_on != null ? i.French_seen_by_executive_on :
i.Russian_seen_by_executive_on
)
}).ToList().OrderBy(a => a.Date);
In the select clause you could add more columns if you whish.
Reference taken from here.
Why not just use .Min() or .Max() on the dates and then .OrderBy() or .OrderByDescending() based on that?
Logic is creating a new Enumerable (here, an array) with the 4 dates for the current line, and calculate the Max/Min of the 4 dates: this results in getting the latest/earliest of the 4. Then order the records based on this value.
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
select i)
.OrderBy(i => new[]{
i.English_seen_by_executive_on,
i.Brief_seen_by_executive_on,
i.French_seen_by_executive_on,
i.Russian_seen_by_executive_on
}.Max())
.ToList();
Your problem is not a problem if you use method syntax for your LINQ query instead of query syntax.
var EventReportRemarks = _context.pm_main_repz
.Where(rep => rep.Department.DepartmentName == "Finance")
.OrderByDescending(rep => rep.English_seen_by_executive_on)
.ThenByDescending(rep => rep.Brief_seen_by_executive_on)
.ThenByDescending(rep => rep.French_seen_by_executive_on descending)
.ThenByDescending(rep => resp.Russian_seen_by_executive_on descending)
.Select(rep => ...);
Optimization
One of the slower parts of a database query is the transport of selected data from the DBMS to your local process. Hence it is wise to limit the transported data to values you actually plan to use.
You transport way more data than you need to.
For example. Every pm_main_repz (my, you do love to use easy identifiers for your items, don't you?), every pm_main_repz has zero or more Employees. Every Employees belongs to exactly one pm_main_repz using a foreign key like pm_main_repzId.
If you use include to transport pm_main_repz 4 with his 1000 Employees every Employee will have a pm_main_repzId with value 4. You'll transport this value 1001 times, while 1 time would have been enough
Always use Select to select data from the database and Select only the properties you actually plan to use. Only use Include if you plan to update the fetched objects
Consider using a proper Select where you only select the items that you actually plan to use:
.Select(rep => new
{
// only Select the rep properties you actually plan to use:
Id = rep.Id,
Name = rep.Name,
...
Employees = rep.Employees.Select(employee => new
{
// again: select only the properties you plan to use
Id = employee.Id,
Name = employee.Name,
// not needed: foreign key to pm_main_repz
// pm_main_repzId = rep.pm_main_repzId,
})
.ToList(),
Department = new
{
Id = rep.Department,
...
}
// etc for pm_evt_cat and provencs
});

How to do in this in Linq C#

So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.
If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?

Linq - Group By Id, Order By and Then select top 5 of each grouping

Is there some way in linq group By Id, Order By descending and then select top 5 of each grouping? Right now I have some code shown below, but I used .Take(5) and it obviously selects the top 5 regardless of grouping.
Items = list.GroupBy(x => x.Id)
.Select(x => x.OrderByDescending(y => y.Value))
.Select(y => new Home.SubModels.Item {
Name= y.FirstOrDefault().Name,
Value = y.FirstOrDefault().Value,
Id = y.FirstOrDefault().Id
})
You are almost there. Use Take in the Select statement:
var items = list.GroupBy(x => x.Id)
//For each IGrouping - order nested items and take 5 of them
.Select(x => x.OrderByDescending(y => y.Value).Take(5))
This will return an IEnumerable<IEnumerable<T>>. If you want it flattened replace Select with SelectMany

c# linq join with projection

So in my shop app i'm allowing users to favorite items.
what i wish to do is to show the list of favorite items in the user profile page where the list is sorted on the like date in a descending order.
We have 2 tables that we need to join, one is items and the other one is favorites.
how would one join this two tables, so the result will answer this criteria:
The result will be a list of items that was favorite by this particular user.
The results will come with the list of comments for each item (each item have a list of comments).
The results will be sorted correctly.
So far i came up with this:
Items =
await _context.Favorites
.Join(
_context.Items,
f => f.ItemId,
i => i.Id,
(f, i) => new { f, i })
.Distinct()
.OrderByDescending(x => x.f.FavDate)
.Select(x => x.i)
.Skip(skip).Take(take)
.Include(c => c.ListOfComments)
.ToListAsync();
This works but does not answer the first criteria, which is that only items favorite by particular user will be returned, this returns list of items favorite by the users and not by a particular user.
I tried to add a where clause before the join (_context.Favorites.Where(f.UserVoterId.equals(profileId)) but it throws an exception.
One way to approach this is to:
include the user id as the join key and
load the data in separate steps
To select the favorite items of a specific user (profileId) you need this query:
var favorites = _context.Favorites.OrderByDescending(f => f.FavDate)
.Join(_context.Items,
fav => new { fav.ItemId, UserId = fav.UserVoterId },
item => new { ItemId = item.Id, UserId = profileId },
(fav, item) => item)
.Skip(pageIndex * pageSize)
.Take(pageSize)
.ToList();
And to load the comments just try one of the following (whichever works):
var itemIds = favorites.Select(f => f.Id);
var comments = _context.Comments.Where(c => itemIds.Contains(c.ItemId))
.GroupBy(c => c.ItemId)
.ToDictionary(g => g.Key, g => g.ToArray());
Or
var items = _context.Comments
.GroupJoin(favorites,
comment => comment.ItemId,
favorite => favorite.Id,
(fav, comments) => new
{
Item = fav,
Comments = comment.ToArray()
});
In the first case, the comments are added to a Dictionary<TItemId, Comment> where TItemId is the type of item.Id and to get the comments for an item you'd use
var itemComments = comments[item.Id];
which is a O(1) operation.
In the second case the items collection will have all the data you need so you'll have to project it into the structure that suits your needs.
NB I mentioned earlier whichever works because I'm not entirely sure that GroupJoin is properly translated to SQL and I'm not sure if I missed some requirements for the GroupJoin method.

Remove every first element of grouped collection

I have a collection of elements and some of these elements are duplicating. I need to extract all records but only the first record if the record is one of a duplicate set.
I was able to group the elements and find all elements that have duplicates, but how to remove every first element of a group?
var records =
dbContext.Competitors
.GroupBy(x => x.Email)
.Select(x => new { Properties = x,
Count = x.Key.Count() })
.Where(x => x.Count > 1)
.ToList();
EDIT: Seems like it's impossible to accomplish this task with EF, because it fails to translate the desired linq expression to SQL. I'll be happy if someone offer different approach.
To exclude the first record from each email-address group with more than one entry, you could do this:
var records = dbContext.Competitors
.GroupBy(x => x.Email)
.SelectMany(x => (x.Count() == 1) ? x : x.OrderBy(t=>t).Skip(1))
.ToList();
This is the logic :
Group by a property > Select every Group > (Possibly) Sort that > Skip first one
This can be turned into some linq code like this :
//use SelectMany to flat the array
var x = list.GroupBy(g => g.Key).Select(grp => grp.Skip(1)).SelectMany(i => i);

Categories