LINQ with two groupings - c#

I am struggling with a double grouping using C#/LINQ on data similar in shape to the following example. I'm starting with a list of TickItems coming from the data layer, and I have now got it shaped as such:
TickItem {
ItemName: string;
QtyNeeded: int;
QtyFulfilled: int;
Category: string;
}
List<TickItem> items = new List<TickItem>();
items.Add("apple", 1, 0, "food");
items.Add("orange", 1, 1, "food");
items.Add("orange", 1, 0, "food");
items.Add("bicycle", 1, 1, "vehicle");
items.Add("apple", 1, 1, "food");
items.Add("apple", 1, 0, "food");
items.Add("car", 1, 1, "vehicle");
items.Add("truck", 1, 0, "vehicle");
items.Add("banana", 1, 0, "food");
I need to group this data by Category, with the sum of each numeric column in the end result. In the end, it should be shaped like this:
{ "food": { "apple" : 3, 1 },
{ "banana" : 1, 0 },
{ "orange" : 2, 1 } },
{ "vehicle": { "bicycle": 1, 1 },
{ "car" : 1, 1 },
{ "truck" : 1, 0} }
I have been able to do each of the groupings individually (group by ItemName and group by Category), but I have not been able to perform both groupings in a single LINQ statement. My code so far:
var groupListItemName = things.GroupBy(tiλ => tiλ.ItemName).ToList();
var groupListCategory = things.GroupBy(tiλ => tiλ.Category).ToList();
Can anyone help?
[edit: I can use either method or query syntax, whichever is easier to visualize the process with]

Please have a look at this post.
http://sohailnedian.blogspot.com/2012/12/linq-groupby-with-aggregate-functions.html
Multiple grouping can be done via new keyword.
empList.GroupBy(_ => new { _.DeptId, _.Position })
.Select(_ => new
{
MaximumSalary = _.Max(deptPositionGroup => deptPositionGroup.Salary),
DepartmentId = _.Key.DeptId,
Position = _.Key.Position
}).ToList()

var query = from i in items
group i by i.Category into categoryGroup
select new
{
Category = categoryGroup.Key,
Items = categoryGroup.GroupBy(g => g.ItemName)
.Select(g => new
{
ItemName = g.Key,
QtyNeeded = g.Sum(x => x.QtyNeeded),
QtyFulfilled = g.Sum(x => x.QtyFulfilled)
}).ToList()
};
This query will return sequence of anonymous objects representing items grouped by category. Each category object will have list of anonymous objects, which will contain totals for each item name.
foreach(var group in query)
{
// group.Category
foreach(var item in group.Items)
{
// item.ItemName
// item.QtyNeeded
// item.QtyFulfilled
}
}

GroupBy has an overload that lets you specify result transformation, for example:
var result = items.GroupBy(i => i.Category,
(category, categoryElements) => new
{
Category = category,
Elements = categoryElements.GroupBy(i => i.ItemName,
(item, itemElements) => new
{
Item = item,
QtyNeeded = itemElements.Sum(i => i.QtyNeeded),
QtyFulfilled = itemElements.Sum(i => i.QtyFulfilled)
})
});

Related

Check if a set exactly includes a subset using Linq taking into account duplicates

var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
subset.All(s => superset.Contains(s))
This code would return true, because 9 is included in the superset,but only once, I want an implementation that would take into account the duplicates, so it would return false
My thought was that you could group both sets by count, then test that the super group list contained every key from the sub group list and, in each case, the super count was greater than or equal to the corresponding subcount. I think that I've achieved that with the following:
var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
var subGroups = subset.GroupBy(n => n).ToArray();
var superGroups = superset.GroupBy(n => n).ToArray();
var basicResult = subset.All(n => superset.Contains(n));
var advancedResult = subGroups.All(subg => superGroups.Any(supg => subg.Key == supg.Key && subg.Count() <= supg.Count()));
Console.WriteLine(basicResult);
Console.WriteLine(advancedResult);
I did a few extra tests and it seemed to work but you can test some additional data sets to be sure.
Here is another solution :
var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
var subsetGroup = subset.GroupBy(x => x).Select(x => new { key = x.Key, count = x.Count() });
var supersetDict = superset.GroupBy(x => x).ToDictionary(x => x.Key, y => y.Count());
Boolean results = subsetGroup.All(x => supersetDict[x.key] >= x.count);
This works for me:
var subsetLookup = subset.ToLookup(x => x);
var supersetLookup = superset.ToLookup(x => x);
bool flag =
subsetLookup
.All(x => supersetLookup[x.Key].Count() >= subsetLookup[x.Key].Count());
That's not how sets and set operations work. Sets cannot contain duplicates.
You should treat the two arrays not as sets, but as (unordered) sequences. A possible algorithm would be: make a list from the sequence superset, then remove one by one each element of the sequence subset from the list until you are unable to find such an element in the list.
bool IsSubList(IEnumerable<int> sub, IEnumerable<int> super)
{
var list = super.ToList();
foreach (var item in sub)
{
if (!list.Remove(item))
return false; // not found in list, so sub is not a "sub-list" of super
}
return true; // all elements of sub were found in super
}
var subset = new[] { 9, 3 };
var superset = new[] { 9, 10, 5, 3,1, 3, 3 };
var isSubSet = IsSubList(subset, superset);

Using LINQ, how would you filter out all but one item of a particular criteria from a list?

I realize my title probably isn't very clear so here's an example:
I have a list of objects with two properties, A and B.
public class Item
{
public int A { get; set; }
public int B { get; set; }
}
var list = new List<Item>
{
new Item() { A = 0, B = 0 },
new Item() { A = 0, B = 1 },
new Item() { A = 1, B = 0 },
new Item() { A = 2, B = 0 },
new Item() { A = 2, B = 1 },
new Item() { A = 2, B = 2 },
new Item() { A = 3, B = 0 },
new Item() { A = 3, B = 1 },
}
Using LINQ, what's the most elegant way to collapse all the A = 2 items into the first A = 2 item and return along with all the other items? This would be the expected result.
var list = new List<Item>
{
new Item() { A = 0, B = 0 },
new Item() { A = 0, B = 1 },
new Item() { A = 1, B = 0 },
new Item() { A = 2, B = 0 },
new Item() { A = 3, B = 0 },
new Item() { A = 3, B = 1 },
}
I'm not a LINQ expert and already have a "manual" solution but I really like the expressiveness of LINQ and was curious to see if it could be done better.
How about:
var collapsed = list.GroupBy(i => i.A)
.SelectMany(g => g.Key == 2 ? g.Take(1) : g);
The idea is to first group them by A and then select those again (flattening it with .SelectMany) but in the case of the Key being the one we want to collapse, we just take the first entry with Take(1).
One way you can accomplish this is with GroupBy. Group the items by A, and use a SelectMany to project each group into a flat list again. In the SelectMany, check if A is 2 and if so Take(1), otherwise return all results for that group. We're using Take instead of First because the result has to be IEnumerable.
var grouped = list.GroupBy(g => g.A);
var collapsed = grouped.SelectMany(g =>
{
if (g.Key == 2)
{
return g.Take(1);
}
return g;
});
One possible solution (if you insist on LINQ):
int a = 2;
var output = list.GroupBy(o => o.A == a ? a.ToString() : Guid.NewGuid().ToString())
.Select(g => g.First())
.ToList();
Group all items with A=2 into group with key equal to 2, but all other items will have unique group key (new guid), so you will have many groups having one item. Then from each group we take first item.
Yet another way:
var newlist = list.Where (l => l.A != 2 ).ToList();
newlist.Add( list.First (l => l.A == 2) );
An alternative to other answers based on GroupBy can be Aggregate:
// Aggregate lets iterate a sequence and accumulate a result (the first arg)
var list2 = list.Aggregate(new List<Item>(), (result, next) => {
// This will add the item in the source sequence either
// if A != 2 or, if it's A == 2, it will check that there's no A == 2
// already in the resulting sequence!
if(next.A != 2 || !result.Any(item => item.A == 2)) result.Add(next);
return result;
});
What about this:
list.RemoveAll(l => l.A == 2 && l != list.FirstOrDefault(i => i.A == 2));
if you whould like more efficient way it would be:
var first = list.FirstOrDefault(i => i.A == 2);
list.RemoveAll(l => l.A == 2 && l != first);

How to get row value where a corresponding value changed - LINQ

Sorry for the title being a little vague, couldn't think of a good one.
I have a list of objects that holds some maximum and minimum limit values along with a timestamp.
To illustrate, my grid used to show the contents of that list could be like this (very simplified):
LimitMin | LimitMax | Start Time
1 2 08:00
1 2 08:01
1 2 08:03
2 5 08:05
2 5 08:06
2 5 08:10
Right now, I just do a select distinct, to get the distinct limits and add them to a list, like this:
var limitdistinct = printIDSPC.Select(x => new { x.LimitMin, x.LimitMax }).Distinct();
But I would like to get the timestamp as well, where the limits changed (08:05 in the example above). I cannot seem to figure out, how to accomplish this. I thought about how Distinct actually works behind the scenes, and if you could somehow get the timestamp from the select statement. Do I have to go through the entire list in a foreach loop, and compare the values to see where it changed?
Any help?
The trick here is to use GroupBy instead of Distinct. You could then either get the minimum timestamp for each limits pair:
items
.GroupBy(x => new { x.LimitMin, x.LimitMax })
.Select(x => new {
x.Key.LimitMin,
x.Key.LimitMax,
MinStartTime = x.Min(y => y.StartTime)
});
or, as GroupBy preserves the order of the original items, get the first timestamp for each:
items
.GroupBy(x => new { x.LimitMin, x.LimitMax })
.Select(x => new {
x.Key.LimitMin,
x.Key.LimitMax,
FirstStartTime = x.First().StartTime
});
Try this:-
var limitdistinct = printIDSPC.GroupBy(x => new { x.LimitMax, x.LimitMin })
.Select(x => new
{
LimitMin = x.Key.LimitMin,
LimitMax = x.Key.LimitMax,
MinTime = x.OrderBy(y => y.StartTime).First().StartTime
});
Fiddle.
One solution would be to group by min/max, then order by start time and finally select the first time value:
var list = new List<Foo>
{
new Foo { LimitMin = 1, LimitMax = 2, StartTime = TimeSpan.Parse("08:00") },
new Foo { LimitMin = 1, LimitMax = 2, StartTime = TimeSpan.Parse("08:01") },
new Foo { LimitMin = 1, LimitMax = 2, StartTime = TimeSpan.Parse("08:03") },
new Foo { LimitMin = 2, LimitMax = 5, StartTime = TimeSpan.Parse("08:05") },
new Foo { LimitMin = 2, LimitMax = 5, StartTime = TimeSpan.Parse("08:06") },
new Foo { LimitMin = 2, LimitMax = 5, StartTime = TimeSpan.Parse("08:10") },
};
var tmp = list
.GroupBy(z => new { z.LimitMin, z.LimitMax })
.Select(z =>
new
{
Time = z.OrderBy(z2 => z2.StartTime).First().StartTime,
Min = z.Key.LimitMin,
Max = z.Key.LimitMax
})
.ToList();

"in" operator in linq c#?

I have a generic list which contains member details and I have a string array of memberIds..I need to filter the list and get the results which contains all the memberIds..How can I achieve this using LINQ.
I tried the following
string[] memberList = hdnSelectedMemberList.Value.Split(',');
_lstFilteredMembers = lstMainMembers.Where(p =>memberList.Contains(p.MemberId))
.ToList();
But the above query is giving me only the results that match the first member ID..so lets say if I have memberIds 1,2,3,4 in the memberList array..the result it returns after the query contains only the members with member ID 1..even though the actual list has 1,2,3,4,5 in it..
Can you please guide me what I am doing wrong.
Thanks and appreciate your feedback.
Strings make terrible primary keys. Try trimming the list:
string[] memberList = hdnSelectedMemberList.Value
.Split(',')
.Select(p => p.Trim())
.ToList();
_lstFilteredMembers = lstMainMembers.Where(p => memberList.Contains(p.MemberId)).ToList();
Because I have a feeling hdnSelectedMemberList may be "1, 2, 3, 4".
Use a join:
var memquery = from member in lstMainMembers
join memberid in memberList
on member.MemberId equals memberid
select member;
With jmh, I'd use a join
var members = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var ids = new[] { 1, 3, 6, 14 };
var result = members.Join(ids, m => m, id => id, (m, id) => m);
foreach (var r in result)
Console.WriteLine(r); //prints 1, 3, 6
The code you are showing is correct, and works in a Unit Test:
public class Data
{
public string MemberId { get; set; }
}
[TestMethod]
public void Your_Code_Works()
{
// Arrange fake data.
var hdnSelectedMemberList = "1,2,3,4";
var lstMainMembers = new[]
{
new Data { MemberId = "1" },
new Data { MemberId = "2" },
new Data { MemberId = "3" },
new Data { MemberId = "4" },
new Data { MemberId = "5" }
};
// Act - copy/pasted from StackOverflow
string[] memberList = hdnSelectedMemberList.Split(',');
var _lstFilteredMembers = lstMainMembers.Where(p => memberList.Contains(p.MemberId)).ToList();
// Assert - All pass.
Assert.AreEqual(4, _lstFilteredMembers.Count);
Assert.AreEqual("1", _lstFilteredMembers[0].MemberId);
Assert.AreEqual("2", _lstFilteredMembers[1].MemberId);
Assert.AreEqual("3", _lstFilteredMembers[2].MemberId);
Assert.AreEqual("4", _lstFilteredMembers[3].MemberId);
}
There must be something wrong with your code outside what you have shown.
Try Enumerable.Intersect to get the intersection of two collections:
http://msdn.microsoft.com/en-us/library/system.linq.enumerable.intersect.aspx
_lstFilteredMembers = lstMainMembers.Intersect(memberList.Select(p => p.MemberID.ToString())).ToList()
Why not just project the IDs list into a list of members?
var result = memberList.Select(m => lstMainMembers.SingleOrDefault(mm => mm.MemberId == m))
Of course, that will give you a list that contains null entries for items that don't match.
You could filter those out, if you wanted to...
result = result.Where(r => r != null)
Or you could filter it before the initial select...
memberList.Where(m => lstMainMembers.Any(mm => mm.MemberId == m)).Select(m => lstMainMembers.Single(mm => mm.MemberId == m))
That's pretty ugly, though.

Applying a filter to subsequences of a sequence using Linq

If I have a List<MyType> as so, with each line representing an item in the collection:
{{ Id = 1, Year = 2010 },
{ Id = 1, Year = 2009 },
{ Id = 1, Year = 2008 },
{ Id = 2, Year = 2010 },
{ Id = 2, Year = 2009 },
{ Id = 2, Year = 2008 }}
I wish to retrieve a collection from this collection of the most recent item for each Id. What will the Linq for this look like?
Desired output:
{{ Id = 1, Year = 2010 },
{ Id = 2, Year = 2010 }}
I have a naiive implementation using a second list variable and a foreach loop, but it's inefficient.
//naiive implementation "p-code"
//...
var mostRecentItems = new List<MyType>();
var ids = collection.Select(i => i.Id).Distinct();
foreach(var id in ids)
{
mostRecentItems.Add(collection.Where(i => i.Id == id).OrderByDescending().First);
}
return mostRecentItems;
Most simply:
var mostRecentById = from item in list
group item by item.Id into g
select g.OrderByDescending(x => x.Year).First();
Group by id, then select the first item in each group ordered in a descending fashion.
var mostRecentItems = collection.GroupBy( c => c.Id )
.Select( g => g.OrderByDescending( i => i.Year ).First() );
or more simply still:
var result = list
.GroupBy(i => i.Id)
.Select(g => new {Id = g.Key, Year = g.Max(y => y.Year)});

Categories