Related
I am trying to get the most frequent values in an array using LINQ in C#.
For example,
int[] input = {1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8};
output = {1, 6}
int[] input = {1, 2, 2, 3 ,3, 3, 5}
output = {3}
Please let me know how to build LINQ.
Please read be careful.
This is a different problem with Select most frequent value using LINQ
I have to choose only the most frequent values. The code below is similar, but I can't use Take(5) because I don't know the number of results.
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
IEnumerable<int> top5 = nums
.GroupBy(i => i)
.OrderByDescending(g => g.Count())
.Take(5)
.Select(g => g.Key);
this output is {1, 2, 3, 4, 5}
but my expected output = {1, 2}
Please read the questions carefully and answer.
Thanks and regards.
Just to add to the plethora of answers:
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var result = input
.GroupBy(i => i)
.GroupBy(g => g.Count())
.OrderByDescending(g => g.Key)
.First()
.Select(g => g.Key)
.ToArray();
Console.WriteLine(string.Join(", ", result)); // Prints "1, 6"
[EDIT]
In case anyone finds this interesting, I compared the performance of the above between .net 4.8 and .net 5.0 as follows:
(1) Added a Comparer class to instrument the number of comparisons made:
class Comparer : IComparer<int>
{
public int Compare(int x, int y)
{
Console.WriteLine($"Comparing {x} with {y}");
return x.CompareTo(y);
}
}
(2) Modified the call to OrderByDescending() to pass a Comparer:
.OrderByDescending(g => g.Key, new Comparer())
(3) Multi-targeted my test console app to "net48" and "net5.0".
After making those changes the output was as follows:
For .net 4.8:
Comparing 1 with 3
Comparing 1 with 1
Comparing 1 with 2
Comparing 3 with 3
Comparing 3 with 2
Comparing 3 with 3
1, 6
For .net 5.0:
Comparing 3 with 1
Comparing 3 with 2
1, 6
As you can see, .net 5.0 is better optimised. For .net Framework however, (as /u/mjwills mentions below) it would likely be more performant to use a MaxBy() extension to avoid having to use OrderByDescending() - but only if instrumentation indicates that the sort is causing a performance issue.
If you want to do it in pure LINQ in one query you can group groups by count and select the max one:
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var tops = nums
.GroupBy(i => i)
.GroupBy(grouping => grouping.Count())
.OrderByDescending(gr => gr.Key)
.Take(1)
.SelectMany(g => g.Select(g => g.Key))
.ToList();
Note that it is not a most effective and clear solution.
UPD
A little bit more effective version using Aggregate to perform MaxBy. Note that it will fail for empty collections unlike the previous one:
var tops = nums
.GroupBy(i => i)
.GroupBy(grouping => grouping.Count())
.Aggregate((max, curr) => curr.Key > max.Key ? curr : max)
.Select(gr => gr.Key);
Also you can use MaxBy from MoreLinq or one introduced in .NET 6.
You can store your result in an IEnumerable of tuples with the first item being the number, the second item being the count of the number in your input array. Then you look at the count of your group with most elements, and take all the tuples where the second items equals your maximum.
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var intermediate = nums
.GroupBy(i => i)
.Select(g => (g.Key,g.Count()));
int amount = intermediate.Max(x => x.Item2);
IEnumerable<int> mostFrequent = intermediate
.Where(x => x.Item2 == amount)
.Select(x => x.Item1);
Online demo: https://dotnetfiddle.net/YCVGam
Use a variable to capture the number of items for the first item, then use TakeWhile to get all the groups with that number of items.
void Main()
{
var input = new[] { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
int numberOfItems = 0;
var output = input
.GroupBy(i => i)
.OrderByDescending(group => group.Count());
var maxNumberOfItems = output.FirstOrDefault()?.Count() ?? 0;
var finalOutput = output.TakeWhile(group => group.Count() == maxNumberOfItems).ToList();
foreach (var item in finalOutput)
{
Console.WriteLine($"Value {item.Key} has {item.Count()} members");
}
}
You can do this as a single query as well:
int? numberOfItems = null;
var finalOutput = input
.GroupBy(i => i)
.OrderByDescending(group => group.Count())
.TakeWhile(i =>
{
var count = i.Count();
numberOfItems ??= count;
return count == numberOfItems;
})
.ToList();
You could consider adding an extension-method. Something like
public static IEnumerable<T> TakeWhileEqual<T, T2>(this IEnumerable<T> collection, Func<T, T2> predicate)
where T2 : IEquatable<T2>
{
using var iter = collection.GetEnumerator();
if (iter.MoveNext())
{
var first = predicate(iter.Current);
yield return iter.Current;
while (iter.MoveNext() && predicate(iter.Current).Equals(first))
{
yield return iter.Current;
}
}
}
This has the advantage of being efficient, not needing to iterate over the collection more than once. But it does require some more code, even if this can be hidden in an extension method.
I think you probably want to use TakeWhile rather than Take;
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var n = nums
.GroupBy(i => i)
.OrderByDescending(g => g.Count());
var c = n.First().Count();
var r = n.TakeWhile(g => g.Count() == c)
.Select(g => g.Key);
If you want to do this in a single pass, without LINQ, you can use a Dictionary and a List track
a) how many times you saw a value and
b) what value you saw the most times
c) what other most-values you saw that many times
We skip through the list, trying to look the current value up in the dictionary. It either works or it doesn't - if it works, TryGetValue tells us how many times the current value has been seen. IF it doesn't, TryGetValue gives use a seen of 0. We increment seen. We take a look at how it compares to the max we've seen so far:
It's greater - we have a new leader in the "most frequent" contest - clear the current leaders list and start over with the new n as the leader. Also note the new max
It's equal - we have a tie for the lead; add the current n in among its peers
It's less - we don't care
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
int maxSeen = int.MinValue;
var seens = new Dictionary<int, int>();
var maxes = new List<int>();
foreach(var n in nums){
seens.TryGetValue(n, out var seen);
seens[n] = ++seen;
if(seen > maxSeen){
maxes = new(){n};
maxSeen = seen;
} else if(seen == maxSeen)
maxes.Add(n);
}
You'll end up with maxes as a List<int> that is the list of numbers that appear most.
If you care about allocations of the List's internal array, you could consider clearing the list instead of newing; I new'd because it was a handy one liner to use an initializer with the new leader
You may first group the first input like that.
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var tmpResult = from i in input
group i by i into k
select new
{
k.Key,
count = k.Count()
};
then you can filter the max value of group like that;
var max = tmpResult.Max(s => s.count);
after you should make a filter is enough
int[] result = tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
Also you can create an Extension method for this.
public static class Extension
{
public static int[] GetMostFrequent(this int[] input)
{
var tmpResult = from i in input
group i by i into k
select new
{
k.Key,
count = k.Count()
};
var max = tmpResult.Max(s => s.count);
return tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
}
You were very close. Just add one more line to your code.
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var counts = input
.GroupBy(i => i)
.Select(i => new { Number = i.Key, Count = i.Count()})
.OrderByDescending(i => i.Count);
var maxCount = counts.First().Count;
var result = counts
.Where(i=> i.Count == maxCount)
.Select(i => i.Number);
result
{1,6}
got this little problem in my little C# hobby project that I can't quite work out. I have been stuck in lots of messy and complicated nested loops. Hope someone can give light.
I have a list of list of int, i.e. List<List<int>> . Assume each list of int contains unique items. The minimum size of the list is 5. I need to find exactly two lists of int (List A and List B) that share exactly three common items and another list of int (List X) that contains exactly one of these common items. Another condition must hold: none of the other lists contain any of these three items.
For example:
List<List<int>> allLists = new List<List<int>>();
allLists.Add(new List<int>() {1, 2, 3, 4});
allLists.Add(new List<int>() {1, 2});
allLists.Add(new List<int>() {3, 4});
allLists.Add(new List<int>() {3, 4, 5, 6, 7, 8, 9});
allLists.Add(new List<int>() {4, 6, 8});
allLists.Add(new List<int>() {5, 7, 9, 11});
allLists.Add(new List<int>() {6, 7, 8});
For the above example, I would hope to find a solution as:
ListA and ListB: [3, 5] // indices of allLists
ListX: 6 // index of allLists
The three shared items: [5, 7, 9]
The matching item in ListX: 7
Note: Depending on the content of lists, there may be multiple solutions. There may be also situations that no lists is found matching the above conditions.
I was stuck in some messy nested loops. I was thinking if anyone may come up with a simple and efficient solution (possibly with LINQ?)
Originally I had something stupid like the following:
for (var i = 0; i < allLists.Count - 1; i++)
{
if (allLists[i].Count > 2)
{
for (var j = i + 1; j < allLists.Count; j++)
{
List<int> sharedItems = allLists[i].Intersect(allLists[j]).ToList();
if (sharedItems.Count == 3)
{
foreach (var item in sharedItems)
{
int itemCount = 0;
int? possibleListXIndex = null;
for (var k = 0; k < allLists.Count; k++)
{
if (k != i && k != j && allLists[k].Contains(item))
{
// nested loops getting very ugly here... also not sure what to do....
}
}
}
}
}
}
}
Extended Problem
There is an extended version of this problem in my project. It is in the same fashion:
find exactly three lists of int (List A, List B and List C) that share exactly four common items
find another list of int (List X) that contains exactly one of the above common items
none of the other lists contain these four items.
I was thinking the original algorithm may become scalable to also cover the extended version without having to write another version of algorithm from scratch. With my nested loops above, I think I will have no choice but to add at least two deeper-level loops to cover four items and three lists.
I thank everyone for your contributions in advance! Truly appreciated.
Here's a solution that comes up with your answer. I wouldn't exactly called it efficient, but it's pretty simple to follow.
It breaks the work in two step. First it constructs a list of initial candidates where they have exactly three matches. The second step adds the ListX property and checks to see if the remaining criteria is met.
var matches = allLists.Take(allLists.Count - 1)
.SelectMany((x, xIdx) => allLists
.Skip(xIdx + 1)
.Select(y => new { ListA = x, ListB = y, Shared = x.Intersect(y) })
.Where(y => y.Shared.Count() == 3))
.SelectMany(x => allLists
.Where(y => y != x.ListA && y != x.ListB)
.Select(y => new
{
x.ListA,
x.ListB,
x.Shared,
ListX = y,
SingleShared = x.Shared.Intersect(y)
})
.Where(y => y.SingleShared.Count() == 1
&& !allLists.Any(z => z != y.ListA
&& z != y.ListB
&& z != y.ListX
&& z.Intersect(y.Shared).Any())));
You get the output below after running the following code.
ListA. 3: [3, 4, 5, 6, 7, 8, 9] ListB. 5: [5, 7, 9, 11] => [5, 7, 9], ListX. 6:[6, 7, 10] => 7
matches.ToList().ForEach(x => {
Console.WriteLine("ListA. {0}: [{1}] ListB. {2}: [{3}] => [{4}], ListX. {5}:[{6}] => {7}",
allLists.IndexOf(x.ListA),
string.Join(", ", x.ListA),
allLists.IndexOf(x.ListB),
string.Join(", ", x.ListB),
string.Join(", ", x.Shared),
allLists.IndexOf(x.ListX),
string.Join(", ", x.ListX),
string.Join(", ", x.SingleShared));
I will leave the exercise of further work such as which list matches which other one given your fairly generic requirement. So here, I find those that match 3 other values in a given array, processing all arrays - so there are duplicates here where an A matches a B and a B matches an A for example.
This should give you something you can work from:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
Console.WriteLine("Hello World");
var s = new List<int>();
List<List<int>> allLists = new List<List<int>>();
allLists.Add(new List<int>()
{1, 2, 3, 4});
allLists.Add(new List<int>()
{1, 2});
allLists.Add(new List<int>()
{3, 4});
allLists.Add(new List<int>()
{3, 4, 5, 6, 7, 8, 9});
allLists.Add(new List<int>()
{4, 6, 8});
allLists.Add(new List<int>()
{5, 7, 9, 11});
allLists.Add(new List<int>()
{6, 7, 8});
/*
// To iterate over it.
foreach (List<int> subList in allLists)
{
foreach (int item in subList)
{
Console.WriteLine(item);
}
}
*/
var countMatch = 3;
/* iterate over our lists */
foreach (var sub in allLists)
{
/* not the sub list */
var ns = allLists.Where(g => g != sub);
Console.WriteLine("Check:{0}", ns.Count()); // 6 of the 7 lists so 6 to check against
//foreach (var glist in ns) - all of them, now refactor to filter them:
foreach (var glist in ns.Where(n=> n.Intersect(sub).Count() == countMatch))
{
var r = sub.Intersect(glist); // get all the matches of glist and sub
Console.WriteLine("Matches:{0} in {1}", r.Count(), glist.Count());
foreach (int item in r)
{
Console.WriteLine(item);
}
}
}
}
}
This will output this:
Hello World
Check:6
Check:6
Check:6
Check:6
Matches:3 in 3
4
6
8
Matches:3 in 4
5
7
9
Matches:3 in 3
6
7
8
Check:6
Matches:3 in 7
4
6
8
Check:6
Matches:3 in 7
5
7
9
Check:6
Matches:3 in 7
6
7
8
I think it would be better to break this functionality into several methods, then it will look easier to read.
var allLists = new List<List<int>>();
allLists.Add(new List<int>() {1, 2, 3, 4});
allLists.Add(new List<int>() {1, 2});
allLists.Add(new List<int>() {3, 4});
allLists.Add(new List<int>() {3, 4, 5, 6, 7, 8, 9});
allLists.Add(new List<int>() {4, 6, 8});
allLists.Add(new List<int>() {5, 7, 9, 11});
allLists.Add(new List<int>() {6, 7, 8});
var count = allLists.Count;
for (var i = 0; i < count - 1; i++)
{
var left = allLists[i];
if (left.Count > 2)
{
for (var j = i + 1; j < count; j++)
{
var right = allLists[j];
var sharedItems = left.Intersect(right).ToList();
if (sharedItems.Count == 3)
{
for (int k = 0; k < count; k++)
{
if (k == i || k == j)
continue;
var intersected = allLists[k].Intersect(sharedItems).ToList();
if (intersected.Count == 1)
{
Console.WriteLine($"Found index k:{k},i:{i},j:{j}, Intersected numbers:{string.Join(",",intersected)}");
}
}
}
}
}
}
I just thought if I try to find the item in List X first, I might need fewer loops in practice. Correct me if I am wrong.
public static void match(List<List<int>> allLists, int numberOfMainListsToExistIn, int numberOfCommonItems)
{
var possibilitiesToCheck = allLists.SelectMany(i => i).GroupBy(e => e).Where(e => (e.Count() == numberOfMainListsToExistIn + 1));
foreach (var pGroup in possibilitiesToCheck)
{
int p = pGroup.Key;
List<int> matchingListIndices = allLists.Select((l, i) => l.Contains(p) ? i : -1).Where(i => i > -1).ToList();
for (int i = 0; i < matchingListIndices.Count; i++)
{
int aIndex = matchingListIndices[i];
int bIndex = matchingListIndices[(i + 1) % matchingListIndices.Count];
int indexOfListXIndex = (i - 1 + matchingListIndices.Count) % matchingListIndices.Count;
int xIndex = matchingListIndices[indexOfListXIndex];
IEnumerable<int> shared = allLists[aIndex].Intersect(allLists[bIndex]).OrderBy(e => e);
IEnumerable<int> xSingle = shared.Intersect(allLists[xIndex]);
bool conditionsHold = false;
if (shared.Count() == numberOfCommonItems && xSingle.Count() == 1 && xSingle.Contains(p))
{
conditionsHold = true;
for (int j = 2; j < matchingListIndices.Count - 1; j++)
{
int cIndex = matchingListIndices[(i + j) % matchingListIndices.Count];
if (!Enumerable.SequenceEqual(shared, allLists[aIndex].Intersect(allLists[cIndex]).OrderBy(e => e)))
{
conditionsHold = false;
break;
}
}
if (conditionsHold)
{
List<int> theOtherListIndices = Enumerable.Range(0, allLists.Count - 1).Except(matchingListIndices).ToList();
if (theOtherListIndices.Any(x => shared.Intersect(allLists[x]).Count() > 0))
{
conditionsHold = false;
}
}
}
if (conditionsHold)
{
matchingListIndices.RemoveAt(indexOfListXIndex);
Console.Write("List A and B: {0}. ", String.Join(", ", matchingListIndices));
Console.Write("Common items: {0}. ", String.Join(", ", shared));
Console.Write("List X: {0}.", xIndex);
Console.WriteLine("Common item in list X: {0}. ", p);
}
}
}
}
For the above example, I will just call the method like this:
match(allLists, 2, 3);
This method will also work with the extended problem:
match(allLists, 3, 4);
... and even more if the problem is even further more extended to (4, 5) and so on...
I searched, but I found only answers which related to two lists. But what about when they are more than two?
List 1 = 1,2,3,4,5
List 2 = 6,7,8,9,1
List 3 = 3,6,9,2,0,1
List 4 = 1,2,9,0,5
List 5 = 1,7,8,6,5,4
List 6 = 1
List 7 =
How to get the common items? as you can see one of them is empty, so the common will be empty, but I need to skip empty lists.
var data = new List<List<int>> {
new List<int> {1, 2, 3, 4, 5},
new List<int> {6, 7, 2, 8, 9, 1},
new List<int> {3, 6, 9, 2, 0, 1},
new List<int> {1, 2, 9, 0, 5},
new List<int> {1, 7, 8, 6, 2, 5, 4},
new List<int> {1, 7, 2}
};
List<int> res = data
.Aggregate<IEnumerable<int>>((a, b) => a.Intersect(b))
.ToList();
The type of Aggregate is explicitly given, otherwise aggregation of two Lists would have to be List too. It can be easily adapted to run in parallel:
List<int> res = data
.AsParallel<IEnumerable<int>>()
.Aggregate((a, b) => a.Intersect(b))
.ToList();
EDIT
Except... it does not run in parallel. The problem is operations on IEnumerable are deferred, so even if they are logically merged in parallel context, the actual merging occurs in the ToList(), which is single threaded. For parallel execution it would be better to leave IEnumerable and return to the Lists:
List<int> res = data
.AsParallel()
.Aggregate((a, b) => a.Intersect(b).ToList());
You can chain Intersect:
List<int> List1 = new List<int> {1, 2, 3, 4, 5};
List<int> List2 = new List<int> { 6, 7, 8, 9, 1 };
List<int> List3 = new List<int> { 3, 6, 9, 2, 0, 1 };
List<int> List4 = new List<int> { 1, 2, 9, 0, 5 };
List<int> List5 = new List<int> { 1, 7, 8, 6, 5, 4 };
List<int> List6 = new List<int> { 1 };
List<int> common = List1
.Intersect(List2)
.Intersect(List3)
.Intersect(List4)
.Intersect(List5)
.Intersect(List6)
.ToList();
var data = new [] {
new List<int> {1, 2, 3, 4, 5},
new List<int> {6, 7, 8, 9, 1},
new List<int> {3, 6, 9, 2, 0, 1},
new List<int> {1, 2, 9, 0, 5},
new List<int> {1, 7, 8, 6, 5, 4},
new List<int> {1},
new List<int> {},
null
};
IEnumerable<int> temp = null;
foreach (var arr in data)
if (arr != null && arr.Count != 0)
temp = temp == null ? arr : arr.Intersect(temp);
One way is to use a HashSet. You can put the items of the first collection in the hash, then iterate each collection after the first and create an new hash that you add items from the current collection to if it's in the hash. At the end you assign that common hash set to the overall one and break if it's every empty. At the end you just return the overall hash set.
public IEnumerable<T> CommonItems<T>(IEnumerable<IEnumerable<T>> collections)
{
if(collections == null)
throw new ArgumentNullException(nameof(collections));
using(var enumerator = collections.GetEnumerator())
{
if(!enumerator.MoveNext())
return Enumerable<T>.Empty();
var overall = new HashSet<T>(enumerator.Current);
while(enumerator.MoveNext())
{
var common = new HashSet<T>();
foreach(var item in enumerator.Current)
{
if(hash.Contains(item))
common.Add(item);
}
overall = common;
if(overall.Count == 0)
break;
}
return overall;
}
}
I have an ordered list, largest to smallest.
{ 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 }
I want a list like this
{ 10, 8, 6, 4, 2, 1, 3, 5, 7, 9 }
If you can see, the new order is, first by each odd index ascending, then by each even index descending.
The idea is that each half of the list has roughly the same weight. e.g
{ 10, 8, 6, 4, 2 } = 30
{ 1, 3, 5, 7, 9 } = 25
The title is the best I can explain in a sentence what I'm looking for which is why I had trouble finding the answer from Google.
Here is my go in C#. I'll welcome any comment on my attempt but I'm only looking of the algorithms name, if it has one.
var firstHalf = new List<string>();
var secondHalf = new List<string>();
for (int i = 0; i < originalList.Count; i++)
{
if (i % 2 == 1)
{
firstHalf.Add(originalList[i]);
}
else
{
secondHalf.Add(originalList[i]);
}
}
secondHalf.Reverse();
var finalList = new List<string>(firstHalf);
finalList.AddRange(secondHalf);
This might not be the most efficient way, but it's easy:
var yourlist = originalList.Where(i => i % 2 == 0)
.OrderBy(i => i)
.Concat(originalList.Where(i => i % 2 != 0)
.OrderByDescending(i => i))
.ToList();
This might be either impossible or so obvious I keep passing over it.
I have a list of objects(let's say ints for this example):
List<int> list = new List<int>() { 1, 2, 3, 4, 5, 6 };
I'd like to be able to group by pairs with no regard to order or any other comparison, returning a new IGrouping object.
ie,
list.GroupBy(i => someLogicToProductPairs);
There's the very real possibility I may be approaching this problem from the wrong angle, however, the goal is to group a set of objects by a constant capacity. Any help is greatly appreciated.
Do you mean like this:
List<int> list = new List<int>() { 1, 2, 3, 4, 5, 6 };
IEnumerable<IGrouping<int,int>> groups =
list
.Select((n, i) => new { Group = i / 2, Value = n })
.GroupBy(g => g.Group, g => g.Value);
foreach (IGrouping<int, int> group in groups) {
Console.WriteLine(String.Join(", ", group.Select(n=>n.ToString()).ToArray()));
}
Output
1, 2
3, 4
5, 6
you can do something like this...
List<int> integers = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var p = integers.Select((x, index) => new { Num = index / 2, Val = x })
.GroupBy(y => y.Num);
int counter = 0;
// this function returns the keys for our groups.
Func<int> keyGenerator =
() =>
{
int keyValue = counter / 2;
counter += 1;
return keyValue;
};
var groups = list.GroupBy(i => {return keyGenerator()});