Related
I am trying to get the most frequent values in an array using LINQ in C#.
For example,
int[] input = {1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8};
output = {1, 6}
int[] input = {1, 2, 2, 3 ,3, 3, 5}
output = {3}
Please let me know how to build LINQ.
Please read be careful.
This is a different problem with Select most frequent value using LINQ
I have to choose only the most frequent values. The code below is similar, but I can't use Take(5) because I don't know the number of results.
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
IEnumerable<int> top5 = nums
.GroupBy(i => i)
.OrderByDescending(g => g.Count())
.Take(5)
.Select(g => g.Key);
this output is {1, 2, 3, 4, 5}
but my expected output = {1, 2}
Please read the questions carefully and answer.
Thanks and regards.
Just to add to the plethora of answers:
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var result = input
.GroupBy(i => i)
.GroupBy(g => g.Count())
.OrderByDescending(g => g.Key)
.First()
.Select(g => g.Key)
.ToArray();
Console.WriteLine(string.Join(", ", result)); // Prints "1, 6"
[EDIT]
In case anyone finds this interesting, I compared the performance of the above between .net 4.8 and .net 5.0 as follows:
(1) Added a Comparer class to instrument the number of comparisons made:
class Comparer : IComparer<int>
{
public int Compare(int x, int y)
{
Console.WriteLine($"Comparing {x} with {y}");
return x.CompareTo(y);
}
}
(2) Modified the call to OrderByDescending() to pass a Comparer:
.OrderByDescending(g => g.Key, new Comparer())
(3) Multi-targeted my test console app to "net48" and "net5.0".
After making those changes the output was as follows:
For .net 4.8:
Comparing 1 with 3
Comparing 1 with 1
Comparing 1 with 2
Comparing 3 with 3
Comparing 3 with 2
Comparing 3 with 3
1, 6
For .net 5.0:
Comparing 3 with 1
Comparing 3 with 2
1, 6
As you can see, .net 5.0 is better optimised. For .net Framework however, (as /u/mjwills mentions below) it would likely be more performant to use a MaxBy() extension to avoid having to use OrderByDescending() - but only if instrumentation indicates that the sort is causing a performance issue.
If you want to do it in pure LINQ in one query you can group groups by count and select the max one:
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var tops = nums
.GroupBy(i => i)
.GroupBy(grouping => grouping.Count())
.OrderByDescending(gr => gr.Key)
.Take(1)
.SelectMany(g => g.Select(g => g.Key))
.ToList();
Note that it is not a most effective and clear solution.
UPD
A little bit more effective version using Aggregate to perform MaxBy. Note that it will fail for empty collections unlike the previous one:
var tops = nums
.GroupBy(i => i)
.GroupBy(grouping => grouping.Count())
.Aggregate((max, curr) => curr.Key > max.Key ? curr : max)
.Select(gr => gr.Key);
Also you can use MaxBy from MoreLinq or one introduced in .NET 6.
You can store your result in an IEnumerable of tuples with the first item being the number, the second item being the count of the number in your input array. Then you look at the count of your group with most elements, and take all the tuples where the second items equals your maximum.
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var intermediate = nums
.GroupBy(i => i)
.Select(g => (g.Key,g.Count()));
int amount = intermediate.Max(x => x.Item2);
IEnumerable<int> mostFrequent = intermediate
.Where(x => x.Item2 == amount)
.Select(x => x.Item1);
Online demo: https://dotnetfiddle.net/YCVGam
Use a variable to capture the number of items for the first item, then use TakeWhile to get all the groups with that number of items.
void Main()
{
var input = new[] { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
int numberOfItems = 0;
var output = input
.GroupBy(i => i)
.OrderByDescending(group => group.Count());
var maxNumberOfItems = output.FirstOrDefault()?.Count() ?? 0;
var finalOutput = output.TakeWhile(group => group.Count() == maxNumberOfItems).ToList();
foreach (var item in finalOutput)
{
Console.WriteLine($"Value {item.Key} has {item.Count()} members");
}
}
You can do this as a single query as well:
int? numberOfItems = null;
var finalOutput = input
.GroupBy(i => i)
.OrderByDescending(group => group.Count())
.TakeWhile(i =>
{
var count = i.Count();
numberOfItems ??= count;
return count == numberOfItems;
})
.ToList();
You could consider adding an extension-method. Something like
public static IEnumerable<T> TakeWhileEqual<T, T2>(this IEnumerable<T> collection, Func<T, T2> predicate)
where T2 : IEquatable<T2>
{
using var iter = collection.GetEnumerator();
if (iter.MoveNext())
{
var first = predicate(iter.Current);
yield return iter.Current;
while (iter.MoveNext() && predicate(iter.Current).Equals(first))
{
yield return iter.Current;
}
}
}
This has the advantage of being efficient, not needing to iterate over the collection more than once. But it does require some more code, even if this can be hidden in an extension method.
I think you probably want to use TakeWhile rather than Take;
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var n = nums
.GroupBy(i => i)
.OrderByDescending(g => g.Count());
var c = n.First().Count();
var r = n.TakeWhile(g => g.Count() == c)
.Select(g => g.Key);
If you want to do this in a single pass, without LINQ, you can use a Dictionary and a List track
a) how many times you saw a value and
b) what value you saw the most times
c) what other most-values you saw that many times
We skip through the list, trying to look the current value up in the dictionary. It either works or it doesn't - if it works, TryGetValue tells us how many times the current value has been seen. IF it doesn't, TryGetValue gives use a seen of 0. We increment seen. We take a look at how it compares to the max we've seen so far:
It's greater - we have a new leader in the "most frequent" contest - clear the current leaders list and start over with the new n as the leader. Also note the new max
It's equal - we have a tie for the lead; add the current n in among its peers
It's less - we don't care
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
int maxSeen = int.MinValue;
var seens = new Dictionary<int, int>();
var maxes = new List<int>();
foreach(var n in nums){
seens.TryGetValue(n, out var seen);
seens[n] = ++seen;
if(seen > maxSeen){
maxes = new(){n};
maxSeen = seen;
} else if(seen == maxSeen)
maxes.Add(n);
}
You'll end up with maxes as a List<int> that is the list of numbers that appear most.
If you care about allocations of the List's internal array, you could consider clearing the list instead of newing; I new'd because it was a handy one liner to use an initializer with the new leader
You may first group the first input like that.
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var tmpResult = from i in input
group i by i into k
select new
{
k.Key,
count = k.Count()
};
then you can filter the max value of group like that;
var max = tmpResult.Max(s => s.count);
after you should make a filter is enough
int[] result = tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
Also you can create an Extension method for this.
public static class Extension
{
public static int[] GetMostFrequent(this int[] input)
{
var tmpResult = from i in input
group i by i into k
select new
{
k.Key,
count = k.Count()
};
var max = tmpResult.Max(s => s.count);
return tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
}
You were very close. Just add one more line to your code.
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var counts = input
.GroupBy(i => i)
.Select(i => new { Number = i.Key, Count = i.Count()})
.OrderByDescending(i => i.Count);
var maxCount = counts.First().Count;
var result = counts
.Where(i=> i.Count == maxCount)
.Select(i => i.Number);
result
{1,6}
Let's say we have the following three lists:
{ 1, 2, 2, 3 }
{ 2, 3, 3, 4 }
{ 2, 3, 4, 5, 5, 5 }
How can we then convert the above to a list having each item repeated the maximum number of times it's found in a list.i.e.,
{1, 2, 2 (Found twice in list 1), 3, 3 (Twice in list 2), 4, 5, 5, 5 (Thrice in list 3)}
I can achieve the above through loops, however, I am looking for a LINQ method that might already be there.
The question is similar to list union with duplicates in python
Linq in one line
int[][] items = { new[]{ 1, 2, 2, 3 }, new[] { 2, 3, 3, 4 }, new[] { 2, 3, 4, 5, 5, 5 } };
var result = items.SelectMany(x => x.GroupBy(y => y)).GroupBy(x => x.Key).Select(x => x.OrderByDescending(y => y.Count()).First()).SelectMany(x => x);
https://dotnetfiddle.net/kZhseg
Here you go:
var xs = new [] { 1, 2, 2, 3 };
var ys = new [] { 2, 3, 3, 4 };
var zs = new [] { 2, 3, 4, 5, 5, 5 };
var result =
xs
.ToLookup(x => x)
.Concat(ys.ToLookup(x => x))
.Concat(zs.ToLookup(x => x))
.GroupBy(x => x.Key)
.Select(x => new { x.Key, count = x.Max(y => y.Count()) })
.SelectMany(x => Enumerable.Repeat(x.Key, x.count));
It gives the result you want.
I want to convert this part of code to LINQ.
Can anyone help me?
var list = new List<int[]>();
list.Add(new int[] { 1, 2, 3, 4 });
list.Add(new int[] { 5, 4, 2, 1 });
list.Add(new int[] { 5, 9, 3, 5 });
var result = new int[4];
foreach (var item in list)
{
for (int i = 0; i < 4; i++)
{
result[i] += item[i];
}
}
Result must be : { 11, 15, 8, 10 } because that is the sum-result
I think this is the most readable version. No need to GroupBy, you can Sum every index of every array:
int[] result = Enumerable.Range(0, 4)
.Select(index => list.Sum(arr => arr[index]))
.ToArray();
Since OP is also using a for-loop from 0-3 they all seem to have the same size.
If that's not the case you could use this super safe approach:
int maxLength = list.Max(arr => arr.Length);
int[] result = Enumerable.Range(0, maxLength)
.Select(index => list.Sum(arr => arr.ElementAtOrDefault(index)))
.ToArray();
First thing that pops to my head:
var list = new List<int[]>();
list.Add(new int[] { 1, 2, 3, 4 });
list.Add(new int[] { 5, 4, 2, 1 });
list.Add(new int[] { 5, 9, 3, 5 });
var result = list.SelectMany(item => item.Select((innerItem, index) => new { index, innerItem }))
.GroupBy(item => item.index, (key, group) => group.Sum(item => item.innerItem))
.ToList();
Tim's approach above is cleaner and is better
You can try this one
var list = new List<int[]>();
list.Add(new int[] { 1, 2, 3, 4 });
list.Add(new int[] { 5, 4, 2, 1 });
list.Add(new int[] { 5, 9, 3, 5 });
var result = list.SelectMany(x => x.Select((z, i) => new {z, i}))
.GroupBy(x=>x.i).Select(x=>x.Sum(z=>z.z)).ToArray();
Want to do aggregation, so why would not use linq aggregate?
var list = new List<int[]>();
list.Add(new int[] { 1, 2, 3, 4 });
list.Add(new int[] { 5, 4, 2, 1 });
list.Add(new int[] { 5, 9, 3, 5 });
var addArrayValues = new Func<int[], int[], int[]>(
(source, destination) =>
{
for (int i = 0; i < source.Length; i++)
destination[i] += source[i];
return destination;
});
var aggregateResult = list.Aggregate(new int[4],
(accumulator, current) => addArrayValues(current, accumulator));
I have a database field which contains string values.
I am looking a way to find top 10 maximum occured words from the field
First get all the words from that field:
IEnumerable<string> allWords = from entry in table
from word in entry.Field.Split(' ')
select word;
Then group them by their counts:
IEnumerable<string> result = from word in allWords
group word by word into grouped
let count = grouped.Count()
orderby count descending
select grouped.Key;
Get top 10 results:
result.Take(10);
var result =
Regex.Matches(s, #"\b\w+\b").OfType<Match>()
.GroupBy(k => k.Value, (g, u) => new { Word = g, Count = u.Count() })
.OrderBy(n => n.Count)
.Take(10);
Here you have an easy example with numbers:
class Program
{
static void Main(string[] args)
{
int[] nums = new int[] { 2, 3, 4, 5, 6, 1, 2, 3, 1, 1, 1, 7, 12, 451, 13,
46, 1, 1, 3, 2, 3, 4, 5, 3, 2, 4, 4, 5, 6, 6, 8, 9, 0};
var numberGroups =
(from n in nums
group n by n into g
orderby g.Count() descending
select new { Number = g.Key, Count = g.Count() }
).Take(10);
Console.ReadLine();
}
}
Regards
This might be either impossible or so obvious I keep passing over it.
I have a list of objects(let's say ints for this example):
List<int> list = new List<int>() { 1, 2, 3, 4, 5, 6 };
I'd like to be able to group by pairs with no regard to order or any other comparison, returning a new IGrouping object.
ie,
list.GroupBy(i => someLogicToProductPairs);
There's the very real possibility I may be approaching this problem from the wrong angle, however, the goal is to group a set of objects by a constant capacity. Any help is greatly appreciated.
Do you mean like this:
List<int> list = new List<int>() { 1, 2, 3, 4, 5, 6 };
IEnumerable<IGrouping<int,int>> groups =
list
.Select((n, i) => new { Group = i / 2, Value = n })
.GroupBy(g => g.Group, g => g.Value);
foreach (IGrouping<int, int> group in groups) {
Console.WriteLine(String.Join(", ", group.Select(n=>n.ToString()).ToArray()));
}
Output
1, 2
3, 4
5, 6
you can do something like this...
List<int> integers = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var p = integers.Select((x, index) => new { Num = index / 2, Val = x })
.GroupBy(y => y.Num);
int counter = 0;
// this function returns the keys for our groups.
Func<int> keyGenerator =
() =>
{
int keyValue = counter / 2;
counter += 1;
return keyValue;
};
var groups = list.GroupBy(i => {return keyGenerator()});