How to get specific columnar maximums in List of arrays?

How to get specific columnar maximums in List of arrays? - c#

I have a list "XYZarr" filled with array of integers as shown below.
int[] intarr1 = new int[3]{ 1, 4, 15};
int[] intarr2 = new int[3] { 10, 5, 8};
int[] intarr3 = new int[3] { 7, 8, 12 };
int[] intarr4 = new int[3] { 7, 8, 9 };
List<int[]> XYZarr = new List<int[]>() { intarr1, intarr2, intarr3, intarr4 };
I want to get only three indexes from the list where int[2] is maximum(i.e third entry). so the resulting indexes should be 0,2,3 of the list "XYZarr"
I can do it by looping through the list , but i want to use LinQ for this. I am not able to phrase the question properly, but..i hope you get my point

How about this:
var result = Enumerable.Range(0, XYZarr.Count)
.OrderByDescending(i => XYZarr[i][2])
.Take(3)
.ToList();

I am assuming that "int[2] is maximum" means that the 3rd element in the array is the maximum element in that array.
Here is how:
var result =
XYZarr
.Select((list,index)
=> new {list,index})
.Where(x => x.list.Max() == x.list[2])
.Select(x => x.index)
.ToList();
If XYZarr contains more than 3 arrays and you just want to get 3 results, then use the Take method like this:
var result =
XYZarr
.Select((list,index)
=> new {list,index})
.Where(x => x.list.Max() == x.list[2])
.Select(x => x.index)
.Take(3)
.ToList();

Related

Check if a set exactly includes a subset using Linq taking into account duplicates

var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
subset.All(s => superset.Contains(s))
This code would return true, because 9 is included in the superset,but only once, I want an implementation that would take into account the duplicates, so it would return false

My thought was that you could group both sets by count, then test that the super group list contained every key from the sub group list and, in each case, the super count was greater than or equal to the corresponding subcount. I think that I've achieved that with the following:
var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
var subGroups = subset.GroupBy(n => n).ToArray();
var superGroups = superset.GroupBy(n => n).ToArray();
var basicResult = subset.All(n => superset.Contains(n));
var advancedResult = subGroups.All(subg => superGroups.Any(supg => subg.Key == supg.Key && subg.Count() <= supg.Count()));
Console.WriteLine(basicResult);
Console.WriteLine(advancedResult);
I did a few extra tests and it seemed to work but you can test some additional data sets to be sure.

Here is another solution :
var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
var subsetGroup = subset.GroupBy(x => x).Select(x => new { key = x.Key, count = x.Count() });
var supersetDict = superset.GroupBy(x => x).ToDictionary(x => x.Key, y => y.Count());
Boolean results = subsetGroup.All(x => supersetDict[x.key] >= x.count);

This works for me:
var subsetLookup = subset.ToLookup(x => x);
var supersetLookup = superset.ToLookup(x => x);
bool flag =
subsetLookup
.All(x => supersetLookup[x.Key].Count() >= subsetLookup[x.Key].Count());

That's not how sets and set operations work. Sets cannot contain duplicates.
You should treat the two arrays not as sets, but as (unordered) sequences. A possible algorithm would be: make a list from the sequence superset, then remove one by one each element of the sequence subset from the list until you are unable to find such an element in the list.
bool IsSubList(IEnumerable<int> sub, IEnumerable<int> super)
{
var list = super.ToList();
foreach (var item in sub)
{
if (!list.Remove(item))
return false; // not found in list, so sub is not a "sub-list" of super
}
return true; // all elements of sub were found in super
}
var subset = new[] { 9, 3 };
var superset = new[] { 9, 10, 5, 3,1, 3, 3 };
var isSubSet = IsSubList(subset, superset);

I want to get most frequent values using LINQ

I am trying to get the most frequent values in an array using LINQ in C#.
For example,
int[] input = {1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8};
output = {1, 6}
int[] input = {1, 2, 2, 3 ,3, 3, 5}
output = {3}
Please let me know how to build LINQ.
Please read be careful.
This is a different problem with Select most frequent value using LINQ
I have to choose only the most frequent values. The code below is similar, but I can't use Take(5) because I don't know the number of results.
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
IEnumerable<int> top5 = nums
.GroupBy(i => i)
.OrderByDescending(g => g.Count())
.Take(5)
.Select(g => g.Key);
this output is {1, 2, 3, 4, 5}
but my expected output = {1, 2}
Please read the questions carefully and answer.
Thanks and regards.

Just to add to the plethora of answers:
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var result = input
.GroupBy(i => i)
.GroupBy(g => g.Count())
.OrderByDescending(g => g.Key)
.First()
.Select(g => g.Key)
.ToArray();
Console.WriteLine(string.Join(", ", result)); // Prints "1, 6"
[EDIT]
In case anyone finds this interesting, I compared the performance of the above between .net 4.8 and .net 5.0 as follows:
(1) Added a Comparer class to instrument the number of comparisons made:
class Comparer : IComparer<int>
{
public int Compare(int x, int y)
{
Console.WriteLine($"Comparing {x} with {y}");
return x.CompareTo(y);
}
}
(2) Modified the call to OrderByDescending() to pass a Comparer:
.OrderByDescending(g => g.Key, new Comparer())
(3) Multi-targeted my test console app to "net48" and "net5.0".
After making those changes the output was as follows:
For .net 4.8:
Comparing 1 with 3
Comparing 1 with 1
Comparing 1 with 2
Comparing 3 with 3
Comparing 3 with 2
Comparing 3 with 3
1, 6
For .net 5.0:
Comparing 3 with 1
Comparing 3 with 2
1, 6
As you can see, .net 5.0 is better optimised. For .net Framework however, (as /u/mjwills mentions below) it would likely be more performant to use a MaxBy() extension to avoid having to use OrderByDescending() - but only if instrumentation indicates that the sort is causing a performance issue.

If you want to do it in pure LINQ in one query you can group groups by count and select the max one:
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var tops = nums
.GroupBy(i => i)
.GroupBy(grouping => grouping.Count())
.OrderByDescending(gr => gr.Key)
.Take(1)
.SelectMany(g => g.Select(g => g.Key))
.ToList();
Note that it is not a most effective and clear solution.
UPD
A little bit more effective version using Aggregate to perform MaxBy. Note that it will fail for empty collections unlike the previous one:
var tops = nums
.GroupBy(i => i)
.GroupBy(grouping => grouping.Count())
.Aggregate((max, curr) => curr.Key > max.Key ? curr : max)
.Select(gr => gr.Key);
Also you can use MaxBy from MoreLinq or one introduced in .NET 6.

You can store your result in an IEnumerable of tuples with the first item being the number, the second item being the count of the number in your input array. Then you look at the count of your group with most elements, and take all the tuples where the second items equals your maximum.
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var intermediate = nums
.GroupBy(i => i)
.Select(g => (g.Key,g.Count()));
int amount = intermediate.Max(x => x.Item2);
IEnumerable<int> mostFrequent = intermediate
.Where(x => x.Item2 == amount)
.Select(x => x.Item1);
Online demo: https://dotnetfiddle.net/YCVGam

Use a variable to capture the number of items for the first item, then use TakeWhile to get all the groups with that number of items.
void Main()
{
var input = new[] { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
int numberOfItems = 0;
var output = input
.GroupBy(i => i)
.OrderByDescending(group => group.Count());
var maxNumberOfItems = output.FirstOrDefault()?.Count() ?? 0;
var finalOutput = output.TakeWhile(group => group.Count() == maxNumberOfItems).ToList();
foreach (var item in finalOutput)
{
Console.WriteLine($"Value {item.Key} has {item.Count()} members");
}
}
You can do this as a single query as well:
int? numberOfItems = null;
var finalOutput = input
.GroupBy(i => i)
.OrderByDescending(group => group.Count())
.TakeWhile(i =>
{
var count = i.Count();
numberOfItems ??= count;
return count == numberOfItems;
})
.ToList();

You could consider adding an extension-method. Something like
public static IEnumerable<T> TakeWhileEqual<T, T2>(this IEnumerable<T> collection, Func<T, T2> predicate)
where T2 : IEquatable<T2>
{
using var iter = collection.GetEnumerator();
if (iter.MoveNext())
{
var first = predicate(iter.Current);
yield return iter.Current;
while (iter.MoveNext() && predicate(iter.Current).Equals(first))
{
yield return iter.Current;
}
}
}
This has the advantage of being efficient, not needing to iterate over the collection more than once. But it does require some more code, even if this can be hidden in an extension method.

I think you probably want to use TakeWhile rather than Take;
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var n = nums
.GroupBy(i => i)
.OrderByDescending(g => g.Count());
var c = n.First().Count();
var r = n.TakeWhile(g => g.Count() == c)
.Select(g => g.Key);
If you want to do this in a single pass, without LINQ, you can use a Dictionary and a List track
a) how many times you saw a value and
b) what value you saw the most times
c) what other most-values you saw that many times
We skip through the list, trying to look the current value up in the dictionary. It either works or it doesn't - if it works, TryGetValue tells us how many times the current value has been seen. IF it doesn't, TryGetValue gives use a seen of 0. We increment seen. We take a look at how it compares to the max we've seen so far:
It's greater - we have a new leader in the "most frequent" contest - clear the current leaders list and start over with the new n as the leader. Also note the new max
It's equal - we have a tie for the lead; add the current n in among its peers
It's less - we don't care
int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
int maxSeen = int.MinValue;
var seens = new Dictionary<int, int>();
var maxes = new List<int>();
foreach(var n in nums){
seens.TryGetValue(n, out var seen);
seens[n] = ++seen;
if(seen > maxSeen){
maxes = new(){n};
maxSeen = seen;
} else if(seen == maxSeen)
maxes.Add(n);
}
You'll end up with maxes as a List<int> that is the list of numbers that appear most.
If you care about allocations of the List's internal array, you could consider clearing the list instead of newing; I new'd because it was a handy one liner to use an initializer with the new leader

You may first group the first input like that.
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var tmpResult = from i in input
group i by i into k
select new
{
k.Key,
count = k.Count()
};
then you can filter the max value of group like that;
var max = tmpResult.Max(s => s.count);
after you should make a filter is enough
int[] result = tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
Also you can create an Extension method for this.
public static class Extension
{
public static int[] GetMostFrequent(this int[] input)
{
var tmpResult = from i in input
group i by i into k
select new
{
k.Key,
count = k.Count()
};
var max = tmpResult.Max(s => s.count);
return tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
}

You were very close. Just add one more line to your code.
int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };
var counts = input
.GroupBy(i => i)
.Select(i => new { Number = i.Key, Count = i.Count()})
.OrderByDescending(i => i.Count);
var maxCount = counts.First().Count;
var result = counts
.Where(i=> i.Count == maxCount)
.Select(i => i.Number);
result
{1,6}

C# Linq 2 Dimensional List Filter

Edit - Figuered this out thanks to #Monofuse:
List<List<int>> list1,
list2,
list2_flattened = list2.SelectMany(x => x).ToList(); // 1d list
list1 = list1.Select(x => x.Where(y => !list2_flattened.Contains(y)).ToList()).ToList(); // 2d list (definitely not the most efficient function, but my list is constrained to a size of about 20)
Given the 2 lists:
List<List<int>> list1;
List<List<int>> list2;
How would you filter the items in List1, such that you end up with items that don't exist in list2?
forgot to mention: list1 must keep the original structure (that is List<List>, so SelectMany isn't an option)
I'm looking for a linq solution
Thanks!

If I've understood you right, and you want to exclude list2 int values from list1 you can put
var result = list1
.Select(list => list
.Where(item => !list2
.SelectMany(dropList => dropList)
.Any(drop => drop == item))
.ToList())
.ToList();
For instance
List<List<int>> list1 = new List<List<int>>() {
new List<int>() { 1, 2, 2, 3, 3, 4, 4},
new List<int>() { 2,},
new List<int>() { 5, 6,}
};
// We should remove 2, 5, 3 whenever they appear in list1
List<List<int>> list2 = new List<List<int>>() {
new List<int>() { 2, 5},
new List<int>() { 3, 3},
};
var result = list1
.Select(list => list
.Where(item => !list2
.SelectMany(dropList => dropList)
.Any(drop => drop == item))
.ToList())
.ToList();
string report = string.Join(Environment.NewLine, result
.Select(line => $"[{string.Join(", ", line)}]"));
Console.Write(report);
Outcome:
[1, 4, 4]
[]
[6]

Not sure if you wanted to remove list1 lists where there is no longer anything in it. Also wasn't sure if you wanted to check that every item in a sublist matches all of a sublist of list2.
var list1 = new List<List<int>>();
var list2 = new List<List<int>>();
var flatList2 = list2.SelectMany(l2 => l2).Distinct();
var result = list1
.Select(o => o
.Where(inner => !flatList2.Contains(inner)))
.Where(o => o.Any());
The below one is exactly the same, it just has different variable names. I think it might help people understand a little more. As we are dealing with a two dimensional array, I always find it a little easier to think of it like a table.
var table = new List<List<int>>();
var table2 = new List<List<int>>();
var distinctColumns = table2.SelectMany(row => row).Distinct();
var result = table
.Select(row => row
.Where(column => !distinctColumns.Contains(column)))
.Where(row => row.Any());

Convert a sequence of numbers into a zero based order?

Suppose I have:
var correctOrder = new[] {2, 1, 0};
var actualPositionsFound = new[] {63,62,61];
How can I easily convert actualPositionsFound to a zero based sequence?
So if I had:
var actualPositionsFound = new[] {100,50,200];
I would like to end up with :
var result = new[] {1,0,2};
Update: In an attempt to make this clearer to avoid closing, what I believe is being asked for is to translate a list of numbers into another list of numbers representing the ascending order of the other list like a sort map, 0-based.
So { 16, 19, 2, 4 } would create a map { 2, 3, 0, 1 }, being 0-based.

If there are no duplicates:
var actualPositionsFound = new[] { 100, 50, 200 };
var indices = actualPositionsFound.OrderBy(n => n)
.Select((n, i) => new { n, i })
.ToDictionary(o => o.n, o => o.i);
var result = actualPositionsFound.Select(n => indices[n]).ToList();

Is it that you are looking for?
actualPositionsFound.Select((elem, idx) => new { elem, idx })
.OrderBy(wrap => wrap.elem)
.Select((wrap, idx) => new { wrap.idx, newIdx = idx })
.OrderBy(wrap => wrap.idx)
.Select(wrap => wrap.newIdx)
.ToArray();

actualPositionsFound
.OrderBy(x => x).ToList()
.Select(x => Array.IndexOf(actualPositionsFound,x)).ToArray();
This won't handle duplicates.

How to return duplicate values from an array?

Is there a way that I could return duplicate values from an array in C#? also im looking to write a small algorithm that would return the most number of duplicate values in an array. for example
[1, 2,2,2 3,3] I need to return the duplicate values with the most number of occurrences and the number of occurrences as well.
I think I saw some post which said that It could be done using Linq but I have no clue what Linq is
Any help would be much appreciated.

Try this:
int[] data = new int[] { 1, 2, 2, 2, 3, 3 };
IGrouping<int, int> mostOccurrences = data
.GroupBy(value => value)
.OrderByDescending(group => group.Count())
.First();
Console.WriteLine("Value {0} occurred {1} time(s).", mostOccurrences.Key, mostOccurrences.Count());
Note that if multiple values occur the same number of times (such as if you added another 3 to that list), the above code will only list one of them. To handle that situation, try this:
int[] data = new int[] { 1, 2, 2, 2, 3, 3, 3 };
var occurrenceInfos = data
.GroupBy(value => value)
.Select(group =>
new {
Count = group.Count(),
Value = group.Key
}
);
int maxOccurrenceCount = occurrenceInfos.Max(info => info.Count);
IEnumerable<int> maxOccurrenceValues = occurrenceInfos
.Where(info => info.Count == maxOccurrenceCount)
.Select(info => info.Value);
foreach (int value in maxOccurrenceValues)
Console.WriteLine("Value {0} occurred {1} time(s).", value, maxOccurrenceCount);

Here's my take on this:
var data = new[] { 1, 2, 2, 2, 3, 3, };
var occurences =
data
.ToLookup(x => x)
.ToDictionary(x => x.Key, x => x.Count());
var mostOccurences =
occurences
.OrderByDescending(x => x.Value)
.First();
These will give you the following results:

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to get specific columnar maximums in List of arrays? - c#

How about this: var result = Enumerable.Range(0, XYZarr.Count) .OrderByDescending(i => XYZarr[i][2]) .Take(3) .ToList();

Related

Check if a set exactly includes a subset using Linq taking into account duplicates

I want to get most frequent values using LINQ

C# Linq 2 Dimensional List Filter

Convert a sequence of numbers into a zero based order?

How to return duplicate values from an array?

Categories

Resources