Figuring out GroupBy in this code snippet - c#

I have a piece of code that I am struggling with figuring out. Not sure what's going on. It should return the most occurring numbers within an array (and it does).
It outputs the following => [2, 3].
I have tried to make my questions as readable as possible, sorry for any eye-strain.
I am struggling to understand the following code:
.GroupBy(..., numbersGroup => numbersGroup.Key),
.OrderByDescending(supergroup => supergroup.Key)
.First()
Could someone help explain this code to me?
I will write down comments inside of the code as far as I have understood it.
int[] numbers1 = { 1, 2, 3, 3, 2, 4 };
// First in GroupBy(x => x) I group all numbers within the array (remove all
// duplicates too?), now my array looks like this [1,2,3,4].
int[] result = numbers1.GroupBy(x => x)
// In GroupBy(numbersGroup => numbersGroup.Count()) I collect all the
// different amount of occurrences withing the array, that would be 1 (1, 4)
// and 2 for (2, 3) so my array should look like this now [1, 2].
// Now this is where things get out of hand, what happens at the rest of it? I
// have tried for 4 hours now and can't figure it out. What exactly happens in
// numbersGroup => numbersGroup.Key? .OrderByDescending(supergroup => supergroup.Key)?
.GroupBy(numbersGroup => numbersGroup.Count(), numbersGroup => numbersGroup.Key)
.OrderByDescending(supergroup => supergroup.Key)
.First()
.ToArray();

Code with my comments:
int[] numbers1 = { 1, 2, 3, 3, 2, 4 };
// First in GroupBy(x => x) all numbers are grouped by their values, so now data is IGrouping<int, int> query like this (formatted as a dict for readability in format {key: value}): {1: [1], 2: [2, 2], 3: [3, 3], 4: [4]} - int is key, value is occurrences list.
int[] result = numbers1.GroupBy(x => x)
// again, do GroupBy by elements count in group. You will get something like this: {1: [1, 4], 2: [2, 3]} - elements count is key, value is array of prev keys
.GroupBy(numbersGroup => numbersGroup.Count(), numbersGroup => numbersGroup.Key)
// sort groups by elements count descending: {2: [2, 3], 1: [1, 4]}
.OrderByDescending(supergroup => supergroup.Key)
// select group with max key (2): [2, 3]
.First()
// create array from this group: [2, 3]
.ToArray();

Every once in a while, I'll come across code that has a lot of chaining, like:
int[] noIdeaWhyThisIsAnArray = Something.DosomethingElse().AndThenAnotherThing().Etc();
Whenever I have trouble understanding this, I break it into steps and use the "var" keyword to simplify things:
var step1 = Something.DosomethingElse();
var step2 = step1.AndThenAnotherThing();
var step3 = step2.Etc();
Then add a breakpoint after step3 is assigned, run the debugger/application, and then start checking out the variables in the Locals tab. In your case, the code would look like:
int[] numbers1 = { 1, 2, 3, 3, 2, 4 };
var step1 = numbers1.GroupBy(x => x);
var step2 = step1.GroupBy(numbersGroup => numbersGroup.Count(), numbersGroup => numbersGroup.Key);
var step3 = step2.OrderByDescending(supergroup => supergroup.Key);
var step4 = step3.First();
var step5 = step4.ToArray();
That said, to answer your specific question:
The first GroupBy simply creates groups of each value/number. So all the 1s go into the first group, then all the 2s go into the next group, all the 3s go into the next group, etc... For example, in the screenshot, you can see the second group has two entries in it - both of them containing "2".
So at this point, there are a total of 4 groups (because there are 4 unique values). Two of the groups have 1 value each, and then the other two groups have 2 values each.
The next step then groups them by that count, so you end up with two groups, where the key indicates how many of each item there are. So the first group has a key of 1, and two values - "1" and "4", which means "1" and "4" both showed up once. The second group has a key of 2, and two values - "2" and "3", which means that "2" and "3" both showed up twice.
The third step orders that result in descending order of the key (and remember, the key indicates how many times those values showed up), so the MOST-frequently-occurring number(s) will be the first element in the result, and the LEAST-frequently-occurring number(s) will be the last element in the result.
The fourth step just takes that first result, which again, is the list of MOST-frequently-occurring numbers, in this case "2" and "3".
Finally, the fifth step takes that list and converts it to an array so instead of it being a Linq grouping object, it's just a simple array of those two numbers.

Related

Should I use Sum method and Count/Length find the element of array that is the closest to the middle value of all elements?

If I have arr=[1,3,4,-7,9,11], the average value is (1+3+4-7+9+11) /6 = 3.5, then elements 3 and 4 are equally distant to 3.5, but smaller of them is 3 so 3 is the result.
You need to find out what the average is first. That involves a cycle either implemented explicitly, or invoked implicitly. So, let's assume that you already know what the average value is, because your question refers to the way some values related to the average can be obtained. Let's implement a comparison function:
protected bool isBetter(double a, double b, double avg) {
double absA = Abs(a - avg);
double absB = Abs(b - avg);
if (absA < absB) return a;
else if (absA > absB) return b;
return (a < b) ? a : b;
}
And now you can iterate your array, always compare via isBetter the current value with the best so far and if it's better, then it will be the new best. Whatever number ended up to be the best will be the result.
Assuming you have worked out the average (avg below) then you can get the diff for each item, then order by the diff, and get the first item. This will give you the closest item in the array
var nearestDiff = arr.Select(x => new { Value=x, Diff = Math.Abs(avg-x)})
.OrderBy(x => x.Diff)
.First().Value;
Live example: https://dotnetfiddle.net/iKvmhp
If instead you must get the item lower than the average
var lowerDiff = arr.Where(x => x<avg)
.OrderByDescending(x =>x)
.First();
You'll need using System.Linq for either of the above to work
Using GroupBy is a good way to do it
var arr = new int[] { 1, 4, 3, -7, 9, 11 };
var avg = arr.Average();
var result = arr.GroupBy(x=>Math.Abs(avg-x)).OrderBy(g=>g.Key).First().OrderBy(x=>x).First();
Original Array
[1,4,3,-7,9,11]
After grouping, key is abs distance from average, items are grouped according to that
[2.5, [1]]
[0.5, [4, 3]]
[5.5, [9]]
[7.5, [11]]
[10.5, [-7]]
Order by group keys
[0.5, [4, 3]]
[2.5, [1]]
[5.5, [9]]
[7.5, [11]]
[10.5, [-7]]
Take first group
[4, 3]
Order group items
[3, 4]
Take first item
3
changed array to [1,4,3,-7,9,11], reversing order of 3 and 4 because they are naturally ordered according to the output originally, and this is necessary to prove the last step

Get IndexOf Second int record in a sorted List in C#

I am having problem while trying to get First and Second Record (not second highest/lowest integer) Index from a sorted List. Lets say that list consists of three records that in order are like this: 0, 0, 1.
I tried like this:
int FirstNumberIndex = MyList.IndexOf(MyList.OrderBy(item => item).Take(1).ToArray()[0]); //returns first record index, true
int SecondNumberIndex = MyList.IndexOf(MyList.OrderBy(item => item).Take(2).ToArray()[1]); //doesn't seem to work
As I explained, I am trying to get the indexes of first two zeros (they are not necessarily in ascending order before the sort) and not of zero and 1.
So if there was a list {0, 2, 4, 0} I need to get Indexes 0 and 3. But this may apply to any number that is smallest and repeats itself in the List.
However, it must also work when the smallest value does not repeat itself.
SecondNumberIndex is set to 0 because
MyList.OrderBy(item => item).Take(2).ToArray()[1] == 0
then you get
MyList.IndexOf(0)
that finds the first occurence of 0. 0 is equal to every other 0. So every time you ask for IndexOf(0), the very first 0 on the list gets found.
You can get what you want by using that sort of approach:
int FirstNumberIndex = MyList.IndexOf(0); //returns first record index, true
int SecondNumberIndex = MyList.IndexOf(0, FirstNumberIndex + 1 ); //will start search next to last ocurrence
From your code I guess you confuse some kind of "instance equality" with regular "equality".
Int is a simple type, IndexOf will not search for ocurrence of your specific instance of 0.
Keep in mind that this code, even if we will move in our thoughts to actual objects:
MyList.OrderBy(item => item).Take(2).ToArray()[1]
will not necessarily return equal objects in their original relative order from the input list.
EDIT
This cannot be adopted for general case, for getting indexes of ordered values from the original, unordered list.
If you are searching for indexes of any number of equal values, then setting bigger and bigger offset for the second parameter of IndexOf is OK.
But, let's consider a case when there are no duplicates. Such approach will work only when the input list is actually ordered ;)
You can preprocess your input list to have pairs (value = list[i],idx = i), then sort that pairs by value and then iterate over sorted pairs and print idx-es
You, probably, are asking about something like this:
var list = new List<int>{0,0,1};
var result = list.Select((val,i)=> new {value = val, idx = i}).Where(x=>x.value == 0);
foreach(var r in result) //anonymous type enumeration
Console.WriteLine(r.idx);
You can try user FindIndex.
var MyList = new List<int>() {3, 5, 1, 2, 4};
int firsIndex = MyList.FindIndex(a => a == MyList.OrderBy(item => item).Take(1).ToArray()[0]);
int secondIndex = MyList.FindIndex(a => a == MyList.OrderBy(item => item).Take(2).ToArray()[1]);
You could calculate the offset of the first occurrence, then use IndexOf on the list after skipping the offset.
int offset = ints.IndexOf(0) + 1;
int secondIndex = ints.Skip(offset).ToList().IndexOf(0) + offset;

Aggregate function. Sequence of steps

Here is the simple task, and the solution, which uses aggregate function. I have a general idea of how to use this function (for example: counting sum of elements, multiplying numbers). However I can not figure out the exact sequence of steps in this solution.
We have array, which contains 4 distinct integer values and string with appearances of the array's indexes.
int[] nums = new int[] {1, 2, 3, 4};
string str = "123214";
We need to count number of appearances of each index, multiply by a corresponding value and then sum this all up, so that the answer would be 13.
Here is the solution, using aggregate function:
str.Aggregate(0, (i, c) => i + nums[c - '1']);
What is the sequence of steps that this function performs?
In this case you can solve this using eq. reasoning:
str.Aggregate(0, (i, c) => i + nums[c - '1'])
= "123214".Aggregate(0,(i,c) => i+nums[c-'1'])
(i := 0, c := '1' => 0+nums['1'-'1'] = 0+nums[0]=0+1 = 1)
= "23214".Aggregate(1, ...)
(i := 1, c := '2' => 1+nums[1]=1+2=3)
= "3214".Aggregate(3, ...)
(i := 3, c:='3' => 3+nums[2]=3+3=6)
= "214".Aggregate(6,...)
(i:=6, c:='2' => 6+nums[1]=6+2=8)
= "14".Aggregate(8,...)
(i:=8,c:='1' => 8+nums[0]=8+1=9)
= "4".Aggregate(9,...)
(i:=9,c:='4' => 9+nums[3]=9+4=13)
= "".Aggregate(13,...)
= 13
I hope this helps
To really learn things like this I would encourage you to look into functional programming (look for folds) - basically the first parameter in Aggregate is a state that gets passed around and second parameter is a function taking the old state and the next element in the enumeration (in this case the next character in your string) and has to produce a new state. So in your case the state is just a number and you calculate it by adding up the old state with a lookup from your nums array based on the numeric value of your character (as index).

C# Calculate items in List<int> values vertically

I have a list of int values some thing like below (upper bound and lower bounds are dynamic)
1, 2, 3
4, 6, 0
5, 7, 1
I want to calculate the column values in vertical wise like
1 + 4 + 5 = 10
2 + 6 + 7 = 15
3 + 0 + 1 = 4
Expected Result = 10,15,4
Any help would be appreciated
Thanks
Deepu
Here's the input data using array literals, but the subsequent code works exactly the same on arrays or lists.
var grid = new []
{
new [] {1, 2, 3},
new [] {4, 6, 0},
new [] {5, 7, 1},
};
Now produce a sequence with one item for each column (take the number of elements in the shortest row), in which the value of the item is the sum of the row[column] value:
var totals = Enumerable.Range(0, grid.Min(row => row.Count()))
.Select(column => grid.Sum(row => row[column]));
Print that:
foreach (var total in totals)
Console.WriteLine(total);
If you use a 2D array you can just sum the first, second,... column of each row.
If you use a 1D array you can simply use a modulo:
int[] results = new results[colCount];
for(int i=0, i<list.Count; list++)
{
results[i%colCount] += list[i];
}
Do you have to use a "List"-object? Elseway, I would use a twodimensional array.
Otherwise, you simply could try, how to reach rows and columns separatly, so you can add the numbers within a simply for-loop. It depends on the methods of the List-object.
Quite inflexible based on the question, but how about:
int ans = 0;
for(int i = 0; i < list.length; i+=3)
{
ans+= list[i];
}
You could either run the same thing 3 times with a different initial iterator value, or put the whole thing in another loop with startValue as an interator that runs 3 times.
Having said this, you may want to a) look at a different way of storing your data if, indeed they are in a single list b) look at more flexible ways to to this or wrap in to a function which allows you to take in to account different column numbers etc...
Cheers,
Adam

LINQ - is SkipWhile broken?

I'm a bit surprised to find the results of the following code, where I simply want to remove all 3s from a sequence of ints:
var sequence = new [] { 1, 1, 2, 3 };
var result = sequence.SkipWhile(i => i == 3); // Oh noes! Returns { 1, 1, 2, 3 }
Why isn't 3 skipped?
My next thought was, OK, the Except operator will do the trick:
var sequence = new [] { 1, 1, 2, 3 };
var result = sequence.Except(i => i == 3); // Oh noes! Returns { 1, 2 }
In summary,
Except removes the 3, but also
removes non-distinct elements. Grr.
SkipWhile doesn't skip the last
element, even if it matches the
condition. Grr.
Can someone explain why SkipWhile doesn't skip the last element? And can anyone suggest what LINQ operator I can use to remove the '3' from the sequence above?
It's not broken. SkipWhile will only skip items in the beginning of the IEnumerable<T>. Once that condition isn't met it will happily take the rest of the elements. Other elements that later match it down the road won't be skipped.
int[] sequence = { 3, 3, 1, 1, 2, 3 };
var result = sequence.SkipWhile(i => i == 3);
// Result: 1, 1, 2, 3
var result = sequence.Where(i => i != 3);
The SkipWhile and TakeWhile operators skip or return elements from a sequence while a predicate function passes (returns True). The first element that doesn’t pass the predicate function ends the process of evaluation.
//Bypasses elements in a sequence as long as a specified condition is true and returns the remaining elements.
One solution you may find useful is using List "FindAll" function.
List <int> aggregator = new List<int> { 1, 2, 3, 3, 3, 4 };
List<int> result = aggregator.FindAll(b => b != 3);
Ahmad already answered your question, but here's another option:
var result = from i in sequence where i != 3 select i;

Categories