How the get the mean, median and stdev from list - c#

Is it possible to get the mean, median and stdev result from the list of list.
Here is my initial code that needs to compute:
var myList = new List<List<double>>();
myList.Add(new List<double> { 1, 3, 6, 8});
myList.Add(new List<double> { 1, 2, 3, 4});
myList.Add(new List<double> { 1, 4, 8, 12});
And expected result is to get the mean, median, and stdev of first and last index only:
Mean: 1, 8
Median: 1, 8
Stdev: 0, 3.265986324
I tried to loop the list to get the average but not sure if this is the best way:
foreach(var i in myList )
{
Console.WriteLine(i[0].Average());
}
Any suggestion/comments TIA

This will get you started by showing how to calculate the Average. Note the use of First or Last to get the first / last entry from each of the sub-lists.
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp5
{
class Program
{
static void Main(string[] args)
{
var myList = new List<List<double>>();
myList.Add(new List<double> { 1, 3, 6, 8 });
myList.Add(new List<double> { 1, 2, 3, 4 });
myList.Add(new List<double> { 1, 4, 8, 12 });
var averageFirst= myList.Select(z => z.First()).Average();
var averageLast = myList.Select(z => z.Last()).Average();
Console.WriteLine(averageFirst);
Console.WriteLine(averageLast);
Console.ReadLine();
}
}
}

I'll join forces with #mjwills (first and last)
Which is to say,
var listOfFirstValues = myList.Select(z => z.First()).ToList();
var listOfLastValues = myList.Select(z => z.Last()).ToList();
Mean
var mean = myList.Average();
Median The middle of a sorted list of numbers
public double Median(List<double> numbers)
{
if (numbers.Count == 0)
return 0;
numbers = numbers.OrderBy(n=>n).ToList();
var halfIndex = numbers.Count()/2;
if (numbers.Count() % 2 == 0)
return (numbers[halfIndex] + numbers[halfIndex - 1]) / 2.0;
return numbers[halfIndex];
}
Standard Deviation a quantity expressing by how much the members of a group differ from the mean value for the group
private double CalculateStdDev(List<double> values)
{
if (values.Count == 0)
return 0;
var avg = values.Average();
var sum = values.Sum(d => Math.Pow(d - avg, 2));
return Math.Sqrt(sum / (values.Count()-1));
}
Note : Totally untested, and lacking any sanity checks

Related

Add index position Value of an array of array using Linq

We can do sum using arr.Sum() function. But if it is an array of arrays. How will we add all values.
suppose data is
Array/List is [[1,2,3],[3,4,5],[5,4,3]]
how will you get s1 , sum of all first index value, s2 , sum of second index value and so on using LINQ.
If you want to sum up columns' values with a help of Linq:
int[][] source = new int[][] {
new int[] { 1, 2, 3},
new int[] { 3, 4, 5},
new int[] { 5, 4, 3},
};
int maxCol = source.Max(item => item.Length);
var colsSum = Enumerable
.Range(0, maxCol)
.Select(index => source.Sum(item => item.Length > index ? item[index] : 0))
.ToArray(); // let's meaterialize into an array
Test:
Console.Write(string.Join(", ", colsSum));
Outcome:
9, 10, 11
Summing up lines' values is easier:
// [6, 12, 12]
var linesSum = source
.Select(item => item.Sum())
.ToArray();
If you want total sum:
// 30
var total = source
.Select(item => item.Sum())
.Sum();
or
// 30
var total = source
.SelectMany(item => item)
.Sum();
Use combination of Aggregate and Zip
var arrays = new[]
{
new[] { 1, 2, 3 },
new[] { 3, 4, 5 },
new[] { 5, 4, 3 }
};
var result =
arrays.Aggregate(Enumerable.Repeat(0, 3),
(total, array) => total.Zip(array, (sum, current) => sum + current));
// result = { 9, 10, 11 }
Enumerable<T>.Zip executes provided function with items of same index.
A possible LINQ based approach (which will handle variable number of columns in each row):
using System;
using System.Collections.Generic;
using System.Linq;
namespace Test
{
public class Program
{
private static IEnumerable<int> GetTotalsPerColumn(int[][] inputData)
{
var data = inputData.SelectMany(z =>
{
return z.Select((item, index) => new { item, index });
})
.GroupBy(z => z.index)
.OrderBy(z => z.Key)
.Select(y => y.Select(z => z.item).Sum()
);
return data;
}
static void Main(string[] args)
{
var inputData = new[] {
new[] { 1, 2, 3, 5},
new[] { 3, 4, 5, 6},
new[] { 5, 4, 3},
};
var values = GetTotalsPerColumn(inputData);
foreach (var value in values)
{
Console.WriteLine(value);
}
Console.ReadLine();
}
}
}
If you are happy to avoid LINQ, this is another approach you could consider.
GetTotalsPerColumn populates a Dictionary where the key is the column number, and the value is the sum.
using System;
using System.Collections.Generic;
namespace Test
{
public class Program
{
static void Main(string[] args)
{
var inputData = new[] {
new[] { 1, 2, 3, 5},
new[] { 3, 4, 5, 6},
new[] { 5, 4, 3},
};
var values = GetTotalsPerColumn(inputData);
foreach (var value in values)
{
Console.WriteLine(value.Key + " - " + value.Value);
}
Console.ReadLine();
}
private static Dictionary<int, int> GetTotalsPerColumn(int[][] inputData)
{
var values = new Dictionary<int, int>();
foreach (var line in inputData)
{
for (int i = 0; i < line.Length; i++)
{
int tempValue;
values.TryGetValue(i, out tempValue);
tempValue += line[i];
values[i] = tempValue;
}
}
return values;
}
}
}

The union of the intersects of the 2 set combinations of a sequence of sequences

How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.
// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}
This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;
You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();
You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.
That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.

LINQ: Separating single list to multiple lists

I have a single array with these entries:
{1, 1, 2, 2, 3,3,3, 4}
and i want to transform them to ( 3 lists in this case ):
{1,2,3,4}
{1,2,3}
{3}
Is there any way to do this with LINQ or SQL? I guess there's a mathematical term for this operation, which I don't know unfortunately...
Or do I have to do it with loops?
=======
EDIT: I can't really describe the logic, so here are more examples.. It more or less loops multiple times over the array and takes every number once ( but every number only once per round ) until there are no numbers left
{1, 1, 2, 2, 3,3,3, 4, 5}
would be
{1,2,3,4,5}
{1,2,3}
{3}
or
{1, 1, 2, 2,2, 3,3,3, 4, 5}
would be
{1,2,3,4,5}
{1,2,3}
{2,3}
private IEnumerable<List<int>> FooSplit(IEnumerable<int> items)
{
List<int> source = new List<int>(items);
while (source.Any())
{
var result = source.Distinct().ToList();
yield return result;
result.ForEach(item => source.Remove(item));
}
}
Usage:
int[] items = { 1, 1, 2, 2, 3, 3, 3, 4 };
foreach(var subList in FooSplit(items))
{
// here you have your three sublists
}
Here is another solution, which is less readable but it will have better performance:
private IEnumerable<IEnumerable<int>> FooSplit(IEnumerable<int> items)
{
var groups = items.GroupBy(i => i).Select(g => g.ToList()).ToList();
while (groups.Count > 0)
{
yield return groups.Select( g =>
{ var i = g[0]; g.RemoveAt(g.Count - 1); return i; });
groups.RemoveAll(g => g.Count == 0);
}
}
this does the job:
static void Main(string[] args)
{
int[] numbers = {1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 5};
List<int> nums = new List<int>(numbers.Length);
nums.AddRange(numbers);
while (nums.Count > 0)
{
int[] n = nums.Distinct().ToArray();
for (int i = 0; i < n.Count(); i++)
{
Console.Write("{0}\t", n[i]);
nums.Remove(n[i]);
}
Console.WriteLine();
}
Console.Read();
}
Here's an alternative console app:
class Program
{
class Freq
{
public int Num { get; set; }
public int Count { get; set; }
}
static void Main(string[] args)
{
var nums = new[] { 1, 1, 2, 2, 3, 3, 3, 4 };
var groups = nums.GroupBy(i => i).Select(g => new Freq { Num = g.Key, Count = g.Count() }).ToList();
while (groups.Any(g => g.Count > 0))
{
var list = groups.Where(g => g.Count > 0).Select(g => g.Num).ToList();
list.ForEach(li => groups.First(g => g.Num == li).Count--);
Console.WriteLine(String.Join(",", list));
}
Console.ReadKey();
}
}

Align multiple sorted lists

If I have, for example the following List<int>s
{ 1, 2, 3, 4 } //list1
{ 2, 3, 5, 6 } //list2
...
{ 3, 4, 5 } //listN
What is the best way to retrieve the following corresponding List<int?>s?
{ 1, 2, 3, 4, null, null } //list1
{ null, 2, 3, null, 5, 6 } //list2
...
{ null, null, 3, 4, 5, null } //listN
I'm posting the solution we discussed in chat. I had an unoptimized version using Linq for all things loopy/filtering:
http://ideone.com/H4gCoE (live demo)
However, I suspect it won't be too performant because of all the enumerator classes created, and the collections being instantiated/modified along the way.
So I took the time to optimize it into handwritten loops with an administration to keep track of active iterators instead of modifying the iters collection. Here it is:
See http://ideone.com/FuZIDy for full live demo.
Note I assume the lists are pre-ordered by DefaultComparer<T>, since I use Linq'sMin() extension method without a custom comparer
public static IEnumerable<IEnumerable<T>> AlignSequences<T>(this IEnumerable<IEnumerable<T>> sequences)
{
var iters = sequences
.Select((s, index) => new { active=true, index, enumerator = s.GetEnumerator() })
.ToArray();
var isActive = iters.Select(it => it.enumerator.MoveNext()).ToArray();
var numactive = isActive.Count(flag => flag);
try
{
while (numactive > 0)
{
T min = iters
.Where(it => isActive[it.index])
.Min(it => it.enumerator.Current);
var row = new T[iters.Count()];
for (int j = 0; j < isActive.Length; j++)
{
if (!isActive[j] || !Equals(iters[j].enumerator.Current, min))
continue;
row[j] = min;
if (!iters[j].enumerator.MoveNext())
{
isActive[j] = false;
numactive -= 1;
}
}
yield return row;
}
}
finally
{
foreach (var iter in iters) iter.enumerator.Dispose();
}
}
Use it like this:
public static void Main(string[] args)
{
var list1 = new int?[] { 1, 2, 3, 4, 5 };
var list2 = new int?[] { 3, 4, 5, 6, 7 };
var list3 = new int?[] { 6, 9, 9 };
var lockstep = AlignSequences(new[] { list1, list2, list3 });
foreach (var step in lockstep)
Console.WriteLine(string.Join("\t", step.Select(i => i.HasValue ? i.Value.ToString() : "null").ToArray()));
}
It prints (for demo purposes I print the results sideways):
1 null null
2 null null
3 3 null
4 4 null
5 5 null
null 6 6
null 7 null
null null 9
null null 9
Note: You might like to change the interface to accept arbitrary number of lists, instead of a single sequence of sequences:
public static IEnumerable<IEnumerable<T>> AlignSequences<T>(params IEnumerable<T>[] sequences)
That way you could just call
var lockstep = AlignSequences(list1, list2, list3);
Here's another approach using List.BinarySearch.
sample data:
var list1 = new List<int>() { 1, 2, 3, 4 };
var list2 = new List<int>() { 2, 3, 5, 6, 7, 8 };
var list3 = new List<int>() { 3, 4, 5 };
var all = new List<List<int>>() { list1, list2, list3 };
calculate min/max and all nullable-lists:
int min = all.Min(l => l.Min());
int max = all.Max(l => l.Max());
// start from smallest number and end with highest, fill all between
int count = max - min + 1;
List<int?> l1Result = new List<int?>(count);
List<int?> l2Result = new List<int?>(count);
List<int?> l3Result = new List<int?>(count);
foreach (int val in Enumerable.Range(min, count))
{
if (list1.BinarySearch(val) >= 0)
l1Result.Add(val);
else
l1Result.Add(new Nullable<int>());
if (list2.BinarySearch(val) >= 0)
l2Result.Add(val);
else
l2Result.Add(new Nullable<int>());
if (list3.BinarySearch(val) >= 0)
l3Result.Add(val);
else
l3Result.Add(new Nullable<int>());
}
output:
Console.WriteLine(string.Join(",", l1Result.Select(i => !i.HasValue ? "NULL" : i.Value.ToString())));
Console.WriteLine(string.Join(",", l2Result.Select(i => !i.HasValue ? "NULL" : i.Value.ToString())));
Console.WriteLine(string.Join(",", l3Result.Select(i => !i.HasValue ? "NULL" : i.Value.ToString())));
1, 2, 3, 4, NULL, NULL, NULL, NULL
NULL, 2, 3, NULL, 5, 6, 7, 8
NULL, NULL, 3, 4, 5, NULL, NULL, NULL
DEMO

How to display how many times an array element appears

I am new to C# and hope I can get some help on this topic. I have an array with elements and I need to display how many times every item appears.
For instance, in [1, 2, 3, 4, 4, 4, 3], 1 appears one time, 4 appears three times, and so on.
I have done the following but don`t know how to put it in the foreach/if statement...
int[] List = new int[]{1,2,3,4,5,4,4,3};
foreach(int d in List)
{
if("here I want to check for the elements")
}
Thanks you, and sorry if this is a very basic one...
You can handle this via Enumerable.GroupBy. I recommend looking at the C# LINQ samples section on Count and GroupBy for guidance.
In your case, this can be:
int[] values = new []{1,2,3,4,5,4,4,3};
var groups = values.GroupBy(v => v);
foreach(var group in groups)
Console.WriteLine("Value {0} has {1} items", group.Key, group.Count());
You can keep a Dictionary of items found as well as their associated counts. In the example below, dict[d] refers to an element by its value. For example d = 4.
int[] List = new int[]{1,2,3,4,5,4,4,3};
var dict = new Dictionary<int, int>();
foreach(int d in List)
{
if (dict.ContainsKey(d))
dict[d]++;
else
dict.Add(d, 1);
}
When the foreach loop terminates you'll have one entry per unique value in dict. You can get the count of each item by accessing dict[d], where d is some integer value from your original list.
The LINQ answers are nice, but if you're trying to do it yourself:
int[] numberFound = new int[6];
int[] List = new int[] { 1, 2, 3, 4, 5, 4, 4, 3 };
foreach (int d in List)
{
numberFound[d]++;
}
var list = new int[] { 1, 2, 3, 4, 5, 4, 4, 3 };
var groups = list.GroupBy(i => i).Select(i => new { Number = i.Key, Count = i.Count() });
private static void CalculateNumberOfOccurenceSingleLoop()
{
int[] intergernumberArrays = { 1, 2, 3, 4, 1, 2, 4, 1, 2, 3, 5, 6, 1, 2, 1, 1, 2 };
Dictionary<int, int> NumberOccurence = new Dictionary<int, int>();
for (int i = 0; i < intergernumberArrays.Length; i++)
{
if (NumberOccurence.ContainsKey(intergernumberArrays[i]))
{
var KeyValue = NumberOccurence.Where(j => j.Key == intergernumberArrays[i]).FirstOrDefault().Value;
NumberOccurence[intergernumberArrays[i]] = KeyValue + 1;
}
else
{
NumberOccurence.Add(intergernumberArrays[i], 1);
}
}
foreach (KeyValuePair<int, int> item in NumberOccurence)
{
Console.WriteLine(item.Key + " " + item.Value);
}
Console.ReadLine();
}

Categories