LINQ Grouping by Sum Value - c#

Say I have a class like so:
public class Work
{
public string Name;
public double Time;
public Work(string name, double time)
{
Name = name;
Time = time;
}
}
And I have a List<Work> with about 20 values that are all filled in:
List<Work> workToDo = new List<Work>();
// Populate workToDo
Is there any possible way that I can group workToDo into segments where each segments sum of Time is a particular value? Say workToDo has values like so:
Name | Time
A | 3.50
B | 2.75
C | 4.25
D | 2.50
E | 5.25
F | 3.75
If I want the sum of times to be 7, each segment or List<Work> should have a bunch of values where the sum of all the Times is 7 or close to it. Is this even remotely possible or is it just a stupid question/idea? I am using this code to separate workToDo into segments of 4:
var query = workToDo.Select(x => x.Time)
.Select((x, i) => new { Index = i, Value = x})
.GroupBy(y => y.Index / 4)
.ToList();
But I am not sure how to do it based on the Times.

Here's a query that segments your data in groups where the times are near to 7, but not over:
Func<List<Work>,int,int,double> sumOfRange = (list, start, end) => list
.Skip(start)
.TakeWhile ((x, index) => index <= end)
.ToList()
.Sum (l => l.Time);
double segmentSize = 7;
var result = Enumerable.Range(0, workToDo.Count ())
.Select (index => workToDo
.Skip(index)
.TakeWhile ((x,i) => sumOfRange(workToDo, index, i)
<= segmentSize));
The output for your example data set is:
A 3.5
B 2.75
total: 6.25
B 2.75
C 4.25
total: 7
C 4.25
D 2.5
total: 6.75
D 2.5
total: 2.5
E 5.25
total: 5.25
F 3.75
total: 3.75
If you want to allow a segments to total over seven, then you could increase the segmentSize variable by 25% or so (i.e. make it 8.75).

This solution recurses through all combinations and returns the ones whose sums are close enough to the target sum.
Here is the pretty front-end method that lets you specify the list of work, the target sum, and how close the sums must be:
public List<List<Work>> GetCombinations(List<Work> workList,
double targetSum,
double threshhold)
{
return GetCombinations(0,
new List<Work>(),
workList,
targetSum - threshhold,
targetSum + threshhold);
}
Here is the recursive method that does all of the work:
private List<List<Work>> GetCombinations(double currentSum,
List<Work> currentWorks,
List<Work> remainingWorks,
double minSum,
double maxSum)
{
// Filter out the works that would go over the maxSum.
var newRemainingWorks = remainingWorks.Where(x => currentSum + x.Time <= maxSum)
.ToList();
// Create the possible combinations by adding each newRemainingWork to the
// list of current works.
var sums = newRemainingWorks
.Select(x => new
{
Works = currentWorks.Concat(new [] { x }).ToList(),
Sum = currentSum + x.Time
})
.ToList();
// The initial combinations are the possible combinations that are
// within the sum range.
var combinations = sums.Where(x => x.Sum >= minSum).Select(x => x.Works);
// The additional combinations get determined in the recursive call.
var newCombinations = from index in Enumerable.Range(0, sums.Count)
from combo in GetCombinations
(
sums[index].Sum,
sums[index].Works,
newRemainingWorks.Skip(index + 1).ToList(),
minSum,
maxSum
)
select combo;
return combinations.Concat(newCombinations).ToList();
}
This line will get combinations that sum to 7 +/- 1:
GetCombinations(workToDo, 7, 1);

What you are describing is a packing problem (where the tasks are being packed into 7-hour containers). Whilst it would be possible to use LINQ syntax in a solution to this problem, there is no solution inherent in LINQ that I am aware of.

Related

How can calculate the mode using array. Im only a beginner in programming. its for our finals project [duplicate]

This question already has answers here:
Find character with most occurrences in string?
(12 answers)
Closed 7 years ago.
I want to find the Mode in an Array. I know that I have to do nested loops to check each value and see how often the element in the array appears. Then I have to count the number of times the second element appears. The code below doesn't work, can anyone help me please.
for (int i = 0; i < x.length; i ++)
{
x[i]++;
int high = 0;
for (int i = 0; i < x.length; i++)
{
if (x[i] > high)
high = x[i];
}
}
Using nested loops is not a good way to solve this problem. It will have a run time of O(n^2) - much worse than the optimal O(n).
You can do it with LINQ by grouping identical values and then finding the group with the largest count:
int mode = x.GroupBy(v => v)
.OrderByDescending(g => g.Count())
.First()
.Key;
This is both simpler and faster. But note that (unlike LINQ to SQL) LINQ to Objects currently doesn't optimize the OrderByDescending when only the first result is needed. It fully sorts the entire result set which is an O(n log n) operation.
You might want this O(n) algorithm instead. It first iterates once through the groups to find the maximum count, and then once more to find the first corresponding key for that count:
var groups = x.GroupBy(v => v);
int maxCount = groups.Max(g => g.Count());
int mode = groups.First(g => g.Count() == maxCount).Key;
You could also use the MaxBy extension from MoreLINQ method to further improve the solution so that it only requires iterating through all elements once.
A non LINQ solution:
int[] x = new int[] { 1, 2, 1, 2, 4, 3, 2 };
Dictionary<int, int> counts = new Dictionary<int, int>();
foreach( int a in x ) {
if ( counts.ContainsKey(a) )
counts[a] = counts[a]+1
else
counts[a] = 1
}
int result = int.MinValue;
int max = int.MinValue;
foreach (int key in counts.Keys) {
if (counts[key] > max) {
max = counts[key];
result = key;
}
}
Console.WriteLine("The mode is: " + result);
As a beginner, this might not make too much sense, but it's worth providing a LINQ based solution.
x
.GroupBy(i => i) //place all identical values into groups
.OrderByDescending(g => g.Count()) //order groups by the size of the group desc
.Select(g => g.Key) //key of the group is representative of items in the group
.First() //first in the list is the most frequent (modal) value
Say, x array has items as below:
int[] x = { 1, 2, 6, 2, 3, 8, 2, 2, 3, 4, 5, 6, 4, 4, 4, 5, 39, 4, 5 };
a. Getting highest value:
int high = x.OrderByDescending(n => n).First();
b. Getting modal:
int mode = x.GroupBy(i => i) //Grouping same items
.OrderByDescending(g => g.Count()) //now getting frequency of a value
.Select(g => g.Key) //selecting key of the group
.FirstOrDefault(); //Finally, taking the most frequent value

Group dateTime by hour range

I got a list like this:
class Article
{
...
Public DateTime PubTime{get;set}
...
}
List<Article> articles
Now I want to group this list with hour range :[0-5,6-11,12-17,18-23]
I know there is a cumbersome way to do this:
var firstRange = articles.Count(a => a.PubTime.Hour >= 0 && a.PubTime.Hour <= 5);
But I want to use a elegant way. How can I do that?Use Linq Or anything others?
Group by Hour / 6:
var grouped = articles.GroupBy(a => a.PubTime.Hour / 6);
IDictionary<int, int> CountsByHourGrouping = grouped.ToDictionary(g => g.Key, g => g.Count());
The key in the dictionary is the period (0 representing 0-5, 1 representing 6-11, 2 representing 12-17, and 3 representing 18-23). The value is the count of articles in that period.
Note that your dictionary will only contain values where those times existed in the source data, so it won't always contain 4 items.
You could write a CheckRange Function, which takes your values and returns a bool. To make your code more reusable and elegant.
Function Example:
bool CheckRange (this int number, int min, int max)
=> return (number >= min && number <= max);
You could now use this function to check if the PubTime.Hour is in the correct timelimit.
Implementation Example:
var firstRange = articles.Count(a => a.CheckRange(0, 5));

Min-Max DataPoint Normilization

I have a list of DataPoint such as
List<DataPoint> newpoints=new List<DataPoint>();
where the DataPoint is a class consists of nine double features from A to I , and the
newpoints.count=100000 double points (i.e each point consists of nine double features from A to I)
I need to apply the normilization for List newpoints using Min-Max normilization method and the scale_range between 0 and 1.
I have implemneted so far the following steps
each DataPoints feature is assigned to one dimensional array. for example, the code for feature A
for (int i = 0; i < newpoints.Count; i++)
{ array_A[i] = newpoints[i].A;} and so on for all nine double features
I have applied the max-min normilization method. for example, the code for feature A:
normilized_featureA= (((array_A[i] - array_A.Min()) * (1 - 0)) /
(array_A.Max() - array_A.Min()))+0;
the method is succssfuly done but it takes more time (i.e. 3 minutes and 45 seconds)
how can i apply the Max_min normilization using the LINQ code in C# to reduce my time to a few seconds?
I found this question in Stackoverflow How to normalize a list of int values but my problem is
double valueMax = list.Max(); // I need Max point for feature A for all 100000
double valueMin = list.Min(); //I need Min point for feature A for all 100000
and so on for all others nine features
your help will be highly appreciated.
As an alternative to modelling your 9 features as double properties on a class "DataPoint", you could also model a datapoint of 9 doubles as an array, with the benefit being that you can do all 9 calculations in one pass, again, using LINQ:
var newpoints = new List<double[]>
{
new []{1.23, 2.34, 3.45, 4.56, 5.67, 6.78, 7.89, 8.90, 9.12},
new []{2.34, 3.45, 4.56, 5.67, 6.78, 7.89, 8.90, 9.12, 12.23},
new []{3.45, 4.56, 5.67, 6.78, 7.89, 8.90, 9.12, 12.23, 13.34},
new []{4.56, 5.67, 6.78, 7.89, 8.90, 9.12, 12.23, 13.34, 15.32}
};
var featureStats = newpoints
// We make the assumption that all 9 data points are present on each row.
.First()
// 2 Anon Projections - first to determine min / max as a function of column
.Select((np, idx) => new
{
Idx = idx,
Max = newpoints.Max(x => x[idx]),
Min = newpoints.Min(x => x[idx])
})
// Second to add in the dynamic Range
.Select(x => new {
x.Idx,
x.Max,
x.Min,
Range = x.Max - x.Min
})
// Back to array for O(1) lookups.
.ToArray();
// Do the normalizaton for the columns, for each row.
var normalizedFeatures = newpoints
.Select(np => np.Select(
(i, idx) => (i - featureStats[idx].Min) / featureStats[idx].Range));
foreach(var datapoint in normalizedFeatures)
{
Console.WriteLine(string.Join(",", datapoint.Select(x => x.ToString("0.00"))));
}
Result:
0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
0.33,0.33,0.33,0.33,0.34,0.47,0.23,0.05,0.50
0.67,0.67,0.67,0.67,0.69,0.91,0.28,0.75,0.68
1.00,1.00,1.00,1.00,1.00,1.00,1.00,1.00,1.00
Stop recalculating the maximum/minimum over and over again, it doesn't change.
double maxInFeatureA = array_A.Max();
double minInFeatureA = array_A.Min();
// somewher in the loop:
normilized_featureA= (((array_A[i] - minInFeatureA ) * (1 - 0)) /
(maxInFeatureA - minInFeatureA ))+0;
Max/Min is really expensive for array when used in foreach/for with many elements.
I suggest you to take this code: Array data normalization
and use it as
var normalizedPoints = newPoints.Select(x => x.A)
.NormalizeData(1, 1)
.ToList();
double min = newpoints.Min(p => p.A);
double max = newpoints.Max(p => p.A);
double readonly normalizer = 1 / (max - min);
var normalizedFeatureA = newpoints.Select(p => (p.A - min) * normalizer);

Grouping by every n minutes

I'm playing around with LINQ and I was wondering how easy could it be to group by minutes, but instead of having each minute, I would like to group by every 5 minutes.
For example, currently I have:
var q = (from cr in JK_ChallengeResponses
where cr.Challenge_id == 114
group cr.Challenge_id
by new { cr.Updated_date.Date, cr.Updated_date.Hour, cr.Updated_date.Minute }
into g
select new {
Day = new DateTime(g.Key.Date.Year, g.Key.Date.Month, g.Key.Date.Day, g.Key.Hour, g.Key.Minute, 0),
Total = g.Count()
}).OrderBy(x => x.Day);
What do I have to do to group my result for each 5 minutes?
To group by n of something, you can use the following formula to create "buckets":
((int)(total / bucket_size)) * bucket_size
This will take the total, divide it, cast to integer to drop off any decimals, and then multiply again, which ends up with multiples of bucket_size. So for instance (/ is integer division, so casting isn't necessary):
group cr.Challenge_id
by new { cr.Updated_Date.Year, cr.Updated_Date.Month,
cr.Updated_Date.Day, cr.Updated_Date.Hour,
Minute = (cr.Updated_Date.Minute / 5) * 5 }
//Data comes for every 3 Hours
where (Convert.ToDateTime(capcityprogressrow["Data Captured Date"].ToString()).Date != DateTime.Now.Date || (Convert.ToDateTime(capcityprogressrow["Data Captured Date"].ToString()).Date == DateTime.Now.Date && (Convert.ToInt16(capcityprogressrow["Data Captured Time"])) % 3 == 0))
group capcityprogressrow by new { WCGID = Convert.ToInt32(Conversions.GetIntEntityValue("WCGID", capcityprogressrow)), WCGName = Conversions.GetEntityValue("WIRECENTERGROUPNAME", capcityprogressrow).ToString(), DueDate = Convert.ToDateTime(capcityprogressrow["Data Captured Date"]), DueTime = capcityprogressrow["Data Captured Time"].ToString() } into WCGDateGroup
// For orderby with acsending order
.OrderBy(x => x.WcgName).ThenBy(x => x.DataCapturedDateTime).ThenBy(x => x.DataCapturedTime).ToList();

Generating Numbers in sequence

How can i generate numbers using LinQ in this sequence given the startIndex,count of numbers and the maximum number.For example:
Sample Numbers = 1,2,3,4
StartIndex = 1 (i.e it should start from 1)
Sequence number count = 3
Maximum number = 4 (i.e till 4)
Expected result given the above details :
1,2,3
1,3,4
1,2,4
Is there a way to do it using linQ?
If you didn't need the length of you sequences to be dynamic, then you could use:
var startindex=1;
var maxindex=4;
var data = Enumerable.Range(startindex,maxindex);
var qry = from x in data
where x == startindex
from y in data
where x < y
from z in data
where y < z
select new { x, y, z };
foreach (var tuple in qry) {
Console.WriteLine("{0} {1} {2}", tuple.x, tuple.y, tuple.z);
}
The sequence length is hardcoded to 3, because there are 3 enumerables being joined: x, y, z.
If you want to dynamically join an arbitrary number of enumerables, then you can use Eric Lippert's Cartesian Product Linq example.
You pass a set of k sequences of N items, and it will return a set of all combinations of length k.
Now, you don't want repeated elements in your results.
So, I added the following to Eric's example:
where accseq.All(accitem => accitem < item)
Here's the final solution (edited for clarity):
var startindex=1;
var maxindex=7;
var count = 3;
// Make a set of count-1 sequences, whose values range from startindex+1 to maxindex
List<List<int>> sequences = new List<List<int>>();
// We only want permutations starting with startindex. So, the first input sequence to be joined should only have the value startindex.
List<int> list1 = new List<int>();
list1.Add(startindex);
sequences.Add(list1);
// The rest of the input sequences to be joined should contain the range startindex+1 .. maxindex
for (int i=1; i< count; i++)
{
sequences.Add(Enumerable.Range(startindex+1,maxindex-startindex).ToList());
}
// Generate the permutations of the input sequences
IEnumerable<IEnumerable<int>> emptyProduct = new[] { Enumerable.Empty<int>() };
var result = sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
where accseq.All(accitem => accitem < item)
select accseq.Concat(new[] {item}));
// Show the result
foreach (var x in result)
{
Console.WriteLine(x);
}
Try this function.
public static IEnumerable<IEnumerable<T>> Permute<T>(IEnumerable<T> list)
{
if (list.Count() == 1)
return new List<IEnumerable<T>> { list };
return list.Select((a, i1) => Permute(list.Where((b, i2) => i2 != i1)).Select(b => (new List<T> { a }).Union(b)))
.SelectMany(c => c);
}
//Here Range(startindex, count)
List<int> list1 = Enumerable.Range(1, 3).ToList();
//generates all permutations
var permutationlist = Permute(list1);
Okay, first let's state the problem clearly. I'll assume your numbers are int but that's a mostly irrelevant detail but the concreteness makes thinking go more smo
You have a sequence a_0, a_1, a_2, ..., a_N of int.
You have an integer k satisfying 1 <= k <= N + 1.
You have a starting index start >=0 and an ending index end <= N.
You want all subsequences a_i0, a_i1, a_i2, ..., a_ik of length k such that i0 = start and ik = end.
Then your algorithm is simple. You want to produce all combinations of size k - 2 of { start + 1, ..., end - 1 }. Given such a combination j1, j2, ..., j(k-1), order it, call the resulting ordered sequence i1, i2, ..., i(k-1) and return the sequence a_start, a_i1, a_i2, ..., a_i(k-1), a_end.
Now that you know a formal statement of the problem, and what you need to solve it, resources abound for generating said combinations. cf. Google search : Generating combinations C# or Knuth Volume 4A.

Categories