Grouping by every n minutes - c#

I'm playing around with LINQ and I was wondering how easy could it be to group by minutes, but instead of having each minute, I would like to group by every 5 minutes.
For example, currently I have:
var q = (from cr in JK_ChallengeResponses
where cr.Challenge_id == 114
group cr.Challenge_id
by new { cr.Updated_date.Date, cr.Updated_date.Hour, cr.Updated_date.Minute }
into g
select new {
Day = new DateTime(g.Key.Date.Year, g.Key.Date.Month, g.Key.Date.Day, g.Key.Hour, g.Key.Minute, 0),
Total = g.Count()
}).OrderBy(x => x.Day);
What do I have to do to group my result for each 5 minutes?

To group by n of something, you can use the following formula to create "buckets":
((int)(total / bucket_size)) * bucket_size
This will take the total, divide it, cast to integer to drop off any decimals, and then multiply again, which ends up with multiples of bucket_size. So for instance (/ is integer division, so casting isn't necessary):
group cr.Challenge_id
by new { cr.Updated_Date.Year, cr.Updated_Date.Month,
cr.Updated_Date.Day, cr.Updated_Date.Hour,
Minute = (cr.Updated_Date.Minute / 5) * 5 }

//Data comes for every 3 Hours
where (Convert.ToDateTime(capcityprogressrow["Data Captured Date"].ToString()).Date != DateTime.Now.Date || (Convert.ToDateTime(capcityprogressrow["Data Captured Date"].ToString()).Date == DateTime.Now.Date && (Convert.ToInt16(capcityprogressrow["Data Captured Time"])) % 3 == 0))
group capcityprogressrow by new { WCGID = Convert.ToInt32(Conversions.GetIntEntityValue("WCGID", capcityprogressrow)), WCGName = Conversions.GetEntityValue("WIRECENTERGROUPNAME", capcityprogressrow).ToString(), DueDate = Convert.ToDateTime(capcityprogressrow["Data Captured Date"]), DueTime = capcityprogressrow["Data Captured Time"].ToString() } into WCGDateGroup
// For orderby with acsending order
.OrderBy(x => x.WcgName).ThenBy(x => x.DataCapturedDateTime).ThenBy(x => x.DataCapturedTime).ToList();

Related

Summing hours under a threshold using linq

I am working on a staff rota app and want to sum only the hours below a threshold of 2250 minutes (37.5 hours). I am struggling to isolate these hours however and the reason being is as follows. Firstly I'm new to LINQ.
Secondly, there are two different pay types that can be entered into the app, so I have to sum both pay types using .Sum() which is fine. The problem I'm having is isolating only the summed hours below 37.5 hours
I am grouping the results and running something like
g.Sum(x => x.Start >= start && x.End <= End ? x.Type1 : 0) +
g.Sum(x => x.Start >= start && x.End <= End ? x.Type2 : 0) <= 2250 ? .....
// then count the hours below here.
Now I get that this example will return zero if the count exceeds 2250, but how do I create a subset of the values below 2250 only?
Assuming that Type1 is minutes worked for pay type 1 and Type2 is minutes worked for pay type 2, you could go about it as follows:
Filter the relevant items based on your Start and End conditions
For each item, calculate the total minutes worked (Type1 + Type2)
Filter the total minutes by comparing them to the threshold
Summing total minutes items that are below the threshold
In Linq,
filtering can be done using .Where()
creating a new object for each item in the collection can be done using .Select()
Implementation:
int threshold = 2250;
int filteredTotalMinutes = g
.Where(x => x.Start >= start && x.End <= end)
.Select(x => x.Type1 + x.Type2)
.Where(minutesWorked => minutesWorked <= threshold)
.Sum();
Another possible approach is to do all the filtering first, and then sum the total minutes worked:
int filteredTotalMinutes = g
.Where(x => x.Start >= start && x.End <= end)
.Where(x => x.Type1 + x.Type2 <= threshold)
.Sum(x => x.Type1 + x.Type2);
These implementations will only take into account the work time of the employees that have worked less than or equal to the threshold.
If you rather need to include work time for all employees, but limit the maximum work time that is included in the calculation for each employee to the threshold (i.e. for each employee, use sum = x.Type1 + x.Type2 if sum is less than or equal to the threshold; else, use the threshold), you may utilize Math.Min() to get the lowest value of the total minutes worked (x.Type1 + x.Type2) and the threshold.
The implementation can now be simplified:
int filteredTotalMinutes = g
.Where(x => x.Start >= start && x.End <= end)
.Sum(x => Math.Min(x.Type1 + x.Type2, threshold));
If Type1 and/or Type2 are nullable (e.g. int?), you need to ensure that Math.Min() actually has int values to work with. You will then need to provide a fallback for each nullable value.
This can be achieved by replacing x.Type* with (x.Type* ?? 0), which reads:
Take the value of x.Type* if x.Type* is not null; else, take 0.
If both Type* properties are nullable, the implementation hence becomes:
int filteredTotalMinutes = g
.Where(x => x.Start >= start && x.End <= end)
.Sum(x => Math.Min((x.Type1 ?? 0) + (x.Type2 ?? 0), threshold));
If you cannot use Math.Min(), you could perhaps rather use a ternary operator to select the desired work time portion for each employee. I would then first calculate the total minutes worked for each employee, and then decide if the total minute amount or the threshold value should be used:
int filteredTotalMinutes = g
.Where(x => x.Start >= start && x.End <= end)
.Select(x => (x.Type1 ?? 0) + (x.Type2 ?? 0))
.Sum(minutesWorked => minutesWorked < threshold
? minutesWorked
: threshold);

Group dateTime by hour range

I got a list like this:
class Article
{
...
Public DateTime PubTime{get;set}
...
}
List<Article> articles
Now I want to group this list with hour range :[0-5,6-11,12-17,18-23]
I know there is a cumbersome way to do this:
var firstRange = articles.Count(a => a.PubTime.Hour >= 0 && a.PubTime.Hour <= 5);
But I want to use a elegant way. How can I do that?Use Linq Or anything others?
Group by Hour / 6:
var grouped = articles.GroupBy(a => a.PubTime.Hour / 6);
IDictionary<int, int> CountsByHourGrouping = grouped.ToDictionary(g => g.Key, g => g.Count());
The key in the dictionary is the period (0 representing 0-5, 1 representing 6-11, 2 representing 12-17, and 3 representing 18-23). The value is the count of articles in that period.
Note that your dictionary will only contain values where those times existed in the source data, so it won't always contain 4 items.
You could write a CheckRange Function, which takes your values and returns a bool. To make your code more reusable and elegant.
Function Example:
bool CheckRange (this int number, int min, int max)
=> return (number >= min && number <= max);
You could now use this function to check if the PubTime.Hour is in the correct timelimit.
Implementation Example:
var firstRange = articles.Count(a => a.CheckRange(0, 5));

C# - group list by TimeSpan starting from specific start point

I want to group my list by time step (hour, day, week etc.) and count sum for each group but starting from specific time.
Now I've got input list:
TIME VALUE
11:30 2
11:50 2
12:00 6
12:30 10
12:50 2
and hour step
var timeStep=new TimeSpan(1,0,0);
and I'm grouping my list with something like this
var myList = list.GroupBy(x =>
{
return x.Time.Ticks / timeStep.Ticks;
})
.Select(g => new { Time = new DateTime(g.Key * timeStep.Ticks), Value = g.Sum(x => x.Value) }).ToList();
It works fine (also for any other step, e.g. daily, weekly) and gives result:
TIME SUM
11:00 4
12:00 18
But now I have to group my list with hour step but starting from e.g. 30 minute of hour, so what can I do to have something like this:
TIME SUM
11:30 10
12:30 12
It is preferable to use a custom DateTme comparer:
internal class DateTimeComparer : IEqualityComparer<DateTime>
{
public bool Equals(DateTime x, DateTime y)
{
return GetHashCode(x) == GetHashCode(y);
// In general, this shouldn't be written (because GetHashCode(x) can equal GetHashCode(y) even if x != y (with the given comparer)).
// But here, we have: x == y <=> GetHashCode(x) == GetHashCode(y)
}
public int GetHashCode(DateTime obj)
{
return (int)((obj - new TimeSpan(0, 30, 0)).Ticks / new TimeSpan(1, 0, 0).Ticks);
}
}
with:
var myList = list.GroupBy(x => x.Time, new DateTimeComparer())
.Select(g => new { Time = g.Key, Value = g.Sum(x => x.Value) }).ToList();

LINQ Grouping by Sum Value

Say I have a class like so:
public class Work
{
public string Name;
public double Time;
public Work(string name, double time)
{
Name = name;
Time = time;
}
}
And I have a List<Work> with about 20 values that are all filled in:
List<Work> workToDo = new List<Work>();
// Populate workToDo
Is there any possible way that I can group workToDo into segments where each segments sum of Time is a particular value? Say workToDo has values like so:
Name | Time
A | 3.50
B | 2.75
C | 4.25
D | 2.50
E | 5.25
F | 3.75
If I want the sum of times to be 7, each segment or List<Work> should have a bunch of values where the sum of all the Times is 7 or close to it. Is this even remotely possible or is it just a stupid question/idea? I am using this code to separate workToDo into segments of 4:
var query = workToDo.Select(x => x.Time)
.Select((x, i) => new { Index = i, Value = x})
.GroupBy(y => y.Index / 4)
.ToList();
But I am not sure how to do it based on the Times.
Here's a query that segments your data in groups where the times are near to 7, but not over:
Func<List<Work>,int,int,double> sumOfRange = (list, start, end) => list
.Skip(start)
.TakeWhile ((x, index) => index <= end)
.ToList()
.Sum (l => l.Time);
double segmentSize = 7;
var result = Enumerable.Range(0, workToDo.Count ())
.Select (index => workToDo
.Skip(index)
.TakeWhile ((x,i) => sumOfRange(workToDo, index, i)
<= segmentSize));
The output for your example data set is:
A 3.5
B 2.75
total: 6.25
B 2.75
C 4.25
total: 7
C 4.25
D 2.5
total: 6.75
D 2.5
total: 2.5
E 5.25
total: 5.25
F 3.75
total: 3.75
If you want to allow a segments to total over seven, then you could increase the segmentSize variable by 25% or so (i.e. make it 8.75).
This solution recurses through all combinations and returns the ones whose sums are close enough to the target sum.
Here is the pretty front-end method that lets you specify the list of work, the target sum, and how close the sums must be:
public List<List<Work>> GetCombinations(List<Work> workList,
double targetSum,
double threshhold)
{
return GetCombinations(0,
new List<Work>(),
workList,
targetSum - threshhold,
targetSum + threshhold);
}
Here is the recursive method that does all of the work:
private List<List<Work>> GetCombinations(double currentSum,
List<Work> currentWorks,
List<Work> remainingWorks,
double minSum,
double maxSum)
{
// Filter out the works that would go over the maxSum.
var newRemainingWorks = remainingWorks.Where(x => currentSum + x.Time <= maxSum)
.ToList();
// Create the possible combinations by adding each newRemainingWork to the
// list of current works.
var sums = newRemainingWorks
.Select(x => new
{
Works = currentWorks.Concat(new [] { x }).ToList(),
Sum = currentSum + x.Time
})
.ToList();
// The initial combinations are the possible combinations that are
// within the sum range.
var combinations = sums.Where(x => x.Sum >= minSum).Select(x => x.Works);
// The additional combinations get determined in the recursive call.
var newCombinations = from index in Enumerable.Range(0, sums.Count)
from combo in GetCombinations
(
sums[index].Sum,
sums[index].Works,
newRemainingWorks.Skip(index + 1).ToList(),
minSum,
maxSum
)
select combo;
return combinations.Concat(newCombinations).ToList();
}
This line will get combinations that sum to 7 +/- 1:
GetCombinations(workToDo, 7, 1);
What you are describing is a packing problem (where the tasks are being packed into 7-hour containers). Whilst it would be possible to use LINQ syntax in a solution to this problem, there is no solution inherent in LINQ that I am aware of.

Is this LINQ query with averaging and grouping by hour written efficiently?

This is my first real-world LINQ-to-SQL query. I was wondering if I am making any large, obvious mistakes.
I have a medium-large sized (2M+ records, adding 13k a day) table with data, dataTypeID, machineID, and dateStamp. I'd like to get the average, min, and max of data from all machines and of a specific dataType within a 4 hour period, going back for 28 days.
E.g
DateTime Avg Min Max
1/1/10 12:00AM 74.2 72.1 75.7
1/1/10 04:00AM 74.5 73.1 76.2
1/1/10 08:00AM 73.7 71.5 74.2
1/1/10 12:00PM 73.2 71.2 76.1
etc..
1/28/10 12:00AM 73.1 71.3 75.5
So far I have only been able to group the averages by 1 hour increments, but I could probably deal with that if the alternatives are overly messy.
Code:
var q =
from d in DataPointTable
where d.dateStamp > DateTime.Now.AddDays(-28) && (d.dataTypeID == (int)dataType + 1)
group d by new {
d.dateStamp.Year,
d.dateStamp.Month,
d.dateStamp.Day,
d.dateStamp.Hour
} into groupedData
orderby groupedData.Key.Year, groupedData.Key.Month, groupedData.Key.Day, groupedData.Key.Hour ascending
select new {
date = Convert.ToDateTime(
groupedData.Key.Year.ToString() + "-" +
groupedData.Key.Month.ToString() + "-" +
groupedData.Key.Day.ToString() + " " +
groupedData.Key.Hour.ToString() + ":00"
),
avg = groupedData.Average(d => d.data),
max = groupedData.Max(d => d.data),
min = groupedData.Min(d => d.data)
};
If you want 4 hour increments divide the hour by 4 (using integer division) and then multiply by 4 when creating the new datetime element. Note you can simply use the constructor that takes year, month, day, hour, minute, and second instead of constructing a string and converting it.
var q =
from d in DataPointTable
where d.dateStamp > DateTime.Now.AddDays(-28) && (d.dataTypeID == (int)dataType + 1)
group d by new {
d.dateStamp.Year,
d.dateStamp.Month,
d.dateStamp.Day,
Hour = d.dateStamp.Hour / 4
} into groupedData
orderby groupedData.Key.Year, groupedData.Key.Month, groupedData.Key.Day, groupedData.Key.Hour ascending
select new {
date = new DateTime(
groupedData.Key.Year,
groupedData.Key.Month,
groupedData.Key.Day,
(groupedData.Key.Hour * 4),
0, 0),
avg = groupedData.Average(d => d.data),
max = groupedData.Max(d => d.data),
min = groupedData.Min(d => d.data)
};
To improve efficiency you might want to consider adding an index on the dateStamp column. Given that you're only selecting a potentially small range of the dates, using an index should be a significant advantage. I would expect the query plan to do an index seek for the first date, making it even faster.

Categories