Sum up every x values in a row - c#

I have a column calendar week and a column amount. Now I want to sum up the amount for every 4 calendar weeks starting from the first calendar week in April. E.g. if I have 52 (rows) calendar weeks in my initial table I would have 13 (rows) weeks in my final table (I talk about table here since I will try to bind the outcome later to a DGV).
I am using Linq-to-dataset and tried different sources to get a hint how to solve this but group by, aggregate couldnt help me but maybe there are some applications of them that I dont understand so I hope one of you Linq-Experts can help me.
Usually I post code but I can just give you the backbone since I have no starting point.
_dataset = New Dataset1
_adapter.Fill(_dataset.Table1)
dim query = from dt in _dataset.Table1.AsEnumerable()

Divide the week by four (using integer division to truncate the result) to generate a value that you can group on.

You can group by anything you like. You could, theoretically, use a simple incrementing number for that:
var groups = dt
.Select((row, i) => Tuple.Create(row, i / 4))
.GroupBy(t => t.Item2);
(c# notation)
Then you can calculate the sum for each group:
var sums = groups
.Select(g => g.Sum(t => t.Item1.Amount));
You mention you want to start at a certain month, e.g. April. You can skip rows by using Skip or Where:
dt.Skip(12).Select(...)
i will always start at 0, making sure your first group contains 4 weeks. However, to know exactly how many weeks to skip or where to start you need more calendar information. I presume you have some fields in your rows that mention the corresponding week's start date:
dt.Where(row => row.StartDate >= firstOfApril).Select(...)

first convert the data into a lookup with the month and then the amount. Note your columns need to be called CalendarWeek and Amount. Then convert to a dictionary and sum the values per key:
var grouped = dt.AsEnumerable().ToLookup(o => Math.Ceiling((double) o.Field<int>("CalendarWeek")/4), o => o.Field<double>("Amount"));
var results = grouped.ToDictionary(result => result.Key, result => result.Sum());

Related

Sum of Values in SortedList and Condition on Keys

I want to sum up certain values (time range start till end) of a SortedList<DateTime, double> with linq. The Keys contain dates of workingdays and the Values contain the number of possible workhours for the given day. The question I want to answer is, how many hours are possible in a given timeframe.
I managed to get a count of the keys, but I'm stuck now at the sum.
The code to count the keys (Thanks to stackoverflow) looks like this:
double ats = (from n in DaysAndHours.Keys
where n >= start
where n <= end
select n).Count();
How do I have to change it, to fill ats with values in the date range?
Thanks!
Assuming post applying condition you want to add the values, then use the following code and modify as required
double result = DaysAndHours.Where(n => n.Key >= start)
.Where(n => n.Key <= end)
.Sum(n => n.Value)

How to calculate days between status on datatable

I have a datatable with steps history for a request.
Here is an example of the data:
I need to place on days how many days between each status has been passed.
Of course if there is only one status then the days should be the days elapsed between the transaction day and today's date.
Any clue?
This may help you to solve the issue:
Iterate through the collection, subtract the date column value of the previous row from current row .TotalDays() will returns you a double value indicates the day difference. this can be implemented as follows:
var temTable = myDatatable.AsEnumerable().Select(x => x).ToList();
for (int i = 0; i < temTable.Count; i++)
{
double days = 0;
if (i != 0)
days = ((DateTime)temTable[i]["date"] - (DateTime)temTable[i - 1]["date"]).TotalDays;
temTable[i]["days"] = days;
}
SELECT r.dDate,r.dHour,r.dStatus,DATEDIFF(
DAY,LAG(r.dDate,1,GETDATE())
OVER (ORDER BY r.dDate),r.dDate) AS DAYS
FROM tbData r;
I believe, above query in Sql Server will provide the desired output.
But to update the table with a new column "days" some additional/other code is required. I'm still working on it. And if you get the solution please post...
I didn't take time into my consideration...... You add that too....
You can use a field with type datetime (replace date and hour), will reduce the risk....
WITH newTable AS (
SELECT dStatus,ABS(DATEDIFF(
DAY,LAG(dDate,1,GETDATE())
OVER (ORDER BY dDate),dDate)) AS dDays FROM tbData
)
UPDATE tbData SET dDays=nt.dDays
FROM newTable nt
WHERE nt.dStatus = tbData.dStatus;
I believe this will resolve your problem.....
Hour field is not considered.....Thank You....

Processing data in DataTable - how to find minimum and maximum for each column?

I have DataTable object that is filled with numeric data from my database. I have to handle all possible numeric-like data types (int32, int64, float, even datetime, but that last is not necessary).
I have to normalize all data in columns, each column separately, so I need to find maximum and minimum values for each column to calculate normalization coefficient.
Do I have to iterate "manually" through all rows to find these max and min values?
Some background:
I don't want to do this in SQL, because its kind of scientific application, where user works in SQL language and writes very complicated SQL queries. I don't want to force user to write even more complicated queries to get normalized data or get min/max values
Colums are fully dynamic, they depend on SQL query written by user.
Do I have to iterate "manually" through all rows to find these max and
min values?
Define manually. Yes, you have to calculate the Min and Max values by enumerating all DataRows. But that can be done either with the old DataTable.Compute method or with
Linq:
int minVal = table.AsEnumerable().Min(r => r.Field<int>("ColName"));
int maxVal = table.AsEnumerable().Max(r => r.Field<int>("ColName"));
DataTable.Compute:
int maxVal = (int)table.Compute("Max(ColName)", "");
Try this:
var min = myTable.AsEnumerable().Min(x => (int)x["column"]);
var max = myTable.AsEnumerable().Max(x => (int)x["column"]);
You'll need to make sure you have a reference to System.Data.DataSetExtensions, which is not added to new projects by default.
You can also use
Convert.ToInt32(datatable.Compute("min(columnname)", string.Empty));
Convert.ToInt32(datatable.Compute("max(columnname)", string.Empty));
You can do it by using DataTable.Select() as :
DataRow [] dr = dataTable.Select("ID= MAX(ID)");
if(dr !=null)
{
// Console.WriteLine(dr[0]["ID"]);
int maxVal=Convert.ToInt32(dr[0]["ID"]);
}
This worked fine for me
int max = Convert.ToInt32(datatable_name.AsEnumerable()
.Max(row => row["column_Name"]));

Comparing Data grid view column and columns

I'm stuck on one part whereby I have no idea how to solve it. Basically, I have one table, "Shifthours" and another one which is "employeeshift". Shifthours table have shift_Start and shift_Stop. employeeshift table have StartTime and EndTime. I'm comparing shift_Start and StartTime. I have linked this 2 tables together using foreign key and the question I asked is that I want the shift_Start to compare with the StartTime and shift_Stop to compare with the EndTime and see the employee fit which shift and the shift_Start and shift_Stop will appear at the column that the employee is eligible.
Currently I got a code that only joins 2 table together but not comparing the timings.
private void LoadAllEmpShift()
{
using (testEntities Setupctx = new testEntities())
{
BindingSource BS = new BindingSource();
var Viewemp = from ES in Setupctx.employeeshifts
join shifthour sh in Setupctx.shifthours on ES.ShiftHourID equals sh.idShiftHours
select new
{
ES.EmployeeShiftID,
ShiftHour_Start = sh.shiftTiming_start,
ShiftHour_Stop = sh.shiftTiming_stop,
ES.EmployeeName,
ES.StartTime,
ES.EndTime,
ES.Date
};
BS.DataSource = Viewemp;
dgvShift.DataSource = BS;
}
}
Anyone knows how to do this?
Edit:
You said you were trying to find where the employee hours match with a set of shift times. It would be nice to have some sample data and the algorithm that you want to use to determine what is a good shift time match.
I have assumed here that the best way to do that is to base the employee's start time off the nearest shift start time.
In the following code, I use the let function to essentially look through the shift hours and find the set of shift hours that are nearest to the employee's start time.
var Viewemp = from ES in Setupctx.employeeshifts
join sh in Setupctx.shifthours on ES.ShiftHourID equals sh.idShiftHours
into shifts
let diff = shifts
.OrderBy (s =>
// this is the line that needs attention:
System.Math.Abs((int)(ES.StartTime - s.shiftTiming_start))
)
.First ()
select new
{
ES.EmployeeShiftID,
ShiftHour_Start = diff.shiftTiming_start,
ShiftHour_Stop = diff.shiftTiming_stop,
ES.EmployeeName,
ES.StartTime,
ES.EndTime,
ES.Date
};
Update
My type in database for the StartTime and EndTime is string instead of
time
In the above code the important logic is finding the absolute value difference between ES.StartTime and s.shiftTiming_start and the smallest difference indicates the best match for shift hour. Unfortunately, your database stores this data as a string and you need to compare them as numeric.
Linq-to-Entities does not contain an easy way to convert string to int function.
I think your next step would be to look into how you can convert those string values to int values. Take a look at this question as I think it might help you out:
Convert String to Int in EF 4.0

Finding similar records in LINQ

I have the following LINQ query, which will be used to find any consignments that are 'similar':
from c in cons
group c by new { c.TripDate.Value, c.DeliveryPostcode, c.DeliveryName } into cg
let min = cg.Min(a => a.DeliverFrom)
let max = cg.Max(a => a.DeliverFrom)
let span = max - min
where span.TotalMinutes <= 59
select cg;
The main thing is the min, max and span. Basically, any consignments that are in the 'group', that have a DeliverFrom datetime within 59 minutes of any other one in the group, will be returned in the group.
The code above looked good originally to me, but upon further inspection it seems that if there's more than 2 records in the group - 2 with DeliverFrom dates 59 minutes of each other, and one with a DeliverFrom date not within 59 minutes of any, then the query would not return that group, as it'll be selecting the min and the max and seeing that the difference is more than 59 minutes. What I want to happen is to see that there are 2 consignments in the group with DeliverFrom dates close enough, and just select a group containing them two.
How would I go about doing this?
EDIT: Doh, another clause has been added in this. There's a field called 'Weight' and one called 'Spaces', each group can have a max of 26 Weight and 26 Spaces
If I'm not mistaken, what you are looking for is a statistical problem called cluster identification, and if so it is a far more complex problem than you might think.
As a thought exercise, imagine if you had 3 entries, at 1:00, 1:30, and 2:00. How would you want to group these? Either the first two or the last two would work as a group (less than 59 minutes apart), but all 3 would not.
If you just want to keep chaining items together into a group as long as they are within 59 minutes of any other item in the group, you'd need to keep iterating until you stop finding new items to add to any cluster.
I 'd group the consignments with the same logic like you do but use this overload of GroupBy instead, allowing me to project each group of consigments into another type. This type would here be an enumerable sequence of groups of consigments, each element in which represents consignments that non only were in the same group to begin with, but also should all be delivered within the duration of an hour. So the signature of resultSelector would be
Func<anontype, IEnumerable<Consignment>, IEnumerable<IEnumerable<Consignment>>>
At this point it becomes clear that it would probably be a good idea to define a type for the grouping so that you can get rid of the anonymous type in the above signature; otherwise you 'd be forced to define your resultSelector as a lambda.
Within resultSelector, you need to first of all sort the incoming group of consignments by DeliverFrom and then return sub-groups based on that time. So it might look like this:
IEnumerable<IEnumerable<Consignment>>
Partitioner(ConsignmentGroupKey key, IEnumerable<Consignment> cg)
{
cg = cg.OrderBy(c => c.DeliverFrom);
var startTime = cg.First().DeliverFrom;
var subgroup = new List<Consignment>();
foreach(var cons in cg) {
if ((cons.DeliverFrom - startTime).TotalMinutes < 60) {
subgroup.Add(cons);
}
else {
yield return subgroup;
startTime = cons.DeliverFrom;
subgroup = new List<Consignment>() { cons };
}
}
if (subgroup.Count > 0) {
yield return subgroup;
}
}
I haven't tried this, but as far as I can tell it should work.

Categories