Fill the first missing null elements after shifting and rolling window - c#

I'm recreating a strategy made in python with pandas. I think my code works, even tho I haven't compared the values yet, because I'm getting an exception. Basically, the problem is that .Shift(20) removes the first 20 elements and .Window(12 * 60 / 15) removes 47 elements. The typical prices are 10180 by default. They become 10113 after the shifting and rolling window. I tried using .FillMissing(), but it doesn't seem to append the first null values to the series.
def populate_indicators(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
if not {'buy', 'sell'}.issubset(dataframe.columns):
dataframe.loc[:, 'buy'] = 0
dataframe.loc[:, 'sell'] = 0
dataframe['typical'] = qtpylib.typical_price(dataframe)
dataframe['typical_sma'] = qtpylib.sma(dataframe['typical'], window=10)
min = dataframe['typical'].shift(20).rolling(int(12 * 60 / 15)).min()
max = dataframe['typical'].shift(20).rolling(int(12 * 60 / 15)).max()
dataframe['daily_mean'] = (max+min)/2
return dataframe
My code (C#)
public override List<TradeAdvice> Prepare(List<OHLCV> candles)
{
var result = new List<TradeAdvice>();
var typicalPrice = candles.TypPrice().Select(e => e ?? 0).ToList();
var typicalSma = typicalPrice.Sma(10);
var series = typicalPrice.ToOrdinalSeries();
var min = series.Shift(20).Window(12 * 60 / 15).Select(kvp => kvp.Value.Min()).FillMissing(); // 10113 elements / 10180 expected
var max = series.Shift(20).Window(12 * 60 / 15).Select(kvp => kvp.Value.Max()).FillMissing(); // 10113 elements / 10180 expected
var dailyMean = (max + min) / 2;
var asd = dailyMean.SelectValues(e => Convert.ToDecimal(e)).Values.ToList();
var crossedBelow = asd.CrossedBelow(typicalPrice);
var crossedAbove = asd.CrossedAbove(typicalPrice);
for (int i = 0; i < candles.Count; i++)
{
if (i < StartupCandleCount - 1)
result.Add(TradeAdvice.WarmupData);
else if (crossedBelow[i]) // crossBelow is 10113 elements instead of 10180...
result.Add(TradeAdvice.Buy);
else if (crossedAbove[i]) // crossBelow is 10113 elements instead of 10180...
result.Add(TradeAdvice.Sell);
else
result.Add(TradeAdvice.NoAction);
}
return result;
}
public class OHLCV
{
public DateTime Timestamp { get; set; }
public decimal Open { get; set; }
public decimal High { get; set; }
public decimal Low { get; set; }
public decimal Close { get; set; }
public decimal Volume { get; set; }
}

If you have an ordinal series that you create with ToOrdinalSeries, it means that the index of the series will be automatically generated numerical value from 0 to length of your series - 1. However, this is still a real index and Deedle keeps the mapping when you use operations like Shift.
If your index was a date, say 01/01 => a, 02/01 => b, 03/01 => c, then Shift would shift the values and drop the keys that are no longer needed, i.e. you may get 02/01 => a, 03/01 => b.
It works the same with ordinal indices, so if you have 0 => a, 1 => b, 2 => c and shift the data, you will get something like 1 => a, 2 => b.
If you then want to get 0 => <default>, 1 => a, 2 => b, then you can do this using Realign which takes the new list of keys that you want to have followed by FillMissing. For example:
var ts = new[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }.ToOrdinalSeries();
var mins = ts.Shift(2).Window(2).Select(kvp => kvp.Value.Min());
var realigned = mins.Realign(Enumerable.Range(0, 10)).FillMissing(-1);
ts.Print(); // Starts from key '0'
mins.Print(); // Starts from key '3' because of Shift & Window
realigned.Print(); // Starts from key '0' with three -1 values at the start

Related

How can I use C#/LINQ to calculate weighted averages

This is to process stock data; the data is in this format:
public class A
{
public int Price;
public int Available;
}
let's take this data for example:
var items = new List<A>
{
new A { Price = 10, Available = 1000 },
new A { Price = 15, Available = 500 },
new A { Price = 20, Available = 2000 },
};
my query returns the average price for a specific volume, so for example:
if I have a requested volume of 100, my average price is 10
if I have a requested volume of 1200, I take the first 1000 at a price of 10, then the next 200 at a price of 15
etc
I have implemented that in C#, but I am trying to find if this could be done with LINQ directly with the database iterator.
I get data that is already sorted by price, but I don't see how to solve this without iteration.
Edit:
this is the code:
public static double PriceAtVolume(IEnumerable<A> Data, long Volume)
{
var PriceSum = 0.0;
var VolumeSum = 0L;
foreach (var D in Data)
{
if (D.Volume < Volume)
{
PriceSum += D.Price * D.Volume;
VolumeSum += D.Volume;
Volume -= D.Volume;
}
else
{
PriceSum += D.Price * Volume;
VolumeSum += Volume;
Volume = 0;
}
if (Volume == 0) break;
}
return PriceSum / VolumeSum;
}
and the test code:
var a = new List<A>
{
new A { Price = 10, Volume = 1000 },
new A { Price = 15, Volume = 500 },
new A { Price = 20, Volume = 2000 }
};
var P0 = PriceAtVolume(a, 100);
var P1 = PriceAtVolume(a, 1200);
Clarification:
Above I said I'd like to move it to LINQ to use the database iterator, so I'd like to avoid scanning the entire data and stop iterating when the answer is calculated. The data is already sorted by price in the database.
This is probably the most Linqy you can get. It uses the Aggregate method, and specifically the most complex of the three overloaded versions of Aggregate, that accepts three arguments. The first argument is the seed, and it is initialized with a zeroed ValueTuple<long, decimal>. The second argument is the accumulator function, with the logic to combine the seed and the current element into a new seed. The third argument takes the final accumulated values and projects them to the desirable average.
public static decimal PriceAtVolume(IEnumerable<A> data, long requestedVolume)
{
return data.Aggregate(
(Volume: 0L, Price: 0M), // Seed
(sum, item) => // Accumulator function
{
if (sum.Volume == requestedVolume)
return sum; // Goal reached, quick return
if (item.Available < requestedVolume - sum.Volume)
return // Consume all of it
(
sum.Volume + item.Available,
sum.Price + item.Price * item.Available
);
return // Consume part of it (and we are done)
(
requestedVolume,
sum.Price + item.Price * (requestedVolume - sum.Volume)
);
},
sum => sum.Volume == 0M ? 0M : sum.Price / sum.Volume // Result selector
);
}
Update: I changed the return type from double to decimal, because a decimal is the preferred type for currency values.
Btw in case that this function is called very often with the same data, and the list of data is huge, it could be optimized by storing the accumulated summaries in a List<(long, decimal)>, and applying BinarySearch to quickly find the desirable entry. It becomes complex though, and I don't expect that the prerequisites for the optimization will come up very often.
This is working as well (although not a one-liner):
private static decimal CalculateWeighedAverage(List<A> amountsAndPrices, int requestedVolume)
{
int originalRequestedVolume = requestedVolume;
return (decimal)amountsAndPrices.Sum(amountAndPrice =>
{
int partialResult = Math.Min(amountAndPrice.Available, requestedVolume) * amountAndPrice.Price;
requestedVolume = Math.Max(requestedVolume - amountAndPrice.Available, 0);
return partialResult;
}) / originalRequestedVolume;
}
Take the sum of price * available as long as the requested volume is bigger than 0 and subtracting the amount of every item in the list in each "sum iteration". Finally divide by the original requested volume.
You could do something to generate the items' prices as a sequence. e.g.
public class A
{
public int Price;
public int Available;
public IEnumerable<int> Inv => Enumerable.Repeat(Price, Available);
}
var avg1 = items.SelectMany(i => i.Inv).Take(100).Average(); // 10
var avg2 = items.SelectMany(i => i.Inv).Take(1200).Average(); // 10.8333333333333
I think the best you can do with LINQ is minimize the running total computation done on the server and compute most of it on the client, but minimize the amount downloaded from the server.
I assume the items are already projected down to the two minimum columns (Price, Availability). If not, a Select can be added before pulling the data from the database into orderedItems.
// find price of last item needed; worst case there won't be one
var lastPriceItem = items.Select(i => new { i.Price, RT = items.Where(it => it.Price <= i.Price).Sum(it => it.Available) }).FirstOrDefault(irt => irt.RT > origReqVol);
// bring over items below that price
var orderedItems = items.OrderBy(i => i.Price).Where(i => i.Price <= lastPriceItem.Price).ToList();
// compute running total on client
var rtItems = orderedItems.Select(i => new {
Item = i,
RT = orderedItems.Where(i2 => i2.Price <= i.Price).Sum(i2 => i2.Available)
});
// computer average price
var reqVol = origReqVol;
var ans = rtItems.Select(irt => new { Price = irt.Item.Price, Quantity = Math.Min((reqVol -= irt.Item.Available)+irt.Item.Available, irt.Item.Available) })
.Sum(pq => pq.Price * pq.Quantity) / (double)origReqVol;

C#: read text file and process it

I need a program in C# which is write out
how many Eric Clapton songs played in the radios.
is there any eric clapton song which all 3 radios played.
how much time they broadcasted eric clapton songs in SUM.
The first columns contain the radio identification(1-2-3)
The second column is about the song playtime minutes
the third column is the song playtime in seconds
the last two is the performer : song
So the file looks like this:
1 5 3 Deep Purple:Bad Attitude
2 3 36 Eric Clapton:Terraplane Blues
3 2 46 Eric Clapton:Crazy Country Hop
3 3 25 Omega:Ablakok
2 4 23 Eric Clapton:Catch Me If You Can
1 3 27 Eric Clapton:Willie And The Hand Jive
3 4 33 Omega:A szamuzott
.................
And more 670 lines.
so far i get this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace radiplaytime
{
public struct Adat
{
public int rad;
public int min;
public int sec;
public Adat(string a, string b, string c)
{
rad = Convert.ToInt32(a);
min = Convert.ToInt32(b);
sec = Convert.ToInt32(c);
}
}
class Program
{
static void Main(string[] args)
{
String[] lines = File.ReadAllLines(#"...\zenek.txt");
List<Adat> adatlista = (from adat in lines
//var adatlista = from adat in lines
select new Adat(adat.Split(' ')[0],
adat.Split(' ')[1],
adat.Split(' ')[2])).ToList<Adat>();
var timesum = (from adat in adatlista
group adat by adat.rad into ertekek
select new
{
rad = ertekek.Key,
hour = (ertekek.Sum(adat => adat.min) +
ertekek.Sum(adat => adat.sec) / 60) / 60,
min = (ertekek.Sum(adat => adat.min) +
ertekek.Sum(adat => adat.sec) / 60) % 60,
sec = ertekek.Sum(adat => adat.sec) % 60,
}).ToArray();
for (int i = 0; i < timesum.Length; i++)
{
Console.WriteLine("{0}. radio: {1}:{2}:{3} playtime",
timesum[i].rad,
timesum[i].hour,
timesum[i].min,
timesum[i].sec);
}
Console.ReadKey();
}
}
}
You can define a custom class to store the values of each line. You will need to use Regex to split each line and populate your custom class. Then you can use linq to get the information you need.
public class Plays
{
public int RadioID { get; set; }
public int PlayTimeMinutes { get; set; }
public int PlayTimeSeconds { get; set; }
public string Performer { get; set; }
public string Song { get; set; }
}
So you then read your file and populate the custom Plays:
String[] lines = File.ReadAllLines(#"songs.txt");
List<Plays> plays = new List<Plays>();
foreach (string line in lines)
{
var matches = Regex.Match(line, #"^(\d+)\s(\d+)\s(\d+)\s(.+)\:(.+)$"); //this will split your line into groups
if (matches.Success)
{
Plays play = new Plays();
play.RadioID = int.Parse(matches.Groups[1].Value);
play.PlayTimeMinutes = int.Parse(matches.Groups[2].Value);
play.PlayTimeSeconds = int.Parse(matches.Groups[3].Value);
play.Performer = matches.Groups[4].Value;
play.Song = matches.Groups[5].Value;
plays.Add(play);
}
}
Now that you have your list of songs, you can use linq to get what you need:
//Get Total Eric Clapton songs played - assuming distinct songs
var ericClaptonSongsPlayed = plays.Where(x => x.Performer == "Eric Clapton").GroupBy(y => y.Song).Count();
//get eric clapton songs played on all radio stations
var radioStations = plays.Select(x => x.RadioID).Distinct();
var commonEricClaptonSong = plays.Where(x => x.Performer == "Eric Clapton").GroupBy(y => y.Song).Where(z => z.Count() == radioStations.Count());
etc.
String splitting works only if the text is really simple and doesn't have to deal with fixed length fields. It generates a lot of temporary strings as well, that can cause your program to consume many times the size of the original in RAM and harm performance due to the constant allocations and garbage collection.
Riv's answer shows how to use a Regex to parse this file. It can be improved in several ways though:
var pattern=#"^(\d+)\s(\d+)\s(\d+)\s(.+)\:(.+)$";
var regex=new Regex(pattern);
var plays = from line in File.ReadLines(filePath)
let matches=regex.Match(line)
select new Plays {
RadioID = int.Parse(matches.Groups[1].Value),
PlayTimeMinutes = int.Parse(matches.Groups[2].Value),
PlayTimeSeconds = int.Parse(matches.Groups[3].Value),
Performer = matches.Groups[4].Value,
Song = matches.Groups[5].Value
};
ReadLines returns an IEnumerable<string> instead of returning all of the lines in a buffer. This means that parsing can start immediatelly
By using a single regular expression, we don't have to rebuild the regex for each line.
No list is needed. The query returns an IEnumerable to which other LINQ operations can be applied directly.
For example :
var durations = plays.GroupBy(p=>p.RadioID)
.Select(grp=>new { RadioID=grp.Key,
Hours = grp.Sum(p=>p.PlayTimeMinutes + p.PlayTimeSecons/60)/60,)
Mins = grp.Sum(p=>p.PlayTimeMinutes + p.PlayTimeSecons/60)%60,)
Secss = grp.Sum(p=> p.PlayTimeSecons)%60)
});
A farther improvement could be to give names to the groups:
var pattern=#"^(?<station>\d+)\s(?<min>\d+)\s(?<sec>\d+)\s(?<performer>.+)\:(?<song>.+)$";
...
select new Plays {
RadioID = int.Parse(matches.Groups["station"].Value),
PlayTimeMinutes = int.Parse(matches.Groups["min"].Value),
...
};
You can also get rid of the Plays class and use a single, slightly more complex LINQ query :
var durations = from line in File.ReadLines(filePath)
let matches=regex.Match(line)
let play= new {
RadioID = int.Parse(matches.Groups["station"].Value),
Minutes = int.Parse(matches.Groups["min"].Value),
Seconds = int.Parse(matches.Groups["sec"].Value)
}
group play by play.RadioID into grp
select new { RadioID = grp.Key,
Hours = grp.Sum(p=>p.Minutes + p.Seconds/60)/60,)
Mins = grp.Sum(p=>p.Minutes + p.Seconds/60)%60,)
Secs = grp.Sum(p=> p.Seconds)%60)
};
In this case, no strings are generated for Performer and Song. That's another benefit of regular expressions. Matches and groups are just indexes into the original string. No string is generated until the .Value is read. This would reduce the RAM used in this case by about 75%.
Once you have the results, you can iterate over them :
foreach (var duration in durations)
{
Console.WriteLine("{0}. radio: {1}:{2}:{3} playtime",
duration.RadioID,
duration.Hours,
duration.Mins,
duration.Secs);
}

Filter, merge, sort and page data from multiple sources

At the moment I'm retrieving data from the DB through a method that retrieves an IQueryable<T1>, filtering, sorting and then paging it (all these on the DB basically), before returning the result to the UI for display in a paged table.
I need to integrate results from another DB, and paging seems to be the main issue.
models are similar but not identical (same fields, different names, will need to map to a generic domain model before returning);
joining at the DB level is not possible;
there are ~1000 records at the moment between both DBs (added during
the past 18 months), and likely to grow at mostly the same (slow)
pace;
results always need to be sorted by 1-2 fields (date-wise).
I'm currently torn between these 2 solutions:
Retrieve all data from both sources, merge, sort and then cache them; then simply filter and page on said cache when receiving requests - but I need to invalidate the cache when the collection is modified (which I can);
Filter data on each source (again, at the DB level), then retrieve, merge, sort & page them, before returning.
I'm looking to find a decent algorithm performance-wise. The ideal solution would probably be a combination between them (caching + filtering at the DB level), but I haven't wrapped my head around that at the moment.
I think you can use the following algorithm. Suppose your page size is 10, then for page 0:
Get 10 results from database A, filtered and sorted at db level.
Get 10 results from database B, filtered and sorted at db level (in parallel with the above query)
Combine those two results to get 10 records in the correct sort order. So you have 20 records sorted, but take only first 10 of them and display in UI
Then for page 1:
Notice how many items from database A and B you used to display in UI at previous step. For example, you used 2 items from database A and 8 items from database B.
Get 10 results from database A, filtered and sorted, but starting at position 2 (skip 2), because those two you already have shown in UI.
Get 10 results from database B, filtered and sorted, but starting at position 8 (skip 8).
Merge the same way as above to get 10 records from 20. Suppose now you used 5 item from A and 5 items from B. Now, in total, you have shown 7 items from A and 13 items from B. Use those numbers for the next step.
This will not allow to (easily) skip pages, but as I understand that is not a requirement.
The perfomance should be effectively the same as when you are querying single database, because queries to A and B can be done in parallel.
I've created something here, I will come back with explications if needed.
I'm not sure my algorithm works correctly for all edge cases, it cover all of the cases what I had in mind, but you never know. I'll leave the code here for your pleasure, I will answer and explain what is done there if you need that, leave a comment.
And perform multiple tests with list of items with large gaps between the values.
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication1
{
class Program
{
//each time when this objects are accessed, consider as a database call
private static IQueryable<model1> dbsetModel_1;
private static IQueryable<model2> dbsetModel_2;
private static void InitDBSets()
{
var rnd = new Random();
List<model1> dbsetModel1 = new List<model1>();
List<model2> dbsetModel2 = new List<model2>();
for (int i = 1; i < 300; i++)
{
if (i % 2 == 0)
{
dbsetModel1.Add(new model1() { Id = i, OrderNumber = rnd.Next(1, 10), Name = "Test " + i.ToString() });
}
else
{
dbsetModel2.Add(new model2() { Id2 = i, OrderNumber2 = rnd.Next(1, 10), Name2 = "Test " + i.ToString() });
}
}
dbsetModel_1 = dbsetModel1.AsQueryable();
dbsetModel_2 = dbsetModel2.AsQueryable();
}
public static void Main()
{
//generate sort of db data
InitDBSets();
//test
var result2 = GetPage(new PagingFilter() { Page = 5, Limit = 10 });
var result3 = GetPage(new PagingFilter() { Page = 6, Limit = 10 });
var result5 = GetPage(new PagingFilter() { Page = 7, Limit = 10 });
var result6 = GetPage(new PagingFilter() { Page = 8, Limit = 10 });
var result7 = GetPage(new PagingFilter() { Page = 4, Limit = 20 });
var result8 = GetPage(new PagingFilter() { Page = 200, Limit = 10 });
}
private static PagedList<Item> GetPage(PagingFilter filter)
{
int pos = 0;
//load only start pages intervals margins from both database
//this part need to be transformed in a stored procedure on db one, skip, take to return interval start value for each frame
var framesBordersModel1 = new List<Item>();
dbsetModel_1.OrderBy(x => x.Id).ThenBy(z => z.OrderNumber).ToList().ForEach(i => {
pos++;
if (pos - 1 == 0)
{
framesBordersModel1.Add(new Item() { criteria1 = i.Id, criteria2 = i.OrderNumber, model = i });
}
else if ((pos - 1) % filter.Limit == 0)
{
framesBordersModel1.Add(new Item() { criteria1 = i.Id, criteria2 = i.OrderNumber, model = i });
}
});
pos = 0;
//this part need to be transformed in a stored procedure on db two, skip, take to return interval start value for each frame
var framesBordersModel2 = new List<Item>();
dbsetModel_2.OrderBy(x => x.Id2).ThenBy(z => z.OrderNumber2).ToList().ForEach(i => {
pos++;
if (pos - 1 == 0)
{
framesBordersModel2.Add(new Item() { criteria1 = i.Id2, criteria2 = i.OrderNumber2, model = i });
}
else if ((pos -1) % filter.Limit == 0)
{
framesBordersModel2.Add(new Item() { criteria1 = i.Id2, criteria2 = i.OrderNumber2, model = i });
}
});
//decide where is the position of your cursor based on start margins
//int mainCursor = 0;
int cursor1 = 0;
int cursor2 = 0;
//filter pages start from 1, filter.Page cannot be 0, if indeed you have page 0 change a lil' bit he logic
if (framesBordersModel1.Count + framesBordersModel2.Count < filter.Page) throw new Exception("Out of range");
while ( cursor1 + cursor2 < filter.Page -1)
{
if (framesBordersModel1[cursor1].criteria1 < framesBordersModel2[cursor2].criteria1)
{
cursor1++;
}
else if (framesBordersModel1[cursor1].criteria1 > framesBordersModel2[cursor2].criteria1)
{
cursor2++;
}
//you should't get here case main key sound't be duplicate, annyhow
else
{
if (framesBordersModel1[cursor1].criteria2 < framesBordersModel2[cursor2].criteria2)
{
cursor1++;
}
else
{
cursor2++;
}
}
//mainCursor++;
}
//magic starts
//inpar skipable
int skipEndResult = 0;
List<Item> dbFramesMerged = new List<Item>();
if ((cursor1 + cursor2) %2 == 0)
{
dbFramesMerged.AddRange(
dbsetModel_1.OrderBy(x => x.Id)
.ThenBy(z => z.OrderNumber)
.Skip(cursor1*filter.Limit)
.Take(filter.Limit)
.Select(x => new Item() {criteria1 = x.Id, criteria2 = x.OrderNumber, model = x})
.ToList()); //consider as db call EF or Stored Procedure
dbFramesMerged.AddRange(
dbsetModel_2.OrderBy(x => x.Id2)
.ThenBy(z => z.OrderNumber2)
.Skip(cursor2*filter.Limit)
.Take(filter.Limit)
.Select(x => new Item() {criteria1 = x.Id2, criteria2 = x.OrderNumber2, model = x})
.ToList());
; //consider as db call EF or Stored Procedure
}
else
{
skipEndResult = filter.Limit;
if (cursor1 > cursor2)
{
cursor1--;
}
else
{
cursor2--;
}
dbFramesMerged.AddRange(
dbsetModel_1.OrderBy(x => x.Id)
.ThenBy(z => z.OrderNumber)
.Skip(cursor1 * filter.Limit)
.Take(filter.Limit)
.Select(x => new Item() { criteria1 = x.Id, criteria2 = x.OrderNumber, model = x })
.ToList()); //consider as db call EF or Stored Procedure
dbFramesMerged.AddRange(
dbsetModel_2.OrderBy(x => x.Id2)
.ThenBy(z => z.OrderNumber2)
.Skip(cursor2 * filter.Limit)
.Take(filter.Limit)
.Select(x => new Item() { criteria1 = x.Id2, criteria2 = x.OrderNumber2, model = x })
.ToList());
}
IQueryable<Item> qItems = dbFramesMerged.AsQueryable();
PagedList<Item> result = new PagedList<Item>();
result.AddRange(qItems.OrderBy(x => x.criteria1).ThenBy(z => z.criteria2).Skip(skipEndResult).Take(filter.Limit).ToList());
//here again you need db cals to get total count
result.Total = dbsetModel_1.Count() + dbsetModel_2.Count();
result.Limit = filter.Limit;
result.Page = filter.Page;
return result;
}
}
public class PagingFilter
{
public int Limit { get; set; }
public int Page { get; set; }
}
public class PagedList<T> : List<T>
{
public int Total { get; set; }
public int? Page { get; set; }
public int? Limit { get; set; }
}
public class Item : Criteria
{
public object model { get; set; }
}
public class Criteria
{
public int criteria1 { get; set; }
public int criteria2 { get; set; }
//more criterias if you need to order
}
public class model1
{
public int Id { get; set; }
public int OrderNumber { get; set; }
public string Name { get; set; }
}
public class model2
{
public int Id2 { get; set; }
public int OrderNumber2 { get; set; }
public string Name2 { get; set; }
}
}

Find a gap in a group of ranges with optional inequalities

There's a bunch of similar questions on SO, but none that seem to answer what I'm asking.
I have a class like so:
public partial class PercentileWeight
{
public virtual Guid VotingWeightId { get; set; }
public virtual decimal LowerBoundPercentageRanking { get; set; }
public virtual bool LowerBoundInclusive { get; set; }
public virtual decimal UpperBoundPercentageRanking { get; set; }
public virtual bool UpperBoundInclusive { get; set; }
public virtual decimal PercentageWeight { get; set; }
}
... the concept here is, if a data source is ranked within a certain percentile, the value of their data may count more or less in a decision tree that consumes that data. For example, if the data source is ranked in the top 10%, I might want to double the value weight of the data source. The object for such a PercentileWeight would look something like this:
var pw = new PercentileWeight
{
UpperBoundPercentageRanking = 100M,
UpperBoundInclusive = true,
LowerBoundPercentageRanking = 90M,
LowerBoundInclusive = false,
PercentageWeight = 200M
};
Note the UpperBoundInclusive and LowerBoundInclusive values. In this model, a ranking of exactly 90% would not qualify, but a value of exactly 100% would qualify. There will also be logic to make sure that none of the ranges overlap.
What I'd like to do is programmatically identify "gaps" in a collection of these objects, so I can show I user "uncovered ranges" for them to create PercentileWeight objects for them.
I want to present the user with a "prefab" PercentileWeight object covering the first gap; for example, if the above object was already in the system, the user would be presented with a potential object resembling:
var pw = new PercentileWeight
{
UpperBoundPercentageRanking = 90M,
UpperBoundInclusive = true,
LowerBoundPercentageRanking = 0M,
LowerBoundInclusive = true,
PercentageWeight = 100M
};
Here's the problem: it seems this should be relatively straightforward, but I have no idea how to do this. Can someone suggest a relatively performant way of identifying the gaps in a collection of such ranges?
This is one of those problems that seems simple but is a little tricky to implement in practice. Here is an extension method which will create new PercentileWeight's to fill in all the gaps between a range.
public static class PercentileWeightExtension
{
public const decimal Delta = 0.00000000000000000000000001M;
public static IEnumerable<PercentileWeight> CoveringRange(this IEnumerable<PercentileWeight> inputs, PercentileWeight coveringRange)
{
//todo: following code expects no overlaps check that none exist
//create lower and upper weights from coveringRange
var lower = new PercentileWeight(decimal.MinValue, true, coveringRange.LowerBoundPercentageRanking, !coveringRange.LowerBoundInclusive);
var upper = new PercentileWeight(coveringRange.UpperBoundPercentageRanking, !coveringRange.UpperBoundInclusive, decimal.MaxValue, true);
//union new lower and upper weights with incoming list and order to process
var orderedInputs = inputs.Union(new [] { lower, upper })
.OrderBy(item => item.LowerBoundPercentageRanking)
.ToList();
//process list in order filling in the gaps
for (var i = 1; i < orderedInputs.Count; i++)
{
var gap = GetPercentileWeightBetweenLowerAndUpper(orderedInputs[i - 1], orderedInputs[i]);
if (gap != null)
{
yield return gap;
}
//dont want to output last input this represents the fake upper made above and wasnt in the original input
if (i < (orderedInputs.Count - 1))
{
yield return orderedInputs[i];
}
}
}
private static PercentileWeight GetPercentileWeightBetweenLowerAndUpper(PercentileWeight lowerWeight, PercentileWeight upperWeight)
{
var lower = lowerWeight.UpperBoundPercentageRanking;
var lowerInclusive = lowerWeight.UpperBoundInclusive;
var upper = upperWeight.LowerBoundPercentageRanking;
var upperInclusive = upperWeight.LowerBoundInclusive;
//see if there is a gap between lower and upper (offset by a small delta for non inclusive ranges)
var diff = (upper + (upperInclusive ? 0 : Delta)) - (lower - (lowerInclusive ? 0 : Delta));
if (diff > Delta)
{
//there was a gap so return a new weight to fill it
return new PercentileWeight
{
LowerBoundPercentageRanking = lower,
LowerBoundInclusive = !lowerInclusive,
UpperBoundPercentageRanking = upper,
UpperBoundInclusive = !upperInclusive
};
}
return null;
}
}
This can be used pretty easy like this
public class Program
{
public static void Main(string[] args)
{
//existing weights
var existingWeights = new[] {
new PercentileWeight(90, false, 95, true) { VotingWeightId = Guid.NewGuid() },
new PercentileWeight(50, true, 60, false) { VotingWeightId = Guid.NewGuid() }
};
//get entire range with gaps filled in from 0 (non inclusive) to 100 (inclusive)
var entireRange = existingWeights.CoveringRange(new PercentileWeight(0, false, 100, true)).ToList();
}
}
Which outputs a new list containing these items (all new items have a VotingWeightId of Guid.Empty)
0 (non inclusive) to 50 (non inclusive) (New)
50 (inclusive) to 60 (non inclusive)
60 (inclusive) to 90 (inclusive) (New)
90 (non inclusive) to 95 (inclusive)
95 (inclusive) to 100 (inclusive) (New)

Sum up column values in datagrid based on the value of another column in c# windows

I have a datagrid with many columns.
Below are two of those columns.
I need to add up the count values for P and F values of P/F column seperately and compare them.for P the sum is 3 and for F it is 7. I need to display the sum with greater value. Is there any way i can achieve dis.
P/F | Count
------------------------
P | 2
P | 1
F | 5
F | 2
Using Linq
var p_sum = from p_col in dataGridView1 //--> am getting error here(group by not found)
group p_col by p_col.Status into g
select g.Sum(p => p.weightagepercent) ;
You could use linq to do something like:
var p_sum =
from p_col in datagrid
group p_col by p_col.datagrid_p_f_column into g
select g.Sum(p => p.datagrid_value_column) };
Do the same for F, and then just show the bigger with a simple smaller, bigger, or max on both variables
Firs't learn linq. It'll make your life easier :) Here's the official 101 linq samples.
Let's assume I have an class called ExtraStringDataPoint, defined like this:
public class ExtraStringDataPoint : IDataPoint
{
public ExtraStringDataPoint(double x, double y , string s)
{
X = x;
Y = y;
Extra = s;
}
public double X { get; set; }
public double Y { get; set; }
public string Extra { get; set; }
}
Now I have a collection of those points:
List<ExtraStringDataPoint> MyList = new List<ExtraStringDataPoint>()
{
new ExtraStringDataPoint(10,10,"ten"),
new ExtraStringDataPoint(20, 20, "twenty"),
new ExtraStringDataPoint(30,30, "thirty")
};
You could use linq to sum all the point's value of Y, where the X value is bigger than 15 like this for example:
var bigger_than_15 = from point in MyList
where point.X > 15
select point;
var total_y = bigger_than_15.Sum(point => point.Y);
So, back to your case. Linq will group the item's according to your datagrid by the column that hold's the name (P or F) into a group. p_col is a temp variable, datagrid_p_f_column should be the column name (or property in an object, it's the same). Once the grouping is done, it will sum all the values. In your case for each p, it will sum p.datagrid_value_column which should be the name of the column holding your numeric value. In the end, p_sum will have the sum of the values of the row's where the name has p. Rinse and repeat for f. compare, and that's it.
int countF=0;
int countP=0;
foreach(DataRow row in DataTable.rows)
{
if(row.itemArray[0].tostring().equals("F")
{
countf++;
}
`if(row.itemArray[0].tostring().equals("P")`
{
countP++;
}
}
if(countF>countP)
{
Display
}
else
Display
//for dataGrid View
dataGridView1.Rows.Add("P","1");
dataGridView1.Rows.Add("F","2");
int countF = 0;
int countP = 0;
for (int i = 0; i < dataGridView1.Rows.Count-1; i++)
{
string a= dataGridView1[0, i].Value.ToString();
string b = dataGridView1[1, i].Value.ToString();
if (a == "F")
{
countF= Convert.ToInt32(b);
countF++;
}
if (a == "P")
{
countF = Convert.ToInt32(b);
countP++;
}
}
if (countF > countP)
{
//display
}
else
{
//display
}
I don't use Linq.
Below is a simple working code
double sumP = 0;
double sumF = 0;
for (int i = 6; i < dataGridView1.Rows.Count-1; ++i)
{
if (dataGridView1.Rows[i].Cells[6].Value.Equals("P"))
{
sumP += Convert.ToDouble(dataGridView1.Rows[i].Cells[9].Value);
}
else if (dataGridView1.Rows[i].Cells[6].Value.Equals("F"))
{
sumF += Convert.ToDouble(dataGridView1.Rows[i].Cells[9].Value);
}
}
If(sumF>sumP)
{
Label2.text="Fail";
}
else
{
label2.text="Pass";
}

Categories