Find a gap in a group of ranges with optional inequalities - c#

There's a bunch of similar questions on SO, but none that seem to answer what I'm asking.
I have a class like so:
public partial class PercentileWeight
{
public virtual Guid VotingWeightId { get; set; }
public virtual decimal LowerBoundPercentageRanking { get; set; }
public virtual bool LowerBoundInclusive { get; set; }
public virtual decimal UpperBoundPercentageRanking { get; set; }
public virtual bool UpperBoundInclusive { get; set; }
public virtual decimal PercentageWeight { get; set; }
}
... the concept here is, if a data source is ranked within a certain percentile, the value of their data may count more or less in a decision tree that consumes that data. For example, if the data source is ranked in the top 10%, I might want to double the value weight of the data source. The object for such a PercentileWeight would look something like this:
var pw = new PercentileWeight
{
UpperBoundPercentageRanking = 100M,
UpperBoundInclusive = true,
LowerBoundPercentageRanking = 90M,
LowerBoundInclusive = false,
PercentageWeight = 200M
};
Note the UpperBoundInclusive and LowerBoundInclusive values. In this model, a ranking of exactly 90% would not qualify, but a value of exactly 100% would qualify. There will also be logic to make sure that none of the ranges overlap.
What I'd like to do is programmatically identify "gaps" in a collection of these objects, so I can show I user "uncovered ranges" for them to create PercentileWeight objects for them.
I want to present the user with a "prefab" PercentileWeight object covering the first gap; for example, if the above object was already in the system, the user would be presented with a potential object resembling:
var pw = new PercentileWeight
{
UpperBoundPercentageRanking = 90M,
UpperBoundInclusive = true,
LowerBoundPercentageRanking = 0M,
LowerBoundInclusive = true,
PercentageWeight = 100M
};
Here's the problem: it seems this should be relatively straightforward, but I have no idea how to do this. Can someone suggest a relatively performant way of identifying the gaps in a collection of such ranges?

This is one of those problems that seems simple but is a little tricky to implement in practice. Here is an extension method which will create new PercentileWeight's to fill in all the gaps between a range.
public static class PercentileWeightExtension
{
public const decimal Delta = 0.00000000000000000000000001M;
public static IEnumerable<PercentileWeight> CoveringRange(this IEnumerable<PercentileWeight> inputs, PercentileWeight coveringRange)
{
//todo: following code expects no overlaps check that none exist
//create lower and upper weights from coveringRange
var lower = new PercentileWeight(decimal.MinValue, true, coveringRange.LowerBoundPercentageRanking, !coveringRange.LowerBoundInclusive);
var upper = new PercentileWeight(coveringRange.UpperBoundPercentageRanking, !coveringRange.UpperBoundInclusive, decimal.MaxValue, true);
//union new lower and upper weights with incoming list and order to process
var orderedInputs = inputs.Union(new [] { lower, upper })
.OrderBy(item => item.LowerBoundPercentageRanking)
.ToList();
//process list in order filling in the gaps
for (var i = 1; i < orderedInputs.Count; i++)
{
var gap = GetPercentileWeightBetweenLowerAndUpper(orderedInputs[i - 1], orderedInputs[i]);
if (gap != null)
{
yield return gap;
}
//dont want to output last input this represents the fake upper made above and wasnt in the original input
if (i < (orderedInputs.Count - 1))
{
yield return orderedInputs[i];
}
}
}
private static PercentileWeight GetPercentileWeightBetweenLowerAndUpper(PercentileWeight lowerWeight, PercentileWeight upperWeight)
{
var lower = lowerWeight.UpperBoundPercentageRanking;
var lowerInclusive = lowerWeight.UpperBoundInclusive;
var upper = upperWeight.LowerBoundPercentageRanking;
var upperInclusive = upperWeight.LowerBoundInclusive;
//see if there is a gap between lower and upper (offset by a small delta for non inclusive ranges)
var diff = (upper + (upperInclusive ? 0 : Delta)) - (lower - (lowerInclusive ? 0 : Delta));
if (diff > Delta)
{
//there was a gap so return a new weight to fill it
return new PercentileWeight
{
LowerBoundPercentageRanking = lower,
LowerBoundInclusive = !lowerInclusive,
UpperBoundPercentageRanking = upper,
UpperBoundInclusive = !upperInclusive
};
}
return null;
}
}
This can be used pretty easy like this
public class Program
{
public static void Main(string[] args)
{
//existing weights
var existingWeights = new[] {
new PercentileWeight(90, false, 95, true) { VotingWeightId = Guid.NewGuid() },
new PercentileWeight(50, true, 60, false) { VotingWeightId = Guid.NewGuid() }
};
//get entire range with gaps filled in from 0 (non inclusive) to 100 (inclusive)
var entireRange = existingWeights.CoveringRange(new PercentileWeight(0, false, 100, true)).ToList();
}
}
Which outputs a new list containing these items (all new items have a VotingWeightId of Guid.Empty)
0 (non inclusive) to 50 (non inclusive) (New)
50 (inclusive) to 60 (non inclusive)
60 (inclusive) to 90 (inclusive) (New)
90 (non inclusive) to 95 (inclusive)
95 (inclusive) to 100 (inclusive) (New)

Related

Find Range of Most Profitable Products

I have over 1,000 records and I am using this to find the highest value of (profit * volume).
In this case its "DEF" but then I have to open excel and sort by volume and find the range that produces the highest profit.. say excel column 200 to column 800 and then I'm left with say from volume 13450 to volume 85120 is the best range for profits.. how can I code something like that in C# so that I can stop using excel.
public class Stock {
public string StockSymbol { get; set; }
public double Profit { get; set; }
public int Volume { get; set; }
public Stock(string Symbol, double p, int v) {
StockSymbol = Symbol;
Profit = p;
Volume = v;
}
}
private ConcurrentDictionary<string, Stock> StockData = new();
private void button1_Click(object sender, EventArgs e) {
StockData["ABC"] = new Stock("ABC", 50, 14000);
StockData["DEF"] = new Stock("DEF", 50, 105000);
StockData["GHI"] = new Stock("GHI", -70, 123080);
StockData["JKL"] = new Stock("JKL", -70, 56500);
StockData["MNO"] = new Stock("MNO", 50, 23500);
var DictionaryItem = StockData.OrderByDescending((u) => u.Value.Profit * u.Value.Volume).First();
MessageBox.Show( DictionaryItem.Value.StockSymbol + " " + DictionaryItem.Value.Profit);
}
I wrote up something that may or may not meet your requirements. It uses random to seed a set of test data (you can ignore all of that).
private void GetStockRange()
{
var stocks = new Stock[200];
var stockChars = Enumerable.Range(0, 26).Select(n => ((char)n + 64).ToString()).ToArray();
var random = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < stocks.Length; i++)
{
stocks[i] = new Stock(stockChars[random.Next(0, 26)], random.NextDouble() * random.Next(-250, 250), random.Next(1, 2000));
}
var highestPerformaceSelectionCount = 3;
var highestPerformanceIndices = stocks
.OrderByDescending(stock => stock.Performance)
.Take(Math.Max(2, highestPerformaceSelectionCount))
.Select(stock => Array.IndexOf(stocks, stock))
.OrderBy(i => i);
var startOfRange = highestPerformanceIndices.First();
var endOfRange = highestPerformanceIndices.Last();
var rangeCount = endOfRange - startOfRange + 1;
var stocksRange = stocks
.Skip(startOfRange)
.Take(rangeCount);
var totalPerformance = stocks.Sum(stock => stock.Performance);
var rangedPerformance = stocksRange.Sum(stock => stock.Performance);
MessageBox.Show(
"Range Start: " + startOfRange + "\r\n" +
"Range End: " + endOfRange + "\r\n" +
"Range Cnt: " + rangeCount + "\r\n" +
"Total P: " + totalPerformance + "\r\n" +
"Range P: " + rangedPerformance
);
}
The basics of this algorithm to get some of the highest performance points (configured using highestPerformanceSelectionCount, min of 2), and using those indices, construct a range which contains those items. Then take a sum of that range to get the total for that range.
Not sure if I am way off from your question. This may also not be the best way to handle the range. I wanted to post what I had before heading home.
I also added a Performance property to the stock class, which is simply Profit * Volume
EDIT
There is a mistake in the use of the selected indices. The indices selected should be used against the ordered set in order to produce correct ranged results.
Rather than taking the stocksRange from the original unsorted array, instead create the range from the ordered set.
var stocksRange = stocks
.OrderByDescending(stock => stock.Performance)
.Skip(startOfRange)
.Take(rangeCount);
The indices should be gathered from the ordered set as well. Caching the ordered set is probably the easiest route to go.
As is generally the case, there are any number of ways you can go about this.
First, your sorting code above (the OrderByDescending call). It does what you appear to want, more or less, in that it produces an ordered sequence of KeyValuePair<string, Stock> that you can then choose from. Personally I'd just have sorted StockData.Values to avoid all that .Value indirection. Once sorted you can take the top performer as you're doing, or use the .Take() method to grab the top n items:
var byPerformance = StockData.Values.OrderByDescending(s => s.Profit * s.Volume);
var topPerformer = byPerformance.First();
var top10 = byPerformance.Take(10).ToArray();
If you want to filter by a particular performance value or range then it helps to pre-calculate the number and do your filtering on that. Either store (or calculate) the Performance value in the class, calculate it in the class with a computed property, or tag the Stock records with a calculated performance using an intermediate type:
Store in the class
public class Stock {
// marking these 'init' makes them read-only after creation
public string StockSymbol { get; init; }
public double Profit { get; init; }
public int Volume { get; init; }
public double Performance { get; init; }
public Stock(string symbol, double profit, int volume)
{
StockSymbol = symbol;
Profit = profit;
Volume = volume;
Performance = profit * volume;
}
}
Calculate in class
public class Stock
{
public string StockSymbol { get; set; }
public double Profit { get; set; }
public int Volume { get; set; }
public double Performance => Profit * Volume;
// you know what the constructor looks like
}
Intermediate Type (with range filtering)
// let's look for those million-dollar items
var minPerformance = 1000000d;
var highPerformance = StockData.Values
// create a stream of intermediate types with the performance
.Select(s => new { Performance = s.Profit * s.Volume, Stock = s })
// sort them
.OrderByDescending(i => i.Performance)
// filter by our criteria
.Where(i => i.Performance >= minPerformance)
// pull out the stocks themselves
.Select(i => i.Value)
// and fix into an array so we don't have to do this repeatedly
.ToArray();
Ultimately though you'll probably end up looking for ways to store the data between runs, update the values and so forth. I strongly suggest looking at starting with a database and learning how to do things there. It's basically the same, you just end up with a lot more flexibility in the way you handle the data. The code to do the actual queries looks basically the same.
Once you have the data in your program, there are very few limits on how you can manipulate it. Anything you can do in Excel with the data, you can do in C#. Usually easily, sometimes with a little work.
LINQ (Language-Integrated Native Query) makes a lot of those manipulations trivial, with extensions for all sorts of things. You can take the average performance (with .Average()) and then filter on those that perform 10% above it with some simple math. If the data follows some sort of Normal Distribution you can add your own extension (or use this one) to figure out the standard deviation... and now we're doing statistics!
The basic concepts of LINQ, and the database languages it was roughly based on, give you plenty of expressive power. And Stack Overflow is full of people who can help you figure out how to get there.
try following :
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
List<Stock> stocks = null;
public Form1()
{
InitializeComponent();
stocks = new List<Stock>() {
new Stock("ABC", 50, 14000),
new Stock("DEF", 50, 105000),
new Stock("GHI", -70, 123080),
new Stock("JKL", -70, 56500),
new Stock("MNO", 50, 23500)
};
}
private void button1_Click(object sender, EventArgs e)
{
DataTable dt = Stock.GetTable(stocks);
dataGridView1.DataSource = dt;
}
}
public class Stock {
public string StockSymbol { get; set; }
public double Profit { get; set; }
public int Volume { get; set; }
public Stock(string Symbol, double p, int v) {
StockSymbol = Symbol;
Profit = p;
Volume = v;
}
public static DataTable GetTable(List<Stock> stocks)
{
DataTable dt = new DataTable();
dt.Columns.Add("Symbol", typeof(string));
dt.Columns.Add("Profit", typeof(int));
dt.Columns.Add("Volume", typeof(int));
dt.Columns.Add("Volume x Profit", typeof(int));
foreach(Stock stock in stocks)
{
dt.Rows.Add(new object[] { stock.StockSymbol, stock.Profit, stock.Volume, stock.Profit * stock.Volume });
}
dt = dt.AsEnumerable().OrderByDescending(x => x.Field<int>("Volume x Profit")).CopyToDataTable();
return dt;
}
}
}

Fill the first missing null elements after shifting and rolling window

I'm recreating a strategy made in python with pandas. I think my code works, even tho I haven't compared the values yet, because I'm getting an exception. Basically, the problem is that .Shift(20) removes the first 20 elements and .Window(12 * 60 / 15) removes 47 elements. The typical prices are 10180 by default. They become 10113 after the shifting and rolling window. I tried using .FillMissing(), but it doesn't seem to append the first null values to the series.
def populate_indicators(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
if not {'buy', 'sell'}.issubset(dataframe.columns):
dataframe.loc[:, 'buy'] = 0
dataframe.loc[:, 'sell'] = 0
dataframe['typical'] = qtpylib.typical_price(dataframe)
dataframe['typical_sma'] = qtpylib.sma(dataframe['typical'], window=10)
min = dataframe['typical'].shift(20).rolling(int(12 * 60 / 15)).min()
max = dataframe['typical'].shift(20).rolling(int(12 * 60 / 15)).max()
dataframe['daily_mean'] = (max+min)/2
return dataframe
My code (C#)
public override List<TradeAdvice> Prepare(List<OHLCV> candles)
{
var result = new List<TradeAdvice>();
var typicalPrice = candles.TypPrice().Select(e => e ?? 0).ToList();
var typicalSma = typicalPrice.Sma(10);
var series = typicalPrice.ToOrdinalSeries();
var min = series.Shift(20).Window(12 * 60 / 15).Select(kvp => kvp.Value.Min()).FillMissing(); // 10113 elements / 10180 expected
var max = series.Shift(20).Window(12 * 60 / 15).Select(kvp => kvp.Value.Max()).FillMissing(); // 10113 elements / 10180 expected
var dailyMean = (max + min) / 2;
var asd = dailyMean.SelectValues(e => Convert.ToDecimal(e)).Values.ToList();
var crossedBelow = asd.CrossedBelow(typicalPrice);
var crossedAbove = asd.CrossedAbove(typicalPrice);
for (int i = 0; i < candles.Count; i++)
{
if (i < StartupCandleCount - 1)
result.Add(TradeAdvice.WarmupData);
else if (crossedBelow[i]) // crossBelow is 10113 elements instead of 10180...
result.Add(TradeAdvice.Buy);
else if (crossedAbove[i]) // crossBelow is 10113 elements instead of 10180...
result.Add(TradeAdvice.Sell);
else
result.Add(TradeAdvice.NoAction);
}
return result;
}
public class OHLCV
{
public DateTime Timestamp { get; set; }
public decimal Open { get; set; }
public decimal High { get; set; }
public decimal Low { get; set; }
public decimal Close { get; set; }
public decimal Volume { get; set; }
}
If you have an ordinal series that you create with ToOrdinalSeries, it means that the index of the series will be automatically generated numerical value from 0 to length of your series - 1. However, this is still a real index and Deedle keeps the mapping when you use operations like Shift.
If your index was a date, say 01/01 => a, 02/01 => b, 03/01 => c, then Shift would shift the values and drop the keys that are no longer needed, i.e. you may get 02/01 => a, 03/01 => b.
It works the same with ordinal indices, so if you have 0 => a, 1 => b, 2 => c and shift the data, you will get something like 1 => a, 2 => b.
If you then want to get 0 => <default>, 1 => a, 2 => b, then you can do this using Realign which takes the new list of keys that you want to have followed by FillMissing. For example:
var ts = new[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }.ToOrdinalSeries();
var mins = ts.Shift(2).Window(2).Select(kvp => kvp.Value.Min());
var realigned = mins.Realign(Enumerable.Range(0, 10)).FillMissing(-1);
ts.Print(); // Starts from key '0'
mins.Print(); // Starts from key '3' because of Shift & Window
realigned.Print(); // Starts from key '0' with three -1 values at the start

Find a Date range from a list of Date ranges where they overlap

I'm having a bit of trouble trying to process a list of objects which have simple From and To properties which are both DateTimes where I want the result to be a list of the same type of objects which show the ranges where there are overlaps, tbh, I think I've gone a bit code/logic blind now!
For example (please note, dates are in ddMMyyyy format):
TS1: 01/01/2020 to 10/01/2020
TS2: 08/01/2020 to 20/01/2020
So in this case I would expect to get 2 objects, both containing the same data:
TSA: 08/01/2020 to 10/01/2020
TSB: 08/01/2020 to 10/01/2020
A more complex example:
TS1: 01/01/2020 to 10/01/2020
TS2: 08/01/2020 to 20/01/2020
TS3: 18/01/2020 to 22/01/2020
So in this case I would expect to get 4 objects, two sets of two containing the same data:
TSA: 08/01/2020 to 10/01/2020
TSB: 08/01/2020 to 10/01/2020
TSC: 18/01/2020 to 20/01/2020
TSD: 18/01/2020 to 20/01/2020
One more example:
TS1: 01/01/2020 to 01/10/2020
TS2: 01/02/2020 to 01/09/2020
TS3: 01/03/2020 to 01/04/2020
So in this case I would expect to get 3 objects, all containing the same data:
TSA: 01/03/2020 to 01/04/2020
TSB: 01/03/2020 to 01/04/2020
TSC: 01/03/2020 to 01/04/2020
I've tried researching an algorithm online, but without any luck to get exactly what I want, or they are SQl based answers.
Any suggestions would be very welcome.
Edit:
Just to explain what this is going to be used for so it might make it a bit clearer for some of the commenters below.
Each of these date ranges denote a room which in use. This system is meant to report back date ranges when there are no rooms available at all. As I already know the quantity of rooms I can determine if there is any availability from these results and return the no availability date ranges.
I've also edited the expected results after trying some of the answers below
The following algorithm calculates the result in O(n log(n)) in the common case, although it is still O(n^2) in the worst case.
First, a record class.
public class DateRange
{
public DateRange(DateTime from, DateTime to)
{
From = from;
To = to;
}
public DateTime From { get; set; }
public DateTime To { get; set; }
}
My algorithm is as follows. I added some comments to the algorithm, so I hope it is comprehensible. In principle, it exploits the fact that most ranges do (hopefully) not overlap with more than a few other ranges, by processing the input in sorted order, dropping older input entries from consideration once the current input has moved past their end time.
public static IEnumerable<DateRange> FindOverlaps(IList<DateRange> dateRanges)
{
if (dateRanges.Count < 2)
{
return Enumerable.Empty<DateRange>();
}
// Sort the input ranges by start time, in ascending order, to process them in that order.
var orderedRanges = dateRanges.OrderBy(x => x.From).ToList();
// Keep a list of previously processed values.
var previousRanges = new List<DateRange>
{
orderedRanges.First(),
};
var result = new List<DateRange>();
foreach (var value in orderedRanges.Skip(1))
{
var toDelete = new List<DateRange>();
// Go through all ranges that start before the current one, and pick those among
// them that end after the current one starts as result values, and also, delete all
// those that end before the current one starts from the list -- given that the input
// is sorted, they will never overlap with future input values.
foreach (var dateRange in previousRanges)
{
if (value.From >= dateRange.To)
{
toDelete.Add(dateRange);
}
else
{
result.Add(new DateRange(value.From, value.To < dateRange.To ? value.To : dateRange.To));
}
}
foreach (var candidate in toDelete)
{
previousRanges.Remove(candidate);
}
previousRanges.Add(value);
}
return result;
}
Note that it is possible that all the n values in the input overlap. In this case, there are n*(n-1) overlaps, so the algorithm will necessarily run in O(n^2). However, in the well-formed case where each date range has a low number of overlaps with other date ranges, the complexity will be roughly O(n log(n)), with the expensive operation being the .OrderBy() calls on the input.
One more consideration. Consider you have a list of input values like so:
var example = new[]
{
new DateRange(new DateTime(2000, 1, 1), new DateTime(2010, 1, 10)),
new DateRange(new DateTime(2000, 2, 1), new DateTime(2000, 10, 10)),
new DateRange(new DateTime(2000, 3, 11), new DateTime(2000, 9, 12)),
new DateRange(new DateTime(2000, 4, 11), new DateTime(2000, 8, 12)),
};
In this case, not only do all the values overlap, they are also contained within one another. My algorithm as posted above will report such regions multiple times (for example, it will return the range from 2000-04-11 to 2000-08-12 three times, because it overlaps three other date ranges). In case you don't want overlapping regions to be reported multiple times like that, you can feed the output of the above function to the following function to filter them down:
public static IEnumerable<DateRange> MergeRanges(IList<DateRange> dateRanges)
{
var currentOverlap = dateRanges.First();
var r = new List<DateRange>();
foreach (var dateRange in dateRanges.Skip(1))
{
if (dateRange.From > currentOverlap.To)
{
r.Add(currentOverlap);
currentOverlap = dateRange;
}
else
{
currentOverlap.To = currentOverlap.To > dateRange.To ? currentOverlap.To : dateRange.To;
}
}
r.Add(currentOverlap);
return r;
}
This does not affect overall algorithmic complexity, as it's obviously O(n)-ish.
Let's assume you defined a type to store the date ranges like this:
public class DataObject
{
public DateTime From { get; set; }
public DateTime To { get; set; }
}
Then you can compare the items in your list to each other to determine if they overlap, and if so return the overlapping period of time (just to point you in the right direction, I did not thoroughly test this algorithm)
public DataObject[] GetOverlaps(DataObject[] objects)
{
var result = new List<DataObject>();
if (objects.Length > 1)
{
for (var i = 0; i < objects.Length - 1; i++)
{
var pivot = objects[i];
for (var j = i + 1; j < objects.Length; j++)
{
var other = objects[j];
// Do both ranges overlap?
if (pivot.From > other.To || pivot.To < other.From)
{
// No
continue;
}
result.Add(new DataObject
{
From = pivot.From >= other.From ? pivot.From : other.From,
To = pivot.To <= other.To ? pivot.To : other.To,
});
}
}
}
return result.ToArray();
}
Some of the requirements around nested ranges and other corner cases are not exactly clear. But here's another algorithm that seems to do what you're after. The usual caveats of limited testing apply of course - could not work at all for corner cases I didn't test.
The algorithm relies on sorting the data first. If you can't do that then this won't work. The algorithm itself is as follows:
static IEnumerable<DateRange> ReduceToOverlapping(IEnumerable<DateRange> source)
{
if (!source.Any())
yield break;
Stack<DateRange> stack = new Stack<DateRange>();
foreach (var r in source)
{
while (stack.Count > 0 && r.Start > stack.Peek().End)
stack.Pop();
foreach (var left in stack)
{
if (left.GetOverlap(r) is DateRange overlap)
yield return overlap;
}
stack.Push(r);
}
}
The DateRange is a simple class to hold the dates you presented. It looks like this:
class DateRange
{
public DateRange(DateRange other)
{ this.Start = other.Start; this.End = other.End; }
public DateRange(DateTime start, DateTime end)
{ this.Start = start; this.End = end; }
public DateRange(string start, string end)
{
const string format = "dd/MM/yyyy";
this.Start = DateTime.ParseExact(start, format, CultureInfo.InvariantCulture);
this.End = DateTime.ParseExact(end, format, CultureInfo.InvariantCulture);
}
public DateTime Start { get; set; }
public DateTime End { get; set; }
public DateRange GetOverlap(DateRange next)
{
if (this.Start <= next.Start && this.End >= next.Start)
{
return new DateRange(next.Start, this.End < next.End ? this.End : next.End);
}
return null;
}
}
As I mentioned this is used by sorting it first. Example of sorting and calling the method on some test data is here:
static void Main(string[] _)
{
foreach (var (inputdata, expected) in TestData)
{
var sorted = inputdata.OrderBy(x => x.Start).ThenBy(x => x.End);
var reduced = ReduceToOverlapping(sorted).ToArray();
if (!Enumerable.SequenceEqual(reduced, expected, new CompareDateRange()))
throw new ArgumentException("failed to produce correct result");
}
Console.WriteLine("all results correct");
}
To Test you'll need a equality comparer and the test data which is here:
class CompareDateRange : IEqualityComparer<DateRange>
{
public bool Equals([AllowNull] DateRange x, [AllowNull] DateRange y)
{
if (null == x && null == y)
return true;
if (null == x || null == y)
return false;
return x.Start == y.Start && x.End == y.End;
}
public int GetHashCode([DisallowNull] DateRange obj)
{
return obj.Start.GetHashCode() ^ obj.End.GetHashCode();
}
}
public static (DateRange[], DateRange[])[] TestData = new (DateRange[], DateRange[])[]
{
(new DateRange[]
{
new DateRange("01/01/2020", "18/01/2020"),
new DateRange("08/01/2020", "17/01/2020"),
new DateRange("09/01/2020", "15/01/2020"),
new DateRange("14/01/2020", "20/01/2020"),
},
new DateRange[]
{
new DateRange("08/01/2020", "17/01/2020"),
new DateRange("09/01/2020", "15/01/2020"),
new DateRange("09/01/2020", "15/01/2020"),
new DateRange("14/01/2020", "15/01/2020"),
new DateRange("14/01/2020", "17/01/2020"),
new DateRange("14/01/2020", "18/01/2020"),
}),
(new DateRange[]
{
new DateRange("01/01/2020", "10/01/2020"),
new DateRange("08/01/2020", "20/01/2020"),
},
new DateRange[]
{
new DateRange("08/01/2020", "10/01/2020"),
}),
(new DateRange[]
{
new DateRange("01/01/2020", "10/01/2020"),
new DateRange("08/01/2020", "20/01/2020"),
new DateRange("18/01/2020", "22/01/2020"),
},
new DateRange[]
{
new DateRange("08/01/2020", "10/01/2020"),
new DateRange("18/01/2020", "20/01/2020"),
}),
(new DateRange[]
{
new DateRange("01/01/2020", "18/01/2020"),
new DateRange("08/01/2020", "10/01/2020"),
new DateRange("18/01/2020", "22/01/2020"),
},
new DateRange[]
{
new DateRange("08/01/2020", "10/01/2020"),
new DateRange("18/01/2020", "18/01/2020"),
}),
};

c# use one variable value to set a second from a fixed list

I'm parsing a CSV file in a c# .net windows form app, taking each line into a class I've created, however I only need access to some of the columns AND the files being taken in are not standardized. That is to say, number of fields present could be different and the columns could appear in any column.
CSV Example 1:
Position, LOCATION, TAG, NAME, STANDARD, EFFICIENCY, IN USE,,
1, AFT-D3, P-D3101A, EQUIPMENT 1, A, 3, TRUE
2, AFT-D3, P-D3103A, EQUIPMENT 2, B, 3, FALSE
3, AFT-D3, P-D2301A, EQUIPMENT 3, A, 3, TRUE
...
CSV Example 2:
Position, TAG, STANDARD, NAME, EFFICIENCY, LOCATION, BACKUP, TESTED,,
1, P-D3101A, A, EQUIPMENT 1, 3, AFT-D3, FALSE, TRUE
2, P-D3103A, A, EQUIPMENT 2, 3, AFT-D3, TRUE, FALSE
3, P-D2301A, A, EQUIPMENT 3, 3, AFT-D3, FALSE, TRUE
...
As you can see, I will never know the format of the file I have to analyse, the only thing I know for sure is that it will always contain the few columns that I need.
My solution to this was to ask the user to enter the columns required and set as strings, the using their entry convert that to a corresponding integer that i could then use as a location.
string standardInpt = "";
string nameInpt = "";
string efficiencyInpt = "";
user would then enter a value from A to ZZ.
int standardLocation = 0;
int nameLocation = 0;
int efficiencyLocation = 0;
when the form is submitted. the ints get their final value by running through an if else... statement:
if(standard == "A")
{
standardLocation = 0;
}
else if(standard == "B")
{
standardLocation = 1;
}
...
etc running all the way to if VAR1 == ZZ and then the code is repeated for VAR2 and for VAR3 etc..
My class would partially look like:
class Equipment
{
public string Standard { get; set;}
public string Name { get; set; }
public int Efficiency { get; set; }
static Equipment FromLine(string line)
{
var data = line.split(',');
return new Equipment()
{
Standard = data[standardLocation],
Name = [nameLocation],
Efficiency = int.Parse(data[efficiencyLocation]),
};
}
}
I've got more code in there but i think this highlights where I would use the variables to set the indexes.
I'm very new to this and I'm hoping there has got to be a significantly better way to achieve this without having to write so much potentially excessive, repetitive If Else logic. I'm thinking some kind of lookup table maybe, but i cant figure out how to implement this, any pointers on where i could look?
You could make it automatic by finding the indexes of the columns in the header, and then use them to read the values from the correct place from the rest of the lines:
class EquipmentParser {
public IList<Equipment> Parse(string[] input) {
var result = new List<Equipment>();
var header = input[0].Split(',').Select(t => t.Trim().ToLower()).ToList();
var standardPosition = GetIndexOf(header, "std", "standard", "st");
var namePosition = GetIndexOf(header, "name", "nm");
var efficiencyPosition = GetIndexOf(header, "efficiency", "eff");
foreach (var s in input.Skip(1)) {
var line = s.Split(',');
result.Add(new Equipment {
Standard = line[standardPosition].Trim(),
Name = line[namePosition].Trim(),
Efficiency = int.Parse(line[efficiencyPosition])
});
}
return result;
}
private int GetIndexOf(IList<string> input, params string[] needles) {
return Array.FindIndex(input.ToArray(), needles.Contains);
}
}
You can use the reflection and attribute.
Write your samples in ,separated into DisplayName Attribute.
First call GetIndexes with the csv header string as parameter to get the mapping dictionary of class properties and csv fields.
Then call FromLine with each line and the mapping dictionary you just got.
class Equipment
{
[DisplayName("STND, STANDARD, ST")]
public string Standard { get; set; }
[DisplayName("NAME")]
public string Name { get; set; }
[DisplayName("EFFICIENCY, EFFI")]
public int Efficiency { get; set; }
// You can add any other property
public static Equipment FromLine(string line, Dictionary<PropertyInfo, int> map)
{
var data = line.Split(',').Select(t => t.Trim()).ToArray();
var ret = new Equipment();
Type type = typeof(Equipment);
foreach (PropertyInfo property in type.GetProperties())
{
int index = map[property];
property.SetValue(ret, Convert.ChangeType(data[index],
property.PropertyType));
}
return ret;
}
public static Dictionary<PropertyInfo, int> GetIndexes(string headers)
{
var headerArray = headers.Split(',').Select(t => t.Trim()).ToArray();
Type type = typeof(Equipment);
var ret = new Dictionary<PropertyInfo, int>();
foreach (PropertyInfo property in type.GetProperties())
{
var fieldNames = property.GetCustomAttribute<DisplayNameAttribute>()
.DisplayName.Split(',').Select(t => t.Trim()).ToArray();
for (int i = 0; i < headerArray.Length; ++i)
{
if (!fieldNames.Contains(headerArray[i])) continue;
ret[property] = i;
break;
}
}
return ret;
}
}
try this if helpful:
public int GetIndex(string input)
{
input = input.ToUpper();
char low = input[input.Length - 1];
char? high = input.Length == 2 ? input[0] : (char?)null;
int indexLow = low - 'A';
int? indexHigh = high.HasValue ? high.Value - 'A' : (int?)null;
return (indexHigh.HasValue ? (indexHigh.Value + 1) * 26 : 0) + indexLow;
}
You can use ASCII code for that , so no need to add if else every time
ex.
byte[] ASCIIValues = Encoding.ASCII.GetBytes(standard);
standardLocation = ASCIIValues[0]-65;

How to count occurences of number stored in file containing multiple delimeters?

This is my input store in file:
50|Carbon|Mercury|P:4;P:00;P:1
90|Oxygen|Mars|P:10;P:4;P:00
90|Serium|Jupiter|P:4;P:16;P:10
85|Hydrogen|Saturn|P:00;P:10;P:4
Now i will take my first row P:4 and then next P:00 and then next like wise and want to count occurence in every other row so expected output will be:
P:4 3(found in 2nd row,3rd row,4th row(last cell))
P:00 2 (found on 2nd row,4th row)
P:1 0 (no occurences are there so)
P:10 1
P:16 0
etc.....
Like wise i would like to print occurence of each and every proportion.
So far i am successfull in splitting row by row and storing in my class file object like this:
public class Planets
{
//My rest fields
public string ProportionConcat { get; set; }
public List<proportion> proportion { get; set; }
}
public class proportion
{
public int Number { get; set; }
}
I have already filled my planet object like below and Finally my List of planet object data is like this:
List<Planets> Planets = new List<Planets>();
Planets[0]:
{
Number:50
name: Carbon
object:Mercury
ProportionConcat:P:4;P:00;P:1
proportion[0]:
{
Number:4
},
proportion[1]:
{
Number:00
},
proportion[2]:
{
Number:1
}
}
Etc...
I know i can loop through and perform search and count but then 2 to 3 loops will be required and code will be little messy so i want some better code to perform this.
Now how do i search each and count every other proportion in my planet List object??
Well, if you have parsed proportions, you can create new struct for output data:
// Class to storage result
public class Values
{
public int Count; // count of proportion entry.
public readonly HashSet<int> Rows = new HashSet<int>(); //list with rows numbers.
/// <summary> Add new proportion</summary>
/// <param name="rowNumber">Number of row, where proportion entries</param>
public void Increment(int rowNumber)
{
++Count; // increase count of proportions entries
Rows.Add(rowNumber); // add number of row, where proportion entry
}
}
And use this code to fill it. I'm not sure it's "messy" and don't see necessity to complicate the code with LINQ. What do you think about it?
var result = new Dictionary<int, Values>(); // create dictionary, where we will storage our results. keys is proportion. values - information about how often this proportion entries and rows, where this proportion entry
for (var i = 0; i < Planets.Count; i++) // we use for instead of foreach for finding row number. i == row number
{
var planet = Planets[i];
foreach (var proportion in planet.proportion)
{
if (!result.ContainsKey(proportion.Number)) // if our result dictionary doesn't contain proportion
result.Add(proportion.Number, new Values()); // we add it to dictionary and initialize our result class for this proportion
result[proportion.Number].Increment(i); // increment count of entries and add row number
}
}
You can use var count = Regex.Matches(lineString, input).Count;. Try this example
var list = new List<string>
{
"50|Carbon|Mercury|P:4;P:00;P:1",
"90|Oxygen|Mars|P:10;P:4;P:00",
"90|Serium|Jupiter|P:4;P:16;P:10",
"85|Hydrogen|Saturn|P:00;P:10;P:4"
};
int totalCount;
var result = CountWords(list, "P:4", out totalCount);
Console.WriteLine("Total Found: {0}", totalCount);
foreach (var foundWords in result)
{
Console.WriteLine(foundWords);
}
public class FoundWords
{
public string LineNumber { get; set; }
public int Found { get; set; }
}
private List<FoundWords> CountWords(List<string> words, string input, out int total)
{
total = 0;
int[] index = {0};
var result = new List<FoundWords>();
foreach (var f in words.Select(word => new FoundWords {Found = Regex.Matches(word, input).Count, LineNumber = "Line Number: " + index[0] + 1}))
{
result.Add(f);
total += f.Found;
index[0]++;
}
return result;
}
I made a DotNetFiddle for you here: https://dotnetfiddle.net/z9QwmD
string raw =
#"50|Carbon|Mercury|P:4;P:00;P:1
90|Oxygen|Mars|P:10;P:4;P:00
90|Serium|Jupiter|P:4;P:16;P:10
85|Hydrogen|Saturn|P:00;P:10;P:4";
string[] splits = raw.Split(
new string[] { "|", ";", "\n" },
StringSplitOptions.None
);
foreach (string p in splits.Where(s => s.ToUpper().StartsWith(("P:"))).Distinct())
{
Console.WriteLine(
string.Format("{0} - {1}",
p,
splits.Count(s => s.ToUpper() == p.ToUpper())
)
);
}
Basically, you can use .Split to split on multiple delimiters at once, it's pretty straightforward. After that, everything is gravy :).
Obviously my code simply outputs the results to the console, but that part is fairly easy to change. Let me know if there's anything you didn't understand.

Categories