Truncate/delete first x rows from a Frame - c#

I have a Frame<int, string> which consists of a OHLCV data. I'm calculating technical analysis indicators for that Frame and since the first few records aren't accurate due to the fact that there are at the very begin, I have to remove them. How do I do that?
public override Frame<int, string> PopulateIndicators(Frame<int, string> dataFrame)
{
var candles = dataFrame.Rows.Select(kvp => new Candle
{
Timestamp = kvp.Value.GetAs<DateTime>("Timestamp"),
Open = kvp.Value.GetAs<decimal>("Open"),
High = kvp.Value.GetAs<decimal>("High"),
Low = kvp.Value.GetAs<decimal>("Low"),
Close = kvp.Value.GetAs<decimal>("Close"),
Volume = kvp.Value.GetAs<decimal>("Volume")
}).Observations.Select(e => e.Value).ToList<IOhlcv>();
// TODO: Truncate/remove the first 50 rows
dataFrame.AddColumn("Rsi", candles.Rsi(14));
}

Most operations in Deedle are expressed in terms of row keys, rather than indices. The idea behind this is that, if you work with ordederd data, you should have some ordered row keys.
This means that this is easier to do based on row keys. However, if you have an ordered row index, you can get a key at a certain location and then use it for filtering in Where. I would try something like:
var firstKey = dataFrame.GetRowKeyAt(50);
var after50 = dataFrame.Where(kvp => kvp.Key > firstKey);

Related

Searching dictionary for matching key from a textbox C#

I'm reading a csv file that contains abbreviations and the full version for example LOL,Laughing out Loud. I've created a dictionary where the abbreviation is the key and full version is the value.
'''
private void btnFilter_Click(object sender, RoutedEventArgs e)
{
var keys = new List<string>();
var values = new List<string>();
using (var rd = new StreamReader("textwords.csv"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(',');
keys.Add(splits[0]);
values.Add(splits[1]);
}
}
var dictionary = keys.Zip(values, (k, v) => new { Key = k, Value = v }).ToDictionary(x => x.Key, x => x.Value);
foreach (var k in dictionary)
{
//
}
aMessage.MessageContent = txtContent.Text;
}
'''
I'm using a button that will check if the textbox txtContent contains any of the abbreviations in the text and change this to the full version. So if the textbox contained the following. "Going to be AFK" after the button click it would change this to "Going to be away from keyboard".
I want to write a foreach loop that would check for any abbreviations, elongate them and then save this to a string variable MessageContent.
Would anyone be able to show me the best way to go about this as I'm not sure how to do this with the input from a textbox?
Thanks
You can just use LINQ to read and create a dictionary object:
var dictionary = File.ReadAllLines(#"textwords.csv")
.Select(x => x.Split(",",StringSplitOptions.RemoveEmptyEntries))
.ToDictionary(key => key.FirstOrDefault().Trim(),
value => value.Skip(1).FirstOrDefault().Trim());
If the abbrevs are correct, i.e. you don't need fuzzy logic, you can use a variety of .NET objects and search the keys rather quickly.
if(dict.ContainsKey(myKey))
{
}
I did that freehand, so it might be dict.Keys.Contains() or similar. The point is you can search the dictionary directly rather than loop.
If you need to do a more fuzzy search, you can utilize LINQ to write queries to iterate over collections of objects extremely fast (and few lines).
As for yours, you would iterate over dictionary.Keys to seach keys, but I still don't see why you need the extra code.

Serially assign values to OrderedDictionary in C#

I have two key-value pairs, and now I want to fill up the larger one with values from the smaller one in a serial manner.
OrderedDictionary pickersPool = new OrderedDictionary(); // Small
OrderedDictionary pickersToTicketMap = new OrderedDictionary(); // Big
pickersPool.Add("emp1", 44);
pickersPool.Add("emp2", 543);
Now I need to update pickersToTicketMap to look like this:
("100", 44);
("109", 543);
("13", 44);
("23", 543);
So basically I need the pickersPool value to cycle through the keys of the pickersToTicketMap dictionary.
I need pickerPool values to keep cycling pickersToTicketMap and updating its value serially.
The pickersToTicketMap orderedlist initially has a value of:
("100", "null");
("109", "null");
("13", "null");
("23", "null");
so I need for the values of PickerPool orderedDictionary to fill up those nulls in a repeated fashion.
It sounds like you should start with a List<string> (or possibly a List<int>, given that they all seem to be integers...) rather than populating your map with empty entries to start with. So something like:
List<string> tickets = new List<string> { "100", "109", "13", "23" };
Then you can populate your pickersToTicketMap as:
var pickers = pickersPool.Values;
var pickerIterator = pickers.GetEnumerator();
foreach (var ticket in tickets)
{
if (!pickerIterator.MoveNext())
{
// Start the next picker...
pickerIterator = pickers.GetEnumerator();
if (!pickerIterator.MoveNext())
{
throw new InvalidOperationException("No pickers available!");
}
}
ticketToPickerMap[ticket] = pickerIterator.Current;
}
Note that I've changed the name from pickersToTicketMap to ticketToPickerMap because that appears to be what you really mean - the key is the ticket, and the value is the picker.
Also note that I'm not disposing of the iterator from pickers. That's generally a bad idea, but in this case I'm assuming that the iterator returned by OrderedDictionary.Values.GetEnumerator() doesn't need disposal.
There may be what you are looking for:
using System.Linq;
...
int i = 0;
// Cast OrderedDictionary to IEnumarable<DictionaryEntry> to be able to use System.Linq
object[] keys = pickersToTicketMap.Cast<DictionaryEntry>().Select(x=>x.Key).ToArray();
IEnumerable<DictionaryEntry> pickersPoolEnumerable = pickersPool.Cast<DictionaryEntry>();
// iterate over all keys (sorted)
foreach (object key in keys)
{
// Set the value of key to element i % pickerPool.Count
// i % pickerPool.Count will return for Count = 2
// 0, 1, 0, 1, 0, ...
pickersToTicketMap[key] = pickersPoolEnumarable
.ElementAt(i % pickersPool.Count).Value;
i++;
}
PS: The ToArray() is required to have a separate copy of the keys, so you don't get a InvalidOperationException due to changing the element you are iterating over.
So you want to update the large dictionary's values with consecutive and repeating values from the possibly smaller one? I have two approaches in mind, one simpler:
You can repeat the smaller collection with Enumerable.Repeat. You have to calculate the count. Then you can use SelectMany to flatten it and ToList to create a collection. Then you can use a for loop to update the larger dictionary with the values in the list via an index:
IEnumerable<int> values = pickersPool.Values.Cast<int>();
if (pickersPool.Count < pickersToTicketMap.Count)
{
// Repeat this collection until it has the same size as the larger collection
values = Enumerable.Repeat( values,
pickersToTicketMap.Count / pickersPool.Count
+ pickersToTicketMap.Count % pickersPool.Count
)
.SelectMany(intColl => intColl);
}
List<int> valueList = values.ToList();
for (int i = 0; i < valueList.Count; i++)
pickersToTicketMap[i] = valueList[i];
I would prefer the above approach, because it's more readable than my second which uses an "infinite" sequence. This is the extension method:
public static IEnumerable<T> RepeatEndless<T>(this IEnumerable<T> sequence)
{
while (true)
foreach (var item in sequence)
yield return item;
}
Now you can use this code to update the larger dictionary's values:
var endlessPickersPool = pickersPool.Cast<DictionaryEntry>().RepeatEndless();
IEnumerator<DictionaryEntry> endlessEnumerator;
IEnumerator<string> ptmKeyEnumerator;
using ((endlessEnumerator = endlessPickersPool.GetEnumerator()) as IDisposable)
using ((ptmKeyEnumerator = pickersToTicketMap.Keys.Cast<string>().ToList().GetEnumerator()) as IDisposable)
{
while (endlessEnumerator.MoveNext() && ptmKeyEnumerator.MoveNext())
{
DictionaryEntry pickersPoolItem = (DictionaryEntry)endlessEnumerator.Current;
pickersToTicketMap[ptmKeyEnumerator.Current] = pickersPoolItem.Value;
}
}
Note that it's important that I use largerDict.Keys.Cast<string>().ToList(), because I can't use the original Keys collection. You get an exception if you change it during enumeration.
Thanks to #jon skeet, although he modified my objects too much while trying to provide a hack for this.
After looking at your solution, I implemented the following, which works well for all my objects.
var pickerIterator = pickerPool.GetEnumerator();
foreach (DictionaryEntry ticket in tickets)
{
if (!pickerIterator.MoveNext())
{
// Start the next picker...
pickerIterator = pickerPool.GetEnumerator();
if (!pickerIterator.MoveNext())
{
throw new InvalidOperationException("No pickers available!");
}
}
ticketToPickerMap[ticket.Key] = pickerIterator.Value.ToString();
}

Is this a DDD rule?

Ok so I have a database-table with 1000 rows.
From these I need to randomly extract 4 entries.
This is a business rule. I could easily do the random thing in LINQ or SQL. But my Domain project must be independent, not referencing any other project.
So I should have a list there, load it with all the 1000 rows and randomly extract 4 to be DDD-clean.
Is this ok? What if the db-table has 100k rows?
If the primary keys were sequential and not interrupted then that would yield large performance benefits for the 100k or beyond tables. Even if they are not sequential I believe you can just check for that and iterate lightly to find it.
Basically you are going to want to get a count of the table
var rowCount = db.DbSet<TableName>().Count(); //EF for pseudo
And then get 4 random numbers in that range
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(0,rowCount).Array();
And then iterate through that getting the records by id (this could also be done using a contains and the id, I was not sure which was faster but with only 4 find should execute quickly).
var randomRecords = new List<TableName>();
foreach(int id in randIds)
{
var match = db.DbSet<TableName>().Find(id);
//if the id is not present, then you can iterate a little to find it
//or provide a custom column for the exact record as you indicate in comments
while(match != null)
{
match = db.DbSet<TableName>().Find(++id);
}
randomRecords.Add(match);
}
Building on Travis's code, this should work for you. It basically gets a count of records, generates 4 random numbers, then asks for the nth record in the table and adds it to the result list.
var rowCount = db.TableName.Count();
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(0,rowCount));
var randomRecords = new List<TableName>();
foreach(int id in randIds)
{
var match = db.TableName
.OrderBy(x=>x.id) // Order by the primary key -- I assumed id
.Skip(id).First();
randomRecords.Add(match);
}
You could also do something like this, IF you have an autoincrementing id field that is the primary key. The caveat is this isn't a fixed time function since you aren't sure how many loops may be required:
var idMax = db.TableName.Max(t=>t.id);
var rand = new Random();
var randomRecords = new List<TableName>();
while(randomRecords.Count()<4)
{
var match = db.TableName.Find(rand.Next(0,idMax));
if(match!=null)
randomRecords.Add(match);
}
If you don't care for absolute randomness (This is very very not random, with some things weighted more than others), but this is the fastest method, requiring only one database trip:
var idMax = db.TableName.Max(t=>t.id);
var rand = new Random();
var randIds = Enumerable.Range(0,4).Select(i => rand.Next(1,idMax));
var query=db.TableName.Where(t=>false);
foreach(int id in randIds)
{
query=query.Concat(db.TableName.OrderBy(t=>t.id).Where(t=>t.id>=id).Take(1));
}
var randomRecords=query.ToList();

Optimizing array iteration with nested Where (LINQ) clauses

I am creating a (C#) tool that has a search functionality. The search is kind of similar to a "go to anywhere" search (like ReSharper has or VS2013).
The search context is a string array that contains all items up front:
private string[] context; // contains thousands of elements
Searching is incremental and occurs with every new input (character) the user provides.
I have implemented the search using the LINQ Where extension method:
// User searched for "c"
var input = "c";
var results = context.Where(s => s.Contains(input));
When the user searches for "ca", I attempted to use the previous results as the search context, however this causes (i think?) a nested Where iteration, and does not run very well. Think of something like this code:
// Cache these results.
var results = var results = context.Where(s => s.Contains(input));
// Next search uses the previous search results
var newResults = results.Where(s => s.Contains(input));
Is there any way to optimize this scenario?
Converting the IEnumerable into an array with every search causes high memory allocations and runs poorly.
Presenting the user with thousands of search results is pretty useless. You should add a "top" (Take in linq) statement to your query before presenting the result to the user.
var results = context.Where(s => s.Contains(input)).Take(100);
And if you want to present the next 100 results to the user:
var results = context.Where(s => s.Contains(input)).Skip(100).Take(100);
Also just use the original array for all the searches, no nested Where as it has no benefits unless you materialize the query.
I got a couple of useful points to add, too many for a comment.
First off, i agree with the other comments that you should start with .take(100), decrease the load time. Even better, add one result at the time:
var results = context.Where(s => s.Contains(input));
var resultEnumerator = result.GetEnumerator()
Loop over the resultEnumerator to display results one at the time, stop when the screen is full or a new search is initiated.
Second, throttle your input. If the user writes Hello, you do not want to shoot off 5 searches for H, He, Hel, Hell and Hello, you want to search for just Hello. When the user later add world, it could be worthwhile to take your old result and add Hello world to the where clause.
results = results.Where(s => s.Contains(input));
resultEnumerator = result.GetEnumerator()
And of course, cancel the current in progress result when the user adds new text.
Using Rx, the throttle part is easy, you would get something like this:
var result = context.AsEnumerable();
var oldStr = "";
var resultEnumerator = result.GetEnumerator();
Observable.FromEventPattern(h => txt.TextChanged += h, h => txt.TextChanged -= h)
.Select(s => txt.Text)
.DistinctUntilChanged().Throttle(TimeSpan.FromMilliseconds(300))
.Subscribe(s =>
{
if (s.Contains(oldStr))
result = result.Where(t => t.Contains(s));
else
result = context.Where(t => t.Contains(s));
resultEnumerator = result.GetEnumerator();
oldStr = s;
// and probably start iterating resultEnumerator again,
// but perhaps not on this thread.
});
If allocs are your concern and you don't want to write a trie implementation or use third party code, you should get away with partitioning your context array successively to clump matching entries together in the front. Not very LINQ-ish, but fast and has zero memory cost.
The partitioning extension method, based on C++'s std::partition
/// <summary>
/// All elements for which predicate is true are moved to the front of the array.
/// </summary>
/// <param name="start">Index to start with</param>
/// <param name="end">Index to end with</param>
/// <param name="predicate"></param>
/// <returns>Index of the first element for which predicate returns false</returns>
static int Partition<T>(this T[] array, int start, int end, Predicate<T> predicate)
{
while (start != end)
{
// move start to the first not-matching element
while ( predicate(array[start]) )
{
if ( ++start == end )
{
return start;
}
}
// move end to the last matching element
do
{
if (--end == start)
{
return start;
}
}
while (!predicate(array[end]));
// swap the two
var temp = array[start];
array[start] = array[end];
array[end] = temp;
++start;
}
return start;
}
So now you need to store the last partition index, which should be initialised with context length:
private int resultsCount = context.Length;
Then for each change in input that's incremental you can run:
resultsCount = context.Partition(0, resultsCount, s => s.Contains(input));
Each time this will only do the checks for elements that haven't been filtered out previously, which is exactly what you are after.
For each non-incremental change you'll need to reset resultsCount to the original value.
You can expose results in a convenient, debugger and LINQ friendly way:
public IEnumerable<string> Matches
{
get { return context.Take(resultsCount); }
}

sorting List<string[]> by many columns

I have List which I would like to sort by many columns. For example, string[] has 5 elements (5 columns) and List has 10 elements (10 rows). For example I would like to start sorting by 1st column, then by 3rd and then by 4th.
How could it be done in the easiest way with C#?
I thought about such algorithm:
Delete values corresponding to those columns that I don't want to use for sorting
Find for each of columns that are left, the longest string that can be used to store their value
Change each row to string, where each cell occupies as many characters as there is maximum number of characters for the value for the given column
Assign int with index for each of those string values
Sort these string values
Sort the real data, with help of already sorted indices
But I think this algorithm is very bad. Could you suggest me any better way, if possible, that uses already existing features of C# and .NET?
List<string[]> list = .....
var newList = list.OrderBy(x => x[1]).ThenBy(x => x[3]).ThenBy(x => x[4]).ToList();
Something like this:
var rows = new List<string[]>();
var sortColumnIndex = 2;
rows.Sort((a, b) => return a[sortColumnIndex].CompareTo(b[sortColumnIndex]));
This will perform an in-place sort -- that is, it will sort the contents of the list.
Sorting on multiple columns is possible, but requires more logic in your comparer delegate.
If you're happy to create another collection, you can use the Linq approach given in another answer.
EDIT here's the multi-column, in-place sorting example:
var rows = new List<string[]>();
var sortColumnIndices = new[] { 1, 3, 4 };
rows.Sort((a, b) => {
for (var index in sortColumnIndices)
{
var result = a[index].CompareTo(b[index]);
if (result != 0)
return result;
}
return 0;
});

Categories