Testing delays in Rx - c#

I'm trying to work out how one could go about testing the following function, which adds monitoring around the internal queue of Observable.ObserveOn.
public IObservable<T> MonitorBuffer<T>(IObservable<T> source, Action<int> monitor, TimeSpan interval, IScheduler scheduler)
{
return Observable.Create<T>(ob =>
{
int count = 0;
return new CompositeDisposable(source
.Do(_ => Interlocked.Increment(ref count))
.ObserveOn(scheduler)
.Do(_ => Interlocked.Decrement(ref count))
.Subscribe(ob),
Observable.Interval(interval, scheduler).Select(_ => count).DistinctUntilChanged().Subscribe(monitor)
);
});
}
I envisage something like this:
var ts = new TestScheduler();
var input = Enumerable.Range(1, 8).Select(i => OnNext(i * 10, i)).ToArray();
var hot = ts.CreateHotObservable(input);
var observer = ts.CreateObserver<int>();
var log = new Subject<int>();
var monitor = ts.CreateObserver<int>();
var ticks = TimeSpan.FromTicks(5);
var buffered = MonitorBuffer(hot, log.OnNext, ticks, ts);
log.Subscribe(monitor);
buffered.Do(x => { /*if(x == 3) Introduce delay here */}).Subscribe(observer);
ts.AdvanceTo(100);
observer.Messages.AssertEqual(...);
monitor.Messages.AssertEqual(...);
The question is, what can I put in the Do to get the desired effect of a temporary downstream delay.
I'm looking for results something like this:
//time: 0--------10--------20--------30--------40--------50--------60--------70--------
//source: ---------1---------2---------3---------4---------5---------6---------7---------
//output: ---------1---------2-----------------------------345-------6---------7---------
//log: ----0-------------------------1---------2---------2----0-----------------------
(NB: I asked a similar question a while ago, but it wasn't very clear, and it's a bit late for a complete rewrite now).

I think I've nailed it...
The secret is to have two schedulers which can be advanced independently.
Building on the test code in the question:
var inputscheduler = new TestScheduler();
(...)
//different scheduler for buffer/observeOn
var bufferScheduler = new TestScheduler();
var buffered = MonitorBuffer(hot, log.OnNext, ticks, bufferScheduler);
log.Subscribe(monitor);
buffered.Subscribe(observer);
//instead of inserting something downstream, use scheduler advances
for (int i = 3; i < 80; i++)
{
inputscheduler.AdvanceTo(i);
if (i < 25|| i > 45) bufferscheduler.AdvanceTo(i);
}
observer.Messages.AssertEqual(...);
monitor.Messages.AssertEqual(...);

Related

C# not waiting for all Tasks to be performed

I'm trying to execute multiple requests at the same time to a Pi Number API. The main problem is that despite the 'Task.WhenAll(ExecuteRequests()).Wait();' line, it isn't completing all tasks. It should execute 50 requests and add it results to pi Dictionary, but after code execution the dictionary has about 44~46 items.
I tried to add an 'availables threads at ThreadPool verification', so i could guarantee i have enough Threads, but nothing changed.
The other problem is that sometimes when I run the code, i have an error saying I'm trying to add an already added key to the dicitionary, but the lock statement wasn't supposed to guarantee this error doesn't occur?
const int TotalRequests = 50;
static int requestsCount = 0;
static Dictionary<int, string> pi = new();
static readonly object lockState = new();
static void Main(string[] args)
{
var timer = new Stopwatch();
timer.Start();
Task.WhenAll(ExecuteRequests()).Wait();
timer.Stop();
foreach (var item in pi.OrderBy(x => x.Key))
Console.Write(item.Value);
Console.WriteLine($"\n\n{timer.ElapsedMilliseconds}ms");
Console.WriteLine($"\n{pi.Count} items");
}
static List<Task> ExecuteRequests()
{
var tasks = new List<Task>();
for (int i = 0; i < TotalRequests; i++)
{
ThreadPool.GetAvailableThreads(out int workerThreads, out int completionPortThreads);
while (workerThreads < 1)
{
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
Thread.Sleep(100);
}
tasks.Add(Task.Run(async () =>
{
var currentRequestId = 0;
lock (lockState)
currentRequestId = requestsCount++;
var httpClient = new HttpClient();
var result = await httpClient.GetAsync($"https://api.pi.delivery/v1/pi?start={currentRequestId * 1000}&numberOfDigits=1000");
if (result.StatusCode == System.Net.HttpStatusCode.OK)
{
var json = await result.Content.ReadAsStringAsync();
var content = JsonSerializer.Deserialize<JsonObject>(json)!["content"]!.ToString();
//var content = (await JsonSerializer.DeserializeAsync<JsonObject>(new MemoryStream(System.Text.Encoding.UTF8.GetBytes(json)!)!)!)!["content"]!.ToString();
pi.Add(currentRequestId, content);
}
}));
}
return tasks;
}
There`s only one problem - you turned only one part of code, which have problem with threads:
lock (lockState)
currentRequestId = requestsCount++;
But, there`s another one:
pi.Add(currentRequestId, content);
The problem related to dictionary idea - a lot of readers and only one writer. So, you saw case with exception and if you write try catch, you will see AggregateException, which almost in every case mean thread issues, so, you need to do this:
lock (lockState)
pi.Add(currentRequestId, content);
I put a lock statement around the dicitionary manipulation as #AlexeiLevenkov mentioned and it worked fine.
tasks.Add(Task.Run(async () =>
{
var currentRequestId = 0;
lock (lockState)
currentRequestId = requestsCount++;
var httpClient = new HttpClient();
var result = await httpClient.GetAsync($"https://api.pi.delivery/v1/pi?start={currentRequestId * 1000}&numberOfDigits=1000");
if (result.StatusCode == System.Net.HttpStatusCode.OK)
{
var json = await result.Content.ReadAsStringAsync();
var content = JsonSerializer.Deserialize<JsonObject>(json)!["content"]!.ToString();
//var content = (await JsonSerializer.DeserializeAsync<JsonObject>(new MemoryStream(System.Text.Encoding.UTF8.GetBytes(json)!)!)!)!["content"]!.ToString();
lock (lockState)
pi.Add(currentRequestId, content);
}
}));
I'm not directly answering the question, just suggesting that you can use Microsoft's Reactive Framework (aka Rx) - NuGet System.Reactive and add using System.Reactive.Linq; - then you can do this:
static void Main(string[] args)
{
var timer = new Stopwatch();
timer.Start();
(int currentRequestId, string content)[] results = ExecuteRequests(50).ToArray().Wait()
timer.Stop();
foreach (var item in results.OrderBy(x => x.currentRequestId))
Console.Write(item.content);
Console.WriteLine($"\n\n{timer.ElapsedMilliseconds}ms");
Console.WriteLine($"\n{results.Count()} items");
}
static IObservable<(int currentRequestId, string content)> ExecuteRequests(int totalRequests) =>
Observable
.Defer(() =>
from currentRequestId in Observable.Range(0, totalRequests)
from content in Observable.Using(() => new HttpClient(), hc =>
from result in Observable.FromAsync(() => hc.GetAsync($"https://api.pi.delivery/v1/pi?start={currentRequestId * 1000}&numberOfDigits=1000"))
where result.StatusCode == System.Net.HttpStatusCode.OK
from json in Observable.FromAsync(() => result.Content.ReadAsStringAsync())
select JsonSerializer.Deserialize<JsonObject>(json)!["content"]!.ToString())
select new
{
currentRequestId,
content,
});

Semaphore slim to handle throttling per time period

I have a requirement from a client, to call their API, however, due to the throttling limit, we can only make 100 API calls in a minute. I am using SemaphoreSlim to handle that, Here is my code.
async Task<List<IRestResponse>> GetAllResponses(List<string> locationApiCalls)
{
var semaphoreSlim = new SemaphoreSlim(initialCount: 100, maxCount: 100);
var failedResponses = new ConcurrentBag<IReadOnlyCollection<IRestResponse>>();
var passedResponses = new ConcurrentBag<IReadOnlyCollection<IRestResponse>>();
var tasks = locationApiCalls.Select(async locationApiCall =>
{
await semaphoreSlim.WaitAsync();
try
{
var response = await RestApi.GetResponseAsync(locationApi);
if (response.IsSuccessful)
{
passedResponses.Add((IReadOnlyCollection<IRestResponse>)response);
}
else
{
failedResponses.Add((IReadOnlyCollection<IRestResponse>)response);
}
}
finally
{
semaphoreSlim.Release();
}
});
await Task.WhenAll(tasks);
var passedResponsesList = passedResponses.SelectMany(x => x).ToList();
}
However this line
var passedResponsesList = passedResponses.SelectMany(x => x).ToList();
never gets executed and I see Lots of failedResponses as well, I guess I have to add Task.Delay (for 1 minute) somewhere in the code as well.
You need to keep track of the time when each of the previous 100 requests was executed. In the sample implementation below, the ConcurrentQueue<TimeSpan> records the relative completion time of each of these previous 100 requests. By dequeuing the first (and hence earliest) time from this queue, you can check how much time has passed since 100 requests ago. If it's been less than a minute, then the next request needs to wait for the remainder of the minute before it can be executed.
async Task<List<IRestResponse>> GetAllResponses(List<string> locationApiCalls)
{
var semaphoreSlim = new SemaphoreSlim(initialCount: 100, maxCount: 100);
var total = 0;
var stopwatch = Stopwatch.StartNew();
var completionTimes = new ConcurrentQueue<TimeSpan>();
var failedResponses = new ConcurrentBag<IReadOnlyCollection<IRestResponse>>();
var passedResponses = new ConcurrentBag<IReadOnlyCollection<IRestResponse>>();
var tasks = locationApiCalls.Select(async locationApiCall =>
{
await semaphoreSlim.WaitAsync();
if (Interlocked.Increment(ref total) > 100)
{
completionTimes.TryDequeue(out var earliest);
var elapsed = stopwatch.Elapsed - earliest;
var delay = TimeSpan.FromSeconds(60) - elapsed;
if (delay > TimeSpan.Zero)
await Task.Delay(delay);
}
try
{
var response = await RestApi.GetResponseAsync(locationApi);
if (response.IsSuccessful)
{
passedResponses.Add((IReadOnlyCollection<IRestResponse>)response);
}
else
{
failedResponses.Add((IReadOnlyCollection<IRestResponse>)response);
}
}
finally
{
completionTimes.Enqueue(stopwatch.Elapsed);
semaphoreSlim.Release();
}
});
await Task.WhenAll(tasks);
var passedResponsesList = passedResponses.SelectMany(x => x).ToList();
}
If you're calling this method from the UI thread of a WinForms or WPF application, remember to add ConfigureAwait(false) to its await statements.

Observable wait until no more changes for time x and then notify [duplicate]

I'm using reactive extensions to collate data into buffers of 100ms:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100))
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
This works fine. However, I want slightly different behavior than that provided by the Buffer operation. Essentially, I want to reset the timer if another data item is received. Only when no data has been received for the entire 100ms do I want to handle it. This opens up the possibility of never handling the data, so I should also be able to specify a maximum count. I would imagine something along the lines of:
.SlidingBuffer(TimeSpan.FromMilliseconds(100), 10000)
I've had a look around and haven't been able to find anything like this in Rx? Can anyone confirm/deny this?
This is possible by combining the built-in Window and Throttle methods of Observable. First, let's solve the simpler problem where we ignore the maximum count condition:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay)
{
var closes = stream.Throttle(delay);
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
The powerful Window method did the heavy lifting. Now it's easy enough to see how to add a maximum count:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay, Int32? max=null)
{
var closes = stream.Throttle(delay);
if (max != null)
{
var overflows = stream.Where((x,index) => index+1>=max);
closes = closes.Merge(overflows);
}
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
I'll write a post explaining this on my blog. https://gist.github.com/2244036
Documentation for the Window method:
http://leecampbell.blogspot.co.uk/2011/03/rx-part-9join-window-buffer-and-group.html
http://enumeratethis.com/2011/07/26/financial-charts-reactive-extensions/
I wrote an extension to do most of what you're after - BufferWithInactivity.
Here it is:
public static IObservable<IEnumerable<T>> BufferWithInactivity<T>(
this IObservable<T> source,
TimeSpan inactivity,
int maximumBufferSize)
{
return Observable.Create<IEnumerable<T>>(o =>
{
var gate = new object();
var buffer = new List<T>();
var mutable = new SerialDisposable();
var subscription = (IDisposable)null;
var scheduler = Scheduler.ThreadPool;
Action dump = () =>
{
var bts = buffer.ToArray();
buffer = new List<T>();
if (o != null)
{
o.OnNext(bts);
}
};
Action dispose = () =>
{
if (subscription != null)
{
subscription.Dispose();
}
mutable.Dispose();
};
Action<Action<IObserver<IEnumerable<T>>>> onErrorOrCompleted =
onAction =>
{
lock (gate)
{
dispose();
dump();
if (o != null)
{
onAction(o);
}
}
};
Action<Exception> onError = ex =>
onErrorOrCompleted(x => x.OnError(ex));
Action onCompleted = () => onErrorOrCompleted(x => x.OnCompleted());
Action<T> onNext = t =>
{
lock (gate)
{
buffer.Add(t);
if (buffer.Count == maximumBufferSize)
{
dump();
mutable.Disposable = Disposable.Empty;
}
else
{
mutable.Disposable = scheduler.Schedule(inactivity, () =>
{
lock (gate)
{
dump();
}
});
}
}
};
subscription =
source
.ObserveOn(scheduler)
.Subscribe(onNext, onError, onCompleted);
return () =>
{
lock (gate)
{
o = null;
dispose();
}
};
});
}
With Rx Extensions 2.0, your can answer both requirements with a new Buffer overload accepting a timeout and a size:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100), 1)
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
See https://msdn.microsoft.com/en-us/library/hh229200(v=vs.103).aspx for the documentation.
I guess this can be implemented on top of Buffer method as shown below:
public static IObservable<IList<T>> SlidingBuffer<T>(this IObservable<T> obs, TimeSpan span, int max)
{
return Observable.CreateWithDisposable<IList<T>>(cl =>
{
var acc = new List<T>();
return obs.Buffer(span)
.Subscribe(next =>
{
if (next.Count == 0) //no activity in time span
{
cl.OnNext(acc);
acc.Clear();
}
else
{
acc.AddRange(next);
if (acc.Count >= max) //max items collected
{
cl.OnNext(acc);
acc.Clear();
}
}
}, err => cl.OnError(err), () => { cl.OnNext(acc); cl.OnCompleted(); });
});
}
NOTE: I haven't tested it, but I hope it gives you the idea.
Colonel Panic's solution is almost perfect. The only thing that is missing is a Publish component, in order to make the solution work with cold sequences too.
/// <summary>
/// Projects each element of an observable sequence into a buffer that's sent out
/// when either a given inactivity timespan has elapsed, or it's full,
/// using the specified scheduler to run timers.
/// </summary>
public static IObservable<IList<T>> BufferUntilInactive<T>(
this IObservable<T> source, TimeSpan dueTime, int maxCount,
IScheduler scheduler = default)
{
if (maxCount < 1) throw new ArgumentOutOfRangeException(nameof(maxCount));
scheduler ??= Scheduler.Default;
return source.Publish(published =>
{
var combinedBoundaries = Observable.Merge
(
published.Throttle(dueTime, scheduler),
published.Skip(maxCount - 1)
);
return published
.Window(() => combinedBoundaries)
.SelectMany(window => window.ToList());
});
}
Beyond adding the Publish, I've also replaced the original .Where((_, index) => index + 1 >= maxCount) with the equivalent but shorter .Skip(maxCount - 1). For completeness there is also an IScheduler parameter, which configures the scheduler where the timer is run.

.NET ReactiveExtensions: Use Sample() with variable timespan

Given a high-frequency observable stream of data, i want to only emit an item every XX seconds.
This is usually done in RX by using .Sample(TimeSpan.FromSeconds(XX))
However... I want the time-interval to vary based on some property on the data.
Let's say my data is:
class Position
{
...
public int Speed;
}
If Speed is less than 100, I want to emit data every 5 seconds. If speed is hight than 100 it should be every 2 seonds.
Is that possible with off-the-shelf Sample() or do I need to build something myself?
Here is a low level implementation, utilizing the System.Reactive.Concurrency.Scheduler.SchedulePeriodic extension method as a timer.
public static IObservable<TSource> Sample<TSource>(this IObservable<TSource> source,
Func<TSource, TimeSpan> intervalSelector, IScheduler scheduler = null)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (intervalSelector == null)
throw new ArgumentNullException(nameof(intervalSelector));
scheduler = scheduler ?? Scheduler.Default;
return Observable.Create<TSource>(observer =>
{
TimeSpan currentInterval = Timeout.InfiniteTimeSpan;
IDisposable timer = null;
TSource latestItem = default;
bool latestEmitted = true;
object locker = new object();
Action periodicAction = () =>
{
TSource itemToEmit;
lock (locker)
{
if (latestEmitted) return;
itemToEmit = latestItem;
latestItem = default;
latestEmitted = true;
}
observer.OnNext(itemToEmit);
};
return source.Subscribe(onNext: item =>
{
lock (locker)
{
latestItem = item;
latestEmitted = false;
}
var newInterval = intervalSelector(item);
if (newInterval != currentInterval)
{
timer?.Dispose();
timer = scheduler.SchedulePeriodic(newInterval, periodicAction);
currentInterval = newInterval;
}
}, onError: ex =>
{
timer?.Dispose();
observer.OnError(ex);
}, onCompleted: () =>
{
timer?.Dispose();
observer.OnCompleted();
});
});
}
Usage example:
observable.Sample(x => TimeSpan.FromSeconds(x.Speed < 100 ? 5.0 : 2.0));
The timer is restarted every time the intervalSelector callback returns a different interval. In the extreme case that the interval is changed with every new item, then this custom operator will behave more like the built-in Throttle than the built-in Sample.
Unlike Sample, Throttle's period is a sliding window. Each time Throttle receives a value, the window is reset. (citation)
Let me know if this works:
var query =
source
.Publish(ss =>
ss
.Select(s => s.Speed < 100 ? 5.0 : 2.0)
.Distinct()
.Select(x => ss.Sample(TimeSpan.FromSeconds(x))));

Nested Threads (Tasks) Timing Out Prematurely

I have the following code, what it does I don't believe is important, but I'm getting strange behavior.
When I run just the months on separate threads, it runs fine(how it is below), but when I multi-thread the years(uncomment the tasks), it will timeout every time. The timeout is set for 5 minutes for months/20 minutes for years and it will timeout within a minute.
Is there a known reason for this behavior? Am I missing something simple?
public List<PotentialBillingYearItem> GeneratePotentialBillingByYear()
{
var years = new List<PotentialBillingYearItem>();
//var tasks = new List<Task>();
var startYear = new DateTime(DateTime.Today.Year - 10, 1, 1);
var range = new DateRange(startYear, DateTime.Today.LastDayOfMonth());
for (var i = range.Start; i <= range.End; i = i.AddYears(1))
{
var yearDate = i;
//tasks.Add(Task.Run(() =>
//{
years.Add(new PotentialBillingYearItem
{
Total = GeneratePotentialBillingMonths(new PotentialBillingParameters { Year = yearDate.Year }).Average(s => s.Total),
Date = yearDate
});
//}));
}
//Task.WaitAll(tasks.ToArray(), TimeSpan.FromMinutes(20));
return years;
}
public List<PotentialBillingItem> GeneratePotentialBillingMonths(PotentialBillingParameters Parameters)
{
var items = new List<PotentialBillingItem>();
var tasks = new List<Task>();
var year = new DateTime(Parameters.Year, 1, 1);
var range = new DateRange(year, year.LastDayOfYear());
range.Start = range.Start == range.End ? DateTime.Now.FirstDayOfYear() : range.Start.FirstDayOfMonth();
if (range.End > DateTime.Today) range.End = DateTime.Today.LastDayOfMonth();
for (var i = range.Start; i <= range.End; i = i.AddMonths(1))
{
var firstDayOfMonth = i;
var lastDayOfMonth = i.LastDayOfMonth();
var monthRange = new DateRange(firstDayOfMonth, lastDayOfMonth);
tasks.Add(Task.Run(() =>
{
using (var db = new AlbionConnection())
{
var invoices = GetInvoices(lastDayOfMonth);
var timeslipSets = GetTimeslipSets();
var item = new PotentialBillingItem
{
Date = firstDayOfMonth,
PostedInvoices = CalculateInvoiceTotals(invoices.Where(w => w.post_date <= lastDayOfMonth), monthRange),
UnpostedInvoices = CalculateInvoiceTotals(invoices.Where(w => w.post_date == null || w.post_date > lastDayOfMonth), monthRange),
OutstandingDrafts = CalculateOutstandingDraftTotals(timeslipSets)
};
items.Add(item);
}
}));
}
Task.WaitAll(tasks.ToArray(), TimeSpan.FromMinutes(5));
return items;
}
You might consider pre-allocating a bigger number of threadpool threads. The threadpool is very slow to allocate new threads. The code below task only 10 seconds (the theoretical minimum) to run setting the minimum number of threadpool threads to 2.5k, but commenting out the SetMinThreads makes it take over 1:30 seconds.
static void Main(string[] args)
{
ThreadPool.SetMinThreads(2500, 10);
Stopwatch sw = Stopwatch.StartNew();
RunTasksOutter(10);
sw.Stop();
Console.WriteLine($"Finished in {sw.Elapsed}");
}
public static void RunTasksOutter(int num) => Task.WaitAll(Enumerable.Range(0, num).Select(x => Task.Run(() => RunTasksInner(10))).ToArray());
public static void RunTasksInner(int num) => Task.WaitAll(Enumerable.Range(0, num).Select(x => Task.Run(() => Thread.Sleep(10000))).ToArray());
You could also be running out of threadpool threads. Per: https://msdn.microsoft.com/en-us/library/0ka9477y(v=vs.110).aspx one of the times to not use the threadpool (which is used by tasks) is:
You have tasks that cause the thread to block for long periods of time. The thread pool has a maximum number of threads, so a large number of blocked thread pool threads might prevent tasks from starting.
Since IO is being done on these threads maybe consider replacing them with async code or starting them with the LongRunning option? https://msdn.microsoft.com/en-us/library/system.threading.tasks.taskcreationoptions(v=vs.110).aspx

Categories