Rx.Net: observe asynchronous events indifinitely - c#

I have a helper class that saves text messages to the local file system. This method returns a Task object, and is asynchronous by definition.
I want to be able to observe when this method gets called, so I can continuously monitor the size and length of the buffer and make a decision based on that.
I am trying to implement this using the Reactive Extension for .NET. However, I can't come up with a design that allows me to continuously listen to messages being added to the buffer. Below is my current implementation:
public IObservable<Unit> Receive(InternalMessage message)
{
var observable = FileBuffer.BufferMessage(message.MessageId.ToString(), message, DateTime.UtcNow).ToObservable(); //This returns a Task, which I convert into an Observable
return observable;
}
Here is how I subscribe to the observable:
IObservable<Unit> receiverObservable = batchHandler.Receive(message);
receiverObservable.Subscribe(
x => Console.WriteLine("On next"),
ex => //TODO,
() => // Completed);
I want the subscriber to be called every time the method Receive is called. However, AFAIK, once this method is called, the observable completes and the sequence is terminated, so future calls to Receive won't be listened to.
Can someone recommend a way to use the Rx.Net libraries to implement this observable pattern that I am looking for, that is, how to keep the sequence open and feed it with results for async methods?

Receive as you've coded it, returns IObservable<Unit>, representing the completion of a single task. You want to subscribe to something that returns IObservable<IObservable<Unit>> representing a stream of task-completions.
There are a number of ways to do this, the best of which probably depends on how your class is set up and how you're calling it.
Here's the laziest one:
You declare a class-level variable subject that represents a stream of your calls:
Subject<IObservable<Unit>> subject = new Subject<IObservable<Unit>>();
subject.Merge().Subscribe(
x => Console.WriteLine("On next"),
ex => { }, //TODO
() => { } // Completed
);
Then when you have a new call, you just add it to the subject.
IObservable<Unit> receiverObservable = batchHandler.Receive(message);
subject.OnNext(receiverObservable);
The reason this is really lazy is that Rx is functional at its core, which tends to look down on mutable-state variables. Subjects are basically mutable state.
The better way to do it is to figure out when/why you're calling Receive, and structure that as an observable. Once that's done, you can work off of that:
IObservable<Unit> sourceReasonsToCallReceive; // Most likely sourced from event
sourceReasonsToCallReceive.SelectMany(_ => batchHandler.Receive(message))
.SubScribe(
x => Console.WriteLine("On next"),
ex => { }, //TODO
() => { } // Completed
);
Hope that helps.

Related

Observable timers disposing

I'm using the Reactive .NET extensions and I wonder about its disposal. I know in some cases it's good to dispose it like that: .TakeUntil(Observable.Timer(TimeSpan.FromMinutes(x))). I
First case
In this case, I have a timer that triggers after x seconds and then it completes and should be disposed.
public void ScheduleOrderCancellationIfNotFilled(string pair, long orderId, int waitSecondsBeforeCancel)
{
Observable.Timer(TimeSpan.FromSeconds(waitSecondsBeforeCancel))
.Do(e =>
{
var result = _client.Spot.Order.GetOrder(pair, orderId);
if (result.Success)
{
if (result.Data?.Status != OrderStatus.Filled)
{
_client.Spot.Order.CancelOrder(pair, orderId);
}
}
})
.Subscribe();
}
Second case
In this case, the timer runs on the first second and then it repeats itself on each 29 minutes. This should live until its defining class is disposed. I believe this one should be disposed with IDisposable implementation. How?
var keepAliveListenKey = Observable.Timer(TimeSpan.FromSeconds(1), TimeSpan.FromMinutes(29))
.Do(async e =>
{
await KeepAliveListenKeyAsync().ConfigureAwait(false);
})
.Subscribe();
Edit
I also want it to be using a Subject<T> which makes it easier to dispose and to reset the subscription.
For ex. Reset and Dispose observable subscriber, Reactive Extensions (#Enigmativity)
public class UploadDicomSet : ImportBaseSet
{
IDisposable subscription;
Subject<IObservable<long>> subject = new Subject<IObservable<long>>();
public UploadDicomSet()
{
subscription = subject.Switch().Subscribe(s => CheckUploadSetList(s));
subject.OnNext(Observable.Interval(TimeSpan.FromMinutes(2)));
}
void CheckUploadSetList(long interval)
{
subject.OnNext(Observable.Never<long>());
// Do other things
}
public void AddDicomFile(SharedLib.DicomFile dicomFile)
{
subject.OnNext(Observable.Interval(TimeSpan.FromMinutes(2)));
// Reset the subscription to go off in 2 minutes from now
// Do other things
}
}
In the first case it gonna be disposed automatically. It is, actually, a common way to achieve automatic subscription management and that's definitely nice and elegant way to deal with rx.
In the second case you have over-engineered. Observable.Timer(TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(1)) is itself sufficient to generate a sequence of ascending longs over time. Since this stream is endless by its nature, you right - explicit subscription management is required. So it is enough to have:
var sub = Observable.Timer(TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(1)).Subscribe()
...and sub.Dispose() it later.
P.S. Note that in your code you .Do async/await. Most probably that is not what you want. You want SelectMany to ensure that async operation is properly awaited and exceptions handled.
Answering your questions in the comments section:
What about disposing using Subject instead?
Well, nothing so special about it. Both IObserver<>, IObservable<> is implemented by this class such that it resembles classical .NET events (list of callbacks to be called upon some event). It does not differ in any sense with respect to your question and use-case.
May you give an example about the .Do with exception handling?
Sure. The idea is that you want translate your async/await encapsulated into some Task<T> to IObservable<T> such that is preserves both cancellation and error signals. For that .SelectMany method must be used (like SelectMany from LINQ, the same idea). So just change your .Do to .SelectMany.
Observable
.Timer(TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(1))
.SelectMany(_ => Observable.FromAsync(() => /* that's the point where your Task<> becomes Observable */ myTask))
I'm confused again. Do I need IObservable<IObservable> (Select) or IObservable (SelectMany)
Most probably, you don't need switch. Why? Because it was created mainly to avoid IO race conditions, such that whenever new event is emitted, the current one (which might be in progress due to natural parallelism or asynchronous workflow) is guaranteed to be cancelled (i.e. unsubscribed). Otherwise race conditions can (and will) damage your state.
SelectMany, on the contrary, will make sure all of them are happen sequentially, in some total order they have indeed arrived. Nothing will be cancelled. You will finish (await, if you wish) current callback and then trigger the next one. Of course, such behavior can be altered by means of appropriate IScheduler, but that is another story.
Reactive Observable Subscription Disposal (#Enigmativity)
The disposable returned by the Subscribe extension methods is returned solely to allow you to manually unsubscribe from the observable before the observable naturally ends.
If the observable completes - with either OnCompleted or OnError - then the subscription is already disposed for you.
One important thing to note: the garbage collector never calls .Dispose() on observable subscriptions, so you must dispose of your subscriptions if they have not (or may not have) naturally ended before your subscription goes out of scope.
First case
Looks like I don't need to manually .Dispose() the subscription in the first case scenario because it ends naturally.
Dispose is being triggered at the end.
var xs = Observable.Create<long>(o =>
{
var d = Observable.Timer(TimeSpan.FromSeconds(5))
.Do(e =>
{
Console.WriteLine("5 seconds elapsed.");
})
.Subscribe(o);
return Disposable.Create(() =>
{
Console.WriteLine("Disposed!");
d.Dispose();
});
});
var subscription = xs.Subscribe(x => Console.WriteLine(x));
Second case
but in the second case, where it doesn't end "naturally", I should dispose it.
Dispose is not triggered unless manually disposed.
var xs = Observable.Create<long>(o =>
{
var d = Observable.Timer(TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(1))
.Do(e =>
{
Console.WriteLine("Test.");
})
.Subscribe(o);
return Disposable.Create(() =>
{
Console.WriteLine("Disposed!");
d.Dispose();
});
});
var subscription = xs.Subscribe(x => Console.WriteLine(x));
Conclusion
He gave such a nice examples, it's worth seeing if you are asking yourself the same question.

System.Reactive Throttling an async method

I have been putting off using reactive extensions for so long, and I thought this would be a good use. Quite simply, I have a method that can be called for various reasons on various code paths
private async Task GetProductAsync(string blah) {...}
I need to be able to throttle this method. That's to say, I want to stop the flow of calls until no more calls are made (for a specified period of time). Or more clearly, if 10 calls to this method happen within a certain time period, i want to limit (throttle) it to only 1 call (after a period) when the last call was made.
I can see an example using a method with IEnumerable, this kind of makes sense
static IEnumerable<int> GenerateAlternatingFastAndSlowEvents()
{ ... }
...
var observable = GenerateAlternatingFastAndSlowEvents().ToObservable().Timestamp();
var throttled = observable.Throttle(TimeSpan.FromMilliseconds(750));
using (throttled.Subscribe(x => Console.WriteLine("{0}: {1}", x.Value, x.Timestamp)))
{
Console.WriteLine("Press any key to unsubscribe");
Console.ReadKey();
}
Console.WriteLine("Press any key to exit");
Console.ReadKey();
However, (and this has always been my major issue with Rx, forever), how do I create an Observable from a simple async method.
Update
I have managed to find an alternative approach using ReactiveProperty
Barcode = new ReactiveProperty<string>();
Barcode.Select(text => Observable.FromAsync(async () => await GetProductAsync(text)))
.Throttle(TimeSpan.FromMilliseconds(1000))
.Switch()
.ToReactiveProperty();
The premise is I catch it at the text property Barcode, however it has its own drawbacks, as ReactiveProperty takes care of notification, and I cant silently update the backing field as its already managed.
To summarise, how can I convert an async method call to Observable, so I can user the Throttle method?
Unrelated to your question, but probably helpful: Rx's Throttle operator is really a debounce operator. The closest thing to a throttling operator is Sample. Here's the difference (assuming you want to throttle or debounce to one item / 3 seconds):
items : --1-23----4-56-7----8----9-
throttle: --1--3-----4--6--7--8-----9
debounce: --1-------4--6------8----9-
Sample/throttle will bunch items that arrive in the sensitive time and emit the last one on the next sampling tick. Debounce throws away items that arrive in the sensitive time, then re-starts the clock: The only way for an item to emit is if it was preceded by Time-Range of silence.
RX.Net's Throttle operator does what debounce above depicts. Sample does what throttle above depicts.
If you want something different, describe how you want to throttle.
There are two key ways of converting a Task to an Observable, with an important difference between them.
Observable.FromAsync(()=>GetProductAsync("test"));
and
GetProductAsync("test").ToObservable();
The first will not start the Task until you subscribe to it.
The second will create (and start) the task and the result will either immediately or sometime later appear in the observable, depending on how fast the Task is.
Looking at your question in general though, it seems that you want to stop the flow of calls. You do not want to throttle the flow of results, which would result in unnecessary computation and loss.
If this is your aim, your GetProductAsync could be seen as an observer of call events, and the GetProductAsync should throttle those calls. One way of achieving that would be to declare a
public event Action<string> GetProduct;
and use
var callStream= Observable.FromEvent<string>(
handler => GetProduct+= handler ,
handler => GetProduct-= handler);
The problem then becomes how to return the result and what should happen when your 'caller's' call is throttled out and discarded.
One approach there could be to declare a type "GetProductCall" which would have the input string and output result as properties.
You could then have a setup like:
var callStream= Observable.FromEvent<GetProductCall>(
handler => GetProduct+= handler ,
handler => GetProduct-= handler)
.Throttle(...)
.Select(r=>async r.Result= await GetProductCall(r.Input).ToObservable().FirstAsync());
(code not tested, just illustrative)
Another approach might include the Merge(N) overload that limits the max number of concurrent observables.

Observable.Range being repeated?

New to Rx -- I have a sequence that appears to be functioning correctly except for the fact that it appears to repeat.
I think I'm missing something around calls to Select() or SelectMany() that triggers the range to re-evaluate.
Explanation of Code & What I'm trying to Do
For all numbers, loop through a method that retrieves data (paged from a database).
Eventually, this data will be empty (I only want to keep processing while it retrieves data
For each of those records retrieved, I only want to process ones that should be processed
Of those that should be processed, I'd like to process up to x of them in parallel (according to a setting).
I want to wait until the entire sequence is completed to exit the method (hence the wait call at the end).
Problem With the Code Below
I run the code through with a data set that I know only has 1 item.
So, page 0 returns 1 item, and page 1 return 0 items.
My expectation is that the process runs once for the one item.
However, I see that both page 0 and 1 are called twice and the process thus runs twice.
I think this has something to do with a call that is causing the range to re-evaluate beginning from 0, but I can't figure out what that it is.
The Code
var query = Observable.Range(0, int.MaxValue)
.Select(pageNum =>
{
_etlLogger.Info("Calling GetResProfIDsToProcess with pageNum of {0}", pageNum);
return _recordsToProcessRetriever.GetResProfIDsToProcess(pageNum, _processorSettings.BatchSize);
})
.TakeWhile(resProfList => resProfList.Any())
.SelectMany(records => records.Where(x=> _determiner.ShouldProcess(x)))
.Select(resProf => Observable.Start(async () => await _schoolDataProcessor.ProcessSchoolsAsync(resProf)))
.Merge(maxConcurrent: _processorSettings.ParallelProperties)
.Do(async trackingRequests =>
{
await CreateRequests(trackingRequests.Result, createTrackingPayload);
var numberOfAttachments = SumOfRequestType(trackingRequests.Result, TrackingRecordRequestType.AttachSchool);
var numberOfDetachments = SumOfRequestType(trackingRequests.Result, TrackingRecordRequestType.DetachSchool);
var numberOfAssignmentTypeUpdates = SumOfRequestType(trackingRequests.Result,
TrackingRecordRequestType.UpdateAssignmentType);
_etlLogger.Info("Extractor generated {0} attachments, {1} detachments, and {2} assignment type changes.",
numberOfAttachments, numberOfDetachments, numberOfAssignmentTypeUpdates);
});
var subscription = query.Subscribe(
trackingRequests =>
{
//Nothing really needs to happen here. Technically we're just doing something when it's done.
},
() =>
{
_etlLogger.Info("Finished! Woohoo!");
});
await query.Wait();
This is because you subscribe to the sequence twice. Once at query.Subscribe(...) and again at query.Wait().
Observable.Range(0, int.MaxValue) is a cold observable. Every time you subscribe to it, it will be evaluated again. You could make the observable hot by publishing it with Publish(), then subscribe to it, and then Connect() and then Wait(). This does add a risk to get a InvalidOperationException if you call Wait() after the last element is already yielded. A better alternative is LastOrDefaultAsync().
That would get you something like this:
var connectable = query.Publish();
var subscription = connectable.Subscribe(...);
subscription = new CompositeDisposable(connectable.Connect(), subscription);
await connectable.LastOrDefaultAsync();
Or you can avoid await and return a task directly with ToTask() (do remove async from your method signature).
return connectable.LastOrDefaultAsync().ToTask();
Once converted to a task, you can synchronously wait for it with Wait() (do not confuse Task.Wait() with Observable.Wait()).
connectable.LastOrDefaultAsync().ToTask().Wait();
However, most likely you do not want to wait at all! Waiting in a async context makes little sense. What you should do it put the remaining of the code that needs to run after the sequence completes in the OnComplete() part of the subscription. If you have (clean-up) code that needs to run even when you unsubscribe (Dispose), consider Observable.Using or the Finally(...) method to ensure this code is ran.
As already mentioned the cause of the Observable.Range being repeated is the fact that you're subscribing twice - once with .Subscribe(...) and once with .Wait().
In this kind of circumstance I would go with a very simple blocking call to get the values. Just do this:
var results = query.ToArray().Wait();
The .ToArray() turns a multi-valued IObservable<T> into a single values IObservable<T[]>. The .Wait() turns this into T[]. It's the easy way to ensure only one subscription, blocking, and getting all of the values out.
In your case you may not need all values, but I think this is a good habit to get into.

How to get intermediate results from long running operation?

Take the following class and suppose Calculate is a very calculation intensive function.
class Algorithm
{
FinalResultObject Calculate()
{
longPartialCalculation();
//signal to caller that that part is ready of type MidResult1
morePartialCalculation();
//signal more is ready, different type of MidResult2
moreWork();
return finalResult;
}
}
Now suppose, intermediate results need to be shown to the user whenever they're ready.
The options I see are:
use separate events to signal
use constructor injection to inject the a handler class whose methods are being called
use RX observables
I'm new to RX but I'm liking the idea that I can easily do the event handling on the UI thread. I'm wondering though if this is overkill and not as intended since it's not really a whole stream of data but just one result for each observable. On the other hand though just as with events subscription and unsubscription seems to be so cumbersome.
Any hints?
The Rx way of tackling this problem is to define a cold observable as follows:
IObservable<Result> Calculate(IScheduler scheduler)
{
return Observable.Create<Result>(observer =>
scheduler.Schedule(() =>
{
observer.OnNext(longPartialCalculation());
observer.OnNext(morePartialCalculation());
observer.OnNext(moreWork());
observer.OnCompleted();
}));
}
// Depending upon your needs, you could use inheritance as follows:
public abstract class Result { ... }
public class MidResult1 : Result { ... }
public class MidResult2 : Result { ... }
public class FinalResultObject : Result { ... }
You could also define an overload that specifies a default scheduler, such as ThreadPoolScheduler if you want to introduce concurrency or CurrentThreadScheduler if you don't.
To use the observable that is returned by this method, simply call Subscribe with an observer. You can provide an OnNext handler to inspect each Result object as it arrives and an OnCompleted handler to handle completion. You can also provide an OnError handler to handle an Exception, if you must.
(Edit: Note that OnError isn't called by my example though.)
If you want to ensure that all of these handlers execute on the UI thread, and you've passed in a concurrency-introducing scheduler such as ThreadPoolScheduler to the Calculate method, then you can also apply the ObserveOn operator (or ObserveOnDispatcher on XAML-based platforms) to marshal all notifications to the UI thread for observation.
algo.Calculate(ThreadPoolScheduler.Instance)
.ObserveOnDispatcher()
.Subscribe(OnNextResult, OnCompleted);
Note that one of the primary benefits of Rx is the ability to query; e.g., a simple filter:
algo.Calculate(ThreadPoolScheduler.Instance)
.Where(result => result.HasRequiredState)
.ObserveOnDispatcher()
.Subscribe(result => handle(result.RequiredState));
You can use .Net's Progress<T>. You create an instance, passing a handler or registering to its event and report through it throughout you long-running process:
var progress = new Progress<string>(value => Console.WriteLine(value));
Calculate(progress);
FinalResultObject Calculate(IProgress<string> progress)
{
longPartialCalculation();
progress.Report("MidResult1");
morePartialCalculation();
progress.Report("MidResult2");
moreWork();
return finalResult;
}
In this case, the report is writing a string to console, but you can of course use for any type you want.
Progress<T> also captures the current SynchronizationContext on creation so you could create it in the UI thread, pass it to a non-UI thread without any synchronization issues.

.Net RX: tracking progress of parallel execution

I need to execute multiple long-running operations in parallel and would like to report a progress in some way. From my initial research it seems that IObservable fits into this model. The idea is that I call a method that return IObservable of int where int is reported percent complete, parallel execution starts immediately upon exiting a method, this observable must be a hot observable so that all subscribers learn the same progress information at specific point in time, e.g. late subscriber may only learn that the whole execution is complete and there is no more progress to track.
The closest approach to this problem that I found is to use Observable.ForkJoin and Observable.Start, but I can't come to understanding how to make them a single observable that I can return from a method. 
Please share your ideas of how can it be achieved or maybe there is another approach to this problem using .Net RX.
To make a hot observable, I would probably start with a method that uses a BehaviorSubject as the return value and the way the operations report progress. If you just want the example, skip to the end. The rest of this answer explains the steps.
I will assume for the sake of this answer that your long-running operations do not have their own way to be called asynchronously. If they do, the next step may be a little different. The next thing to do is to send the work to another thread using an IScheduler. You may allow the caller to select where the work happens by making an overload that takes the scheduler as a parameter if desired (in which case the overload that does not will pick a default scheduler). There are quite a few overloads of IScheduler.Scheduler, of which several are extensions methods, so you should look through them to see which is most appropriate for your situation; I'm using the on that takes only an Action here. If you have multiple operations that can all run in parallel, you can call scheduler.Schedule multiple times.
The hardest part of this will probably be determining what the progress is at any given point. If you have multiple operations going on at once, you will probably need to keep track of how many have completed to know what the current progress is. With the information you provided, I can't be more specific than that.
Finally, if your operations are cancellable, you may want to take a CancellationToken as a parameter. You can use this to cancel the operation while it is in the scheduler's queue before it starts. If you write your operation code correctly, it can use the token for cancellation as well.
IObservable<int> DoStuff(/*args*/,
CancellationToken cancel,
IScheduler scheduler)
{
BehaviorSubject<int> progress;
//if you don't take it as a parameter, pick a scheduler
//IScheduler scheduler = Scheduler.ThreadPool;
var disp = scheduler.Schedule(() =>
{
//do stuff that needs to run on another thread
//report progres
porgress.OnNext(25);
});
var disp2 = scheduler.Schedule(...);
//if the operation is cancelled before the scheduler has started it,
//you need to dispose the return from the Schedule calls
var allOps = new CompositeDisposable(disp, disp2);
cancel.Register(allOps.Dispose);
return progress;
}
Here is one approach
// setup a method to do some work,
// and report it's own partial progress
Func<string, IObservable<int>> doPartialWork =
(arg) => Observable.Create<int>(obsvr => {
return Scheduler.TaskPool.Schedule(arg,(sched,state) => {
var progress = 0;
var cancel = new BooleanDisposable();
while(progress < 10 && !cancel.IsDisposed)
{
// do work with arg
Thread.Sleep(550);
obsvr.OnNext(1); //report progress
progress++;
}
obsvr.OnCompleted();
return cancel;
});
});
var myArgs = new[]{"Arg1", "Arg2", "Arg3"};
// run all the partial bits of work
// use SelectMany to get a flat stream of
// partial progress notifications
var xsOfPartialProgress =
myArgs.ToObservable(Scheduler.NewThread)
.SelectMany(arg => doPartialWork(arg))
.Replay().RefCount();
// use Scan to get a running aggreggation of progress
var xsProgress = xsOfPartialProgress
.Scan(0d, (prog,nextPartial)
=> prog + (nextPartial/(myArgs.Length*10d)));

Categories