Testing an IConnectableObservable with the TestScheduler - c#

Ok, it's late but I can't for the life of me work out why the following is happening.
I am trying to test the following (simplified) IConnectableObservable<long>:
private const int PollingIntervalMinutes = 5;
private IConnectableObservable<long> CreateObservable(IScheduler scheduler)
{
return Observable
.Interval(TimeSpan.FromMinutes(PollingIntervalMinutes), scheduler)
.StartWith(0)
.Publish();
}
If I test it "long hand" as follows the test passes:
[Test]
public void ShouldReturnExpectedNumberOfMessagesLongHand()
{
var scheduler = new TestScheduler();
var observed = scheduler.CreateObserver<long>();
var observable = CreateObservable(scheduler);
observable.Subscribe(observed);
observable.Connect();
Assert.That(observed.Messages.Count, Is.EqualTo(1));
scheduler.AdvanceBy(TimeSpan.FromMinutes(PollingIntervalMinutes).Ticks);
Assert.That(observed.Messages.Count, Is.EqualTo(2));
scheduler.AdvanceBy(TimeSpan.FromMinutes(PollingIntervalMinutes).Ticks);
Assert.That(observed.Messages.Count, Is.EqualTo(3));
scheduler.AdvanceBy(TimeSpan.FromMinutes(PollingIntervalMinutes).Ticks);
Assert.That(observed.Messages.Count, Is.EqualTo(4));
}
However, if I use the TestScheduler.Start approach - as follows - the test hangs and never reaches the Assert:
[Test]
public void ShouldReturnExpectedNumberOfMessages()
{
var scheduler = new TestScheduler();
var observable = CreateObservable(scheduler);
var observed = scheduler.Start(() => { observable.Connect(); return observable; }, TimeSpan.FromMinutes(PollingIntervalMinutes * 3).Ticks);
Assert.That(observed.Messages.Count, Is.EqualTo(4));
}
By placing a breakpoint in the observable (i.e. on an additional Select or Do) I can see that the call to scheduler.Start is causing the underlying observable to spin (i.e. hits the breakpoint thousands of times) instead of respecting the scheduled times.
I've tried various different means of calling Connect on the IConnectableObservable (i.e. connecting prior to calling start, scheduling a call to Connect in the TestScheduler, etc) but to no avail.
It is definitely related to testing an IConnectableObservable as removing the Publish (i.e. making it a normal cold observable) makes the test pass.
A sanity check and/or suggestions would be greatly appreciated.

The Undisposed Publisher strikes again.
The usual suspects:
var observable = CreateObservable(scheduler);
scheduler.Start(() => { observable.Connect(); return observable; }, ...
To actually dispose of the interval timer, you need a way to dispose the subscription from observable.Connect(), and not the subscription by the Start method.
Once you connect, your interval is cranking out items (as fast as it can) using the test scheduler, and the unsubscribe doesn't actually do anything, leaving it running - and the test scheduler will never complete.
One way of ensuring the disposal of resources, in general, is to use Using.
scheduler.Start(() => Observable.Using(() => observable.Connect(), _ => observable), ...
But a simpler way of ensuring that the original connection to publish is disposed when the downstream observable is unsubscribed from, is to use RefCount.
scheduler.Start(() => CreateObservable(scheduler).RefCount(), ...

Related

How to unit test that tasks are run synchronously

In my code I have a method such as:
void PerformWork(List<Item> items)
{
HostingEnvironment.QueueBackgroundWorkItem(async cancellationToken =>
{
foreach (var item in items)
{
await itemHandler.PerformIndividualWork(item);
}
});
}
Where Item is just a known model and itemHandler just does some work based off of the model (the ItemHandler class is defined in a separately maintained code base as nuget pkg I'd rather not modify).
The purpose of this code is to have work done for a list of items in the background but synchronously.
As part of the work, I would like to create a unit test to verify that when this method is called, the items are handled synchronously. I'm pretty sure the issue can be simplified down to this:
await MyTask(1);
await MyTask(2);
Assert.IsTrue(/* MyTask with arg 1 was completed before MyTask with arg 2 */);
The first part of this code I can easily unit test is that the sequence is maintained. For example, using NSubstitute I can check method call order on the library code:
Received.InOrder(() =>
{
itemHandler.PerformIndividualWork(Arg.Is<Item>(arg => arg.Name == "First item"));
itemHandler.PerformIndividualWork(Arg.Is<Item>(arg => arg.Name == "Second item"));
itemHandler.PerformIndividualWork(Arg.Is<Item>(arg => arg.Name == "Third item"));
});
But I'm not quite sure how to ensure that they aren't run in parallel. I've had several ideas which seem bad like mocking the library to have an artificial delay when PerformIndividualWork is called and then either checking a time elapsed on the whole background task being queued or checking the timestamps of the itemHandler received calls for a minimum time between the calls. For instance, if I have PerformIndividualWork mocked to delay 500 milliseconds and I'm expecting three items, then I could check elapsed time:
stopwatch.Start();
// I have an interface instead of directly calling HostingEnvironment, so I can access the task being queued here
backgroundTask.Invoke(...);
stopwatch.Stop();
Assert.IsTrue(stopwatch.ElapsedMilliseconds > 1500);
But that doesn't feels right and could lead to false positives. Perhaps the solution lies in modifying the code itself; however, I can't think of a way of meaningfully changing it to make this sort of unit test (testing tasks are run in order) possible. We'll definitely have system/integration testing to ensure the issue caused by asynchronous performance of the individual items doesn't happen, but I would like to hit testing here at this level as well.
Not sure if this is a good idea, but one approach could be to use an itemHandler that will detect when items are handled in parallel. Here is a quick and dirty example:
public class AssertSynchronousItemHandler : IItemHandler
{
private volatile int concurrentWork = 0;
public List<Item> Items = new List<Item>();
public Task PerformIndividualWork(Item item) =>
Task.Run(() => {
var result = Interlocked.Increment(ref concurrentWork);
if (result != 1) {
throw new Exception($"Expected 1 work item running at a time, but got {result}");
}
Items.Add(item);
var after = Interlocked.Decrement(ref concurrentWork);
if (after != 0) {
throw new Exception($"Expected 0 work items running once this item finished, but got {after}");
}
});
}
There are probably big problems with this, but the basic idea is to check how many items are already being handled when we enter the method, then decrement the counter and check there are still no other items being handled. With threading stuff I think it is very hard to make guarantees about things from tests alone, but with enough items processed this can give us a little confidence that it is working as expected:
[Fact]
public void Sample() {
var handler = new AssertSynchronousItemHandler();
var subject = new Subject(handler);
var input = Enumerable.Range(0, 100).Select(x => new Item(x.ToString())).ToList();
subject.PerformWork(input);
// With the code from the question we don't have a way of detecting
// when `PerformWork` finishes. If we can't change this we need to make
// sure we wait "long enough". Yes this is yuck. :)
Thread.Sleep(1000);
Assert.Equal(input, handler.Items);
}
If I modify PerformWork to do things in parallel I get the test failing:
public void PerformWork2(List<Item> items) {
Task.WhenAll(
items.Select(item => itemHandler.PerformIndividualWork(item))
).Wait(2000);
}
// ---- System.Exception : Expected 1 work item running at a time, but got 4
That said, if it is very important to run synchronously and it is not apparent from glancing at the implementation with async/await then maybe it is worth using a more obviously synchronous design, like a queue serviced by only one thread, so that you're guaranteed synchronous execution by design and people won't inadvertently change it to async during refactoring (i.e. it is deliberately synchronous and documented that way).

Rx Cache - Replay operator Clear

I am using the following code from here - looks like an issue to me in clearing the "Replay cache"
https://gist.github.com/leeoades/4115023
If I change the following call and code like this I see that there is bug in Replay i.e. it is never cleared. Can someone please help to rectify this ?
private Cache<string> GetCalculator()
{
var calculation = Observable.Create<string>(o =>
{
_calculationStartedCount++;
return Observable.Timer(_calculationDuration, _testScheduler)
.Select(_ => "Hello World!" + _calculationStartedCount) // suffixed the string with count to test the behaviour of Replay clearing
.Subscribe(o);
});
return new Cache<string>(calculation);
}
[Test]
public void After_Calling_GetResult_Calling_ClearResult_and_GetResult_should_perform_calculation_again()
{
// ARRANGE
var calculator = GetCalculator();
calculator.GetValue().Subscribe();
_testScheduler.Start();
// ACT
calculator.Clear();
string result = null;
calculator.GetValue().Subscribe(r => result = r);
_testScheduler.Start();
// ASSERT
Assert.That(_calculationStartedCount, Is.EqualTo(2));
Assert.That(result, Is.EqualTo("Hello World!2")); // always returns Hello World!1 and not Hello World!2
Assert.IsNotNull(result);
}
The problem is a subtle one. The source sequence Timer completes after it emits an event, which in turn calls OnCompleted on the internal ReplaySubject created by Replay. When a Subject completes it no longer accepts any new values even if a new Observable shows up.
When you resubscribe to the underlying Observable it executes again, but isn't able to restart the Subject, so your new Observer can only receive the most recent value before the ReplaySubject completed.
The simplest solution would probably just be to never let the source stream complete (untested):
public Cache(IObservable<T> source)
{
//Not sure why you are wrapping this in an Observable.create
_source = source.Concat(Observable.Never())
.Replay(1, Scheduler.Immediate);
}

Is it in general dubious to call Task.Factory.StartNew(async () => {}) in Subscribe?

I have a situation where I need to use a custom scheduler to run tasks (these need to be tasks) and the scheduler does not set a synchronization context (so no ObserveOn, SubscribeOn, SynchronizationContextScheduler etc. I gather). The following is how I ended up doing it. Now, I wonder, I'm not really sure if this is the fittest way of doing asynchronous calls and awaiting their results. Is this all right or is there a more robust or idiomatic way?
var orleansScheduler = TaskScheduler.Current;
var someObservable = ...;
someObservable.Subscribe(i =>
{
Task.Factory.StartNew(async () =>
{
return await AsynchronousOperation(i);
}, CancellationToken.None, TaskCreationOptions.None, orleansScheduler);
});
What if awaiting wouldn't be needed?
<edit: I found a concrete, and a simplified example of what I'm doing here. Basically I'm using Rx in Orleans and the above code is bare-bones illustration of what I'm up to. Though I'm also interested in this situation in general too.
The final code
It turns out this was a bit tricky in the Orleans context. I don't see how I could get to use ObserveOn, which would be just the thing I'd like to use. The problem is that by using it, the Subscribe would never get called. The code:
var orleansScheduler = TaskScheduler.Current;
var factory = new TaskFactory(orleansScheduler);
var rxScheduler = new TaskPoolScheduler(factory);
var someObservable = ...;
someObservable
//.ObserveOn(rxScheduler) This doesn't look like useful since...
.SelectMany(i =>
{
//... we need to set the custom scheduler here explicitly anyway.
//See Async SelectMany at http://log.paulbetts.org/rx-and-await-some-notes/.
//Doing the "shorthand" form of .SelectMany(async... would call Task.Run, which
//in turn runs always on .NET ThreadPool and not on Orleans scheduler and hence
//the following .Subscribe wouldn't be called.
return Task.Factory.StartNew(async () =>
{
//In reality this is an asynchronous grain call. Doing the "shorthand way"
//(and optionally using ObserveOn) would get the grain called, but not the
//following .Subscribe.
return await AsynchronousOperation(i);
}, CancellationToken.None, TaskCreationOptions.None, orleansScheduler).Unwrap().ToObservable();
})
.Subscribe(i =>
{
Trace.WriteLine(i);
});
Also, a link to a related thread at Codeplex Orleans forums.
I strongly recommend against StartNew for any modern code. It does have a use case, but it's very rare.
If you have to use a custom task scheduler, I recommend using ObserveOn with a TaskPoolScheduler constructed from a TaskFactory wrapper around your scheduler. That's a mouthful, so here's the general idea:
var factory = new TaskFactory(customScheduler);
var rxScheduler = new TaskPoolScheduler(factory);
someObservable.ObserveOn(rxScheduler)...
Then you could use SelectMany to start an asynchronous operation for each event in a source stream as they arrive.
An alternative, less ideal solution is to use async void for your subscription "events". This is acceptable, but you have to watch your error handling. As a general rule, don't allow exceptions to propagate out of an async void method.
There is a third alternative, where you hook an observable into a TPL Dataflow block. A block like ActionBlock can specify its task scheduler, and Dataflow naturally understands asynchronous handlers. Note that by default, Dataflow blocks will throttle the processing to a single element at a time.
Generally speaking, instead of subscribing to execute, it's better/more idiomatic to project the task parameters into the task execution and subscribe just for the results. That way you can compose with further Rx downstream.
e.g. Given a random task like:
static async Task<int> DoubleAsync(int i, Random random)
{
Console.WriteLine("Started");
await Task.Delay(TimeSpan.FromSeconds(random.Next(10) + 1));
return i * 2;
}
Then you might do:
void Main()
{
var random = new Random();
// stream of task parameters
var source = Observable.Range(1, 5);
// project the task parameters into the task execution, collect and flatten results
source.SelectMany(i => DoubleAsync(i, random))
// subscribe just for results, which turn up as they are done
// gives you flexibility to continue the rx chain here
.Subscribe(result => Console.WriteLine(result),
() => Console.WriteLine("All done."));
}

How to cancel a Select in RX if it is not finished before the next event arrives

I have the following setup
IObservable<Data> source = ...;
source
.Select(data=>VeryExpensiveOperation(data))
.Subscribe(data=>Console.WriteLine(data));
Normally the events come seperated by a reasonable time frame.
Imagine a user updating a text box in a form. Our VeryExpensiveOperation
might take 5 seconds to complete and whilst it does an hour glass
is displayed on the screen.
However if during the 5 seconds the user updates the textbox again
I would want to send a cancelation to the current VeryExpensiveOperation
before the new one starts.
I would imagine a scenario like
source
.SelectWithCancel((data, cancelToken)=>VeryExpensiveOperation(data, token))
.Subscribe(data=>Console.WriteLine(data));
So every time the lambda is called is is called with a cancelToken which can be
used to manage canceling a Task. However now we are mixing Task, CancelationToken and RX.
Not quite sure how to fit it all together. Any suggestions.
Bonus Points for figuring out how to test the operator using XUnit :)
FIRST ATTEMPT
public static IObservable<U> SelectWithCancelation<T, U>( this IObservable<T> This, Func<CancellationToken, T, Task<U>> fn )
{
CancellationTokenSource tokenSource = new CancellationTokenSource();
return This
.ObserveOn(Scheduler.Default)
.Select(v=>{
tokenSource.Cancel();
tokenSource=new CancellationTokenSource();
return new {tokenSource.Token, v};
})
.SelectMany(o=>Observable.FromAsync(()=>fn(o.Token, o.v)));
}
Not tested yet. I'm hoping that a task that does not complete generates an IObservable that completes without firing any OnNext events.
You have to model VeryExpensiveOperation as an cancellable asynchronous thing. Either a Task or an IObservable. I'll assume it is a task with a CancellationToken:
Task<TResult> VeryExpensiveOperationAsync<TSource, TResult>(TSource item, CancellationToken token);
Then you do it like so:
source
.Select(item => Observable.DeferAsync(async token =>
{
// do not yield the observable until after the operation is completed
// (ie do not just do VeryExpensiveOperation(...).ToObservable())
// because DeferAsync() will dispose of the token source as soon
// as you provide the observable (instead of when the observable completes)
var result = await VeryExpensiveOperationAsync(item, token);
return Observable.Return(result);
})
.Switch();
The Select just creates a deferred observable that, when subscribed, will create a token and kick off the operation. If the observable is unsubscribed before the operation finishes, the token will be cancelled.
The Switch subscribes to each new observable that comes out of Select, unsubscribing from the previous observable it was subscribed to.
This has the effect you want.
P.S. this is easily testable. Just provide a mock source and a mock VeryExpensiveOperation that uses a TaskCompletetionSource provided by the unit test so the unit test can control exactly when new source items are produced and when tasks are completed. Something like this:
void SomeTest()
{
// create a test source where the values are how long
// the mock operation should wait to do its work.
var source = _testScheduler.CreateColdObservable<int>(...);
// records the actions (whether they completed or canceled)
List<bool> mockActionsCompleted = new List<bool>();
var resultStream = source.SelectWithCancellation((token, delay) =>
{
var tcs = new TaskCompletionSource<string>();
var tokenRegistration = new SingleAssignmentDisposable();
// schedule an action to complete the task
var d = _testScheduler.ScheduleRelative(delay, () =>
{
mockActionsCompleted.Add(true);
tcs.SetResult("done " + delay);
// stop listening to the token
tokenRegistration.Dispose();
});
// listen to the token and cancel the task if the token signals
tokenRegistration.Disposable = token.Register(() =>
{
mockActionsCompleted.Add(false);
tcs.TrySetCancelled();
// cancel the scheduled task
d.Dispose();
});
return tcs.Task;
});
// subscribe to resultStream
// start the scheduler
// assert the mockActionsCompleted has the correct sequence
// assert the results observed were what you expected.
}
You might run into trouble using testScheduler.Start() due to the new actions scheduled dynamically. a while loop with testScheduler.AdvanceBy(1) might work better.
Why not just use a Throttle?
http://rxwiki.wikidot.com/101samples#toc30
Throttle stops the flow of events until no more events are produced for a specified period of time. For example, if you throttle a TextChanged event of a textbox to .5 seconds, no events will be passed until the user has stopped typing for .5 seconds. This is useful in search boxes where you do not want to start a new search after every keystroke, but want to wait until the user pauses.
SearchTextChangedObservable = Observable.FromEventPattern<TextChangedEventArgs>(this.textBox, "TextChanged");
_currentSubscription = SearchTextChangedObservable.Throttle(TimeSpan.FromSeconds(.5)).ObserveOnDispatcher

Unit testing code that uses Task.Factory.StartNew().ContinueWith()

so I have some code
Task.Factory.StartNew(() => this.listener.Start()).ContinueWith(
(task) =>
{
if (task.IsCompleted)
{
this.status = WorkerStatus.Started;
this.RaiseStatusChanged();
this.LogInformationMessage("Worker Started.");
}
});
When I am testing I am mocking all the dependant objects (namley this.listener.Start()). the problem is that the test finishes executing before ContinueWith can be called. When I debug it gets called fine due to the extra delay of me stepping through code.
so how can I - from the test code in a different assembly - ensure that the code is run before my test hits its asserts?
I could just use Thread.Sleep ... but this seems like a really hacky way of doing it.
I guess I am looking for the Task version of Thread.Join.
Consider the following:
public class SomeClass
{
public void Foo()
{
var a = new Random().Next();
}
}
public class MyUnitTest
{
public void MyTestMethod()
{
var target = new SomeClass();
target.Foo(); // What to assert, what is the result?..
}
}
What is the value assigned to a? You cannot tell, unless the result is returned outside the method Foo() (as the return value, a public property, an event, etc.).
The process of "coordinating the actions of threads for a predictable outcome" is called Synchronization.
One of the easiest solutions in your case might be to return the instance of Task class and the use its Wait() method:
var task = Task.Factory.StartNew(() => Method1())
.ContinueWith(() => Method2());
No need to wait for the first task, because ContinueWith() creates a continuation that executes asynchronously when the target Task completes (MSDN):
task.Wait();
I don't think there is an easy-yet-practical way of doing this. Ran into the same problem myself just now and Thread.Sleep(X) is by far the simplest (if not elegant) way of getting around the problem.
The only other solution that I considered is hiding the Task.Factory.StartNew() call behind an interface that you can mock from your test thus removing the actual execution of the task entirely in the test scenario (but still have an expectation that the interface method will be called. For example:
public interface ITaskWrapper
{
void TaskMethod();
}
And your concrete implementation:
public class MyTask : ITaskWrapper
{
public void TaskMethod()
{
Task.Factory.StartNew(() => DoSomeWork());
}
}
Then just mock ITaskWrapper in your test method and set an expectation on TaskMethod being called.
If there's any way for you to be notified of when the processing has ended (can you add a handler for that StatusChanged event?), use a ManualResetEvent and wait on it with a reasonable timeout. If the timeout expired fail the test, otherwise go on and perform your assertions.
E.g.
var waitHandle = new ManualResetEvent(false);
sut.StatusChanged += (s, e) => waitHandle.Set();
sut.DoStuff();
Assert.IsTrue(waitHandle.WaitOne(someTimeout), "timeout expired");
// do asserts here
The continuation task will still run regardless of whether the initial task completed before the ContinueWith() call or not. I double checked this with the following:
// Task immediately exits
var task = Task.Factory.StartNew(() => { });
Thread.Sleep(100);
// Continuation on already-completed task
task.ContinueWith(t => { MessageBox.Show("!"); });
Debug further. Maybe your task is failing.
When dealing with asynchronous processes during code under test that use Reactive Extensions, one approach is to use a TestScheduler. The TestScheduler can be moved forward in time, drained of all shceduled tasks, etc. So your code under test can take an IScheduler, which you provide a TestScheduler instance for. Then your test can manipulate time without needing to actually sleep, wait or synchronize. An improvement on this approach is Lee Campbell's ISchedulerProvider approach.
If you use Observable.Start instead of Task.Factory.StartNew in your code, you can then use your TestScheduler in the unit test to push through all the scheduled tasks.
For example, your code under test could look something like this:
//Task.Factory.StartNew(() => DoSomething())
// .ContinueWith(t => DoSomethingElse())
Observable.Start(() => DoSomething(), schedulerProvider.ThreadPool)
.ToTask()
.ContinueWith(t => DoSomethingElse())
and in your unit test:
// ... test code to execute the code under test
// run the tasks on the ThreadPool scheduler
testSchedulers.ThreadPool.Start();
// assertion code can now run

Categories