Using BufferBlock as Observable without consuming the elements - c#

We use BufferBlocks to build a small simulation tool where we want to find areas that takes a long time to complete. Producers and Consumers of the blocks will essentially sleep for x amount of time and then post a message to another block.
We decided to use an Observer pattern. Howver, I see some behavior I did not expect. Whenever the OnNext method of the observers is called the BufferBlock is empty (Count == 0). This is problematic as I want only 1 observer to be able to fetch the value from the queue.
Is there a way to change this behavior? If not, how should I handle consumption from the BufferBlocks?
Currently I want to be able to do something similar to post the messages and have all Observers try to fetch it:
public void OnNext(Object value)
{
var res =this.AsConsumer().ConsumeQueue.ReceiveAsync().Result;
Thread.Sleep(this.TimeToConsume );
ProduceQueue.Post(someOtherValue);
}
I have written some tests to show the behavior of the BufferBlock.
[Test]
public void
WhenObservingMocks_CallsOnNextForAllMocks()
{
var firstObserver = new Mock<IObserver<int>>();
var secondObserver = new Mock<IObserver<int>>();
var block = new BufferBlock<int>();
block.AsObservable().Subscribe(firstObserver.Object);
block.AsObservable().Subscribe(secondObserver.Object);
block.Post(2);
Thread.Sleep(TimeSpan.FromMilliseconds(50));
firstObserver.Verify(e => e.OnNext(It.IsAny<int>()), Times.Once);
secondObserver.Verify(e => e.OnNext(It.IsAny<int>()), Times.Once);
}
[Test]
public void
WhenHavingObservers_DoesConsumesTheElementFromQueue()
{
var firstObserver = new Mock<IObserver<int>>();
var secondObserver = new Mock<IObserver<int>>();
var block = new BufferBlock<int>();
block.AsObservable().Subscribe(firstObserver.Object);
block.AsObservable().Subscribe(secondObserver.Object);
block.Post(2);
Assert.Zero(block.Count);
}
[Test]
public void
WhenPostingOnce_CanOnlyReceiveOnce()
{
var block = new BufferBlock<int>();
block.Post(2);
Assert.True(block.TryReceive(out int _));
Assert.False(block.TryReceive(out int _));
}

Related

Exception handling in RX.Net when using ToEventPattern and Timeout

I am writing some code using RX in C# that must interface with an older system by emitting events.
In summary, I have an observable and need to emit one event when the observable completes and another event if a timeout exception is detected. The main problem is how best to handle the exception.
I'm relatively new to RX, so although I have found a solution, I can't be sure that there isn't a better or more appropriate way that uses the RX extensions better.
This is not the real code but indicates the pattern of my thinking:
public delegate void SuccessHandler(object sender, SuccessEventArgs e);
public event SuccessHandler OnSuccess;
public delegate void TimeoutHandler(object sender, TimeoutEventArgs e);
public event TimeoutHandler OnTimeout;
var id;
var o = Observable.Return() // <- this would be a fetch from an asynchronous source
.Where(r=>r.status=="OK")
.Timeout(new Timespan(0,0,30)
.Do(r=> {
id=r.Id // <-- Ugh! I know this shouldn't be done!
}
.Subscribe(r => {
var statusResponse= new StatusResponse()
{
Id = r.Id
Name = r.Name
Message = "The operation completed successfully",
Status = Status.Success
};
if (OnSuccess == null) return;
OnSuccess (this, new SuccessEventArgs(statusResponse);
},
e =>
{
_logger.LogError(e, "A matching response was not returned in a timely fashion");
if (OnTimeout == null) return;
OnTimeout(this, new TimeoutEventArgs(id));
});
If I didn't need to detect and act upon the timeout it would be fine; I have already worked out how to substitute the Subscribe for ToEventPattern:
...
.Select(r =>
{
var statusResponse= new StatusResponse()
{
Id = r.Id
Name = r.Name
Message = "The operation completed successfully",
Status = Status.Success
};
return new EventPattern<SuccessEventArgs>(this, new SuccessEventArgs(statusResponse));
})
.ToEventPattern();
However, I'd like to be able to detect the timeout (and possibly other exceptions). my experiments with Catch have been unsuccessful because I can't seem to get the types to line up correctly, probably because I don't really understand what is going on.
I'd very much appreciate opinions on this. Is this an acceptable solution? How can I improve it? Can anyone point me to some good online references that will explain how this kind of flow-control and exception handling can be done (all the examples I've seen so far seem to stop short of the real-world case where you want to emit an event and combine that with exception handling).
Thanks in advance
You can branch from observables quite easily, e.g.
var a = Observable.Range(0, 10);
var b = a.Select(x => x * x);
var c = a.Select(x => x * 10);
A word of warning - if the observable is cold, this will cause the producer function to run for each subscription. Look up the difference between hot and cold observables if this isn't clear.
I've created a solution that creates two branches from the source observable and turns each into an event:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
var service = new Service();
var apiCall = service.CallApi();
apiCall.OnSuccess.OnNext += (_, __) => Console.WriteLine("Success!");
apiCall.OnTimeout.OnNext += (_, __) => Console.WriteLine("Timeout!");
Console.ReadLine();
}
}
class SuccessEventArgs{}
class TimeoutEventArgs{}
class ApiCall
{
public IEventPatternSource<SuccessEventArgs> OnSuccess {get;}
public IEventPatternSource<TimeoutEventArgs> OnTimeout {get;}
public ApiCall(IEventPatternSource<SuccessEventArgs> onSuccess, IEventPatternSource<TimeoutEventArgs> onTimeout)
{
OnSuccess = onSuccess;
OnTimeout = onTimeout;
}
}
class Service
{
public ApiCall CallApi()
{
var apiCall = Observable
.Timer(TimeSpan.FromSeconds(3))
.Do(_ => Console.WriteLine("Api Called"))
.Select(_ => new EventPattern<SuccessEventArgs>(null, new SuccessEventArgs()))
// .Timeout(TimeSpan.FromSeconds(2)) // uncomment to time out
.Timeout(TimeSpan.FromSeconds(4))
// the following two lines turn the "cold" observable "hot"
// comment them out and see how often "Api Called" is logged
.Publish()
.RefCount();
var success = apiCall
// ignore the TimeoutException and return an empty observable
.Catch<EventPattern<SuccessEventArgs>, TimeoutException>(_ => Observable.Empty<EventPattern<SuccessEventArgs>>())
.ToEventPattern();
var timeout = apiCall
.Materialize() // turn the exception into a call to OnNext rather than OnError
.Where(x => x.Exception is TimeoutException)
.Select(_ => new EventPattern<TimeoutEventArgs>(null, new TimeoutEventArgs()))
.ToEventPattern();
return new ApiCall(success, timeout);
}
}

Parallel processing using TPL in windows service

I have a windows service which is consuming a messaging system to fetch messages. I have also created a callback mechanism with the help of Timer class which helps me to check the message after some fixed time to fetch and process. Previously, the service is processing the message one by one. But I want after the message arrives the processing mechanism to execute in parallel. So if the first message arrived it should go for processing on one task and even if the processing is not finished for the first message still after the interval time configured using the callback method (callback is working now) next message should be picked and processed on a different task.
Below is my code:
Task.Factory.StartNew(() =>
{
Subsriber<Message> subsriber = new Subsriber<Message>()
{
Interval = 1000
};
subsriber.Callback(Process, m => m != null);
});
public static void Process(Message message)
{
if (message != null)
{
// Processing logic
}
else
{
}
}
But using the Task Factory I am not able to control the number of tasks in parallel so in my case I want to configure the number of tasks on which messages will run on the availability of the tasks?
Update:
Updated my above code to add multiple tasks
Below is the code:
private static void Main()
{
try
{
int taskCount = 5;
Task.Factory.StartNewAsync(() =>
{
Subscriber<Message> consumer = new
Subcriber<Message>()
{
Interval = 1000
};
consumer.CallBack(Process, msg => msg!=
null);
}, taskCount);
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
public static void StartNewAsync(this TaskFactory
target, Action action, int taskCount)
{
var tasks = new Task[taskCount];
for (int i = 0; i < taskCount; i++)
{
tasks[i] = target.StartNew(action);
}
}
public static void Process(Message message)
{
if (message != null)
{
}
else
{ }
}
}
I think what your looking for will result in quite a large sample. I'm trying just to demonstrate how you would do this with ActionBlock<T>. There's still a lot of unknowns so I left the sample as skeleton you can build off. In the sample the ActionBlock will handle and process in parallel all your messages as they're received from your messaging system
public class Processor
{
private readonly IMessagingSystem _messagingSystem;
private readonly ActionBlock<Message> _handler;
private bool _pollForMessages;
public Processor(IMessagingSystem messagingSystem)
{
_messagingSystem = messagingSystem;
_handler = new ActionBlock<Message>(msg => Process(msg), new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 5 //or any configured value
});
}
public async Task Start()
{
_pollForMessages = true;
while (_pollForMessages)
{
var msg = await _messagingSystem.ReceiveMessageAsync();
await _handler.SendAsync(msg);
}
}
public void Stop()
{
_pollForMessages = false;
}
private void Process(Message message)
{
//handle message
}
}
More Examples
And Ideas
Ok, sorry I'm short on time but here's the general idea/skeleton of what I was thinking as an alternative.
If I'm honest though I think the ActionBlock<T> is the better option as there's just so much done for you, with the only limit being that you can't dynamically scale the amount of work it will do it once, although I think the limit can be quite high. If you get into doing it this way you could have more control or just have a kind of dynamic amount of tasks running but you'll have to do a lot of things manually, e.g if you want to limit the amount of tasks running at a time, you'd have to implement a queueing system (something ActionBlock handles for you) and then maintain it. I guess it depends on how many messages you're receiving and how fast your process handles them.
You'll have to check it out and think of how it could apply to your direct use case as I think some of the details area a little sketchily implemented on my side around the concurrentbag idea.
So the idea behind what I've thrown together here is that you can start any number of tasks, or add to the tasks running or cancel tasks individually by using the collection.
The main thing I think is just making the method that the Callback runs fire off a thread that does the work, instead of subscribing within a separate thread.
I used Task.Factory.StartNew as you did, but stored the returned Task object in an object (TaskInfo) which also had it's CancellationTokenSource, it's Id (assigned externally) as properties, and then added that to a collection of TaskInfo which is a property on the class this is all a part of:
Updated - to avoid this being too confusing i've just updated the code that was here previously.
You'll have to update bits of it and fill in the blanks in places like with whatever you have for my HeartbeatController, and the few events that get called because they're beyond the scope of the question but the idea would be the same.
public class TaskContainer
{
private ConcurrentBag<TaskInfo> Tasks;
public TaskContainer(){
Tasks = new ConcurrentBag<TaskInfo>();
}
//entry point
//UPDATED
public void StartAndMonitor(int processorCount)
{
for (int i = 0; i <= processorCount; i++)
{
Processor task = new Processor(ProcessorId = i);
CreateProcessorTask(task);
}
this.IsRunning = true;
MonitorTasks();
}
private void CreateProcessorTask(Processor processor)
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
Task taskInstance = Task.Factory.StartNew(
() => processor.Start(cancellationTokenSource.Token)
);
//bind status update event
processor.ProcessorStatusUpdated += ReportProcessorProcess;
Tasks.Add(new ProcessorInfo()
{
ProcessorId = processor.ProcessorId,
Task = taskInstance,
CancellationTokenSource = cancellationTokenSource
});
}
//this method gets called once but the HeartbeatController gets an action as a param that it then
//executes on a timer. I haven't included that but you get the idea
//This method also checks for tasks that have stopped and restarts them if the manifest call says they should be running.
//Will also start any new tasks included in the manifest and stop any that aren't included in the manifest.
internal void MonitorTasks()
{
HeartbeatController.Beat(() =>
{
HeartBeatHappened?.Invoke(this, null);
List<int> tasksToStart = new List<int>();
//this is an api call or whatever drives your config that says what tasks must be running.
var newManifest = this.GetManifest(Properties.Settings.Default.ResourceId);
//task Removed Check - If a Processor is removed from the task pool, cancel it if running and remove it from the Tasks List.
List<int> instanceIds = new List<int>();
newManifest.Processors.ForEach(x => instanceIds.Add(x.ProcessorId));
var removed = Tasks.Select(x => x.ProcessorId).ToList().Except(instanceIds).ToList();
if (removed.Count() > 0)
{
foreach (var extaskId in removed)
{
var task = Tasks.FirstOrDefault(x => x.ProcessorId == extaskId);
task.CancellationTokenSource?.Cancel();
}
}
foreach (var newtask in newManifest.Processors)
{
var oldtask = Tasks.FirstOrDefault(x => x.ProcessorId == newtask.ProcessorId);
//Existing task check
if (oldtask != null && oldtask.Task != null)
{
if (!oldtask.Task.IsCanceled && (oldtask.Task.IsCompleted || oldtask.Task.IsFaulted))
{
var ex = oldtask.Task.Exception;
tasksToStart.Add(oldtask.ProcessorId);
continue;
}
}
else //New task Check
tasksToStart.Add(newtask.ProcessorId);
}
foreach (var item in tasksToStart)
{
var taskToRemove = Tasks.FirstOrDefault(x => x.ProcessorId == item);
if (taskToRemove != null)
Tasks.Remove(taskToRemove);
var task = newManifest.Processors.FirstOrDefault(x => x.ProcessorId == item);
if (task != null)
{
CreateProcessorTask(task);
}
}
});
}
}
//UPDATED
public class Processor{
private int ProcessorId;
private Subsriber<Message> subsriber;
public Processor(int processorId) => ProcessorId = processorId;
public void Start(CancellationToken token)
{
Subsriber<Message> subsriber = new Subsriber<Message>()
{
Interval = 1000
};
subsriber.Callback(Process, m => m != null);
}
private void Process()
{
//do work
}
}
Hope this gives you an idea of how else you can approach your problem and that I didn't miss the point :).
Update
To use events to update progress or which tasks are processing, I'd extract them into their own class, which then has subscribe methods on it, and when creating a new instance of that class, assign the event to a handler in the parent class which can then update your UI or whatever you want it to do with that info.
So the content of Process() would look more like this:
Processor processor = new Processor();
Task task = Task.Factory.StartNew(() => processor.ProcessMessage(cancellationTokenSource.CancellationToken));
processor.StatusUpdated += ReportProcess;

Buffer tasks data with timeout

I have a rather tricky question to solve. I have multiple (up to hundred or more) tasks, each of them produce a piece of data, say, string. These tasks can be spawn in every moment and there can be huge amount of them in one time and no at another. Each task must receive bool, indicating, whether is was completed correctly or not (that's important).
I want to implement some kind of buffer, to agregate data from tasks and flush it to external service, returning operation state (ok or fail). Also, my buffer must be flushed by timeout (to prevent waiting for new tasks to generate data for too long).
So far i tried to make some shared list of items. Tasks can add items to list and there is another task, checking timer or count of items in list and flushing them. But in this approach i can't tell status of flush operation to task, which is very bad for me.
I'll be gratefull for any approarch to solve my problem.
As I understand, you need to save result of each task to database/service, but you don't want to do it immediately.
There can be more than one solution to your problem, but it's difficult to come up with the best one, so I'll describe how I would have done it ... quickly.
A container for data you need to save/send.
public class TaskResultEventArgs : EventArgs
{
public bool Result { get; set; }
}
A notifier which also runs the task for you. I assumed you can delay execution of tasks.
public class NotifyingTaskRunner
{
public event EventHandler<TaskResultEventArgs> TaskCompleted;
public void RunAndNotify(Task<bool> task)
{
task.ContinueWith(t =>
{
OnTaskCompleted(this, new TaskResultEventArgs { Result = t.Result });
}, TaskContinuationOptions.OnlyOnRanToCompletion);
task.Start();
}
protected virtual void OnTaskCompleted(object sender, TaskResultEventArgs e)
{
var h = TaskCompleted;
if (h != null)
{
h.Invoke(sender, e);
}
}
}
A listener which can buffer and/or flush results (or you might want to delegate this to another class).
public class Listener
{
private ConcurrentQueue<bool> _queue = new ConcurrentQueue<bool>();
public Listener(NotifyingTaskRunner runner)
{
runner.TaskCompleted += Flush;
}
public async void Flush(object sender, TaskResultEventArgs e)
{
// Enqueue status to flush everything later (or flush it immediately)
_queue.Enqueue(e.Result);
}
}
And this is how you can use everything together.
var runner = new NotifyingTaskRunner();
var listener = new Listener(runner);
var t1 = new Task<bool>(() => { return true; });
var t2 = new Task<bool>(() => { return false; });
runner.RunAndNotify(t1);
runner.RunAndNotify(t2);

Query on Queues and Thread Safety

Thread-Safety is not an aspect that I have worried about much as the simple apps and libraries I have written usually only run on the main thread, or do not directly modified properties or fields in any classes that I needed to worry about before.
However, I have started working on a personal project that I am using a WebClient to download data asynchronously from a remote server. There is a Queue<Uri> that contains a pre-built queue of a series of URI's to download data.
So consider the following snippet (this is not my real code, but something I am hoping illustrates my question:
private WebClient webClient = new WebClient();
private Queue<Uri> requestQueue = new Queue<Uri>();
public Boolean DownloadNextASync()
{
if (webClient.IsBusy)
return false;
if (requestQueue.Count == 0)
return false
var uri = requestQueue.Dequeue();
webClient.DownloadDataASync(uri);
return true;
}
If I am understanding correctly, this method is not thread safe (assuming this specific instance of this object is known to multiple threads). My reasoning is webClient could become busy during the time between the IsBusy check and the DownloadDataASync() method call. And also, requestQueue could become empty between the Count check and when the next item is dequeued.
My question is what is the best way to handle this type of situation to make it thread-safe?
This is more of an abstract question as I realize for this specific method that there would have to be an exceptionally inconvenient timing for this to actually cause a problem, and to cover that case I could just wrap the method in an appropriate try-catch since both pieces would throw an exception. But is there another option? Would a lock statement be applicable here?
If you're targeting .Net 4.0, you could use the Task Parallel Library for help:
var queue = new BlockingCollection<Uri>();
var maxClients = 4;
// Optionally provide another producer/consumer collection for the data
// var data = new BlockingCollection<Tuple<Uri,byte[]>>();
// Optionally implement CancellationTokenSource
var clients = from id in Enumerable.Range(0, maxClients)
select Task.Factory.StartNew(
() =>
{
var client = new WebClient();
while (!queue.IsCompleted)
{
Uri uri;
if (queue.TryTake(out uri))
{
byte[] datum = client.DownloadData(uri); // already "async"
// Optionally pass datum along to the other collection
// or work on it here
}
else Thread.SpinWait(100);
}
});
// Add URI's to search
// queue.Add(...);
// Notify our clients that we've added all the URI's
queue.CompleteAdding();
// Wait for all of our clients to finish
clients.WaitAll();
To use this approach for progress indication you can use TaskCompletionSource<TResult> to manage the Event based parallelism:
public static Task<byte[]> DownloadAsync(Uri uri, Action<double> progress)
{
var source = new TaskCompletionSource<byte[]>();
Task.Factory.StartNew(
() =>
{
var client = new WebClient();
client.DownloadProgressChanged
+= (sender, e) => progress(e.ProgressPercentage);
client.DownloadDataCompleted
+= (sender, e) =>
{
if (!e.Cancelled)
{
if (e.Error == null)
{
source.SetResult((byte[])e.Result);
}
else
{
source.SetException(e.Error);
}
}
else
{
source.SetCanceled();
}
};
});
return source.Task;
}
Used like so:
// var urls = new List<Uri>(...);
// var progressBar = new ProgressBar();
Task.Factory.StartNew(
() =>
{
foreach (var uri in urls)
{
var task = DownloadAsync(
uri,
p =>
progressBar.Invoke(
new MethodInvoker(
delegate { progressBar.Value = (int)(100 * p); }))
);
// Will Block!
// data = task.Result;
}
});
I highly recommend reading "Threading In C#" by Joseph Albahari. I have taken a look through it in preparation for my first (mis)adventure into threading and it's pretty comprehensive.
You can read it here: http://www.albahari.com/threading/
Both of the thread-safety concerns you raised are valid. Furthermore, the both WebClient and Queue are documented as not being thread-safe (at the bottom of the MSDN docs). For example, if two threads were dequeuing simultaneously, they might actually cause the queue to become internally inconsistent or could lead to non-sensical return values. For example, if the implementation of Dequeue() was something like:
1. var valueToDequeue = this._internalList[this._startPointer];
2. this._startPointer = (this._startPointer + 1) % this._internalList.Count;
3. return valueToDequeue;
and two threads each executed line 1 before either continued to line 2, then both would return the same value (there are other potential issues here as well). This would not necessarily throw an exception, so you should use a lock statement to guarantee that only one thread can be inside the method at a time:
private readonly object _lock = new object();
...
lock (this._lock) {
// body of method
}
You could also lock on the WebClient or the Queue if you know that no-one else will be synchronizing on them.

Understand the flow of control when calling a blocking code from non-blocking block?

I have the following code
static void Main(string[] args)
{
//var source = BlockingMethod();
var source2 = NonBlocking();
source2.Subscribe(Console.WriteLine);
//source.Subscribe(Console.WriteLine);
Console.ReadLine();
}
private static IObservable<string> BlockingMethod()
{
var subject = new ReplaySubject<string>();
subject.OnNext("a");
subject.OnNext("b");
subject.OnCompleted();
Thread.Sleep(1000);
return subject;
}
private static IObservable<string> NonBlocking()
{
return Observable.Create<string>(
observable =>
{
observable.OnNext("c");
observable.OnNext("d");
observable.OnCompleted();
//Thread.Sleep(1000);
var source = BlockingMethod();
source.Subscribe(Console.WriteLine);
return Disposable.Create(() => Console.WriteLine("Observer has unsubscribed"));
//or can return an Action like
//return () => Console.WriteLine("Observer has unsubscribed");
});
}
}
which prints
c
d
Observer has unsubscribed
a
b
Can anyone help me get the flow of the control in the program. I did try reading the Call Stack etc..but could not understand everything.
EDIT
Why do i get the above output(which i assume is right) instead of
c
d
a
b
Observer has unsubscribed
The difference in your expected behaviour and the actual behaviour comes from the following line:
var subject = new ReplaySubject<string>();
By default a ReplaySubject uses the Scheduler.CurrentThread. It's as if you declared it like so:
var subject = new ReplaySubject<string>(Scheduler.CurrentThread);
When scheduling using the current thread you get your actions queued up - waiting for the currently executing code to complete before it starts. If you want the code to run immediately you need to use Scheduler.Immediate like so:
var subject = new ReplaySubject<string>(Scheduler.Immediate);
Does this explain it sufficiently?

Categories