dispatch work from many threads to one synchronous thread - c#

Say I have 10 threads busily doing something and they sometimes call a method
public HandleWidgets(Widget w) { HeavyLifting(w) }
However, I don't want my 10 threads to wait on HeavyLifting(w), but instead, dispatch the HeavyLifting(w) work to an 11th thread, the HeavyLifter thread and continue asynchronously. The HeavyLifter thread dispatched to should always be the same thread, and I don't want to make multiple threads (hence, I can't do something quite like this: C# Asynchronous call without EndInvoke?).
HeavyLifting(w) is "fire and forget" in that the threads that call HandleWidgets() don't need a callback or anything like that.
What's a healthy practice for this?

I'm surprised none of the other answers here mention TPL DataFlow. You connect a bunch of blocks together and post data through the chain. You can control the concurrency of each block explicitly, so you could do the following:
var transformBlk =
new TransformBlock<int,int>(async i => {
await Task.Delay(i);
return i * 10;
}, new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = 10});
var processorBlk=
new ActionBlock<int>(async i => {
Console.WriteLine(i);
},new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = 1});
transformBlk.LinkTo(processorBlk);
var data = Enumerable.Range(1, 20).Select(x => x * 1000);
foreach(var x in data)
{
transformBlk.Post(x);
}

Basically you have threads that are producers of work and one thread that is a consumer of it.
Create a thread and have it Take from a BlockingCollection in a loop. This is your consumer thread which will call HeavyLifting. It will simply wait until an item is available and then proces it:
A call to Take may block until an item is available to be removed.
The other threads can simply add items to the collection.
Note that BlockingCollection doesn't guarantee ordering of items added/removed by itself:
The order in which an item is removed depends on the type of collection used to create the BlockingCollection instance. When you create a BlockingCollection object, you can specify the type of collection to use

You can create a limited concurrency TaskScheduler to be used as a Task factory, as provided in this example from MSDN, with limitation to single thread:
var lcts = new LimitedConcurrencyLevelTaskScheduler(1);
TaskFactory factory = new TaskFactory(lcts);
The implement your function as:
public HandleWidgets(Widget w)
{
factory.StartNew(() => HeavyLifting(w));
}

Create a queue which is shared among all threads and one semaphore, which is also shared among all threads. Worker threads (that should not wait for HeavyLifter) post requests like this:
lock (queue)
{
queue.Enqueue(request);
semaphore.Release();
}
HeavyLifter is a background thread (doesn't stop process from exiting) and runs the following code in infinite loop:
while (true)
{
semaphore.WaitOne();
Request item = null
lock (queue)
{
item = queue.Dequeue();
}
this.ProcessRequest(item);
}
EDIT: typo.

--- EDIT ---
I just noticed you need "fire and forget", in which case a blocking collection alone would be enough. The solution below is really for more complex scenarios where you need to return a result, or propagate an exception, or compose tasks in some fashion (e.g. via async/await) etc...
Use TaskCompletionSource to expose the work done in the "synchronous" thread as a Task-based API to the "client" threads.
For each invocation of HandleWidgets (CT = "client thread", "ST" = synchronous thread):
CT: Create a separate TaskCompletionSource.
CT: Dispatch HeavyLifting to the ST (probably through a BlockingCollection; also pass the TaskCompletionSource to it, so it can do the last step below).
CT: Return TaskCompletionSource's Task to the caller without waiting for the work on ST to finish.
CT: Continue normally. If/when it is no longer possible to continue without waiting on HeavyLifting to finish (in the ST), wait on the task above.
ST: When HeavyLifting finishes, call SetResult (or SetException or SetCanceled, as appropriate), which unblocks any CTs that might currently wait on the task.

Related

Waiting on a continuous UI background polling task

I am somewhat new to parallel programming C# (When I started my project I worked through the MSDN examples for TPL) and would appreciate some input on the following example code.
It is one of several background worker tasks. This specific task pushes status messages to a log.
var uiCts = new CancellationTokenSource();
var globalMsgQueue = new ConcurrentQueue<string>();
var backgroundUiTask = new Task(
() =>
{
while (!uiCts.IsCancellationRequested)
{
while (globalMsgQueue.Count > 0)
ConsumeMsgQueue();
Thread.Sleep(backgroundUiTimeOut);
}
},
uiCts.Token);
// Somewhere else entirely
backgroundUiTask.Start();
Task.WaitAll(backgroundUiTask);
I'm asking for professional input after reading several topics like Alternatives to using Thread.Sleep for waiting, Is it always bad to use Thread.Sleep()?, When to use Task.Delay, when to use Thread.Sleep?, Continuous polling using Tasks
Which prompts me to use Task.Delay instead of Thread.Sleep as a first step and introduce TaskCreationOptions.LongRunning.
But I wonder what other caveats I might be missing? Is polling the MsgQueue.Count a code smell? Would a better version rely on an event instead?
First of all, there's no reason to use Task.Start or use the Task constructor. Tasks aren't threads, they don't run themselves. They are a promise that something will complete in the future and may or may not produce any results. Some of them will run on a threadpool thread. Use Task.Run to create and run the task in a single step when you need to.
I assume the actual problem is how to create a buffered background worker. .NET already offers classes that can do this.
ActionBlock< T >
The ActionBlock class already implements this and a lot more - it allows you to specify how big the input buffer is, how many tasks will process incoming messages concurrently, supports cancellation and asynchronous completion.
A logging block could be as simple as this :
_logBlock=new ActionBlock<string>(msg=>File.AppendAllText("myLog.txt",msg));
The ActionBlock class itself takes care of buffering the inputs, feeding new messages to the worker function when it arrives, potentially blocking senders if the buffer gets full etc. There's no need for polling.
Other code can use Post or SendAsync to send messages to the block :
_block.Post("some message");
When we are done, we can tell the block to Complete() and await for it to process any remaining messages :
_block.Complete();
await _block.Completion;
Channels
A newer, lower-level option is to use Channels. You can think of channels as a kind of asynchronous queue, although they can be used to implement complex processing pipelines. If ActionBlock was written today, it would use Channels internally.
With channels, you need to provide the "worker" task yourself. There's no need for polling though, as the ChannelReader class allows you to read messages asynchronously or even use await foreach.
The writer method could look like this :
public ChannelWriter<string> LogIt(string path,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded<string>();
var writer=channel.Writer;
_=Task.Run(async ()=>{
await foreach(var msg in channel.Reader.ReadAllAsync(token))
{
File.AppendAllText(path,msg);
}
},token).ContinueWith(t=>writer.TryComplete(t.Exception);
return writer;
}
....
_logWriter=LogIt(somePath);
Other code can send messages by using WriteAsync or TryWrite, eg :
_logWriter.TryWrite(someMessage);
When we're done, we can call Complete() or TryComplete() on the writer :
_logWriter.TryComplete();
The line
.ContinueWith(t=>writer.TryComplete(t.Exception);
is needed to ensure the channel is closed even if an exception occurs or the cancellation token is signaled.
This may seem too cumbersome at first. Channels allow us to easily run initialization code or carry state from one message to the next. We could open a stream before the loop starts and use it instead of reopening the file each time we call File.AppendAllText, eg :
public ChannelWriter<string> LogIt(string path,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded<string>();
var writer=channel.Writer;
_=Task.Run(async ()=>{
//***** Can't do this with an ActionBlock ****
using(var writer=File.AppendText(somePath))
{
await foreach(var msg in channel.Reader.ReadAllAsync(token))
{
writer.WriteLine(msg);
//Or
//await writer.WriteLineAsync(msg);
}
}
},token).ContinueWith(t=>writer.TryComplete(t.Exception);
return writer;
}
Definitely Task.Delay is better than Thread.Sleep, because you will not be blocking the thread on the pool, and during the wait the thread on the pool will be available to handle other tasks. Then, you don't need to make your task long-running. Long-running tasks are run in a dedicated thread, and then Task.Delay is meaningless.
Instead, I will recommend a different approach. Just use System.Threading.Timer and make your life simple. Timers are kernel objects that will run their callback on the thread pool, and you will not have to worry about delay or sleep.
The TPL Dataflow library is the preferred tool for this kind of job. It allows building efficient producer-consumer pairs quite easily, and more complex pipelines as well, while offering a complete set of configuration options. In your case using a single ActionBlock should be enough.
A simpler solution you might consider is to use a BlockingCollection. It has the advantage of not requiring the installation of any package (because it is built-in), and it's also much easier to learn. You don't have to learn more than the methods Add, CompleteAdding, and GetConsumingEnumerable. It also supports cancellation. The drawback is that it's a blocking collection, so it blocks the consumer thread while waiting for new messages to arrive, and the producer thread while waiting for available space in the internal buffer (only if you specify a boundedCapacity in the constructor).
var uiCts = new CancellationTokenSource();
var globalMsgQueue = new BlockingCollection<string>();
var backgroundUiTask = new Task(() =>
{
foreach (var item in globalMsgQueue.GetConsumingEnumerable(uiCts.Token))
{
ConsumeMsgQueueItem(item);
}
}, uiCts.Token);
The BlockingCollection uses a ConcurrentQueue internally as a buffer.

EventHub ForEach Parallel Async

Always managing to confuse myself working with async, I'm after a bit of validation/confirmation here that i'm doing what i think i'm doing in the following scenarios..
given the following trivial example:
// pretend / assume these are json msgs or something ;)
var strEvents = new List<string> { "event1", "event2", "event3" };
i can post each event to an eventhub simply as follows:
foreach (var e in strEvents)
{
// Do some things
outEventHub.Add(e); // ICollector
}
the foreach will run on a single thread, and execute each thing inside sequentially.. the posting to eventhub will also remain on the same thread too i guess??
Changing ICollector to IAsyncCollector, and achieve the following:
foreach (var e in strEvents)
{
// Do some things
await outEventHub.AddAsync(e);
}
I think i am right here in saying that the foreach will run on a single thread, the actual sending to the event hub will be pushed off elsewhere? Or at least not block that same thread..
Changing to Parallel.ForEach event as these events will be arriving 100+ or so at a time:
Parallel.ForEach(events, async (e) =>
{
// Do some things
await outEventHub.AddAsync(e);
});
Starting to get a bit hazy now, as i am not sure what really is going on now... afaik the each event has it's own thread (within the bounds of the hardware) and steps within that thread do not block it.. so this trivial example aside.
Finally, i could turn them all in to Tasks i thought..
private static async Task DoThingAsync(string e, IAsyncCollector<string> outEventHub)
{
await outEventHub.AddAsync(e);
}
var t = new List<Task>();
foreach (var e in strEvents)
{
t.Add(DoThingAsync(e, outEventHub));
}
await Task.WhenAll(t);
now i am really hazy, and i think this is prepping everything on a single thread.. and then running everything exactly at the same time, on any thread available??
I appreciate that in order to determine which is right for the job at hand benchmarking is required... but an explanation of what the framework is doing in each situation would be super helpful for me right now..
Parallel != async
This is the main idea here. Both of them have their uses, and they can be used together, but they are very different. You are mostly right with your assumptions, but let me clarify:
Simple foreach
This is non-parallel and non-async. Nothing to talk about.
Await inside foreach
This is async code that is non-parallel.
foreach (var e in strEvents)
{
// Do some things
await outEventHub.AddAsync(e);
}
This will all take place on a single thread. It takes an event, starts adding it to your event hub, and while it is being completed (I'm guessing it does some sort of network IO) it hands back the thread to the thread pool (or UI if it was called on a UI thread) so it can do other work while wating on AddAsync to return. But as you said, is is not parallel at all.
Parallel Foreach (async)
This one is a trap! In short, Parallel.Foreach is designed for synchronous workloads. We'll get back to this but first let's assume you used it with the non-async code.
Parallel foreach (sync)
A.k.a. Parallel but not async.
Parallel.ForEach(events, (e) =>
{
// Do some things
outEventHub.Add(e);
});
Each item will get its own "Task", but they won't spawn a thread. Creating threads is expensive, and in an optimal case there is no point in having more threads than CPU cores. Instead these tasks run on a ThreadPool, which has just as many Threads as optimal. Each thread takes a task, works on it, then takes another one, etc.
You can think of it as - on a 4 core machine - having 4 workers around a pile of tasks, so 4 of them are being run at a time. You can imagine that this is not ideal in case of IO bound workloads (which this most likely is). If your network is slow, you can have all 4 threads blocked on trying to send the event out, while they could be doing useful work. This leads us to...
Tasks
Async and potentially parallel (depends on the usage).
Your description is correct here, too, except for the ThreadPool, it is kikking off all the tasks at once (on the main thread), which then run on the pool's threads. While they are running, the main thread is released, which then can do other work, as needed. Up to this point it is the same as the Parallel.Foreach case. But:
What happens is that a TaskPool thread picks up a task, does the necessary preprocessing, then sends out the network request asynchronously. This means that this task will not block while waiting for the network, but rather it releases the ThreadPool thread to pick up another workitem. When the network request completes, the tasks continuation (the remaining code lines after the network request) is scheduled back to the list of tasks.
You can see that theoretically this is the most efficient process, so fast that you have to be careful not to flood your network.
Back to Parallel.Foreach and async
At this point you should be able to spot the problem. All your async lambda async (e) => { await outEventHub.AddAsync(e);} is doing is to kick off the work, it will return right after it hits the await. (Remember that async/await is releasing threads while waiting.) Parallel.Foreach returns right after it started all of them. But nothing is awaiting these tasks! These become fire and forget, which is usually a bad practice. It is like you deleted the await Task.WhenAll call from your task example.
I hope this cleared most things for you, if not, let me know what to improve on.
Why don't you send those events asynchronously in parallel, like this:
var tasks = new List<Task>();
foreach( var e in strEvents )
{
tasks.Add(outEventHub.AddAsync(e));
}
await Task.WhenAll(tasks);
await outEventHub.FlushAsync();

How can I have two separate task schedulers?

I am writing a game, and using OpenGL I require that some work be offloaded to the rendering thread where an OpenGL context is active, but everything else is handled by the normal thread pool.
Is there a way I can force a Task to be executed in a special thread-pool, and any new tasks created from an async also be dispatched to that thread pool?
I want a few specialized threads for rendering, and I would like to be able to use async and await for example for creating and filling a vertex buffer.
If I just use a custom task scheduler and a new Factory(new MyScheduler()) it seems that any subsequent Task objects will be dispatched to the thread pool anyway where Task.Factory.Scheduler suddenly is null.
The following code should show what I want to be able to do:
public async Task Initialize()
{
// The two following tasks should run on the rendering thread pool
// They cannot run synchronously because that will cause them to fail.
this.VertexBuffer = await CreateVertexBuffer();
this.IndexBuffer = await CreateIndexBuffer();
// This should be dispatched, or run synchrounousyly, on the normal thread pool
Vertex[] vertices = CreateVertices();
// Issue task for filling vertex buffer on rendering thread pool
var fillVertexBufferTask = FillVertexBufffer(vertices, this.VertexBuffer);
// This should be dispatched, or run synchrounousyly, on the normal thread pool
short[] indices = CreateIndices();
// Wait for tasks on the rendering thread pool to complete.
await FillIndexBuffer(indices, this.IndexBuffer);
await fillVertexBufferTask; // Wait for the rendering task to complete.
}
Is there any way to achieve this, or is it outside the scope of async/await?
This is possible and basically the same thing what Microsoft did for the Windows Forms and WPF Synchronization Context.
First Part - You are in the OpenGL thread, and want to put some work into the thread pool, and after this work is done you want back into the OpenGL thread.
I think the best way for you to go about this is to implement your own SynchronizationContext. This thing basically controls how the TaskScheduler works and how it schedules the task. The default implementation simply sends the tasks to the thread pool. What you need to do is to send the task to a dedicated thread (that holds the OpenGL context) and execute them one by one there.
The key of the implementation is to overwrite the Post and the Send methods. Both methods are expected to execute the callback, where Send has to wait for the call to finish and Post does not. The example implementation using the thread pool is that Sendsimply directly calls the callback and Post delegates the callback to the thread pool.
For the execution queue for your OpenGL thread I am think a Thread that queries a BlockingCollection should do nicely. Just send the callbacks to this queue. You may also need some callback in case your post method is called from the wrong thread and you need to wait for the task to finish.
But all in all this way should work. async/await ensures that the SynchronizationContext is restored after a async call that is executed in the thread pool for example. So you should be able to return to the OpenGL thread after you did put some work off into another thread.
Second Part - You are in another thread and want to send some work into the OpenGL thread and await the completion of that work.
This is possible too. My idea in this case is that you don't use Tasks but other awaitable objects. In general every object can be awaitable. It just has to implement a public method getAwaiter() that returns a object implementing the INotifyCompletion interface. What await does is that it puts the remaining method into a new Action and sends this action to the OnCompleted method of that interface. The awaiter is expected to call the scheduled actions once the operation it is awaiting is done. Also this awaiter has to ensure that the SynchronizationContext is captured and the continuations are executed on the captured SynchronizationContext. That sounds complicated, but once you get the hang of it, it goes fairly easy. What helped me a lot is the reference source of the YieldAwaiter (this is basically what happens if you use await Task.Yield()). This is not what you need, but I think it is a place to start.
The method that returns the awaiter has to take care of sending the actual work to the thread that has to execute it (you maybe already have the execution queue from the first part) and the awaiter has to trigger once that work is done.
Conclusion
Make no mistake. That is a lot of work. But if you do all that you will have less problem down the line because you can seamless use the async/await pattern as if you would be working inside windows forms or WPF and that is a hue plus.
First, realize that await introduces the special behavior after the method is called; that is to say, this code:
this.VertexBuffer = await CreateVertexBuffer();
is pretty much the same as this code:
var createVertexBufferTask = CreateVertexBuffer();
this.VertexBuffer = await createVertexBufferTask;
So, you'll have to explicitly schedule code to execute a method within a different context.
You mention using a MyScheduler but I don't see your code using it. Something like this should work:
this.factory = new TaskFactory(CancellationToken.None, TaskCreationOptions.DenyChildAttach, TaskContinuationOptions.None, new MyScheduler());
public async Task Initialize()
{
// Since you mention OpenGL, I'm assuming this method is called on the UI thread.
// Run these methods on the rendering thread pool.
this.VertexBuffer = await this.factory.StartNew(() => CreateVertexBuffer()).Unwrap();
this.IndexBuffer = await this.factory.StartNew(() => CreateIndexBuffer()).Unwrap();
// Run these methods on the normal thread pool.
Vertex[] vertices = await Task.Run(() => CreateVertices());
var fillVertexBufferTask = Task.Run(() => FillVertexBufffer(vertices, this.VertexBuffer));
short[] indices = await Task.Run(() => CreateIndices());
await Task.Run(() => FillIndexBuffer(indices, this.IndexBuffer));
// Wait for the rendering task to complete.
await fillVertexBufferTask;
}
I would look into combining those multiple Task.Run calls, or (if Initialize is called on a normal thread pool thread) removing them completely.

How to make asynchronous methods that use a queue thread safe

I have a service that ensures that exactly one popup is displayed at the same time. AddPopupAsync can be called concurrently, i.e. a popup is open while another 10 AddPopupAsync requests drop in. The code:
public async Task<int> AddPopupAsync(Message message)
{
//[...]
while (popupQueue.Count != 0)
{
await popupQueue.Peek();
}
return await Popup(interaction);
}
But I can spot two unwanted things that can happen because of the lack of thread safety:
If the queue is empty, Peek will throw an exception
If a thread A is preempted before the first statement in Popup, another thread B will not wait for the pending popup A since the queue is still empty.
Popup method works with a TaskCompletionSource and before calling its SetResult method, popupQueue.Dequeue() is called.
I think about using the atomic TryPeek from ConcurrentQueue in order to make #1 thread safe:
do
{
Task<int> result;
bool success = popupQueue.TryPeek(out result);
if (!success) break;
await result;
}
while (true);
Then I read about a AsyncProducerConsumerCollection but I'm not sure if that is the most simple solution.
How can I ensure thread safety in a simple way? Thank you.
To simply add thread-safety you should use a ConcurrentQueue. It's a thread-safe implementation of a queue and you can use it just the same way you would use a queue without worrying about concurrency.
However if you want a queue you can await asynchronously (which means that you're not busy-waiting or blocking a thread while waiting) you should use TPL Dataflow's BufferBlock which is very similar to AsyncProducerConsumerCollection only already implemented for you by MS:
var buffer = new BufferBlock<int>();
await buffer.SendAsync(3); // Instead of enqueue.
int item = await buffer.ReceiveAsync(); // instead of dequeue.
ConcurrentQueue is fine for thread-safeness, but wasteful in high levels of contention. BufferBlock is not only thread-safe it also gives you asynchronous coordination (usually between a consumer and producer).

How to wait for a boolean without looping (using any kind of wait / semaphore / event / mutex, etc)

I need to stop a thread until another thread sets a boolean value and I don't want to share between them an event.
What I currently have is the following code using a Sleep (and that's the code I want to change):
while (!_engine.IsReadyToStop())
{
System.Threading.Thread.Sleep(Properties.Settings.Default.IntervalForCheckingEngine);
}
Any ideas?
EDIT TO CLARIFY THINGS:
There is an object called _engine of a class that I don't own. I cannot modify it, that's why I don't want to share an event between them. I need to wait until a method of that class returns true.
SpinWait.SpinUntil is the right answer, regardless where you're gonna place this code. SpinUntil offers "a nice mix of spinning, yielding, and sleeping in between invocations".
If you are using C# 4.0, you can use:
Task t = Task.Factory.StartNew (() => SomeCall(..));
t.Wait();
By using Task.Wait method.
If you have more than one task run one after another, you can use Task.ContinueWith:
Task t = Task.Factory.StartNew (() =>SomeCall(..)).
ContinueWith(ExecuteAfterThisTaskFinishes(...);
t.Wait();
declare as
AutoResetEvent _ReadyToStop = new AutoResetEvent(false);
and use as
_ReadyToStop.WaitOne();
and
_ReadyToStop.Set();
For more info see the Synchronization Primitives in .Net
A condition variable is the synchronization primitive you can use for waiting on a condition.
It does not natively exist in .NET. But the following link provides 100% managed code for a condition variable class implemented in terms of SemaphoreSlim, AutoResetEvent and Monitor classes. It allows thread to wait on a condition. And can wake up one or more threads when condition is satisfied. In addition, it supports timeouts and CancellationTokens.
To wait on a condition you write code similar to the following:
object queueLock = new object();
ConditionVariable notEmptyCondition = new ConditionVariable();
T Take() {
lock(queueLock) {
while(queue.Count == 0) {
// wait for queue to be not empty
notEmptyCondition.Wait(queueLock);
}
T item = queue.Dequeue();
if(queue.Count < 100) {
// notify producer queue not full anymore
notFullCondition.Pulse();
}
return item;
}
}
Then in another thread you can wake up one or more threads waiting on condition.
lock(queueLock) {
//..add item here
notEmptyCondition.Pulse(); // or PulseAll
}

Categories