Awaiting the generation of multiple items in a single thread - c#

(OBS: English is not my native language and I understand that the title of this question is far from good, but I tried my best to make the question itself clear)
Let's suppose I have a IEnumerable<T> ts that has MANY items and that each MoveNext() is VERY expensive - let's say ts was generated using a yield returnmethod that makes expensive computations.
Look at this piece of code:
var enumerator = ts.GetEnumerator();
while (true) {
T t = await TaskEx.Run<T>(() => enumerator.MoveNext() ?
enumerator.Current :
null);
if (t == null) break;
t.DoSomeLightweigthOperation();
}
It consumes the collection ts asynchronously, without blocking the main thread (which, in my case is a UI thread). The problem is: it spawns one thread for each item in ts. Is there a (simple) way to do the same thing using only ONE thread that does the entire job?
So, just to make myself clear, I want to generate these items using only one thread, but I need to get some kind of collections of Tasks (or any other class with a GetAwaiter) so that I can await after the generation of each of these items.

Your existing solution does not spawn one thread for each item. Rather, it creates a thread pool work item that is queued to the thread pool, once for each item. It is possible (even likely) that each MoveNext will actually be done by the same threadpool thread.
So, I think your existing solution will work, given your constraints.
If you can change the enumerable at all, I'd consider IAsyncEnumerator<T>, which has a Task<bool> MoveNext() member. IAsyncEnumerator<T>/IAsyncEnumerable<T> are part of Ix_Experimental-Async. I also wrote a nearly-identical IAsyncEnumerator<T>, which is part of Nito.AsyncEx. Asynchronous enumeration is a closer match to what you're trying to do; there's a Channel9 video that describes asynchronous enumeration (though the API has changed slightly since then).

If all you want to do is run through ts on another thread you could do something like this:
ThreadPool.QueueUserWorkItem(_ =>
{
foreach(T t in ts)
{
t.DoSomeLightweigthOperation();
}
});
Which just runs through it on a thread from the threadpool.
UPDATE: So the operation does ui work which means it needs to be queued on the ui thread. You don't say which ui framework you use but all of them have ways of doing that.
For instance if you are using wpf you can use the dispatcher

Related

How to store results from tasks running in threadpool?

I have a problem with a threadpool efficiency. I'm not sure I understand the whole concept. I did a lot of reading before asking that question and I know that threadpool is a good solution if you have a lot of small, relatively quick functions AND what's more important - non-blocking tasks. Using lock is very bad in threadpool.
And here is my question: How to return values from threadpool functions? If you have functions to run they probably produce some results, right? It's good to store those results somewhere. Where?
I'm running c.a. 200k very quick functions in a threadpool. The results I store in the List. Of course I have to do:
lock(lockobj)
{
myList.Add(result);
}
So, is this the right way? I mean, if your functions returns SOMETHING, you have to store them in some kind of collection. It has to be a blocking collection. So, I started thinking... "Blocking is very bead in threadpool, but you have to do this, at least once - at the end of every function
How to store/return results from functions running in threadpool?
Thanks!
JB
EDIT: By "function" I mean...
ThreadPool.QueueUserWorkItem(state =>
{
Result r = function(); // previously named "Task"
lock(lockobj)
{
allResults.Add(r);
}
}
If you don't want to block the ThreadPool threads use a lock-free approach. ConcurrentQueue is currently lock-free (as of .NET 4.6.2) when you enqueue items.
So simply do this:
public static ConcurrentQueue<Result> AllResults { get; } = new ConcurrentQueue<Result>();
ThreadPool.QueueUserWorkItem(state =>
{
Result r = function();
AllResults.Enqueue(r);
}
This will guarantee you don't block ThreadPool threads.
Any kind of collection that is thread safe/synchronized will do. There are plenty in .net framework.
You can also use volatile variables to store data between multiple threads - but this is usually considered a bad practice.
Another approach can be to schedule those operations on tasks that can produce results, they run by default on the thread pool and you can get the return values by awaiting the methods and checking the Result of the Task that is returned.
Finally you can write your own code in order to synchronize access to certain regions of code/variables etc using stuff like lock, semaphores, mutex etc

C# await on a List<T> Count

I am upgrading some legacy WinForms code and I am trying to figure out what the "right way" as of .NET 4.6.1 to refactor the following.
The current code is doing a tight while(true) loop while checking a bool property. This property puts a lock() on a generic List<T> and then returns true if it has no items (list.Count == 0).
The loop has the dreaded Application.DoEvents() in it to make sure the message pump continues processing, otherwise it would lock up the application.
Clearly, this needs to go.
My confusion is how to start a basic refactoring where it still can check the queue length, while executing on a thread and not blowing out the CPU for no reason. A delay between checks here is fine, even a "long" one like 100ms+.
I was going to go with an approach that makes the method async and lets a Task run to do the check:
await Task.Run(() => KeepCheckingTheQueue());
Of course, this keeps me in the situation of the method needing to ... loop to check the state of the queue.
Between the waiting, awaiting, and various other methods that can be used to move this stuff to the thread pool ... any suggestion on how best to handle this?
What I need is how to best "poll' a boolean member (or property) while freeing the UI, without the DoEvents().
The answer you're asking for:
private async Task WaitUntilAsync(Func<bool> func)
{
while (!func())
await Task.Delay(100);
}
await WaitUntilAsync(() => list.Count == 0);
However, polling like this is a really poor approach. If you can describe the actual problem your code is solving, then you can get better solutions.
For example, if the list represents some queue of work, and your code is wanting to asynchronously wait until it's done, then this can be better coded using an explicit signal (e.g., TaskCompletionSource<T>) or a true producer/consumer queue (e.g., TPL Dataflow).
It's generally never a good idea for client code to worry about locking a collection (or sprinkling your code with lock() blocks everywhere) before querying it. Best to encapsulate that complexity out.
Instead I recommend using one of the .NET concurrent collections such as ConcurrentBag. No need for creating a Task which is somewhat expensive.
If your collection does not change much you might want to consider one of the immutable thread-safe collections such as ImmutableList<>.
EDIT: Upon reading your comments I suggest you use a WinForms Timer; OnApplicationIdle or BackgroundWorker. The problem with async is that you still need to periodically call it. Using a timer or app idle callback offers the benefit of using the GUI thread.
Depending on the use case, you could start a background thread or a background worker. Or maybe even a timer.
Those are executed in a different thread, and are therefore not locking the execution of your other form related code. Invoke the original thread if you have to perform actions on the UI thread.
I would also recommend to prevent locking as much as possible, for example by doing a check before actually locking:
if (list.Count == 0)
{
lock (lockObject)
{
if (list.Count == 0)
{
// execute your code here
}
}
}
That way you are only locking if you really need to and you avoid unnecessary blocking of your application.
I think what you're after here is the ability to await Task.Yield().
class TheThing {
private readonly List<int> _myList = new List<int>();
public async Task WaitForItToNotBeEmpty() {
bool hadItems;
do {
await Task.Yield();
lock (_myList) // Other answers have touched upon this locking concern
hadItems = _myList.Count != 0;
} while (!hadItems);
}
// ...
}

Best practice for queueing events in WPF

I have a WPF (MVVM) project where I have multiple view-models, each with a button that launches different analyses on the same data source, which in this case is a file. The file cannot be shared, so if the buttons are pressed near the same time the second call will fail.
I need a way to queue the button clicks so that each analysis can be run sequentially, but I can't seem to get it to work. I tried using a static Semaphore, SemaphoreSlim and Mutex, but they appear to stop everything (the Wait() function appears to block the currently running analysis). I tried a lock() command with a static object but it didn't seem to block either event (I get the file share error). I also tried a thread pool (with a max concurrent thread count of 1), but it gives threading errors updating the UI (this may be solvable with Invoke() calls).
My question is what might be considered best practice in this situation with WPF?
EDIT: I created a mockup which exhibits the problem I'm having. It is at http://1drv.ms/1s4oQ1T.
What you need here is an asynchronous queue, so that you can enqueue these tasks without actually having anything blocking your threads. SemaphoreSlim actually has a WaitAsync method that makes creating such a queue rather simple:
public class TaskQueue
{
private SemaphoreSlim semaphore;
public TaskQueue()
{
semaphore = new SemaphoreSlim(1);
}
public async Task<T> Enqueue<T>(Func<Task<T>> taskGenerator)
{
await semaphore.WaitAsync();
try
{
return await taskGenerator();
}
finally
{
semaphore.Release();
}
}
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
}
This allows you to enqueue operations that will be all executed sequentially, rather than in parallel, and without blocking any threads at any time. The operations can also be any type of asynchronous operation, whether that is CPU bound work in another thread, IO bound work, etc.
I would do two things to solve this problem:
First, encapsulate the analysis operations in a command pattern. If you aren't familiar with it, the simplest implementation is an interface with a single function Execute. When you want to perform an analysis operation, just create one of these. You could also use the built-in ICommand interface to help, but be aware that this interface has more to it than the generic command pattern.
Of course, creation is only half the battle, so after doing so I would add it to a BlockingCollection. This collection is .NET's solution to the Producer-Consumer problem. Have a background thread that consumes this collection (executing the command objects contained within) using a foreach on the collection's GetConsumingEnumerable method and your buttons will "feed" it.
foreach (var item in bc.GetConsumingEnumerable())
{
item.Execute();
}
MSDN for Blocking Collection: http://msdn.microsoft.com/en-us/library/dd267312(v=vs.110).aspx
Now, all the semaphores, waits, etc. are done for you, and you can just add an operation to the queue (if it needs to be a queue, consider using ConcurrentQueue as the backing collection for BlockingCollection) and return on the UI thread. The background thread will pick the task up and run it.
You will need to Invoke any UI updates from the background thread of course, no getting around that issue :).
I'd recommend a queue, in a scheduling object shared by the view-models, with a consumer task that waits on the queue to have an item added to it. When a button is pressed, the view-model adds a work item to the queue. The consumer task takes one item from the queue each time, does the analysis contained in the work item, and then checks the queue for another item, waiting for more work items to be added if there are no work items to be processed.

Comparison of Join and WaitAll

For multiple threads wait, can anyone compare the pros and cons of using WaitHandle.WaitAll and Thread.Join?
WaitHandle.WaitAll has a 64 handle limit so that is obviously a huge limitation. On the other hand, it is a convenient way to wait for many signals in only a single call. Thread.Join does not require creating any additional WaitHandle instances. And since it could be called individually on each thread the 64 handle limit does not apply.
Personally, I have never used WaitHandle.WaitAll. I prefer a more scalable pattern when I want to wait on multiple signals. You can create a counting mechanism that counts up or down and once a specific value is reach you signal a single shared event. The CountdownEvent class conveniently packages all of this into a single class.
var finished = new CountdownEvent(1);
for (int i = 0; i < NUM_WORK_ITEMS; i++)
{
finished.AddCount();
SpawnAsynchronousOperation(
() =>
{
try
{
// Place logic to run in parallel here.
}
finally
{
finished.Signal();
}
}
}
finished.Signal();
finished.Wait();
Update:
The reason why you want to signal the event from the main thread is subtle. Basically, you want to treat the main thread as if it were just another work item. Afterall, it, along with the other real work items, is running concurrently as well.
Consider for a moment what might happen if we did not treat the main thread as a work item. It will go through one iteration of the for loop and add a count to our event (via AddCount) indicating that we have one pending work item right? Lets say the SpawnAsynchronousOperation completes and gets the work item queued on another thread. Now, imagine if the main thread gets preempted before swinging around to the next iteration of the loop. The thread executing the work item gets its fair share of the CPU and starts humming along and actually completes the work item. The Signal call in the work item runs and decrements our pending work item count to zero which will change the state of the CountdownEvent to signalled. In the meantime the main thread wakes up and goes through all iterations of the loop and hits the Wait call, but since the event got prematurely signalled it pass on by even though there are still pending work items.
Again, avoiding this subtle race condition is easy when you treat the main thread as a work item. That is why the CountdownEvent is intialized with one count and the Signal method is called before the Wait.
I like #Brian's answer as a comparison of the two mechanisms.
If you are on .Net 4, it would be worthwhile exploring Task Parallel Library to achieve Task Parellelism via System.Threading.Tasks which allows you to manage tasks across multiple threads at a higher level of abstraction. The signalling you asked about in this question to manage thread interactions is hidden or much simplified, and you can concentrate on properly defining what each Task consists of and how to coordinate them.
This may seem offtopic but as Microsoft themselves say in the MSDN docs:
in the .NET Framework 4, tasks are the
preferred API for writing
multi-threaded, asynchronous, and
parallel code.
The waitall mechanism involves kernal-mode objects. I don't think the same is true for the join mechanism. I would prefer join, given the opportunity.
Technically though, the two are not equivalent. IIRC Join can only operate on one thread. Waitall can hold for the signalling of multiple kernel objects.

C# Improvement on a Fire-and-Forget

Greetings
I have a program that creates multiples instances of a class, runs the same long-running Update method on all instances and waits for completion. I'm following Kev's approach from this question of adding the Update to ThreadPool.QueueUserWorkItem.
In the main prog., I'm sleeping for a few minutes and checking a Boolean in the last child to see if done
while(!child[child.Length-1].isFinished) {
Thread.Sleep(...);
}
This solution is working the way I want, but is there a better way to do this? Both for the independent instances and checking if all work is done.
Thanks
UPDATE:
There doesn't need to be locking. The different instances each have a different web service url they request from, and do similar work on the response. They're all doing their own thing.
If you know the number of operations that will be performed, use a countdown and an event:
Activity[] activities = GetActivities();
int remaining = activities.Length;
using (ManualResetEvent finishedEvent = new ManualResetEvent(false))
{
foreach (Activity activity in activities)
{
ThreadPool.QueueUserWorkItem(s =>
{
activity.Run();
if (Interlocked.Decrement(ref remaining) == 0)
finishedEvent.Set();
});
}
finishedEvent.WaitOne();
}
Don't poll for completion. The .NET Framework (and the Windows OS in general) has a number of threading primitives specifically designed to prevent the need for spinlocks, and a polling loop with Sleep is really just a slow spinlock.
You can try Semaphore.
A blocking way of waiting is a bit more elegant than polling. See the Monitor.Wait/Monitor.Pulse (Semaphore works ok too) for a simple way to block and signal. C# has some syntactic sugar around the Monitor class in the form of the lock keyword.
This doesn't look good. There is almost never a valid reason to assume that when the last thread is completed that the other ones are done as well. Unless you somehow interlock the worker threads, which you should never do. It also makes little sense to Sleep(), waiting for a thread to complete. You might as well do the work that thread is doing.
If you've got multiple threads going, give them each a ManualResetEvent. You can wait on completion with WaitHandle.WaitAll(). Counting down a thread counter with the Interlocked class can work too. Or use a CountdownLatch.

Categories