How to await Parallel Linq actions to complete - c#

I'm not sure how I'm supposed to mix plinq and async-await. Suppose that I have the following interface
public interface IDoSomething (
Task Do();
}
I have a list of these which I would like to execute in parallel and be able to await the completion of all.
public async Task DoAll(IDoSomething[] doers) {
//Execute all doers in parallel ideally using plinq and
//continue when all are complete
}
How to implement this? I'm not sure how to go from parallel linq to Tasks and vice versa.
I'm not terribly worried about exception handling. Ideally the first one would fire and break the whole process as I plan to discard the entire thing on error.
Edit: A lot of people are saying Task.WaitAll. I'm aware of this but my understanding (unless someone can demonstrate otherwise) is that it won't actively parallelize things for you to multiple available processor cores. What I'm specifically asking is twofold -
if I await a Task within a Plinq Action does that get rid of a lot of the advantage since it schedules a new thread?
If I doers.AsParallel().ForAll(async d => await d.Do()) which takes about 5 second on average, how do I not spin the invoking thread in the meantime?

What you're looking for is this:
public Task DoAllAsync(IEnumerable<IDoSomething> doers)
{
return Task.WhenAll(doers.Select(doer => Task.Run(() => doer.Do())));
}
Using Task.Run will use a ThreadPool thread to execute each synchronous part of the async method Do in parallel while Task.WhenAll asynchronously waits for the asynchronous parts together that are executing concurrently.
This is a good idea only if you have substantial synchronous parts in these async methods (i.e. the parts before an await) for example:
async Task Do()
{
for (int i = 0; i < 10000; i++)
{
Math.Pow(i,i);
}
await Task.Delay(10000);
}
Otherwise, there's no need for parallelism and you can just fire the asynchronous operations concurrently and wait for all the returned tasks using Task.WhenAll:
public Task DoAllAsync(IEnumerable<IDoSomething> doers)
{
return Task.WhenAll(doers.Select(doer => doer.Do()));
}

public async Task DoAll(IDoSomething[] doers) {
//using ToArray to materialize the query right here
//so we don't accidentally run it twice later.
var tasks = doers.Select(d => Task.Run(()=>d.Do())).ToArray();
await Task.WhenAll(tasks);
}

Related

What is the difference between starting a worker via Task.Run and just calling an async method (don't await) with await Task.Yield inside? [duplicate]

I would like to ask you on your opinion about the correct architecture when to use Task.Run. I am experiencing laggy UI in our WPF .NET 4.5
application (with Caliburn Micro framework).
Basically I am doing (very simplified code snippets):
public class PageViewModel : IHandle<SomeMessage>
{
...
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
// Makes UI very laggy, but still not dead
await this.contentLoader.LoadContentAsync();
HideLoadingAnimation();
}
}
public class ContentLoader
{
public async Task LoadContentAsync()
{
await DoCpuBoundWorkAsync();
await DoIoBoundWorkAsync();
await DoCpuBoundWorkAsync();
// I am not really sure what all I can consider as CPU bound as slowing down the UI
await DoSomeOtherWorkAsync();
}
}
From the articles/videos I read/saw, I know that await async is not necessarily running on a background thread and to start work in the background you need to wrap it with await Task.Run(async () => ... ). Using async await does not block the UI, but still it is running on the UI thread, so it is making it laggy.
Where is the best place to put Task.Run?
Should I just
Wrap the outer call because this is less threading work for .NET
, or should I wrap only CPU-bound methods internally running with Task.Run as this makes it reusable for other places? I am not sure here if starting work on background threads deep in core is a good idea.
Ad (1), the first solution would be like this:
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
await Task.Run(async () => await this.contentLoader.LoadContentAsync());
HideLoadingAnimation();
}
// Other methods do not use Task.Run as everything regardless
// if I/O or CPU bound would now run in the background.
Ad (2), the second solution would be like this:
public async Task DoCpuBoundWorkAsync()
{
await Task.Run(() => {
// Do lot of work here
});
}
public async Task DoSomeOtherWorkAsync(
{
// I am not sure how to handle this methods -
// probably need to test one by one, if it is slowing down UI
}
Note the guidelines for performing work on a UI thread, collected on my blog:
Don't block the UI thread for more than 50ms at a time.
You can schedule ~100 continuations on the UI thread per second; 1000 is too much.
There are two techniques you should use:
1) Use ConfigureAwait(false) when you can.
E.g., await MyAsync().ConfigureAwait(false); instead of await MyAsync();.
ConfigureAwait(false) tells the await that you do not need to resume on the current context (in this case, "on the current context" means "on the UI thread"). However, for the rest of that async method (after the ConfigureAwait), you cannot do anything that assumes you're in the current context (e.g., update UI elements).
For more information, see my MSDN article Best Practices in Asynchronous Programming.
2) Use Task.Run to call CPU-bound methods.
You should use Task.Run, but not within any code you want to be reusable (i.e., library code). So you use Task.Run to call the method, not as part of the implementation of the method.
So purely CPU-bound work would look like this:
// Documentation: This method is CPU-bound.
void DoWork();
Which you would call using Task.Run:
await Task.Run(() => DoWork());
Methods that are a mixture of CPU-bound and I/O-bound should have an Async signature with documentation pointing out their CPU-bound nature:
// Documentation: This method is CPU-bound.
Task DoWorkAsync();
Which you would also call using Task.Run (since it is partially CPU-bound):
await Task.Run(() => DoWorkAsync());
One issue with your ContentLoader is that internally it operates sequentially. A better pattern is to parallelize the work and then sychronize at the end, so we get
public class PageViewModel : IHandle<SomeMessage>
{
...
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
// makes UI very laggy, but still not dead
await this.contentLoader.LoadContentAsync();
HideLoadingAnimation();
}
}
public class ContentLoader
{
public async Task LoadContentAsync()
{
var tasks = new List<Task>();
tasks.Add(DoCpuBoundWorkAsync());
tasks.Add(DoIoBoundWorkAsync());
tasks.Add(DoCpuBoundWorkAsync());
tasks.Add(DoSomeOtherWorkAsync());
await Task.WhenAll(tasks).ConfigureAwait(false);
}
}
Obviously, this doesn't work if any of the tasks require data from other earlier tasks, but should give you better overall throughput for most scenarios.

Async / Await, Task.Run(async() => await), await Task.Run(()=>), await Task.Run(async() => await) difference [duplicate]

I would like to ask you on your opinion about the correct architecture when to use Task.Run. I am experiencing laggy UI in our WPF .NET 4.5
application (with Caliburn Micro framework).
Basically I am doing (very simplified code snippets):
public class PageViewModel : IHandle<SomeMessage>
{
...
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
// Makes UI very laggy, but still not dead
await this.contentLoader.LoadContentAsync();
HideLoadingAnimation();
}
}
public class ContentLoader
{
public async Task LoadContentAsync()
{
await DoCpuBoundWorkAsync();
await DoIoBoundWorkAsync();
await DoCpuBoundWorkAsync();
// I am not really sure what all I can consider as CPU bound as slowing down the UI
await DoSomeOtherWorkAsync();
}
}
From the articles/videos I read/saw, I know that await async is not necessarily running on a background thread and to start work in the background you need to wrap it with await Task.Run(async () => ... ). Using async await does not block the UI, but still it is running on the UI thread, so it is making it laggy.
Where is the best place to put Task.Run?
Should I just
Wrap the outer call because this is less threading work for .NET
, or should I wrap only CPU-bound methods internally running with Task.Run as this makes it reusable for other places? I am not sure here if starting work on background threads deep in core is a good idea.
Ad (1), the first solution would be like this:
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
await Task.Run(async () => await this.contentLoader.LoadContentAsync());
HideLoadingAnimation();
}
// Other methods do not use Task.Run as everything regardless
// if I/O or CPU bound would now run in the background.
Ad (2), the second solution would be like this:
public async Task DoCpuBoundWorkAsync()
{
await Task.Run(() => {
// Do lot of work here
});
}
public async Task DoSomeOtherWorkAsync(
{
// I am not sure how to handle this methods -
// probably need to test one by one, if it is slowing down UI
}
Note the guidelines for performing work on a UI thread, collected on my blog:
Don't block the UI thread for more than 50ms at a time.
You can schedule ~100 continuations on the UI thread per second; 1000 is too much.
There are two techniques you should use:
1) Use ConfigureAwait(false) when you can.
E.g., await MyAsync().ConfigureAwait(false); instead of await MyAsync();.
ConfigureAwait(false) tells the await that you do not need to resume on the current context (in this case, "on the current context" means "on the UI thread"). However, for the rest of that async method (after the ConfigureAwait), you cannot do anything that assumes you're in the current context (e.g., update UI elements).
For more information, see my MSDN article Best Practices in Asynchronous Programming.
2) Use Task.Run to call CPU-bound methods.
You should use Task.Run, but not within any code you want to be reusable (i.e., library code). So you use Task.Run to call the method, not as part of the implementation of the method.
So purely CPU-bound work would look like this:
// Documentation: This method is CPU-bound.
void DoWork();
Which you would call using Task.Run:
await Task.Run(() => DoWork());
Methods that are a mixture of CPU-bound and I/O-bound should have an Async signature with documentation pointing out their CPU-bound nature:
// Documentation: This method is CPU-bound.
Task DoWorkAsync();
Which you would also call using Task.Run (since it is partially CPU-bound):
await Task.Run(() => DoWorkAsync());
One issue with your ContentLoader is that internally it operates sequentially. A better pattern is to parallelize the work and then sychronize at the end, so we get
public class PageViewModel : IHandle<SomeMessage>
{
...
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
// makes UI very laggy, but still not dead
await this.contentLoader.LoadContentAsync();
HideLoadingAnimation();
}
}
public class ContentLoader
{
public async Task LoadContentAsync()
{
var tasks = new List<Task>();
tasks.Add(DoCpuBoundWorkAsync());
tasks.Add(DoIoBoundWorkAsync());
tasks.Add(DoCpuBoundWorkAsync());
tasks.Add(DoSomeOtherWorkAsync());
await Task.WhenAll(tasks).ConfigureAwait(false);
}
}
Obviously, this doesn't work if any of the tasks require data from other earlier tasks, but should give you better overall throughput for most scenarios.

Run I\O bunch threads in asynchronous manner [duplicate]

Ok, so basically I have a bunch of tasks (10) and I want to start them all at the same time and wait for them to complete. When completed I want to execute other tasks. I read a bunch of resources about this but I can't get it right for my particular case...
Here is what I currently have (code has been simplified):
public async Task RunTasks()
{
var tasks = new List<Task>
{
new Task(async () => await DoWork()),
//and so on with the other 9 similar tasks
}
Parallel.ForEach(tasks, task =>
{
task.Start();
});
Task.WhenAll(tasks).ContinueWith(done =>
{
//Run the other tasks
});
}
//This function perform some I/O operations
public async Task DoWork()
{
var results = await GetDataFromDatabaseAsync();
foreach (var result in results)
{
await ReadFromNetwork(result.Url);
}
}
So my problem is that when I'm waiting for tasks to complete with the WhenAll call, it tells me that all tasks are over even though none of them are completed. I tried adding Console.WriteLine in my foreach and when I have entered the continuation task, data keeps coming in from my previous Tasks that aren't really finished.
What am I doing wrong here?
You should almost never use the Task constructor directly. In your case that task only fires the actual task that you can't wait for.
You can simply call DoWork and get back a task, store it in a list and wait for all the tasks to complete. Meaning:
tasks.Add(DoWork());
// ...
await Task.WhenAll(tasks);
However, async methods run synchronously until the first await on an uncompleted task is reached. If you worry about that part taking too long then use Task.Run to offload it to another ThreadPool thread and then store that task in the list:
tasks.Add(Task.Run(() => DoWork()));
// ...
await Task.WhenAll(tasks);
If you want to run those task's parallel in different threads using TPL you may need something like this:
public async Task RunTasks()
{
var tasks = new List<Func<Task>>
{
DoWork,
//...
};
await Task.WhenAll(tasks.AsParallel().Select(async task => await task()));
//Run the other tasks
}
These approach parallelizing only small amount of code: the queueing of the method to the thread pool and the return of an uncompleted Task. Also for such small amount of task parallelizing can take more time than just running asynchronously. This could make sense only if your tasks do some longer (synchronous) work before their first await.
For most cases better way will be:
public async Task RunTasks()
{
await Task.WhenAll(new []
{
DoWork(),
//...
});
//Run the other tasks
}
To my opinion in your code:
You should not wrap your code in Task before passing to Parallel.ForEach.
You can just await Task.WhenAll instead of using ContinueWith.
Essentially you're mixing two incompatible async paradigms; i.e. Parallel.ForEach() and async-await.
For what you want, do one or the other. E.g. you can just use Parallel.For[Each]() and drop the async-await altogether. Parallel.For[Each]() will only return when all the parallel tasks are complete, and you can then move onto the other tasks.
The code has some other issues too:
you mark the method async but don't await in it (the await you do have is in the delegate, not the method);
you almost certainly want .ConfigureAwait(false) on your awaits, especially if you aren't trying to use the results immediately in a UI thread.
The DoWork method is an asynchronous I/O method. It means that you don't need multiple threads to execute several of them, as most of the time the method will asynchronously wait for the I/O to complete. One thread is enough to do that.
public async Task RunTasks()
{
var tasks = new List<Task>
{
DoWork(),
//and so on with the other 9 similar tasks
};
await Task.WhenAll(tasks);
//Run the other tasks
}
You should almost never use the Task constructor to create a new task. To create an asynchronous I/O task, simply call the async method. To create a task that will be executed on a thread pool thread, use Task.Run. You can read this article for a detailed explanation of Task.Run and other options of creating tasks.
Just also add a try-catch block around the Task.WhenAll
NB: An instance of System.AggregateException is thrown that acts as a wrapper around one or more exceptions that have occurred. This is important for methods that coordinate multiple tasks like Task.WaitAll() and Task.WaitAny() so the AggregateException is able to wrap all the exceptions within the running tasks that have occurred.
try
{
Task.WaitAll(tasks.ToArray());
}
catch(AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
Console.WriteLine(String.Format("Exception type {0} from {1}", inner.GetType(), inner.Source));
}
}

is await TaskRun(() => PublicVoidNotAsycMethod) equivalent to PublicVoidNotAsycMethod()

I have an async method:
public async void BillSubscriptions()
{
await Task.Run(() => ProcessSubscriptions(_subscriptionRepository));
await Task.Run(() => ProcessNonRecurringSubscriptions(_subscriptionRepository));
await Task.Run(() => ProcessTrialSubscriptions(_subscriptionRepository));
}
Note that ProcessSubscriptions, ProcessNonRecurringSUbscriptions and ProcessTrialSUbscriptions are private void methods and not async.
All of those method retrieve data from database and process it and update the database based on some algorithms.
My question is, is the above code equivalent to this code below?
public async void BillSubscriptions()
{
ProcessSubscriptions(_subscriptionRepository);
ProcessNonRecurringSubscriptions(_subscriptionRepository);
ProcessTrialSubscriptions(_subscriptionRepository);
}
They are not the same at all.
In your first example you have the following:
public async void BillSubscriptions()
{
await Task.Run(() => ProcessSubscriptions(_subscriptionRepository));
await Task.Run(() => ProcessNonRecurringSubscriptions(_subscriptionRepository));
await Task.Run(() => ProcessTrialSubscriptions(_subscriptionRepository));
}
An async void method namely BillSubscriptions. This method is public, as such it can be invoked by anyone. It internally awaits three private methods in a specific order. The methods are instructed to run via the Task.Run function which in this case accepts a lambda expression that resolves to the Action delegate. These methods are executed sequentially and queued to run on a ThreadPool thread, a Task object is returned to represent the asynchronous operation.
See Task.Run here for details on its functionality.
The other operation is as follows:
public async void BillSubscriptions()
{
ProcessSubscriptions(_subscriptionRepository);
ProcessNonRecurringSubscriptions(_subscriptionRepository);
ProcessTrialSubscriptions(_subscriptionRepository);
}
Again we have a public method marked as async named BillSubscriptions that executes three private methods that are executed sequentially. The difference is that these are all ran on the current thread and are blocking. Whereas in the previous example the code does not block and they could potentially (are are likely) to execute on different threads. I have made some modifications to demonstrate the differences:
Here is the link for the .NET fiddle that will hopefully make this more clear.
Here is the output:
Is async = True, ProcessSubscriptions :: Thread ID10
Is async = True, ProcessNonRecurringSubscriptions :: Thread ID11
Is async = True, ProcessTrialSubscriptions :: Thread ID10
Is async = False, ProcessSubscriptions :: Thread ID9
Is async = False, ProcessNonRecurringSubscriptions :: Thread ID9
Is async = False, ProcessTrialSubscriptions :: Thread ID9
Note:
Avoid using async void as it breaks the async state machine
Since the methods are private to the class, make then async instead. Rename them to suffix them with MethodNameAsync, make them Task returning and within their body have them return Task.Run(() => { ... });
Since it appears that you are looking to understand if there is an advantage to having async code there is...very much so in fact. Since the three private methods do not need to wait for the return value of another, they could all run in parallel. You could use Task.WhenAll to see a dramatic performance gain. For example if each method took nearly 1 second to execute, that would take at least 3 seconds for them to run synchronously, however, if executing in parallel -- it would only take as long as the longest execution of the three.
If your methods are doing database calls then you should use async await all the way down (ToListAsync(), db.SaveChangesAsync() in case you are using EF) and not wrap them in Task.Run(). The reason why you would want to do that is that thread is not blocked while waiting on I/O-Bound operation (DB call in this case) to complete and can be assigned to do something else (for example process another HTTP request in case of web app).
There is a difference.
In your first example, you execute the Process methods concurrently on different threads and await each in turn.
Depending on your execution context (e.g. ASP.NET or a XAML app) the BillSubscriptions method may continue to run on the main thread (as you are not using ConfigureAwait(false)) or it may be continued on some thread pool thread that executed the Process method.
In the second example, you just execute synchronously. There is less scheduling and task / context switching going on.
Unless I'm mistaken this version won't work as you are not awaiting anything, so it can't be async.
The async keyword alone won't do anything for you in this case anyway, as async methods run synchronously until they reach the first await.
If you wanted to perform all 3 operations without blocking you have 3 options:
make the Process methods async and await them (without Task.Run)
use your second version without the async keyword and start it with Task.Run(() => BillSubscriptions())
if the Process methods don't have to be executed in order and can run concurrently you could modify your first version to this:
.
public async void BillSubscriptions()
{
await Task.WhenAll(
Task.Run(() => ProcessSubscriptions(_subscriptionRepository));
Task.Run(() => ProcessNonRecurringSubscriptions(_subscriptionRepository));
Task.Run(() => ProcessTrialSubscriptions(_subscriptionRepository))
);
}
To be honest, not. But if you worry only about execution order, BillSubscriptions#2 is equivalent to BillSubscriptions#1. Thats because code after wait is added to Task.ContinueWith method
Without async/await BillSubscriptions would looks like:
public void BillSubscriptions()
{
Task t1 = Task.Run(() => ProcessSubscriptions(_subscriptionRepository));
t1.ContinueWith(t =>
{
Task t2 = Task.Run(() => ProcessNonRecurringSubscriptions(_subscriptionRepository));
t2.ContinueWith(tt =>
{
Task.Run(() => ProcessTrialSubscriptions(_subscriptionRepository));
});
});
}

Task await vs Task.WaitAll

In terms of parallelism are these equivalent?
async Task TestMethod1()
{
Task<int> t1 = GetInt1();
Task<int> t2 = GetInt2();
await Task.WhenAll(t1, t2);
}
async Task TestMethod2()
{
Task<int> t1 = GetInt1();
await GetInt2();
await t1;
}
In TestMethod2, I am mainly interested in understanding whether GetInt1() starts executing while awaiting GetInt2().
Yes, in terms of "parallelism" (actually concurrency), they are pretty much the same.
In particular, the TAP docs state that returned tasks are "hot", that is:
All tasks that are returned from TAP methods must be activated... Consumers of a TAP method may safely assume that the returned task is active
So, your code is starting the asynchronous operations by calling their methods. The tasks they return are already in progress. In both examples, both tasks are running concurrently.
It doesn't matter terribly much whether you use two awaits or a single await Task.WhenAll. I prefer the Task.WhenAll approach because IMO it more clearly communicates the intent of concurrency. Also, it only interrupts the source context (e.g., UI thread) once instead of twice, but that's just a minor concern.

Categories