This question already has answers here:
Async/Await vs Threads
(2 answers)
Closed 7 years ago.
I did a sample to simulate concurrency using the code below:
var threads = new Thread[200];
//starting threads logic
for (int i = 0; i < 200; i++)
{
threads[i].Start();
}
for (int i = 0; i < 200; i++)
{
threads[i].Join();
}
The code is supposed to insert thousands of records to the database and it seems to work well, as the threads finished at almost the same time.
But, when I use:
var tasks = new List<Task<int>>();
for (int i = 0; i < 200; i++)
{
tasks.Add(insert(i));
// await insert(i);
}
int[] result = await Task.WhenAll(tasks);
it takes a lot of time to finish the same logic.
Can someone explain to me what's the difference? I thought that Await should create threads.
If you need to replicate your original Thread-based behaviour, you can use Task.Factory.StartNew(... , TaskCreationOptions.LongRunning) to schedule your work, and then block until the worker tasks complete via Task.WaitAll. I do not recommended this approach, but in terms of behaviour this will be very close to how your code was working previously.
A more in-depth analysis as to why may not getting the expected performance in your scenario is as follows:
Explanation, part 1 (async does not mean "on a different thread")
Methods marked with the async keyword do not magically run asynchronously. They are merely capable of combining awaitable operations (that may or may not run asynchronously themselves), into a single larger unit (generally Task or Task<T>).
If your insert method is async, it is still likely that it performs at least some of the work synchronously. This will definitely be the case with all of your code preceding the first await statement. This work will execute on the "main" thread (thread which calls insert) - and that will be your bottleneck or at least part thereof as the degree of parallelism for that section of your code will be 1 while you're calling insert in a tight loop, regardless of whether you await the resulting task.
To illustrate the above point, consider the following example:
void Test()
{
Debug.Print($"Kicking off async chain (thread {Thread.CurrentThread.ManagedThreadId}) - this is the main thread");
OuterTask().Wait(); // Do not block on Tasks - educational purposes only.
}
async Task OuterTask()
{
Debug.Print($"OuterTask before await (thread {Thread.CurrentThread.ManagedThreadId})");
await InnerTask().ConfigureAwait(false);
Debug.Print($"OuterTask after await (thread {Thread.CurrentThread.ManagedThreadId})");
}
async Task InnerTask()
{
Debug.Print($"InnerTask before await (thread {Thread.CurrentThread.ManagedThreadId})");
await Task.Delay(10).ConfigureAwait(false);
Debug.Print($"InnerTask after await (thread {Thread.CurrentThread.ManagedThreadId}) - we are now on the thread pool");
}
This produces the following output:
Kicking off async chain (thread 6) - this is the main thread
OuterTask before await (thread 6)
InnerTask before await (thread 6)
InnerTask after await (thread 8) - we are now on the thread pool
OuterTask after await (thread 8)
Note that the code before the first await inside Task1 and even Task2 still executes on the "main" thread. Our chain actually executes synchronously, on the same thread which kicked off the outer task, until we await the first truly async operation (in this case Task.Delay).
Additionally
If you are running in an environment where SynchronizationContext.Current is not null (i.e. Windows Forms, WPF) and you're not using ConfigureAwait(false) on the tasks awaited inside your insert method, then continuations scheduled by the async state machine after the first await statement will also likely execute on the "main" thread - although this is not guaranteed in certain environments (i.e. ASP.NET).
Explanation, part 2 (executing Tasks on the thread pool)
If, as part of your insert method, you are opting to start any Tasks manually, then you are most likely scheduling your work on the thread pool by using Task.Run or any other method of starting a new task that does not specify TaskCreationOptions.LongRunning. Once the thread pool gets saturated any newly started tasks will be queued, thus reducing the throughput of your parallel system.
Proof:
IEnumerable<Task> tasks = Enumerable
.Range(0, 200)
.Select(_ => Task.Run(() => Thread.Sleep(100))); // Using Thread.Sleep to simulate blocking calls.
await Task.WhenAll(tasks); // Completes in 2+ seconds.
Now with TaskCreationOptions.LongRunning:
IEnumerable<Task> tasks = Enumerable
.Range(0, 200)
.Select(_ => Task.Factory.StartNew(
() => Thread.Sleep(100), TaskCreationOptions.LongRunning
));
await Task.WhenAll(tasks); // Completes in under 130 milliseconds.
It is generally not a good idea to spawn 200 threads (this will not scale well), but if massive parallelisation of blocking calls is an absolute requirement, the above snippet shows you one way to do it with TPL.
In first example you created threads manually. In second you created tasks. Task - probably - are using thread pool, where limited count of threads exist. So, most task ae waiting in queue, while few of them are executing in parallel on available threads.
Related
Presumptions/Prelude:
In previous questions, we note that Thread.Sleep blocks threads see: When to use Task.Delay, when to use Thread.Sleep?.
We also note that console apps have three threads: The main thread, the GC thread & the finalizer thread IIRC. All other threads are debugger threads.
We know that async does not spin up new threads, and it instead runs on the synchronization context, "uses time on the thread only when the method is active". https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/task-asynchronous-programming-model
Setup:
In a sample console app, we can see that neither the sibling nor the parent code are affected by a call to Thread.Sleep, at least until the await is called (unknown if further).
var sw = new Stopwatch();
sw.Start();
Console.WriteLine($"{sw.Elapsed}");
var asyncTests = new AsyncTests();
var go1 = asyncTests.WriteWithSleep();
var go2 = asyncTests.WriteWithoutSleep();
await go1;
await go2;
sw.Stop();
Console.WriteLine($"{sw.Elapsed}");
Stopwatch sw1 = new Stopwatch();
public async Task WriteWithSleep()
{
sw1.Start();
await Task.Delay(1000);
Console.WriteLine("Delayed 1 seconds");
Console.WriteLine($"{sw1.Elapsed}");
Thread.Sleep(9000);
Console.WriteLine("Delayed 10 seconds");
Console.WriteLine($"{sw1.Elapsed}");
sw1.Stop();
}
public async Task WriteWithoutSleep()
{
await Task.Delay(3000);
Console.WriteLine("Delayed 3 second.");
Console.WriteLine($"{sw1.Elapsed}");
await Task.Delay(6000);
Console.WriteLine("Delayed 9 seconds.");
Console.WriteLine($"{sw1.Elapsed}");
}
Question:
If the thread is blocked from execution during Thread.Sleep, how is it that it continues to process the parent and sibling? Some answer that it is background threads, but I see no evidence of multithreading background threads. What am I missing?
I see no evidence of multithreading background threads. What am I missing?
Possibly you are looking in the wrong place, or using the wrong tools. There's a handy property that might be of use to you, in the form of Thread.CurrentThread.ManagedThreadId. According to the docs,
A thread's ManagedThreadId property value serves to uniquely identify that thread within its process.
The value of the ManagedThreadId property does not vary over time
This means that all code running on the same thread will always see the same ManagedThreadId value. If you sprinkle some extra WriteLines into your code, you'll be able to see that your tasks may run on several different threads during their lifetimes. It is even entirely possible for some async applications to have all their tasks run on the same thread, though you probably won't see that behaviour in your code under normal circumstances.
Here's some example output from my machine, not guaranteed to be the same on yours, nor is it necessarily going to be the same output on successive runs of the same application.
00:00:00.0000030
* WriteWithSleep on thread 1 before await
* WriteWithoutSleep on thread 1 before first await
* WriteWithSleep on thread 4 after await
Delayed 1 seconds
00:00:01.0203244
* WriteWithoutSleep on thread 5 after first await
Delayed 3 second.
00:00:03.0310891
* WriteWithoutSleep on thread 6 after second await
Delayed 9 seconds.
00:00:09.0609263
Delayed 10 seconds
00:00:10.0257838
00:00:10.0898976
The business of running tasks on threads is handled by a TaskScheduler. You could write one that forces code to be single threaded, but that's not often a useful thing to do. The default scheduler uses a threadpool, and as such tasks can be run on a number of different threads.
The Task.Delay method is implemented basically like this (simplified¹):
public static Task Delay(int millisecondsDelay)
{
var tcs = new TaskCompletionSource();
_ = new Timer(_ => tcs.SetResult(), null, millisecondsDelay, -1);
return tcs.Task;
}
The Task is completed on the callback of a System.Threading.Timer component, and according to the documentation this callback is invoked on a ThreadPool thread:
The method does not execute on the thread that created the timer; it executes on a ThreadPool thread supplied by the system.
So when you await the task returned by the Task.Delay method, the continuation after the await runs on the ThreadPool. The ThreadPool typically has more than one threads available immediately on demand, so it's not difficult to introduce concurrency and parallelism if you create 2 tasks at once, like you do in your example. The main thread of a console application is not equipped with a SynchronizationContext by default, so there is no mechanism in place to prevent the observed concurrency.
¹ For demonstration purposes only. The Timer reference is not stored anywhere, so it might be garbage collected before the callback is invoked, resulting in the Task never completing.
I am not accepting my own answer, I will accept someone else's answer because they helped me figure this out. First, in the context of my question, I was using async Main. It was very hard to choose between Theodor's & Rook's answer. However, Rook's answer provided me with one thing that helped me fish: Thread.CurrentThread.ManagedThreadId
These are the results of my running code:
1 00:00:00.0000767
Not Delayed.
1 00:00:00.2988809
Delayed 1 second.
4 00:00:01.3392148
Delayed 3 second.
5 00:00:03.3716776
Delayed 9 seconds.
5 00:00:09.3838139
Delayed 10 seconds
4 00:00:10.3411050
4 00:00:10.5313519
I notice that there are 3 threads here, The initial thread (1) provides for the first calling method and part of the WriteWithSleep() until Task.Delay is initialized and later awaited. At the point that Task.Delay is brought back into Thread 1, everything is run on Thread 4 instead of Thread 1 for the main and the remainder of WriteWithSleep.
WriteWithoutSleep uses its own Thread(5).
So my error was believing that there were only 3 threads. I believed the answer to this question: https://stackoverflow.com/questions/3476642/why-does-this-simple-net-console-app-have-so-many-threads#:~:text=You%20should%20only%20see%20three,see%20are%20debugger%2Drelated%20threads.
However, that question may not have been async, or may not have considered these additional worker threads from the threadpool.
Thank you all for your assistance in figuring out this question.
I am working on some legacy code which repeatedly calls a long running task in a new thread:
var jobList = spGetSomeJobIds.ToList();
jobList.ForEach((jobId) =>
{
var myTask = Task.Factory.StartNew(() => CallExpensiveStoredProc(jobId),
TaskCreationOptions.LongRunning);
myTask.Wait();
});
As the calling thread immediately calls Wait and blocks until the task completes I can't see any point in the Task.Factory.StartNew code. Am I missing something? is there something about TaskCreationOptions.LongRunning which might add value?
As msdn says:
Waits for the Task to complete execution.
in addition, there is the following statement:
Wait blocks the calling thread until the task completes.
So myTask.Wait(); looks redundant as method CallExpensiveStoredProc returns nothing.
As a good practise, it would be better to use async and await operators when you deal with asyncronous operations such as database operations.
UPDATE:
What we have is:
We run LongRunning, so new Thread is created. It can be seen in source files.
Then we call myTask.Wait();. This method just waits when myTask will finish its work. So all jobList iterations will be executed sequantially, not parallely. So now we need to decide how our job should be executed - sequantially(case A) or parallelly(case B).
Case A: Sequantial execution of our jobs
If your jobs should be executed sequntially, then a few questions might be arisen:
what for do we use multithreading, if our code is executing sequantially? Our code should be clean and simple. So we can avoid using multithreading in this case
when we create a new thread, we are adding additional overheads to the threadpool. Because thread pool tries to determine the optimal number of threads and it creates at least one thread per core. That means when all of the thread pool threads are busy, the task might wait (in extreme cases infinitely long), until it actually starts executing.
To sum up, so there is no gain in this case to create new Thread, especially new thread using LongRunning enum.
Case B: Parallel execution of our jobs
If our goal is to run all jobs parallely, then myTask.Wait(); should be eliminated because it makes code to be executed sequntially.
Code to test:
var jobs = new List<int>(){1, 2, 3 };
jobs.ForEach(j =>
{
var myTask = Task.Factory.StartNew(() =>
{
Console.WriteLine($"This is a current number of executing task: { j }");
Thread.Sleep(5000); // Imitation of long-running operation
Console.WriteLine($"Executed: { j }");
}, TaskCreationOptions.LongRunning);
myTask.Wait();
});
Console.WriteLine($"All jobs are executed");
To conclude in this case B, there is no gain to create new Thread, especially new thread using LongRunning enum. Because this is an expensive operation in the time it takes to be created and in memory consumption.
Say I have the following action method:
[HttpPost]
public async Task<IActionResult> PostCall()
{
var tasks = new List<Task<bool>>();
for (int i = 0; i < 10; i++)
tasks.Add(Manager.SomeMethodAsync(i));
// Is this line necessary to ensure that all tasks will finish successfully?
Task.WaitAll(tasks.ToArray());
if (tasks.Exists(x => x.Result))
return new ObjectResult("At least one task returned true");
else
return new ObjectResult("No tasks returned true");
}
Is Task.WaitAll(tasks.ToArray()) necessary to ensure that all tasks will finish successfully? Will the tasks whose Result happened not to get accessed by the Exists finish their execution in the background successfully? Or is there a chance that some of the tasks (that weren't waited for) get dropped since they would not be attached to the request? Is there a better implementation I'm missing?
Under your provided implementation, the Task.WaitAll call blocks the calling thread until all tasks have completed. It would only proceed to the next line and perform the Exists check after this has happened. If you remove the Task.WaitAll, then the Exists check would cause the calling thread to block on each task in order; i.e. it first blocks on tasks[0]; if this returns false, then it would block on tasks[1], then tasks[2], and so on. This is not desirable since it doesn't allow for your method to finish early if the tasks complete out of order.
If you only need to wait until whichever task returns true first, then you could use Task.WhenAny. This will make your asynchronous method resume as soon as any task completes. You can then check whether it evaluated to true and return success immediately; otherwise, you keep repeating the process for the remaining collection of tasks until there are none left.
If your code was running as an application (WPF, WinForms, Console), then the remaining tasks would continue running on the thread pool until completion, unless the application is shut down. Thread-pool threads are background threads, so they won't keep the process alive if all foreground threads have terminated (e.g. because all windows were closed).
Since you're running a web app, you incur the risk of having your app pool recycled before the tasks have completed. The unawaited tasks are fire-and-forget and therefore untracked by the runtime. To prevent this from happening, you can register them with the runtime through the HostingEnvironment.QueueBackgroundWorkItem method, as suggested in the comments.
[HttpPost]
public async Task<IActionResult> PostCall()
{
var tasks = Enumerable
.Range(0, 10)
.Select(Manager.SomeMethodAsync)
.ToList();
foreach (var task in tasks)
HostingEnvironment.QueueBackgroundWorkItem(_ => task);
while (tasks.Any())
{
var readyTask = await Task.WhenAny(tasks);
tasks.Remove(readyTask);
if (await readyTask)
return new ObjectResult("At least one task returned true");
}
return new ObjectResult("No tasks returned true");
}
Yes, the tasks are not guaranteed to complete unless something waits for them (with something like an await)
In your case, the main change you should make is making the Task.WaitAll
await Task.WhenAll(tasks);
So it is actually asynchronous. If you just want to wait for a task to return, use WhenAny instead.
They start out even, but eventually the processTasks never gets hits.
Originally I had this as two threads when the tasks were simple. Someone suggested async/await tasks and being new to c# I had no reason to doubt them.
Task monitorTasks= new Task (monitor.start );
Task processTasks= new Task( () => processor.process(ref param, param2) );
monitorTasks.Start();
processTasks.Start();
await processTasks;
Have I executed this wrong? Is my problem inevitable while running two tasks? Should they be threads? How to avoid.
edit
To clarify. The tasks are never intended to end. They will always be processing and monitoring while triggering events that notify watchers of monitor outputs or processor outputs.
If you await on a Task.WhenAll then it will wait until all tasks have been processed
await Task.WhenAll(monitorTasks, processTasks)
https://msdn.microsoft.com/en-us/library/system.threading.tasks.task.whenall(v=vs.110).aspx
Task.WaitAll blocks the current thread until everything has completed.
Task.WhenAll returns a task which represents the action of waiting until everything has completed.
Task.WhenAll Method
Creates a task that will complete when all of the Task objects in an
enumerable collection have completed.
Task.WaitAll Method
Waits for all of the provided Task objects to complete execution.
If you want to block wait on started tasks (which is seemingly what you want)
Task monitorTasks= new Task (monitor.start );
Task processTasks= new Task( () => processor.process(ref param, param2) );
monitorTasks.Start();
processTasks.Start();
Task.WaitAll(new Task[]{monitorTasks,processTasks})
If you are using async await, see Asynchronous programming with async and await
Then you could do something like this
var task1 = DoWorkAsync();
var task2 = DoMoreWorkAsync();
await Task.WhenAll(task1, task2);
I couldn't get tasks to run evenly.
The monitor task was getting constantly flooded whereas the processor task was getting tasks less frequently, which is when I suspect the monitor task took over.
Since no one could help me,
My solution was to turn them back into threads and set the priority of the threads.Lower than normal for the monitor task, and higher than normal for the processor task.
This seems to have solved my problem.
Let's say I have a method like SaveAsync(Item item) and I need to call it on 10 Items and the calls are independent of one another. I imagine the ideal way in terms of threading is like
Thread A | Run `SaveAsync(item1)` until we hit the `await` | ---- ... ---- | Run `SaveAsync(item10)` until we hit the `await` | ---------------------------------------|
Thread B | --------------------------------------------------- | Run the stuff after the `await` in `SaveAsync(item1)` | ------------------ ... -----------------------|
Thread C | ------------------------------------------------------ | Run the stuff after the `await` in `SaveAsync(item2)` | ------------------ ... --------------------|
.
.
.
(with it being possible that some of the stuff after the await for multiple items is run in the same thread, perhaps even Thread A)
I'm wondering how to write that in C#? Is it a parallel foreach or a loop with with await SaveAsync(item) or what?
Per default async tasks will always return to the thread context they were started on. You can change this by adding
await task.ConfigureAwait(false)
This allows tells the runtime that you do not care on which thread context the task will resume and the runtime can omit the capture of the current thread context (which is quite costly).
However per default you will always be scheduled on the thread context that started the task.
There are a fewer default contexts, such as the ui thread context or the thread pool context. A task started on the ui thread context will be scheduled back to the ui thread context.
A tasks started on the thread pool context will be scheduled to the next free thread from the pool. Not necessarily the same thread the task was started on.
However you can provide your own context if you need more control over the task scheduling.
How to start multiple task in a fashion as you described above. A loop will not help here. Lets take this example.
foreach(var item in items)
{
await SaveAsync(item);
}
The await here will wait until the SaveAsync finishes. So all saves are processed in sequence.
How to save truly asynchronous?
The trick is to start all tasks, but not await them, until all tasks are started. You then wait all tasks with WhenAll(IEnumerable<Task>).
Here an example.
var tasks = new List<Task>();
foreach(var item in items)
{
tasks.Add(SaveAsync(item)); // No await here
}
await Task.WhenAll(tasks); // will only continue when all tasks are finished (or cancelled or failed)
Because of the missing await, all "Save-Actions" are placed in the Async/Await state machine. As soon as the first task yields back, the second will be executed. This will result in a behavior somewhat similar to the one described in your question.
The only main difference here, is all tasks are executed in the same thread. This is most of time complete ok, because all Save methods usually need to access the same resources. Parallelizing them gives no real advantage, because the bottleneck is this resource.
How to use mutliple threads
You can execute a task on a new thread by using
Task.Run(SaveAsync(item));
This will execute the thread on a new thread taken from the thread pool, but there is no wait to start a new thread and finish the method on the ui thread.
To execute all items on different thread, you can use nearly the same code as before:
var tasks = new List<Task>();
foreach(var item in items)
{
tasks.Add(Task.Run(SaveAsync(item));); // No await here
}
await Task.WhenAll(tasks); // will only continue when all tasks are finished (or cancelled or failed)
The only difference is here, that we take the taks returned form StartNew.
One remark: Using Task.Run does not guarantee you a new thread. It will execute the task on the next free thread from the thread pool. This depends on your local settings as well as the local configuration (e.g. a heavy barebone server will have a lot more threads than any consumer laptop).
Whether you get a new thread or you have to wait for any occupied thread to finish is completely up to the thread pool. (The tread pool usually does a really great job. For more info, here a really great article on the thread pool performance: CLR-Thread-Pool)
This is where people do most of the mistakes with async/await:
1) Either people think, that everything after calling async method, with/without awaiting, does translate to ThreadPool thread.
2) Or people think that async does run synchronously.
The truth is somewhere between and #Iqon's statement about next block of code is actually incorrect: "The only main difference here, is all tasks are executed in the same thread."
var tasks = new List<Task>();
foreach(var item in items)
{
tasks.Add(SaveAsync(item)); // No await here
}
await Task.WhenAll(tasks); // will only continue when all tasks are finished (or cancelled or failed)
To make statement like this would suggest that the async method SaveAsync(item) is actually capable to execute fully and completely synchronously.
Here are examples:
async Task SaveAsync1(Item item)
{
//no awaiting at all
}
async Task SaveAsync2(Item item)
{
//awaiting already completed task
int i = await Task.FromResult(0);
}
Methods like these would really run synchronously on thread it the async task was executed on. But these kind async methods are special snowflakes. There is no operation awaited here, everything is commplete even when await is inside the method, because it does not await on first case and does await on completed task in second case, so it synchronously continues after await and these two calls would be same:
var taskA = SaveAsync2(item);//it would return task runned to completion
//same here, await wont happen as returned task was runned to completion
await SaveAsync2(item);
So, making statements, that executing async method here synchronously is correct only in this special case:
var tasks = new List<Task>();
foreach(var item in items)
{
tasks.Add(SaveAsync2(item));
}
await Task.WhenAll(tasks); // will only continue when all tasks are finished (or cancelled or failed)
And there is no need to store tasks and await Task.WhenAll(tasks), it is all already done and this would be enough:
foreach(var item in items)
{
SaveAsync2(item);
//it will execute synchronously because there is
//nothing to await for in the method
}
Now lets explore real case, an async method that actualy awaits something inside or spark awaitable operation:
async Task SaveAsync3(Item item)
{
//awaiting already completed task
int i = await Task.FromResult(0);
await Task.Delay(1000);
Console.WriteLine(i);
}
Now what would this do?
var tasks = new List<Task>();
foreach(var item in items)
{
tasks.Add(SaveAsync3(item));
}
await Task.WhenAll(tasks); // will only continue when all tasks are finished (or cancelled or failed)
Would it run synchronously? No!
Would it run concurrently in parallel? NO!
As I said at begining, the truth is somewhere between with async methods unless they are special snow flakes like SaveAsync1 and SaveAsync2.
So what the code did? It executed each SaveAsync3 synchronously up to await Task.Delay where it found the returned task is incomplete and returned back to caller and provided incomplete task which was stored to tasks and next SaveAsync was executed in same way.
Now await Task.WhenAll(tasks); has really meaning, because it is awaiting some incomplete operation which will run outside this thread context and in parallel.
All those parts of SaveAsync3 method after await Task.Delay will be scheduled to ThreadPool and will run in parallel, unless special case like UI thread context and in that case ConfigureAwait(false) after TaskDelay would be needed.
Hope you guys understand, what I want to say. You can not really say how async method will run unless you have more information about it or code.
This exercise also opens question, when to Task.Run on async method.
It is often missused and I think, that there are really just 2 main cases:
1) When you want break from current threads context, like UI, ASP.NET etc
2) When async method has synchronous part(up to first incomplete await) which is computationally intensive and you want to offload it as well, not just the incomplete await part. The case would be if SaveAsync3 would be computing the variable i for long time, let's say Fibonacci :).
For example you do not have to use Task.Run on something like SaveAsync which would open file and save into it something asynchronously, unless the synchronous part before first await inside SaveAsync is an issue, taks time. Then Task.Run is in order as is part of case 2).