Basic overview: program should launch task to parse some array of data and occasionally enqueue tasks to process it one at a time. Test rig have a button an two labels to display debug info. TaskQueue is a class for SemaphoreSlim from this thread
Dispatcher dispath = Application.Current.Dispatcher;
async void Test_Click(s, e)
{
TaskQueue queue = new TaskQueue();
// Blocks thread if SimulateParse does not have await inside
await SimulateParse(queue);
//await Task.Run(() => SimulateParse(queue));
lblStatus2.Content = string.Format("Awaiting queue"));
await queue.WaitAsync(); //this is just SemaphoreSlim.WaitAsync()
lblStatus.Content = string.Format("Ready"));
lblStatus2.Content = string.Format("Ready"));
MessageBox.Show("Ok");
}
async Task SimulateParse(TaskQueue queue)
{
Random rnd = new Random();
int counter = 0; // representing some piece of data
for(int i = 0; i < 500; i++)
{
dispatch.Invoke(() => lblStatus2.Content = string.Format("Check {0}", ++counter));
Thread.Sleep(25); //no await variant
//await Task.Delay(25);
// if some condition matched - queue work
if (rnd.Next(1, 11) < 2)
{
// Blocks thread even though Enqueue() has await inside
queue.Enqueue(SimulateWork, counter);
//Task.Run(() => queue.Enqueue(SimulateWork, counter));
}
}
}
async Task SimulateWork(object par)
{
dispatch.Invoke(() => lblStatus.Content = string.Format("Working with {0}", par));
Thread.Sleep(400); //no await variant
//await Task.Delay(400);
}
It seems, that it works only if launched task have await inside itself, i.e. if you trying to launch task without await inside it, it will block current thread.
This rig will work as intended, if commented lines are used, but it looks like excessive amount of calls, also, real versions of SimulateParse and SimulateWork does not need to await anything. Main question is - what is the optimal way to launch task with non-async function inside of it? Do i just need to encase them in a Task.Run() like in commented rows?
TaskQueue is used here to run task one by one
It will run them one at a time, yes. SemaphoreSlim does have an implicit queue, but it's not strictly a FIFO-queue. Most synchronization primitives have a mostly-but-not-quite-FIFO implementation, which is Close Enough. This is because they are synchronization primitives, and not queues.
If you want an actual queue (i.e., with guaranteed FIFO order), then you should use a queue, such as TPL Dataflow or System.Threading.Channels.
if you trying to launch task without await inside it, it will block current thread.
All async methods begin executing on the current thread, as described on my blog. async does not mean "run on a different thread". If you want to run a method on a thread pool thread, then wrap that method call in Task.Run. That's a much cleaner solution than sprinkling Task.Delay throughout, and it's more efficient, too (no delays).
Related
private static async Task FuncAsync(DataTable dt, DataRow dr)
{
try
{
await Task.Delay(3000); //assume this is an async http post request that takes 3 seconds to respond
Thread.Sleep(1000) //assume this is some synchronous code that takes 2 second
}
catch (Exception e)
{
Thread.Sleep(1000); //assume this is synchronous code that takes 1 second
}
}
private async void Button1_Click(object sender, EventArgs e)
{
List<Task> lstTasks = new List<Task>();
DataTable dt = (DataTable)gridview1.DataSource;
foreach (DataRow dr in dt.Rows)
{
lstTasks.Add(FuncAsync(dr["colname"].ToString());
}
while (lstTasks.Any())
{
Task finishedTask = await Task.WhenAny(lstTasks);
lstTasks.Remove(finishedTask);
await finishedTask;
progressbar1.ReportProgress();
}
}
Assuming the datatable has got 10000 rows.
In the code, on button click, at the 1st iteration of the for loop, an async api request is made. While it takes 3 seconds, the control immediately goes to the caller. So the for loop can make the next iteration, and so on.
When the api response arrives, the code below the await runs as a callback. Thus blocking the UI thread and any incomplete for loop iterations will be delayed until the callback completes irrespective of whether I use await WhenAny or WhenAll.
All code runs on the UI thread due to the presence of synchronization context. I can do ConfigureAwait false on Task.Delay so the callbacks run on separate threads in order to unblock the ui thread.
Say 1000 iterations are made when the 1st await returns and when the 1st iterations await call back runs the following iterations will have completed completed awaits so their callbacks will run. Effectively callbacks will run one after the other if configure await is true. If false then they will run in parallel on separate threads.
So I think that the progress bar that I am updating in the while loop is incorrect because - by the time the code reaches the while block, most of the initial for loop iterations will have been already completed. I hope that I have understood correctly so far.
I have the following options to report progress from inside the task:
using IProgress (I think this is more suitable to report progress from another thread [for example when using Task.Run], or in usual async await if the configure await is false, resulting in code below the await to run in separate thread otherwise it will not show the progress bar moving as the ui thread will be blocked running the callbacks. In my current example code always runs on the same UI thread). So I was thinking the below point may be more appropriate solution.
making the Task non-static so that I can access the progress bar from within the Task and do porgressbar1.PerformStep().
Another thing I have noticed is that await WhenAll doesn't guarantee that IProgress is fully executed.
The IProgress<T> implementation offered natively by the .NET platform, the Progress<T> class, has the interesting characteristic of notifying the captured SynchronizationContext asynchronously, by invoking its Post method. This characteristic sometimes results to unexpected behavior. For example can you guess what effect has the code below to the Label1 control?
IProgress<string> progress = new Progress<string>(s => Label1.Text = s);
progress.Report("Hello");
Label1.Text = "World";
What text will be eventually written to the label, "Hello" or "World"? The correct answer is: "Hello". The delegate s => Label1.Text = s is invoked asynchronously, so it runs after the execution of the Label1.Text = "World" line, which is invoked synchronously.
Implementing a synchronous version of the Progress<T> class is quite trivial. All you have to do is copy-paste Microsoft's source code, rename the class from Progress<T> to SynchronousProgress<T>, and change the line m_synchronizationContext.Post(... to m_synchronizationContext.Send(.... This way every time you invoke the progress.Report method, the call will block until the invocation of the delegate on the UI thread is completed. The unfortunate implication of this is that if the UI thread is blocked for some reason, for example because you used the .Wait() or the .Result to wait synchronously for the task to complete, your application will deadlock.
The asynchronous nature of the Progress<T> class is rarely a problem in practice, but if you want to avoid thinking about it you can just manipulate the ProgressBar1 control directly. After all you are not writing a library, you are just writing code in the event handler of a button to make some HTTP requests. My suggestion is to forget about the .ConfigureAwait(false) hackery, and just let the main workflow of your asynchronous event handler to stay on the UI thread from start to end. If you have synchronous blocking code that needs to be offloaded to a ThreadPool thread, use the Task.Run method to offload it. To create your tasks, instead of manually adding tasks to a List<Task>, use the handly LINQ Select operator to project each DataRow to a Task. Also add a reference to the System.Data.DataSetExtensions assembly, so that the DataTable.AsEnumerable extension method becomes available. Finally add a throttler (a SemaphoreSlim), so that your application makes efficient use of the available network bandwidth, and it doesn't overburden the target machine:
private async void Button1_Click(object sender, EventArgs e)
{
Button1.Enabled = false;
const int maximumConcurrency = 10;
var throttler = new SemaphoreSlim(maximumConcurrency, maximumConcurrency);
DataTable dataTable = (DataTable)GridView1.DataSource;
ProgressBar1.Minimum = 0;
ProgressBar1.Maximum = dataTable.Rows.Count;
ProgressBar1.Step = 1;
ProgressBar1.Value = 0;
Task[] tasks = dataTable.AsEnumerable().Select(async row =>
{
await throttler.WaitAsync();
try
{
await Task.Delay(3000); // Simulate an asynchronous HTTP request
await Task.Run(() => Thread.Sleep(2000)); // Simulate synchronous code
}
catch
{
await Task.Run(() => Thread.Sleep(1000)); // Simulate synchronous code
}
finally
{
throttler.Release();
}
ProgressBar1.PerformStep();
}).ToArray();
await Task.WhenAll(tasks);
Button1.Enabled = true;
}
You can simply add a wrapper function:
private IProgress<double> _progress;
private int _jobsFinished = 0;
private int _totalJobs = 1000;
private static async Task FuncAsync()
{
try
{
await Task.Delay(3000); //assume this is an async http post request that takes 3 seconds to respond
Thread.Sleep(1000); //assume this is some synchronous code that takes 2 second
}
catch (Exception e)
{
Thread.Sleep(1000); //assume this is synchronous code that takes 1 second
}
}
private async Task AwaitAndUpdateProgress()
{
await FuncAsync(); // Can also do Task.Run(FuncAsync) to run on a worker thread
_jobsFinished++;
_progress.Report((double) _jobsFinished / _totalJobs);
}
And then just WhenAll after adding all the calls.
This is the code that I wrote to better understand asynchronous methods. I knew that an asynchronous method is not the same as multithreading, but it does not seem so in this particular scenario:
class Program
{
static void Main(string[] args)
{
Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo("en-US");
//the line above just makes sure that the console output uses . to represent doubles instead of ,
ExecuteAsync();
Console.ReadLine();
}
private static async Task ParallelAsyncMethod() //this is the method where async parallel execution is taking place
{
List<Task<string>> tasks = new List<Task<string>>();
for (int i = 0; i < 5; i++)
{
tasks.Add(Task.Run(() => DownloadWebsite()));
}
var strings = await Task.WhenAll(tasks);
foreach (var str in strings)
{
Console.WriteLine(str);
}
}
private static string DownloadWebsite() //Imitating a website download
{
Thread.Sleep(1500); //making the thread sleep for 1500 miliseconds before returning
return "Download finished";
}
private static async void ExecuteAsync()
{
var watch = Stopwatch.StartNew();
await ParallelAsyncMethod();
watch.Stop();
Console.WriteLine($"It took the machine {watch.ElapsedMilliseconds} milliseconds" +
$" or {Convert.ToDouble(watch.ElapsedMilliseconds) / 1000} seconds to complete this task");
Console.ReadLine();
}
}
//OUTPUT:
/*
Download finished
Download finished
Download finished
Download finished
Download finished
It took the machine 1537 milliseconds or 1.537 seconds to complete this task
*/
As you can see, the DownloadWebsite method waits for 1.5 seconds and then returns "a". The method called ParallelAsyncMethod adds five of these methods into the "tasks" list and then starts the parallel asynchronous execution. As you can see, I also tracked the amount of time that it takes for the ExecuteAsync method to be executed. The result is always somewhere around 1540 milliseconds. Here is my question: if the DownloadWebsite method required a thread to sleep 5 times for 1500 milliseconds, does it mean that the parallel execution of these methods required 5 different threads? If not, then how come it only took the program 1540 milliseconds to be executed and not ~7500 ms?
I knew that an asynchronous method is not the same as multi-threading
That is correct, an asynchronous method releases the current thread whilst I/O occurs, and schedules a continuation after it's completion.
Async and threads are completely unrelated concepts.
but it does not seem so in this particular scenario
That is because you explicitly run DownloadWebsite on the ThreadPool using Task.Run, which imitates asynchronous code by returning a Task after instructing the provided delegate to run.
Because you are not waiting for each Task to complete before starting the next, multiple threads can be used simultaneously.
Currently each thread is being blocked, as you have used Thread.Sleep in the implementation of DownloadWebsite, meaning you are actually running 5 synchronous methods on the ThreadPool.
In production code your DownloadWebsite method should be written asynchronously, maybe using HttpClient.GetAsync:
private static async Task<string> DownloadWebsiteAsync()
{
//...
await httpClinet.GetAsync(//...
//...
}
In that case, GetAsync returns a Task, and releases the current thread whilst waiting for the HTTP response.
You can still run multiple async methods concurrently, but as the thread is released each time, this may well use less than 5 separate threads and may even use a single thread.
Ensure that you dont use Task.Run with an asynchronous method; this simply adds unnecessary overhead:
var tasks = new List<Task<string>>();
for (int i = 0; i < 5; i++)
{
tasks.Add(DownloadWebsiteAsync()); // No need for Task.Run
}
var strings = await Task.WhenAll(tasks);
As an aside, if you want to imitate an async operation, use Task.Delay instead of Thread.Sleep as the former is non-blocking:
private static async Task<string> DownloadWebsite() //Imitating a website download
{
await Task.Delay(1500); // Release the thread for ~1500ms before continuing
return "Download finished";
}
Lets say i have the following code for example:
private async Task ManageClients()
{
for (int i =0; i < listClients.Count; i++)
{
if (list[i] == 0)
await DoSomethingWithClientAsync();
else
await DoOtherThingAsync();
}
DoOtherWork();
}
My questions are:
1. Will the for() continue and process other clients on the list?, or it
will await untill it finishes one of the tasks.
2. Is even a good practice to use async/await inside a loop?
3. Can it be done in a better way?
I know it was a really simple example, but I'm trying to imagine what would happen if that code was a server with thousands of clients.
In your code example, the code will "block" when the loop reaches await, meaning the other clients will not be processed until the first one is complete. That is because, while the code uses asynchronous calls, it was written using a synchronous logic mindset.
An asynchronous approach should look more like this:
private async Task ManageClients()
{
var tasks = listClients.Select( client => DoSomethingWithClient() );
await Task.WhenAll(tasks);
DoOtherWork();
}
Notice there is only one await, which simultaneously awaits all of the clients, and allows them to complete in any order. This avoids the situation where the loop is blocked waiting for the first client.
If a thread that is executing an async function is calling another async function, this other function is executed as if it was not async until it sees a call to a third async function. This third async function is executed also as if it was not async.
This goes on, until the thread sees an await.
Instead of really doing nothing, the thread goes up the call stack, to see if the caller was not awaiting for the result of the called function. If not, the thread continues the statements in the caller function until it sees an await. The thread goes up the call stack again to see if it can continue there.
This can be seen in the following code:
var taskDoSomething = DoSomethingAsync(...);
// because we are not awaiting, the following is done as soon as DoSomethingAsync has to await:
DoSomethingElse();
// from here we need the result from DoSomethingAsync. await for it:
var someResult = await taskDoSomething;
You can even call several sub-procedures without awaiting:
var taskDoSomething = DoSomethingAsync(...);
var taskDoSomethingElse = DoSomethingElseAsync(...);
// we are here both tasks are awaiting
DoSomethingElse();
Once you need the results of the tasks, if depends what you want to do with them. Can you continue processing if one task is completed but the other is not?
var someResult = await taskDoSomething;
ProcessResult(someResult);
var someOtherResult = await taskDoSomethingelse;
ProcessBothResults(someResult, someOtherResult);
If you need the result of all tasks before you can continue, use Task.WhenAll:
Task[] allTasks = new Task[] {taskDoSomething, taskDoSomethingElse);
await Task.WhenAll(allTasks);
var someResult = taskDoSomething.Result;
var someOtherResult = taskDoSomethingElse.Result;
ProcessBothResults(someResult, someOtherResult);
Back to your question
If you have a sequence of items where you need to start awaitable tasks, it depends on whether the tasks need the result of other tasks or not. In other words can task[2] start if task[1] has not been completed yet? Do Task[1] and Task[2] interfere with each other if they run both at the same time?
If they are independent, then start all Tasks without awaiting. Then use Task.WhenAll to wait until all are finished. The Task scheduler will take care that not to many tasks will be started at the same time. Be aware though, that starting several tasks could lead to deadlocks. Check carefully if you need critical sections
var clientTasks = new List<Task>();
foreach(var client in clients)
{
if (list[i] == 0)
clientTasks.Add(DoSomethingWithClientAsync());
else
clientTasks.Add(DoOtherThingAsync());
}
// if here: all tasks started. If desired you can do other things:
AndNowForSomethingCompletelyDifferent();
// later we need the other tasks to be finished:
var taskWaitAll = Task.WhenAll(clientTasks);
// did you notice we still did not await yet, we are still in business:
MontyPython();
// okay, done with frolicking, we need the results:
await taskWaitAll;
DoOtherWork();
This was the scenario where all Tasks where independent: no task needed the other to be completed before it could start. However if you need Task[2] to be completed before you can start Task[3] you should await:
foreach(var client in clients)
{
if (list[i] == 0)
await DoSomethingWithClientAsync());
else
await DoOtherThingAsync();
}
I'm still getting up to speed with async & multi threading. I'm trying to monitor when the Task I Start is still running (to show in a UI). However it's indicating that it is RanToCompletion earlier than I want, when it hits an await, even when I consider its Status as still Running.
Here is the sample I'm doing. It all seems to be centred around the await's. When it hits an await, it is then marked as RanToCompletion.
I want to keep track of the main Task which starts it all, in a way which indicates to me that it is still running all the way to the end and only RanToCompletion when it is all done, including the repo call and the WhenAll.
How can I change this to get the feedback I want about the tskProdSeeding task status?
My Console application Main method calls this:
Task tskProdSeeding;
tskProdSeeding = Task.Factory.StartNew(SeedingProd, _cts.Token);
Which the runs this:
private async void SeedingProd(object state)
{
var token = (CancellationToken)state;
while (!token.IsCancellationRequested)
{
int totalSeeded = 0;
var codesToSeed = await _myRepository.All().ToListAsync(token);
await Task.WhenAll(Task.Run(async () =>
{
foreach (var code in codesToSeed)
{
if (!token.IsCancellationRequested)
{
try
{
int seedCountByCode = await _myManager.SeedDataFromLive(code);
totalSeeded += seedCountByCode;
}
catch (Exception ex)
{
_logger.InfoFormat(ex.ToString());
}
}
}
}, token));
Thread.Sleep(30000);
}
}
If you use async void the outer task can't tell when the task is finished, you need to use async Task instead.
Second, once you do switch to async Task, Task.Factory.StartNew can't handle functions that return a Task, you need to switch to Task.Run(
tskProdSeeding = Task.Run(() => SeedingProd(_cts.Token), _cts.Token);
Once you do both of those changes you will be able to await or do a .Wait() on tskProdSeeding and it will properly wait till all the work is done before continuing.
Please read "Async/Await - Best Practices in Asynchronous Programming" to learn more about not doing async void.
Please read "StartNew is Dangerous" to learn more about why you should not be using StartNew the way you are using it.
P.S. In SeedingProd you should switch it to use await Task.Delay(30000); insetad of Thread.Sleep(30000);, you will then not tie up a thread while it waits. If you do this you likely could drop the
tskProdSeeding = Task.Run(() => SeedingProd(_cts.Token), _cts.Token);
and just make it
tskProdSeeding = SeedingProd(_cts.Token);
because the function no-longer has a blocking call inside of it.
I'm not convinced that you need a second thread (Task.Run or StartNew) at all. It looks like the bulk of the work is I/O-bound and if you're doing it asynchronously and using Task.Delay instead of Thread.Sleep, then there is no thread consumed by those operations and your UI shouldn't freeze. The first thing anyone new to async needs to understand is that it's not the same thing as multithreading. The latter is all about consuming more threads, the former is all about consuming fewer. Focus on eliminating the blocking and you shouldn't need a second thread.
As others have noted, SeedingProd needs to return a Task, not void, so you can observe its completion. I believe your method can be reduced to this:
private async Task SeedingProd(CancellationToken token)
{
while (!token.IsCancellationRequested)
{
int totalSeeded = 0;
var codesToSeed = await _myRepository.All().ToListAsync(token);
foreach (var code in codesToSeed)
{
if (token.IsCancellationRequested)
return;
try
{
int seedCountByCode = await _myManager.SeedDataFromLive(code);
totalSeeded += seedCountByCode;
}
catch (Exception ex)
{
_logger.InfoFormat(ex.ToString());
}
}
await Task.Dealy(30000);
}
}
Then simply call the method, without awaiting it, and you'll have your task.
Task mainTask = SeedingProd(token);
When you specify async on a method, it compiles into a state machine with a Task, so SeedingProd does not run synchronously, but acts as a Task even if returns void. So when you call Task.Factory.StartNew(SeedingProd) you start a task that kick off another task - that's why the first one finishes immediately before the second one. All you have to do is add the Task return parameter instead of void:
private async Task SeedingProdAsync(CancellationToken ct)
{
...
}
and call it as simply as this:
Task tskProdSeeding = SeedingProdAsync(_cts.Token);
I am not an advanced developer. I'm just trying to get a hold on the task library and just googling. I've never used the class SemaphoreSlim so I would like to know what it does. Here I present code where SemaphoreSlim is used with async & await but which I do not understand. Could someone help me to understand the code below.
1st set of code
await WorkerMainAsync();
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
while (true)
{
await ss.WaitAsync();
// you should probably store this task somewhere and then await it
var task = DoPollingThenWorkAsync();
}
}
async Task DoPollingThenWorkAsync(SemaphoreSlim semaphore)
{
var msg = Poll();
if (msg != null)
{
await Task.Delay(3000); // process the I/O-bound job
}
// this assumes you don't have to worry about exceptions
// otherwise consider try-finally
semaphore.Release();
}
Firstly, the WorkerMainAsync will be called and a SemaphoreSlim is used. Why is 10 passed to the constructor of SemaphoreSlim?
When does the control come out of the while loop again?
What does ss.WaitAsync(); do?
The DoPollingThenWorkAsync() function is expecting a SemaphoreSlim but is not passed anything when it is called. Is this typo?
Why is await Task.Delay(3000); used?
They could simply use Task.Delay(3000) but why do they use await here instead?
2nd set of code for same purpose
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
List<Task> trackedTasks = new List<Task>();
while (DoMore())
{
await ss.WaitAsync();
trackedTasks.Add(Task.Run(() =>
{
DoPollingThenWorkAsync();
ss.Release();
}));
}
await Task.WhenAll(trackedTasks);
}
void DoPollingThenWorkAsync()
{
var msg = Poll();
if (msg != null)
{
Thread.Sleep(2000); // process the long running CPU-bound job
}
}
Here is a task & ss.Release added to a list. I really do not understand how tasks can run after adding to a list?
trackedTasks.Add(Task.Run(async () =>
{
await DoPollingThenWorkAsync();
ss.Release();
}));
I am looking forward for a good explanation & help to understand the two sets of code. Thanks
why 10 is passing to SemaphoreSlim constructor.
They are using SemaphoreSlim to limit to 10 tasks at a time. The semaphore is "taken" before each task is started, and each task "releases" it when it finishes. For more about semaphores, see MSDN.
they can use simply Task.Delay(3000) but why they use await here.
Task.Delay creates a task that completes after the specified time interval and returns it. Like most Task-returning methods, Task.Delay returns immediately; it is the returned Task that has the delay. So if the code did not await it, there would be no delay.
just really do not understand after adding task to list how they can run?
In the Task-based Asynchronous Pattern, Task objects are returned "hot". This means they're already running by the time they're returned. The await Task.WhenAll at the end is waiting for them all to complete.