To provide producer-consumer functionality that can queue and execute async methods one after the other, I'm trying to implement an async queue. I noticed major performance issues using it in a large application.
async Task Loop() {
while (Verify()) {
if (!_blockingCollection.TryTake(out var func, 1000, _token)) continue;
await func.Invoke();
}
}
Implementation of AsyncQueue.Add:
public void Add(Func<Task> func) {
_blockingCollection.Add(func);
}
Example usage from arbitrary thread:
controller.OnEvent += (o, a) => _queue.Add(async (t) => await handle(a));
Execution paths' depend on the state of the application and include
async network requests that internally use TaskCompletionSource to return result
IO operations
tasks that get added to a list and are awaited using Task.WhenAll(...)
an async void method that converts an array and awaits a network request
Symptoms:
The application slows down gradually.
When I replace await func.Invoke() with func.Invoke().Wait() instead of awaiting it properly, performance improves dramatically and it does not slow down.
Why is that? Is an async queue that uses BlockingCollection a bad idea?
What is a better alternative?
Why is that?
There isn't enough information in the question to provide an answer to this.
As others have noted, there's a CPU-consuming spin issue with the loop as it currently is.
In the meantime, I can at least answer this part:
Is an async Queue that uses BlockingCollection a bad idea?
Yes.
What is a better alternative?
Use an async-compatible queue. E.g., Channels, or BufferBlock/ActionBlock from TPL Dataflow.
Example using Channels:
async Task Loop() {
await foreach (var func in channelReader.ReadAllAsync()) {
await func.Invoke();
}
}
or if you're not on .NET Core yet:
async Task Loop() {
while (await channelReader.WaitToReadAsync()) {
while (channelReader.TryRead(out var func)) {
await func.Invoke();
}
}
}
Related
I'm struggling to understand what's happening in this simple program.
In the example below I have a task factory that uses the LimitedConcurrencyLevelTaskScheduler from ParallelExtensionsExtras with maxDegreeOfParallelism set to 2.
I then start 2 tasks that each call an async method (e.g. an async Http request), then gets the awaiter and the result of the completed task.
The problem seem to be that Task.Delay(2000) never completes. If I set maxDegreeOfParallelism to 3 (or greater) it completes. But with maxDegreeOfParallelism = 2 (or less) my guess is that there is no thread available to complete the task. Why is that?
It seems to be related to async/await since if I remove it and simply do Task.Delay(2000).GetAwaiter().GetResult() in DoWork it works perfectly. Does async/await somehow use the parent task's task scheduler, or how is it connected?
using System;
using System.Linq;
using System.Threading.Tasks;
using System.Threading.Tasks.Schedulers;
namespace LimitedConcurrency
{
class Program
{
static void Main(string[] args)
{
var test = new TaskSchedulerTest();
test.Run();
}
}
class TaskSchedulerTest
{
public void Run()
{
var scheduler = new LimitedConcurrencyLevelTaskScheduler(2);
var taskFactory = new TaskFactory(scheduler);
var tasks = Enumerable.Range(1, 2).Select(id => taskFactory.StartNew(() => DoWork(id)));
Task.WaitAll(tasks.ToArray());
}
private void DoWork(int id)
{
Console.WriteLine($"Starting Work {id}");
HttpClientGetAsync().GetAwaiter().GetResult();
Console.WriteLine($"Finished Work {id}");
}
async Task HttpClientGetAsync()
{
await Task.Delay(2000);
}
}
}
Thanks in advance for any help
await by default captures the current context and uses that to resume the async method. This context is SynchronizationContext.Current, unless it is null, in which case it is TaskScheduler.Current.
In this case, await is capturing the LimitedConcurrencyLevelTaskScheduler used to execute DoWork. So, after starting the Task.Delay both times, both of those threads are blocked (due to the GetAwaiter().GetResult()). When the Task.Delay completes, the await schedules the remainder of the HttpClientGetAsync method to its context. However, the context will not run it since it already has 2 threads.
So you end up with threads blocked in the context until their async methods complete, but the async methods cannot complete until there is a free thread in the context; thus a deadlock. Very similar to the standard "don't block on async code" style of deadlock, just with n threads instead of one.
Clarifications:
The problem seem to be that Task.Delay(2000) never completes.
Task.Delay is completing, but the await cannot continue executing the async method.
If I set maxDegreeOfParallelism to 3 (or greater) it completes. But with maxDegreeOfParallelism = 2 (or less) my guess is that there is no thread available to complete the task. Why is that?
There are plenty of threads available. But the LimitedConcurrencyTaskScheduler only allows 2 threads at a time to run in its context.
It seems to be related to async/await since if I remove it and simply do Task.Delay(2000).GetAwaiter().GetResult() in DoWork it works perfectly.
Yes; it's the await that is capturing the context. Task.Delay does not capture a context internally, so it can complete without needing to enter the LimitedConcurrencyTaskScheduler.
Solution:
Task schedulers in general do not work very well with asynchronous code. This is because task schedulers were designed for Parallel Tasks rather than asynchronous tasks. So they only apply when code is running (or blocked). In this case, LimitedConcurrencyLevelTaskScheduler only "counts" code that's running; if you have a method that's doing an await, it won't "count" against that concurrency limit.
So, your code has ended up in a situation where it has the sync-over-async antipattern, probably because someone was trying to avoid the problem of await not working as expected with limited concurrency task schedulers. This sync-over-async antipattern has then caused the deadlock problem.
Now, you could add in more hacks by using ConfigureAwait(false) everywhere and continue blocking on asynchronous code, or you could fix it better.
A more proper fix would be to do asynchronous throttling. Toss out the LimitedConcurrencyLevelTaskScheduler completely; concurrency-limiting task schedulers only work with synchronous code, and your code is asynchronous. You can do asynchronous throttling using SemaphoreSlim, as such:
class TaskSchedulerTest
{
private readonly SemaphoreSlim _mutex = new SemaphoreSlim(2);
public async Task RunAsync()
{
var tasks = Enumerable.Range(1, 2).Select(id => DoWorkAsync(id));
await Task.WhenAll(tasks);
}
private async Task DoWorkAsync(int id)
{
await _mutex.WaitAsync();
try
{
Console.WriteLine($"Starting Work {id}");
await HttpClientGetAsync();
Console.WriteLine($"Finished Work {id}");
}
finally
{
_mutex.Release();
}
}
async Task HttpClientGetAsync()
{
await Task.Delay(2000);
}
}
I think you are encountering a sync deadlock. You are waiting for a thread to complete that is waiting for your thread to complete. Never going to happen. If you make your DoWork method async so you can await the HttpClientGetAsync() call, and you'll avoid the deadlock.
using MassTransit.Util;
using System;
using System.Linq;
using System.Threading.Tasks;
//using System.Threading.Tasks.Schedulers;
namespace LimitedConcurrency
{
class Program
{
static void Main(string[] args)
{
var test = new TaskSchedulerTest();
test.Run();
}
}
class TaskSchedulerTest
{
public void Run()
{
var scheduler = new LimitedConcurrencyLevelTaskScheduler(2);
var taskFactory = new TaskFactory(scheduler);
var tasks = Enumerable.Range(1, 2).Select(id => taskFactory.StartNew(() => DoWork(id)));
Task.WaitAll(tasks.ToArray());
}
private async Task DoWork(int id)
{
Console.WriteLine($"Starting Work {id}");
await HttpClientGetAsync();
Console.WriteLine($"Finished Work {id}");
}
async Task HttpClientGetAsync()
{
await Task.Delay(2000);
}
}
}
https://medium.com/rubrikkgroup/understanding-async-avoiding-deadlocks-e41f8f2c6f5d
TLDR never call .result, which I'm sure .GetResult(); was doing
Nearly every introduction about async programming for C# warns against using the Sleep instruction, because it would block the whole thread.
But I found that during sleep, the Tasks from the queue are being fetched and executed. See:
using System;
using System.Threading.Tasks;
namespace TestApp {
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Main");
Program.step1();
for (int i = 0; i < 6; i++) {
System.Threading.Thread.Sleep(200);
Console.WriteLine("Sleep-Loop");
}
}
private static async void step1() {
await Task.Delay(400);
Console.WriteLine("Step1");
Program.step2();
}
private static async void step2() {
await Task.Delay(400);
Console.WriteLine("Step2");
}
}
}
The output:
Main
Sleep-Loop
Sleep-Loop
Step1
Sleep-Loop
Sleep-Loop
Step2
Sleep-Loop
Sleep-Loop
My questions:
Is Sleep really allows the queued tasks to execute, or something else happens?
If yes, then does this also happen in every other cases of idleness? For example during polling?
In the above example, if we comment out the loop, then the application exits before any tasks could get executed. Is there another way to prevent that?
In C# 7.3 you can have async entry points, I suggest using that.
Some notes :
Don't use async void, it has subtleties with the way it deals with errors, if you see yourself writing async void then think about what you are doing. If it's not for an event handler you are probably doing something wrong
If you want to wait for a bunch of tasks to finish, use Task.WhenAll
Modified example
static async Task Main(string[] args)
{
Console.WriteLine("Start Task");
var task = Program.step1();
for (int i = 0; i < 6; i++)
{
await Task.Delay(100);
Console.WriteLine("Sleep-Loop");
}
Console.WriteLine("waiting for the task to finish");
await task;
Console.WriteLine("finished");
Console.ReadKey();
}
private static async Task step1()
{
await Task.Delay(1000);
Console.WriteLine("Step1");
await Program.step2();
}
private static async Task step2()
{
await Task.Delay(1000);
Console.WriteLine("Step2");
}
It's important to note Tasks are not threads and async is not parallel, however they can be.
9 times out of 10 if you are using the async await pattern it is for IO bound work to use operating system I/O completion ports so you can free up threads. It's a scalability and UI responsiveness feature.
If you aren't doing any I/O work, then there is actually very little need for the async await pattern at all, and as such CPU work should probably be just wrapped in a Task.Run at the point of calling. Not wrapped in an async method.
At this point it's also good to note just using tasks are not the async and await pattern. Although they both have tasks in common, they are not the same thing.
Lastly, if you find you need to use asynchronous code in a fire and forget way, think very carefully how you will handle any errors.
Here are some guidelines.
If you want to do I/O work, use the async await pattern.
If you want to do CPU work use Task.Run.
Never use async void unless it's for an event handler.
Never wrap CPU work in an async method, let the caller use Task.Run
If you need to wait for a task, await it, never call Result, or Wait or use Task.WhenAll
I have case there i want to call one asyn method inside paralle.Foreach loop
public void ItemCheck<T>(IList<T> items,int id)
{
Parallel.ForEach(items, (current) =>
{
PostData(current,id);
});
Console.log("ItemCheck executed")
}
public async void PostData<T>(T obj, int id)
{
Console.lgo("PosstData executed")
}
Output :
ItemCheck executed
PosstData executed
Why it happens like this ?? Before completing execution of PostData method,next line is executed.How can i solve this issue.Anyone help on this
Why it happens like this ??
Because you're using async void.
Also - as Jon Skeet mentioned - there's no point in doing parallelism here. Parallel processing is splitting the work over multiple threads. What you really want is concurrency, not parallelism, which can be done with Task.WhenAll:
public async Task ItemCheckAsync<T>(IList<T> items, int id)
{
var tasks = items.Select(current => PostDataAsync(current, id));
await Task.WhenAll(tasks);
}
public async Task PostDataAsync<T>(T obj, int id)
The phrase "in parallel" is commonly used to mean "doing more than one thing at a time", but that usage has misled you into using Parallel, which is not the appropriate tool in this case. This is one reason why I strongly prefer the term "concurrent", and reserve the term "parallel" for things that the Parallel class does.
The problem is that your PostData method is executed asynchronous and nothing tells the parallel loop to wait until completion of all task.
An alternative i would use to sync the execution flow:
var tasks = items
.AsParallel()
.WithDegreeOfParallelisum(...)
.Select(async item => await PostData(item, id))
.ToArray();
Task.WaitAll(tasks); // this will wait for all tasks to finnish
Also your async methods, even void they should return Task not void. Being more explicit about your code is one plus of this approach, additionally, you can use the task for any other operations, like in the case be waited to finish.
public async Task PostData<T>(T obj, int id)
Do you even need to create/start async task in parallel and then wait for them? The result of this is just creating tasks in parallel (which is not a heavy operation, so why do it in parallel ?).
If you dont do any additional heavy work in the parallel loop except the PostData method, i think you don't need the parallel loop at all.
Is it possible to rewrite the code below using the async await syntax?
private void PullTablePages()
{
var taskList = new List<Task>();
var faultedList = new List<Task>();
foreach (var featureItem in featuresWithDataName)
{
var task = Task.Run(() => PullTablePage();
taskList.Add(task);
if(taskList.Count == Constants.THREADS)
{
var index = Task.WaitAny(taskList.ToArray());
taskList.Remove(taskList[index]);
}
}
if (taskList.Any())
{
Task.WaitAll(taskList.ToArray());
}
//todo: do something with faulted list
}
When I rewrote it as below, the code doesn't block and the console application finishes before most of the threads complete.
It seems like the await syntax doesn't block as I expected.
private async void PullTablePagesAsync()
{
var taskList = new List<Task>();
var faultedList = new List<Task>();
foreach (var featureItem in featuresWithDataName)
{
var task = Task.Run(() => PullTablePage();
taskList.Add(task);
if(taskList.Count == Constants.THREADS)
{
var anyFinished = await Task.WhenAny(taskList.ToArray());
await anyFinished;
for (var index = taskList.Count - 1; index >= 0; index--)
{
if (taskList[index].IsCompleted)
{
taskList.Remove(taskList[index]);
}
}
}
}
if (taskList.Any())
{
await Task.WhenAll(taskList);
}
//todo: what to do with faulted list?
}
Is it possible to do so?
WaitAll doesn't seem to wait for all tasks to complete. How do I get it to do so? The return type says that it returns a task, but can't seem to figure out the syntax.## Heading ##
New to multithreading, please excuse ignorance.
Is it possible to rewrite the code below using the async await syntax?
Yes and no. Yes, because you already did it. No, because while being equivalent from the function implementation standpoint, it's not equivalent from the function caller perspective (as you already experienced with your console application). Contrary to many other C# keywords, await is not a syntax sugar. There is a reason why the compiler forces you to mark your function with async in order to enable await construct, and the reason is that now your function is no more blocking and that puts additional responsibility to the callers - either put themselves to be non blocking (async) or use a blocking calls to your function. Every async method in fact is and should return Task or Task<TResult> and then the compiler will warn the callers that ignore that fact. The only exception is async void which is supposed to be used only for event handlers, which by nature should not care what the object being notified is doing.
Shortly, async/await is not for rewriting synchronous (blocking code), but for easy turning it to asynchronous (non blocking). If your function is supposed to be synchronous, then just keep it the way it is (and your original implementation is perfectly doing that). But if you need asynchronous version, then you should change the signature to
private async Task PullTablePagesAsync()
with the await rewritten body (which you already did correctly). And for backward compatibility provide the old synchronous version using the new implementation like this
private void PullTablePages() { PullTablePagesAsync().Wait(); }
It seems like the await syntax doesn't block as I expected.
You're expecting the wrong thing.
The await syntax should never block - it's just that the execution flow should not continue until the task is finished.
Usually you are using async/await in methods that return a task. In your case you're using it in a method with a return type void.
It takes times to get your head around async void methods, that's why their use is usually discouraged. Async void methods run synchronously (block) to the calling method until the first (not completed) task is awaited. What happens after depends on the execution context (usually you're running on the pool). What's important: The calling method (the one that calls PullTablePAgesAsync) does not know of continuations and can't know when all code in PullTablePagesAsync is done.
Maybe take a look on the async/await best practices on MSDN
I'd like to write a bunch of methods querying the Oracle Database in the async/await way. Since ODP.NET seems to support neither awaitable *Async methods nor Begin/EndOperationName pairs, what options do I have to implement this manually?
All examples for I/O-intensive async methods I've seen so far call only other async methods from the .NET library, but no light is shed on the way the context switching is done internally. The documentation says that in these cases no separate thread is used and the multithreading overhead is apparently worth only for CPU-intensive operations. So I guess using Task.Run() is not an option, or am I wrong?
As long as I know Oracle ODP is a sync wrapper for an async library. I found this post as I just wondering the same: Will the introduction of an async pattern for Oracle ODP calls improve performance? (I am using WCF on IIS NET TCP).
But, as it have been said, as long as the introduction of the async pattern is done creating a new Task and the calling thread is already from the thread pool, no improvement can't be done, and it will be just a overhead.
you can always use a Task.Factory.StartNew with the TaskCreationOptions.LongRunning, so that .NET will create a new thread rather than using a threadpool thread. Below is the manual asynchronous code that you can apply to your operation.
private static void ManualAsyncOperation()
{
Task<string> task = Task.Factory.StartNew(() =>
{
Console.WriteLine("Accessing database .....");
//Mimic the DB operation
Thread.Sleep(1000);
return "Hello wolrd";
},TaskCreationOptions.LongRunning);
var awaiter =task.GetAwaiter();
awaiter.OnCompleted(() =>
{
try
{
var result = awaiter.GetResult();
Console.WriteLine("Result: {0}", result);
}
catch (Exception exception)
{
Console.WriteLine("Exception: {0}",exception);
}
});
Console.WriteLine("Continuing on the main thread and waiting for the result ....");
Console.WriteLine();
Console.ReadLine();
}
I'm using this
public static class TaskHelper
{
public async static Task<T> AsAsync<T>(Func<T> function, TaskCreationOptions taskCreationOptions = TaskCreationOptions.None)
{
return await Task.Factory.StartNew(function, taskCreationOptions);
}
public async static Task AsAsync(Action action, TaskCreationOptions taskCreationOptions = TaskCreationOptions.None)
{
await Task.Factory.StartNew(action, taskCreationOptions);
}
}
Any synchronous function can be made asynchronous and you can await of it.