Issue with calling async method inside Parallel.Foreach method - c#

I have case there i want to call one asyn method inside paralle.Foreach loop
public void ItemCheck<T>(IList<T> items,int id)
{
Parallel.ForEach(items, (current) =>
{
PostData(current,id);
});
Console.log("ItemCheck executed")
}
public async void PostData<T>(T obj, int id)
{
Console.lgo("PosstData executed")
}
Output :
ItemCheck executed
PosstData executed
Why it happens like this ?? Before completing execution of PostData method,next line is executed.How can i solve this issue.Anyone help on this

Why it happens like this ??
Because you're using async void.
Also - as Jon Skeet mentioned - there's no point in doing parallelism here. Parallel processing is splitting the work over multiple threads. What you really want is concurrency, not parallelism, which can be done with Task.WhenAll:
public async Task ItemCheckAsync<T>(IList<T> items, int id)
{
var tasks = items.Select(current => PostDataAsync(current, id));
await Task.WhenAll(tasks);
}
public async Task PostDataAsync<T>(T obj, int id)
The phrase "in parallel" is commonly used to mean "doing more than one thing at a time", but that usage has misled you into using Parallel, which is not the appropriate tool in this case. This is one reason why I strongly prefer the term "concurrent", and reserve the term "parallel" for things that the Parallel class does.

The problem is that your PostData method is executed asynchronous and nothing tells the parallel loop to wait until completion of all task.
An alternative i would use to sync the execution flow:
var tasks = items
.AsParallel()
.WithDegreeOfParallelisum(...)
.Select(async item => await PostData(item, id))
.ToArray();
Task.WaitAll(tasks); // this will wait for all tasks to finnish
Also your async methods, even void they should return Task not void. Being more explicit about your code is one plus of this approach, additionally, you can use the task for any other operations, like in the case be waited to finish.
public async Task PostData<T>(T obj, int id)
Do you even need to create/start async task in parallel and then wait for them? The result of this is just creating tasks in parallel (which is not a heavy operation, so why do it in parallel ?).
If you dont do any additional heavy work in the parallel loop except the PostData method, i think you don't need the parallel loop at all.

Related

Async Queue implementation .Wait() faster than await

To provide producer-consumer functionality that can queue and execute async methods one after the other, I'm trying to implement an async queue. I noticed major performance issues using it in a large application.
async Task Loop() {
while (Verify()) {
if (!_blockingCollection.TryTake(out var func, 1000, _token)) continue;
await func.Invoke();
}
}
Implementation of AsyncQueue.Add:
public void Add(Func<Task> func) {
_blockingCollection.Add(func);
}
Example usage from arbitrary thread:
controller.OnEvent += (o, a) => _queue.Add(async (t) => await handle(a));
Execution paths' depend on the state of the application and include
async network requests that internally use TaskCompletionSource to return result
IO operations
tasks that get added to a list and are awaited using Task.WhenAll(...)
an async void method that converts an array and awaits a network request
Symptoms:
The application slows down gradually.
When I replace await func.Invoke() with func.Invoke().Wait() instead of awaiting it properly, performance improves dramatically and it does not slow down.
Why is that? Is an async queue that uses BlockingCollection a bad idea?
What is a better alternative?
Why is that?
There isn't enough information in the question to provide an answer to this.
As others have noted, there's a CPU-consuming spin issue with the loop as it currently is.
In the meantime, I can at least answer this part:
Is an async Queue that uses BlockingCollection a bad idea?
Yes.
What is a better alternative?
Use an async-compatible queue. E.g., Channels, or BufferBlock/ActionBlock from TPL Dataflow.
Example using Channels:
async Task Loop() {
await foreach (var func in channelReader.ReadAllAsync()) {
await func.Invoke();
}
}
or if you're not on .NET Core yet:
async Task Loop() {
while (await channelReader.WaitToReadAsync()) {
while (channelReader.TryRead(out var func)) {
await func.Invoke();
}
}
}

use Task.Run() inside of Select LINQ method

Suppose I have the following code (just for learninig purposes):
static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();
var tasks = Enumerable.Range(0, 100).Select(async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
await Task.WhenAll(tasks);
Console.WriteLine($"Items in dictionary {results.Count}");
}
static async Task<int> DoAsyncJob(int i)
{
// simulate some I/O bound operation
await Task.Delay(100);
return i * 10;
}
I want to know what will be the difference if I make it as follows:
var tasks = Enumerable.Range(0, 100)
.Select(index => Task.Run(async () =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
}));
I get the same results in both cases. But does code executes similarly?
Task.Run is there to execute CPU bound, synchronous, operations in a thread pool thread. As it is the operation you're running is already asynchronous, so using Task.Run means you're scheduling work to run in a thread pool thread, and that work is merely starting an asynchronous operation, which then completes almost immediately, and goes off to do whatever asynchronous work it has to do without blocking that thread pool thread. So by using Task.Run you're waiting to schedule work in the thread pool, but then not actually don't any meaningful work. You're better off just starting the asynchronous operation in the current thread.
The only exception would be if DoAsyncJob were implemented improperly and for some reason wasn't actually asynchronous, contrary to its name and signature, and actually did a lot of synchronous work before returning. But if it is doing that, you should just fix that buggy method, rather than using Task.Run to call it.
On a side note, there's no reason to have a ConcurrentDictionary to collect the results here. Task.WhenAll returns a collection of the results of all of the tasks you've executed. Just use that. Now you don't even need a method to wrap your asynchronous method and process the result in any special way, simplifying the code further:
var tasks = Enumerable.Range(0, 100).Select(DoAsyncJob);
var results = await Task.WhenAll(tasks);
Console.WriteLine($"Items in results {results.Count}");
Yes, both cases execute similarly. Actually, they execute in exactly the same way.

Async - Which of these is correct

From below 2 scenario(s), which one of them is correct way of doing asynchronous programming in c#?
Scenario-1
public async Task<T1> AddSomethingAsync(param)
{
return await SomeOtherFunctionFromThirdPartyLibraryForIOAsync(param);
}
then
List<Task> Tasks=new List<Task>();
foreach(var task in FromAllTasks())
{
Tasks.Add(AddSomethingAsync(task));
}
await Task.WhenAll(Tasks.AsParallel());
Scenario-2
public async Task<T1> AddSomethingAsync(param)
{
return SomeOtherFunctionFromThirdPartyLibraryForIOAsync(param);
}
then
List<Task> Tasks=new List<Task>();
foreach(var task in FromAllTasks())
{
Tasks.Add(SomeOtherFunctionFromThirdPartyLibraryForIOAsync(task));
}
await Task.WhenAll(Tasks.AsParallel());
The only difference between 2 is, later is not having await keyword inside AddSomethingAsync function.
So here is the update - What I want to know to achieve is, All tasks should be executed in parallel and asynchronously. (My thinking is in scenario-1, the call will be awaited inside AddSomethingAsync and will hurt at upper layer blocking next loop to execute. confirm
Scenario 3
public Task<T1> AddSomethingAsync(param)
{
return SomeOtherFunctionFromThirdPartyLibraryForIOAsync(param);
}
then
List<Task> Tasks=new List<Task>();
foreach(var task in FromAllTasks())
{
Tasks.Add(SomeOtherFunctionFromThirdPartyLibraryForIOAsync(task));
}
await Task.WhenAll(Tasks);
If you are not awaiting anything - you don't need async keyword. Doing AsParallel will do nothing in this case too.
In my opinion it's same. await is a mark, means this line execute in the same thread with this method, it'll await in thread.
Actually async and await is design for method return void. so if this method doesn't return result, it's can be put in a thread alone. Any async methods this method called means it'll use same thread with the void method, if thoese sub method need result await, it's wait inside this thread.
And when you put them in List<Task>, it's make no difference.

How to convert WaitAll() to async and await

Is it possible to rewrite the code below using the async await syntax?
private void PullTablePages()
{
var taskList = new List<Task>();
var faultedList = new List<Task>();
foreach (var featureItem in featuresWithDataName)
{
var task = Task.Run(() => PullTablePage();
taskList.Add(task);
if(taskList.Count == Constants.THREADS)
{
var index = Task.WaitAny(taskList.ToArray());
taskList.Remove(taskList[index]);
}
}
if (taskList.Any())
{
Task.WaitAll(taskList.ToArray());
}
//todo: do something with faulted list
}
When I rewrote it as below, the code doesn't block and the console application finishes before most of the threads complete.
It seems like the await syntax doesn't block as I expected.
private async void PullTablePagesAsync()
{
var taskList = new List<Task>();
var faultedList = new List<Task>();
foreach (var featureItem in featuresWithDataName)
{
var task = Task.Run(() => PullTablePage();
taskList.Add(task);
if(taskList.Count == Constants.THREADS)
{
var anyFinished = await Task.WhenAny(taskList.ToArray());
await anyFinished;
for (var index = taskList.Count - 1; index >= 0; index--)
{
if (taskList[index].IsCompleted)
{
taskList.Remove(taskList[index]);
}
}
}
}
if (taskList.Any())
{
await Task.WhenAll(taskList);
}
//todo: what to do with faulted list?
}
Is it possible to do so?
WaitAll doesn't seem to wait for all tasks to complete. How do I get it to do so? The return type says that it returns a task, but can't seem to figure out the syntax.## Heading ##
New to multithreading, please excuse ignorance.
Is it possible to rewrite the code below using the async await syntax?
Yes and no. Yes, because you already did it. No, because while being equivalent from the function implementation standpoint, it's not equivalent from the function caller perspective (as you already experienced with your console application). Contrary to many other C# keywords, await is not a syntax sugar. There is a reason why the compiler forces you to mark your function with async in order to enable await construct, and the reason is that now your function is no more blocking and that puts additional responsibility to the callers - either put themselves to be non blocking (async) or use a blocking calls to your function. Every async method in fact is and should return Task or Task<TResult> and then the compiler will warn the callers that ignore that fact. The only exception is async void which is supposed to be used only for event handlers, which by nature should not care what the object being notified is doing.
Shortly, async/await is not for rewriting synchronous (blocking code), but for easy turning it to asynchronous (non blocking). If your function is supposed to be synchronous, then just keep it the way it is (and your original implementation is perfectly doing that). But if you need asynchronous version, then you should change the signature to
private async Task PullTablePagesAsync()
with the await rewritten body (which you already did correctly). And for backward compatibility provide the old synchronous version using the new implementation like this
private void PullTablePages() { PullTablePagesAsync().Wait(); }
It seems like the await syntax doesn't block as I expected.
You're expecting the wrong thing.
The await syntax should never block - it's just that the execution flow should not continue until the task is finished.
Usually you are using async/await in methods that return a task. In your case you're using it in a method with a return type void.
It takes times to get your head around async void methods, that's why their use is usually discouraged. Async void methods run synchronously (block) to the calling method until the first (not completed) task is awaited. What happens after depends on the execution context (usually you're running on the pool). What's important: The calling method (the one that calls PullTablePAgesAsync) does not know of continuations and can't know when all code in PullTablePagesAsync is done.
Maybe take a look on the async/await best practices on MSDN

async all the way down issue

I have an async asp.net controller. This controller calls an async method. The method that actually performs the async IO work is deep down in my application. The series of methods between the controller and the last method in the chain are all marked with the async modifier. Here is an example of how I have the code setup:
public async Task<ActionResult> Index(int[] ids)
{
List<int> listOfDataPoints = dataPointService(ids);
List<Task> dpTaskList = new List<Task>();
foreach (var x in listOfDataPoints)
{
dpTaskList.Add(C_Async(x));
}
await Task.WhenAll(dpTaskList);
return View();
}
private async Task C_Async(int id)
{
//this method executes very fast
var idTemp = paddID(id);
await D_Async(idTemp);
}
private async Task D_Async(string id)
{
//this method executes very fast
await E_Async(id);
}
private async Task E_Async(string url)
{
//this method performs the actual async IO
result = await new WebClient().DownloadStringTaskAsync(new Uri(url))
saveContent(result);
}
As you can see the controller calls C_Async(x) asynchronously then there is a chain of async methods to E_Async. There are methods between the controller and E_Async and all have the async modifier. Is there a performance penalty since there are methods using the async modifyer but not doing any async IO work?
Note: This is a simplified version of the real code there are more async methods between the controller and the E_Async method.
Yes. There is a penalty (though not a huge one), and if you don't need to be async don't be. This pattern is often called "return await" where you can almost always remove both the async and the await. Simply return the task you already have that represents the asynchronous operations:
private Task C_Async(int id)
{
// This method executes very fast
var idTemp = paddID(id);
return D_Async(idTemp);
}
private Task D_Async(string id)
{
// This method executes very fast
return E_Async(id);
}
In this specific case Index will only await the tasks that E_Async returns. That means that after all the I/O is done the next line of code will directly be return View();. C_Async and D_Async already ran and finished in the synchronous call.
You must be careful about the thread message pumps and what async really does. The sample below calls into an async method which calls two other async methods which start two tasks to do the actual work which wait 2 and 3 seconds.
13.00 6520 .ctor Calling async method
13.00 6520 RunSomethingAsync Before
13.00 6520 GetSlowString Before
13.00 5628 OtherTask Sleeping for 2s
15.00 5628 OtherTask Sleeping done
15.00 6520 GetVerySlow Inside
15.00 2176 GetVerySlow Sleeping 3s
18.00 2176 GetVerySlow Sleeping Done
18.00 6520 RunSomethingAsync After GetSlowOtherTaskResultGetVerySlowReturn
As you can see the calls are serialized which might not be what you want when you after performance. Perhaps the two distinct await calls do not depend on each other and can be started directly as tasks.
All methods until GetSlowStringBefore are called on the UI or ASP.NET thread that started the async operation (if it it has a message pump). Only the last call with the result of the operation are marshalled back to the initiating thread.
The performance penalty is somewhere in the ContextSwitch region to wake up an already existing thread. This should be somewhere at microsecond level. The most expensive stuff would be the creation of the managed objects and the garbage collector cleaning up the temporary objects. If you call this in a tight loop you will be GC bound because there is an upper limit how many threads can be created. In that case TPL will buffer your tasks in queues which require memory allocations and then drain the queues with n worker threads from the thread pool.
On my Core I7 I get an overhead of 2microseconds for each call (comment out the Debug.Print line) and a memory consumption of 6,5GB for 5 million calls in a WPF application which gives you a memory overhead of 130KB per asynchronous operation chain. If you are after high scalability you need to watch after your GC. Until Joe Duffy has finished his new language we have to use CLR we currently have.
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
Print("Calling async method");
RunSomethingAsync();
}
private async void RunSomethingAsync()
{
Print("Before");
string msg = await GetSlowString();
Print("After " + msg);
cLabel.Content = msg;
}
void Print(string message, [CallerMemberName] string method = "")
{
Debug.Print("{0:N2} {1} {2} {3}", DateTime.Now.Second, AppDomain.GetCurrentThreadId(), method, message);
}
private async Task<string> GetSlowString()
{
Print("Before");
string otherResult = await OtherTask();
return "GetSlow" + otherResult + await GetVerySlow(); ;
}
private Task<string> OtherTask()
{
return Task.Run(() =>
{
Print("Sleeping for 2s");
Thread.Sleep(2 * 1000);
Print("Sleeping done");
return "OtherTaskResult";
});
}
private Task<string> GetVerySlow()
{
Print("Inside");
return Task.Run(() =>
{
Print("Sleeping 3s");
Thread.Sleep(3000);
Print("Sleeping Done");
return "GetVerySlowReturn";
});
}
}

Categories