Stopwatch with an async action which awaits calls from a database - c#

How can I properly use the Stopwatch class with an async action event that awaits calls to my database asynchronously inside an async Post method on DotNet Core ?
Why
To time my code and check for bottleneck. This is a simplified test method that will contain more code as time goes on.
Errors
I tried using an Action event, a Task event and a Func<Task> event without success and they all give me errors which always occurs when I am awaiting a call from my database asynchronously using EF Core
When I use Action event
An unhandled exception of type 'System.ObjectDisposedException' occurred in System.Private.CoreLib.dll. Cannot access a disposed context instance.
When I use Func<Task>
System.Threading.Tasks.TaskCanceledException: A task was canceled.
It doesn't print anything when I use Task event and the rest of the code executes without errors
Code
Post Method
public async Task<JsonResult> OnPostTest() {
// my Database Context
using DatabaseContext dc = _dbContext;
// list of json data that will be returned back to the client
List<object>? listJsonData = null;
// stopwatch initialization
Stopwatch sw = new();
sw.LogActionAsync(nameof(OnPostTest), (async () => { // new Task(async () => { // new Func<Task>(async() => {
// get list of data from TableTest database with a valid name and are not marked as delete
List<TableTest> listValidTabletest = await dc.ListTest.AsNoTracking().Where(t => !string.IsNullOrWhiteSpaces(t.strName) && !t.blnDelete).ToListAsync(); //<-- returns a list asynchronously and where the error occurs
// initialize list that will be returned
listJsonData = new();
foreach (TableTest t in listValidTableTest) {
// object that will be in the list of returned json objects
var returnData = new {
t.strName,
t.arrPrices,
t.strStartDate,
t.strEndDate
};
listJsonData.Add(returnData);
}
}));
return new JsonResult(new {
// return list or an empty array if list has not been initialized
arrJsonData = listJsonData?.toArray() ?? Array.Empty<object>(),
blnGetStatus = bool.TrueString
});
}
Stopwatch Extension Class
public static async void LogActionAsync(this Stopwatch sw, string? strMethodName, Action asyncAction, int intNbOfIterations = 1) {
sw.Reset();
sw.Start();
List<Task> listOfTasks = new();
for (int i = 0; i < intNbOfIterations; i++) {
listOfTasks.Add(Task.Factory.StartNew(asyncAction)); // for Action event
//listOfTask.Add(asyncTask); // for Task event
}
await Task.WhenAll(listOfTasks);
//await asyncFuncTask; // for Func\<Task> event
sw.Stop();
// log duration to a file using Serilog
Log.Debug($"{strMethodName} Action Duration: '{sw.Elapsed.Duration()}'");
}
EDIT:
I changed my stopwatch extension method to async Task LogActionAsync... and my stopwatch object to await sw.LogActionAsync... but now nothing is being logged*. Any idea ?

There's a lot of bugs in this code. To summarize:
async void in two places.
Missing awaits.
Using a single database context concurrently.
Adding to a list of results concurrently.
So, let's fix these one by one.
async void in two places.
Missing awaits.
As another answer noted, LogActionAsync should not be async void but should be async Task and awaited.
I changed my stopwatch extension method to async Task LogActionAsync... and my stopwatch object to await sw.LogActionAsync...
You're still missing one more async void. It's a tricky one: lambdas, when assigned to Action variables, become async void. The proper delegate type for an asynchronous method without a return value is Func<Task>, not Action.
Code:
public static async Task LogActionAsync(this Stopwatch sw, string? strMethodName, Func<Task> asyncAction, int intNbOfIterations = 1) {
sw.Reset();
sw.Start();
List<Task> listOfTasks = new();
for (int i = 0; i < intNbOfIterations; i++) {
listOfTasks.Add(asyncAction());
}
await Task.WhenAll(listOfTasks);
sw.Stop();
// log duration to a file using Serilog
Log.Debug($"{strMethodName} Action Duration: '{sw.Elapsed.Duration()}'");
}
And now you can properly use await everywhere.
Using a single database context concurrently.
Adding to a list of results concurrently.
As another answer noted, you will need one database context per action lambda. This is a limitation of Entity Framework (in turn imposed by a limitation of most SQL on-the-wire protocols).
The List<T>.Add method is also not threadsafe, and the code is potentially invoking it from multiple threads concurrently. It's possible to use a concurrent collection, but it's easier and cleaner to return result data instead of modifying a shared collection as a side effect.
But, really, I suspect that the concurrency in the posted code is an accident. It seems very odd to run N "iterations" of something concurrently when doing timing; I believe the desired semantics are to run N iterations of something serially.
If my assumption is correct, then the code should look like this:
public static async Task LogActionAsync(this Stopwatch sw, string? strMethodName, Func<Task> asyncAction, int intNbOfIterations = 1) {
sw.Reset();
sw.Start();
for (int i = 0; i < intNbOfIterations; i++) {
await asyncAction();
}
sw.Stop();
// log duration to a file using Serilog
Log.Debug($"{strMethodName} Action Duration: '{sw.Elapsed.Duration()}'");
}
public static async Task<T> LogActionAsync<T>(this Stopwatch sw, string? strMethodName, Func<Task<T>> asyncFunc, int intNbOfIterations = 1) {
sw.Reset();
sw.Start();
T result = default;
for (int i = 0; i < intNbOfIterations; i++) {
result = await asyncFunc();
}
sw.Stop();
// log duration to a file using Serilog
Log.Debug($"{strMethodName} Action Duration: '{sw.Elapsed.Duration()}'");
return result;
}
public async Task<JsonResult> OnPostTest() {
// my Database Context
using DatabaseContext dc = _dbContext;
// list of json data that will be returned back to the client
List<object>? listJsonData = null;
// stopwatch initialization
Stopwatch sw = new();
listJsonData = await sw.LogActionAsync(nameof(OnPostTest), (async () => {
// get list of data from TableTest database with a valid name and are not marked as delete
List<TableTest> listValidTabletest = await dc.ListTest.AsNoTracking().Where(t => !string.IsNullOrWhiteSpaces(t.strName) && !t.blnDelete).ToListAsync();
// initialize list that will be returned
var jsonData = new List<object>();
foreach (TableTest t in listValidTableTest) {
// object that will be in the list of returned json objects
var returnData = new {
t.strName,
t.arrPrices,
t.strStartDate,
t.strEndDate
};
jsonData.Add(returnData);
}
return jsonData;
}));
return new JsonResult(new {
// return list or an empty array if list has not been initialized
arrJsonData = listJsonData?.toArray() ?? Array.Empty<object>(),
blnGetStatus = bool.TrueString
});
}

You're not awaiting your call to LogActionAsync, so your call happens after your page action is over, which is why you get all those disposed exceptions. Your entire page and all its DI objects and database contexts and everything have long been disposed by that point.
async void should be considered a debugging tool, it helps find any async issue inexperienced people make right away!

The problem in your code has nothing to do with StopWatch and everything to do with entity framework.
Entity Framework DbContext is not concurrent safe.
You need to move the creation and disposal of the DbContext inside the Task.
Additionally, you should not be using Task.Factory.StartNew due to weird exception handling. And in this case, you should not use Task.Run nor Task.Factory.StartNew because you do not need a thread for concurrency.

Related

How to limit number of async IO tasks to database?

I have a list of id's and I want to get data for each of those id in parallel from database. My below ExecuteAsync method is called at very high throughput and for each request we have around 500 ids for which I need to extract data.
So I have got below code where I am looping around list of ids and making async calls for each of those id in parallel and it works fine.
private async Task<List<T>> ExecuteAsync<T>(IList<int> ids, IPollyPolicy policy,
Func<CancellationToken, int, Task<T>> mapper) where T : class
{
var tasks = new List<Task<T>>(ids.Count);
// invoking multiple id in parallel to get data for each id from database
for (int i = 0; i < ids.Count; i++)
{
tasks.Add(Execute(policy, ct => mapper(ct, ids[i])));
}
// wait for all id response to come back
var responses = await Task.WhenAll(tasks);
var excludeNull = new List<T>(ids.Count);
for (int i = 0; i < responses.Length; i++)
{
var response = responses[i];
if (response != null)
{
excludeNull.Add(response);
}
}
return excludeNull;
}
private async Task<T> Execute<T>(IPollyPolicy policy,
Func<CancellationToken, Task<T>> requestExecuter) where T : class
{
var response = await policy.Policy.ExecuteAndCaptureAsync(
ct => requestExecuter(ct), CancellationToken.None);
if (response.Outcome == OutcomeType.Failure)
{
if (response.FinalException != null)
{
// log error
throw response.FinalException;
}
}
return response?.Result;
}
Question:
Now as you can see I am looping all ids and making bunch of async calls to database in parallel for each id which can put lot of load on database (depending on how many request is coming). So I want to limit the number of async calls we are making to database. I modified ExecuteAsync to use Semaphore as shown below but it doesn't look like it does what I want it to do:
private async Task<List<T>> ExecuteAsync<T>(IList<int> ids, IPollyPolicy policy,
Func<CancellationToken, int, Task<T>> mapper) where T : class
{
var throttler = new SemaphoreSlim(250);
var tasks = new List<Task<T>>(ids.Count);
// invoking multiple id in parallel to get data for each id from database
for (int i = 0; i < ids.Count; i++)
{
await throttler.WaitAsync().ConfigureAwait(false);
try
{
tasks.Add(Execute(policy, ct => mapper(ct, ids[i])));
}
finally
{
throttler.Release();
}
}
// wait for all id response to come back
var responses = await Task.WhenAll(tasks);
// same excludeNull code check here
return excludeNull;
}
Does Semaphore works on Threads or Tasks? Reading it here looks like Semaphore is for Threads and SemaphoreSlim is for tasks.
Is this correct? If yes then what's the best way to fix this and limit the number of async IO tasks we make to database here.
Task is an abstraction on threads, and doesn’t necessarily create a new thread. Semaphore limits the number of threads that can access that for loop. Execute returns a Task which aren’t threads. If there’s only 1 request, there will be only 1 thread inside that for loop, even if it is asking for 500 ids. The 1 thread sends off all the async IO tasks itself.
Sort of. I would not say that tasks are related to threads at all. There are actually two kinds of tasks: a delegate task (which is kind of an abstraction of a thread), and a promise task (which has nothing to do with threads).
Regarding the SemaphoreSlim, it does limit the concurrency of a block of code (not threads).
I recently started playing with C# so my understanding is not right looks like w.r.t Threads and Tasks.
I recommend reading my async intro and best practices. Follow up with There Is No Thread if you're interested more about how threads aren't really involved.
I modified ExecuteAsync to use Semaphore as shown below but it doesn't look like it does what I want it to do
The current code is only throttling the adding of the tasks to the list, which is only done one at a time anyway. What you want to do is throttle the execution itself:
private async Task<List<T>> ExecuteAsync<T>(IList<int> ids, IPollyPolicy policy, Func<CancellationToken, int, Task<T>> mapper) where T : class
{
var throttler = new SemaphoreSlim(250);
var tasks = new List<Task<T>>(ids.Count);
// invoking multiple id in parallel to get data for each id from database
for (int i = 0; i < ids.Count; i++)
tasks.Add(ThrottledExecute(ids[i]));
// wait for all id response to come back
var responses = await Task.WhenAll(tasks);
// same excludeNull code check here
return excludeNull;
async Task<T> ThrottledExecute(int id)
{
await throttler.WaitAsync().ConfigureAwait(false);
try {
return await Execute(policy, ct => mapper(ct, id)).ConfigureAwait(false);
} finally {
throttler.Release();
}
}
}
Your colleague has probably in mind the Semaphore class, which is indeed a thread-centric throttler, with no asynchronous capabilities.
Limits the number of threads that can access a resource or pool of resources concurrently.
The SemaphoreSlim class is a lightweight alternative to Semaphore, which includes the asynchronous method WaitAsync, that makes all the difference in the world. The WaitAsync doesn't block a thread, it blocks an asynchronous workflow. Asynchronous workflows are cheap (usually less than 1000 bytes each). You can have millions of them "running" concurrently at any given moment. This is not the case with threads, because of the 1 MB of memory that each thread reserves for its stack.
As for the ExecuteAsync method, here is how you could refactor it by using the LINQ methods Select, Where, ToArray and ToList:
Update: The Polly library supports capturing and continuing on the current synchronization context, so I added a bool executeOnCurrentContext
argument to the API. I also renamed the asynchronous Execute method to ExecuteAsync, to be in par with the guidelines.
private async Task<List<T>> ExecuteAsync<T>(IList<int> ids, IPollyPolicy policy,
Func<CancellationToken, int, Task<T>> mapper,
int concurrencyLevel = 1, bool executeOnCurrentContext = false) where T : class
{
var throttler = new SemaphoreSlim(concurrencyLevel);
Task<T>[] tasks = ids.Select(async id =>
{
await throttler.WaitAsync().ConfigureAwait(executeOnCurrentContext);
try
{
return await ExecuteAsync(policy, ct => mapper(ct, id),
executeOnCurrentContext).ConfigureAwait(false);
}
finally
{
throttler.Release();
}
}).ToArray();
T[] results = await Task.WhenAll(tasks).ConfigureAwait(false);
return results.Where(r => r != null).ToList();
}
private async Task<T> ExecuteAsync<T>(IPollyPolicy policy,
Func<CancellationToken, Task<T>> function,
bool executeOnCurrentContext = false) where T : class
{
var response = await policy.Policy.ExecuteAndCaptureAsync(
ct => executeOnCurrentContext ? function(ct) : Task.Run(() => function(ct)),
CancellationToken.None, continueOnCapturedContext: executeOnCurrentContext)
.ConfigureAwait(executeOnCurrentContext);
if (response.Outcome == OutcomeType.Failure)
{
if (response.FinalException != null)
{
ExceptionDispatchInfo.Throw(response.FinalException);
}
}
return response?.Result;
}
You are throttling the rate at which you add tasks to the list. You are not throttling the rate at which tasks are executed. To do that, you'd probably have to implement your semaphore calls inside the Execute method itself.
If you can't modify Execute, another way to do it is to poll for completed tasks, sort of like this:
for (int i = 0; i < ids.Count; i++)
{
var pendingCount = tasks.Count( t => !t.IsCompleted );
while (pendingCount >= 500) await Task.Yield();
tasks.Add(Execute(policy, ct => mapper(ct, ids[i])));
}
await Task.WhenAll( tasks );
Actually the TPL is capable to control the task execution and limit the concurrency. You can test how many parallel tasks is suitable for your use-case. No need to think about threads, TPL will manage everything fine for you.
To use limited concurrency see this answer, credits to #panagiotis-kanavos
.Net TPL: Limited Concurrency Level Task scheduler with task priority?
The example code is (even using different priorities, you can strip that):
QueuedTaskScheduler qts = new QueuedTaskScheduler(TaskScheduler.Default,4);
TaskScheduler pri0 = qts.ActivateNewQueue(priority: 0);
TaskScheduler pri1 = qts.ActivateNewQueue(priority: 1);
Task.Factory.StartNew(()=>{ },
CancellationToken.None,
TaskCreationOptions.None,
pri0);
Just throw all your tasks to the queue and with Task.WhenAll you can wait till everything is done.

Using cancellation token properly in c#

I was recently exposed to C# language and was working on getting data out of cassandra so I was working with below code which gets data from Cassandra and it works fine.
Only problem I have is in my ProcessCassQuery method - I am passing CancellationToken.None to my requestExecuter Function which might not be the right thing to do. What should be the right way to handle that case and what should I do to handle it correctly?
/**
*
* Below method does multiple async calls on each table for their corresponding id's by limiting it down using Semaphore.
*
*/
private async Task<List<T>> ProcessCassQueries<T>(IList<int> ids, Func<CancellationToken, int, Task<T>> mapperFunc, string msg) where T : class
{
var tasks = ids.Select(async id =>
{
await semaphore.WaitAsync();
try
{
ProcessCassQuery(ct => mapperFunc(ct, id), msg);
}
finally
{
semaphore.Release();
}
});
return (await Task.WhenAll(tasks)).Where(e => e != null).ToList();
}
// this might not be good idea to do it. how can I improve below method?
private Task<T> ProcessCassQuery<T>(Func<CancellationToken, Task<T>> requestExecuter, string msg) where T : class
{
return requestExecuter(CancellationToken.None);
}
As said in the official documentation, the cancellation token allows propagating a cancellation signal. This can be useful for example, to cancel long-running operations that for some reason do not make sense anymore or that are simply taking too long.
The CancelationTokenSource will allow you to get a custom token that you can pass to the requestExecutor. It will also provide the means for cancelling a running Task.
private CancellationTokenSource cts = new CancellationTokenSource();
// ...
private Task<T> ProcessCassQuery<T>(Func<CancellationToken, Task<T>> requestExecuter, string msg) where T : class
{
return requestExecuter(cts.Token);
}
Example
Let's take a look at a different minimal/dummy example so we can look at the inside of it.
Consider the following method, GetSomethingAsync that will yield return an incrementing integer every second.
The call to token.ThrowIfCancellationRequested will make sure a TaskCanceledException is thrown if this process is cancelled by an outside action. Other approaches can be taken, for example, check if token.IsCancellationRequested is true and do something about it.
private static async IAsyncEnumerable<int> GetSomethingAsync(CancellationToken token)
{
Console.WriteLine("starting to get something");
token.ThrowIfCancellationRequested();
for (var i = 0; i < 100; i++)
{
await Task.Delay(1000, token);
yield return i;
}
Console.WriteLine("finished getting something");
}
Now let's build the main method to call the above method.
public static async Task Main()
{
var cts = new CancellationTokenSource();
// cancel it after 3 seconds, just for demo purposes
cts.CancelAfter(3000);
// or: Task.Delay(3000).ContinueWith(_ => { cts.Cancel(); });
await foreach (var i in GetSomethingAsync(cts.Token))
{
Console.WriteLine(i);
}
}
If we run this, we will get an output that should look like:
starting to get something
0
1
Unhandled exception. System.Threading.Tasks.TaskCanceledException: A task was canceled.
Of course, this is just a dummy example, the cancellation could be triggered by a user action, or some event that happens, it does not have to be a timer.

What is wrong with my Code (SendPingAsync)

Im writing a C# Ping-Application.
I started with a synchronous Ping-method, but I figurred out that pinging several server with one click takes more and more time.
So I decided to try the asynchronous method.
Can someone help me out?
public async Task<string> CustomPing(string ip, int amountOfPackets, int sizeOfPackets)
{
// timeout
int Timeout = 2000;
// PaketSize logic
string packet = "";
for (int j = 0; j < sizeOfPackets; j++)
{
packet += "b";
};
byte[] buffer = Encoding.ASCII.GetBytes(packet);
// time-var
long ms = 0;
// Main Method
using (Ping ping = new Ping())
for (int i = 0; i < amountOfPackets; i++)
{
PingReply reply = await ping.SendPingAsync(ip, Timeout, buffer);
ms += reply.RoundtripTime;
};
return (ms / amountOfPackets + " ms");
};
I defined a "Server"-Class (Ip or host, City, Country).
Then I create a "server"-List:
List<Server> ServerList = new List<Server>()
{
new Server("www.google.de", "Some City,", "Some Country")
};
Then I loop through this list and I try to call the method like this:
foreach (var server in ServerList)
ListBox.Items.Add("The average response time of your custom server is: " + server.CustomPing(server.IP, amountOfPackets, sizeOfPackets));
Unfortunately, this is much more competitive than the synchronous method, and at the point where my method should return the value, it returns
System.Threading.Tasks.Taks`1[System.string]
since you have an async method it will return the task when it is called like this:
Task<string> task = server.CustomPing(server.IP, amountOfPackets, sizeOfPackets);
when you add it directly to your ListBox while concatenating it with a string it will use the ToString method, which by default prints the full class name of the object. This should explaint your output:
System.Threading.Tasks.Taks`1[System.string]
The [System.string] part actually tells you the return type of the task result. This is what you want, and to get it you would need to await it! like this:
foreach (var server in ServerList)
ListBox.Items.Add("The average response time of your custom server is: " + await server.CustomPing(server.IP, amountOfPackets, sizeOfPackets));
1) this has to be done in another async method and
2) this will mess up all the parallelity that you are aiming for. Because it will wait for each method call to finish.
What you can do is to start all tasks one after the other, collect the returning tasks and wait for all of them to finish. Preferably you would do this in an async method like a clickhandler:
private async void Button1_Click(object sender, EventArgs e)
{
Task<string> [] allTasks = ServerList.Select(server => server.CustomPing(server.IP, amountOfPackets, sizeOfPackets)).ToArray();
// WhenAll will wait for all tasks to finish and return the return values of each method call
string [] results = await Task.WhenAll(allTasks);
// now you can execute your loop and display the results:
foreach (var result in results)
{
ListBox.Items.Add(result);
}
}
The class System.Threading.Tasks.Task<TResult> is a helper class for Multitasking. While it resides in the Threading Namespace, it works for Threadless Multitasking just as well. Indeed if you see a function return a task, you can usually use it for any form of Multitasking. Tasks are very agnostic in how they are used. You can even run it synchronously, if you do not mind that little extra overhead of having a Task doing not a lot.
Task helps with some of the most important rules/convetions of Multitasking:
Do not accidentally swallow exceptions. Threadbase Multitasking is notoriously good in doing just that.
Do not use the result after a cancelation
It does that by throwing you exceptions in your face (usually the Aggregate one) if you try to access the Result Property when convention tells us you should not do that.
As well as having all those other usefull properties for Multitasking.

How to properly execute a List of Tasks async in C#

I have a list of objects that I need to run a long running process on and I would like to kick them off asynchronously, then when they are all finished return them as a list to the calling method. I've been trying different methods that I have found, however it appears that the processes are still running synchronously in the order that they are in the list. So I am sure that I am missing something in the process of how to execute a list of tasks.
Here is my code:
public async Task<List<ShipmentOverview>> GetShipmentByStatus(ShipmentFilterModel filter)
{
if (string.IsNullOrEmpty(filter.Status))
{
throw new InvalidShipmentStatusException(filter.Status);
}
var lookups = GetLookups(false, Brownells.ConsolidatedShipping.Constants.ShipmentStatusType);
var lookup = lookups.SingleOrDefault(sd => sd.Name.ToLower() == filter.Status.ToLower());
if (lookup != null)
{
filter.StatusId = lookup.Id;
var shipments = Shipments.GetShipments(filter);
var tasks = shipments.Select(async model => await GetOverview(model)).ToList();
ShipmentOverview[] finishedTask = await Task.WhenAll(tasks);
return finishedTask.ToList();
}
else
{
throw new InvalidShipmentStatusException(filter.Status);
}
}
private async Task<ShipmentOverview> GetOverview(ShipmentModel model)
{
String version;
var user = AuthContext.GetUserSecurityModel(Identity.Token, out version) as UserSecurityModel;
var profile = AuthContext.GetProfileSecurityModel(user.Profiles.First());
var overview = new ShipmentOverview
{
Id = model.Id,
CanView = true,
CanClose = profile.HasFeatureAction("Shipments", "Close", "POST"),
CanClear = profile.HasFeatureAction("Shipments", "Clear", "POST"),
CanEdit = profile.HasFeatureAction("Shipments", "Get", "PUT"),
ShipmentNumber = model.ShipmentNumber.ToString(),
ShipmentName = model.Name,
};
var parcels = Shipments.GetParcelsInShipment(model.Id);
overview.NumberParcels = parcels.Count;
var orders = parcels.Select(s => WareHouseClient.GetOrderNumberFromParcelId(s.ParcelNumber)).ToList();
overview.NumberOrders = orders.Distinct().Count();
//check validations
var vals = Shipments.GetShipmentValidations(model.Id);
if (model.ValidationTypeId == Constants.OrderValidationType)
{
if (vals.Count > 0)
{
overview.NumberOrdersTotal = vals.Count();
overview.NumberParcelsTotal = vals.Sum(s => WareHouseClient.GetParcelsPerOrder(s.ValidateReference));
}
}
return overview;
}
It looks like you're using asynchronous methods while you really want threads.
Asynchronous methods yield control back to the calling method when an async method is called, then wait until the methods has completed on the await. You can see how it works here.
Basically, the only usefulness of async/await methods is not to lock the UI, so that it stays responsive.
If you want to fire multiple processings in parallel, you will want to use threads, like such:
using System.Threading.Tasks;
public void MainMethod() {
// Parallel.ForEach will automagically run the "right" number of threads in parallel
Parallel.ForEach(shipments, shipment => ProcessShipment(shipment));
// do something when all shipments have been processed
}
public void ProcessShipment(Shipment shipment) { ... }
Marking the method as async doesn't auto-magically make it execute in parallel. Since you're not using await at all, it will in fact execute completely synchronously as if it wasn't async. You might have read somewhere that async makes functions execute asynchronously, but this simply isn't true - forget it. The only thing it does is build a state machine to handle task continuations for you when you use await and actually build all the code to manage those tasks and their error handling.
If your code is mostly I/O bound, use the asynchronous APIs with await to make sure the methods actually execute in parallel. If they are CPU bound, a Task.Run (or Parallel.ForEach) will work best.
Also, there's no point in doing .Select(async model => await GetOverview(model). It's almost equivalent to .Select(model => GetOverview(model). In any case, since the method actually doesn't return an asynchronous task, it will be executed while doing the Select, long before you get to the Task.WhenAll.
Given this, even the GetShipmentByStatus's async is pretty much useless - you only use await to await the Task.WhenAll, but since all the tasks are already completed by that point, it will simply complete synchronously.
If your tasks are CPU bound and not I/O bound, then here is the pattern I believe you're looking for:
static void Main(string[] args) {
Task firstStepTask = Task.Run(() => firstStep());
Task secondStepTask = Task.Run(() => secondStep());
//...
Task finalStepTask = Task.Factory.ContinueWhenAll(
new Task[] { step1Task, step2Task }, //more if more than two steps...
(previousTasks) => finalStep());
finalStepTask.Wait();
}

How do you use AsParallel with the async and await keywords?

I was looking at someone sample code for async and noticed a few issues with the way it was implemented. Whilst looking at the code I wondered if it would be more efficient to loop through a list using as parallel, rather than just looping through the list normally.
As far as I can tell there is very little difference in performance, both use up every processor, and both talk around the same amount of time to completed.
This is the first way of doing it
var tasks= Client.GetClients().Select(async p => await p.Initialize());
And this is the second
var tasks = Client.GetClients().AsParallel().Select(async p => await p.Initialize());
Am I correct in assuming there is no difference between the two?
The full program can be found below
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
RunCode1();
Console.WriteLine("Here");
Console.ReadLine();
RunCode2();
Console.WriteLine("Here");
Console.ReadLine();
}
private async static void RunCode1()
{
Stopwatch myStopWatch = new Stopwatch();
myStopWatch.Start();
var tasks= Client.GetClients().Select(async p => await p.Initialize());
Task.WaitAll(tasks.ToArray());
Console.WriteLine("Time ellapsed(ms): " + myStopWatch.ElapsedMilliseconds);
myStopWatch.Stop();
}
private async static void RunCode2()
{
Stopwatch myStopWatch = new Stopwatch();
myStopWatch.Start();
var tasks = Client.GetClients().AsParallel().Select(async p => await p.Initialize());
Task.WaitAll(tasks.ToArray());
Console.WriteLine("Time ellapsed(ms): " + myStopWatch.ElapsedMilliseconds);
myStopWatch.Stop();
}
}
class Client
{
public static IEnumerable<Client> GetClients()
{
for (int i = 0; i < 100; i++)
{
yield return new Client() { Id = Guid.NewGuid() };
}
}
public Guid Id { get; set; }
//This method has to be called before you use a client
//For the sample, I don't put it on the constructor
public async Task Initialize()
{
await Task.Factory.StartNew(() =>
{
Stopwatch timer = new Stopwatch();
timer.Start();
while(timer.ElapsedMilliseconds<1000)
{}
timer.Stop();
});
Console.WriteLine("Completed: " + Id);
}
}
}
There should be very little discernible difference.
In your first case:
var tasks = Client.GetClients().Select(async p => await p.Initialize());
The executing thread will (one at a time) start executing Initialize for each element in the client list. Initialize immediately queues a method to the thread pool and returns an uncompleted Task.
In your second case:
var tasks = Client.GetClients().AsParallel().Select(async p => await p.Initialize());
The executing thread will fork to the thread pool and (in parallel) start executing Initialize for each element in the client list. Initialize has the same behavior: it immediately queues a method to the thread pool and returns.
The two timings are nearly identical because you're only parallelizing a small amount of code: the queueing of the method to the thread pool and the return of an uncompleted Task.
If Initialize did some longer (synchronous) work before its first await, it may make sense to use AsParallel.
Remember, all async methods (and lambdas) start out being executed synchronously (see the official FAQ or my own intro post).
There's a singular major difference.
In the following code, you are taking it upon yourself to perform the partitioning. In other words, you're creating one Task object per item from the IEnumerable<T> that is returned from the call to GetClients():
var tasks= Client.GetClients().Select(async p => await p.Initialize());
In the second, the call to AsParallel is internally going to use Task instances to execute partitions of the IEnumerable<T> and you're going to have the initial Task that is returned from the lambda async p => await p.Initialize():
var tasks = Client.GetClients().AsParallel().
Select(async p => await p.Initialize());
Finally, you're not really doing anything by using async/await here. Granted, the compiler might optimize this out, but you're just waiting on a method that returns a Task and then returning a continuation that does nothing back through the lambda. That said, since the call to Initialize is already returning a Task, it's best to keep it simple and just do:
var tasks = Client.GetClients().Select(p => p.Initialize());
Which will return the sequence of Task instances for you.
To improve on the above 2 answers this is the simplest way to get an async/threaded execution that is awaitable:
var results = await Task.WhenAll(Client.GetClients()
.Select(async p => p.Initialize()));
This will ensure that it spins separate threads and that you get the results at the end. Hope that helps someone. Took me quite a while to figure this out properly since this is very not obvious and the AsParallel() function seems to be what you want but doesn't use async/await.

Categories