I have the following code, that I intend to run asynchronously. My goal is that GetPictureForEmployeeAsync() is called in parallel as many times as needed. I'd like to make sure that 'await' on CreatePicture does not prevent me from doing so.
public Task<Picture[]> GetPictures(IDictionary<string, string> tags)
{
var query = documentRepository.GetRepositoryQuery();
var employees = query.Where(doc => doc.Gender == tags["gender"]);
return Task.WhenAll(employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)));
}
private Task<Picture> GetPictureForEmployeeAsync(Employee employee, IDictionary<string, string> tags)
{
var base64PictureTask = blobRepository.GetBase64PictureAsync(employee.ID.ToString());
var documentTask = documentRepository.GetItemAsync(employee.ID.ToString());
return CreatePicture(tags, base64PictureTask, documentTask);
}
private static async Task<Picture> CreatePicture(IDictionary<string, string> tags, Task<string> base64PictureTask, Task<Employee> documentTask)
{
var document = await documentTask;
return new Picture
{
EmployeeID = document.ID,
Data = await base64PictureTask,
ID = document.ID.ToString(),
Tags = tags,
};
}
If I understand it correctly, Task.WhenAll() is not affected by the two awaited tasks inside CreatePicture() because GetPictureForEmployeeAsync() is not awaited. Am I right about this? If not, how should I restructure the code to achieve what I want?
I'd like to make sure that 'await' on CreatePicture does not prevent me from doing so.
It doesn't.
If I understand it correctly, Task.WhenAll() is not affected by the two awaited tasks inside CreatePicture() because GetPictureForEmployeeAsync() is not awaited. Am I right about this?
Yes and no. The WhenAll isn't limited in any way by the awaited tasks in CreatePicture, but that has nothing to do with whether GetPictureForEmployeeAsync is awaited or not. These two lines of code are equivalent in terms of behavior:
return Task.WhenAll(employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)));
return Task.WhenAll(employees.Select(async employee => await GetPictureForEmployeeAsync(employee, tags)));
I recommend reading my async intro to get a good understanding of how async and await work with tasks.
Also, since GetPictures has non-trivial logic (GetRepositoryQuery and evaluating tags["gender"]), I recommend using async and await for GetPictures, as such:
public async Task<Picture[]> GetPictures(IDictionary<string, string> tags)
{
var query = documentRepository.GetRepositoryQuery();
var employees = query.Where(doc => doc.Gender == tags["gender"]);
var tasks = employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)).ToList();
return await Task.WhenAll(tasks);
}
As a final note, you may find your code cleaner if you don't pass around "tasks meant to be awaited" - instead, await them first and pass their result values:
async Task<Picture> GetPictureForEmployeeAsync(Employee employee, IDictionary<string, string> tags)
{
var base64PictureTask = blobRepository.GetBase64PictureAsync(employee.ID.ToString());
var documentTask = documentRepository.GetItemAsync(employee.ID.ToString());
await Task.WhenAll(base64PictureTask, documentTask);
return CreatePicture(tags, await base64PictureTask, await documentTask);
}
static Picture CreatePicture(IDictionary<string, string> tags, string base64Picture, Employee document)
{
return new Picture
{
EmployeeID = document.ID,
Data = base64Picture,
ID = document.ID.ToString(),
Tags = tags,
};
}
The thing to keep in mind about calling an async method is that, as soon as an await statement is reached inside that method, control immediately goes back to the code that invoked the async method -- no matter where the await statement happens to be in the method. With a 'normal' method, control doesn't go back to the code that invokes that method until the end of that method is reached.
So in your case, you can do the following:
private async Task<Picture> GetPictureForEmployeeAsync(Employee employee, IDictionary<string, string> tags)
{
// As soon as we get here, control immediately goes back to the GetPictures
// method -- no need to store the task in a variable and await it within
// CreatePicture as you were doing
var picture = await blobRepository.GetBase64PictureAsync(employee.ID.ToString());
var document = await documentRepository.GetItemAsync(employee.ID.ToString());
return CreatePicture(tags, picture, document);
}
Because the first line of code in GetPictureForEmployeeAsync has an await, control will immediately go right back to this line...
return Task.WhenAll(employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)));
...as soon as it is invoked. This will have the effect of all of the employee items getting processed in parallel (well, sort of -- the number of threads that will be allotted to your application will be limited).
As an additional word of advice, if this application is hitting a database or web service to get the pictures or documents, this code will likely cause you issues with running out of available connections. If this is the case, consider using System.Threading.Tasks.Parallel and setting the maximum degree of parallelism, or use SemaphoreSlim to control the number of connections used simultaneously.
Related
I have an API which needs to be run in a loop for Mass processing.
Current single API is:
public async Task<ActionResult<CombinedAddressResponse>> GetCombinedAddress(AddressRequestDto request)
We are not allowed to touch/modify the original single API. However can be run in bulk, using foreach statement. What is the best way to run this asychronously without locks?
Current Solution below is just providing a list, would this be it?
public async Task<ActionResult<List<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
var combinedAddressResponses = new List<CombinedAddressResponse>();
foreach(AddressRequestDto request in requests)
{
var newCombinedAddress = (await GetCombinedAddress(request)).Value;
combinedAddressResponses.Add(newCombinedAddress);
}
return combinedAddressResponses;
}
Update:
In debugger, it has to go to combinedAddressResponse.Result.Value
combinedAddressResponse.Value = null
and Also strangely, writing combinedAddressResponse.Result.Value gives error below "Action Result does not contain a definition for for 'Value' and no accessible extension method
I'm writing this code off the top of my head without an IDE or sleep, so please comment if I'm missing something or there's a better way.
But effectively I think you want to run all your requests at once (not sequentially) doing something like this:
public async Task<ActionResult<List<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
var combinedAddressResponses = new List<CombinedAddressResponse>(requests.Count);
var tasks = new List<Task<ActionResult<CombinedAddressResponse>>(requests.Count);
foreach (var request in requests)
{
tasks.Add(Task.Run(async () => await GetCombinedAddress(request));
}
//This waits for all the tasks to complete
await tasks.WhenAll(tasks.ToArray());
combinedAddressResponses.AddRange(tasks.Select(x => x.Result.Value));
return combinedAddressResponses;
}
looking for a way to speed things up and run in parallel thanks
What you need is "asynchronous concurrency". I use the term "concurrency" to mean "doing more than one thing at a time", and "parallel" to mean "doing more than one thing at a time using threads". Since you're on ASP.NET, you don't want to use additional threads; you'd want to use a form of concurrency that works asynchronously (which uses fewer threads). So, Parallel and Task.Run should not be parts of your solution.
The way to do asynchronous concurrency is to build a collection of tasks, and then use await Task.WhenAll. E.g.:
public async Task<ActionResult<IReadOnlyList<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
// Build the collection of tasks by doing an asynchronous operation for each request.
var tasks = requests.Select(async request =>
{
var combinedAddressResponse = await GetCombinedAdress(request);
return combinedAddressResponse.Value;
}).ToList();
// Wait for all the tasks to complete and get the results.
var results = await Task.WhenAll(tasks);
return results;
}
I'm not sure if this code is Asynchronous. I call this function using await from my main controller and within the function I use await on the LINQ query and .ToListAsync() - but after the query I have the foreach loop which may defeat the purpose of async on the query.
Main Controller Call:
case "getassets":
reply = await GetAssets();
break;
Function:
public async Task<ReplyObj> GetAssets()
{
ReplyObj obj = new ReplyObj();
obj.Result = new List<dynamic>();
dynamic AssetRecords = await _context.Asset.FromSql("SELECT * FROM Asset").ToListAsync();
foreach (var objAsset in AssetRecords)
{
obj.Result.Add(new Asset()
{
AssetId = objAsset.AssetId,
Name = objAsset.Name,
Description = objAsset.Description,
PriceDecimals = objAsset.PriceDecimals
});
}
obj.Success = true;
obj.Message = "";
return obj;
}
This call will have many requests hitting it, I want to know for sure that its using async correctly. Thank you!
To begin, here's a couple of references for async/await in C# that I'd suggest reviewing:
Microsoft Docs
SO Community Answer
The simple (high-level) answer is that awaiting your sql call will return control up the call stack and continue execution. In this case, that means it will go up to:
reply = await GetAssets();
Which will in turn return control to whatever function called that, etc. etc..
With that said, if all of your async calls in your call stack are immediately being awaited, then async won't end up buying you anything/changing the flow of control. To say, keep in mind that async != threading.
Few things that I want to point out:
dynamic AssetRecords = await _context.Asset.FromSql("SELECT * FROM Asset").ToListAsync(); When you are using _context.Asset it will return all the Asset rows that you have. Why would you execute another query on the table when the Asset it self is giving all that you need? Hence, to me this is redundant thing.
And if you use select method and then get the list asyncronously then you will have eliminated the the foreach loop and it will cost only one await call and query processing. See code sample below:
public async Task<ReplyObj> GetAssets()
{
ReplyObj obj = new ReplyObj();
return obj.Result = await _context.Asset.Select(s => new
{
AssetId = s.AssetId,
Name = s.Name,
Description = s.Description,
PriceDecimals = s.PriceDecimals
}).ToListAsync();
}
This could now be your true async method.
PS:
If you want to show your code to experts for review, I would suggest joining StackExchange CodeReview Community
I'm submitting a series of select statements (queries - thousands of them) to a single database synchronously and getting back one DataTable per query (Note: This program is such that it has knowledge of the DB schema it is scanning only at run time, hence the use of DataTables). The program runs on a client machine and connects to DBs on a remote machine. It takes a long time to run so many queries. So, assuming that executing them async or in parallel will speed things up, I'm exploring TPL Dataflow (TDF). I want to use the TDF library because it seems to handle all of the concerns related to writing multi-threaded code that would otherwise need to be done by hand.
The code shown is based on http://blog.i3arnon.com/2016/05/23/tpl-dataflow/. Its minimal and is just to help me understand the basic operations of TDF. Please do know I've read many blogs and coded many iterations trying to crack this nut.
None-the-less, with this current iteration, I have one problem and a question:
Problem
The code is inside a button click method (Using a UI, a user selects a machine, a sql instance, and a database, and then kicks off the scan). The two lines with the await operator return an error at build time: The 'await' operator can only be used within an async method. Consider marking this method with the 'async' modifier and changing its return type to 'Task'. I can't change the return type of the button click method. Do I need to somehow isolate the button click method from the async-await code?
Question
Although I've found beau-coup write-ups describing the basics of TDF, I can't find an example of how to get my hands on the output that each invocation of the TransformBlock produces (i.e., a DataTable). Although I want to submit the queries async, I do need to block until all queries submitted to the TransformBlock are completed. How do I get my hands on the series of DataTables produced by the TransformBlock and block until all queries are complete?
Note: I acknowledge that I have only one block now. At a minimum, I'll be adding a cancellation block and so do need/want to use TPL.
private async Task ToolStripButtonStart_Click(object sender, EventArgs e)
{
UserInput userInput = new UserInput
{
MachineName = "gat-admin",
InstanceName = "",
DbName = "AdventureWorks2014",
};
DataAccessLayer dataAccessLayer = new DataAccessLayer(userInput.MachineName, userInput.InstanceName);
//CreateTableQueryList gets a list of all tables from the DB and returns a list of
// select statements, one per table, e.g., SELECT * from [schemaname].[tablename]
IList<String> tableQueryList = CreateTableQueryList(userInput);
// Define a block that accepts a select statement and returns a DataTable of results
// where each returned record is: schemaname + tablename + columnname + column datatype + field data
// e.g., if the select query returns one record with 5 columns, then a datatable with 5
// records (one per field) will come back
var transformBlock_SubmitTableQuery = new TransformBlock<String, Task<DataTable>>(
async tableQuery => await dataAccessLayer._SubmitSelectStatement(tableQuery),
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 2,
});
// Add items to the block and start processing
foreach (String tableQuery in tableQueryList)
{
await transformBlock_SubmitTableQuery.SendAsync(tableQuery);
}
// Enable the Cancel button and disable the Start button.
toolStripButtonStart.Enabled = false;
toolStripButtonStop.Enabled = true;
//shut down the block (no more inputs or outputs)
transformBlock_SubmitTableQuery.Complete();
//await the completion of the task that procduces the output DataTable
await transformBlock_SubmitTableQuery.Completion;
}
public async Task<DataTable> _SubmitSelectStatement(string queryString )
{
try
{
.
.
await Task.Run(() => sqlDataAdapter.Fill(dt));
// process dt into the output DataTable I need
return outputDt;
}
catch
{
throw;
}
}
The cleanest way to retrieve the output of a TransformBlock is to perform a nested loop using the methods OutputAvailableAsync and TryReceive. It is a bit verbose, so you could consider encapsulating this functionality in an extension method ToListAsync:
public static async Task<List<T>> ToListAsync<T>(this IReceivableSourceBlock<T> source,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(source);
List<T> list = new();
while (await source.OutputAvailableAsync(cancellationToken).ConfigureAwait(false))
{
while (source.TryReceive(out T item))
{
list.Add(item);
}
}
Debug.Assert(source.Completion.IsCompleted);
await source.Completion.ConfigureAwait(false); // Propagate possible exception
return list;
}
Then you could use the ToListAsync method like this:
private async Task ToolStripButtonStart_Click(object sender, EventArgs e)
{
TransformBlock<string, DataTable> transformBlock = new(async query => //...
//...
transformBlock.Complete();
foreach (DataTable dataTable in await transformBlock.ToListAsync())
{
// Do something with each dataTable
}
}
Note: this ToListAsync implementation is destructive, meaning that in case of an error the consumed messages are discarded. To make it non-destructive, just remove the await source.Completion line. In this case you'll have to remember to await the Completion of the block after processing the list with the consumed messages, otherwise you won't be aware if the TransformBlock failed to process all of its input.
Alternative ways to retrieve the output of a dataflow block do exist, for example this one by dcastro uses a BufferBlock as a buffer and is slightly more performant, but personally I find the approach above to be safer and more straightforward.
Instead of waiting for the completion of the block before retrieving the output, you could also retrieve it in a streaming manner, as an IAsyncEnumerable<T> sequence:
public static async IAsyncEnumerable<T> ToAsyncEnumerable<T>(
this IReceivableSourceBlock<T> source,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(source);
while (await source.OutputAvailableAsync(cancellationToken).ConfigureAwait(false))
{
while (source.TryReceive(out T item))
{
yield return item;
cancellationToken.ThrowIfCancellationRequested();
}
}
Debug.Assert(source.Completion.IsCompleted);
await source.Completion.ConfigureAwait(false); // Propagate possible exception
}
This way you will be able to get your hands to each DataTable immediately after it has been cooked, without having to wait for the processing of all queries. To consume an IAsyncEnumerable<T> you simply move the await before the foreach:
await foreach (DataTable dataTable in transformBlock.ToAsyncEnumerable())
{
// Do something with each dataTable
}
Advanced: Below is a more sophisticated version of the ToListAsync method, that propagates all the errors of the underlying block, in the same direct way that are propagated by methods like the Task.WhenAll and Parallel.ForEachAsync. The original simple ToListAsync method wraps the errors in a nested AggregateException, using the Wait technique that is shown in this answer.
/// <summary>
/// Asynchronously waits for the successful completion of the specified source, and
/// returns all the received messages. In case the source completes with error,
/// the error is propagated and the received messages are discarded.
/// </summary>
public static Task<List<T>> ToListAsync<T>(this IReceivableSourceBlock<T> source,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(source);
async Task<List<T>> Implementation()
{
List<T> list = new();
while (await source.OutputAvailableAsync(cancellationToken)
.ConfigureAwait(false))
while (source.TryReceive(out T item))
list.Add(item);
await source.Completion.ConfigureAwait(false);
return list;
}
return Implementation().ContinueWith(t =>
{
if (t.IsCanceled) return t;
Debug.Assert(source.Completion.IsCompleted);
if (source.Completion.IsFaulted)
{
TaskCompletionSource<List<T>> tcs = new();
tcs.SetException(source.Completion.Exception.InnerExceptions);
return tcs.Task;
}
return t;
}, default, TaskContinuationOptions.DenyChildAttach |
TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default).Unwrap();
}
.NET 6 update: A new API DataflowBlock.ReceiveAllAsync was introduced in .NET 6, with this signature:
public static IAsyncEnumerable<TOutput> ReceiveAllAsync<TOutput> (
this IReceivableSourceBlock<TOutput> source,
CancellationToken cancellationToken = default);
It is similar with the aforementioned ToAsyncEnumerable method. The important difference is that the new API does not propagate the possible exception of the consumed source block, after propagating all of its messages. This behavior is not consistent with the analogous API ReadAllAsync from the Channels library. I have reported this consistency on GitHub, and the issue is currently labeled by Microsoft as a bug.
As it turns out, to meet my requirements, TPL Dataflow is a bit overkill. I was able to meet my requirements using async/await and Task.WhenAll. I used the Microsoft How-To How to: Extend the async Walkthrough by Using Task.WhenAll (C#) as a model.
Regarding my "Problem"
My "problem" is not a problem. An event method signature (in my case, a "Start" button click method that initiates my search) can be modified to be async. In the Microsoft How-To GetURLContentsAsync solution, see the startButton_Click method signature:
private async void startButton_Click(object sender, RoutedEventArgs e)
{
.
.
}
Regarding my question
Using Task.WhenAll, I can wait for all my queries to finish then process all the outputs for use on my UI. In the Microsoft How-To GetURLContentsAsync solution, see the SumPageSizesAsync method, i.e,, the array of int named lengths is the sum of all outputs.
private async Task SumPageSizesAsync()
{
.
.
// Create a query.
IEnumerable<Task<int>> downloadTasksQuery = from url in urlList select ProcessURLAsync(url);
// Use ToArray to execute the query and start the download tasks.
Task<int>[] downloadTasks = downloadTasksQuery.ToArray();
// Await the completion of all the running tasks.
Task<int[]> whenAllTask = Task.WhenAll(downloadTasks);
int[] lengths = await whenAllTask;
.
.
}
Using Dataflow blocks properly results in both cleaner and faster code. Dataflow blocks aren't agents or tasks. They're meant to work in a pipeline of blocks, connected with LinkTo calls, not manual coding.
It seems the scenario is to download some data, eg some CSVs, parse them and insert them to a database. Each of those steps can go into its own block:
a Downloader with a DOP>1, to allow multiple downloads run concurrently without flooding the network.
a Parser that converts the files into arrays of objects
an Importer that uses SqlBulkCopy to bulk insert the rows into the database in the fastest way possible, using minimal logging.
var downloadDOP=8;
var parseDOP=2;
var tableName="SomeTable";
var linkOptions=new DataflowLinkOptions { PropagateCompletion = true};
var downloadOptions =new ExecutionDataflowBlockOptions {
MaxDegreeOfParallelism = downloadDOP,
};
var parseOptions =new ExecutionDataflowBlockOptions {
MaxDegreeOfParallelism = parseDOP,
};
With these options, we can construct a pipeline of blocks
//HttpClient is thread-safe and reusable
HttpClient http=new HttpClient(...);
var downloader=new TransformBlock<(Uri,string),FileInfo>(async (uri,path)=>{
var file=new FileInfo(path);
using var stream =await httpClient.GetStreamAsync(uri);
using var fileStream=file.Create();
await stream.CopyToAsync(stream);
return file;
},downloadOptions);
var parser=new TransformBlock<FileInfo,Foo[]>(async file=>{
using var reader = file.OpenText();
using var csv = new CsvReader(reader, CultureInfo.InvariantCulture);
var records = csv.GetRecords<Foo>().ToList();
return records;
},parseOptions);
var importer=new ActionBlock<Foo[]>(async recs=>{
using var bcp=new SqlBulkCopy(connectionString, SqlBulkCopyOptions.TableLock);
bcp.DestinationTableName=tableName;
//Map columns if needed
...
using var reader=ObjectReader.Create(recs);
await bcp.WriteToServerAsync(reader);
});
downloader.LinkTo(parser,linkOptions);
parser.LinkTo(importer,linkOptions);
Once the pipeline is complete, you can start posting Uris to the head block and await until the tail block completes :
IEnumerable<(Uri,string)> filesToDownload = ...
foreach(var pair in filesToDownload)
{
await downloader.SendAsync(pair);
}
downloader.Complete();
await importer.Completion;
The code uses CsvHelper to parse the CSV file and FastMember's ObjectReader to create an IDataReader wrapper over the CSV records.
In each block you can use a Progress instance to update the UI based on the pipeline's progress
Sorry for a subjective question like this:
I current have a sequential loading method which was extremely slow, I have converted this into a async method:
public async void LoadData(int releaseId, int projectId, bool uiThread)
Within this method I start up an Await task (and set the ConfigureAwait to be false) as it does not need to capture and resume from this context.
await Task.Run(() =>
{
//make several DB calls as below
}).ConfigureAwait(False);
Within this task I make several async calls to EF / the database, each call looks something like this:
public async virtual Task<List<X>> FindXAsync()
{
var q = from c in context.X
select c;
return await q.ToListAsync();
}
But in the task I am awaiting the response by using result see below:
X = sm.FindXAsync().Result;
From my limited knowledge of using asynchronous programming with EF would each call run sequentially from inside the task?
Will the current set up return multiple return sets concurrently or would I have to create multiple tasks and await them separatly.
Again sorry for the vague question but i'm sure you guys are a lot more experienced on this topic than me ^^
Edit: I release there wasn't really a proper question in there, I guess what I am wondering is would x,y and z return concurrently or sequentially from within the task.
await Task.Run(() =>
{
x = sm.FindXAsync().Result;
y = new ObservableCollection<Y>(sm.FindYAsync().Result);
z = new ObservableCollection<Z>(sm.FindZAsync().Result);
}).ConfigureAwait(False)
Thanks,
Chris
Yes, the will run sequentially when you use the Result property. Also they will run sequentially with the following code: `
x = await sm.FindXAsync().ConfigureAwait(False);
y = new ObservableCollection<Y>( await sm.FindYAsync().ConfigureAwait(False));
z = new ObservableCollection<Z>(await sm.FindZAsync().ConfigureAwait(False));
Also update
public async void LoadData to return Task -> `public async Task LoadData`.
If you want to run all code in parallel add the taks in an array and then call await Task.WhenAll(tasks)
I have a list of objects that I need to run a long running process on and I would like to kick them off asynchronously, then when they are all finished return them as a list to the calling method. I've been trying different methods that I have found, however it appears that the processes are still running synchronously in the order that they are in the list. So I am sure that I am missing something in the process of how to execute a list of tasks.
Here is my code:
public async Task<List<ShipmentOverview>> GetShipmentByStatus(ShipmentFilterModel filter)
{
if (string.IsNullOrEmpty(filter.Status))
{
throw new InvalidShipmentStatusException(filter.Status);
}
var lookups = GetLookups(false, Brownells.ConsolidatedShipping.Constants.ShipmentStatusType);
var lookup = lookups.SingleOrDefault(sd => sd.Name.ToLower() == filter.Status.ToLower());
if (lookup != null)
{
filter.StatusId = lookup.Id;
var shipments = Shipments.GetShipments(filter);
var tasks = shipments.Select(async model => await GetOverview(model)).ToList();
ShipmentOverview[] finishedTask = await Task.WhenAll(tasks);
return finishedTask.ToList();
}
else
{
throw new InvalidShipmentStatusException(filter.Status);
}
}
private async Task<ShipmentOverview> GetOverview(ShipmentModel model)
{
String version;
var user = AuthContext.GetUserSecurityModel(Identity.Token, out version) as UserSecurityModel;
var profile = AuthContext.GetProfileSecurityModel(user.Profiles.First());
var overview = new ShipmentOverview
{
Id = model.Id,
CanView = true,
CanClose = profile.HasFeatureAction("Shipments", "Close", "POST"),
CanClear = profile.HasFeatureAction("Shipments", "Clear", "POST"),
CanEdit = profile.HasFeatureAction("Shipments", "Get", "PUT"),
ShipmentNumber = model.ShipmentNumber.ToString(),
ShipmentName = model.Name,
};
var parcels = Shipments.GetParcelsInShipment(model.Id);
overview.NumberParcels = parcels.Count;
var orders = parcels.Select(s => WareHouseClient.GetOrderNumberFromParcelId(s.ParcelNumber)).ToList();
overview.NumberOrders = orders.Distinct().Count();
//check validations
var vals = Shipments.GetShipmentValidations(model.Id);
if (model.ValidationTypeId == Constants.OrderValidationType)
{
if (vals.Count > 0)
{
overview.NumberOrdersTotal = vals.Count();
overview.NumberParcelsTotal = vals.Sum(s => WareHouseClient.GetParcelsPerOrder(s.ValidateReference));
}
}
return overview;
}
It looks like you're using asynchronous methods while you really want threads.
Asynchronous methods yield control back to the calling method when an async method is called, then wait until the methods has completed on the await. You can see how it works here.
Basically, the only usefulness of async/await methods is not to lock the UI, so that it stays responsive.
If you want to fire multiple processings in parallel, you will want to use threads, like such:
using System.Threading.Tasks;
public void MainMethod() {
// Parallel.ForEach will automagically run the "right" number of threads in parallel
Parallel.ForEach(shipments, shipment => ProcessShipment(shipment));
// do something when all shipments have been processed
}
public void ProcessShipment(Shipment shipment) { ... }
Marking the method as async doesn't auto-magically make it execute in parallel. Since you're not using await at all, it will in fact execute completely synchronously as if it wasn't async. You might have read somewhere that async makes functions execute asynchronously, but this simply isn't true - forget it. The only thing it does is build a state machine to handle task continuations for you when you use await and actually build all the code to manage those tasks and their error handling.
If your code is mostly I/O bound, use the asynchronous APIs with await to make sure the methods actually execute in parallel. If they are CPU bound, a Task.Run (or Parallel.ForEach) will work best.
Also, there's no point in doing .Select(async model => await GetOverview(model). It's almost equivalent to .Select(model => GetOverview(model). In any case, since the method actually doesn't return an asynchronous task, it will be executed while doing the Select, long before you get to the Task.WhenAll.
Given this, even the GetShipmentByStatus's async is pretty much useless - you only use await to await the Task.WhenAll, but since all the tasks are already completed by that point, it will simply complete synchronously.
If your tasks are CPU bound and not I/O bound, then here is the pattern I believe you're looking for:
static void Main(string[] args) {
Task firstStepTask = Task.Run(() => firstStep());
Task secondStepTask = Task.Run(() => secondStep());
//...
Task finalStepTask = Task.Factory.ContinueWhenAll(
new Task[] { step1Task, step2Task }, //more if more than two steps...
(previousTasks) => finalStep());
finalStepTask.Wait();
}