Flow context/state through generated continuations

Flow context/state through generated continuations - c#

First, some context (pardon the pun). Consider the following two async methods:
public async Task Async1() {
PreWork();
await Async2();
PostWork();
}
public async Task Async2() {
await Async3();
}
Thanks to the async and await keywords, this creates the illusion of a nice simple call stack:
Async1
PreWork
Async2
Async3
PostWork
But, if I understand correctly, in the generated code the call to PostWork is tacked on as a continuation to the Async2 task, so at runtime the flow of execution is actually more like this:
Async1
PreWork
Async2
Async3
PostWork
(and actually it's even more complicated than that, because in reality the use of async and await causes the compiler to generate state machines for each async method, but we might be able to ignore that detail for this question)
Now, my question: Is there any way to flow some sort of context through these auto-generated continuations, such that by the time I hit PostWork, I have accumulated state from Async1 and Async2?
I can get something similar to what I want with tools like AsyncLocal and CallContext.LogicalSetData, but they aren't quite what I need because the contexts are "rolled back" as you work your way back up the async chain. For example, calling Async1 in the following code will print "1,3", not "1,2,3":
private static readonly AsyncLocal<ImmutableQueue<String>> _asyncLocal = new AsyncLocal<ImmutableQueue<String>>();
public async Task Async1() {
_asyncLocal.Value = ImmutableQueue<String>.Empty.Enqueue("1");
await Async2();
_asyncLocal.Value = _asyncLocal.Value.Enqueue("3");
Console.WriteLine(String.Join(",", _asyncLocal.Value));
}
public async Task Async2() {
_asyncLocal.Value = _asyncLocal.Value.Enqueue("2");
await Async3();
}
I understand why this prints "1,3" (the execution context flows down to Async2 but not back up to Async1) but that isn't what I'm looking for. I really want to accumulate state through the actual execution chain, such that I'm able to print "1,2,3" at the end because that was the actual way in which the methods were executed leading up to the call to Console.WriteLine.
Note that I don't want to blindly accumulate all state across all async work, I only want to accumulate state that is causally related. In this scenario I want to print "1,2,3" because that "2" came from a dependent (awaited) task. If instead the call to Async2 was just a fire-and-forget task then I wouldn't expect to see "2" because its execution would not be in the actual chain leading up to the Console.WriteLine.
Edit: I do not want to solve this problem just passing around parameters and return values because I need a generic solution that will work across a large code base without having to modify every method to pass around this metadata.

Related

What's the generally accepted best practice when calling NON-Async functions, from within other async functions?

It seems that in a Blazor Server app we are encouraged to use Async where possible, and I generally understand why. That said - would someone please be kind enough to explain the general expectation when using Async functions and fill in some missing knowledge for me - I'm confused by all the information that's out there and need something specific to what I'm doing, to help me understand.
I'm trying to stick to using async where possible and in most cases it's fairly easy to do so, but when calling some functions, it seems overkill to make them asynchronous (or is it?). In the the example below - the NON-async 'ValidateEvent' function is being called from within an async function.
So my question is ...Should I:
a) Call it normally from within the Async function (which seems to defeat the point of async) eg: "var validationResult = ValidateEvent(objDto);"?
b) Call it using Task.Run eg: "await Task.Run(() =>ValidateEvent(objDto));"?
c) Convert this simple IF/ELSE method into an Async function?
Thanks in advance for any help/advice.
//example async function, that itself calls the other non-async function.
public async Task<Response> AddAsync(ObjDto objDto)
{
// a) Call it normally?
var validationResult = ValidateEvent(objDto);
// b) Calling it using Task.Run?
var validationResult = await Task.Run(() =>ValidateEvent(objDto));
//Do stuff asynchronously here
...
await _db.AddAsync(objDto);
await _db.SaveChangesAsync();
...
}
The validation function:
c) Should I really be converting this to async since it's just a series of 'IFs and ELSEs' (async conversion further below)
//Non-Async version
public ResponseObj ValidateEvent(ObjDto obj)
{
ResponseObj responseObj = new();
string stringErrors = "";
//If not full day, check that end date is not before start date
if (!obj.IsFullDay)
{
if (obj.EndTime < obj.StartTime)
stringErrors += ("End date cannot be before the start date<br />");
}
...other code removed for brevity
if (string.IsNullOrEmpty(stringErrors)) //Success
{
responseObj.Status = responseObj.Success;
}
else //errors
{
responseObj.Status = responseObj.Error;
responseObj.Message = stringErrors;
}
return responseObj;
}
//Example Async conversion - is it worth converting this using Task.Run?
//Async conversion
public async Task<ResponseObj> ValidateEvent(ObjDto obj)
{
ResponseObj responseObj = new();
string stringErrors = "";
await Task.Run(() =>
{
//If not full day, check that end date is not before start date
if (!obj.IsFullDay)
{
if (obj.EndTime < obj.StartTime)
stringErrors += ("End date cannot be before the start date<br />");
}
if (string.IsNullOrEmpty(stringErrors)) //Success
{
responseObj.Status = responseObj.Success;
}
else //errors
{
responseObj.Status = responseObj.Error;
responseObj.Message = stringErrors;
}
}
);
return responseObj;
}
Again, thanks in advance for any help/advice in understaning the best way to go with this in general.

a) Call it normally from within the Async function ...
Yes
... (which seems to defeat the point of async)
No it doesn't.

It depends on what the ValidateEvent method does.
In case it does a trivial amount of work that is not going to be increased over time, like your example suggests, then it's completely OK to invoke it from inside the asynchronous AddAsync method.
In case it blocks for a considerable amount of time, say for more than 50 msec, then invoking it from inside an asynchronous method would violates Microsoft's guidelines about how asynchronous methods are expected to behave:
An asynchronous method that is based on TAP can do a small amount of work synchronously, such as validating arguments and initiating the asynchronous operation, before it returns the resulting task. Synchronous work should be kept to the minimum so the asynchronous method can return quickly.
So what you should do if the ValidateEvent method blocks, and you can't do anything about it? AFAIK you are in a gray area, with all options being unsatisfactory for one reason or another.
You can leave it as is, accepting that the method violates Microsoft's guidelines, document this behavior, and leave to the callers the option to wrap the AddAsync in a Task.Run, if this is beneficial for them.
You can wrap the ValidateEvent in a Task.Run, like you did in your question. This violates (partially) another guideline by Microsoft: Do not expose asynchronous wrappers for synchronous methods.
You can split your AddAsync method into two parts, the synchronous ValidateAdd and the asynchronous AddAsync, expecting that the callers will invoke the one method after the other.
If your AddAsync method is part of an application-agnostic library, I would say go with the option 1. That's what Microsoft is doing with some built-in APIs. If the AddAsync method is application-specific, and the application has a GUI (WinForms, WPF etc), then it is tempting to go with the option 2. Keeping the UI responsive is paramount for a GUI application, so the Task.Run should be included somewhere, and putting it inside the AddAsync seems like a convenient place to put it. The option 3 is probably the least attractive option for this particular case, because validating and adding are tightly coupled operations, but it may be a good option for other scenarios.

How to convert synchronous method to asynchronous without changing it's signature

I have a large scale C# solution with 40-ish modules.
I'm trying to convert a service used solution-wide from synchronous to asynchronous.
the problem is I can't find a way to do so without changing the signature of the method.
I've tried wrapping said asynchronous desired operation with Task but that requires changing the method signature.
I've tried changing the caller to block itself while the method is operating but that screwed my system pretty good because it's a very long calling-chain and changing each of the members in the chain to block itself is a serious issue.
public SomeClass Foo()
{
// Synchronous Code
}
Turn this into:
public SomeClass Foo()
{
//Asynchronous code
}
whilst all callers stay the same
public void DifferentModule()
{
var class = Foo();
}

Any implementation that fundamentally changes something from sync to async is going to involve a signature change. Any other approach is simply not going to work well. Fundamentally: async and sync demand different APIs, which means: different signatures. This is unavoidable, and frankly "How to convert synchronous method to asynchronous without changing it's signature?" is an unsolvable problem (and more probably: the wrong question). I'm sorry if it seems like I'm not answering the question there, but... sometimes the answer is "you can't, and anyone who says you can is tempting you down a very bad path".
In the async/Task<T> sense, the most common way to do this without breaking compatibility is to add a new / separate method, so you have
SomeReturnType Foo();
and
Task<SomeReturnType> FooAsync(); // or ValueTask<T> if often actually synchoronous
nothing that Foo and FooAsync here probably have similar but different implementations - one designed to exploit async, one that works purely synchronously. It is not a good idea to spoof one by calling the other - both "sync over async" (the synchronous version calling the async version) and "async over sync" (the async version calling the sync version) are anti-patterns, and should be avoided (the first is much more harmful than the second).
If you really don't want to do this, you could also do things like adding a FooCompleted callback event (or similar), but : this is still fundamentally a signature change, and the caller will still have to use the API differently. By the time you've done that - you might as well have made it easy for the consumer by adding the Task<T> API instead.

The common pattern is to add an Async to the method name and wrap the return type in a Task. So:
public SomeClass Foo()
{
// Synchronous Code
}
becomes:
public Task<SomeClass> FooAsync()
{
// Asynchronous Code
}
You'll end up with two versions of the method, but it will allow you to gradually migrate your code over to the async approach, and it won't break the existing code whilst you're doing the migration.

If you desperately need to do this, it can be achieved by wrapping the Synchronous code that needs to become Asynchronous in a Task this can be done like this:
public SomeClass Foo()
{
Task t = Task.Run(() =>
{
// Do stuff, code in here will run asynchronously
}
t.Wait();
// or if you need a return value: var result = t.Wait();
return someClass;
// or return result
}
Code you write inside the Task.Run(() => ...) will run asynchronously
Short explanation: with Task t = Task.Run(() => ...) we start a new Task, the "weird" parameter is a Lambda expression, basically we're passing a anonymous Method into the Run method which will get executed by the Task
We then wait for the task to finish with t.Wait();. The Wait method can return a value, you can return a value from an anonymous method just like from any method, with the return keyword
Note: This can, but should not be done. See Sean's answer for more

Does adding a Task<T> to the return of a method make it run in a thread?

I am struggling to find any solid information online, but lets pretend I have a method like below:
public int DoSomething()
{
// some sync logic that takes a while
return someResult
}
So lets assume it is a method that is completely sync and currently will block the thread the caller is on for 100ms. So then lets say I wanted to create a version that would not block that thread while it executes, is it as simple as just doing:
public Task<int> DoSomethingAsync()
{
// some sync logic that takes a while
return Task.FromResult(someResult)
}
As I know Task.Run will cause code to be executed within a thread, but does .net basically interpret any method with a Task/<T> return type to be something that should be non blocking?
(I am hesitant to use the word async as it has mixed meanings in .net, but as shown above there is no await use here)

No. All that changing the return type does is... change the return type.
You're changing your promise from "when I return I will give you an int" to "when I return, I will give you something that will eventually provide an int" (with the caveat that the value may, in fact, already be available).
How your method comes up with the Task is entirely an internal implementation detail that is a concern for your method. In your current changed implementation, it does it by running entirely on the same thread from which it was called and then just stuffing the result into a Task at the end - to no real benefit.
If you had added the async keyword and left the return someResult; line you'd have been given more clues about this by the compiler:
This async method lacks 'await' operators and will run synchronously. Consider using ...

Awaiting last method call

A few posts on stack overflow have suggested the following:
Any async method where you have a single await expression awaiting a Task or Task < T >, right at the end of the method with no further processing, would be better off being written without using async/await.
Architecture for async/await
Await or Return
Is this advice just for particular circumstances? On a web server, isn't one of the main reasons to use async/await, being that while awaiting something like the UpdateDataAsync method below, the Thread will return to the ThreadPool, allowing the server to work on other requests?
Should SaveDataAsync await that DB update call given it's the last call in the method?
public async Task WorkWithDataAsync(Data data)
{
ManipulateData(data);
await SaveDataAsync(data);
await SendDataSomewhereAsync(data);
}
public async Task SaveDataAsync(Data data)
{
FixData(data);
await DBConnection.UpdateDataAsync(data);
}
Also, given you don't know where SaveDataAsync will be used, making it synchronous would hurt a method like WorkWithDataAsync would it not?

Removing the await and just returning the Task doesn't make the method synchronous. await is a tool that makes it easier to make a method asynchronous. It's not the only way to make a method asynchronous. It's there because it allows you to add continuations to task much more easily than you could without it, but in the method that you've shown you're not actually leveraging the await keyword to accomplish anything and as such you can remove it and have the code simply function identically.
(Note that technically the semantics of exceptions will have been slightly changed by removing async; thrown exceptions will be thrown from the method, not wrapped in the returned Task, if it matters to you. As far as any caller is concerned, this is the only observable difference.)

Writing these methods without the async-await keywords doesn't make them synchronous.
The idea is to return the task directly instead of having the overhead of generating the state machine.
The actual difference is about exception handling. An async method encapsulates exceptions inside the returned task while a simple task returning method doesn't.
If for example FixData throws an exception it will be captured in an async method but directly thrown in a simple task returning one. Both options would be equally asynchronous.

Write your own async method

I would like to know how to write your own async methods the "correct" way.
I have seen many many posts explaining the async/await pattern like this:
http://msdn.microsoft.com/en-us/library/hh191443.aspx
// Three things to note in the signature:
// - The method has an async modifier.
// - The return type is Task or Task<T>. (See "Return Types" section.)
// Here, it is Task<int> because the return statement returns an integer.
// - The method name ends in "Async."
async Task<int> AccessTheWebAsync()
{
// You need to add a reference to System.Net.Http to declare client.
HttpClient client = new HttpClient();
// GetStringAsync returns a Task<string>. That means that when you await the
// task you'll get a string (urlContents).
Task<string> getStringTask = client.GetStringAsync("http://msdn.microsoft.com");
// You can do work here that doesn't rely on the string from GetStringAsync.
DoIndependentWork();
// The await operator suspends AccessTheWebAsync.
// - AccessTheWebAsync can't continue until getStringTask is complete.
// - Meanwhile, control returns to the caller of AccessTheWebAsync.
// - Control resumes here when getStringTask is complete.
// - The await operator then retrieves the string result from getStringTask.
string urlContents = await getStringTask;
// The return statement specifies an integer result.
// Any methods that are awaiting AccessTheWebAsync retrieve the length value.
return urlContents.Length;
}
private void DoIndependentWork()
{
resultsTextBox.Text += "Working........\r\n";
}
This works great for any .NET Method that already implements this functionality like
System.IO opertions
DataBase opertions
Network related operations (downloading, uploading...)
But what if I want to write my own method that takes quite some time to complete where there just is no Method I can use and the heavy load is in the DoIndependentWork method of the above example?
In this method I could do:
String manipulations
Calculations
Handling my own objects
Aggregating, comparing, filtering, grouping, handling stuff
List operations, adding, removing, coping
Again I have stumbled across many many posts where people just do the following (again taking the above example):
async Task<int> AccessTheWebAsync()
{
HttpClient client = new HttpClient();
Task<string> getStringTask = client.GetStringAsync("http://msdn.microsoft.com");
await DoIndependentWork();
string urlContents = await getStringTask;
return urlContents.Length;
}
private Task DoIndependentWork()
{
return Task.Run(() => {
//String manipulations
//Calculations
//Handling my own objects
//Aggregating, comparing, filtering, grouping, handling stuff
//List operations, adding, removing, coping
});
}
You may notice that the changes are that DoIndependentWork now returns a Task and in the AccessTheWebAsync task the method got an await.
The heavy load operations are now capsulated inside a Task.Run(), is this all it takes?
If that's all it takes is the only thing I need to do to provide async Method for every single method in my library the following:
public class FooMagic
{
public void DoSomeMagic()
{
//Do some synchron magic...
}
public Task DoSomeMagicAsync()
{
//Do some async magic... ?!?
return Task.Run(() => { DoSomeMagic(); });
}
}
Would be nice if you could explain it to me since even a high voted question like this:
How to write simple async method? only explains it with already existing methods and just using asyn/await pattern like this comment of the mentioned question brings it to the point:
How to write simple async method?

Actual Answer
You do that using TaskCompletionSource, which has a Promise Task that doesn't execute any code and only:
"Represents the producer side of a Task unbound to a delegate, providing access to the consumer side through the Task property."
You return that task to the caller when you start the asynchronous operation and you set the result (or exception/cancellation) when you end it. Making sure the operation is really asynchronous is on you.
Here is a good example of this kind of root of all async method in Stephen Toub's AsyncManualResetEvent implementation:
class AsyncManualResetEvent
{
private volatile TaskCompletionSource<bool> _tcs = new TaskCompletionSource<bool>();
public Task WaitAsync() { return _tcs.Task; }
public void Set() { _tcs.TrySetResult(true); }
public void Reset()
{
while (true)
{
var tcs = _tcs;
if (!tcs.Task.IsCompleted ||
Interlocked.CompareExchange(ref _tcs, new TaskCompletionSource<bool>(), tcs) == tcs)
return;
}
}
}
Background
There are basically two reasons to use async-await:
Improved scalability: When you have I/O intensive work (or other inherently asynchronous operations), you can call it asynchronously and so you release the calling thread and it's capable of doing other work in the mean time.
Offloading: When you have CPU intensive work, you can call it asynchronously, which moves the work off of one thread to another (mostly used for GUI threads).
So most of the .Net framework's asynchronous calls support async out of the box and for offloading you use Task.Run (as in your example). The only case where you actually need to implement async yourself is when you create a new asynchronous call (I/O or async synchronization constructs for example).
These cases are extremely rare, which is why you mostly find answers that
"Only explains it with already existing methods and just using async/await pattern"
You can go deeper in The Nature of TaskCompletionSource

Would be nice if you could explain it to me: How to write simple async
method?
First, we need to understand what an async method means. When one exposes an async method to the end user consuming the async method, you're telling him: "Listen, this method will return to you quickly with a promise of completing sometime in the near future". That is what you're guaranteeing to your users.
Now, we need to understand how Task makes this "promise" possible. As you ask in your question, why simply adding a Task.Run inside my method makes it valid to be awaited using the await keyword?
A Task implements the GetAwaiter pattern, meaning it returns an object called an awaiter (Its actually called TaskAwaiter). The TaskAwaiter object implements either INotifyCompletion or ICriticalNotifyCompletion interfaces, exposing a OnCompleted method.
All these goodies are in turn used by the compiler once the await keyword is used. The compiler will make sure that at design time, your object implements GetAwaiter, and in turn use that to compile the code into a state machine, which will enable your program to yield control back to the caller once awaited, and resume when that work is completed.
Now, there are some guidelines to follow. A true async method doesn't use extra threads behind the scenes to do its job (Stephan Cleary explains this wonderfully in There Is No Thread), meaning that exposing a method which uses Task.Run inside is a bit misleading to the consumers of your api, because they will assume no extra threading involved in your task. What you should do is expose your API synchronously, and let the user offload it using Task.Run himself, controlling the flow of execution.
async methods are primarily used for I/O Bound operations, since these naturally don't need any threads to be consumed while the IO operation is executing, and that is why we see them alot in classes responsible for doing IO operations, such as hard drive calls, network calls, etc.
I suggest reading the Parallel PFX teams article Should I expose asynchronous wrappers for synchronous methods? which talks exactly about what you're trying to do and why it isn't recommended.

TL;DR:
Task.Run() is what you want, but be careful about hiding it in your library.
I could be wrong, but you might be looking for guidance on getting CPU-Bound code to run asynchronously [by Stephen Cleary]. I've had trouble finding this too, and I think the reason it's so difficult is that's kinda not what you're supposed to do for a library - kinda...
Article
The linked article is a good read (5-15 minutes, depending) that goes into a decent amount of detail about the hows and whys of using Task.Run() as part of an API vs using it to not block a UI thread - and distinguishes between two "types" of long-running process that people like to run asynchronously:
CPU-bound process (one that is crunching / actually working and needs a separate thread to do its work)
Truly asynchronous operation (sometimes called IO-bound - one that is doing a few things here and there with a bunch of waiting time in between actions and would be better off not hogging a thread while it's sitting there doing nothing).
The article touches on the use of API functions in various contexts, and explains whether the associated architectures "prefer" sync or async methods, and how an API with sync and async method signatures "looks" to a developer.
Answer
The last section "OK, enough about the wrong solutions? How do we fix this the right way???" goes into what I think you're asking about, ending with this:
Conclusion: do not use Task.Run in the implementation of the method; instead, use Task.Run to call the method.
Basically, Task.Run() 'hogs' a thread, and is thus the thing to use for CPU-bound work, but it comes down to where it's used. When you're trying to do something that requires a lot of work and you don't want to block the UI Thread, use Task.Run() to run the hard-work function directly (that is, in the event handler or your UI based code):
class MyService
{
public int CalculateMandelbrot()
{
// Tons of work to do in here!
for (int i = 0; i != 10000000; ++i)
;
return 42;
}
}
...
private async void MyButton_Click(object sender, EventArgs e)
{
await Task.Run(() => myService.CalculateMandelbrot());
}
But... don't hide your Task.Run() in an API function suffixed -Async if it is a CPU-bound function, as basically every -Async function is truly asynchronous, and not CPU-bound.
// Warning: bad code!
class MyService
{
public int CalculateMandelbrot()
{
// Tons of work to do in here!
for (int i = 0; i != 10000000; ++i)
;
return 42;
}
public Task<int> CalculateMandelbrotAsync()
{
return Task.Run(() => CalculateMandelbrot());
}
}
In other words, don't call a CPU-bound function -Async, because users will assume it is IO-bound - just call it asynchronously using Task.Run(), and let other users do the same when they feel it's appropriate. Alternately, name it something else that makes sense to you (maybe BeginAsyncIndependentWork() or StartIndependentWorkTask()).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.