TaskCompletionSource usage in IO Async methods - c#

The implementation of the ExecuteNonQueryAsync() method in System.Data.SqlClient.SqlCommand is as follows:
public override Task<int> ExecuteNonQueryAsync(CancellationToken cancellationToken) {
Bid.CorrelationTrace("<sc.SqlCommand.ExecuteNonQueryAsync|API|Correlation> ObjectID%d#, ActivityID %ls\n", ObjectID);
SqlConnection.ExecutePermission.Demand();
TaskCompletionSource<int> source = new TaskCompletionSource<int>();
CancellationTokenRegistration registration = new CancellationTokenRegistration();
if (cancellationToken.CanBeCanceled) {
if (cancellationToken.IsCancellationRequested) {
source.SetCanceled();
return source.Task;
}
registration = cancellationToken.Register(CancelIgnoreFailure);
}
Task<int> returnedTask = source.Task;
try {
RegisterForConnectionCloseNotification(ref returnedTask);
Task<int>.Factory.FromAsync(BeginExecuteNonQueryAsync, EndExecuteNonQueryAsync, null).ContinueWith((t) => {
registration.Dispose();
if (t.IsFaulted) {
Exception e = t.Exception.InnerException;
source.SetException(e);
}
else {
if (t.IsCanceled) {
source.SetCanceled();
}
else {
source.SetResult(t.Result);
}
}
}, TaskScheduler.Default);
}
catch (Exception e) {
source.SetException(e);
}
return returnedTask;
}
Which I would summarize as:
Create TaskCompletionSource<int> source = new TaskCompletionSource<int>();
Create a new task using Task<int>.Factory.FromAsync, using the APM "Begin/End" API
Invoke source.SetResult() when the task finishes.
Return source.Task
What is the point of using TaskCompletionSource here and why not to return the task created by Task<int>.Factory.FromAsync() directly? This task also has the result and exception (if any) wrapped.
In C# in a Nutshell book, in the Asynchronous Programming and Continuations section, it states:
In writing Delay, we
used TaskCompletionSource, which is a standard way to implement “bottom-level”
I/O-bound asynchronous methods.
For compute-bound methods, we use Task.Run to initiate thread-bound concurrency.
Simply by returning the task to the caller, we create an asynchronous method.
Why is it that the compute-bound methods can be implemented using Task.Run(), but not the I/O bound methods?

Note that for a definitive answer, you would have to ask the author of the code. Barring that, we can only speculate. However, I think it's reasonable to make some inferences with reasonable accuracy…
What is the point of using TaskCompletionSource here and why not to return the task created by Task.Factory.FromAsync() directly?
In this case, it appears to me that the main reason is to allow the implementation to deregister the registered callback CancelIgnoreFailure() before the task is actually completed. This ensures that by the time the client code receives completion notification, the API itself has completely cleaned up from the operation.
A secondary reason might be simply to provide a complete abstraction. I.e. to not allow any of the underlying implementation to "leak" from the method, in the form of a Task object that a caller might inspect or (worse) manipulate in a way that interferes with the correct and reliable operation of the task.
Why is it that the compute-bound methods can be implemented using Task.Run(), but not the I/O bound methods?
You can implement I/O bound operations using Task.Run(), but why would you? Doing so commits a thread to the operation which, for an operation that would not otherwise require a thread, is wasteful.
I/O bound operations generally have support from an I/O completion port and the IOCP thread pool (the threads of which handle completions of an arbitrarily large number of IOCPs) and so it is more efficient to simply use the existing asynchronous I/O API, rather than to use Task.Run() to call a synchronous I/O method.

Related

Long running synchronous implementation of an interface that returns a Task

I'm using this question as a basis for my question.
TL;DR: If you're not supposed to wrap synchronous code in an async wrapper, how do you deal with long-running, thread-blocking methods that implement an interface method which expects an asynchronous implementation?
Let's say I have an application that runs continuously to process a work queue. It's a server-side application (running mostly unattended) but it has a UI client to give a more-fine-grained control over the behavior of the application as required by the business processes: start, stop, tweak parameters during execution, get progress, etc.
There's a business logic layer into which services are injected as dependencies.
The BLL defines a set of interfaces for those services.
I want to keep the client responsive: allow UI client to interact with the running process, and I also want threads to be used efficiently because the process needs to be scalable: there could be any number of asynchronous database or disk operations depending on the work in the queue. Thus I'm employing async/await "all the way".
To that end, I have methods in the service interfaces that are obviously designed to encourage async/await and support for cancellation because they take a CancellationToken, are named with "Async", and return Tasks.
I have a data repository service that performs CRUD operations to persist my domain entities. Let's say that at the present time, I'm using an API for this that doesn't natively support async. In the future, I may replace this with one that does, but for the time being the data repository service performs the majority of its operations synchronously, many of them long-running operations (because the API blocks on the database IO).
Now, I understand that methods returning Tasks can run synchronously. The methods in my service class that implement the interfaces in my BLL will run synchronously as I explained, but the consumer (my BLL, client, etc) will assume they are either 1: running asynchronously or 2: running synchronously for a very short time. What the methods shouldn't do is wrap synchronous code inside an async call to Task.Run.
I know I could define both sync and async methods in the interface.
In this case I don't want to do that because of the async "all the way" semantics I'm trying to employ and because I'm not writing an API to be consumed by a customer; as mentioned above, I don't want to change my BLL code later from using the sync version to using the async version.
Here's the data service interface:
public interface IDataRepository
{
Task<IReadOnlyCollection<Widget>>
GetAllWidgetsAsync(CancellationToken cancellationToken);
}
And it's implementation:
public sealed class DataRepository : IDataRepository
{
public Task<IReadOnlyCollection<Widget>> GetAllWidgetsAsync(
CancellationToken cancellationToken)
{
/******* The idea is that this will
/******* all be replaced hopefully soon by an ORM tool. */
var ret = new List<Widget>();
// use synchronous API to load records from DB
var ds = Api.GetSqlServerDataSet(
"SELECT ID, Name, Description FROM Widgets", DataResources.ConnectionString);
foreach (DataRow row in ds.Tables[0].Rows)
{
cancellationToken.ThrowIfCancellationRequested();
// build a widget for the row, add to return.
}
// simulate long-running CPU-bound operation.
DateTime start = DateTime.Now;
while (DateTime.Now.Subtract(start).TotalSeconds < 10) { }
return Task.FromResult((IReadOnlyCollection<Widget>) ret.AsReadOnly());
}
}
The BLL:
public sealed class WorkRunner
{
private readonly IDataRepository _dataRepository;
public WorkRunner(IDataRepository dataRepository) => _dataRepository = dataRepository;
public async Task RunAsync(CancellationToken cancellationToken)
{
var allWidgets = await _dataRepository
.GetAllWidgetsAsync(cancellationToken).ConfigureAwait(false);
// I'm using Task.Run here because I want this on
// another thread even if the above runs synchronously.
await Task.Run(async () =>
{
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
foreach (var widget in allWidgets) { /* do something */ }
await Task.Delay(2000, cancellationToken); // wait some arbitrary time.
}
}).ConfigureAwait(false);
}
}
Presentation and presentation logic:
private async void HandleStartStopButtonClick(object sender, EventArgs e)
{
if (!_isRunning)
{
await DoStart();
}
else
{
DoStop();
}
}
private async Task DoStart()
{
_isRunning = true;
var runner = new WorkRunner(_dependencyContainer.Resolve<IDataRepository>());
_cancellationTokenSource = new CancellationTokenSource();
try
{
_startStopButton.Text = "Stop";
_resultsTextBox.Clear();
await runner.RunAsync(_cancellationTokenSource.Token);
// set results info in UI (invoking on UI thread).
}
catch (OperationCanceledException)
{
_resultsTextBox.Text = "Canceled early.";
}
catch (Exception ex)
{
_resultsTextBox.Text = ex.ToString();
}
finally
{
_startStopButton.Text = "Start";
}
}
private void DoStop()
{
_cancellationTokenSource.Cancel();
_isRunning = false;
}
So the question is: how do you deal with long-running, blocking methods that implement an interface method which expects an asynchronous implementation? Is this an example where it's preferable to break the "no async wrapper for sync code" rule?
You are not exposing asynchronous wrappers for synchronous methods. You are not the author of the external library, you are the client. As the client, you are adapting the library API to your service interface.
The key reasons for the advice against using asynchronous wrappers for synchronous methods are (summarised from the MSDN article referenced in the question):
to ensure the client has knowledge of the true nature of any synchronous library function
to give the client control over how to invoke the function (async or sync.)
to avoid increasing the surface area of the library by having 2 versions of every function
With respect to your service interface, by defining only async methods you are choosing to invoke the library operations asynchronously no matter what. You are effectively saying, I've made my choice for (2) regardless of (1). And you've given a reasonable reason - long term you know your sync library API will be replaced.
As a side point, even though your external library API functions are synchronous, they are not long-running CPU bound. As you said, they block on IO. They are actually IO-bound. They just block the thread waiting for IO rather than releasing it.

Is awaiting methods from synchronous sources with await Task.Run(() => good practice?

I have a method that has the async keyword with a task. This method returns a string that comes from JwtSecurityTokenHandler().WriteToken(t); The thing is none of the assignments in the body of the method are awaitable.I get the warning CS-1998. That says you shouldnt use async for synchronous methods which makes complete sense. But then it adds that you can use await Task.Run(() => { . So is it good practice to do this?
public async Task<object> GenerateMyUserJwtToken(string email, IdentityUser user)
//code that isnt awaitable
{
var u = await Task.Run(() =>
{
return new JwtSecurityTokenHandler().WriteToken(token);
});
return u;
}
edit: I did not ask what the error was I asked if it was a good idea to Implement await Task.Run(() on an async method signature that has no await assignments. I also asked that another async method is awaiting this in the another method here is the code
//awaiting method:
public async Task<object> LoginAsync(LoginDto model)
{
return await GenerateMyUserJwtToken(model.Email, appUser);
}
//controller:
[HttpPost("login")]
public async Task<object> Login([FromBody] LoginDto model)
{
var logMeIn = await new AuthUserService().LoginAsync(model);
return logMeIn; //returns token
}
My Question is is this async all the way or does the task.Run stop that process?
Using Task.Run just to make something sync is generally a bad practice but it cannot be stated generally.
If the sync method to execute may take for a long time, then it can be a solution. Please note that Task.Run will assign the task to a pool thread and it is not always desirable. It is a common misunderstanding that async methods always use or should use threads somewhere at the end of the async-await chain. However, async-await has nothing to do with threads, it is about asynchronicity (chaining deferred tasks) and creating threads is just one option to create awaitable tasks.
So what are the options?
The method to call is fast and never blocks the caller for long time (>100ms or so): do not use async at all. In this case Task<T>.FromResult(result) is a tempting solution but is highly discouraged because it is misleading for the caller. Use it only in unit tests or if you are forced to implement an async method of an interface you cannot change.
The method execution takes for a long time because it is CPU bound: now you can use a thread. But I typically would not use pool threads for long lasting tasks as it can cause nasty side effects if the thread pool is out of threads. Use await Task.Factory.StartNew(() => MyLongRunningTask(), cancellationToken, TaskCreationOptions.LongRunning); instead, which creates a brand new thread instead of bothering the pool.
The method execution takes for a long time because it is IO bound (eg. sending/receiving packets via a hardware): Use TaskCompletitionSource<T>, add a hook to the whatever completition event of the device (eg. OS hook or IRQ notification) and from that set the result of the completition source and return its task.

How does a Task get to know when it has completed?

The Task does not maintain a wait handle for performance reasons, and only lazily constructs one if the code were to ask one of it.
How then does a Task know it has been completed?
One would argue that the implementer sets the result on the TaskCompletionSource in their implementation but that would explain only the modern implementations and re-writes such as System.IO.FileStream.Begin/EndReadTask.
I followed the Task.IsComplete property; almost in every instance, an internal bitwise flag field (m_stateFlags) is set by the TrySetResult / TrySetException methods to indicate the status of the task.
But that does not cover all cases.
What about a method such as this?
public async Task FooAsync()
{
await Task.Run(() => { });
}
How then does a Task know it has been completed?
As I describe on my blog (overview, more detail), there are two kinds of tasks: Delegate Tasks (which execute code) and Promise Tasks (which represent an event).
Delegate Tasks complete themselves when their delegate completes.
Promise Tasks are completed from an external signal, using TaskCompletionSource<T> (or equivalent methods that are internal to the BCL).
I am answering my own question because I have suddenly remembered that I know the answer to it.
When using the C# Language Support Features
It's the state machine.
If the implementer of the asynchronous method used the C# language support such as the async keyword in the method declaration and the await keyword inside the method body to await an operation intrinsic to the task, then to the extent of the task he is implementing, the state machine signals task completion by setting the result of the task.
For e.g. if his implementation was as such:
// client code
public async void TopLevelMethod()
{
await MyMethodAsync();
}
// library code -- his implementation
public async Task MyMethodAsync()
{
await AnotherOperationAsync();
}
Then the completion of MyMethodAsync will be entrusted to the compiler generated state machine.
Of course, the signaling of completion of AnotherOperationAsync will also be taken care of by the compiler generated state machine, but that is not the point here.
Recall the states inside the MoveNext method indicate the task completion states and in the block inside of MoveNext that invokes the continuation callback, it also calls SetResult on the AsyncXXXMethodBuilder.
When not using the C# Language Support Features
If, however, the implementer of the asynchronous method did not make use of the C# language features, then it is the duty of the implementer to signal the completion of the task by setting the relevant result, exception or cancelled properties on the TaskCompletionSource object.
For e.g.
public Task MyMethodAsync()
{
var tcs = new TaskCompletionSource<object>();
try
{
AnotherOperation();
tcs.SetResult();
}
catch(Exception ex)
{
tcs.SetException(ex);
}
return tcs.Task;
}
If the implementer did not use TPL support or invoked another operation asynchronously using the older .NET API, then too, it is the implementer's responsibility to signal task completion by explicitly setting the status of the task through one of the Try/SetResult/Exception etc. methods.
For e.g.
public Task MyMethodAsync()
{
var tcs = new TaskCompletionSource...
var autoReseEvent = ...
ThreadPool.QueueUserWorkItem(new WaitCallback(() =>
{
/* Work */
Thread.SpinWait(1000);
tcs.SetResult(...);
autoResetEvent.Set();
};)...;
return tcs.Task;
}
An Ill-Advised Case
The best way to await a task is, of course, to use the await keyword. If, however, when implementing an asynchronous API, the implementer does this:
public Task MyMethodAsync()
{
return Task.Run(...);
}
That would leave the consumer of his API with a sour mouth, I suppose?
Task.Run should only ever be used in a fire and forget scenario where you do not care about the point in time when the task will be completed.
The one exception to this is if you awaited the task returned by the call to Task.Run using the await keyword, like the code snippet shown below, in which case, you would be using the language support as described in the first section.
public async Task MyMethodAsync()
{
await Task.Run(...);
}

How to wait synchronously on cancellation of async task

I have a generic class which runs certain work packages asynchronously. It has the possibility to cancel the execution of all tasks and wait synchronously on the completion of the canceled tasks. The cancellation is triggered before a new transaction (the user does something) starts. This is necessary because the asynchronous task as well as the new transaction would change the same objects, but the asynchronous task would do it while assuming the state before the transaction.
Here a sample code why this behavior is so important:
private void Transaction()
{
asyncExecution.AbortAllAsyncWork();
DoTransaction();
}
This method is called synchronously and DoTransaction changes objects which are also changed in the asynchronous tasks. For example list could be changed while they are iterated.
Previously I achieved this behavior with the ContinueWith method on tasks where I passed a synchronous task scheduler. All in all it was hard to understand and seemed kind of dirty. Therefore I wondered if I could achieve the same behavior with the new async-await feature. The problem here lies in a deadlock described here. The code so far with the deadlock problem:
public void RunAsync<TWork, TResult>(IIncrementalAsyncExecutable<TWork, TResult> executable, TWork initialWork) where TWork : class
{
Task workingTask = RunAsyncInternal(executable, initialWork, CancellationTokenSource);
if (IsRunning(workingTask))
{
workingTasks.Add(workingTask);
}
}
private async Task RunAsyncInternal<TWork, TResult>(IIncrementalAsyncExecutable<TWork, TResult> executable, TWork initialWork, CancellationTokenSource tokenSource) where TWork : class
{
while (!executable.WorkDone)
{
TResult result = await Task.Run(() => executable.CalculateNextStep(initialWork));
executable.SyncResult(result);
if (tokenSource.IsCancellationRequested)
{
return;
}
}
}
public void AbortAllAsyncWork()
{
CancellationTokenSource.Cancel();
foreach (Task workingTask in workingTasks)
{
if (IsRunning(workingTask))
{
workingTask.Wait(); // here is the deadlock problem
}
}
}
Is there a possibility to achieve this behavior with the new async-await feature without deadlock?
I have a generic class which runs certain work packages asynchronously.
It's really much easier to use built-in types for this, like TPL Dataflow. They've done all the hard work.
The problem is, that the application is designed synchronously.
Yes, but note that the problem is in the application's design. Waiting for tasks to complete is an inherently asynchronous operation, and the best solution is definitely Yuval's.
Previously I achieved this behavior with the ContinueWith method on tasks where I passed a synchronous task scheduler.
I don't see how that could possibly avoid the deadlock you're seeing.
I want to find a solution, where there is no refactoring necessary.
What you're really asking is "how do I do sync-over-async". The best answer is "don't". But there are a few hacks that you can use to force it to work in some scenarios. There is no general-purpose solution that always works.
These hacks are: just block, push the operations onto a background thread, and run a nested message loop. They're described in more detail on Stephen Toub's blog.
If each Task is going to be cancelled, you can simply await on them and catch the OperationCanceledException:
public async Task AbortAllAsyncWork()
{
CancellationTokenSource.Cancel();
foreach (Task workingTask in workingTasks.Keys)
{
if (IsRunning(workingTask))
{
try
{
await workingTask;
}
catch (OperationCanceledException oce)
{
// Do something usefull
}
}
}
}
Or, you could simply await Task.WhenAll on all the tasks:
await Task.WhenAll(workingTasks.Keys)
Assuming Keys is an IEnumerable<Task>, the returned Task would be in a Canceled state.

HttpClient.SendAsync using the thread-pool instead of async IO?

So I've been digging up on the implementation of HttpClient.SendAsync via Reflector. What I intentionally wanted to find out was the flow of execution of these methods, and to determine which API gets called to execute the asynchronous IO work.
After exploring the various classes inside HttpClient, I saw that internally it uses HttpClientHandler which derives from HttpMessageHandler and implements its SendAsync method.
This is the implementation of HttpClientHandler.SendAsync:
protected internal override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
if (request == null)
{
throw new ArgumentNullException("request", SR.net_http_handler_norequest);
}
this.CheckDisposed();
this.SetOperationStarted();
TaskCompletionSource<HttpResponseMessage> source = new TaskCompletionSource<HttpResponseMessage>();
RequestState state = new RequestState
{
tcs = source,
cancellationToken = cancellationToken,
requestMessage = request
};
try
{
HttpWebRequest request2 = this.CreateAndPrepareWebRequest(request);
state.webRequest = request2;
cancellationToken.Register(onCancel, request2);
if (ExecutionContext.IsFlowSuppressed())
{
IWebProxy proxy = null;
if (this.useProxy)
{
proxy = this.proxy ?? WebRequest.DefaultWebProxy;
}
if ((this.UseDefaultCredentials || (this.Credentials != null)) || ((proxy != null) && (proxy.Credentials != null)))
{
this.SafeCaptureIdenity(state);
}
}
Task.Factory.StartNew(this.startRequest, state);
}
catch (Exception exception)
{
this.HandleAsyncException(state, exception);
}
return source.Task;
}
What I found weird is that the above uses Task.Factory.StartNew to execute the request while generating a TaskCompletionSource<HttpResponseMessage> and returning the Task created by it.
Why do I find this weird? well, we go on alot about how I/O bound async operations have no need for extra threads behind the scenes, and how its all about overlapped IO.
Why is this using Task.Factory.StartNew to fire an async I/O operation? this means that SendAsync isn't only using pure async control flow to execute this method, but spinning a ThreadPool thread "behind our back" to execute its work.
this.startRequest is a delegate that points to StartRequest which in turn uses HttpWebRequest.BeginGetResponse to start async IO. HttpClient is using async IO under the covers, just wrapped in a thread-pool Task.
That said, note the following comment in SendAsync
// BeginGetResponse/BeginGetRequestStream have a lot of setup work to do before becoming async
// (proxy, dns, connection pooling, etc). Run these on a separate thread.
// Do not provide a cancellation token; if this helper task could be canceled before starting then
// nobody would complete the tcs.
Task.Factory.StartNew(startRequest, state);
This works around a well-known problem with HttpWebRequest: Some of its processing stages are synchronous. That is a flaw in that API. HttpClient is avoiding blocking by moving that DNS work to the thread-pool.
Is that good or bad? It is good because it makes HttpClient non-blocking and suitable for use in a UI. It is bad because we are now using a thread for long-running blocking work although we expected to not use threads at all. This reduces the benefits of using async IO.
Actually, this is a nice example of mixing sync and async IO. There is nothing inherently wrong with using both. HttpClient and HttpWebRequest are using async IO for long-running blocking work (the HTTP request). They are using threads for short-running work (DNS, ...). That's not a bad pattern in general. We are avoiding most blocking and we only have to make a small part of the code async. A typical 80-20 trade-off. It is not good to find such things in the BCL (a library) but in application level code that can be a very smart trade-off.
It seems it would have been preferable to fix HttpWebRequest. Maybe that is not possible for compatibility reasons.

Categories