Is using Task.Run a bad practice? - c#

Generally, when implementing an asynchronous method on a class, I write something like this:
public Task<Guid> GetMyObjectIdAsync(string objectName)
{
return Task.Run(() => GetMyObjectId(objectName));
}
private Guid GetMyObjectId(string objectName)
{
using (var unitOfWork = _myUnitOfWorkFactory.CreateUnitOfWork())
{
var myObject = unitOfWork.MyObjects.Single(o => o.Name == objectName);
return myObject.Id;
}
}
This sort of pattern allows me to use the same logic synchronously and asynchronously, depending on the situation (most of my work is in an old code base, not a lot supports async calls yet), as I could expose the synchronous method publicly and get maximum compatibility if I need to.
Recently I've read several SO posts that suggest using Task.Run() is a bad idea, and should only be used under certain circumstances, but those circumstances did not seem very clear.
Is the pattern I've depicted above actually a bad idea? Am I losing some of the functionality/ intended purpose of async calls doing it this way? Or is this a legit implementation?

What you are doing is offloading a synchronous operation to another thread. If your thread is "special" then that's perfectly fine. One example of a "special" thread is a UI thread. In that case you may want to offload work off of it to keep the UI responsive (another example is some kind of listener).
In most cases however you're just moving work around from one thread to another. This doesn't add any value and does add unnecessary overhead.
So:
Is the pattern I've depicted above actually a bad idea?
Yes, it is. It's a bad idea to offload synchronous work to the ThreadPool and pretend as if it's asynchronous.
Am I losing some of the functionality/ intended purpose of async calls doing it this way?
There's actually nothing asynchronous about this operation to begin with. If your executing this on a remote machine and you can benefit from doing it asynchronously the operation itself needs to be truly asynchronous, meaning:
var myObject = await unitOfWork.MyObjects.SingleAsync(o => o.Name == objectName);
What you're currently doing is called "async over sync" and you probably shouldn't do it. More in Should I expose asynchronous wrappers for synchronous methods?

Recently I've read several SO posts that suggest using Task.Run() is a bad idea, and should only be used under certain circumstances, but those circumstances did not seem very clear.
The absolutely bare bones rules of thumb I tell people who are new to asynchrony is:
First, understand the purpose. Asynchrony is for mitigating the important inefficiencies of high-latency operations.
Is the thing you're doing low-latency? Then don't make it asynchronous in any way. Just do the work. It's fast. Using a tool to mitigate latency on low-latency tasks is just making your program unnecessarily complex.
Is the thing you're doing high-latency because it is waiting on a disk to spin or a packet to show up? Make this asynchronous but do not put it on another thread. You don't hire a worker to sit by your mailbox waiting for letters to arrive; the postal system is already running asynchronously to you. You don't need to hire people to make it more asynchronous. Read "There Is No Thread" if that's not clear.
Is the high-latency work waiting on a CPU to do some enormous computation? Like a computation that is going to take well over 10 ms? Then offload that task onto a thread so that the thread can be scheduled to an idle CPU.

public Task<Guid> GetMyObjectIdAsync(string objectName)
When I see this, I expect there to be some advantage in using this method rather than just wrapping it in Task.Run() myself.
In particular, I'd expect it to release the thread when it hits some I/O or otherwise has the opportunity to do so.
Now consider if I have the code:
_resource = GetResourceForID(GetMyObjectIdAsync(SomeLongRunningWayToGetName()));
If I have a reason to need to have this done in a task, and I'm in the sort of situation where Task.Run() does actually make sense (I have a reason to offload it onto another thread) the best way to do this would be to wrap the whole thing:
Task task = Task.Run(() => _resource = GetResourceForID(GetMyObjectIdAsync(SomeLongRunningWayToGetName())));
Here Task.Run() might be a bad idea for me as the caller, or it might be good because I really am gaining from what it gives me.
However, if I see your signature I'm going to think that the best way to do this with your code would be to turn it into code that uses that method.
Task task = SomeLongRunningWayToGetName()
.ContinueWith(t => GetMyObjectIdAsync(t.Result))
.ContinueWith(t => _resource = GetResourceForIDAsync(t.Result));
(Or similar using async and await).
At best this has less good chunking of the Task.Run(). At worse I'm awaiting this just to gain from the better asynchronicity that it doesn't offer in a context that could make use of it if it was really there. (E.g I might have used this in an MVC action that I'd made asynchronous because I thought the extra overhead would be repaid in better thread-pool use).
So while Task.Run() is sometimes useful, in this case it's always bad. If you can't offer me greater asynchronicity than I can bring to the use of the class myself, don't lead me to believe you do.
Only offer a public XXXAsync() method if it really does call into asynchronous I/O.
If you really need to stub out an asynchronous method to e.g. match a signature of a shared base or interface, then it would be better as:
public Task<Guid> GetMyObjectIdAsync(string objectName)
{
return Task.FromResult(GetMyObjectId(objectName);
}
This is bad too (the caller would still have been better off just calling GetMyObjectId() directly), but at least if code awaits it then while it operates on the same thread there's no overhead of using yet another thread to do the work, so if it's mixed in with other awaits the negative impact is reduced. It's therefore useful if you really need to return a Task but can't add anything useful in how you call it.
But if you don't really need to offer it, just don't.
(A private method calling Run() because you every call site benefits from it is different, and there you're just adding convenience rather than calling Run() in several places, but that should be well-documented as such).

Related

Refactoring old code to use async keyword

I have started refactoring old code to become Async. If the method is not straight forward to async then i'm copying the old method and I create a new MethodAsync(), and then i just upgrade one by one client to use the new async method when I can.
But I would instead like to avoid creating this duplicate methods. So i'm wondering if there is performance issues with the following two scenarios:
UPDATE: Ignore point 1
Method is async with await keyword(s). But the client is still not async so it is using .Result :- forcing it to sync behavior. Is this way of "forcing" sync behavior more bad than calling a method that does not use async keyword?
Update: The Async State machine created with the async keyword would introduce overhead. So another question would be how much impact does that have, keep in mind this project is huge in the numbers of several thousand methods.
And 4:- Are there any other factors than the creation of Async state machine that comes to play with the performance while the methods are not yet true async?
But the client is still not async so it is using .Result :- forcing it to sync behavior. Is this way of "forcing" sync behavior more bad than calling a method that does not use async keyword?
It's generally considered an antipattern. The main reason is that it can cause deadlocks, depending on the client.
Update: The Async State machine created with the async keyword would introduce overhead. So another question would be how much impact does that have, keep in mind this project is huge in the numbers of several thousand methods.
Any overhead from the async keyword is minimal, compared to the overhead of blocking threads on asynchronous code.
I would instead like to avoid creating this duplicate methods.
I recommend reading my article on Brownfield Async Development; in particular, the "boolean argument hack" is one I've used with some success in this scenario.
Is this way of "forcing" sync behavior more bad than calling a method that does not use async keyword?
As mentioned by #mong Zhu, this risks causing deadlocks. Say that the method is awaiting some work that needs to be done on the main thread, calling .Result on the mainthread will prevent the main-thread from being used, effectivly deadlocking the application.
The Async State machine created with the async keyword would introduce overhead. So another question would be how much impact does that have
This is almost impossible to answer since performance spans multiple orders of magnitude. The overhead should be small when doing things Async/await is intended for, i.e. IO operations or compute bound operations sufficiently slow for the user to notice. The general recommendation is as always: measure.
Are there any other factors than the creation of Async state machine that comes to play with the performance while the methods are not yet true async
The state machine is simply a hidden class with a large switch statement. I have not measured this, but I suspect the overhead of the state machine is smaller than scheduling work on another thread. However, marking a method as async is pointless unless you are not using await. If you are returning tasks but not using await you can either:
Return the task returned from another method. Say that you do some initial work and call ReadToEndAsync that returns a Task<string>, just return the task as is.
use Task.FromResult(myObject) to create a completed task, or Task.CompletedTask;
Edit: Return a ValueTask instead, this is either a task or a TResult. Since this is a struct there is very little overhead to create one.

How to properly implement an interface that was designed for async usage?

Starting with the following (simplified) interface which has async/await in mind, I want to implement it by using LiteDB database.
public interface IDataService
{
Task<User> GetUserAsync(int key);
}
Unfortunately, LiteDB currently does not support async methods. All methods are synchronous.
I tried the following, but later ran into some issues with that (process did not really wait for whatever reason). Following that, I read that you should only use Task.Run() for CPU bound algorithms, and therefore not for database access.
public Task<User> GetUserAsync(int key)
{
var task = Task.Run(() =>
{
return _users
.Find(x => x.Key == key)
.SingleOrDefault();
});
return task;
}
Even though I read a couple of blog articles and also SO questions, it is still unclear for me how to make a properly written awaitable method for synchronous IO based code.
So how can I write a properly awaitable method?
Note: I can't change the interface. This is given.
You must not use Task.Run. It is not a question of whether that is the morally correct thing to do. As you note, it is morally wrong to assign worker threads to tasks that are not CPU bound, but that's not the issue here. The issue here is that LiteDB is single-threaded and is not built to be robust in the face of attempting to move calls into it onto worker threads.
UPDATE: Apparently LiteDB is thread safe in version 4, according to a commenter who is likely to be more informed than I am on this issue. So, consult LiteDB documentation for what kind of thread safe it is. Not all thread safe objects can be used without restriction on any thread.
Your choices are:
Abandon use of this interface.
As the other answer suggests, lie. Say that your implementation is async when it is in fact synchronous. This can cause your user interface to hang, but hey, you were going to hang your user interface anyways when you made the synchronous long-running call. Explicitly calling FromResult is the right way to do that.
Get a better database that supports asynchrony. Or wait for LiteDB to be a better database, that supports asynchrony.
Move all of the calls to LiteDB, including the creation and destruction of every object associated with this database, onto a dedicated thread. (Probably in-process, but hey, you could put it in its own process if you like.) Implement your own asynchronous front end to LiteDB that marshals calls to and from this dedicated thread appropriately. Yes, you burn an entire thread that spends most of its time sleeping, but that's the price you pay for using a synchronous database.
The solution that meets all your stated constraints is the last one. It's also the one that is the most work for you. As woodworkers like to say: what you save on cheap materials you'll spend on labour. Trying to retrofit a correct async interface onto a non-asynchronous but I/O bound single-threaded library is a tricky problem. Good luck!
One option is to use Task.FromResult and implement your interface synchronously.
return Task.FromResult(_users.Find(x => x.Key == key).SingleOrDefault());
The pay off of async/await is the use of I/O completion ports which suspends the thread and allows it to serve other incoming requests until the awaited I/O operation completes. Seeing as your implementation can't make use of this wrapping your work in a Task.Run or similar has no benefit.

How to synchronize TPL Tasks, by using Monitor / Mutex / Semaphore? Or should one use something else entirely?

I'm trying to move some of my old projects from ThreadPool and standalone Thread to TPL Task, because it supports some very handy features, like continuations with Task.ContinueWith (and from C# 5 with async\await), better cancellation, exception capturing, and so on. I'd love to use them in my project. However I already see potential problems, mostly with synchronization.
I've written some code which shows a Producer / Consumer problem, using a classic stand-alone Thread:
class ThreadSynchronizationTest
{
private int CurrentNumber { get; set; }
private object Synchro { get; set; }
private Queue<int> WaitingNumbers { get; set; }
public void TestSynchronization()
{
Synchro = new object();
WaitingNumbers = new Queue<int>();
var producerThread = new Thread(RunProducer);
var consumerThread = new Thread(RunConsumer);
producerThread.Start();
consumerThread.Start();
producerThread.Join();
consumerThread.Join();
}
private int ProduceNumber()
{
CurrentNumber++;
// Long running method. Sleeping as an example
Thread.Sleep(100);
return CurrentNumber;
}
private void ConsumeNumber(int number)
{
Console.WriteLine(number);
// Long running method. Sleeping as an example
Thread.Sleep(100);
}
private void RunProducer()
{
while (true)
{
int producedNumber = ProduceNumber();
lock (Synchro)
{
WaitingNumbers.Enqueue(producedNumber);
// Notify consumer about a new number
Monitor.Pulse(Synchro);
}
}
}
private void RunConsumer()
{
while (true)
{
int numberToConsume;
lock (Synchro)
{
// Ensure we met out wait condition
while (WaitingNumbers.Count == 0)
{
// Wait for pulse
Monitor.Wait(Synchro);
}
numberToConsume = WaitingNumbers.Dequeue();
}
ConsumeNumber(numberToConsume);
}
}
}
In this example, ProduceNumber generates a sequence of increasing integers, while ConsumeNumber writes them to the Console. If producing runs faster, numbers will be queued for consumption later. If consumption runs faster, the consumer will wait until a number is available. All synchronization is done using Monitor and lock (internally also Monitor).
When trying to 'TPL-ify' similar code, I already see a few issues I'm not sure how to go about. If I replace new Thread().Start() with Task.Run():
TPL Task is an abstraction, which does not even guarantee that the code will run on a separate thread. In my example, if the producer control method runs synchronously, the infinite loop will cause the consumer to never even start. According to MSDN, providing a TaskCreationOptions.LongRunning parameter when running the task should hint the TaskScheduler to run the method appropriately, however I didn't find any way to ensure that it does. Supposedly TPL is smart enough to run tasks the way the programmer intended, but that just seems like a bit of magic to me. And I don't like magic in programming.
If I understand how this works correctly, a TPL Task is not guaranteed to resume on the same thread as it started. If it does, in this case it would try to release a lock it doesn't own while the other thread holds the lock forever, resulting in a deadlock. I remember a while ago Eric Lippert writing that it's the reason why await is not allowed in a lock block. Going back to my example, I'm not even sure how to go about solving this issue.
These are the few issues that crossed my mind, although there may be (probably are) more. How should I go about solving them?
Also, this made me think, is using the classical approach of synchronizing via Monitor, Mutex or Semaphore even the right way to do TPL code? Perhaps I'm missing something that I should be using instead?
Your question pushes the limits of broadness for Stack Overflow. Moving from plain Thread implementations to something based on Task and other TPL features involves a wide variety of considerations. Taken individually, each concern has almost certainly been addressed in a prior Stack Overflow Q&A, and taken in aggregate there are too many considerations to address competently and comprehensively in a single Stack Overflow Q&A.
So, with that said, let's look just at the specific issues you've asked about here.
TPL Task is an abstraction, which does not even guarantee that the code will run on a separate thread. In my example, if the producer control method runs synchronously, the infinite loop will cause the consumer to never even start. According to MSDN, providing a TaskCreationOptions.LongRunning parameter when running the task should hint the TaskScheduler to run the method appropriately, however I didn't find any way to ensure that it does. Supposedly TPL is smart enough to run tasks the way the programmer intended, but that just seems like a bit of magic to me. And I don't like magic in programming.
It is true that the Task object itself does not guarantee asynchronous behavior. For example, an async method which returns a Task object could contain no asynchronous operations at all, and could run for an extended period of time before returning an already-completed Task object.
On the other hand, Task.Run() is guaranteed to operate asynchronously. It is documented as such:
Queues the specified work to run on the ThreadPool and returns a task or Task<TResult> handle for that work
While the Task object itself abstracts the idea of a "future" or "promise" (to use synonymous terms found in programming), the specific implementation is very much tied to the thread pool. When used correctly, you can be assured of asynchronous operation.
If I understand how this works correctly, a TPL Task is not guaranteed to resume on the same thread as it started. If it does, in this case it would try to release a lock it doesn't own while the other thread holds the lock forever, resulting in a deadlock. I remember a while ago Eric Lippert writing that it's the reason why await is not allowed in a lock block. Going back to my example, I'm not even sure how to go about solving this issue.
Only some synchronization objects are thread-specific. For example, Monitor is. But Semaphore is not. Whether this is useful to you or not depends on what you are trying to implement. For example, you can implement the producer/consumer pattern with a long running thread that uses BlockingCollection<T>, without needing to call any explicit synchronization objects at all. If you did want to use TPL techniques, you could use SemaphoreSlim and its WaitAsync() method.
Of course, you could also use the Dataflow API. For some scenarios this would be preferable. For very simple producer/consumer, it would probably be overkill. :)
Also, this made me think, is using the classical approach of synchronizing via Monitor, Mutex or Semaphore even the right way to do TPL code? Perhaps I'm missing something that I should be using instead?
IMHO, this is the crux of the matter. Moving from Thread-based programming to the TPL is not simply a matter of a straight-forward mapping from one construct to another. In some cases, doing so would be inefficient, and in other cases it simply won't work.
Indeed, I would say a key feature of TPL and especially of async/await is that synchronization of threads is much less necessary. The general idea is to perform operations asynchronously, with minimal interaction between threads. Data flows between threads only at well-defined points (i.e. retrieved from the completed Task objects), reducing or even eliminating the need for explicit synchronization.
It's impossible to suggest specific techniques, as how best to implement something will depend on what exactly the goal is. But the short version is to understand that when using TPL, very often it is simply unnecessary to use synchronization primitives such as what you're used to using with the lower-level API. You should strive to develop enough experience with the TPL idioms that you can recognize which ones apply to which programming problems, so that you apply them directly rather than trying to mentally map your old knowledge.
In a way, this is (I think) analogous to learning a new human language. At first, one spends a lot of time mentally translating literally, possibly remapping to adjust to grammar, idioms, etc. But ideally at some point, one internalizes the language and is able to express oneself in that language directly. Personally, I've never gotten to that point when it comes to human languages, but I understand the concept in theory :). And I can tell you firsthand, it works quite well in the context of programming languages.
By the way, if you are interested in seeing how TPL ideas taken to extremes work out, you might like to read through Joe Duffy's recent blog articles on the topic. Indeed, the most recent version of .NET and associated languages have borrowed heavily from concepts developed in the Midori project he's describing.
Tasks in .Net are a hybrid. TPL brought tasks in .Net 4.0, but async-await only came with .Net 4.5.
There's a difference between the original tasks and the truly asynchronous tasks that came with async-await. The first is simply an abstraction of a "unit of work" that runs on some thread, but asynchronous tasks don't need a thread, or run anywhere at all.
The regular tasks (or Delegate Tasks) are queued on some TaskScheduler (usually by Task.Run that uses the ThreadPool) and are executed by the same thread throughout the task's lifetime. There's no problem at all in using a traditional lock here.
The asynchronous tasks (or Promise Tasks) usually don't have code to execute, they just represent an asynchronous operation that will complete in the future. Take Task.Delay(10000) for example. The task is created, and completed after 10 seconds but there's nothing running in the meantime. Here you can still use the traditional lock when appropriate (but not with an await inside the critical section) but you can also lock asynchronously with SemaphoreSlim.WaitAsync (or other async synchronization constructs)
Is using the classical approach of synchronizing via Monitor, Mutex or Semaphore even the right way to do TPL code?
It may be, that depends on what the code actually does and whether it uses TPL (i.e. Tasks) or async-await. However, there are many other tools you can now use like async synchronization constructs (AsyncLock) and async data structures (TPL Dataflow)

Best solution for async chicken and egg story

I have been applying async best practices to all my libraries. Basically it means:
Only use async when it's truly async (libraries shouldn't lie)
Define a synchronous method if and only if you have a faster synchronous method that won’t dead lock.
Postfix all async methods with Async
I worked on a library that is synchronous by nature. This means it has only sync methods. If the user wants to run the work on a separate thread than the UI thread, they can do that themselves by using Task.Factory (responsibility of the caller).
However, inside a handler / method / extensibility point, we want to show the user a message box. This is an async method (for example, WinRT ShowDialogAsync). Then this gives us the following options:
A. Move everything to async (so we have the option to use await in our handlers and don't block anything).
public async Task MyMethodAsync()
{
await _messageService.ShowAsync();
}
The advantage is that users can add async methods without having to use .Wait(). The downside is that we are lying as a library (it's not truly async).
I have considered making everything async, but I don't think that's a good idea either. It would make all libraries lie but prepare them in case we would need it. Remember that making everything async out of the box has a (small) performance impact as well.
B. Inside the handler that requires user input, call .Wait()
public void MyMethod()
{
_messageService.ShowAsync().Wait();
}
The advantage is that this will allow us to use async code inside sync methods. But... it will never be callable from the UI-thread because the _messageService dispatches to the UI thread (but it cannot do that because it's still waiting for the method, resulting in a deadlock). This method will work when used inside a Task.Factory.Run block (but the responsibility is up to the end-user):
await Task.Factory.Run(() => MyMethod());
The question
I feel that both have pros and cons, but what would you choose? Let the library lie (A) or only allow the method to be called from a background thread (B)? Or maybe there are other options I've overseen.
If I go for A, it means I have to bump the major version every time (because it's actually a breaking change) whenever a user requests to convert a method to an async signature method.
Define a synchronous method if and only if you have a faster synchronous method that won’t dead lock.
I'd say "define a synchronous method if you have synchronous work to do". It doesn't matter how fast it is. The burden is on the caller to determine if it's too slow and they need to use Task.Run.
However, inside a handler / method / extensibility point
If this is an Observer kind of extensibility, consider just using events or observables.
However, it sounds like you want more of a Strategy kind of extensibility, where your invoking code must wait for and/or change its behavior based on the result of the callback.
I have considered making everything async, but I don't think that's a good idea either.
Async all the way is a guideline, not a strict command. It definitely applies in the 99% case, but this could be one of the exceptions. I would try not to make a library async just for the sake of a possibly-async Strategy pattern; I'd investigate other extension possibilities first. There is a valid argument for making the library async, if you view the Strategy callback as a dependency (the library would be async because its dependency is (possibly) async).
As you've discovered, there's no clean way to do sync-over-async. There are a few different hacks (such as blocking from a background thread), but you'll first need to decide whether you need to call your library from the UI thread.
If you do, then there's just two options: make the library async, or use a nested message loop. I strongly avoid nested message loops, especially in libraries; I'm just mentioning it for sake of completeness.
If you can impose on the user a requirement to only call the library from a non-UI thread, then you can apply other hacks. E.g., blocking the background thread.
There's not an easy solution, sorry.
As far as me personally... if the library needs an async Strategy, then I would lean towards making the library async. But it does depend on what kind of library it is, whether there were backwards-compatibility issues, etc. And the first thing I'd look into is a different kind of extensibility point.
as you can read here :
https://msdn.microsoft.com/en-us/magazine/jj991977.aspx
Async All the Way
Asynchronous code reminds me of the story of a fellow who mentioned that the world was suspended in space and was immediately challenged by an elderly lady claiming that the world rested on the back of a giant turtle. When the man enquired what the turtle was standing on, the lady replied, “You’re very clever, young man, but it’s turtles all the way down!” As you convert synchronous code to asynchronous code, you’ll find that it works best if asynchronous code calls and is called by other asynchronous code—all the way down (or “up,” if you prefer). Others have also noticed the spreading behavior of asynchronous programming and have called it “contagious” or compared it to a zombie virus. Whether turtles or zombies, it’s definitely true that asynchronous code tends to drive surrounding code to also be asynchronous. This behavior is inherent in all types of asynchronous programming, not just the new async/await keywords.
“Async all the way” means that you shouldn’t mix synchronous and asynchronous code without carefully considering the consequences. In particular, it’s usually a bad idea to block on async code by calling Task.Wait or Task.Result. This is an especially common problem for programmers who are “dipping their toes” into asynchronous programming, converting just a small part of their application and wrapping it in a synchronous API so the rest of the application is isolated from the changes. Unfortunately, they run into problems with deadlocks. After answering many async-related questions on the MSDN forums, Stack Overflow and e-mail, I can say this is by far the most-asked question by async newcomers once they learn the basics: “Why does my partially async code deadlock?”

Web API Sync Calls Best Practice

Probably this question has already been made, but I never found a definitive answer. Let's say that I have a Web API 2.0 Application hosted on IIS. I think I understand that best practice (to prevent deadlocks on client) is always use async methods from the GUI event to the HttpClient calls. And this is good and it works. But what is the best practice in case I had client application that does not have a GUI (e.g. Window Service, Console Application) but only synchronous methods from which to make the call? In this case, I use the following logic:
void MySyncMethodOnMyWindowServiceApp()
{
list = GetDataAsync().Result().ToObject<List<MyClass>>();
}
async Task<Jarray> GetDataAsync()
{
list = await Client.GetAsync(<...>).ConfigureAwait(false);
return await response.Content.ReadAsAsync<JArray>().ConfigureAwait(false);
}
But unfortunately this can still cause deadlocks on client that occur at random times on random machines.
The client app stops at this point and never returns:
list = await Client.GetAsync(<...>).ConfigureAwait(false);
If it's something that can be run in the background and isn't forced to be synchronous, try wrapping the code (that calls the async method) in a Task.Run(). I'm not sure that'll solve a "deadlock" problem (if it's something out of sync, that's another issue), but if you want to benefit from async/await, if you don't have async all the way down, I'm not sure there's a benefit unless you run it in a background thread. I had a case where adding Task.Run() in a few places (in my case, from an MVC controller which I changed to be async) and calling async methods not only improved performance slightly, but it improved reliability (not sure that it was a "deadlock" but seemed like something similar) under heavier load.
You will find that using Task.Run() is regarded by some as a bad way to do it, but I really couldn't see a better way to do it in my situation, and it really did seem to be an improvement. Perhaps this is one of those things where there's the ideal way to do it vs. the way to make it work in the imperfect situation that you're in. :-)
[Updated due to requests for code]
So, as someone else posted, you should do "async all the way down". In my case, my data wasn't async, but my UI was. So, I went async down as far as I could, then I wrapped my data calls with Task.Run in such as way that it made sense. That's the trick, I think, to figure out if it makes sense that things can run in parallel, otherwise, you're just being synchronous (if you use async and immediately resolve it, forcing it to wait for the answer). I had a number of reads that I could perform in parallel.
In the above example, I think you have to async up as far as makes sense, and then at some point, determine where you can spin off a t hread and perform the operation independent of the other code. Let's say you have an operation that saves data, but you don't really need to wait for a response -- you're saving it and you're done. The only thing you might have to watch out for is not to close the program without waiting for that thread/task to finish. Where it makes sense in your code is up to you.
Syntax is pretty easy. I took existing code, changed the controller to an async returning a Task of my class that was formerly being returned.
var myTask = Task.Run(() =>
{
//...some code that can run independently.... In my case, loading data
});
// ...other code that can run at the same time as the above....
await Task.WhenAll(myTask, otherTask);
//..or...
await myTask;
//At this point, the result is available from the task
myDataValue = myTask.Result;
See MSDN for probably better examples:
https://msdn.microsoft.com/en-us/library/hh195051(v=vs.110).aspx
[Update 2, more relevant for the original question]
Let's say that your data read is an async method.
private async Task<MyClass> Read()
You can call it, save the task, and await on it when ready:
var runTask = Read();
//... do other code that can run in parallel
await runTask;
So, for this purpose, calling async code, which is what the original poster is requesting, I don't think you need Task.Run(), although I don't think you can use "await" unless you're an async method -- you'll need an alternate syntax for Wait.
The trick is that without having some code to run in parallel, there's little point in it, so thinking about multi-threading is still the point.
Using Task<T>.Result is the equivalent of Wait which will perform a synchronous block on the thread. Having async methods on the WebApi and then having all the callers synchronously blocking them effectively makes the WebApi method synchronous. Under load you will deadlock if the number of simultaneous Waits exceeds the server/app thread pool.
So remember the rule of thumb "async all the way down". You want the long running task (getting a collection of List) to be async. If the calling method must be sync you want to make that conversion from async to sync (using either Result or Wait) as close to the "ground" as possible. Keep they long running process async and have the sync portion as short as possible. That will greatly reduce the length of time that threads are blocked.
So for example you can do something like this.
void MySyncMethodOnMyWindowServiceApp()
{
List<MyClass> myClasses = GetMyClassCollectionAsync().Result;
}
Task<List<MyClass>> GetMyListCollectionAsync()
{
var data = await GetDataAsync(); // <- long running call to remote WebApi?
return data.ToObject<List<MyClass>>();
}
The key part is the long running task remains async and not blocked because await is used.
Also don't confuse the responsiveness with scalability. Both are valid reasons for async. Yes responsiveness is a reason for using async (to avoid blocking on the UI thread). You are correct this wouldn't apply to a back end service however this isn't why async is used on a WebApi. The WebApi is also a non GUI back end process. If the only advantage of async code was responsiveness of the UI layer then WebApi would be sync code from start to finish. The other reason for using async is scalability (avoiding deadlocks) and this is the reason why WebApi calls are plumbed async. Keeping the long running processes async helps IIS make more efficient use of a limited number of threads. By default there are only 12 worker threads per core. This can be raised but that isn't a magic bullet either as threads are relatively expensive (about 1MB overhead per thread). await allows you to do more with less. More concurrent long running processes on less threads before a deadlock occurs.
The problem you are having with deadlocks must stem from something else. Your use of ConfigureAwait(false) prevents deadlocks here. Solve the bug and you are fine.
See Should we switch to use async I/O by default? to which the answer is "no". You should decide on a case by case basis and choose async when the benefits outweigh the costs. It is important to understand that async IO has a productivity cost associated with it. In non-GUI scenarios only a few targeted scenarios derive any benefit at all from async IO. The benefits can be enormous, though, but only in those cases.
Here's another helpful post: https://stackoverflow.com/a/25087273/122718

Categories