I have an async call (DoAsyncWork()), that I would like to start in a fire-and-forget way, i.e. I'm not interesting in its result and would like the calling thread to continue even before the async method is finished.
What is the proper way to do this? I need this in both, .NET Framework 4.6 as well as .NET Core 2, in case there are differences.
public async Task<MyResult> DoWorkAsync(){...}
public void StarterA(){
Task.Run(() => DoWorkAsync());
}
public void StarterB(){
Task.Run(async () => await DoWorkAsync());
}
Is it one of those two or something different/better?
//edit: Ideally without any extra libraries.
What is the proper way to do this?
First, you need to decide whether you really want fire-and-forget. In my experience, about 90% of people who ask for this actually don't want fire-and-forget; they want a background processing service.
Specifically, fire-and-forget means:
You don't care when the action completes.
You don't care if there are any exceptions when executing the action.
You don't care if the action completes at all.
So the real-world use cases for fire-and-forget are astoundingly small. An action like updating a server-side cache would be OK. Sending emails, generating documents, or anything business related is not OK, because you would (1) want the action to be completed, and (2) get notified if the action had an error.
The vast majority of the time, people don't want fire-and-forget at all; they want a background processing service. The proper way to build one of those is to add a reliable queue (e.g., Azure Queue / Amazon SQS, or even a database), and have an independent background process (e.g., Azure Function / Amazon Lambda / .NET Core BackgroundService / Win32 service) processing that queue. This is essentially what Hangfire provides (using a database for a queue, and running the background process in-proc in the ASP.NET process).
Is it one of those two or something different/better?
In the general case, there's a number of small behavior differences when eliding async and await. It's not something you would want to do "by default".
However, in this specific case - where the async lambda is only calling a single method - eliding async and await is fine.
It depends on what you mean by proper :)
For instance: are you interested in the exceptions being thrown in your "fire and forget" calls? If not, than this is sort of fine. Though what you might need to think about is in what environment the task lives.
For instance, if this is a asp.net application and you do this inside the lifetime of a thread instantiated due to a call to a .aspx or .svc. The Task becomes a background thread of that (foreground)thread. The foreground thread might get cleaned up by the application pool before your "fire and forget" task is completed.
So also think about in which thread your tasks live.
I think this article gives you some useful information on that:
https://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
Also note that if you do not return a value in your Tasks, a task will not return exception info. Source for that is the ref book for microsoft exam 70-483
There is probably a free version of that online somewhere ;P https://www.amazon.com/Exam-Ref-70-483-Programming-C/dp/0735676828
Maybe useful to know is that if your have an async method being called by a non-async and you wish to know its result. You can use .GetAwaiter().GetResult().
Also I think it is important to note the difference between async and multi-threading.
Async is only useful if there are operations that use other parts of a computer that is not the CPU. So things like networking or I/O operations. Using async then tells the system to go ahead and use CPU power somewhere else instead of "blocking" that thread in the CPU for just waiting for a response.
multi-threading is the allocation of operations on different threads in a CPU (for instance, creating a task which creates a background thread of the foreground thread... foreground threads being the threads that make up your application, they are primary, background threads exist linked to foreground threads. If you close the linked foreground thread, the background thread closes as well)
This allows the CPU to work on different tasks at the same time.
Combining these two makes sure the CPU does not get blocked up on just 4 threads if it is a 4 thread CPU. But can open more while it waits for async tasks that are waiting for I/O operations.
I hope this gives your the information needed to do, what ever it is you are doing :)
Related
I have a C# console app processing around 100,000 JSON messages from RabbitMQ every 1 min
After getting each/a bunch of messages from RabbitMQ I then call
await Task.Run(async () =>
{
//do lots of CPU stuff here, including 2 external API calls using await async call
}
Everything I've read says use await Task.Run for CPU bound operations. And use await async for the HTTP external calls.
If I change it to:
await Task.Run(() =>
Then it complains as I have an async API call in the lines below, so it needs the async keyword in the Task.Run statement.
There are about 2000+ (complex if then business rules) lines of code in this section, and the sometimes the API call is not needed.
So I'm faced with either a massive restructure of the application, with lots of testing needed, or if its ok to do API calls alongside the CPU bound operations then I'll leave it as is.
To summarise, is this bad practice, or is it ok to have CPU bound work and API calls inside the same task? The task is processing one JSON message.
Everything I've read says use await task.run for cpu bound operations . And use await async for the http external calls
The general guideline is to use async/await for I/O. Task.Run is useful for CPU-bound operations if you need to offload them from a UI thread. For example, in server scenarios such as ASP.NET, you wouldn't want to use Task.Run for CPU-bound code. This is because ASP.NET already schedules your code on a separate thread pool thread.
In your case, you have a Console application, which doesn't have a UI thread. But it also doesn't have that automatic scheduling onto a thread pool thread that ASP.NET gives you.
if its ok to do api calls alongside the cpu bound operations then i'll leave it as is.
This is fine either way. Since the code is awaiting the Task.Run, it won't continue (presumably processing the next message) until the operation completes on another thread pool thread. So the Task.Run isn't helping much, but it isn't hurting much, either.
If you need more performance - specifically, processing messages concurrently - then you should look into something like TPL Dataflow or System.Threading.Channels that would allow you to replace the Task.Run with a queue of work that can run in parallel. That would give you something more like what ASP.NET provides out of the box.
General
(...) use await Task.Run for CPU bound operations. And use await async for the HTTP external calls.
This advice comes from the fact that if you run code that doesn't 'let go' enough then you may not get a lot of the benefit Tasks give because the current thread / thread pool just cannot handle the work. In the extreme case when you run 100% synchronous code you won't get any parallelism because the current thread cannot let go and cannot do any other work - your tasks would be executed sequentially. It's important to remember that this is not a style issue; running busy synchronous code with Tasks just work well in some scenarios. In this sense the problem polices itself: if you structure the solution incorrectly it doesn't do what you need.
If you run a mixture of busy and waiting then Task.Run may or may not be great and it will depend on the specific workload. If it works for you, it's fine - you're not doing anything incorrect.
Generally the picture is nuanced and tasks can be and are used to do all kinds of jobs. In certain circumstances the situation is clear cut - e.g. if you run long running work in the UI thread you will lock the UI which is bad or if you have (long-running) busy synchronous code. It's worth keeping in mind that this has been a problem before C# had Tasks.
BTW. If you look at the reference documentation for Task.WhenAll Method it contains examples with both I/O (ping) and CPU (dummy for loop) style work. Yes, these are toy examples but it shows it isn't incorrect to run both types of work with tasks.
Parallel.ForEachAsync?
If you can use .NET 6, Parallel.ForEachAsync could improve performance of your solution and/or make the code look cleaner. Example on Twitter (as picture!).
I've been struggling for about some days now on checking where to do await and where not to.
I have a Repository class which fetches the data from database.
using EntityFramework the code would be something like this:
public async Task<List<Object>> GetAsync()
{
return await context.Set<Object>().ToListAsync();
}
and the consumer:
var data = await GetAsync();
and on top level I'm awaiting this method too.
should I use await on only one of these methods?
Is it a performance penalty on using resources and creates new thread each time you do await?
I have checked the questions listed in the comments and they do not reffer to the performance issues and just say that you can do it. I wanted the best practice and the reason why to / not to do so.
I'd like to add to that.
There are some async methods where there is no need to use async/await keywords. It is important to detect this kind of misuse because adding the async modifier comes at a price.
E.G. You don't need async/await keywords in your example.
public Task<List<Object>> GetAsync()
{
return context.Set<Object>().ToListAsync();
}
And then:
var data = await GetAsync();
Will be just fine. In this case, you are returning the Task<List<Object>> and then you are awaiting that in the place you directly work with objects.
I recommend installing async await helper
Let me get the essence of your question first, confusion is related to where to use the Async in the complete chain of calls and where not and how to assess the performance impact of usage, as it may lead to creation of more threads. If the synopsis goes beyond this add details to the comments, i till try to answer them too.
Let's divide and tackle each of them one by one.
Where to use the Async in the chain of calls and where not ?
Here as you are using Entity Framework to access a database, I can safely assume you are using IO based Asynchronous processing, which is the most prominent use case for Async processing across languages and frameworks, use cases for CPU based Asynchronous processing are relatively limited (will explain them too)
Async is a Scalability feature especially for IO processing instead of performance feature, in simple words via Async processing you can ensure that a hosted server can cater to many times more calls for IO processing, since calls are not blocking and they just hand over the processing request over the network, while process thread goes back to the pool ready to serve another request, complete handing over process few milliseconds
When the processing is complete, software thread need to just receive them and pass back to the client, again few millisecond, mostly its around < 1 ms, if its a pure pass through no logic call for IO
What are Benefits
Imagine instead making Synchronous call for IO to a database, where each thread involve will just wait for result to arrive, which may go in few seconds, impact will be highly negative, in general based on thread pool size, you may server 25 - 50 request at most and they too will reply on number of cores available to process, will continuously spin wasting resources, while they are idle and waiting for the response
If you make synchronous call there's no way to serve 1000+ requests in same setup and I am extremely conservative Async can actually have huge Scalability impact, where for light weight calls it may serve millions requests with ease from a single hosted process
After the background, where to use the Async in complete chain
Everywhere, feasible from begin to end, from entry point to actual exit point making IO call, since that's the actual call relieving the pool thread, as it dispatch the call over network
Do Remember though, await at a given point doesn't allow further code to process ins same method, even if it relieve the thread, so its better that if there are multiple independent calls, they are aggregated using Task.WhenAll and the representative task is awaited, it will return when all of them finish success / error, what ever may be the state
If the Async is broken in between by using something like Task.Wait or Task.Result, it will not remain pure Async call and will block the calling thread pool thread
How can Async be further enhanced ?
In Pure library calls, where Async is initiated by the Thread pool and dispatching thread can be different from receiving one and call doesn't need to reenter same context, you shall use ConfigureAwait(false), which means it will not wait to re-enter the original context and is a performance boost
Like await it makes sense to use ConfigureAwait(false) across the chain, entry to the end. This is valid only for libraries which reply extensively on thread pools
Is there a Thread created
Best read this, Stephen Cleary - There's no thread
A genuine IO async call will use Hardware based concurrency to process, will not block
the Software Threads
Variations
CPU based Asychronous processing, where you take things in background, since current thread needs to be responsive, mostly in case of Ui like WPF
Use cases
All kinds of systems especially non MS frameworks, like node js, have Async processing as underlying principle and database server cluster on receiving end is tuned to receive millions of calls and process them
B2C calls, its expected that each request is light weight with limited Payload
Edit 1:
Just in this specific case as listed here, ToListAsyncis by default Asynchronous, so you can skip async await in that case as listed in variopus comments, though do review Stepehen Cleay's article in general that may not be a very good strategy, since gains are minimal and negative impact for incorrect usage can be high
I'm making app with using XF pcl. Even I launched my app on the store already, I'm still newbie in c# world. I'm having a trouble especially using a Thread.
In XF/iOS, I faced after I launched app and took a while(longer than a day), all of Task.Run() of my code does not start new thread. A person advised me if there is a chance that I'm starting many thread and somehow they are not terminated. So new thread's not started.
So I searched my project and I have Task.Run at about 20 places in my code.
I used it when I call 'async Task' method even it background thread is not necessary.
So, I'm going to change it by using 'async void'. But I already changed it like this. and no problem.
Let's say AAAAA() is a 'async Task' method from some nuget library I'm using. So
I can not change method.
void Something()
{
...
Task.Run(async () => await XXXXX.AAAAA());
...
}
to
async void Something()
{
...
await XXXXX.AAAAA();
...
}
But sometimes, I faced that I can't change a method to async easily. So I'm going to change like this at that time.
void Something()
{
...
AA();
...
}
async void AA()
{
await XXXXX.AAAAA();
}
Is this OK unless background thread is not necessary?
I ask this question because I watched lots of videos that saying not to use "Async void".
I wonder if I could use like this if there seems no problem.
Any advice will help me.
Thanks.
Don't do async void. There are several worst practices about it.
Instead, try to solve your threading problems from the root with a good approach to asynchronous programming.
1. Define your task boundaries
Do not just "fire and forget". Expect your task to end and release resources. There are good reasons not to do Task.Run(...) and forget about it.
Async methods exist for a reason. They return in the Future (to quote the Java world). If you fire too many Async task that take long time to complete or get stuck in a loop, you drain your system resources and may end up unable to spawn new tasks.
So analyze your prolem, don't just run random methods from random packages. Design your workflow and identify parallelisms.
A simple straightforward solution is to Task.Run(()=>).Wait(). This destroys all kinds of parallelism but will constrain the resources and, most importantly, adheres with your synchronous programming.
2. A Task is not a Thread
While I discourage the unbounded/uncontrolled use of threads, the truth is that Task.Run(...) won't necessarily spawn new threads. It may not actually do anything under some circumstances.
For example I was forced to do this to force starting a new thread
Task.Factory.StartNew(()=>..., cancellationToken: tokenSource.Token, creationOptions:
TaskCreationOptions.LongRunning, scheduler: TaskScheduler.Default);
TaskCreationOptions.LongRunning tells the Task factory to use an available separate thread. Normally Task.Run runs on the same current thread by exploiting VM waits to run code from other tasks, so as to perform a lightweight context switch. If your synchronous code blocks in a synchronous way the runtime may not give control to other tasks.
3. TPL is made for 2 things
One is responsiveness. If your application is completely asynchronous, then a good use of the TPL leaves your UI thread responsive over waits, e.g. if you click on a button you won't see the whole window greyed and "stuck". This behaviour was introduced by Microsoft to help developers that are unfriendly with proper multithread programming
The other is I/O optimization. If you need to download 5 files, parse a text file from disk and store a bunch of rows in the database you can fire 7 task that leverage the I/O wait times of each task (e.g. SSL handshake, disk buffering, SQL response wait) so that the 7 tasks will reasonably complete by the time of the longest.
If you just invoke asynchronous methods because you found them on your NuGet library you are just doing it wrong, as you may need to invoke the corresponding synchronous version
Summarizing
Your question reveals a lack of understanding of parallel programming. In fact you said you are new to C#. Welcome to the world of .NET.
Parallel programming is not easy, and without a knowledge of your application design it is impossible to help you in a single short answer. You need to take several examples and/or ask questions about specific best practice for some parts of your application by posting real or simil-real code.
Probably this question has already been made, but I never found a definitive answer. Let's say that I have a Web API 2.0 Application hosted on IIS. I think I understand that best practice (to prevent deadlocks on client) is always use async methods from the GUI event to the HttpClient calls. And this is good and it works. But what is the best practice in case I had client application that does not have a GUI (e.g. Window Service, Console Application) but only synchronous methods from which to make the call? In this case, I use the following logic:
void MySyncMethodOnMyWindowServiceApp()
{
list = GetDataAsync().Result().ToObject<List<MyClass>>();
}
async Task<Jarray> GetDataAsync()
{
list = await Client.GetAsync(<...>).ConfigureAwait(false);
return await response.Content.ReadAsAsync<JArray>().ConfigureAwait(false);
}
But unfortunately this can still cause deadlocks on client that occur at random times on random machines.
The client app stops at this point and never returns:
list = await Client.GetAsync(<...>).ConfigureAwait(false);
If it's something that can be run in the background and isn't forced to be synchronous, try wrapping the code (that calls the async method) in a Task.Run(). I'm not sure that'll solve a "deadlock" problem (if it's something out of sync, that's another issue), but if you want to benefit from async/await, if you don't have async all the way down, I'm not sure there's a benefit unless you run it in a background thread. I had a case where adding Task.Run() in a few places (in my case, from an MVC controller which I changed to be async) and calling async methods not only improved performance slightly, but it improved reliability (not sure that it was a "deadlock" but seemed like something similar) under heavier load.
You will find that using Task.Run() is regarded by some as a bad way to do it, but I really couldn't see a better way to do it in my situation, and it really did seem to be an improvement. Perhaps this is one of those things where there's the ideal way to do it vs. the way to make it work in the imperfect situation that you're in. :-)
[Updated due to requests for code]
So, as someone else posted, you should do "async all the way down". In my case, my data wasn't async, but my UI was. So, I went async down as far as I could, then I wrapped my data calls with Task.Run in such as way that it made sense. That's the trick, I think, to figure out if it makes sense that things can run in parallel, otherwise, you're just being synchronous (if you use async and immediately resolve it, forcing it to wait for the answer). I had a number of reads that I could perform in parallel.
In the above example, I think you have to async up as far as makes sense, and then at some point, determine where you can spin off a t hread and perform the operation independent of the other code. Let's say you have an operation that saves data, but you don't really need to wait for a response -- you're saving it and you're done. The only thing you might have to watch out for is not to close the program without waiting for that thread/task to finish. Where it makes sense in your code is up to you.
Syntax is pretty easy. I took existing code, changed the controller to an async returning a Task of my class that was formerly being returned.
var myTask = Task.Run(() =>
{
//...some code that can run independently.... In my case, loading data
});
// ...other code that can run at the same time as the above....
await Task.WhenAll(myTask, otherTask);
//..or...
await myTask;
//At this point, the result is available from the task
myDataValue = myTask.Result;
See MSDN for probably better examples:
https://msdn.microsoft.com/en-us/library/hh195051(v=vs.110).aspx
[Update 2, more relevant for the original question]
Let's say that your data read is an async method.
private async Task<MyClass> Read()
You can call it, save the task, and await on it when ready:
var runTask = Read();
//... do other code that can run in parallel
await runTask;
So, for this purpose, calling async code, which is what the original poster is requesting, I don't think you need Task.Run(), although I don't think you can use "await" unless you're an async method -- you'll need an alternate syntax for Wait.
The trick is that without having some code to run in parallel, there's little point in it, so thinking about multi-threading is still the point.
Using Task<T>.Result is the equivalent of Wait which will perform a synchronous block on the thread. Having async methods on the WebApi and then having all the callers synchronously blocking them effectively makes the WebApi method synchronous. Under load you will deadlock if the number of simultaneous Waits exceeds the server/app thread pool.
So remember the rule of thumb "async all the way down". You want the long running task (getting a collection of List) to be async. If the calling method must be sync you want to make that conversion from async to sync (using either Result or Wait) as close to the "ground" as possible. Keep they long running process async and have the sync portion as short as possible. That will greatly reduce the length of time that threads are blocked.
So for example you can do something like this.
void MySyncMethodOnMyWindowServiceApp()
{
List<MyClass> myClasses = GetMyClassCollectionAsync().Result;
}
Task<List<MyClass>> GetMyListCollectionAsync()
{
var data = await GetDataAsync(); // <- long running call to remote WebApi?
return data.ToObject<List<MyClass>>();
}
The key part is the long running task remains async and not blocked because await is used.
Also don't confuse the responsiveness with scalability. Both are valid reasons for async. Yes responsiveness is a reason for using async (to avoid blocking on the UI thread). You are correct this wouldn't apply to a back end service however this isn't why async is used on a WebApi. The WebApi is also a non GUI back end process. If the only advantage of async code was responsiveness of the UI layer then WebApi would be sync code from start to finish. The other reason for using async is scalability (avoiding deadlocks) and this is the reason why WebApi calls are plumbed async. Keeping the long running processes async helps IIS make more efficient use of a limited number of threads. By default there are only 12 worker threads per core. This can be raised but that isn't a magic bullet either as threads are relatively expensive (about 1MB overhead per thread). await allows you to do more with less. More concurrent long running processes on less threads before a deadlock occurs.
The problem you are having with deadlocks must stem from something else. Your use of ConfigureAwait(false) prevents deadlocks here. Solve the bug and you are fine.
See Should we switch to use async I/O by default? to which the answer is "no". You should decide on a case by case basis and choose async when the benefits outweigh the costs. It is important to understand that async IO has a productivity cost associated with it. In non-GUI scenarios only a few targeted scenarios derive any benefit at all from async IO. The benefits can be enormous, though, but only in those cases.
Here's another helpful post: https://stackoverflow.com/a/25087273/122718
In MSDN, there is a paragraph like this:
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
But it looks I need little more help with the bold text since I am not sure what it exactly means. So how come it becomes async without using Threads?
Source: http://msdn.microsoft.com/en-us/library/hh191443.aspx
There are many asynchronous operations which don't require the use of multiple threads. Things like Asynchronous IO work by having interrupts which signal when data is available. This allows you to have an asynchronous call which isn't using extra threads - when the signal occurs, the operation completes.
Task.Run can be used to make your own CPU-based async methods, which will run on its own separate thread. The paragraph was intended to show that this isn't the only option, however.
async/await is not just about using more threads. It's about using the threads you have more effectively. When operations block, such as waiting on a download or file read, the async/await pattern allows you to use that existing thread for something else. The compiler handles all the magic plumbing underneath, making it much easier to develop with.
See http://msdn.microsoft.com/en-us/magazine/hh456401.aspx for the problem description and the whitepaper at http://www.microsoft.com/en-us/download/details.aspx?id=14058.
Not the code generated by the async and await keyword themselves, no. They create code that runs on your the current thread, assuming it has a synchronization context. If it doesn't then you actually do get threads, but that's using the pattern for no good reason. The await expression, what you write on the right side of the await keyword causes threads to run.
But that thread is often not observable, it may be a device driver thread. Which reports that it is done with a I/O completion port. Pretty common, I/O is always a good reason to use await. If not already forced on you by WinRT, the real reason that async/await got added.
A note about "having a synchronization context". You have one on a thread if the SynchronizationContext.Current property is not null. This is almost only ever the case on the main thread of a gui app. Also the only place where you normally ever worry about having delays not freeze your user interface.
Essentially what it's doing is when you run an async method without calling it with await is this:
Start the method and do as much as possible sychronously.
When necessary, pause the method and put the rest of it into a continuation.
When the async part is completed (is no longer being waited on), schedule the continuation to run on the same thread.
Whatever you want can run on this thread as normal. You can even examine/manipulate the Task returned from the async method.
When the thread becomes available, it will run the rest of your method.
The 'async part' could be file IO, a web request, or pretty much anything, as long as calling code can wait on this task to complete. This includes, but is not limited to, a separate thread. As Reed Copsey pointed out, there are other ways of performing async operations, like interrupts.