Why use async methods with Azure table storage?

Why use async methods with Azure table storage? - c#

I'm using Microsoft.WindowsAzure.Storage.Table and couldn't figure out what's the difference between async methods and regular methods, for example CloudTable.Execute and CloudTable.ExecuteAsync. When and why should I use each of them? Is this even related to storage design and the module I'm using or am I misunderstanding the concept of async methods (I'm new to c# and Azure)?
Edit: If I should always use async methods, why are regular methods implemented, available, and moreover used in most Azure table storage guides?
Thanks in advance!

Basically when using the CloudTable.ExecuteAsync the compiler generates a state machine in the background, so you can avoid performance bottlenecks and enhance the overall responsiveness of your application.

It's not just the table storage but any service that implements async operation should be used. We use async to off load the main thread and shift the task on the background. The main thread is ready to take more requests while background task is getting completed. Once completed it will bring back the control to the main thread. If you don't use async you get info problem called resource starvation where your request pool start growing and eventually your application hangs up .
Look into the following link and it explains in details.
Synchronous I/O antipattern

Related

Proper way to start and fire-and-forget asynchronous calls?

I have an async call (DoAsyncWork()), that I would like to start in a fire-and-forget way, i.e. I'm not interesting in its result and would like the calling thread to continue even before the async method is finished.
What is the proper way to do this? I need this in both, .NET Framework 4.6 as well as .NET Core 2, in case there are differences.
public async Task<MyResult> DoWorkAsync(){...}
public void StarterA(){
Task.Run(() => DoWorkAsync());
}
public void StarterB(){
Task.Run(async () => await DoWorkAsync());
}
Is it one of those two or something different/better?
//edit: Ideally without any extra libraries.

What is the proper way to do this?
First, you need to decide whether you really want fire-and-forget. In my experience, about 90% of people who ask for this actually don't want fire-and-forget; they want a background processing service.
Specifically, fire-and-forget means:
You don't care when the action completes.
You don't care if there are any exceptions when executing the action.
You don't care if the action completes at all.
So the real-world use cases for fire-and-forget are astoundingly small. An action like updating a server-side cache would be OK. Sending emails, generating documents, or anything business related is not OK, because you would (1) want the action to be completed, and (2) get notified if the action had an error.
The vast majority of the time, people don't want fire-and-forget at all; they want a background processing service. The proper way to build one of those is to add a reliable queue (e.g., Azure Queue / Amazon SQS, or even a database), and have an independent background process (e.g., Azure Function / Amazon Lambda / .NET Core BackgroundService / Win32 service) processing that queue. This is essentially what Hangfire provides (using a database for a queue, and running the background process in-proc in the ASP.NET process).
Is it one of those two or something different/better?
In the general case, there's a number of small behavior differences when eliding async and await. It's not something you would want to do "by default".
However, in this specific case - where the async lambda is only calling a single method - eliding async and await is fine.

It depends on what you mean by proper :)
For instance: are you interested in the exceptions being thrown in your "fire and forget" calls? If not, than this is sort of fine. Though what you might need to think about is in what environment the task lives.
For instance, if this is a asp.net application and you do this inside the lifetime of a thread instantiated due to a call to a .aspx or .svc. The Task becomes a background thread of that (foreground)thread. The foreground thread might get cleaned up by the application pool before your "fire and forget" task is completed.
So also think about in which thread your tasks live.
I think this article gives you some useful information on that:
https://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
Also note that if you do not return a value in your Tasks, a task will not return exception info. Source for that is the ref book for microsoft exam 70-483
There is probably a free version of that online somewhere ;P https://www.amazon.com/Exam-Ref-70-483-Programming-C/dp/0735676828
Maybe useful to know is that if your have an async method being called by a non-async and you wish to know its result. You can use .GetAwaiter().GetResult().
Also I think it is important to note the difference between async and multi-threading.
Async is only useful if there are operations that use other parts of a computer that is not the CPU. So things like networking or I/O operations. Using async then tells the system to go ahead and use CPU power somewhere else instead of "blocking" that thread in the CPU for just waiting for a response.
multi-threading is the allocation of operations on different threads in a CPU (for instance, creating a task which creates a background thread of the foreground thread... foreground threads being the threads that make up your application, they are primary, background threads exist linked to foreground threads. If you close the linked foreground thread, the background thread closes as well)
This allows the CPU to work on different tasks at the same time.
Combining these two makes sure the CPU does not get blocked up on just 4 threads if it is a 4 thread CPU. But can open more while it waits for async tasks that are waiting for I/O operations.
I hope this gives your the information needed to do, what ever it is you are doing :)

How do a library provide async methods if it is a bad practice to do Task.Run?

In this article: https://blog.stephencleary.com/2013/11/taskrun-etiquette-examples-dont-use.html , it is advised against using Task.Run. however there are lot of libraries that provide methods that ends with Async and hence I expect those methods to return a running task that I can await (which however is not necessary, since those libraries could decide to return a synchronous task).
The context is a ASP.NET application. How am I supposed to make a method running in parallel?
What I understand is that async calls are executed in parallel if they contain at least one "await" operator inside, the problem is that the innermost call, should be parallel to achieve that, and to do that I have somewhat to resort to Task.Run
I have also seen some examples using TaskCompletionSource, is this necessary to implement the "inner most async method" to run a method in parallel in a ASP.NET application?

In an ASP.Net application we tend to value requests/s over individual response times1 - certainly if we're directly trading off one versus the other. So we don't try to focus more CPU power at satisfying one request.
And really, focussing more CPU power at a task is what Task.Run is for - it's for when you have a distinct chunk of work to be done, you can't do it on the current thread (because its got its own work to do) and when you're free to use as much CPU as possible.
In ASP.Net, where async shines is when we're dealing with I/O. Nasty slow things like accessing the file system or talking to a database across the network. And wonderfully, at the lowest level, the windows I/O system is async already and we don't have to devote a thread just to waiting for things to finish.
So, you won't be using Task.Run. Instead you'll be looking for I/O related objects that expose Async methods. And those methods themselves will not, as above, be using Task.Run. What this does allow us to do is to stop using any threads for servicing our particular request whilst there's no work to be done, and so improve out requests/s metric.
1This is a generalization but single user/request ASP.Net sites are rare in my experience.

Must async methods be supported by OS or is async program level feature

Tutorials sometimes point implementation of own async methods as for example this code:
async public static Task GetHttpResponseAsync()
{
using (HttpClient httpClient = new HttpClient())
{
HttpResponseMessage response = await httpClient.GetAsync(...);
Console.WriteLine(response.Something);
}
}
what is clear to me is how async work generally, but none of tutorials explains how internally are implemented
httpClient.GetAsync(...);
which is really important to understand how asynchoronus code works in details. What make me curious is if internall operations of GetAsync (those method or other async method) are registered in some kind of container where this code is executed? Are async methods must be supported by operating system (f.e it uses windows api)? If I would like to implement my own asynchronous file downloader (from disk and without essential part from .NET framework), how would I implement it, should I register my method somewhere for further invocation?
It's pretty clear for me that internally, compiler makes state machine and after DoSomething() method do what it have to, it just invoke this state machine again to resume executing code after await.
Also what is unclear for me is that how async code can run on same thread. I think that maintaining state machine must be on the same thread but how the code from httpClient.GetAsync() can be run on the same thread and doesn't interrupt other operations (f.e gui). There must be something that make this code runs on separate thread (in all cases). Am I wrong? What I missed?
Additional explanation of my question: In JavaScript, as far as I know and understand, async methods works by registering them in some kind of container (which runs them one by one on separate thread) which executes this method. After execution of the method complete, result is returned to user context, It's clear for me, Is that work in the same way here?

In short, true asynchrony must be provided at the OS level. As you've noted, async/await is a language level feature that uses compiler generated state machines to "break up" your method into pieces that can run asynchronously, but it relies on OS primitives (interrupts, threads) to actually perform this work in an asynchronous manner.
However, it's important to note that there will often not be a thread created to handle your async operation. I'll defer to this expertly-written article to describe why this is the case: http://blog.stephencleary.com/2013/11/there-is-no-thread.html
On Windows, the primary mechanism for performing asynchronous work is with I/O Completion Ports. This is a Windows API that is used under the hood by many .NET types, including the HttpClient you're using.
Note that for non-I/O operations, you can also always use Threads, or better yet, the Thread Pool API, to perform background work that will complete asynchronously.
The ReadFile (or newer ReadFileEx) function in the Windows API is designed to work with async I/O. When you call ReadFile, you can use the FILE_FLAG_OVERLAPPED flag and pass an OVERLAPPED structure to the lpOverlapped argument, which enables async reads. I would encourage you to use this API to design your file downloader.
In summary:
There will not always be a thread created for async operations.
async/await is a language feature, but relies on various Windows APIs to achieve true asynchrony.
If the work you are doing is I/O bound, consider using Asynchronous I/O: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365683(v=vs.85).aspx
If the work you are doing is CPU bound, consider using Threading: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684841(v=vs.85).aspx
Now that .NET Core is open source on GitHub, you can actually inspect the source code to see what's going on under the hood. Here is HttpClient.cs, which uses HttpMessageHandler.cs -> HttpClientHandler.Windows.cs -> WinHttpHandler.cs -> Interop.winhttp.cs -> PInvoke into the WinHTTP API native DLL -> Winsock sockets with an I/O completion port.

When should I use Async Controllers in ASP.NET MVC?

I have some concerns using async actions in ASP.NET MVC. When does it improve performance of my apps, and when does it not?
Is it good to use async action everywhere in ASP.NET MVC?
Regarding awaitable methods: shall I use async/await keywords when I want to query a database (via EF/NHibernate/other ORM)?
How many times can I use await keywords to query the database asynchronously in one single action method?

You may find my MSDN article on the subject helpful; I took a lot of space in that article describing when you should use async on ASP.NET, not just how to use async on ASP.NET.
I have some concerns using async actions in ASP.NET MVC. When it improves performance of my apps, and when - not.
First, understand that async/await is all about freeing up threads. On GUI applications, it's mainly about freeing up the GUI thread so the user experience is better. On server applications (including ASP.NET MVC), it's mainly about freeing up the request thread so the server can scale.
In particular, it won't:
Make your individual requests complete faster. In fact, they will complete (just a teensy bit) slower.
Return to the caller/browser when you hit an await. await only "yields" to the ASP.NET thread pool, not to the browser.
First question is - is it good to use async action everywhere in ASP.NET MVC?
I'd say it's good to use it everywhere you're doing I/O. It may not necessarily be beneficial, though (see below).
However, it's bad to use it for CPU-bound methods. Sometimes devs think they can get the benefits of async by just calling Task.Run in their controllers, and this is a horrible idea. Because that code ends up freeing up the request thread by taking up another thread, so there's no benefit at all (and in fact, they're taking the penalty of extra thread switches)!
Shall I use async/await keywords when I want to query database (via EF/NHibernate/other ORM)?
You could use whatever awaitable methods you have available. Right now most of the major players support async, but there are a few that don't. If your ORM doesn't support async, then don't try to wrap it in Task.Run or anything like that (see above).
Note that I said "you could use". If you're talking about ASP.NET MVC with a single database backend, then you're (almost certainly) not going to get any scalability benefit from async. This is because IIS can handle far more concurrent requests than a single instance of SQL server (or other classic RDBMS). However, if your backend is more modern - a SQL server cluster, Azure SQL, NoSQL, etc - and your backend can scale, and your scalability bottleneck is IIS, then you can get a scalability benefit from async.
Third question - How many times I can use await keywords to query database asynchronously in ONE single action method?
As many as you like. However, note that many ORMs have a one-operation-per-connection rule. In particular, EF only allows a single operation per DbContext; this is true whether the operation is synchronous or asynchronous.
Also, keep in mind the scalability of your backend again. If you're hitting a single instance of SQL Server, and your IIS is already capable of keeping SQLServer at full capacity, then doubling or tripling the pressure on SQLServer is not going to help you at all.

Asynchronous action methods are useful when an action must perform several independent long running operations.
A typical use for the AsyncController class is long-running Web
service calls.
Should my database calls be asynchronous ?
The IIS thread pool can often handle many more simultaneous blocking requests than a database server. If the database is the bottleneck, asynchronous calls will not speed up the database response. Without a throttling mechanism, efficiently dispatching more work to an overwhelmed database server by using asynchronous calls merely shifts more of the burden to the database. If your DB is the bottleneck, asynchronous calls won’t be the magic bullet.
You should have a look at 1 and 2 references
Derived from #PanagiotisKanavos comments:
Moreover, async doesn't mean parallel. Asynchronous execution frees a
valuable threadpool thread from blocking for an external resource, for
no complexity or performance cost. This means the same IIS machine can
handle more concurrent requests, not that it will run faster.
You should also consider that blocking calls start with a
CPU-intensive spinwait. During stress times, blocking calls will
result in escalating delays and app pool recycling. Asynchronous calls
simply avoid this

is it good to use async action everywhere in ASP.NET MVC?
As usual in programming, it depends. There is always a trade-off when going down a certain path.
async-await shines in places where you know you'll receiving concurrent requests to your service and you want to be able to scale out well. How does async-await help with scaling out? In the fact that when you invoke a async IO call synchronously, such as a network call or hitting your database, the current thread which is responsible for the execution is blocked waiting for the request to finish. When you use async-await, you enable the framework to create a state machine for you which makes sure that after the IO call is complete, your method continues executing from where it left off.
A thing to note is that this state machine has a subtle overhead. Making a method asynchronous does not make it execute faster, and that is an important factor to understand and a misconception many people have.
Another thing to take under consideration when using async-await is the fact that it is async all the way, meaning that you'll see async penetrate your entire call stack, top to buttom. This means that if you want to expose synchronous API's, you'll often find yourself duplicating a certain amount of code, as async and sync don't mix very well.
Shall I use async/await keywords when I want to query database (via
EF/NHibernate/other ORM)?
If you choose to go down the path of using async IO calls, then yes, async-await will be a good choice, as more and more modern database providers expose async method implementing the TAP (Task Asynchronous Pattern).
How many times I can use await keywords to query database
asynchronously in ONE single action method?
As many as you want, as long as you follow the rules stated by your database provider. There is no limit to the amount of async calls you can make. If you have queries which are independent of each other and can be made concurrently, you can spin a new task for each and use await Task.WhenAll to wait for both to complete.

async actions help best when the actions does some I\O operations to DB or some network bound calls where the thread that processes the request will be stalled before it gets answer from the DB or network bound call which you just invoked. It's best you use await with them and it will really improve the responsiveness of your application (because less ASP input\output threads will be stalled while waiting for the DB or any other operation like that). In all my applications whenever many calls to DB very necessary I've always wrapped them in awaiatable method and called that with await keyword.

My 5 cents:
Use async/await if and only if you do an IO operation, like DB or external service webservice.
Always prefer async calls to DB.
Each time you query the DB.
P.S. There are exceptional cases for point 1, but you need to have a good understanding of async internals for this.
As an additional advantage, you can do few IO calls in parallel if needed:
Task task1 = FooAsync(); // launch it, but don't wait for result
Task task2 = BarAsync(); // launch bar; now both foo and bar are running
await Task.WhenAll(task1, task2); // this is better in regard to exception handling
// use task1.Result, task2.Result

As you know, MVC supports asynchronous controllers and you should take advantage of it. In case your Controller, performs a lengthy operation, (it might be a disk based I/o or a network call to another remote service), if the request is handled in synchronous manner, the IIS thread is busy the whole time. As a result, the thread is just waiting for the lengthy operation to complete. It can be better utilized by serving other requests while the operation requested in first is under progress. This will help in serving more concurrent requests.
Your webservice will be highly scalable and will not easily run into C10k problem.
It is a good idea to use async/await for db queries. and yes you can use them as many number of times as you deem fit.
Take a look here for excellent advise.

My experience is that today a lot of developers use async/await as a default for controllers.
My suggestion would be, use it only when you know it will help you.
The reason is, as Stephen Cleary and others already mentioned, it can introduce performance issues, rather than resolving them, and it will help you only in a specific scenario:
High-traffic controllers
Scalable backend

Is it good to use async action everywhere in ASP.NET MVC?
It's good to do so wherever you can use an async method especially when you have performance issues at the worker process level which happens for massive data and calculation operations. Otherwise, no need because unit testing will need casting.
Regarding awaitable methods: shall I use async/await keywords when I
want to query a database (via EF/NHibernate/other ORM)?
Yes, it's better to use async for any DB operation as could as possible to avoid performance issues at the level of worker processes.
Note that EF has created many async alternatives for most operations, such as:
.ToListAsync()
.FirstOrDefaultAsync()
.SaveChangesAsync()
.FindAsync()
How many times can I use await keywords to query the database
asynchronously in one single action method?
The sky is the limit

.NET async framework - is there a limitation with thread-local data

I am looking to write a Windows Service that will start various "jobs".
Each "job" will:
be distinct in what it accomplishes
run for the lifetime of the Service, so "long running". Typically, a job will get 10 tasks from the database and process them, then sleep, and then repeat this cycle again and again.
Share the same "context". The application will be loosely coupled and call an IoC to get classes. It will also store some data on this context too
I need each job to be able to run in parallel and effectively run as separate programs.
My first thought was to create one thread per job. This is okay but has the drawback that a ManualResetEvent stops the thread in its tracks, and the Abort doesn't allow much chance for the Thread to exit in a graceful manner.
I then explored some of the new async framework in .NET 4.5 and boy does it seem to simplify coding.
However, whilst some of the data held on the context may be freely shared between each job, some can not: so each job requires it's own copy of certain data.
I attempted to solve this using ThreadLocal<T> properties. However, whilst this works fine for a specific thread that I've created, this doesn't work for the async methods. The thread that starts an async method is often not the thread that finishes the method, particularly when the method uses "await".
So, what is the preferred pattern for what I am attempting to accomplish?
FYI: Albahari's posting was a great help.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.