This is such a basic (read noob) question regarding the .NET async/await library but I thought I'd ask it anyhow before rewriting our api's to be awaitable.
The Question
Why wouldn't the runtime simply evaluate any given thread that has a lot of idle time and automatically operate asynchronous whenever it gets to the blocking call.
Example of some routine
Web request to some app...
App starts database call...
Wait for response...(idle and long)
Receive recordset...
Return to client...
If I were the runtime environment, wouldn't it be wise to simply jot down that step 3 takes a while so I should use the current thread at that point, during its idle moments, to help out other routines that would normally be waiting for our current thread to be available?
Isn't it possible that at some point in the future we'll be able to toggle a flag in the app.config (or web.config) that says <system.runtime><asyncBehavior enableAsynchronousWhenIdle=true /></system.runtime>?
Sure it's possible but it completely breaks the current programming model. Before when you had a blocking call you were guaranteed that no other code would run on your thread. This change now allows re-entrant calls on the same thread.
For instance consider this case:
static int _processCount;
static object _lockObj = new object();
public Response ProcessRequest(Request request) {
lock (_lockObj) {
_processCount++;
var savedCount = _processCount;
// Make long running request
if (savedCount != _processCount)
throw new InvalidOperationException("Is my lock broken?");
}
}
Before we allow processing requests during the long running process this code is fine, but if we allow new requests to be processed on the thread while it is making a long running request we open up the possibility of this case.
Process Request A
Process Request A waits for the long running operation
The idle processing uses the thread to process Request B.
Request B enters the lock because locks have thread affinity
Request B waits for the long running operation
Request A returns from the long running operation and throws an exception because it's state has been corrupted.
So the code needs to be written in such a way that it is aware of the reentrancy potential. There is no way for the Framework to know if your code will break so that change will never happen.
.NET (nor any framework) isn't that smart. Unless you explicitly program code to run asynchronously, it has no way of knowing if any particularly code should run asynchronously. It can't look at your code and say, "Oh here's some code that runs awhile and blocking the UI thread, so I should run this in a separate thread so that the UI can update." As far as the framework is concerned--you intended it to operate that way--and it has no intelligence to override the way you coded into something more efficient.
Basically because both synchronous and asynchronous approaches are useful, and sometimes blocking is fine. Asynchrony isn't a silver bullet.
In the other hand, turning all synchronous code into asynchronous operations may break a lot of code base, both from the base class library (BCL) and third-party code, because asynchronous operations should synchronize access to shared resources and objects, and current synchronous code can be an actual bomb!
Related
I have an async call (DoAsyncWork()), that I would like to start in a fire-and-forget way, i.e. I'm not interesting in its result and would like the calling thread to continue even before the async method is finished.
What is the proper way to do this? I need this in both, .NET Framework 4.6 as well as .NET Core 2, in case there are differences.
public async Task<MyResult> DoWorkAsync(){...}
public void StarterA(){
Task.Run(() => DoWorkAsync());
}
public void StarterB(){
Task.Run(async () => await DoWorkAsync());
}
Is it one of those two or something different/better?
//edit: Ideally without any extra libraries.
What is the proper way to do this?
First, you need to decide whether you really want fire-and-forget. In my experience, about 90% of people who ask for this actually don't want fire-and-forget; they want a background processing service.
Specifically, fire-and-forget means:
You don't care when the action completes.
You don't care if there are any exceptions when executing the action.
You don't care if the action completes at all.
So the real-world use cases for fire-and-forget are astoundingly small. An action like updating a server-side cache would be OK. Sending emails, generating documents, or anything business related is not OK, because you would (1) want the action to be completed, and (2) get notified if the action had an error.
The vast majority of the time, people don't want fire-and-forget at all; they want a background processing service. The proper way to build one of those is to add a reliable queue (e.g., Azure Queue / Amazon SQS, or even a database), and have an independent background process (e.g., Azure Function / Amazon Lambda / .NET Core BackgroundService / Win32 service) processing that queue. This is essentially what Hangfire provides (using a database for a queue, and running the background process in-proc in the ASP.NET process).
Is it one of those two or something different/better?
In the general case, there's a number of small behavior differences when eliding async and await. It's not something you would want to do "by default".
However, in this specific case - where the async lambda is only calling a single method - eliding async and await is fine.
It depends on what you mean by proper :)
For instance: are you interested in the exceptions being thrown in your "fire and forget" calls? If not, than this is sort of fine. Though what you might need to think about is in what environment the task lives.
For instance, if this is a asp.net application and you do this inside the lifetime of a thread instantiated due to a call to a .aspx or .svc. The Task becomes a background thread of that (foreground)thread. The foreground thread might get cleaned up by the application pool before your "fire and forget" task is completed.
So also think about in which thread your tasks live.
I think this article gives you some useful information on that:
https://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
Also note that if you do not return a value in your Tasks, a task will not return exception info. Source for that is the ref book for microsoft exam 70-483
There is probably a free version of that online somewhere ;P https://www.amazon.com/Exam-Ref-70-483-Programming-C/dp/0735676828
Maybe useful to know is that if your have an async method being called by a non-async and you wish to know its result. You can use .GetAwaiter().GetResult().
Also I think it is important to note the difference between async and multi-threading.
Async is only useful if there are operations that use other parts of a computer that is not the CPU. So things like networking or I/O operations. Using async then tells the system to go ahead and use CPU power somewhere else instead of "blocking" that thread in the CPU for just waiting for a response.
multi-threading is the allocation of operations on different threads in a CPU (for instance, creating a task which creates a background thread of the foreground thread... foreground threads being the threads that make up your application, they are primary, background threads exist linked to foreground threads. If you close the linked foreground thread, the background thread closes as well)
This allows the CPU to work on different tasks at the same time.
Combining these two makes sure the CPU does not get blocked up on just 4 threads if it is a 4 thread CPU. But can open more while it waits for async tasks that are waiting for I/O operations.
I hope this gives your the information needed to do, what ever it is you are doing :)
I would like to preface this question with the following:
I'm familiar with the IAsyncStateMachine implementation that the await keyword in C# generates.
My question is not about the basic flow of control that ensures when you use the async and await keywords.
Assumption A
The default threading behaviour in any threading environment, whether it be at the Windows operating system level or in POSIX systems or in the .NET thread pool, has been that when a thread makes a request for an I/O bound operation, say for a disk read, it issues the request to the disk device driver and enters a waiting state. Of course, I am glossing over the details because they are not of moment to our discussion.
Importantly, that thread can do nothing useful until it is unblocked by an interrupt from the device driver notifying it of completion. During this time, the thread remains on the wait queue and cannot be re-used for any other work.
I would first like a confirmation of the above description.
Assumption B
Secondly, even with the introduction of TPL, and its enhancements done in v4.5 of the .NET framework, and with the language level support for asynchronous operations involving tasks, this default behaviour described in Assumption A has not changed.
Question
Then, I'm at a loss trying to reconcile Assumptions A and B with the claim that suddenly emerged in all TPL literature that:
When the, say, main thread, starts this request for this I/O bound
work, it immediately returns and continues executing the rest of
the queued up messages in the message pump.
Well, what makes that thread return back to do other work? Isn't that thread supposed to be in the waiting state in the wait queue?
You might be tempted to reply that the code in the state machine launches the task awaiter and if the awaiter hasn't completed, the main thread returns.
That beggars the question -- what thread does the awaiter run on?
And the answer that springs up to mind is: whatever the implementation of the method be, of whose task it is awaiting.
That drives us down the rabbit hole further until we reach the last of such implementations that actually delivers the I/O request.
Where is that part of the source code in the .NET framework that changes this underlying fundamental mechanism about how threads work?
Side Note
While some blocking asynchronous methods such as WebClient.DownloadDataTaskAsync, if one were to follow their code
through their (the method's and not one's own) oval tract into their
intestines, one would see that they ultimately either execute the
download synchronously, blocking the current thread if the operation
was requested to be performed synchronously
(Task.RunSynchronously()) or if requested asynchronously, they
offload the blocking I/O bound call to a thread pool thread using the
Asynchronous Programming Model (APM) Begin and End methods.
This surely will cause the main thread to return immediately because
it just offloaded blocking I/O work to a thread pool thread, thereby
adding approximately diddlysquat to the application's scalability.
But this was a case where, within the bowels of the beast, the work
was secretly offloaded to a thread pool thread. In the case of an API
that doesn't do that, say an API that looks like this:
public async Task<string> GetDataAsync()
{
var tcs = new TaskCompletionSource<string>();
// If GetDataInternalAsync makes the network request
// on the same thread as the calling thread, it will block, right?
// How then do they claim that the thread will return immediately?
// If you look inside the state machine, it just asks the TaskAwaiter
// if it completed the task, and if it hasn't it registers a continuation
// and comes back. But that implies that the awaiter is on another thread
// and that thread is happily sleeping until it gets a kick in the butt
// from a wait handle, right?
// So, the only way would be to delegate the making of the request
// to a thread pool thread, in which case, we have not really improved
// scalability but only improved responsiveness of the main/UI thread
var s = await GetDataInternalAsync();
tcs.SetResult(s); // omitting SetException and
// cancellation for the sake of brevity
return tcs.Task;
}
Please be gentle with me if my question appears to be nonsensical. The extent of knowledge of things in almost all matters is limited. I am just learning anything.
When you are talking about an async I/O operation, the truth, as pointed out here by Stephen Cleary (http://blog.stephencleary.com/2013/11/there-is-no-thread.html) is that there is no thread. An async I/O operation is completed at a lower level than the threading model. It generally occurs within interrupt handler routines. Therefore, there is no I/O thread handling the request.
You ask how a thread that launches a blocking I/O request returns immediately. The answer is because an I/O request is not at its core actually blocking. You could block a thread such that you are intentionally saying not to do anything else until that I/O request finishes, but it was never the I/O that was blocking, it was the thread deciding to spin (or possibly yield its time slice).
The thread returns immediately because nothing has to sit there polling or querying the I/O operation. That is the core of true asynchronicity. An I/O request is made, and ultimately the completion bubbles up from an ISR. Yes, this may bubble up into the thread pool to set the task completion, but that happens in a nearly imperceptible amount of time. The work itself never had to be ran on a thread. The request itself may have been issued from a thread, but as it is an asynchronous request, the thread can immediately return.
Let's forget C# for a moment. Lets say I am writing some embedded code and I request data from a SPI bus. I send the request, continue my main loop, and when the SPI data is ready, an ISR is triggered. My main loop resumes immediately precisely because my request is asynchronous. All it has to do is push some data into a shift register and continue on. When data is ready for me to read back, an interrupt triggers. This is not running on a thread. It may interrupt a thread to complete the ISR, but you could not say that it actually ran on that thread. Just because its C#, this process is not ultimately any different.
Similarly, lets say I want to transfer data over USB. I place the data in a DMA location, set a flag to tell the bus to transfer my URB, and then immediately return. When I get a response back it also is moved into memory, an interrupt occurs and sets a flag to let the system know hey, heres a packet of data sitting in a buffer for you.
So once again, I/O is never truly blocking. It could appear to block, but that is not what is happening at the low level. It is higher level processes that may decide that an I/O operation has to happen synchronously with some other code. This is not to say of course that I/O is instant. Just that the CPU is not stuck doing work to service the I/O. It COULD block if implemented that way, and this COULD involve threads. But that is not how async I/O is implemented.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
I'm trying to wrap my head around all of the Async stuff that's been added into the .NET framework with the more recent versions. I understand some of it, but to be honest, personally I don't think it makes writing asynchronous code easier. I find it rather confusing most of the time and actually harder to read than the more conventional approaches that we used before the advent of async/await.
Anyway, my question is a simple one. I see a lot of code like this:
var stream = await file.readAsStreamAsync()
What's going on here? Isn't this equivalent to just calling the blocking variant of the method, i.e.
var stream = file.readAsStream()
If so, what's the point in using it here like this? It doesn't make the code any easier to read so please tell me what I am missing.
The result of both calls is the same.
The difference is that var stream = file.readAsStream() will block the calling thread until the operation completes.
If the call was made in a GUI app from the UI thread, the application will freeze until the IO completes.
If the call was made in a server application, the blocked thread will not be able to handle other incoming requests. The thread pool will have to create a new thread to 'replace' the blocked one, which is expensive. Scalability will suffer.
On the other hand, var stream = await file.readAsStreamAsync() will not block any thread. The UI thread in a GUI application can keep the application responding, a worker thread in a server application can handle other requests.
When the async operation completes, the OS will notify the thread pool and the rest of the method will be executed.
To make all this 'magic' possible, a method with async/await will be compiled into a state machine. Async/await allows to make complicated asynchronous code look as simple as synchronous one.
It makes writing asynchronous code enormously easier. As you noted in your own question, it looks as if you were writing the synchronous variant - but it's actually asynchronous.
To understand this, you need to really know what asynchronous and synchronous means. The meaning is really simple - synchronous means in a sequence, one after another. Asynchronous means out of sequence. But that's not the whole picture here - the two words are pretty much useless on their own, most of their meaning comes from context. You need to ask: synchronous with respect to what, exactly?
Let's say you have a Winforms application that needs to read a file. In the button click, you do a File.ReadAllText, and put the results in some textbox - all fine and dandy. The I/O operation is synchronous with respect to your UI - the UI can do nothing while you wait for the I/O operation to complete. Now, the customers start complaining that the UI seems hung for seconds at a time when it reads the file - and Windows flags the application as "Not responding". So you decide to delegate the file reading to a background worker - for example, using BackgroundWorker, or Thread. Now your I/O operation is asynchronous with respect to your UI and everyone is happy - all you had to do is extract your work and run it in its own thread, yay.
Now, this is actually perfectly fine - as long as you're only really doing one such asynchronous operation at a time. However, it does mean you have to explicitly define where the UI thread boundaries are - you need to handle the proper synchronization. Sure, this is pretty simple in Winforms, since you can just use Invoke to marshal UI work back to the UI thread - but what if you need to interact with the UI repeatedly, while doing your background work? Sure, if you just want to publish results continuously, you're fine with the BackgroundWorkers ReportProgress - but what if you also want to handle user input?
The beauty of await is that you can easily manage when you're on a background thread, and when you're on a synchronization context (such as the windows forms UI thread):
string line;
while ((line = await streamReader.ReadLineAsync()) != null)
{
if (line.StartsWith("ERROR:")) tbxLog.AppendLine(line);
if (line.StartsWith("CRITICAL:"))
{
if (MessageBox.Show(line + "\r\n" + "Do you want to continue?",
"Critical error", MessageBoxButtons.YesNo) == DialogResult.No)
{
return;
}
}
await httpClient.PostAsync(...);
}
This is wonderful - you're basically writing synchronous code as usual, but it's still asynchronous with respect to the UI thread. And the error handling is again exactly the same as with any synchronous code - using, try-finally and friends all work great.
Okay, so you don't need to sprinkle BeginInvoke here and there, what's the big deal? The real big deal is that, without any effort on your part, you actually started using the real asynchronous APIs for all those I/O operations. The thing is, there aren't really any synchronous I/O operations as far as the OS is concerned - when you do that "synchronous" File.ReadAllText, the OS simply posts an asynchronous I/O request, and then blocks your thread until the response comes back. As should be evident, the thread is wasted doing nothing in the meantime - it still uses system resources, it adds a tiny amount of work for the scheduler etc.
Again, in a typical client application, this isn't a big deal. The user doesn't care whether you have one thread or two - the difference isn't really that big. Servers are a different beast entirely, though; where a typical client only has one or two I/O operations at the same time, you want your server to handle thousands! On a typical 32-bit system, you could only fit about 2000 threads with default stacksize in your process - not because of the physical memory requirements, but just by exhausting the virtual address space. 64-bit processes are not as limited, but there's still the thing that starting up new threads and destroying them is rather pricy, and you are now adding considerable work to the OS thread scheduler - just to keep those threads waiting.
But the await-based code doesn't have this problem. It only takes up a thread when it's doing CPU work - waiting on an I/O operation to complete is not CPU work. So you issue that asynchronous I/O request, and your thread goes back to the thread pool. When the response comes, another thread is taken from the thread pool. Suddenly, instead of using thousands of threads, your server is only using a couple (usually about two per CPU core). The memory requirements are lower, the multi-threading overheads are significantly lowered, and your total throughput increases quite a bit.
So - in a client application, await is only really a thing of convenience. In any larger server application, it's a necessity - because suddenly your "start a new thread" approach simply doesn't scale. And the alternative to using await are all those old-school asynchronous APIs, which handle nothing like synchronous code, and where handling errors is very tedious and tricky.
var stream = await file.readAsStreamAsync();
DoStuff(stream);
is conceptually more like
file.readAsStreamAsync(stream => {
DoStuff(stream);
});
where the lambda is automatically called when the stream has been fully read. You can see this is quite different from the blocking code.
If you're building a UI application for example, and implementing a button handler:
private async void HandleClick(object sender, EventArgs e)
{
ShowProgressIndicator();
var response = await GetStuffFromTheWebAsync();
DoStuff(response);
HideProgressIndicator();
}
This is drastically different from the similar synchronous code:
private void HandleClick(object sender, EventArgs e)
{
ShowProgressIndicator();
var response = GetStuffFromTheWeb();
DoStuff(response);
HideProgressIndicator();
}
Because in the second code the UI will lock up and you'll never see the progress indicator (or at best it'll flash briefly) since the UI thread will be blocked until the entire click handler is completed. In the first code the progress indicator shows and then the UI thread gets to run again while the web call happens in the background, and then when the web call completes the DoStuff(response); HideProgressIndicator(); code gets scheduled on the UI thread and it nicely finishes its work and hides the progress indicator.
What's going on here? Isn't this equivalent to just calling the
blocking variant of the method, i.e.
No, it is not a blocking call. This is a syntactic sugar that the compiler uses to create a state machine, which on the runtime will be used to execute your code asynchronously.
It makes your code more readable and almost similar to code that runs synchronously.
It looks like you're missing what is all this async / await concept is about.
Keyword async let compiler knows that method may need to perform some asynchronous operations and therefore it shouldn't be executed in normal way as any other method, instead it should be treated as state machine. This indicates that compiler will first execute only part of method (let's call it Part 1), and then start some asynchronous operation on other thread releasing the calling thread. Compiler also will schedule Part 2 to execute on first available thread from the ThreadPool. If asynchronous operation is not marked with keyword await then its not been awaited and calling thread continues to run till method is finished. In most cases this is not desirable. That's when we need to use keyword await.
So typical scenario is :
Thread 1 enters async method and executes code Part1 ->
Thread 1 starts async operation ->
Thread 1 is released, operation is underway Part2 is scheduled in TP ->
Some Thread (most likely same Thread 1 is its free) continues to run method till its end (Part2) ->
I've been reading some async articles here: http://www.asp.net/web-forms/tutorials/aspnet-45/using-asynchronous-methods-in-aspnet-45 and the author says :
When you’re doing asynchronous work, you’re not always using a thread.
For example, when you make an asynchronous web service request,
ASP.NET will not be using any threads between the async method call
and the await.
So what I am trying to understand is, how does it become async if we don't use any Threads for concurrent execution? What does it mean "you're not always using a thread."?
Let me first explain what I know regarding working with threads (A quick example, of course Threads can be used in different situations other than UI and Worker methodology here)
You have UI Thread to take input, give output.
You can handle things in UI Thread but it makes the UI unresponsive.
So lets say we have a stream-related operation and we need to download some sort of data.
And we also allow users to do other things while it is being downloaded.
We create a new worker thread which downloads the file and changes the progress bar.
Once it is done, there is nothing to do so thread is killed.
We continue from UI thread.
We can either wait for the worker thread in UI thread depending on the situation but before that while the file is being downloaded, we can do other things with UI thread and then wait for the worker thread.
Isn't the same for async programming? If not, what's the difference? I read that async programming uses ThreadPool to pull threads from though.
Threads are not necessary for asynchronous programming.
"Asynchronous" means that the API doesn't block the calling thread. It does not mean that there is another thread that is blocking.
First, consider your UI example, this time using actual asynchronous APIs:
You have UI Thread to take input, give output.
You can handle things in UI Thread but it makes the UI unresponsive.
So lets say we have a stream-related operation and we need to download some sort of data.
And we also allow users to do other things while it is being downloaded.
We use asynchronous APIs to download the file. No worker thread is necessary.
The asynchronous operation reports its progress back to the UI thread (which updates the progress bar), and it also reports its completion to the UI thread (which can respond to it like any other event).
This shows how there can be only one thread involved (the UI thread), yet also have asynchronous operations going on. You can start up multiple asynchronous operations and yet only have one thread involved in those operations - no threads are blocked on them.
async/await provides a very nice syntax for starting an asynchronous operation and then returning, and having the rest of the method continue when that operation completes.
ASP.NET is similar, except it doesn't have a main/UI thread. Instead, it has a "request context" for every incomplete request. ASP.NET threads come from a thread pool, and they enter the "request context" when they work on a request; when they're done, they exit their "request context" and return to the thread pool.
ASP.NET keeps track of incomplete asynchronous operations for each request, so when a thread returns to the thread pool, it checks to see if there are any asynchronous operations in progress for that request; if there are none, then the request is complete.
So, when you await an incomplete asynchronous operation in ASP.NET, the thread will increment that counter and return. ASP.NET knows the request isn't complete because the counter is non-zero, so it doesn't finish the response. The thread returns to the thread pool, and at that point: there are no threads working on that request.
When the asynchronous operation completes, it schedules the remainder of the async method to the request context. ASP.NET grabs one of its handler threads (which may or may not be the same thread that executed the earlier part of the async method), the counter is decremented, and the thread executes the async method.
ASP.NET vNext is slightly different; there's more support for asynchronous handlers throughout the framework. But the general concept is the same.
For more information:
My async/await intro post tries to be both an intro yet also reasonably complete picture of how async and await work.
The official async/await FAQ has lots of great links that go into a lot of detail.
The MSDN magazine article It's All About the SynchronizationContext exposes some of the plumbing underneath.
First time when I saw async and await, I thougth they were C# Syntactic sugar for Asynchronous Programming Model. I was wrong, async and await are more than that. It is a brand new asynchronous pattern Task-based Asynchronous Pattern, http://www.microsoft.com/en-us/download/details.aspx?id=19957 is a good article to get start. Most of the FCL classes which inplement TAP are call APM methods (BegingXXX() and EndXXX()). Here are two code snaps for TAP and AMP:
TAP sample:
static void Main(string[] args)
{
GetResponse();
Console.ReadLine();
}
private static async Task<WebResponse> GetResponse()
{
var webRequest = WebRequest.Create("http://www.google.com");
Task<WebResponse> response = webRequest.GetResponseAsync();
Console.WriteLine(new StreamReader(response.Result.GetResponseStream()).ReadToEnd());
return response.Result;
}
APM sample:
static void Main(string[] args)
{
var webRequest = WebRequest.Create("http://www.google.com");
webRequest.BeginGetResponse(EndResponse, webRequest);
Console.ReadLine();
}
static void EndResponse(IAsyncResult result)
{
var webRequest = (WebRequest) result.AsyncState;
var response = webRequest.EndGetResponse(result);
Console.WriteLine(new StreamReader(response.GetResponseStream()).ReadToEnd());
}
Finally these two will be the same, because GetResponseAsync() call BeginGetResponse() and EndGetResponse() inside. When we reflector the source code of GetResponseAsync(), we will get code like this:
task = Task<WebResponse>.Factory.FromAsync(
new Func<AsyncCallback, object, IAsyncResult>(this.BeginGetResponse),
new Func<IAsyncResult, WebResponse>(this.EndGetResponse), null);
For APM, in the BeginXXX(), there is an argument for a callback method which will invoked when the task (typically is an IO heavy operation) was completed. Creating a new thread and asynchronous, both of them will immediately return in main thread, both of them are unblocked. On performance side, creating new thread will cost more resource when process I/O-bound operations such us read file, database operation and network read. There are two disadvantages in creating new thread,
like in your mentioned article, there are memory cost and CLR are
limitation on thread pool.
Context switch will happen. On the other hander, asynchronous will
not create any thread manually and it will not have context switch
when the the IO-bound operations return.
Here is an picture which can help to understand the differences:
This diagram is from a MSDN article "Asynchronous Pages in ASP.NET 2.0", which explain very detail about how the old asynchronous working in ASP.NET 2.0.
About Asynchronous Programming Model, please get more detail from Jeffrey Richter's article "Implementing the CLR Asynchronous Programming Model", also there are more detail on his book "CLR via Csharp 3rd Edition" in chapter 27.
Let’s imagine that you are implementing a web application and as each client request comes in to
your server, you need to make a database request. When a client request comes in, a thread pool
thread will call into your code. If you now issue a database request synchronously, the thread will block
for an indefinite amount of time waiting for the database to respond with the result. If during this time
another client request comes in, the thread pool will have to create another thread and again this
thread will block when it makes another database request. As more and more client requests come in,
more and more threads are created, and all these threads block waiting for the database to respond.
The result is that your web server is allocating lots of system resources (threads and their memory) that
are barely even used!
And to make matters worse, when the database does reply with the various results, threads become
unblocked and they all start executing. But since you might have lots of threads running and relatively
few CPU cores, Windows has to perform frequent context switches, which hurts performance even
more. This is no way to implement a scalable application.
To read data from the file, I now call ReadAsync instead of Read. ReadAsync internally allocates a
Task object to represent the pending completion of the read operation. Then, ReadAsync
calls Win32’s ReadFile function (#1). ReadFile allocates its IRP, initializes it just like it did in the
synchronous scenario (#2), and then passes it down to the Windows kernel (#3). Windows adds the IRP
to the hard disk driver’s IRP queue (#4), but now, instead of blocking your thread, your thread is
allowed to return to your code; your thread immediately returns from its call to ReadAsync (#5, #6,
and #7). Now, of course, the IRP has not necessarily been processed yet, so you cannot have code after
ReadAsync that attempts to access the bytes in the passed-in Byte[].
I am implementing a protocol library. Here a simplified description.
The main thread within the main function will always check, whether some data is available on the the networkstream (within a tcpclient). Let us say response is the received message and thread is a running thread.
thread = new Thread(new ThreadStart(function));
thread.IsBackground = true;
thread.Start();
while(true){
response = receiveMessage();
if (response != null)
{
thread.Suspend();
//I am searching for an alternative for the line above and not thread.Abort().
thread2 = new Thread(new ThreadStart(function2));
thread2.IsBackground = true;
thread2.Start();
}
}
So far so good, there are actually more messages to come within the while loop and there is also a statemachine for handling different sort of incoming messages, but this should be enough.
(There are also more than just the functions "function" and "function2").
So anyways how the functions look inside is not clear in this application, since the protocol is hidden from the programmer and meant to be a library. This means the protocol will start some programmer-defined functions as a thread depending on at what state in the protocol the program is.
So if then a special response is received (e.g. a callAnotherFunction message), I want to terminate
a thread (here named "thread") abruptly, lets say within 100 ms. But I do not know whether it executes within a loop or without and how much processing is needed until it terminates.
How to stop these threads without deprecated Suspend or Exceptionthrowing Abort function?
(Note that I cannot force the programmer of the functions to catch the ThreadAbortException.)
Or do I need a different programme architecture?
(Btw I have decided to put the loop within receiveMessage for polling the network stream into the main function, since anytime a message can appear).
Starting a thread without having a reliable way to terminate it is a bad practice. Suspend/Abort are one of those unreliable ways to terminate a thread because you may terminate a thread in a state that corrupts your entire program and you have no way to avoid it from happening.
You can see how to kill a thread safely here: Killing a .NET thread
If the "user" is giving you a method to run in a thread, then the user should also give you a method to stop the code from running. Think of it as a contract: you promise the user that you will call the stop method and they promise that the stop method will actually stop the thread. If your user violates that contract then they will be responsible for the issues that arise, which is good because you don't want to be responsible for your user's errors :).
Note that I cannot force the programmer of the functions to catch the ThreadAbortException.
Since Suspend/Abort are bad practice, the programmer doesn't need to catch the ThreadAbortException, however they should catch the ThreadInterruptedException as part of their "contract."
Remember that there are two situations you need to worry about:
The thread is executing some code.
The thread is in a blocking state.
In the case that the thread is executing some code, all you can do is notify the thread that it can exit and wait until it processes the notification. You may also skip the waiting and assume that you've leaked a resource, in which case it's the user's fault again because they didn't design their stop method to terminate their thread in a timely fashion.
In the case where the thread is in a blocking state and it's not blocking on a notification construct (i.e. semaphore, manual reset event, etc) then you should call Thread.Interrupt() to get it out of the blocking state- the user must handle the ThreadInterruptedException.
Suspend is really evil especially in a way you are trying to use it - to stop thread execution forever. It will leave all locks that thread had and also will not release resources.
Thread Abort is slightly better since it will at least try to terminate thread cleaner and locks will have chance to be released.
To properly do that you really need your thread's code to cooperate in termination. Events, semaphores or even simple bool value checked by the thread may be enough.
It may be better to re-architect your solution to have queue of messages and process them on separate thread. Special message may simply empty the queue.
You need some sort of cancellation protocol between your application and wherever function comes from. Then you can share some sort of cancellation token between function and your message loop. If message loop recognizes that function needs to be stopped you signal that by setting that token which must be tested by function on proper occasions. The simplest way would be to share a condition variable which can be atomically set from within your message loop and atomically read from function.
I'd however consider using the proper Asynchronous IO patterns combined with Tasks provided by the .NET framework out-of-the box along with proper cancellation mechanisms.
So function refers to code which you have little control over? This is pretty typical of 3rd party libraries. Most of the time they do not have builtin abilities to gracefully terminate long running operations. Since you have no idea how these functions are implemented you have very few options. In fact, your only guaranteed safe option is to spin these operations up in their own process and communicate with them via WCF. That way if you need to terminate the operation abruptly you would just kill the process. Killing another process will not corrupt the state of the current process like what would happen if you called Thread.Abort on thread within the current process.