APM Pattern Use a Thread from Threadpool? - c#

I wonder whether existing I/O bound APM calls in .net API (BeginGetResponse, BeginRead, etc.) uses a thread from threadpool or uses the current thread until the callback. I know that it is "async" all the way down to the hardware/network card. I also know that the callback is executed on threadpool. My question is that: All contents of BeginGetResponse are executed on Threadpool or the contents until waiting for I/O are executed on current thread; then the rest is executed on threadpool.
I hope that the question is clear. I really wonder how BeginGetResponse is implemented underhood.

APM is more general mechanism. But the cases you are talking about use the operating system's support for I/O completion ports. The general idea is that your main thread calls the BeginXxx() method. Under the hood, it calls ThreadPool.BindHandle(), that sets up the plumbing to get the port to automatically start a TP thread when the I/O operation completes. That thread calls your callback method.
Core idea that no thread is waiting while the I/O operation takes place.
This is supported for MessageQueue, FileStream, PipeStream, Socket, FileSystemWatcher, IpcChannel and SerialPort.

BeginXxx execute on the current thread. You can easily verify this for yourself using e.g. Reflector. Moreover, sometimes the callback is executed on the current thread too. One case is if an error occurs early, and another is when the actual asynchronous I/O operation blocks — this happens sometimes, as asynchronous I/O is not guaranteed not to block.

The IAsyncResult approach using worker pool threads is available only for some tasks. Like FileIO (not directory enumeration), LDAP query (v2.0), ADO .net queries.
If you have it and can take the complexity, use the APM. They are usually built by .net folks as it takes some complexity.
Otherwise, use hand built if you think you will get speed.
Using explicit threads gives you more control. Specifically, you can choose to have foreground threads, which will keep your application "alive" after the main thread returns from Main. Explicit threads can also specify their COM threading apartment.
The general rule is to use the threadpool when you have a queue of work items to do, and use an explicit thread when you have an architectural need for them.

Many operations use IO completion ports.
This means that no thread is used while waiting for the operation. Once the operation is complete, the callback is called on a thread-pool thread or using some other synchronization context.

Related

How do I run a large number of blocking operations in a performant way?

I want to execute a library method that does a blocking I/O operation many times (up to 67840 calls). The library does not provide an async version of the method.
Since in most cases the call just waits for a timeout, I want to run multiple calls in parallel. My method is async, therefore it would be good if I could await the result.
Since the ThreadPool should not be used for blocking operations, I would like to do the following:
Start a number of threads (e.g. 1024)
Run the blocking calls on these threads
await the completion (e. g. via TaskCompletionSource) and process the result of each call in normal Tasks on the TheadPool
Are there existing classes in .NET with which I could achive something like this? I am aware of TaskCreationOptions.LongRunning, but as far as I can see this would create a new thread for each call.
blocking I/O operation... The library does not provide an async version of the method.
Just from this, you know you won't end up with an "ideal" solution. Ideally, I/O is performed asynchronously. In fact, on Windows, all I/O is performed asynchronously at the OS level, with each synchronous API call just blocking the current thread until that asynchronous operation completes.
So, the first thing you should accept is that you'll need to bend the rules a little.
Since in most cases the call just waits for a timeout, I want to run multiple calls in parallel.
Yes. Parallelism is an appropriate solution. If it were possible to do the I/O asynchronously, then parallelism would not be the appropriate solution, but since the I/O is blocking (and you have no control over that), then parallelism is the best solution you're left with.
My method is async, therefore it would be good if I could await the result.
This doesn't necessarily follow. It's acceptable for asynchronous methods to be partially blocking, as long as that's clearly documented. The asynchronous signature (i.e., "returns a Task" and has an *Async suffix) implies that the method may be asynchronous, not that it must be asynchronous.
Personally, I prefer not to do thread offloading in my logic methods, and only do it when calling them from the UI layer (link to my blog).
Since the ThreadPool should not be used for blocking operations
Well, this is one of those rules you can consider bending. The thread pool does work just fine with blocking operations, and in fact it's my first suggested solution.
Start a number of threads (e.g. 1024)... Run the blocking calls on these threads
If you toss out the "I want my own threads" part and just use the thread pool, then the answer is quite simple: Parallel or PLINQ would work quite nicely. You can set a maximum level of parallelism for both of these approaches, and you can set a larger than normal minimum thread count on the thread pool to scale up the number of threads more quickly if you want.
This does toss a lot of blocking work on the thread pool, which is generally not recommended but can work in some scenarios. Specifically, client applications like console apps or GUI apps would work fine with this. If this is in a web app, though, then you would not want to fill up the thread pool with blocking calls. In that case, I'd actually recommend splitting up the scanning to a separate app using a basic distributed architecture (link to my blog).
await the completion (e. g. via TaskCompletionSource) and process the result of each call in normal Tasks on the TheadPool
If you want to do the parallel work on a separate thread, then you can wrap it in await Task.Run(...); mucking around with TCS isn't necessary.

How are asynchronous I/O methods processed

After reading alot about async-await, I can only find the benefits of using it in GUI thread (WPF/WinForms).
In what scenarios does it reduce the creation of threads in WCF services?
Does a programmer must use async-await on every method in the service by choosing to implement async-await in web service? Making some non-async-await methods in a service full of async-await reduse the efficiency of my service? How?
Last question - some say that using 'await Task.Run(()=>...)' is not a "real async-await". What do they mean by saying that?
Thanks in advence,
Stav.
EDIT:
Both answers are excellent but for even dipper explanation about how async-await works, I suggest to read #Stephen Cleary answer here:
https://stackoverflow.com/a/7663734/806963
Following topics are required for understand his answer:
SynchronizationContext,SynchronizationContext.Current,TaskScheduler,TaskScheduler.Current,Threadpool.
The real benefit of async/await in server applications (like WCF) is asynchronous I/O.
When you call a synchronous I/O method, the calling thread will be blocked waiting for the I/O to complete. The thread cannot be used by other requests, it just waits for the result. When more requests arrive, the thread pool will create more threads to handle them, wasting a lot of resources - memory, context switching when the waiting threads get unblocked...
If you use async IO, the thread is not blocked. After starting the asynchronous IO operation, it is again available to be used by the thread pool. When the async operation is finished, the thread pool assigns a thread to continue processing the request. No resources wasted.
From MSDN (it's about file I/O, but applies to other too)
In synchronous file I/O, a thread starts an I/O operation and immediately enters a wait state until the I/O request has completed. A thread performing asynchronous file I/O sends an I/O request to the kernel by calling an appropriate function. If the request is accepted by the kernel, the calling thread continues processing another job until the kernel signals to the thread that the I/O operation is complete. It then interrupts its current job and processes the data from the I/O operation as necessary.
Now you probably can see why await Task.Run() will not give any benefit if the IO in the task is done synchronously. A thread will get blocked anyway, just not the one that called the Task.Run().
You don't need to implement every method asynchronously to see improvement in performance (although it should become a habit to always perform I/O asynchronously).
In what scenarios does it reduce the creation of threads in WCF services?
If you have an action that will wait on an IO operation (reading from the database, calling an external web service, ...), using async/await frees up the managed thread that your WCF request is being processed on. That makes the thread available for other requests, pending completion of your IO. It makes for more efficient use of the thread pool.
After reading alot about async-await, I can only find the benefits of using it in GUI thread
For client applications that is the key benefit that I'm aware of, since you are far less likely to run out of manged threads than you are in a server application.
some say that using 'await Task.Run(()=>...)' is not a "real async-await".
You allocate a new managed thread to run your new task, so you are not saving any managed threads.

How are threads managed for Begin/Async calls (like socket IO)?

The .Net Socket async API manages threads automatically when using the BeginXXX methods. For example, if I have 100 active connections sending and receiving TCP messages, will be used around 3 threads. And it makes me curious.
How the API makes this thread management?
How all flow of connections are divided among the threads to be processed?
How the manager prioritizes which connections/readings/writings must be processed first?
My questions may not have sense because I don't know how it works and what to ask specifically, so sorry. Basically I need to know how this whole process works in low level.
The .Net Socket async API manages threads automatically when using the
BeginXXX methods.
This is not quite correct. APM Begin/End-style socket API do not manage threads at all. Rather, the completion AsyncCallback is called on a random thread, which is the thread where the asynchronous socket I/O operation has completed. Most likely, this is going to be an IOCP pool thread (I/O completion port thread), different from the thread on which you called the BeginXXX method. For more details, check Stephen Cleary's "There Is No Thread".
How the manager prioritizes which connections/readings/writings must
be processed first?
The case when there's no IOCP threads available to handle the completion of the async I/O operation is called TheadPool starvation. It happens when all pool threads are busy executing some code (e.g., processing the received socket messages), or are blocked with a blocking call like WaitHandle.WaitOne(). In this case, the I/O completion routine is queued to ThreadPool to be executed when a thread becomes available, on FIFO basis.
You have an option to increase the size of ThreadPool with SetMinThreads/SetMaxThreads APIs, but doing so isn't always a good idea. The number of actual concurrent threads is anyway limited by the number of CPU/cores, so you'd rather want to finish any CPU-bound processing work as soon as possible and release the thread to go back to the pool.

ReadToEndAsync from UI Thread

If I call await ReadToEndAsync from the UI thread on Windows Phone 8, on what context will ReadToEndAsync do its work? Will a task get queued for processing by the UI thread itself, or will a new thread do the work.
Based on this:
http://blogs.msdn.com/b/ericlippert/archive/2010/11/04/asynchrony-in-c-5-0-part-four-it-s-not-magic.aspx
it seems like it will run on the UI thread.
This is an essential truth of async in its purest form: There is no thread.
For a truly asynchronous stream, ReadToEndAsync has no almost work to do. When you call that method, it merely asks the runtime to read to the end, and notify it when the operation is complete (via a Task). The runtime turns to the OS, asks it to read, and notify it when the operation is complete (e.g., via an IOCP). The OS turns to the device driver, asks it to read, and notify it when the operation is complete (e.g., via an IRP). The device driver turns to the device, asks it to read, and notify it when the operation is complete (e.g., via an IRQ).
There is no thread.
This is an ideal situation, of course. In the real world, at some point the "read to end" operation is broken up into several "read n byte" operations, and those need to be stitched back together. That (tiny) amount of work is done using borrowed threads: unknowable threads for kernel-mode code and thread pool threads for user-mode code.
Also, there are some situations where an asynchronous API does not exist. In those cases, asynchronous work is faked using a thread pool thread. For example, if you call ReadToEndAsync on a MemoryStream, there are no asynchronous APIs for reading from memory, so that is a fake asynchronous operation that will run on the thread pool.
But the idea that there always must be a thread to execute an asynchronous operation is not the truth. Do not try to control the thread — that's impossible. Instead, only try to realize the truth: there is no thread.
Edit: Expanded this answer into a blog post.

Which threads work method I need to use?

I have audio player application (c# .NET 4.0 WPF) that gets an audio-stream from the web and plays it. The app also displays waveforms and spectrums and saves the audio to local disk. It also does a few more things.
My quetion is when I recive a new byte packet from the web and I need to play them (and maybe write them to local disk etc.), do I need use threads? I try to do all the things with the main thread and it seems to work well.
I can work with threadpool for every bytes packet that I received in my connection. Would this be a reasonable approach?
For this you can use the Task Parallel Library (TPL). The Task Parallel Library (TPL) is a set of public types and APIs in the System.Threading and System.Threading.Tasks namespaces in the .NET Framework version 4. The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications. The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available. In addition, the TPL handles the partitioning of the work, the scheduling of threads on the ThreadPool, cancellation support, state management, and other low-level details.
Another option (if the operations you were performing were sufficiently long running) is the BackgroundWorker class. The BackgroundWorker component gives you the ability to execute time-consuming operations asynchronously ("in the background"), on a thread different from your application's main UI thread. To use a BackgroundWorker, you simply tell it what time-consuming worker method to execute in the background, and then you call the RunWorkerAsync method. Your calling thread continues to run normally while the worker method runs asynchronously. When the method is finished, the BackgroundWorker alerts the calling thread by firing the RunWorkerCompleted event, which optionally contains the results of the operation. This may not be the best option for you if you have many operations to undertake sequentially.
The next alternative that has been largely replaced by the TPL, is the Thread Class. This is not so easy to use at the TPL and you can do everything using the TPL as you can using the Thread Class (well almost) and the TPL is much more user friendly.
I hope this helps.
I suggest using 2 threads: in one you are downloading packets from web and putting them in queue(it can be UI thread if you are using async download operation) and in another thread you are analyzing queue and processing packets from it.

Categories