I know .NET has a good asynchronous model for network I/O, which uses completion port under the hood. But all the callbacks are happening on IO threads from thread pool.
Is there any .NET counterparts like Java's selector which deals with multiple streams in a single thread? Or, the thread pool callbacks scale better than this single threaded approach?
Thanks,
Dodd
For async operations an IO handle is associated with the thread pool. When the async operation completes, I believe that the callbacks(for each stream) may or may not execute using the same thread, any available thread pool thread could process the callback, it's quite possible that the same thread could process multiple callbacks or just one callback based on runtime conditions.
Hope this helps
EDIT: Adding reply to Dodd's comment
I'm not intimately familiar with the Selector API but from looking at an example posted here it seems that Selector waits until all events occur. Is that true? If so then the caller would have to wait for all events to occur even when one event occurs sooner than the another.
But if the Selector works by processing an event as soon as it occurs, one could run into a situation where the selector is processing the callback for one event while another event arrives(I would imagine that in this case the incoming event would get queued somewhere or else you would be dropping events) but it would still reduce the throughput when the events are orthogonal and should be processed as soon as they occur.
The async model in .NET is centered around the thread pool to reduce the overhead of creating a new thread(since it's an expensive operation). If you are observing that the thread pool is maxing out you could increase the number of Thread in the pool as documented here. Bear in mind though that at the end of the day you are limited to the number of processors i.e On a dual core box only 2 threads can be actively running, all others are blocked, so that might be something to take into account.
Hope this helps.
Thanks Abhijeet,
Well, my concern is that, in some extremely busy senarios, many async callbacks happen simultaneously and we are running out of threads: then context switching will be nightmare. In this particular case, is async callback the right choice? Or we should use Select()?
Related
I have a pretty specific question about the .NET threadpool.
I would say I have a pretty fair understanding of the threadpool, but one thing still boggles my mind.
Let's assume I run a web application which serves requests, but also performs a lot of heavy duty CPU-bound work by rendering / editing uploaded media.
Common advice when it comes to separating I/O and CPU bound tasks in an application would be to dispatch the CPU bound work to the .Net ThreadPool. Concrete, that would mean dispatching the call with Task.Run(...) - So far so good.
However, I do wonder, what would happen if this is done for a lot of requests. Let's say several hundreds / thousands, enough to really put an enourmous strain on a machine, and even up to the point the Threadpool just can't handle it anymore. Adding more Threads would obviously go only so far, when your CPU can't handle more. I would say at this point the Threadpool's Threads are also at the mercy of the CPU itself, and the scheduling algorithm.
What implications would this have on I/O bound async operations?
Would this cause I/O bound async operations to struggle with executing their continuation? Given we are in a runtime environment which executes async/await continuations on the Threadpool and discards the SynchronizationContext, what would ensure that these would still execute properly?
Does the Threadpool make any sophisticated assumption as to which Thread receives scheduling priority, to ensure throughput even when it's absolutely polluted with work?
It would be especially interesting to know how ASP.Net Core deals with this, since the request handlers are supposedly Threadpool Threads themselves.
Let's assume I run a web application which serves requests, but also performs a lot of heavy duty CPU-bound work by rendering / editing uploaded media.
Common advice when it comes to separating I/O and CPU bound tasks in an application would be to dispatch the CPU bound work to the .Net ThreadPool. Concrete, that would mean dispatching the call with Task.Run(...) - So far so good.
No, that's bad advice.
ASP.NET is already handling the request on a thread pool thread, so switching to another thread pool thread via Task.Run isn't going to help anything - in fact, it'll make things worse.
Task.Run is fine to offload CPU work to the thread pool when the calling method is a GUI thread. However, it's not a good idea to use Task.Run on ASP.NET, generally speaking.
However, I do wonder, what would happen if this is done for a lot of requests. Let's say several hundreds / thousands, enough to really put an enourmous strain on a machine, and even up to the point the Threadpool just can't handle it anymore. Adding more Threads would obviously go only so far, when your CPU can't handle more.
The thread pool will inject threads whenever the thread pool is over-full. However, the injection rate is limited, so the thread pool grows slowly.
What implications would this have on I/O bound async operations? Would this cause I/O bound async operations to struggle with executing their continuation? ... what would ensure that these would still execute properly?
First off, the I/O requests themselves (and their lowest-level, BCL-internal continuations) are not affected. That's because "the" thread pool is actually two thread pools: there's worker threads (that execute queued work) and there's I/O threads (that enlist in the I/O completion port and handle low-level I/O completion).
However, at some point most continuations do transition to the worker thread pool, so by the time your code continues, it needs a regular thread pool thread to do so. And yes, that means that if the (worker) thread pool is saturated, then that can starve await continuations.
Having ASP.NET handlers do heavy CPU work is unusual. The thread pool does have a lot of knobs to tweak if you do need to support it. And there's always the option of splitting the CPU-bound APIs internally into a separate API, which would give you two different ASP.NET apps: one I/O-bound and the other CPU-bound, which would let you tune the thread pool appropriately for each.
I am building on top of an existing application and using the C# SerialPort class to handle data transactions over USB. I've noticed that when the application spawns a number of long-running threads and does not specify them as long-running using the TaskCreation LongRunning option, these threads can temporarily starve the SerialPort DataReceived event and prevent it from triggering for some extended amount of time.
Is this some fundamental result of the way in which C# handles thread management?
Is there any way to increase the "priority" of the DataReceived event?
If not, would a "solution" be to have a constantly running thread that polls the serial port data flags rather than using the DataReceived event?
Thanks!
That is pretty fundamental, yes. Your DataReceived event handler is called by a thread-pool thread. When you've got too many of them active for other purposes, like Tasks that are not LongRunning, then it can be a while before your event handler gets a shot at running. The normal operating system scheduling algorithm that boosts a thread's priority when it completes an I/O call is ineffective here, it only gets the tp thread scheduled efficiently :)
This is a fundamental fire-hose problem, you are expecting the machine to accomplish more work than it can perform. The ballpark way to know if you are doing it right is by looking at Task Manager. CPU usage should be pegged at 100%. If it is substantially less then you are not using the thread-pool efficiently, some tasks are hogging the pool but are not executing enough code, typically because they are waiting too much on a sync object or an I/O operation to complete. You ought to fix that, either by using a plain Thread or with TaskCreationOptions.LongRunning. ThreadPool.SetMinThreads() is often quoted as a quick fix, it is a dirty one. Upgrading the machine spec, more cores, is a cleaner one.
There's an even simpler solution -- use the serial port the way the device driver writers intended. There is absolutely nothing about serial port communications that warrants use of a worker thread, either manually created or from the thread pool. I encourage you to read my entire blog post "If you must use .NET System.IO.Ports.SerialPort", but the short version is to use BaseStream.ReadAsync with async+await if you are on a recent version of .NET, and BaseStream.BeginRead with a callback otherwise.
Also, thanks for providing me with yet another reason the DataReceived event is horrible. The list is getting quite long.
I have a piece of code (on a server) that uses async method to receive data on sockets like this:
asyncRes = connectionSocket.BeginReceive(receiveBuffer, 0, RECEIVING_BUFFER_SIZE,
SocketFlags.None, out error, new AsyncCallback(ReceiveDataDone), null);
In the handler (ReceiveDataDone) of the socket there are cases where Thread.Sleep(X) is used in order to wait for other things(questionable implementation indeed). I know this is a questionable design but I wonder if making such kind of code could explain an explosion of threads created in my application because of the other pending sockets in the server that have their ReceiveDataDone called. (when many connections are handled by the server the number of threads created figuratively explodes). I wonder how BeginReceive method on .NET sockets work, that could explain the huge number of threads I see.
You absolutely should not perform any kind of blocking action in APM callbacks. These are run in the ThreadPool. The ThreadPool is designed for the invocation of short-lived tasks. If you block (or take a long time to execute) you are tying up (a finite number of) threads and causing ThreadPool starvation. Because the ThreadPool does not spin up extra threads easily (in fact, it's quite slow to start extra threads), you're bottlenecking on the timing that controls how quickly the ThreadPool is allowed to spin up new threads.
Despite answering a different question, this answer I provided a while back explains the same issue:
https://stackoverflow.com/a/1733226/14357
You should not use Thread.sleep for waiting in ThreadPool Threads this causes the Thread to be blocked and It will not accept any further workitems for the time it is blocked.
You can use TimerCallback for such a use case. It will let the ThreadPool schedule other work on the waiting thread in the meantime.
I'm making a multi-threaded application using delegates to handle the processing of requests in a WCF service. I want the clients to be able to send the request and then disconnect and await for a callback to announce the work is done (which will most likely be searching through a database). I don't know how many requests may come in at once, it could be one every once in a while or it could spike to dozens.
As far as I know, .Net's threadpool has 25 threads available to use. What happens when I spawn 25 delegates or more? Does it throw an error, does it wait, does it pause an existing operation and start working on the new delegate, or some other behavior?
Beyond that, what happens if I want to spawn up to or more than 25 delegates while other operations (such as incoming/outgoing connections) want to start, and/or when another operation is working and I want to spawn another delegate?
I want to make sure this is scalable without being too complex.
Thanks
All operations are queued (I am assuming that you are using the threadpool directly or indirectly). It is the job of the threadpool to munch through the queue and dispatch operations onto threads. Eventually all threads may become busy, which will just mean that the queue will grow until threads are free to start processing queued work items.
You're confusing delegates with threads, and number of concurrent connections.
With WCF 2-way bindings, the connection remains open while waiting for the callback.
IIS 7 or above, on modern hardware should have no difficulty maintaining a few thousand concurrent connections if they're sitting idle.
Delegates are just method pointers - you can have as many as you wish. That doesn't mean they're being invoked concurrently.
If you are using ThreadPool.QueueUserWorkItem then it just queues the extra items until a thread is available.
ThreadPools default max amount of thread is 250 not 25! You can still set a higher limit for the ThreadPool if you need that.
If your ThreadPool runs out of threads two things may happen: All opperations are queued until the next resource is available. If there are finished threads those might still be "in use" so the GC will trigger and free up some of them, providing you with new resources.
However you can also create Threads not using the ThreadPool.
Senerio
We have a C# .Net Web Application that records incidents. An external database needs to be queried when an incident is approved by a supervisor. The queries to this external database are sometimes taking a while to run. This lag is experienced through the browser.
Possible Solution
I want to use threading to eliminate the simulated hang to the browser. I have used the Thread class before and heard about ThreadPool. But, I just found BackgroundWorker in this post.
MSDN states:
The BackgroundWorker class allows you to run an operation on a separate, dedicated thread. Time-consuming operations like downloads and database transactions can cause your user interface (UI) to seem as though it has stopped responding while they are running. When you want a responsive UI and you are faced with long delays associated with such operations, the BackgroundWorker class provides a convenient solution.
Is BackgroundWorker the way to go when handling long running queries?
What happens when 2 or more BackgroundWorker processes are ran simultaneously? Is it handled like a pool?
Yes, BackgroundWorker can significantly simplify your threading code for long-running operations. The key is registering for the DoWork, ProgressChanged, and RunWorkerCompleted events. These help you avoid having to have a bunch of synchronization objects passed back and forth with the thread to try to determine the progress of the operation.
Also, I believe the progress events are called on the UI thread, avoiding the need for calls to Control.Invoke to update your UI.
To answer your last question, yes, threads are allocated from the .NET thread pool, so you while you may instantiate as many BackgroundWorker objects as you'd like, you can only run as many concurrent operations as the thread pool will allow.
If you're using .NET 4 (or can use the TPL backport from the Rx Framework), then one nice option is to use a Task created with the LongRunning hint.
This provides many options difficult to accomplish via the ThreadPool or BackgroundWorker, including allowing for continuations to be specified at creation time, as well as allowing for clean cancellation and exception/error handling.
I had ran in similar situation with long running queries. I used the asynchronous invoke provided by delegates. You can use the BeginInvoke method of the delegate.
BackgroundWrokerks are just like any other threads, accept they can be killed or quit, w/out exiting the main thread and your application.
ThreadPool uses a pool of BackgroundWorkers. It is the preferred way of most multi threading scenarios because .net manages threads for you, and it re-uses them instead of creating new ones as needed which is a expensive process.
Such threading scenarios are great for processor intensive code.
For something like a query which happens externally, you also have the option of asynchronous data access. You can hand off the query request, and give it the name of your callback method, which will be called when query is finished and than do something with the result (i.e. update UI status or display returned data)..
.Net has inbuilt support for asynchronous data querying
http://www.devx.com/dotnet/Article/26747