How does a Handle relate to a thread? - c#

How exactly does a Handle relate to a thread? I am writing a service that accepts an HTTP request and calls a method before returning a response. I have written a test client that sends out 10,000 HTTP requests (using a semaphore to make sure that only 1000 request are made at a time).
If i call the method (the method processed before returning a response) through the ThreadPool, or through a generic Action<T>.BeginInvoke, the service's handles will go way up and stay there until all the request have finished, but the thread count of the service stays pretty much dead.
However, if I synchronously call the method before returning the response, the thread count goes up, but the handle count will will go through extreme peaks and valleys.
This is C# on a windows machine (Server 2008)

Your description is too vague to give a good diagnostic. But the ThreadPool was designed to carefully throttle the number of active threads. It will avoid running more threads than you have CPU cores. Only when a thread gets "stuck" will it schedule an extra thread. That explains why you see the number of threads not increase wildly. And, indirectly, why the handle count stays stable because the machine is doing less work.

You can think of a handle as an abstraction of a pointer. There are lots of things in Windows that use handles (when you open a file at the API level, you get a handle to the file, when you create a window, the window has a handle, a thread has a handle, etc). So, Your handle count probably has to do with operations that are occuring on your threads. If you have more threads running, more stuff is going on at the same time, so you will see more handles open.

Related

ThreadPool and methods with while(true) loops?

ThreadPool utilizes recycling of threads for optimal performance over using multiple of the Thread class. However, how does this apply to processing methods with while loops inside of the ThreadPool?
As an example, if we were to apply a thread in the ThreadPool to a client that has connected to a TCP server, that client would need a while loop to keep checking for incoming data. The loop can be exited to disconnect the client, but only if the server closes or if the client demands a disconnection.
If that is the case, then how would having a ThreadPool help when masses of clients connect? Either way the same amount of memory is used if the clients stay connected. If they stay connected, then the threads cannot be recycled. If so, then ThreadPool would not help much until a client disconnects and opens up a thread to recycle.
On the other hand it was suggested to me to use the Network.BeginReceive and NetworkStream.EndReceive asynchronous methods to avoid threads all together to save RAM usage and CPU usage. Is this true or not?
Either way the same amount of memory is used if the clients stay
connected.
So far this is true. It's up to your app to decide how much state it needs to keep per client.
If they stay connected, then the threads cannot be recycled. If so,
then ThreadPool would not help much until a client disconnects and
opens up a thread to recycle.
This is untrue, because it assumes that all interesting operations performed by these threads are synchronous. This is a naive mode of operation, and in fact real world code is asynchronous: a thread makes a call to request an action and is then free to do other things. When a result is made available as a result of that action, some thread looking for other things to do will run the code that acts on the result.
On the other hand it was suggested to me to use the
Network.BeginReceive and NetworkStream.EndReceive asynchronous methods
to avoid threads all together to save RAM usage and CPU usage. Is this
true or not?
As explained above, async methods like these will allow you to service a potentially very large number of clients with only a small number of worker threads -- but by itself it will do nothing to either help or hurt the memory situation.
You are correct. Slow blocking codes can cause poor performances both on the client-side as well as server-side. You can run slow work on a separate thread and that might work well enough on the client-side but may not help on the server-side. Having blocking methods in the server can diminish the overall performance of the server because it can lead to a situation where your server has a large no of threads running and all blocked. So, even simple request might end up taking a long time. It is better to use asynchronous APIs if they are available for slow running tasks just like the situation you are in. (Note: even if the asynchronous operations are not available, you can implement one by implementing a custom awaiter class) This is better for the clients as well as servers. The main point of asynchronous code is to reduce the no of threads. Because servers can have larger no of requests in progress simultaneously because reducing no of threads to handle a particular no of clients can improve scalability.
If you dont need to have more control over the threads or the thread-pool you can go with asynchronous approach.
Also, each thread takes 1 MB space on the heap. So, asynchronous methods will definitely help reduce memory usage. However, I think the nature of the work you have described here is going to take pretty much the same amount of time in multi-threaded as well as asynchronous approach.

How to create a "Spool" service for a class in C#

I am looking into a C# programming fairly scrub to the language. I would like to think I have a good understanding of object oriented programming in general, and what running multiple threads means, at a high level, but actual implementation I am as said scrub.
What I am looking to do is to create a tool that will have many threads running and interacting with each other independent, each will serve their own task and may call others.
My strategy to ensure communication (without losing anything with multiple updates occurring same time from different threads) is on each class to create a spool like task that can be called external, and add tasks to a given thread, or spool service for these. I am not sure if I should place this on the class or external and have the class itself call the spool for new tasks and keeping track of the spool. Here I am in particular considering how to signal the class if an empty spool gets a task (listener approach, so tasks can subscribe to pools if they want to be awoken if new stuff arrive), or make a "check every X seconds if out of tasks and next task is not scheduled" approach
What would a good strategy be to create this, should I create this in the actual class, or external? What are the critical regions in the implementation, as the "busy wait check" allows it to only be on adding new jobs, and removing jobs on the actual spool, while the signaling will require both adding/removing jobs, but also the goto sleep on signaling to be critical, and that suddenly add a high requirement for the spool of what to do if the critical region has entered, as this could result in blocks, causing other blocks, and possible unforeseen deadlocks.
I use such a model often, on various systems. I define a class for the agents, say 'AgentClass' and one for the requests, say 'RequestClass'. The agent has two abstract methods, 'submit(RequestClass *message)' and 'signal()'. Typically, a thread in the agent constructs a producer-consumer queue and waits on it for RequestClass instances, the submit() method queueing the passed RequestClass instances to the queue. The RequestClass usually contains a 'command' enumeration that tells the agent what needs doing, together with all data required to perform the request and the 'sender' agent instance. When an agent gets a request, it switches on the enumeration to call the correct function to do the request. The agent acts only on the data in the RequestClass - results, error messages etc. are placed in data members of the RequestClass. When the agent has performed the request, (or failed and generated error data), it can either submit() the request back to the sender, (ie. the request has been performed asynchronously), or call the senders signal() function, whch signals an event upon which the sender was waiting, (ie. the request was performed synchronously).
I usually construct a fixed number of RequestClass instances at startup and store them in a global 'pool' P-C queue. Any agent/thread/whatever than needs to send a request can dequeue a RequestClass instance, fill in data, submit() it to the agent and then wait asynchronously or synchronously for the request to be performed. When done with, the RequestClass is returned to the pool. I do this to avoid continual malloc/free/new/dispose, ease debugging, (I dump the pool level to a status bar using a timer, so I always notice if a request leaks or gets double-freed), and to eliminate the need for explicit thread termination on app close, (if multiple threads are only ever reading/writing to data areas that outlive the application forms etc, the app will close easily and the OS can deal with all the threads - there are hundreds of posts about 'cleanly shutting down threads upon app close' - I never bother!).
Such message-passing designs are quite resistant to deadlocks since the only locks, (if any), are in the P-C queues, though you can certainly achieve it if you try hard enough:)
Is this the sort of system that you seem to need , or have I got it wrong?
Rgds,
Martin

Watch dog for blocking function call

I have a closed-source API for some hardware sensor that I use to query that sensor. The API comes as DLL that I use through C# interop. The API's functions are blocking. They usually return error values but in some cases they just won't return.
I need to be able to detect this situation and in that case kill the blocked thread. How can this be done in C#?
The thread they're being invoked on is created through a BackgroundWorker. I'm looking for a simple watch dog for blocking function calls that I can set up before calling the function and reset when I'm back. It should just sit there and wait for me to come back. If I don't, it shall kill the thread so that 1) the API is freed up again and no thread of my application is still hanging around and doing anything should it eventually return and 2) I can take other recovery measures like re-initialising the API to continue working with it.
One approach might be to set up a System.Threading.Timer before the API call to fire after a certain timeout interval, then dispose the Timer after the call completes. If the Timer fires, it'll fire on a ThreadPool thread, and you can then take appropriate action to kill the offending thread.
Note that you'll need to P/Invoke to the Win32 TerminateThread API, since .NET's Thread.Abort() won't work if you're blocked in unmanaged code.
Also note that it's very unlikely your process will be in a safe state after forcibly killing a thread, as the terminated thread might be holding synchronization objects, might have been in the middle of mutating shared memory state, or any other such critical operation. As a result of terminating it, other threads may hang, the process may crash, data may be corrupted, dogs and cats might start living together; there's no way of being sure what'll happen, but chances are it'll be bad. The safest approach, if possible, would be to isolate usage of the API into a separate process that you communicate with via some remoting channel. Then you can kill that external process on demand, as killing a process is a lot safer than killing a thread.

Using delegates in C# .Net, what happens when I run out of threads in the .Net threadpool?

I'm making a multi-threaded application using delegates to handle the processing of requests in a WCF service. I want the clients to be able to send the request and then disconnect and await for a callback to announce the work is done (which will most likely be searching through a database). I don't know how many requests may come in at once, it could be one every once in a while or it could spike to dozens.
As far as I know, .Net's threadpool has 25 threads available to use. What happens when I spawn 25 delegates or more? Does it throw an error, does it wait, does it pause an existing operation and start working on the new delegate, or some other behavior?
Beyond that, what happens if I want to spawn up to or more than 25 delegates while other operations (such as incoming/outgoing connections) want to start, and/or when another operation is working and I want to spawn another delegate?
I want to make sure this is scalable without being too complex.
Thanks
All operations are queued (I am assuming that you are using the threadpool directly or indirectly). It is the job of the threadpool to munch through the queue and dispatch operations onto threads. Eventually all threads may become busy, which will just mean that the queue will grow until threads are free to start processing queued work items.
You're confusing delegates with threads, and number of concurrent connections.
With WCF 2-way bindings, the connection remains open while waiting for the callback.
IIS 7 or above, on modern hardware should have no difficulty maintaining a few thousand concurrent connections if they're sitting idle.
Delegates are just method pointers - you can have as many as you wish. That doesn't mean they're being invoked concurrently.
If you are using ThreadPool.QueueUserWorkItem then it just queues the extra items until a thread is available.
ThreadPools default max amount of thread is 250 not 25! You can still set a higher limit for the ThreadPool if you need that.
If your ThreadPool runs out of threads two things may happen: All opperations are queued until the next resource is available. If there are finished threads those might still be "in use" so the GC will trigger and free up some of them, providing you with new resources.
However you can also create Threads not using the ThreadPool.

Letting the client pull data

I'm writing a server in C# which creates a (long, possibly even infinite) IEnumerable<Result> in response to a client request, and then streams those results back to the client.
Can I set it up so that, if the client is reading slowly (or possibly not at all for a couple of seconds at a time), the server won't need a thread stalled waiting for buffer space to clear up so that it can pull the next couple of Results, serialize them, and stuff them onto the network?
Is this how NetworkStream.BeginWrite works? The documentation is unclear (to me) about when the callback method will be called. Does it happen basically immediately, just on another thread which then blocks on EndWrite waiting for the actual writing to happen? Does it happen when some sort of lower-level buffer in the sockets API underflows? Does it happen when the data has been actually written to the network? Does it happen when it has been acknowledged?
I'm confused, so there's a possibility that this whole question is off-base. If so, could you turn me around and point me in the right direction to solve what I'd expect is a fairly common problem?
I'll answer the third part of your question in a bit more detail.
The MSDN documentation states that:
When your application calls BeginWrite, the system uses a separate thread to execute the specified callback method, and blocks on EndWrite until the NetworkStream sends the number of bytes requested or throws an exception.
As far as my understanding goes, whether or not the callback method is called immediately after calling BeginSend depends upon the underlying implementation and platform. For example, if IO completion ports are available on Windows, it won't be. A thread from the thread pool will block before calling it.
In fact, the NetworkStream's BeginWrite method simply calls the underlying socket's BeginSend method on my .Net implementation. Mine uses the underlying WSASend Winsock function with completion ports, where available. This makes it far more efficient than simply creating your own thread for each send/write operation, even if you were to use a thread pool.
The Socket.BeginSend method then calls the OverlappedAsyncResult.CheckAsyncCallOverlappedResult method if the result of WSASend was IOPending, which in turn calls the native RegisterWaitForSingleObject Win32 function. This will cause one of the threads in the thread pool to block until the WSASend method signals that it has completed, after which the callback method is called.
The Socket.EndSend method, called by NetworkStream.EndSend, will wait for the send operation to complete. The reason it has to do this is because if IO completion ports are not available then the callback method will be called straight away.
I must stress again that these details are specific to my implementation of .Net and my platform, but that should hopefully give you some insight.
First, the only way your main thread can keep executing while other work is being done is through the use of another thread. A thread can't do two things at once.
However, I think what you're trying to avoid is mess with the Thread object and yes that is possible through the use of BeginWrite. As per your questions
The documentation is unclear (to me)
about when the callback method will be
called.
The call is made after the network driver reads the data into it's buffers.
Does it happen basically immediately,
just on another thread which then
blocks on EndWrite waiting for the
actual writing to happen?
Nope, just until it's in the buffers handled by the network driver.
Does it happen when some sort of
lower-level buffer in the sockets API
underflows?
If by underflow your mean it has room for it then yes.
Does it happen when the data has been
actually written to the network?
No.
Does it happen when it has been
acknowledged?
No.
EDIT
Personally I would try using a Thread. There's a lot of stuff that BeginWrite is doing behind the scenes that you should probably recognize... plus I'm weird and I like controlling my threads.

Categories