Watch dog for blocking function call - c#

I have a closed-source API for some hardware sensor that I use to query that sensor. The API comes as DLL that I use through C# interop. The API's functions are blocking. They usually return error values but in some cases they just won't return.
I need to be able to detect this situation and in that case kill the blocked thread. How can this be done in C#?
The thread they're being invoked on is created through a BackgroundWorker. I'm looking for a simple watch dog for blocking function calls that I can set up before calling the function and reset when I'm back. It should just sit there and wait for me to come back. If I don't, it shall kill the thread so that 1) the API is freed up again and no thread of my application is still hanging around and doing anything should it eventually return and 2) I can take other recovery measures like re-initialising the API to continue working with it.

One approach might be to set up a System.Threading.Timer before the API call to fire after a certain timeout interval, then dispose the Timer after the call completes. If the Timer fires, it'll fire on a ThreadPool thread, and you can then take appropriate action to kill the offending thread.
Note that you'll need to P/Invoke to the Win32 TerminateThread API, since .NET's Thread.Abort() won't work if you're blocked in unmanaged code.
Also note that it's very unlikely your process will be in a safe state after forcibly killing a thread, as the terminated thread might be holding synchronization objects, might have been in the middle of mutating shared memory state, or any other such critical operation. As a result of terminating it, other threads may hang, the process may crash, data may be corrupted, dogs and cats might start living together; there's no way of being sure what'll happen, but chances are it'll be bad. The safest approach, if possible, would be to isolate usage of the API into a separate process that you communicate with via some remoting channel. Then you can kill that external process on demand, as killing a process is a lot safer than killing a thread.

Related

What would be a use case for Thread.Sleep(Timeout.Infinite)?

I happened to lay my eyes on an intellisense tool tip regarding the parameter passed to System.Threading.Thread.Sleep(int millisecondsTimeout), saying something like "(…) Specify System.Threading.Timeout.Infinite to block the thread indefinitely". And I am intrigued.
I can understand why one might include short inactive delays within a possibly endless loop, thus yielding processing power to other executing threads when no immediate action in the sleeping thread is required, although I typically prefer implementing such delays with EventWaitHandlers so that I can avoid waiting a full sleeping delay if I signal the thread to gracefully end its execution from a different thread.
But I cannot see when I might need to suspend a thread indefinitely, and in a way that, as far as I can tell, can only be interrupted through a rather ungraceful Thread.Abort()/ThreadAbortException pair.
So what would be a working scenario where I might want to suspend a thread indefinitely?
It is a pretty long story and I have to wave my hands a bit to make it understandable. Most programmers think that Thread.Sleep() puts the thread to sleep and prevents it from executing any code. This is not accurate. Thread.Sleep(Infinite) is equivalent to Application.Run(). No kidding.
This doesn't happen very often in real life, it is mostly relevant in custom hosting scenarios. Getting code to run on a specific thread is in general an important feature to deal with code that is not thread-safe and the major reason why Application.Run() exists. But Windows exposes another way to do at a much lower level, the underlying api for this is QueueUserAPC(). The .NET analogue of this function is BeginInvoke().
This requires the thread to co-operate, just like it does when it calls Application.Run(). The thread must be in an "alertable wait state", executing a blocking function that can be interrupted. The CLR does not execute the sleep by itself, it passes the job to the CLR host. Most hosts will simply execute SleepEx(), passing TRUE for the bAlertable argument. The thread is now in a state to execute any requests posted by QueueUserAPC(). Just like it will be when it is actively executing inside the Application.Run() dispatcher loop.
The kernel feature is not otherwise exposed at all in the framework. It is the kind of code that is very hard to get right, re-entrancy bugs are pretty nasty. As most programmers that were bitten by Application.DoEvents() or a poorly placed MessageBox.Show() can attest. It is however a valid scenario in a custom hosting scenario. Where the host can get C# code to run on a specific thread, using this mechanism. So it is possible to pass Infinite because the designers did not want to intentionally disable this scenario. If this is made possible at all by the host author then they'd let you know about it. I don't know of a practical example.
More practically, you do use this feature every day. It is the way that System.Threading.Timer and System.Timers.Timer are implemented. Done by a thread inside the CLR which is started as soon as you use any timer, it uses SleepEx(INFINITE, TRUE) at its core.
You can use .Interrupt() to wake a sleeping thread (causing ThreadInterruptedException in the code that was calling .Sleep(), which can be caught and handled), so this provides a mechanism to say "sleep until someone prods you". I'm not saying it is necessarily the best mechanism for this, but: it may have uses for you.

Terminate loopless thread instantly without Abort or Suspend

I am implementing a protocol library. Here a simplified description.
The main thread within the main function will always check, whether some data is available on the the networkstream (within a tcpclient). Let us say response is the received message and thread is a running thread.
thread = new Thread(new ThreadStart(function));
thread.IsBackground = true;
thread.Start();
while(true){
response = receiveMessage();
if (response != null)
{
thread.Suspend();
//I am searching for an alternative for the line above and not thread.Abort().
thread2 = new Thread(new ThreadStart(function2));
thread2.IsBackground = true;
thread2.Start();
}
}
So far so good, there are actually more messages to come within the while loop and there is also a statemachine for handling different sort of incoming messages, but this should be enough.
(There are also more than just the functions "function" and "function2").
So anyways how the functions look inside is not clear in this application, since the protocol is hidden from the programmer and meant to be a library. This means the protocol will start some programmer-defined functions as a thread depending on at what state in the protocol the program is.
So if then a special response is received (e.g. a callAnotherFunction message), I want to terminate
a thread (here named "thread") abruptly, lets say within 100 ms. But I do not know whether it executes within a loop or without and how much processing is needed until it terminates.
How to stop these threads without deprecated Suspend or Exceptionthrowing Abort function?
(Note that I cannot force the programmer of the functions to catch the ThreadAbortException.)
Or do I need a different programme architecture?
(Btw I have decided to put the loop within receiveMessage for polling the network stream into the main function, since anytime a message can appear).
Starting a thread without having a reliable way to terminate it is a bad practice. Suspend/Abort are one of those unreliable ways to terminate a thread because you may terminate a thread in a state that corrupts your entire program and you have no way to avoid it from happening.
You can see how to kill a thread safely here: Killing a .NET thread
If the "user" is giving you a method to run in a thread, then the user should also give you a method to stop the code from running. Think of it as a contract: you promise the user that you will call the stop method and they promise that the stop method will actually stop the thread. If your user violates that contract then they will be responsible for the issues that arise, which is good because you don't want to be responsible for your user's errors :).
Note that I cannot force the programmer of the functions to catch the ThreadAbortException.
Since Suspend/Abort are bad practice, the programmer doesn't need to catch the ThreadAbortException, however they should catch the ThreadInterruptedException as part of their "contract."
Remember that there are two situations you need to worry about:
The thread is executing some code.
The thread is in a blocking state.
In the case that the thread is executing some code, all you can do is notify the thread that it can exit and wait until it processes the notification. You may also skip the waiting and assume that you've leaked a resource, in which case it's the user's fault again because they didn't design their stop method to terminate their thread in a timely fashion.
In the case where the thread is in a blocking state and it's not blocking on a notification construct (i.e. semaphore, manual reset event, etc) then you should call Thread.Interrupt() to get it out of the blocking state- the user must handle the ThreadInterruptedException.
Suspend is really evil especially in a way you are trying to use it - to stop thread execution forever. It will leave all locks that thread had and also will not release resources.
Thread Abort is slightly better since it will at least try to terminate thread cleaner and locks will have chance to be released.
To properly do that you really need your thread's code to cooperate in termination. Events, semaphores or even simple bool value checked by the thread may be enough.
It may be better to re-architect your solution to have queue of messages and process them on separate thread. Special message may simply empty the queue.
You need some sort of cancellation protocol between your application and wherever function comes from. Then you can share some sort of cancellation token between function and your message loop. If message loop recognizes that function needs to be stopped you signal that by setting that token which must be tested by function on proper occasions. The simplest way would be to share a condition variable which can be atomically set from within your message loop and atomically read from function.
I'd however consider using the proper Asynchronous IO patterns combined with Tasks provided by the .NET framework out-of-the box along with proper cancellation mechanisms.
So function refers to code which you have little control over? This is pretty typical of 3rd party libraries. Most of the time they do not have builtin abilities to gracefully terminate long running operations. Since you have no idea how these functions are implemented you have very few options. In fact, your only guaranteed safe option is to spin these operations up in their own process and communicate with them via WCF. That way if you need to terminate the operation abruptly you would just kill the process. Killing another process will not corrupt the state of the current process like what would happen if you called Thread.Abort on thread within the current process.

Is it possible to kill WaitForSingleObject(handle, INFINITE)?

I am having problems closing an application that uses WaitForSingleObject() with an INFINITE timout.
The full picture is this. I am doing the following to allow my application to handle the device wakeup event:
Register the event with:
CeRunAppAtEvent("\\\\.\\Notifications\\NamedEvents\\WakeupEvent",
NOTIFICATION_EVENT_WAKEUP);
Start a new thread to wait on:
Thread waitForWakeThread = new Thread(new ThreadStart(WaitForWakeup));
waitForWakeThread.Start();
Then do the following in the target method:
private void WaitForWakeup()
{
IntPtr handle = CreateEvent(IntPtr.Zero, 0, 0, "WakeupEvent");
while (true)
{
WaitForSingleObject(handle, INFINITE);
MessageBox.Show("Wakey wakey");
}
}
This all works fine until I try to close the application when, predictably, WaitForSingleObject continues to wait and does not allow the app to close properly. We only allow one instance of our app to run at a time and we check for this on startup. It appears to continue running until the device is soft reset.
Is there a way to kill the handle that WaitForSingleObject is waiting for, to force it to return?
Many thanks.
Use WaitForMultipleObject instead, and pass 2 handles. The existing one, and one for an event called something like 'exit'. During app shutdown, SetEvent on the exit event, and the WaitForMultipleObject will return and you can get it to exit the thread gracefully.
You need to switch on the return value of WaitForMultipleObject to do the appropriate behaviour depending on which one of the handles was triggered.
Possibly, also, you can set the thread to be a background thread. This will prevent it from stopping your application from shutting down when the main thread terminates.
See:
http://msdn.microsoft.com/en-us/library/system.threading.thread.isbackground.aspx
This is what I would do...
Use the EventWaitHandle class instead of calling CreateEvent directly. There shouldn't be any need to use the Windows API other than CeRunAppAtEvent (and API calls make code ugly...). Get this working first.
Before creating the thread, create a ManualResetEvent variable that is not initially flagged. Call it "TerminateEvent".
Replace the WaitForSingleObject API call with WaitHandle.WaitAny(WaitHandle[]) and pass an array containing "TerminateEvent" and the EventWaitHandle class wrapping the CeRunAppAtEvent notification.
Your loop can use the return value of WaitAny to determine what to do. The return value is the array index of the wait handle that unblocked the thread, so you can determine whether to continue the loop or not.
To cleanly end the thread, you can call "Set" on your "TerminateEvent" and then "Join" the thread to wait for it to terminate.
'This all works fine until I try to close the application when, predictably, WaitForSingleObject continues to wait and does not allow the app to close properly.'
Any app can close, no matter what its threads are doing. If you call ExitProcess(0) from any thread in your app, the app will close, no matter if there are threads waiting INFINITE on some API/sychro, sleeping, running on another processor, whatever. The OS will change the state of all theads that are not running to 'never run again' and use its interprocessor driver to hard-interrupt any other processors that are actually running your thread code. Once all the threads are stopped, the OS frees handles, segments etc and your app no longer exists.
Problems arise when developers try to 'cleanly' shut down threads that are stuck - like yours, when the app is closing. So..
Do you have a TThread.WaitFor, or similar, in an OnClose/OnCloseQuery handler, FormDestroy or destructor? If you have, and have no vital reason to ensure that the thread is terminated, just comment it out!
This allows the main form to close and so your code will finally reach the ExitProcess() it has been trying to get at since you clicked on the red cross button
You could, of coure, just call ExitProcess() yourself, but this may leave you with resources leaked in other proceses - database connections, for example.
'216/217 errors on close if I don't stop the threads'. This often happens because developers have followed the er... 'unfortunate' Delphi thread examples and communicate with threads by directly exchanging data between secondary thread fields and main thread fields, (eg. TThread.synchronize). This just sucks and is hell-bent on causing problems, even in the app run, never mind at shutdown when a form has been destroyed and a thread is trying to write to it or a thread has been destroyed and a main-thread form is trying ot call methods on it. It is much safer to communicate asynchronously with threads by means of queueing/PostMessaging objects that outlive both of them, eg. objects created in the thread/form and freed in the form/thread, or by means of a (thread-safe), pool of objects created in an initialization section. Forms can then close/free safely while associated threads may continue to pointlessly fill up objects for handling until the main form closes, ExitProcess() is reached and the OS annihilates the threads.
'My Form handle is invalid because it has closed but my thread tries to post a message to it'. If the PostMessage excepts, exit your thread. A better way is similar to the approach above - only post messages to a window that outlives all forms. Create one in an initialization section with a trivial WndProc that only handles one const message number that all threads use for posting. You can use wParam to pass the TwinControl instance that the thread is trying to communicate with, (usually a form variable), while lParam passes the object being communicated. When it gets a message from a thread, WndProc calls 'Peform' on the TwinControl passed and the TwinControl will get the comms object in a message-handler. A simple global boolean, 'AppClosing', say, can stop the WndProc calling Peform() on TwinControls that are freeing themselves during shutdown. This approach also avoids problems arising when the OS recreates your form window with a different handle - the Delphi form handle is not used and Windows will not recreate/change the handle of the simple form created in initialization.
I have followed these approaches for decades and do not get any shutdown problems, even with apps with dozens of threads slinging objects around on queues.
Rgds,
Martin
Of course the preferable way to solve this is to use WaitForMultipleObjects, or any other suitable function that is able to wait for multiple criterias (such as WaitForMultipleObjects, MsgWaitForMultipleObjects, etc.).
However if you have no control over which function is used - there're some tricky methods to solve this.
You may hack the functions imported from system DLL, by altering in memory the import table of any module. Since WaitForMultipleObjects is exported from kernel32.dll - it's ok.
using this technics you may redirect the function caller into your hands, and there you will be able to use the WaitForMultipleObjects.

C# Communication between threads

I am using .NET 3.5 and am trying to wrap my head around a problem (not being a supreme threading expert bear with me).
I have a windows service which has a very intensive process that is always running, I have put this process onto a separate thread so that the main thread of my service can handle operational tasks - i.e., service audit cycles, handling configuration changes, etc, etc.
I'm starting the thread via the typical ThreadStart to a method which kicks the process off - call it workerthread.
On this workerthread I am sending data to another server, as is expected the server reboots every now and again and connection is lost and I need to re-establish the connection (I am notified by the lost of connection via an event). From here I do my reconnect logic and I am back in and running, however what I easily started to notice to happen was that I was creating this worker thread over and over again each time (not what I want).
Now I could kill the workerthread when I lose the connection and start a new one but this seems like a waste of resources.
What I really want to do, is marshal the call (i.e., my thread start method) back to the thread that is still in memory although not doing anything.
Please post any examples or docs you have that would be of use.
Thanks.
You should avoid killing the worker thread. When you forcibly kill a Win32 thread, not all of its resources are fully recovered. I believe the reserved virtual address space (or is it the root page?) for the thread stack is not recovered when a Win32 thread is killed. It may not be much, but in a long-running server service process, it will add up over time and eventually bring down your service.
If the thread is allowed to exit its threadproc to terminate normally, all the resources are recovered.
If the background thread will be running continuously (not sleeping), you could just use a global boolean flag to communicate state between the main thread and the background thread. As long as the background thread checks this global flag periodically. If the flag is set, the thread can shut itself down cleanly and exit. No need for locking semantics if the main thread is the only writer and the background thread only reads the flag value.
When the background thread loses the connection to the server that it's sending data to, why doesn't it perform the reconnect on its own? It's not clear to me why the main thread needs to tear down the background thread to start another.
You can use the Singleton pattern. In your case, make the connection a static object. Both threads can access the object, which means construct it and use it.
The main thread could construct it whenever required, and the worker thread access it whenever it is available.
Call the method using ThreadPool.QueueUserWorkItem instead. This method grabs a thread from the thread pool and kicks off a method. It appears to be ideal for the task of starting a method on another thread.
Also, when you say "typical ThreadStart" do you mean you're creating and starting a new Thread with a ThreadStart parameter, or you're creating a ThreadStart and calling Invoke on it?
Have you considered a BackgroundWorker?
From what I understand, you just have a single thread that's doing work, unless the need arises where you have to cancel it's processing.
I would kill (but end gracefully if possible) the worker thread anyway. Everything gets garbage-collected, and you can start from scratch.
How often does this server reboot happen? If it happens often enough for resources to be a problem, it's probably happening too often.
The BackgroundWorker is a bit slower than using plain threads, but it has the option of supporting the CancelAsync method.
Basically, BackgroundWorker is a wrapper around a worker thread with some extra options and events.
The CancelAsync method only works when WorkerSupportsCancellation is set.
When CancelAsync is called, CancellationPending is set.
The worker thread should periodically check CancellationPending to see if needs to quit prematurely.
--jeroen

How do you handle a thread that has a hung call?

I have a thread that goes out and attempts to make a connection. In the thread, I make a call to a third party library. Sometimes, this call hangs, and never returns. On the UI thread, I want to be able to cancel the connection attempt by aborting the thread, which should abort the hung call to the third party library.
I've called Thread.Abort, but have now read that Thread.Abort only works when control returns to managed code. I have observed that this is true, because the thread never aborts, and I've been sitting on Thread.Join for ten minutes now. What should I do with this hung thread? Should I just null the reference and move on? I'd like to be as clean as possible--
Random thought: I wonder if you could write a second assembly as a small console exe that does this communication... launch it with Process.Start and capture results either via the file system or by intercepting stdout. Then if it hangs you can kill the process.
A bit harsh, maybe - and obviously it has overheads of spawning a process - but it should at least be possible to kill it.
This function in your third-party library doesn't have a timeout or cancel function? If so, that's pretty poor design. There's not going to be any pretty solution here, methinks...
Unfortunately, there's no way you're going to get around it, short of using the Win32 API to kill the thread manually, which is certainly not going to be clean. However, if this third-party library is not giving you any other options, it may be the thing to do. The TerminateThread function is what you'll want to use, but observe the warning! To get the thread ID to pass to this function, you have to use another Win32 API call (the Thread class doesn't expose it directly). The approach here will be to set the value of a volatile class variable to the result of GetCurrentThreadId at the start of the managed thread method, and then use this thread ID later to terminate the thread.
Not sure if this will do it or be acceptable, but its worth a shot.
[DllImport("kernel32.dll")]
private static extern bool TerminateThread (Int32 id, Int32 dwexit);
From the documentation
TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be - released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
Managed threads can't directly stop native threads. So if the call is blocked in native code then the best you can do is have the managed thread check then terminate once it returns. If it never returns, maybe there's a version of the call with a timemout?
If not, killing the thread (through win32) is not usually a good idea...
Not a good solution to ever wait on a thread (in any language) indefinitely, especially if you are making external calls. Always use a join with a timeout, or a spin lock that monitors the state of a shared atomic variable until it changes, or you reach a timeout. I'm not a C# guy, but these are all sound concurrency practices.

Categories