So I have been playing around with threads for the last couple months and while my output is as expected i have a feeling I'm not doing this the best way. I can't seem to get a straight answer from anyone i work with on what is best practice so i thought i would ask you guys.
Question: I'm going to try to make this simple so bear with me. Say i have a form that has a start and stop button. The start button fires and event that starts a thread. Inside this thread's DoWork it is going to call 3 methods. Method1() prints to the console "A\n" 10 times with a pause of 10 seconds in between. Method2() and Method3() are the exact same just different letter and different pause times in between Console.WriteLine. Now when you press the stop button you want the response to be immediate. I don't want to have to wait for the methods to complete. How do i go about this?
The way i have been doing this is passing my BackgroundWorker to each method and checking the worker.CancellationPending like so
public void Method1(BackgroundWorker worker)
{
for(int i = 0; i < 10 && !worker.CancellationPending; ++i)
{
Console.WriteLine("A");
for(int j = 0; j < 100 && !worker.CancellationPending; ++i)
{
Thread.Sleep(100);
}
}
}
Like i said this give me the desired result however imagine that method1 becomes a lot more complex, let say it is using a DLL to write that has a keydown and a key up. If i just abort the thread i could possibly leave myself in an undesired state as well. I find myself littering my code with !worker.CancellationPending. Practically every code block i am checking CancellationPending. I look at a lot of example on line and i rarely see people passing a thread around like i am. What is best practices on this?
Consider using iterators (yield return) to break up the steps.
public void Method1(Backgroundworker worker)
{
foreach (var discard in Method1Steps)
{
if (worker.CancelationPending)
return;
}
}
private IEnumerable<object> Method1Steps()
{
for (int i = 0; i < 10; ++i)
{
yield return null;
Console.WriteLine("A");
for (int j = 0; j < 100; ++i)
{
Thread.Sleep(100);
yield return null;
}
}
}
This solution may be harder to implement if you have a bunch of try/catch/finally or a bunch of method calls that also need to know about cancelation.
Yes, you are doing it correctly. It may seem awkward at first, but it really is the best option. It is definitely far better than aborting a thread. Loop iterations, as you have discovered, are ideal candidates for checking CancelationPending. This is because a loop iteration often isolates a logical unit of work and thus easily delineate a safe point. Safe points are markers in the execution of a thread where termination can be easily accomplished without corrupting any data.
The trick is to poll CancelationPending at safe points frequently enough to provide timely feedback to the caller that cancelation completed successfully, but not too frequently to negatively effect performance or as to "litter the code".
In your specific case the inner loop is the best place to poll CancelationPending. I would omit the check on the outer loop. The reason is because the inner loop is where most of the time is spent. The check on the outer loop would be pointless because the outer loop does very little actual work except to get the inner loop going.
Now, on the GUI side you might want to grey out the stop button to let the user know that the cancelation request was accepted. You could display a message like "cancelation pending" or the like to make it clear. Once you get the feedback that cancelation is complete then you could remove the message.
Well, if you are in the situation where you have to abort a CPU-intensive thread, then you are somwhat stuck with testing an 'Abort' boolean, (or cancellation token), in one loop or another, (maybee not the innermost one - depends on how long this takes). AFAIK, you can just 'return' from the inner loop, so exiting the method - no need to check at every level! To minimize the overhead on this, try to make it a local-ish boolean, ie try not to dereference it through half-a-dozen ...... classes every time .
Maybee inherit classes from 'Stoppable', that has an 'Abort' method and a 'Stop' boolean? You example thread above is spending most time sleeping, so you get 50ms average latency before you get to check anything. In such a case, you could wait on some event with a timeout instead of sleeping. Override 'Abort' to set the event as well as calling the inherited Abort & so terminate the wait early. You could also set the event in the cancellationToken delegate/callback, should you implement this new functionality as described by Dan.
There are acually very few Windows API etc. that are not easily 'unstickable' or don't have asynchronous, 'Ex' versions, so it's err.. 'nearly' always possible to cancel, one way or another, eg. closing sockets to force a socket read to except, writing temporary file to force Folder Change Notifications to return.
Rgds,
Martin
Related
I understand Thread.Abort() is evil from the multitude of articles I've read on the topic, so I'm currently in the process of ripping out all of my abort's in order to replace it for a cleaner way; and after comparing user strategies from people here on stackoverflow and then after reading "How to: Create and Terminate Threads (C# Programming Guide)" from MSDN both which state an approach very much the same -- which is to use a volatile bool approach checking strategy, which is nice, but I still have a few questions....
Immediately what stands out to me here, is what if you do not have a simple worker process which is just running a loop of crunching code? For instance for me, my process is a background file uploader process, I do in fact loop through each file, so that's something, and sure I could add my while (!_shouldStop) at the top which covers me every loop iteration, but I have many more business processes which occur before it hits it's next loop iteration, I want this cancel procedure to be snappy; don't tell me I need to sprinkle these while loops every 4-5 lines down throughout my entire worker function?!
I really hope there is a better way, could somebody please advise me on if this is in fact, the correct [and only?] approach to do this, or strategies they have used in the past to achieve what I am after.
Thanks gang.
Further reading: All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me. What if it is a linear, but timely background operation?
Unfortunately there may not be a better option. It really depends on your specific scenario. The idea is to stop the thread gracefully at safe points. That is the crux of the reason why Thread.Abort is not good; because it is not guaranteed to occur at safe points. By sprinkling the code with a stopping mechanism you are effectively manually defining the safe points. This is called cooperative cancellation. There are basically 4 broad mechanisms for doing this. You can choose the one that best fits your situation.
Poll a stopping flag
You have already mentioned this method. This a pretty common one. Make periodic checks of the flag at safe points in your algorithm and bail out when it gets signalled. The standard approach is to mark the variable volatile. If that is not possible or inconvenient then you can use a lock. Remember, you cannot mark a local variable as volatile so if a lambda expression captures it through a closure, for example, then you would have to resort to a different method for creating the memory barrier that is required. There is not a whole lot else that needs to be said for this method.
Use the new cancellation mechanisms in the TPL
This is similar to polling a stopping flag except that it uses the new cancellation data structures in the TPL. It is still based on cooperative cancellation patterns. You need to get a CancellationToken and the periodically check IsCancellationRequested. To request cancellation you would call Cancel on the CancellationTokenSource that originally provided the token. There is a lot you can do with the new cancellation mechanisms. You can read more about here.
Use wait handles
This method can be useful if your worker thread requires waiting on an specific interval or for a signal during its normal operation. You can Set a ManualResetEvent, for example, to let the thread know it is time to stop. You can test the event using the WaitOne function which returns a bool indicating whether the event was signalled. The WaitOne takes a parameter that specifies how much time to wait for the call to return if the event was not signaled in that amount of time. You can use this technique in place of Thread.Sleep and get the stopping indication at the same time. It is also useful if there are other WaitHandle instances that the thread may have to wait on. You can call WaitHandle.WaitAny to wait on any event (including the stop event) all in one call. Using an event can be better than calling Thread.Interrupt since you have more control over of the flow of the program (Thread.Interrupt throws an exception so you would have to strategically place the try-catch blocks to perform any necessary cleanup).
Specialized scenarios
There are several one-off scenarios that have very specialized stopping mechanisms. It is definitely outside the scope of this answer to enumerate them all (never mind that it would be nearly impossible). A good example of what I mean here is the Socket class. If the thread is blocked on a call to Send or Receive then calling Close will interrupt the socket on whatever blocking call it was in effectively unblocking it. I am sure there are several other areas in the BCL where similiar techniques can be used to unblock a thread.
Interrupt the thread via Thread.Interrupt
The advantage here is that it is simple and you do not have to focus on sprinkling your code with anything really. The disadvantage is that you have little control over where the safe points are in your algorithm. The reason is because Thread.Interrupt works by injecting an exception inside one of the canned BCL blocking calls. These include Thread.Sleep, WaitHandle.WaitOne, Thread.Join, etc. So you have to be wise about where you place them. However, most the time the algorithm dictates where they go and that is usually fine anyway especially if your algorithm spends most of its time in one of these blocking calls. If you algorithm does not use one of the blocking calls in the BCL then this method will not work for you. The theory here is that the ThreadInterruptException is only generated from .NET waiting call so it is likely at a safe point. At the very least you know that the thread cannot be in unmanaged code or bail out of a critical section leaving a dangling lock in an acquired state. Despite this being less invasive than Thread.Abort I still discourage its use because it is not obvious which calls respond to it and many developers will be unfamiliar with its nuances.
Well, unfortunately in multithreading you often have to compromise "snappiness" for cleanliness... you can exit a thread immediately if you Interrupt it, but it won't be very clean. So no, you don't have to sprinkle the _shouldStop checks every 4-5 lines, but if you do interrupt your thread then you should handle the exception and exit out of the loop in a clean manner.
Update
Even if it's not a looping thread (i.e. perhaps it's a thread that performs some long-running asynchronous operation or some type of block for input operation), you can Interrupt it, but you should still catch the ThreadInterruptedException and exit the thread cleanly. I think that the examples you've been reading are very appropriate.
Update 2.0
Yes I have an example... I'll just show you an example based on the link you referenced:
public class InterruptExample
{
private Thread t;
private volatile boolean alive;
public InterruptExample()
{
alive = false;
t = new Thread(()=>
{
try
{
while (alive)
{
/* Do work. */
}
}
catch (ThreadInterruptedException exception)
{
/* Clean up. */
}
});
t.IsBackground = true;
}
public void Start()
{
alive = true;
t.Start();
}
public void Kill(int timeout = 0)
{
// somebody tells you to stop the thread
t.Interrupt();
// Optionally you can block the caller
// by making them wait until the thread exits.
// If they leave the default timeout,
// then they will not wait at all
t.Join(timeout);
}
}
If cancellation is a requirement of the thing you're building, then it should be treated with as much respect as the rest of your code--it may be something you have to design for.
Lets assume that your thread is doing one of two things at all times.
Something CPU bound
Waiting for the kernel
If you're CPU bound in the thread in question, you probably have a good spot to insert the bail-out check. If you're calling into someone else's code to do some long-running CPU-bound task, then you might need to fix the external code, move it out of process (aborting threads is evil, but aborting processes is well-defined and safe), etc.
If you're waiting for the kernel, then there's probably a handle (or fd, or mach port, ...) involved in the wait. Usually if you destroy the relevant handle, the kernel will return with some failure code immediately. If you're in .net/java/etc. you'll likely end up with an exception. In C, whatever code you already have in place to handle system call failures will propagate the error up to a meaningful part of your app. Either way, you break out of the low-level place fairly cleanly and in a very timely manner without needing new code sprinkled everywhere.
A tactic I often use with this kind of code is to keep track of a list of handles that need to be closed and then have my abort function set a "cancelled" flag and then close them. When the function fails it can check the flag and report failure due to cancellation rather than due to whatever the specific exception/errno was.
You seem to be implying that an acceptable granularity for cancellation is at the level of a service call. This is probably not good thinking--you are much better off cancelling the background work synchronously and joining the old background thread from the foreground thread. It's way cleaner becasue:
It avoids a class of race conditions when old bgwork threads come back to life after unexpected delays.
It avoids potential hidden thread/memory leaks caused by hanging background processes by making it possible for the effects of a hanging background thread to hide.
There are two reasons to be scared of this approach:
You don't think you can abort your own code in a timely fashion. If cancellation is a requirement of your app, the decision you really need to make is a resource/business decision: do a hack, or fix your problem cleanly.
You don't trust some code you're calling because it's out of your control. If you really don't trust it, consider moving it out-of-process. You get much better isolation from many kinds of risks, including this one, that way.
The best answer largely depends on what you're doing in the thread.
Like you said, most answers revolve around polling a shared boolean every couple lines. Even though you may not like it, this is often the simplest scheme. If you want to make your life easier, you can write a method like ThrowIfCancelled(), which throws some kind of exception if you're done. The purists will say this is (gasp) using exceptions for control flow, but then again cacelling is exceptional imo.
If you're doing IO operations (like network stuff), you may want to consider doing everything using async operations.
If you're doing a sequence of steps, you could use the IEnumerable trick to make a state machine. Example:
<
abstract class StateMachine : IDisposable
{
public abstract IEnumerable<object> Main();
public virtual void Dispose()
{
/// ... override with free-ing code ...
}
bool wasCancelled;
public bool Cancel()
{
// ... set wasCancelled using locking scheme of choice ...
}
public Thread Run()
{
var thread = new Thread(() =>
{
try
{
if(wasCancelled) return;
foreach(var x in Main())
{
if(wasCancelled) return;
}
}
finally { Dispose(); }
});
thread.Start()
}
}
class MyStateMachine : StateMachine
{
public override IEnumerabl<object> Main()
{
DoSomething();
yield return null;
DoSomethingElse();
yield return null;
}
}
// then call new MyStateMachine().Run() to run.
>
Overengineering? It depends how many state machines you use. If you just have 1, yes. If you have 100, then maybe not. Too tricky? Well, it depends. Another bonus of this approach is that it lets you (with minor modifications) move your operation into a Timer.tick callback and void threading altogether if it makes sense.
and do everything that blucz says too.
Perhaps the a piece of the problem is that you have such a long method / while loop. Whether or not you are having threading issues, you should break it down into smaller processing steps. Let's suppose those steps are Alpha(), Bravo(), Charlie() and Delta().
You could then do something like this:
public void MyBigBackgroundTask()
{
Action[] tasks = new Action[] { Alpha, Bravo, Charlie, Delta };
int workStepSize = 0;
while (!_shouldStop)
{
tasks[workStepSize++]();
workStepSize %= tasks.Length;
};
}
So yes it loops endlessly, but checks if it is time to stop between each business step.
You don't have to sprinkle while loops everywhere. The outer while loop just checks if it's been told to stop and if so doesn't make another iteration...
If you have a straight "go do something and close out" thread (no loops in it) then you just check the _shouldStop boolean either before or after each major spot inside the thread. That way you know whether it should continue on or bail out.
for example:
public void DoWork() {
RunSomeBigMethod();
if (_shouldStop){ return; }
RunSomeOtherBigMethod();
if (_shouldStop){ return; }
//....
}
Instead of adding a while loop where a loop doesn't otherwise belong, add something like if (_shouldStop) CleanupAndExit(); wherever it makes sense to do so. There's no need to check after every single operation or sprinkle the code all over with them. Instead, think of each check as a chance to exit the thread at that point and add them strategically with this in mind.
All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me
There are not a lot of ways to make code take a long time. Looping is a pretty essential programming construct. Making code take a long time without looping takes a huge amount of statements. Hundreds of thousands.
Or calling some other code that is doing the looping for you. Yes, hard to make that code stop on demand. That just doesn't work.
I have this code doing what I want:
TriggerSomeExternalProcess();
double secondsElapsed = 0;
DateTime startTime = DateTime.UtcNow;
double timeoutInSeconds = 10;
while (secondsElapsed < timeoutInSeconds) {
// TODO: this seems bad...
secondsElapsed = DateTime.UtcNow.Subtract(startTime).TotalSeconds;
}
CheckStatusOfExternalProcess();
The goal is to TriggerSomeExternalProcess and then CheckStatusOfSomeExternalProcess - but that process runs on the same thread so I can't do Thread.Sleep(). It's an ongoing process that can't be awaited.
I feel like the above while loop is wrong - what pattern do you employ when you need to wait without blocking your thread?
copy-pasted from a comment on one of the answers
unfortunately I can't touch the code in the ExternalProcess. I'm writing a test and those are the methods I have access to. I know it's less than ideal
Instead of using a CheckStatusOfExternalProcess() Method u may be able to add an StatusChangedEvent onto the ExternalProcess thing and attach a EventHandler onto it. That way your eventhandler gets called, when the status has changed.
Is that a possibility for you?
Btw: If both of your processes run on the same Thread - how can that be not blocking?
I assume the process that's external is not your own.
Therefore it can't take a callback action.
It can't give a heartbeat (send periodically back it's current status).
And it can't subscribe to it's status changing.
These would be the normal ways to deal with it.
In which case you could just use something like this
Task.delay(TimeSpan.FromSeconds(10))).ContinueWith(() => CheckStatusOfExternalProcess())
continue with will fire as soon the first task is complete but now you can continue on in your code without worrying about it
I'm writing an application working with a big and ugly 3rd party system via a complicated API.
Sometimes some errors happen in the system, but if we wait for my program to face this errors it can be too late.
So, I use a separate thread to check the system state as following:
while (true)
{
ask_state();
check_state();
System.Threading.Thread.Sleep(TimeSpan.FromSeconds(1));
}
It doesn't really matter if I check the system state once in 100 ms or once a minute.
But I have heard that using Thread.Sleep() is a bad practice. Why? And what can I do in this situation?
One reason is that Thread.Sleep() is blocking your code from doing anything else. Recent efforts is to make blocking as least as possible. For example, node.js is a non-blocking language.
Update: I don't know about the infrastructure of Timer class in C#. Maybe it's also blocking.
You can schedule a task to check that third API every 100 ms. This way, during that 100 ms, your program can do other tasks.
Update: This analogy might help. If we compare operating system to a hospital, and compare the threads to nurses in that hospital, the supervisor (programmer) can choose a policy:
Either to ask each nurse (thread) to watch one, and only one patient (a job, a task to be done), even if between each check she waits for an hour (Sleep() method)
To ask each nurse to check each patient, and during the interval till next check, go on and check other patients.
The first model is blocking. It's not scalable. But in the second model, even with few nurses, you might be able to serve many patients.
Because the only way to shut down this thread if it's waiting inside the Sleep is to either a) wait for the Sleep to end, or b) use one of Thread.Abort or Thread.Interrupt.1
If it's a long sleep, then (a) isn't really suitable if you're trying to be responsive. And (b) are pretty obnoxious if the code happens to not actually be inside the Sleep at the time.
It's far better, if you want to be able to interrupt the sleeping behaviour in a suitable fashion, to use a waitable object (such as e.g. a ManualResetEvent) - you might then even be able to place the wait on the waitable object into the while conditional, to make it clear what will cause the thread to exit.
1 I've use shutdown in this instance because it's a very common scenario where cross-thread communication is required. But for any other cross-thread signalling or communication, the same arguments can also apply, and if it's not shutdown then Thread.Abort or Thread.Interrupt are even less suitable.
i would set a timer to whatever ms you want and wait for my check methods to complete, by the way do you want to use an eternal loop or it is not a complete code that you showed up there ?
ok this is a sample of what i'm talking about:
public void myFunction()
{
int startCount = Environment.TickCount;
ask_state();
check_state();
while (true)
{
if (Environment.TickCount - startCount >= 20000) //two seconds
{
break;
}
Application.DoEvents();
}
}
//Now you have an organized function that makes the task you want just call it every
// time interval, again you can use a timer to do that for you
private void timer_Tick(object sender, EventArgs e)
{
myFunction();
}
good luck
I have a loop that I don't want to continue until LoadAmazonDataByBatch() has returned. I know there must be a straight forward way of doing it, and I'm almost certain I'm approaching the problem wrong.
const int batchSize = 500;
for (int i = 0; i < total; i = i + batchSize)
{
LoadAmazonDataByBatch(i, batchSize, fileList, total, amazonLogHandler, stopWatch);
}
LoadAmazonDataByBatch() does a bunch of things on worker threads including creating a temporary DataSet that would get very large without the batching. I don't want to create a new DataSet until the old one is processed and disposed (by LoadAmazonDataByBatch).
Obviously the way this is written now everything happens almost all at once.
How can I approach this better?
You need to do some sort of thread synchronization.
Not clear where you got the LoadAmazonByBatch() , but I'd suggest
checking the doc for that function to see if there is a synchronous version of the operation.
if no doc is available, then you will need to roll up yr sleeves. It may require viewing or modifying the source of LoadAmazonByBatch(). Look for a ManualResetEvent that is set by the workers when they are finished. Or, maybe there is a regular .NET event that is emitted by that method when it completes. If those things don't exist you'll need to add something like that.
It's very likely that LoadAmazonDataByBatch creates a bunch of threads. You have to call Join on all created threads to wait till they complete.
Surely the only way this wouldn't wait for the function to return is if it's written asynchronously?
The relevant code isn't the loop you posted, it's the definition of LoadAmazonDataByBatch() that we need to see.
If that function has a callback (stopWatch?), perhaps you could call the function (LoadAmazonDataByBatch) within the callback.
If LoadAmazonDataByBatch() generates child threads, and it runs until each of those threads is finished, you can use the Thread.Join() method to make it wait for the child threads to finish. I am not sure how that would work for multiple children but I think it should be OK.
Reference: Threading in C#
I understand Thread.Abort() is evil from the multitude of articles I've read on the topic, so I'm currently in the process of ripping out all of my abort's in order to replace it for a cleaner way; and after comparing user strategies from people here on stackoverflow and then after reading "How to: Create and Terminate Threads (C# Programming Guide)" from MSDN both which state an approach very much the same -- which is to use a volatile bool approach checking strategy, which is nice, but I still have a few questions....
Immediately what stands out to me here, is what if you do not have a simple worker process which is just running a loop of crunching code? For instance for me, my process is a background file uploader process, I do in fact loop through each file, so that's something, and sure I could add my while (!_shouldStop) at the top which covers me every loop iteration, but I have many more business processes which occur before it hits it's next loop iteration, I want this cancel procedure to be snappy; don't tell me I need to sprinkle these while loops every 4-5 lines down throughout my entire worker function?!
I really hope there is a better way, could somebody please advise me on if this is in fact, the correct [and only?] approach to do this, or strategies they have used in the past to achieve what I am after.
Thanks gang.
Further reading: All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me. What if it is a linear, but timely background operation?
Unfortunately there may not be a better option. It really depends on your specific scenario. The idea is to stop the thread gracefully at safe points. That is the crux of the reason why Thread.Abort is not good; because it is not guaranteed to occur at safe points. By sprinkling the code with a stopping mechanism you are effectively manually defining the safe points. This is called cooperative cancellation. There are basically 4 broad mechanisms for doing this. You can choose the one that best fits your situation.
Poll a stopping flag
You have already mentioned this method. This a pretty common one. Make periodic checks of the flag at safe points in your algorithm and bail out when it gets signalled. The standard approach is to mark the variable volatile. If that is not possible or inconvenient then you can use a lock. Remember, you cannot mark a local variable as volatile so if a lambda expression captures it through a closure, for example, then you would have to resort to a different method for creating the memory barrier that is required. There is not a whole lot else that needs to be said for this method.
Use the new cancellation mechanisms in the TPL
This is similar to polling a stopping flag except that it uses the new cancellation data structures in the TPL. It is still based on cooperative cancellation patterns. You need to get a CancellationToken and the periodically check IsCancellationRequested. To request cancellation you would call Cancel on the CancellationTokenSource that originally provided the token. There is a lot you can do with the new cancellation mechanisms. You can read more about here.
Use wait handles
This method can be useful if your worker thread requires waiting on an specific interval or for a signal during its normal operation. You can Set a ManualResetEvent, for example, to let the thread know it is time to stop. You can test the event using the WaitOne function which returns a bool indicating whether the event was signalled. The WaitOne takes a parameter that specifies how much time to wait for the call to return if the event was not signaled in that amount of time. You can use this technique in place of Thread.Sleep and get the stopping indication at the same time. It is also useful if there are other WaitHandle instances that the thread may have to wait on. You can call WaitHandle.WaitAny to wait on any event (including the stop event) all in one call. Using an event can be better than calling Thread.Interrupt since you have more control over of the flow of the program (Thread.Interrupt throws an exception so you would have to strategically place the try-catch blocks to perform any necessary cleanup).
Specialized scenarios
There are several one-off scenarios that have very specialized stopping mechanisms. It is definitely outside the scope of this answer to enumerate them all (never mind that it would be nearly impossible). A good example of what I mean here is the Socket class. If the thread is blocked on a call to Send or Receive then calling Close will interrupt the socket on whatever blocking call it was in effectively unblocking it. I am sure there are several other areas in the BCL where similiar techniques can be used to unblock a thread.
Interrupt the thread via Thread.Interrupt
The advantage here is that it is simple and you do not have to focus on sprinkling your code with anything really. The disadvantage is that you have little control over where the safe points are in your algorithm. The reason is because Thread.Interrupt works by injecting an exception inside one of the canned BCL blocking calls. These include Thread.Sleep, WaitHandle.WaitOne, Thread.Join, etc. So you have to be wise about where you place them. However, most the time the algorithm dictates where they go and that is usually fine anyway especially if your algorithm spends most of its time in one of these blocking calls. If you algorithm does not use one of the blocking calls in the BCL then this method will not work for you. The theory here is that the ThreadInterruptException is only generated from .NET waiting call so it is likely at a safe point. At the very least you know that the thread cannot be in unmanaged code or bail out of a critical section leaving a dangling lock in an acquired state. Despite this being less invasive than Thread.Abort I still discourage its use because it is not obvious which calls respond to it and many developers will be unfamiliar with its nuances.
Well, unfortunately in multithreading you often have to compromise "snappiness" for cleanliness... you can exit a thread immediately if you Interrupt it, but it won't be very clean. So no, you don't have to sprinkle the _shouldStop checks every 4-5 lines, but if you do interrupt your thread then you should handle the exception and exit out of the loop in a clean manner.
Update
Even if it's not a looping thread (i.e. perhaps it's a thread that performs some long-running asynchronous operation or some type of block for input operation), you can Interrupt it, but you should still catch the ThreadInterruptedException and exit the thread cleanly. I think that the examples you've been reading are very appropriate.
Update 2.0
Yes I have an example... I'll just show you an example based on the link you referenced:
public class InterruptExample
{
private Thread t;
private volatile boolean alive;
public InterruptExample()
{
alive = false;
t = new Thread(()=>
{
try
{
while (alive)
{
/* Do work. */
}
}
catch (ThreadInterruptedException exception)
{
/* Clean up. */
}
});
t.IsBackground = true;
}
public void Start()
{
alive = true;
t.Start();
}
public void Kill(int timeout = 0)
{
// somebody tells you to stop the thread
t.Interrupt();
// Optionally you can block the caller
// by making them wait until the thread exits.
// If they leave the default timeout,
// then they will not wait at all
t.Join(timeout);
}
}
If cancellation is a requirement of the thing you're building, then it should be treated with as much respect as the rest of your code--it may be something you have to design for.
Lets assume that your thread is doing one of two things at all times.
Something CPU bound
Waiting for the kernel
If you're CPU bound in the thread in question, you probably have a good spot to insert the bail-out check. If you're calling into someone else's code to do some long-running CPU-bound task, then you might need to fix the external code, move it out of process (aborting threads is evil, but aborting processes is well-defined and safe), etc.
If you're waiting for the kernel, then there's probably a handle (or fd, or mach port, ...) involved in the wait. Usually if you destroy the relevant handle, the kernel will return with some failure code immediately. If you're in .net/java/etc. you'll likely end up with an exception. In C, whatever code you already have in place to handle system call failures will propagate the error up to a meaningful part of your app. Either way, you break out of the low-level place fairly cleanly and in a very timely manner without needing new code sprinkled everywhere.
A tactic I often use with this kind of code is to keep track of a list of handles that need to be closed and then have my abort function set a "cancelled" flag and then close them. When the function fails it can check the flag and report failure due to cancellation rather than due to whatever the specific exception/errno was.
You seem to be implying that an acceptable granularity for cancellation is at the level of a service call. This is probably not good thinking--you are much better off cancelling the background work synchronously and joining the old background thread from the foreground thread. It's way cleaner becasue:
It avoids a class of race conditions when old bgwork threads come back to life after unexpected delays.
It avoids potential hidden thread/memory leaks caused by hanging background processes by making it possible for the effects of a hanging background thread to hide.
There are two reasons to be scared of this approach:
You don't think you can abort your own code in a timely fashion. If cancellation is a requirement of your app, the decision you really need to make is a resource/business decision: do a hack, or fix your problem cleanly.
You don't trust some code you're calling because it's out of your control. If you really don't trust it, consider moving it out-of-process. You get much better isolation from many kinds of risks, including this one, that way.
The best answer largely depends on what you're doing in the thread.
Like you said, most answers revolve around polling a shared boolean every couple lines. Even though you may not like it, this is often the simplest scheme. If you want to make your life easier, you can write a method like ThrowIfCancelled(), which throws some kind of exception if you're done. The purists will say this is (gasp) using exceptions for control flow, but then again cacelling is exceptional imo.
If you're doing IO operations (like network stuff), you may want to consider doing everything using async operations.
If you're doing a sequence of steps, you could use the IEnumerable trick to make a state machine. Example:
<
abstract class StateMachine : IDisposable
{
public abstract IEnumerable<object> Main();
public virtual void Dispose()
{
/// ... override with free-ing code ...
}
bool wasCancelled;
public bool Cancel()
{
// ... set wasCancelled using locking scheme of choice ...
}
public Thread Run()
{
var thread = new Thread(() =>
{
try
{
if(wasCancelled) return;
foreach(var x in Main())
{
if(wasCancelled) return;
}
}
finally { Dispose(); }
});
thread.Start()
}
}
class MyStateMachine : StateMachine
{
public override IEnumerabl<object> Main()
{
DoSomething();
yield return null;
DoSomethingElse();
yield return null;
}
}
// then call new MyStateMachine().Run() to run.
>
Overengineering? It depends how many state machines you use. If you just have 1, yes. If you have 100, then maybe not. Too tricky? Well, it depends. Another bonus of this approach is that it lets you (with minor modifications) move your operation into a Timer.tick callback and void threading altogether if it makes sense.
and do everything that blucz says too.
Perhaps the a piece of the problem is that you have such a long method / while loop. Whether or not you are having threading issues, you should break it down into smaller processing steps. Let's suppose those steps are Alpha(), Bravo(), Charlie() and Delta().
You could then do something like this:
public void MyBigBackgroundTask()
{
Action[] tasks = new Action[] { Alpha, Bravo, Charlie, Delta };
int workStepSize = 0;
while (!_shouldStop)
{
tasks[workStepSize++]();
workStepSize %= tasks.Length;
};
}
So yes it loops endlessly, but checks if it is time to stop between each business step.
You don't have to sprinkle while loops everywhere. The outer while loop just checks if it's been told to stop and if so doesn't make another iteration...
If you have a straight "go do something and close out" thread (no loops in it) then you just check the _shouldStop boolean either before or after each major spot inside the thread. That way you know whether it should continue on or bail out.
for example:
public void DoWork() {
RunSomeBigMethod();
if (_shouldStop){ return; }
RunSomeOtherBigMethod();
if (_shouldStop){ return; }
//....
}
Instead of adding a while loop where a loop doesn't otherwise belong, add something like if (_shouldStop) CleanupAndExit(); wherever it makes sense to do so. There's no need to check after every single operation or sprinkle the code all over with them. Instead, think of each check as a chance to exit the thread at that point and add them strategically with this in mind.
All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me
There are not a lot of ways to make code take a long time. Looping is a pretty essential programming construct. Making code take a long time without looping takes a huge amount of statements. Hundreds of thousands.
Or calling some other code that is doing the looping for you. Yes, hard to make that code stop on demand. That just doesn't work.