Let's say I have a C# program with a GUI, and the update/refresh/display of the GUI takes 0.2 seconds.
Say, while it is still computing the display process (within this 0.2 seconds), a new update request is given, so the current one is outdated. How can I make it stop doing this meaningless outdated work to start computing for the new request?
It might not be only about UI update too. Perhaps, for any function call, how can I make it so it will become "If another of the same call is issued, abandon the current work and go with the new data/situation instead"?
Thanks.
Perhaps, for any function call, how can I make it so it will become "If another of the same call is issued, abandon the current work and go with the new data/situation instead"?
Why would you want that? You would lose all provability in your code. You would never be able to ensure a consistent state in your system. If you want to simulate it, just mess with the PC. Design a program that arbitrarily pushes the PC back to the top of any method that it is in. You would quickly see the system devolve.
You're talking about some fairly advanced threading issues. Using the built-in system control structure, there's no way to prevent the message loop from completing any more than there is a method for interrupting (gracefully) another method.
Now, if this capability is very important to you, you COULD build all custom controls and within your control's painting code you could check (in a thread-safe manner, of course) a boolean value indicating whether or not painting should continue.
But if I can take a stab in the dark, I'm going to guess that you're not actually explicitly doing any multithreading. If this is the case, then the scenario that you describe can't ever actually happen, as the process of refreshing the GUI is going to complete before another one can begin (namely, this anonymous process you're describing that calls for another refresh or deems the current one stale). Because code on the same thread executes sequentially, there's really no opportunity for an unrelated piece of code to cause an update.
The semantics of how and when repaints take place (the difference between Invalidate() and Refresh(), and their respective impacts on this logic, for example) is a topic that's probably really not of interest to you. Just know that if you're...
Doing multithreading, then you'll
have to implement your own code for
checking whether or not the current
operation should continue (for the
UI, this means custom controls with
this logic in the paint logic)
Not
doing multithreading, then what you
describe can never happen.
Hope this is helpful!
One possible way is to start off a thread that updates the GUI and abort it and start another. This is generally not a recommended practice because of the horrible state of thread management in C# but you should be able to get around it without worrying.
public static class ControlExtensions
{
public static TResult InvokeEx<TControl, TResult>(this TControl control,
Func<TControl, TResult> func)
where TControl : Control
{
if (control.InvokeRequired)
return (TResult)control.Invoke(func, control);
else
return func(control);
}
}
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
Thread guiUpdateThread = null;
public void BeginLongGuiUpdate(MyState state)
{
if (guiUpdateThread != null && guiUpdateThread.ThreadState != ThreadState.Stopped)
{
guiUpdateThread.Abort();
guiUpdateThread.Join(); // wait for thread to abort
}
guiUpdateThread = new Thread(LongGuiUpdate);
guiUpdateThread.Start(state);
}
private void LongGuiUpdate(object state)
{
MyState myState = state as MyState;
// ...
Thread.Sleep(200);
this.InvokeEx(f => f.Text = myState.NewTitle);
// ...
}
}
I don't knwo if this maps to what you need, but here goes.
One way to achieve this kind of behavior is to reverse the problem and delay the actual rendering.
Any time you get a request or change, fire off a timer. Every request coming in will either start or restart the time.
When the timer actually elapses, carry out the rendering.
It does not do exactly what you describe, but it might actually do what you need in the end, which is not to render for each request because rendering takes too long.
If you do not have continuous requests, this works fairly well. Obviously, if you do, you never get anything displayed...
Related
I understand Thread.Abort() is evil from the multitude of articles I've read on the topic, so I'm currently in the process of ripping out all of my abort's in order to replace it for a cleaner way; and after comparing user strategies from people here on stackoverflow and then after reading "How to: Create and Terminate Threads (C# Programming Guide)" from MSDN both which state an approach very much the same -- which is to use a volatile bool approach checking strategy, which is nice, but I still have a few questions....
Immediately what stands out to me here, is what if you do not have a simple worker process which is just running a loop of crunching code? For instance for me, my process is a background file uploader process, I do in fact loop through each file, so that's something, and sure I could add my while (!_shouldStop) at the top which covers me every loop iteration, but I have many more business processes which occur before it hits it's next loop iteration, I want this cancel procedure to be snappy; don't tell me I need to sprinkle these while loops every 4-5 lines down throughout my entire worker function?!
I really hope there is a better way, could somebody please advise me on if this is in fact, the correct [and only?] approach to do this, or strategies they have used in the past to achieve what I am after.
Thanks gang.
Further reading: All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me. What if it is a linear, but timely background operation?
Unfortunately there may not be a better option. It really depends on your specific scenario. The idea is to stop the thread gracefully at safe points. That is the crux of the reason why Thread.Abort is not good; because it is not guaranteed to occur at safe points. By sprinkling the code with a stopping mechanism you are effectively manually defining the safe points. This is called cooperative cancellation. There are basically 4 broad mechanisms for doing this. You can choose the one that best fits your situation.
Poll a stopping flag
You have already mentioned this method. This a pretty common one. Make periodic checks of the flag at safe points in your algorithm and bail out when it gets signalled. The standard approach is to mark the variable volatile. If that is not possible or inconvenient then you can use a lock. Remember, you cannot mark a local variable as volatile so if a lambda expression captures it through a closure, for example, then you would have to resort to a different method for creating the memory barrier that is required. There is not a whole lot else that needs to be said for this method.
Use the new cancellation mechanisms in the TPL
This is similar to polling a stopping flag except that it uses the new cancellation data structures in the TPL. It is still based on cooperative cancellation patterns. You need to get a CancellationToken and the periodically check IsCancellationRequested. To request cancellation you would call Cancel on the CancellationTokenSource that originally provided the token. There is a lot you can do with the new cancellation mechanisms. You can read more about here.
Use wait handles
This method can be useful if your worker thread requires waiting on an specific interval or for a signal during its normal operation. You can Set a ManualResetEvent, for example, to let the thread know it is time to stop. You can test the event using the WaitOne function which returns a bool indicating whether the event was signalled. The WaitOne takes a parameter that specifies how much time to wait for the call to return if the event was not signaled in that amount of time. You can use this technique in place of Thread.Sleep and get the stopping indication at the same time. It is also useful if there are other WaitHandle instances that the thread may have to wait on. You can call WaitHandle.WaitAny to wait on any event (including the stop event) all in one call. Using an event can be better than calling Thread.Interrupt since you have more control over of the flow of the program (Thread.Interrupt throws an exception so you would have to strategically place the try-catch blocks to perform any necessary cleanup).
Specialized scenarios
There are several one-off scenarios that have very specialized stopping mechanisms. It is definitely outside the scope of this answer to enumerate them all (never mind that it would be nearly impossible). A good example of what I mean here is the Socket class. If the thread is blocked on a call to Send or Receive then calling Close will interrupt the socket on whatever blocking call it was in effectively unblocking it. I am sure there are several other areas in the BCL where similiar techniques can be used to unblock a thread.
Interrupt the thread via Thread.Interrupt
The advantage here is that it is simple and you do not have to focus on sprinkling your code with anything really. The disadvantage is that you have little control over where the safe points are in your algorithm. The reason is because Thread.Interrupt works by injecting an exception inside one of the canned BCL blocking calls. These include Thread.Sleep, WaitHandle.WaitOne, Thread.Join, etc. So you have to be wise about where you place them. However, most the time the algorithm dictates where they go and that is usually fine anyway especially if your algorithm spends most of its time in one of these blocking calls. If you algorithm does not use one of the blocking calls in the BCL then this method will not work for you. The theory here is that the ThreadInterruptException is only generated from .NET waiting call so it is likely at a safe point. At the very least you know that the thread cannot be in unmanaged code or bail out of a critical section leaving a dangling lock in an acquired state. Despite this being less invasive than Thread.Abort I still discourage its use because it is not obvious which calls respond to it and many developers will be unfamiliar with its nuances.
Well, unfortunately in multithreading you often have to compromise "snappiness" for cleanliness... you can exit a thread immediately if you Interrupt it, but it won't be very clean. So no, you don't have to sprinkle the _shouldStop checks every 4-5 lines, but if you do interrupt your thread then you should handle the exception and exit out of the loop in a clean manner.
Update
Even if it's not a looping thread (i.e. perhaps it's a thread that performs some long-running asynchronous operation or some type of block for input operation), you can Interrupt it, but you should still catch the ThreadInterruptedException and exit the thread cleanly. I think that the examples you've been reading are very appropriate.
Update 2.0
Yes I have an example... I'll just show you an example based on the link you referenced:
public class InterruptExample
{
private Thread t;
private volatile boolean alive;
public InterruptExample()
{
alive = false;
t = new Thread(()=>
{
try
{
while (alive)
{
/* Do work. */
}
}
catch (ThreadInterruptedException exception)
{
/* Clean up. */
}
});
t.IsBackground = true;
}
public void Start()
{
alive = true;
t.Start();
}
public void Kill(int timeout = 0)
{
// somebody tells you to stop the thread
t.Interrupt();
// Optionally you can block the caller
// by making them wait until the thread exits.
// If they leave the default timeout,
// then they will not wait at all
t.Join(timeout);
}
}
If cancellation is a requirement of the thing you're building, then it should be treated with as much respect as the rest of your code--it may be something you have to design for.
Lets assume that your thread is doing one of two things at all times.
Something CPU bound
Waiting for the kernel
If you're CPU bound in the thread in question, you probably have a good spot to insert the bail-out check. If you're calling into someone else's code to do some long-running CPU-bound task, then you might need to fix the external code, move it out of process (aborting threads is evil, but aborting processes is well-defined and safe), etc.
If you're waiting for the kernel, then there's probably a handle (or fd, or mach port, ...) involved in the wait. Usually if you destroy the relevant handle, the kernel will return with some failure code immediately. If you're in .net/java/etc. you'll likely end up with an exception. In C, whatever code you already have in place to handle system call failures will propagate the error up to a meaningful part of your app. Either way, you break out of the low-level place fairly cleanly and in a very timely manner without needing new code sprinkled everywhere.
A tactic I often use with this kind of code is to keep track of a list of handles that need to be closed and then have my abort function set a "cancelled" flag and then close them. When the function fails it can check the flag and report failure due to cancellation rather than due to whatever the specific exception/errno was.
You seem to be implying that an acceptable granularity for cancellation is at the level of a service call. This is probably not good thinking--you are much better off cancelling the background work synchronously and joining the old background thread from the foreground thread. It's way cleaner becasue:
It avoids a class of race conditions when old bgwork threads come back to life after unexpected delays.
It avoids potential hidden thread/memory leaks caused by hanging background processes by making it possible for the effects of a hanging background thread to hide.
There are two reasons to be scared of this approach:
You don't think you can abort your own code in a timely fashion. If cancellation is a requirement of your app, the decision you really need to make is a resource/business decision: do a hack, or fix your problem cleanly.
You don't trust some code you're calling because it's out of your control. If you really don't trust it, consider moving it out-of-process. You get much better isolation from many kinds of risks, including this one, that way.
The best answer largely depends on what you're doing in the thread.
Like you said, most answers revolve around polling a shared boolean every couple lines. Even though you may not like it, this is often the simplest scheme. If you want to make your life easier, you can write a method like ThrowIfCancelled(), which throws some kind of exception if you're done. The purists will say this is (gasp) using exceptions for control flow, but then again cacelling is exceptional imo.
If you're doing IO operations (like network stuff), you may want to consider doing everything using async operations.
If you're doing a sequence of steps, you could use the IEnumerable trick to make a state machine. Example:
<
abstract class StateMachine : IDisposable
{
public abstract IEnumerable<object> Main();
public virtual void Dispose()
{
/// ... override with free-ing code ...
}
bool wasCancelled;
public bool Cancel()
{
// ... set wasCancelled using locking scheme of choice ...
}
public Thread Run()
{
var thread = new Thread(() =>
{
try
{
if(wasCancelled) return;
foreach(var x in Main())
{
if(wasCancelled) return;
}
}
finally { Dispose(); }
});
thread.Start()
}
}
class MyStateMachine : StateMachine
{
public override IEnumerabl<object> Main()
{
DoSomething();
yield return null;
DoSomethingElse();
yield return null;
}
}
// then call new MyStateMachine().Run() to run.
>
Overengineering? It depends how many state machines you use. If you just have 1, yes. If you have 100, then maybe not. Too tricky? Well, it depends. Another bonus of this approach is that it lets you (with minor modifications) move your operation into a Timer.tick callback and void threading altogether if it makes sense.
and do everything that blucz says too.
Perhaps the a piece of the problem is that you have such a long method / while loop. Whether or not you are having threading issues, you should break it down into smaller processing steps. Let's suppose those steps are Alpha(), Bravo(), Charlie() and Delta().
You could then do something like this:
public void MyBigBackgroundTask()
{
Action[] tasks = new Action[] { Alpha, Bravo, Charlie, Delta };
int workStepSize = 0;
while (!_shouldStop)
{
tasks[workStepSize++]();
workStepSize %= tasks.Length;
};
}
So yes it loops endlessly, but checks if it is time to stop between each business step.
You don't have to sprinkle while loops everywhere. The outer while loop just checks if it's been told to stop and if so doesn't make another iteration...
If you have a straight "go do something and close out" thread (no loops in it) then you just check the _shouldStop boolean either before or after each major spot inside the thread. That way you know whether it should continue on or bail out.
for example:
public void DoWork() {
RunSomeBigMethod();
if (_shouldStop){ return; }
RunSomeOtherBigMethod();
if (_shouldStop){ return; }
//....
}
Instead of adding a while loop where a loop doesn't otherwise belong, add something like if (_shouldStop) CleanupAndExit(); wherever it makes sense to do so. There's no need to check after every single operation or sprinkle the code all over with them. Instead, think of each check as a chance to exit the thread at that point and add them strategically with this in mind.
All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me
There are not a lot of ways to make code take a long time. Looping is a pretty essential programming construct. Making code take a long time without looping takes a huge amount of statements. Hundreds of thousands.
Or calling some other code that is doing the looping for you. Yes, hard to make that code stop on demand. That just doesn't work.
I have a C# program seems stuck at random time, and after a random while it recovered itself! When it stuck, I can see the memory growth, and when it recover, the memory usage just drops to normal. The CPU usage seems normal all the way and there is no files is written or read (as designed).
The program calls an external (3rd party) DLL function to communicate with hardware, and updates the UI from the DLL's callback which running on a different thread. I have checked the code and found nothing suspicious apart from the following code (redacted):
private void Func(StructType para) {
if (labelA.InvokeRequired) {
labelA.BeginInvoke(new MethodInvoker(() => Func(para)));
return;
}
if (labelB.InvokeRequired) {
labelB.BeginInvoke(new MethodInvoker(() => Func(para)));
return;
}
labelA.Text = para.A;
labelB.Text = para.B;
}
I wonder if this is a proper way of update the UI element from another thread? If not, how to revise it?
In fact, I invoke 6 labels and another form (optionally). It seems working fine for most time but occasionally stuck. I can't post all code here for obvious reason, but just trying to troubleshot from where I doubt most.
Thanks in advance for any suggestions!
You don't need to individually check each each control to determine if you need to invoke it - there is only one UI thread, thus, that check is only useful once. Keep in mind - modifying any UI component is almost certain to cascade into a whole bunch of other reads/writes to other UI components; as a result of that, you have to make the assumption that if you are touching any UI object, you have to assume you're touching all UI components.
With that in mind, I'd recommend you perform your invoke check once, and I recommend performing the check and invoke on the parent control of both labels.
Assuming this refers to the class that is the parent to both those labels, I would modify your code as follows:
private void Func(StructType para) {
if (this.InvokeRequired) {
// Elide the explicit delegate declaration; it's not necessary.
this.BeginInvoke( Func(para) );
// Elide the return statement - multiple returns are messy, and in this case, easily removed.
}
else {
labelA.Text = para.A;
labelB.Text = para.B;
}
}
Be aware that InvokeRequired returns false if the object is disposed, even if the calling thread is not the UI thread.
I imagine this may be marked as repetitious and closed, but I cannot for the life of me find a clear, concise answer to this question. All the replies and resources deal almost exclusively with Windows Forms and utilizing pre-built utility classes such as BackgroundWorker. I would very much like to understand this concept at its core, so I can apply the fundamental knowledge to other threading implementations.
A simple example of what I would like to achieve:
//timer running on a seperate thread and raising events at set intervals
//incomplete, but functional, except for the cross-thread event raising
class Timer
{
//how often the Alarm event is raised
float _alarmInterval;
//stopwatch to keep time
Stopwatch _stopwatch;
//this Thread used to repeatedly check for events to raise
Thread _timerThread;
//used to pause the timer
bool _paused;
//used to determine Alarm event raises
float _timeOfLastAlarm = 0;
//this is the event I want to raise on the Main Thread
public event EventHandler Alarm;
//Constructor
public Timer(float alarmInterval)
{
_alarmInterval = alarmInterval;
_stopwatch = new Stopwatch();
_timerThread = new Thread(new ThreadStart(Initiate));
}
//toggles the Timer
//do I need to marshall this data back and forth as well? or is the
//_paused boolean in a shared data pool that both threads can access?
public void Pause()
{
_paused = (!_paused);
}
//little Helper to start the Stopwatch and loop over the Main method
void Initiate()
{
_stopwatch.Start();
while (true) Main();
}
//checks for Alarm events
void Main()
{
if (_paused && _stopwatch.IsRunning) _stopwatch.Stop();
if (!_paused && !_stopwatch.IsRunning) _stopwatch.Start();
if (_stopwatch.Elapsed.TotalSeconds > _timeOfLastAlarm)
{
_timeOfLastAlarm = _stopwatch.Elapsed.Seconds;
RaiseAlarm();
}
}
}
Two questions here; primarily, how do i get the event to the main thread to alert the interested parties of the Alarm event.
Secondarily, regarding the Pause() method, which will be called by an object running on the main thread; can i directly manipulate the Stopwatch that was created on the background thread by calling _stopwatch.start()/_stopwatch.stop(). If not, can the main thread adjust the _paused boolean as illustrated above such that the background thread can then see the new value of _paused and use it?
I swear, I've done my research, but these (fundamental and critical) details have not made themselves clear to me yet.
Disclaimer: I am aware that there are classes available that will provide the exact particular functionality that I am describing in my Timer class. (In fact, I believe the class is called just that, Threading.Timer) However, my question is not an attempt to help me implement the Timer class itself, rather understand how to execute the concepts that drive it.
Note: im writing this here because theres not enough space on comments, this is of course not a complete, nor half a complete answer:
I've always used Events to signal unrelated code to do something, so that was how I described my intent. Forgive me though, I'm not sure I see the difference between marshaling and event versus marshaling another type of data (signal).
Conceptually both can be treated as events. The difference between using provided sync/signalining objects and trying to implement something like this by urself, is who and how gets the job done.
An event in .net is just a delegate, a list of pointers to methods that should be executed when the provider of the event fires it.
What youre talking about (marshalling the event), if i understand you correctly, is sharing the event object when something happens, while the concept of signalig usually talks about an object which is shared to start with, and both threads "know" something happened by checking its state either manualy or automatily (relying on provided tools by both .net and windows).
In the most basic scenario, you can implement such a signaling concept by using a boolean variable, with one thread constantly looping to check if the value of the boolean is true, and another setting to such, as a way to signal something happend. The different signaling tools provided by .NET do this in a less resource-wasting maner, by also not executing the waiting thread, as long as theres no signal (the boolean equals to false), but conceptually, it is the same idea.
You cannot magically execute code on an existing thread.
Instead, you need the existing thread to explicitly execute your code, using a thread-safe data structure to tell it what to do.
This is how Control.Invoke works (which is in turn how BackgroundWorker works).
WiinForms runs a message loop in Application.Run() which looks roughly like this:
while(true) {
var message = GetMessage(); //Windows API call
ProcessMessage(message);
}
Control.Invoke() sends a Windows message (using thread-safe message passing code within Windows) telling it to run your delegate. ProcessMessage (which executes on the UI thread) will catch that message and execute the delegate.
If you want to do this yourself, you will need to write your own loop. You can use the new thread-safe Producer-Consumer collections in .Net 4.0 for this, or you can use a delegate field (with Interlocked.CompareExchange) and an AutoResetEvent and do it yourself.
I understand Thread.Abort() is evil from the multitude of articles I've read on the topic, so I'm currently in the process of ripping out all of my abort's in order to replace it for a cleaner way; and after comparing user strategies from people here on stackoverflow and then after reading "How to: Create and Terminate Threads (C# Programming Guide)" from MSDN both which state an approach very much the same -- which is to use a volatile bool approach checking strategy, which is nice, but I still have a few questions....
Immediately what stands out to me here, is what if you do not have a simple worker process which is just running a loop of crunching code? For instance for me, my process is a background file uploader process, I do in fact loop through each file, so that's something, and sure I could add my while (!_shouldStop) at the top which covers me every loop iteration, but I have many more business processes which occur before it hits it's next loop iteration, I want this cancel procedure to be snappy; don't tell me I need to sprinkle these while loops every 4-5 lines down throughout my entire worker function?!
I really hope there is a better way, could somebody please advise me on if this is in fact, the correct [and only?] approach to do this, or strategies they have used in the past to achieve what I am after.
Thanks gang.
Further reading: All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me. What if it is a linear, but timely background operation?
Unfortunately there may not be a better option. It really depends on your specific scenario. The idea is to stop the thread gracefully at safe points. That is the crux of the reason why Thread.Abort is not good; because it is not guaranteed to occur at safe points. By sprinkling the code with a stopping mechanism you are effectively manually defining the safe points. This is called cooperative cancellation. There are basically 4 broad mechanisms for doing this. You can choose the one that best fits your situation.
Poll a stopping flag
You have already mentioned this method. This a pretty common one. Make periodic checks of the flag at safe points in your algorithm and bail out when it gets signalled. The standard approach is to mark the variable volatile. If that is not possible or inconvenient then you can use a lock. Remember, you cannot mark a local variable as volatile so if a lambda expression captures it through a closure, for example, then you would have to resort to a different method for creating the memory barrier that is required. There is not a whole lot else that needs to be said for this method.
Use the new cancellation mechanisms in the TPL
This is similar to polling a stopping flag except that it uses the new cancellation data structures in the TPL. It is still based on cooperative cancellation patterns. You need to get a CancellationToken and the periodically check IsCancellationRequested. To request cancellation you would call Cancel on the CancellationTokenSource that originally provided the token. There is a lot you can do with the new cancellation mechanisms. You can read more about here.
Use wait handles
This method can be useful if your worker thread requires waiting on an specific interval or for a signal during its normal operation. You can Set a ManualResetEvent, for example, to let the thread know it is time to stop. You can test the event using the WaitOne function which returns a bool indicating whether the event was signalled. The WaitOne takes a parameter that specifies how much time to wait for the call to return if the event was not signaled in that amount of time. You can use this technique in place of Thread.Sleep and get the stopping indication at the same time. It is also useful if there are other WaitHandle instances that the thread may have to wait on. You can call WaitHandle.WaitAny to wait on any event (including the stop event) all in one call. Using an event can be better than calling Thread.Interrupt since you have more control over of the flow of the program (Thread.Interrupt throws an exception so you would have to strategically place the try-catch blocks to perform any necessary cleanup).
Specialized scenarios
There are several one-off scenarios that have very specialized stopping mechanisms. It is definitely outside the scope of this answer to enumerate them all (never mind that it would be nearly impossible). A good example of what I mean here is the Socket class. If the thread is blocked on a call to Send or Receive then calling Close will interrupt the socket on whatever blocking call it was in effectively unblocking it. I am sure there are several other areas in the BCL where similiar techniques can be used to unblock a thread.
Interrupt the thread via Thread.Interrupt
The advantage here is that it is simple and you do not have to focus on sprinkling your code with anything really. The disadvantage is that you have little control over where the safe points are in your algorithm. The reason is because Thread.Interrupt works by injecting an exception inside one of the canned BCL blocking calls. These include Thread.Sleep, WaitHandle.WaitOne, Thread.Join, etc. So you have to be wise about where you place them. However, most the time the algorithm dictates where they go and that is usually fine anyway especially if your algorithm spends most of its time in one of these blocking calls. If you algorithm does not use one of the blocking calls in the BCL then this method will not work for you. The theory here is that the ThreadInterruptException is only generated from .NET waiting call so it is likely at a safe point. At the very least you know that the thread cannot be in unmanaged code or bail out of a critical section leaving a dangling lock in an acquired state. Despite this being less invasive than Thread.Abort I still discourage its use because it is not obvious which calls respond to it and many developers will be unfamiliar with its nuances.
Well, unfortunately in multithreading you often have to compromise "snappiness" for cleanliness... you can exit a thread immediately if you Interrupt it, but it won't be very clean. So no, you don't have to sprinkle the _shouldStop checks every 4-5 lines, but if you do interrupt your thread then you should handle the exception and exit out of the loop in a clean manner.
Update
Even if it's not a looping thread (i.e. perhaps it's a thread that performs some long-running asynchronous operation or some type of block for input operation), you can Interrupt it, but you should still catch the ThreadInterruptedException and exit the thread cleanly. I think that the examples you've been reading are very appropriate.
Update 2.0
Yes I have an example... I'll just show you an example based on the link you referenced:
public class InterruptExample
{
private Thread t;
private volatile boolean alive;
public InterruptExample()
{
alive = false;
t = new Thread(()=>
{
try
{
while (alive)
{
/* Do work. */
}
}
catch (ThreadInterruptedException exception)
{
/* Clean up. */
}
});
t.IsBackground = true;
}
public void Start()
{
alive = true;
t.Start();
}
public void Kill(int timeout = 0)
{
// somebody tells you to stop the thread
t.Interrupt();
// Optionally you can block the caller
// by making them wait until the thread exits.
// If they leave the default timeout,
// then they will not wait at all
t.Join(timeout);
}
}
If cancellation is a requirement of the thing you're building, then it should be treated with as much respect as the rest of your code--it may be something you have to design for.
Lets assume that your thread is doing one of two things at all times.
Something CPU bound
Waiting for the kernel
If you're CPU bound in the thread in question, you probably have a good spot to insert the bail-out check. If you're calling into someone else's code to do some long-running CPU-bound task, then you might need to fix the external code, move it out of process (aborting threads is evil, but aborting processes is well-defined and safe), etc.
If you're waiting for the kernel, then there's probably a handle (or fd, or mach port, ...) involved in the wait. Usually if you destroy the relevant handle, the kernel will return with some failure code immediately. If you're in .net/java/etc. you'll likely end up with an exception. In C, whatever code you already have in place to handle system call failures will propagate the error up to a meaningful part of your app. Either way, you break out of the low-level place fairly cleanly and in a very timely manner without needing new code sprinkled everywhere.
A tactic I often use with this kind of code is to keep track of a list of handles that need to be closed and then have my abort function set a "cancelled" flag and then close them. When the function fails it can check the flag and report failure due to cancellation rather than due to whatever the specific exception/errno was.
You seem to be implying that an acceptable granularity for cancellation is at the level of a service call. This is probably not good thinking--you are much better off cancelling the background work synchronously and joining the old background thread from the foreground thread. It's way cleaner becasue:
It avoids a class of race conditions when old bgwork threads come back to life after unexpected delays.
It avoids potential hidden thread/memory leaks caused by hanging background processes by making it possible for the effects of a hanging background thread to hide.
There are two reasons to be scared of this approach:
You don't think you can abort your own code in a timely fashion. If cancellation is a requirement of your app, the decision you really need to make is a resource/business decision: do a hack, or fix your problem cleanly.
You don't trust some code you're calling because it's out of your control. If you really don't trust it, consider moving it out-of-process. You get much better isolation from many kinds of risks, including this one, that way.
The best answer largely depends on what you're doing in the thread.
Like you said, most answers revolve around polling a shared boolean every couple lines. Even though you may not like it, this is often the simplest scheme. If you want to make your life easier, you can write a method like ThrowIfCancelled(), which throws some kind of exception if you're done. The purists will say this is (gasp) using exceptions for control flow, but then again cacelling is exceptional imo.
If you're doing IO operations (like network stuff), you may want to consider doing everything using async operations.
If you're doing a sequence of steps, you could use the IEnumerable trick to make a state machine. Example:
<
abstract class StateMachine : IDisposable
{
public abstract IEnumerable<object> Main();
public virtual void Dispose()
{
/// ... override with free-ing code ...
}
bool wasCancelled;
public bool Cancel()
{
// ... set wasCancelled using locking scheme of choice ...
}
public Thread Run()
{
var thread = new Thread(() =>
{
try
{
if(wasCancelled) return;
foreach(var x in Main())
{
if(wasCancelled) return;
}
}
finally { Dispose(); }
});
thread.Start()
}
}
class MyStateMachine : StateMachine
{
public override IEnumerabl<object> Main()
{
DoSomething();
yield return null;
DoSomethingElse();
yield return null;
}
}
// then call new MyStateMachine().Run() to run.
>
Overengineering? It depends how many state machines you use. If you just have 1, yes. If you have 100, then maybe not. Too tricky? Well, it depends. Another bonus of this approach is that it lets you (with minor modifications) move your operation into a Timer.tick callback and void threading altogether if it makes sense.
and do everything that blucz says too.
Perhaps the a piece of the problem is that you have such a long method / while loop. Whether or not you are having threading issues, you should break it down into smaller processing steps. Let's suppose those steps are Alpha(), Bravo(), Charlie() and Delta().
You could then do something like this:
public void MyBigBackgroundTask()
{
Action[] tasks = new Action[] { Alpha, Bravo, Charlie, Delta };
int workStepSize = 0;
while (!_shouldStop)
{
tasks[workStepSize++]();
workStepSize %= tasks.Length;
};
}
So yes it loops endlessly, but checks if it is time to stop between each business step.
You don't have to sprinkle while loops everywhere. The outer while loop just checks if it's been told to stop and if so doesn't make another iteration...
If you have a straight "go do something and close out" thread (no loops in it) then you just check the _shouldStop boolean either before or after each major spot inside the thread. That way you know whether it should continue on or bail out.
for example:
public void DoWork() {
RunSomeBigMethod();
if (_shouldStop){ return; }
RunSomeOtherBigMethod();
if (_shouldStop){ return; }
//....
}
Instead of adding a while loop where a loop doesn't otherwise belong, add something like if (_shouldStop) CleanupAndExit(); wherever it makes sense to do so. There's no need to check after every single operation or sprinkle the code all over with them. Instead, think of each check as a chance to exit the thread at that point and add them strategically with this in mind.
All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me
There are not a lot of ways to make code take a long time. Looping is a pretty essential programming construct. Making code take a long time without looping takes a huge amount of statements. Hundreds of thousands.
Or calling some other code that is doing the looping for you. Yes, hard to make that code stop on demand. That just doesn't work.
I have a service responsible for many tasks, one of which is to launch jobs (one at a time) on a separate thread (threadJob child), these jobs can take a fair amount of time and
have various phases to them which I need to report back.
Ever so often a calling application requests the status from the service (GetStatus), this means that somehow the service needs to know at what point the job (child thread) is
at, my hope was that at some milestones the child thread could somehow inform (SetStatus) the parent thread (service) of its status and the service could return that information
to the calling application.
For example - I was looking to do something like this:
class Service
{
private Thread threadJob;
private int JOB_STATUS;
public Service()
{
JOB_STATUS = "IDLE";
}
public void RunTask()
{
threadJob = new Thread(new ThreadStart(PerformWork));
threadJob.IsBackground = true;
threadJob.Start();
}
public void PerformWork()
{
SetStatus("STARTING");
// do some work //
SetStatus("PHASE I");
// do some work //
SetStatus("PHASE II");
// do some work //
SetStatus("PHASE III");
// do some work //
SetStatus("FINISHED");
}
private void SetStatus(int status)
{
JOB_STATUS = status;
}
public string GetStatus()
{
return JOB_STATUS;
}
};
So, when a job needs to be performed RunTask() is called and this launches the thread (threadJob). This will run and perform some steps (using SetStatus to set the new status at
various points) and finally finish. Now, there is also function GetStatus() which should return the STATUS whenever requested (from a calling application using IPC) - this status
should reflect the current status of the job running by threadJob.
So, my problem is simple enough...
How can threadJob (or more specifically PerformWork()) return to Service the change in status in a thread-safe manner (I assume my example above of SetStatus/GetStatus is
unsafe)? Do I need to use events? I assume I cannot simply change JOB_STATUS directly ... Should I use a LOCK (if so on what?)...
You may have already looked into this, but the BackgroundWorker class gives you a nice interface for running tasks on background threads, and provides events to hook into for notifications that progress has changed.
You could create an event in the Service class and then invoke it in a thread-safe manner. Pay very close attention to how I have implemented the SetStatus method.
class Service
{
public delegate void JobStatusChangeHandler(string status);
// Event add/remove auto implemented code is already thread-safe.
public event JobStatusChangeHandler JobStatusChange;
public void PerformWork()
{
SetStatus("STARTING");
// stuff
SetStatus("FINISHED");
}
private void SetStatus(string status)
{
JobStatusChangeHandler snapshot;
lock (this)
{
// Get a snapshot of the invocation list for the event handler.
snapshot = JobStatusChange;
}
// This is threadsafe because multicast delegates are immutable.
// If you did not extract the invocation list into a local variable then
// the event may have all delegates removed after the check for null which
// which would result in a NullReferenceException when you attempt to invoke
// it.
if (snapshot != null)
{
snapshot(status);
}
}
}
I'd have the child thread raise a 'statusupdate' event, passing a struct with the information necessary for the parent and have the parent subscribe to it when launching it.
You can use the Event-Based Async Pattern.
I would go with delegate/event from the thread to the caller. If caller was UI or somewhere on that lines, I would be nice to the message pump and use appropriate Invoke()s to serialize notifications with the UIs thread when required.
I once wrote an app that needed a marker showing the progress a thread was making. I just used a shared global variable between them. The parent would just read the value, and the thread would just update it. No need to synchronize as only the parent read it, and only the child wrote it atomically. As it happened the parent was redrawing things frequently enough anyhow that it didn't even need to be poked by the child when the child updated the variable. Sometimes the simplest possible way works well.
Your current code mixes strings and ints for JOB_STATUS, which can't work. I'm assuming strings here, but it doesn't really matter, as I'll explain.
You current implementation is thread safe in the sense that no memory corruption will occur, since all assignments to reference type fields are guaranteed to be atomic. The CLR demands this, otherwise you could potentially access unmanaged memory if you could somehow access partially updated references. Your processor gives you that atomicity for free, however.
So as long as you're using reference types like strings, you won't get any memory corruption. The same is true for primitives like ints (and smaller) and enums based on them. (Just avoid longs and bigger, and non-primitive value types such as nullable integers.)
But, that is not the end of the story: this implementation is not guaranteed to always represent the current state. The reason for this is that the thread that calls GetStatus might be looking at a stale copy of the JOB_STATUS field, because the assignment in SetState contains no so-called memory barrier. That is: the new value for JOB_STATUS need not be sent to your main RAM right away. There are several reasons why this can be delayed:
Writing to main RAM is inherently slow (relatively speaking), which is the reason your processor has all kinds of buffers and L-something caches in the first place, so the processor usually delays memory synchronization. Not for very long, but it will probably delay. This can be quite noticeable on multicore processors, as these usually have separate caches per core.
The JIT might have stored the value of JOB_STATUS in a register earlier on, as part of some optimization strategy. Again, registers are far more efficient to use than your main RAM. However, this does mean that it might not see changes early enough, as it's still looking at the old copy in the register. (We're not talking minutes here, but still.)
So, if you want to be 100% certain that each thread & processor core is immediately aware of the changed status, declare your field as volatile:
private volatile int JOB_STATUS;
Now, GetStatus/SetStatus, without any locking constructs, is truly thread safe, as volatile demands that the value is read from and written to main RAM immediately (or something 100% equivalent, if the processor can do that more efficiently).
Note that if you don't declare your field as volatile you must use synchronization primitives, such as lock, but generally speaking you need to use the synchronization primitives both Get and Set, otherwise you won't solve the problem that volatile fixes.
Mind you, as you're doing IPC calls to get the status, I'd wager that you won't ever actually be able to observe any difference between non-volatile and volatile, given the overhead of the IPC calls and the thread synchronizations undoubtedly performed behind the scenes.
For more information on volatile, see volatile (C#) on MSDN.