Threads and garbage collection

Threads and garbage collection - c#

I have a windows service which runs continuously and creates some threads to do some work. I want to make sure that these threads are properly disposed of (garbage collected after they are finished.
However, I also want to be able to check to see if they are alive periodically and terminate them if they are. I know I can't keep any references to them, though, because then they wouldn't be garbage collected.
Is there an alternative way to check for the existence/state of user-defined threads? I was thinking maybe something like the following using WeakReference: (I can't fully test right now or I'd just test it myself)
List<WeakReference> weakReferences;
Thread myThread = new Thread(() => Foo());
WeakReference wr = new WeakReference(myThread);
weakReferences.Add(wr); //adds a reference to the thread but still allows it to be garbage collected
myThread.Start();
myThread = null; //get rid of reference so thread can be garbage collected
and then at the beginning of my onTimeElapsed event (run every 5 minutes):
foreach(WeakReference wr in weakReferences)
{
Thread target = wr.Target as Thread; //not sure if this cast is really possible
if(target.IsAlive && otherLogic)
{
target.Abort();
{
}
But I'm not sure exactly how WeakReference works. Any ideas on how to properly do this?

Is myThread a method variable? or...?
In most scenarios, the thread will simply be garbage collected when possible. There is no need to set myThread to null if myThread is a method variable, because that won't exist at the time.
I would, however, note that threads are actually pretty expensive objects (the stack alone is a pain to allocate). If possible, I would suggest either using the ThreadPool (if each item is short-lived), or a bespoke work queue (if longer), potentially with multiple workers servicing a single queue.
As for terminating/aborting a thread... that is never a good idea; you have no idea what the thread is doing at that point. After that, it is possible that your entire process is doomed. If at all possible, consider having the worker check an "abort" flag occasionally. If not possible, consider doing the work in a separate process. A process is even more expensive than a thread, but it has the advantage that it is isolated; you can kill it without impacting yourself. Of course, you could still corrupt any files it was working on, etc...
Frankly, the main time I would ever consider aborting a thread is if my process is already dying, and I'm trying to put it out of misery ASAP.

Use thread pool. Don't spawn threads by yourself and don't invent the wheel.

Related

How InvokeRequired and Invoke let us make app thread safe

How InvokeRequired and Invoke let us make our apps thread safe.
Let's consider such code:
private void ThreadSafeUpdate(string message)
{
if (this.textBoxSome.InvokeRequired)
{
SetTextCallback d = new SetTextCallback(msg);
this.Invoke
(d, new object[] { message });
}
else
{
// It's on the same thread, no need for Invoke
this.textBoxSome.Text = message;
}
}
Is it possible to change state of InvokeRequired after InvokeRequired and before Invoke? If not, then why?
How does Invoking make it thread safe?
If InvokeRequired illustrate is current thread owning control, how would the thread know that it is or it is not the owner.
Let's consider that SomeMethod() is currently running on Thread1. We would like to call it from Thread2. Internally this method updates some field. Does Method.Invoke contain some kind of lock mechanism internally?
What if SomeMethod() takes very long time and we would like to run something other on the control owner thread. Does Invoking lock the owner thread or is it some kind of a background thread safe task?
ThreadSafeUpdate() //takes 5 minutes in Thread2
ThreadSafeUpdate() //after 2 minutes, we are running it in other thread2
ThreadSafeUpdate() //next run from Thread3
I think it is some kind of general pattern which can be implemented outside of winforms, what's its name?

Is it possible to change state of InvokeRequired
Yes, and it is a pretty common occurrence. Either because you started the thread too soon, before the form's Load event fired. Or because the user closed the window just as this code is running. In both cases this code fails with an exception. InvokeRequired fails when the thread races ahead of the window creation, the invoked code fails when the UI thread races ahead of the thread. The odds for an exception are low, too low to ever diagnose the bug when you test the code.
How Invoking make it thread safe?
You cannot make it safe with this code, it is a fundamental race. It must be made safe by interlocking the closing of the window with the thread execution. You must make sure that the thread stopped before allowing the window to close. The subject of this answer.
how would he know that he is or he is not owner.
This is something that can be discovered with a winapi call, GetWindowsThreadProcessId(). The Handle property is the fundamental oracle for that. Pretty decent test, but with the obvious flaw that it cannot work when the Handle is no longer valid. Using an oracle in general is unwise, you should always know when code runs on a worker thread. Such code is very fundamentally different from code that runs on the UI thread. It is slow code.
We would like to call it from Thread2
This is not in general possible. Marshaling a call from one thread to a specific other thread requires that other thread to co-operate. It must solve the producer-consumer problem. Take a look at the link, the fundamental solution to that problem is a dispatcher loop. You probably recognize it, that's how the UI thread of a program operates. Which must solve this problem, it gets notifications from arbitrary other threads and UI is never thread-safe. But worker threads in general don't try to solve this problem themselves, unless you write it explicitly, you need a thread-safe Queue and a loop that empties it.
What's if SomeMethod() takes very long time
Not sure I follow, the point of using threads is to let code that takes a long time not do anything to harm the responsiveness of the user interface.
I think it is some kind of general pattern
There is, it doesn't look like this. This kind of code tends to be written when you have an oh-shoot moment and discover that your UI is freezing. Bolting threading on top of code that was never designed to support threading is forever a bad idea. You'll overlook too many nasty little details. Very important to minimize the number of times the worker thread interacts with the UI thread, your code is doing the opposite. Fall in the pit of success with the BackgroundWorker class, its RunWorkerCompleted event gives a good synchronized way to update UI with the result of the background operation. And if you like Tasks then the TaskScheduler.FromCurrentSynchronizationContext() method helps you localize the interactions.

Usually, no. But it could happen if you're using await between the InvokeRequired check and Invoke call without capturing the execution context. Of course, if you're already using await, you're probably not going to be using Invoke and InvokeRequired.
EDIT: I just noticed that InvokeRequired will return false when the control handle hasn't been created yet. It shouldn't make much of a difference, because your call will fail anyway when the control hasn't quite been created yet, but it is something to keep in mind.
It doesn't make it thread-safe. It just adds the request to the control's queue, so that it's executed the next available time on the same thread the control was created on. This has more to do with windows architecture than with general thread-safety. The end result, however, is that the code runs on a single thread - of course, this still means you need to handle shared state synchronization manually, if any.
Well, it's complicated. But in the end, it boils down to comparing the thread ID of the thread that created the control, and the current thread ID. Internally, this calls the native method GetWindowThreadProcessId - the operating system keeps track of the controls (and more importantly, their message loops).
Invoke cannot return until the GUI thread returns to its message loop. Invoke itself only posts the command to the queue and waits for it to be processed. But the command is run on the GUI thread, not the Invoke-caller. So the SomeMethod calls in your example will be serialized, and the Invoke call itself will wait until the second call finishes.
This should already be answered. The key point is "only run GUI code on the GUI thread". That's how you get reliable and responsive GUI at all times.
You can use it anywhere you've got a loop or a wait on some queue. It probably isn't all that useful, although I have actually used it already a few times (mostly in legacy code).
However, all of this is just a simple explanation of the workings. The truth is, you shouldn't really need InvokeRequired... well, ever. It's an artifact of a different age. This is really mostly about juggling threads with little order, which isn't exactly a good practice. The uses I've seen are either lazy coding, or hotfixes for legacy code - using this in new code is silly. The argument for using InvokeRequired is usually like "it allows us to handle this business logic safely whether it runs in the GUI thread or not". Hopefully, you can see the problem with that logic :)
Also, it's not free thread-safety. It does introduce delays (especially when the GUI thread is also doing some work that isn't GUI - very likely in code that uses InvokeRequired in the first place). It does not protect you from accesses to the shared state from other threads. It can introduce deadlocks. And don't even get me started on doing anything with code that uses Application.DoEvents.
And of course, it's even less useful once you take await into consideration - writing asynchronous code is vastly easier, and it allows you to make sure the GUI code always runs in the GUI context, and the rest can run wherever you want (if it uses a thread at all).

Most efficient way to use and communicate with many threads

I am coding an application that runs many threads in the background which have to report back to the main thread so it can update a table in the interface. In the past, the worker threads were ordinary separate classes (named Citizen) which I have ran from the main thread using something like
new Thread(new ThreadStart(citizen.ProcessActions)).Start();
where ProcessActions function was the main function which did all the background work. Before actually starting the thread, I would register event handlers so the Citizen threads could log/report some stuff to the interface. Usually, there are tens of these Citizen threads (around 50) and they're pretty big classes - each has it's own HTTP client and it browses the web.
Is this a good way to do manage threads? Probably not, to be frank; I'm pretty sure the threads aren't gracefully exiting - once the ProcessActions function gets done, I remove the event handlers and that's it - the memory usage keeps rising with each new Citizen started.
What would be the best way to manage many (50+) threads, with which you have to communicate often? I believe I wouldn't have to worry much about thread safety for Citizen variables as I wouldn't be accessing them from other threads but it's own thread.

I think what you're looking for is a thread pool. Here's an MSDN article on them and that should be available in C# 4.0.
The idea would be to create a thread pool, set its count to some high number(say 50), and then start assigning threads to tasks. If the pool needs to expand, it can, but by declaring a high number up front, you get all the expensive creation of threads out of the way.
It might be beneficial to 'queue' tasks that you want to get done, and assign those tasks as threads become available.
Also, memory leaks can be hard to find, but I would start by testing the simple case: Take out all threads(just run one Citizen after another from the main thread) and let it run for a long time. If it's still leaking memory, your thread management isn't the issue.

Resource usage of ThreadPool RegisterWaitForSingleObject

I am writing a server application which processes request from multiple clients. For the processing of requests I am using the threadpool.
Some of these requests modify a database record, and I want to restrict the access to that specific record to one threadpool thread at a time. For this I am using named semaphores (other processes are also accessing these records).
For each new request that wants to modify a record, the thread should wait in line for its turn.
And this is where the question comes in:
As I don't want the threadpool to fill up with threads waiting for access to a record, I found the RegisterWaitForSingleObject method in the threadpool.
But when I read the documentation (MSDN) under the section Remarks:
New wait threads are created automatically when required. ...
Does this mean that the threadpool will fill up with wait-threads? And how does this affect the performance of the threadpool?
Any other suggestions to boost performance is more than welcome!
Thanks!

Your solution is a viable option. In the absence of more specific details I do not think I can offer other tangible options. However, let me try to illustrate why I think your current solution is, at the very least, based on sound theory.
Lets say you have 64 requests that came in simultaneously. It is reasonable to assume that the thread pool could dispatch each one of those requests to a thread immediately. So you might have 64 threads that immediately begin processing. Now lets assume that the mutex has already been acquired by another thread and it is held for a really long time. That means those 64 threads will be blocked for a long time waiting for the thread that currently owns the mutex to release it. That means those 64 threads are wasted on doing nothing.
On the other hand, if you choose to use RegisterWaitForSingleObject as opposed to using a blocking call to wait for the mutex to be released then you can immediately release those 64 waiting threads (work items) and allow them to be put back into the pool. If I were to implement my own version of RegisterWaitForSingleObject then I would use the WaitHandle.WaitAny method which allows me to specify up to 64 handles (I did not randomly choose 64 for the number of requests afterall) in a single blocking method call. I am not saying it would be easy, but I could replace my 64 waiting threads for only a single thread from the pool. I do not know how Microsoft implemented the RegisterWaitForSingleObject method, but I am guessing they did it in a manner that is at least as efficient as my strategy. To put this another way, you should be able to reduce the number of pending work items in the thread pool by at least a factor of 64 by using RegisterWaitForSingleObject.
So you see, your solution is based on sound theory. I am not saying that your solution is optimal, but I do believe your concern is unwarranted in regards to the specific question asked.

IMHO you should let the database do its own synchronization. All you need to do is to ensure that you're sync'ed within your process.
Interlocked class might be a premature optimization that is too complex to implement. I would recommend using higher-level sync objects, such as ReaderWriterLockSlim. Or better yet, a Monitor.

An approach to this problem that I've used before is to have the first thread that gets one of these work items be responsible for any other ones that occur while it's processing the work item(s), This is done by queueing the work items then dropping into a critical section to process the queue. Only the 'first' thread will drop into the critical section. If a thread can't get the critical section, it'll leave and let the thread already operating in the critical section handle the queued object.
It's really not very complicated - the only thing that might not be obvious is that when leaving the critical section, the processing thread has to do it in a way that doesn't potentially leave a late-arriving workitem on the queue. Basically, the 'processing' critical section lock has to be released while holding the queue lock. If not for this one requirement, a synchronized queue would be sufficient, and the code would really be simple!
Pseudo code:
// `workitem` is an object that contains the database modification request
//
// `queue` is a Queue<T> that can hold these workitem requests
//
// `processing_lock` is an object use to provide a lock
// to indicate a thread is processing the queue
// any number of threads can call this function, but only one
// will end up processing all the workitems.
//
// The other threads will simply drop the workitem in the queue
// and leave
void threadpoolHandleDatabaseUpdateRequest(workitem)
{
// put the workitem on a queue
Monitor.Enter(queue.SyncRoot);
queue.Enqueue(workitem);
Monitor.Exit(queue.SyncRoot);
bool doProcessing;
Monitor.TryEnter(processing_queue, doProcessing);
if (!doProcessing) {
// another thread has the processing lock, it'll
// handle the workitem
return;
}
for (;;) {
Monitor.Enter(queue.SyncRoot);
if (queue.Count() == 0) {
// done processing the queue
// release locks in an order that ensures
// a workitem won't get stranded on the queue
Monitor.Exit(processing_queue);
Monitor.Exit(queue.SyncRoot);
break;
}
workitem = queue.Dequeue();
Monitor.Exit(queue.SyncRoot);
// this will get the database mutex, do the update and release
// the database mutex
doDatabaseModification(workitem);
}
}

ThreadPool creates a wait thread for ~64 waitable objects.
Good comments are here: Thread.sleep vs Monitor.Wait vs RegisteredWaitHandle?

Safely Closing A Thread

I'm writing a bit of code that will open a MessageBox on a separate thread to prevent the MessageBox from stopping the program. It is very very important that starting a new thread will not crash the program that I am running, but I don't know enough about threads to make sure this happens.
My question is, after starting the thread, how can I safely dispose of it after the MessageBox closes? I imagine closing/disposing of it is necessary so it's not just floating around after it is created and started.
Please advise, thanks!
var Thread = new Thread
(
()=>
{
MessageBox.Show("Buy pizza, Pay with snakes");
}
);
Thread.Start();

You don't need to do anything special.
Thread instances are automatically "cleaned up" (rather they become candidates for garbage collection) when there's no references to them (in your code) and their main method body has terminated. In fact, Thread doesn't implement IDisposable - so speaking of it's "disposal" is incorrect.
In your example, once the lambda method completes (ie the message box is closed), the thread will automatically terminate. You don't need to do anything extra.
Now there's a difference between reclaiming allocated memory and having objects become candidates for disposal/collection. Any objects allocated will remain on the GC heap until the next collection cleans them up ... but you shouldn't have to care about that.
A separate issue you may need to contend with is performing UI operations on a thread other than the main UI thread. While it is possible, you have to be careful not to reference any UI elements that are created on a different thread from the one you create.

The thread will close automatically after the scope of the lambda expression is left... in your case you don't need to worry about anything.
In general it's also good practice to set the thread to background, because if your application is closed you might get a message box just hanging out there by itself:
var thread = new Thread(
()=>
{
MessageBox.Show("Buy pizza, pay with snakes");
});
thread.IsBackground = true;
thread.Start();
Note: it's preferred that your variables start with a lower letter. For details on naming conventions please see the Microsoft Naming Guidelines.

A Thread will automatically clean itself up once the code contained within it completes. You don't have to manually dispose of it (and, in fact, it's not IDisposable!).

A few things first...
Threads don't "crash" the program unless an unhandled exception is thrown from within it.
You don't need to dispose of a thread. Finishing its main routine is enough.
If necessary, you can make your program wait for the end of the thread execution using the Join() method on your Thread instance.
And then a suggestion: it seems that you need a modeless MessageBox. AFAIK, the feasible way of doing this is creating a custom form and display it through Show() instead of ShowDialog().

In C#, you shouldn't have to care all that much once the thread goes out of scope. It's a simple answer, but simple is good: let the computer do what it's good at. :-)

You should be aware that if an exception is thrown by your worker thread and is not caught, then your application may abort (as Humberto mentioned in point #1). The example you provided is trivial, and I can't imagine that it would throw an exception, but you may want to consider at least wrapping the worker thread logic in a try/catch.
I would suggest not using a separate thread for this purpose. Create your own form for displaying the message and show it with the Show method. Creating a form like this isn't too difficult; I recommend making use of the Button.DialogResult, Form.AcceptButton, and Form.CancelButton properties. You have more control over the appearance of the form
In terms of reliability, an advantage of keeping your code out of a worker thread is that you can subscribe to the Application.ThreadException event in order to handle any exceptions that were not caught by your application's logic. This allows you to prevent your application from crashing due to an unhandled exception, but be aware that this will affect your entire application.

C# Managed Thread Cleanup

After my application creates a thread using a ParameterizedThreadStart delegate, that thread performs some initialization and runs to completion. Later on, I can observe that this thread is no longer active because its IsAlive property is false and ThreadState property is ThreadState.Stopped.
Once a thread reaches this state they remain in my application, still existing as thread objects until my application shuts down. Are there any steps I can take to dispose of them once they're no longer active? I would like to delete the object and deallocate any resources so that any given moment the only thread objects I have are active threads. Thread doesn't implement IDisposable, though, so I'm not sure how I should do this.

You're holding onto the reference to the thread in your code.
If you have written code that will check the state of the thread, then that code inherently will keep the thread object alive until the GC collects it.
Once you are finished with a thread, or ideally if you don't need to access it, make sure you null all references to it. Thread doesn't implement IDisposable because as you've made clear this wouldn't make sense for a thread.
Threads are native in .Net so you don't have to worry about leaks. If you're certain they will stop then just delete them from your list once you are sure it has finished.

It sounds like you need to let go of your reference to the Thread object, so the garbage collector can discard it. Just set the reference you have to null, and let the GC do its job when it's ready.
Depending on your situation, you may wish to use a WeakReference (or my friend Cyrus' WeakReference<T>).

Is the unmanaged thread still there, did the thread actually return from its ParameterizedThreadStart method? Also try making IsBackground = false

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.