How to kill a specific thread from an array of threads - c#

I am creating an array of threads based on the number of records in a database. Each thread then polls an ipaddress and then sleeps for a time and then repolls again. I periodically check the database for any change in the number of hosts. If there are more hosts I start another thread. If there are less hosts I need to kill the specific thread that was monitoring that host. How do i kill the specific thread.
enter code here protected static void GetThreads()
{
Thread[] threads;
do
{
dt = getIP_Poll_status();
threads = new Thread[dt.Rows.Count];
Console.WriteLine(dt.Rows.Count + " Threads");
for (int i = 0; i < threads.Length; ++i)
{
string ip = dt.Rows[i][0].ToString();
int sleep = Convert.ToInt32(dt.Rows[i][1].ToString());
string status = dt.Rows[i][2].ToString();
string host = dt.Rows[i][3].ToString();
Hosts.Add(host);
string port = dt.Rows[i][4].ToString();
//Console.WriteLine("starting on " + ip + " delay " + sleep+".current status "+status);
threads[i] = new Thread(PollingThreadStart);
threads[i].Start(new MyThreadParameters(ip, sleep, status, host, port));
threads[i].Name = host;
}
Thread.Sleep(50000);
}
while (true);
}

Killing threads forcibly is a bad idea. It can leave the system in an indeterminate state.
You should set a flag (in a thread-safe way) so that the thread will terminate itself appropriately next time it checks. See my threading article for more details and sample code.
I would add that using Sleep is almost always the wrong thing to do, by the way. You should use something which allows for a graceful wake-up, such as Monitor.Wait. That way when there are changes (e.g. the polling thread should die) something can wake the thread up if it's waiting, and it can notice the change immediately.

Given that most of your threads will spend the majority of their time doing nothing, your design might be better realised as a single thread that keeps a list of ip addresses and the time they're due to be polled next. Keep it sorted in order of next poll time.
Pseudocode:
What time does the next ip address need to be polled?
Sleep till then
Poll the address.
Update the poll time for that address to now + interval.
Resort the list
Repeat.
Whenever you have a DB update, update the list and then order the thread to re-evaluate when it needs to stop.

You don't specify the language you are targetting, but in general you use the same method regardless of the language. Simple use a shared variable that is used to signal the thread when it is time to stop running. The thread should periodically check the value and if it is set it will stop in a graceful fashion. Typically, this is the only safe method to stop a thread.

I would say:
Don't kill threads. Ask them to die, nicely (vie Events or some shared flag).
Be careful when creating exactly one thread per DB entry. This could mean that some unexpected DB activity where suddenly you have many rows translates into killing the OS with too many threads. Definitely have a limit on the number of threads.

You could send an interrupt signal to the thead you want to end. The thread that has been signalled would need to catch the ThreadInterruptedException that will be raised, and exit gracefully.
There are other, possibly better, ways to achieve what you want, but they are more complicated...

What you are trying to do is a bad idea because of the reasons mentioned above. However you can still give the thread a name which can be the host name. You may find out the thread by checking the name and kill it.

Related

Python timeout lock with option to stop

I using python 3.6 for sync multiple threads. I have a "master thread" that gives work for all the other threads. When a worker thread is finish work, it signal the master thread to give him more work.
In order to achive that, the master thread is waiting for one (or more) threads to finish before collecting new data to process.
while True:
while freeWorkers > 0:
# Give the worker more work...
time.sleep(5) # wait for 5 seconds before checking if we got free workers.
Basiclly, it's working. I want to upgrade it in that way: after a worker finish it job, it report some how to the "master" thread. Because master thread is really quick, in most cases the master thread will be sleeping... I want to make him stop sleeping, what will trigger giving more work for the free workers.
In C#, I did this trick in that way:
An object to handle the syncing around
public object SyncingClock { get; private set; } = new object();
Entering sleep in that way:
lock (SyncingClock)
Monitor.Wait(SyncingClock, 5000);
Worker thread report completion in that way:
lock (SyncingClock)
Monitor.Pulse(SyncingClock);
So, I looking to way to perform this C# trick in Python (or any other alternative).
Thanks.
i think you should look at eventdriven programming (https://emptypage.jp/notes/pyevent.en.html)
and not having a while loop polling for finished workers:
for example something like this:
def create_thread(self, work_finished_method):
t = some_method_to_create_and prepare_a_thread()
t.event_finished += work_finished_method
return t
class MyThread:
name = "SomeNameForTheThread"
event_finished = event.Event(name + " has finished.")
def finished(self):
self.event_finished()
def do_work:
do_something()
finished()
and when the work_finished method is called in the mainhthread you can assign new work to the thread.
This done with Condition object.
self.conditon = threading.Condition()
For waiting to timeout or pulse, do:
with service.conditon:
service.conditon.wait(5)
For notify:
with self.conditon:
self.conditon.notifyAll()

Cross Process Event - Release all waiters reliably

I have created a cross process event via ManualResetEvent. When this event does occur potentially n threads in n different processes should be unblocked and start running to fetch the new data. The problem is that it seems that ManualResetEvent.Set followed by an immediate Reset does not cause all waiting threads to wake up. The docs are pretty vague there
http://msdn.microsoft.com/en-us/library/windows/desktop/ms682396(v=vs.85).aspx
When the state of a manual-reset event object is signaled, it remains
signaled until it is explicitly reset to nonsignaled by the ResetEvent
function. Any number of waiting threads, or threads that subsequently
begin wait operations for the specified event object, can be released
while the object's state is signaled.
There is a method called PulseEvent which seems to do exactly what I need but unfortunately it is also flawed.
A thread waiting on a synchronization object can be momentarily
removed from the wait state by a kernel-mode APC, and then returned to
the wait state after the APC is complete. If the call to PulseEvent
occurs during the time when the thread has been removed from the wait
state, the thread will not be released because PulseEvent releases
only those threads that are waiting at the moment it is called.
Therefore, PulseEvent is unreliable and should not be used by new
applications. Instead, use condition variables.
Now MS does recommend to use condition variables.
Condition variables are synchronization primitives that enable threads
to wait until a particular condition occurs. Condition variables are
user-mode objects that cannot be shared across processes.
Following the docs I seem to have run out of luck to do it reliably. Is there an easy way to accomplish the same thing without the stated limitations with one ManualResetEvent or do I need to create for each listener process a response event to get an ACK for each subscribed caller? In that case I would need a small shared memory to register the pids of the subscribed processes but that seems to bring in its own set of problems. What does happen when one process crashes or does not respond? ....
To give some context. I have new state to publish which all other processes should read from a shared memory location. It is ok to miss one update when several updates occur at once but the process must read at least the last up to date value. I could poll with a timeout but that seems not like a correct solution.
Currently I am down to
ChangeEvent = new EventWaitHandle(false, EventResetMode.ManualReset, counterName + "_Event");
ChangeEvent.Set();
Thread.Sleep(1); // increase odds to release all waiters
ChangeEvent.Reset();
One general purpose option for handling the case where producers must wake all consumers and the number of consumers is evolving is to use a moving fence approach. This option requires a shared memory IPC region too. The method does sometimes result in consumers being woken when no work is present, especially if lots of processes need scheduling and load is high, but they will always wake except on hopelessly overloaded machines.
Create several manual reset events and have the producers maintain a counter to the next event that will be set. All Events are left set, except the NextToFire event. Consumer processes wait on the NextToFire event. When the producer wishes to wake all consumers it resets the Next+1 event and sets the current event. All consumers will eventually be scheduled and then wait on the new NextToFire event. The effect is that only the producer uses ResetEvent, but consumers always know which event will be next to wake them.
All Users Init: (pseudo code is C/C++, not C#)
// Create Shared Memory and initialise NextToFire;
pSharedMemory = MapMySharedMemory();
if (First to create memory) pSharedMemory->NextToFire = 0;
HANDLE Array[4];
Array[0] = CreateEvent(NULL, 1, 0, "Event1");
Array[1] = CreateEvent(NULL, 1, 0, "Event2");
Array[2] = CreateEvent(NULL, 1, 0, "Event3");
Array[3] = CreateEvent(NULL, 1, 0, "Event4");
Producer to Wake all
long CurrentNdx = pSharedMemory->NextToFire;
long NextNdx = (CurrentNdx+1) & 3;
// Reset next event so consumers block
ResetEvent(Array[NextNdx]);
// Flag to consumers new value
long Actual = InterlockedIncrement(&pSharedMemory->NextToFire) & 3;
// Next line needed if multiple producers active.
// Not a perfect solution
if (Actual != NextNdx) ResetEvent(Actual);
// Now wake them all up
SetEvent(CurrentNdx);
Consumers wait logic
long CurrentNdx = (pSharedMemory->NextToFire) & 3;
WaitForSingleObject(Array[CurrentNdx], Timeout);
Since .NET 4.0, you could use MemoryMappedFile to sync process memory. In this case, write counter to MemoryMappedFile and decrement it from worker processes. If the counter equals to zero, then main process allowed to reset event. Here is the sample code.
Main Process
//number of WorkerProcess
int numWorkerProcess = 5;
//Create MemroyMappedFile object and accessor. 4 means int size.
MemoryMappedFile mmf = MemoryMappedFile.CreateNew("test_mmf", 4);
MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor();
EventWaitHandle ChangeEvent = new EventWaitHandle(false, EventResetMode.ManualReset, counterName + "_Event");
//write counter to MemoryMappedFile
accessor.Write(0, numWorkerProcess);
//.....
ChangeEvent.Set();
//spin wait until all workerProcesses decreament counter
SpinWait.SpinUntil(() => {
int numLeft = accessor.ReadInt32(0);
return (numLeft == 0);
});
ChangeEvent.Reset();
WorkerProcess
//Create existed MemoryMappedfile object which created by main process.
MemoryMappedFile mmf = MemoryMappedFile.OpenExisting("test_mmf");
MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor();
//This mutex object is used for decreament counter.
Mutex mutex = new Mutex(false, "test_mutex");
EventWaitHandle ChangeEvent = new EventWaitHandle(false, EventResetMode.ManualReset, "start_Event");
//....
ChangeEvent.WaitOne();
//some job...
//decrement counter with mutex lock.
mutex.WaitOne();
int count = accessor.ReadInt32(0);
--count;
accessor.Write(0, count);
mutex.ReleaseMutex();
/////////////////////////////////////
If environment is less than .NET 4.0, you could realize by using CreateFileMapping function from win32 API.
You wrote: “PulseEvent which seems to do exactly what I need but unfortunately it is also flawed”. This is true that PulseEvent is flawed, but I cannot agree that manual-reset event is flawed. It is very reliable. There are just cases when you can use manual-reset events, and there are cases where you cannot use them. It is not one-fits-for-all. There are lots of other tools, like auto-reset events, pipes, etc.
The best way to just notify a thread, if you need to notify it periodically, but don't need to send data across processes, is an auto-reset event. You just need own event for each thread. So, you have as many events as there are threads.
If you need to just send data to processes, it’s better to use named pipes. Unlike auto-reset events, you don't need own pipe for each of the processes. Each named pipe has a server and one or more clients. When there are many clients, many instances of the same named pipes are automatically created by the operating system for each of the clients. All instances of a named pipe share the same pipe name, but each instance has its own buffers and handles, and provides a separate conduit for client/server communication. The use of instances enables multiple pipe clients to use the same named pipe simultaneously. Any process can act as both a server for one pipe and a client for another pipe, and vice versa, making peer-to-peer communication possible.
If you will use a named pipe, there would be no need in the events at all in your scenario, and the data will have guaranteed delivery no matter what happens with the processes – each of the processes may get long delays (e.g. by a swap) but the data will be finally delivered ASAP without your special involvement.
One event for all threads (processes) is only OK if the notification will be only once. In this case, you will need the manual-reset event, not an auto-reset one. For example, if you need to notify that your application will very soon exit, you may signal this common manual-reset event. But, as I wrote before, in your scenario, named pipes is the best choice.

How to kill a thread in C# effectively?

I am not trying to beat a dead horse, honestly. And I've read all the advice on thread killing, however, please consider the code. It does the following:
It starts a thread (via StartThread method)
It calls the database looking for anything in the ServiceBroker queue. Note the WAITFOR command - it means that it will sit there until there is something in the queue. All this in MonitorQueue method.
Kill the thread. I tried .Interrupt - it seems to do absolutely nothing. Then I tried .Abort, which should never be used, but even that did nothing.
Thread thxMonitor = new Thread(MonitorQueue);
void StartThread() {
thxMonitor.Start();
}
void MonitorQueue(object obj) {
var conn = new SqlConnection(connString);
conn.Open();
var cmd = conn.CreateCommand();
cmd.CommandTimeout = 0; // forever and ever
cmd.CommandType = CommandType.Text;
cmd.CommandText = "WAITFOR (RECEIVE CONVERT(int, message_body) AS Message FROM SBQ)";
var dataTable = new DataTable();
var da = new SqlDataAdapter(command);
da.Fill(dataTable);
da.Dispose();
}
void KillThreadByAnyMeansNecessary() {
thxMonitor.Interrupt();
thxMonitor.Abort();
}
Is it actually possible to kill a thread?
Set an Abort flag to tell the thread is needs to terminate. Append a dummy record to the ServiceBroker queue. The WAITFOR then returns. The thread then checks its 'Abort' flag and, finding it set, deletes the dummy record from the queue and exits.
Another variant would be to add a 'real' poison-pill record to the specification for the table monitored by the ServiceBroker - an illegal record-number, or the like. That would avoid touching the thread/s at all in any direct manner - always a good thing:) This might be more complex, especially if each work thread is expeceted to notify upon actual termination, but would still be effective if the work threads, ServiceBroker and DB were all on different boxes. I added this as an edit because, having thought a bit more about it, it seems more flexible, after all, if the threads normally only communicate via. the DB, why not shut them down with only the DB? No Abort(), no Interrupt() and, hopefully, no lockup-generating Join().
I hate to not answer your question, but consider going about this a different way. T-SQL allows a TIMEOUT parameter to be specified with WAITFOR, such that if a message is not received in a certain period of time, the statement will quit and have to be tried again. You see this over and over again in patterns where you have to wait. The tradeoff is that you don't immediately get the thread to die when requested -- you have to wait for your timeout to expire before your thread dies.
The quicker you want this to happen, the smaller your timeout interval. Want it to happen instantly? Then you should be polling instead.
static bool _quit = false;
Thread thxMonitor = new Thread(MonitorQueue);
void StartThread() {
thxMonitor.Start();
}
void MonitorQueue(object obj) {
var conn = new SqlConnection(connString);
conn.Open();
var cmd = conn.CreateCommand();
cmd.CommandType = CommandType.Text;
cmd.CommandText = "WAITFOR (RECEIVE CONVERT(int, message_body) AS Message FROM SBQ) TIMEOUT 500";
var dataTable = new DataTable();
while(!quit && !dataTable.AsEnumerable().Any()) {
using (var da = new SqlDataAdapter(command)) {
da.Fill(dataTable);
}
}
}
void KillThreadByAnyMeansNecessary() {
_quit = true;
}
EDIT:
Although this can feel like polling the queue, it's not really. When you poll, you're actively checking something, and then you're waiting to avoid a "spinning" condition where you're constantly burning up CPU (though sometimes you don't even wait).
Consider what happens in a polling scenario when you check for entries, then wait 500ms. If nothing's in the queue and 200ms later a message arrives, you have to wait another 300ms when polling to get the message. With a timeout, if a message arrives 200ms into the timeout of the "wait" method, the message gets processed immediately.
That time delay forced by the wait when polling vs. a constant high CPU when polling in a tight loop is why polling is often unsatisfactory. Waiting with a timeout has no such disadvantages -- the only tradeoff is you have to wait for your timeout to expire before your thread can die.
Don't do this! Seriously!
The function that you need to call to kill a thread is the TerminateThread function, which you can call via P/Invoke. All the reasons as to why you shouldn't use this method are right there in the documentation
TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you
know exactly what the target thread is doing, and you control all of
the code that the target thread could possibly be running at the time
of the termination. For example, TerminateThread can result in the
following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be
inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other
users of the DLL.
The important thing to note is the bit in bold, and the fact that under the CLR / .Net framework you are never in the situation where you know exactly what the target thread is doing (unless you happen to write the CLR).
To clarify, calling TerminateThread on a thread running .Net code could quite possibly deadlock your process or otherwise leave in a completely unrecoverable state.
If you can't find some way to abort the connection then you are far better off just leaving that thread running in the background than trying to kill it with TerminateThread. Other people have already posted alternative suggestions on how to achieve this.
The Thread.Abort method is slightly safer in that it raises a ThreadAbortException rather than immediately tearing down your thread, however this has the disadvantage of not always working - the CLR can only throw the exception if the CLR is actually running code on that thread, however in this case the thread is probably sat waiting for some IO request to complete in native SQL Server Client code instead, which is why your call to Thread.Abort isn't doing anything, and won't do anything until control is returned to the CLR.
Thread.Abort also has its own problems anyway and is generally considered a bad thing to be doing, however it probably wont completely hose your process (although it still might, depending on what the code running is doing).
instead of killing your thread, change your code to use WAITFOR with a small timeout.
after the timeout elapses, check to see if the thread has been interrupted.
if not, loop back around and do your waitfor again.
Yes, "the entire point" of waitfor is to wait for something. But if you want something to be responsive, you can't ask one thread to wait for Infinity, and then expect it to listen to anything else.
It is not just easy to terminate the thread right away. There is a potential potential problem associated with it:
Your thread acquire a lock, and then you kill it before it releases the lock. Now the threads who require the lock will get stuck.
You can use some global variable to tell the thread to stop. You have to manually, in your thread code, check that global variable and return if you see it indicates you should stop.
Please refer to this question discussing the same thing:
How to kill a thread instantly in C#?

Limiting the number of threadpool threads

I am using ThreadPool in my application. I have first set the limit of the thread pool by using the following:
ThreadPool.SetMaxThreads(m_iThreadPoolLimit,m_iThreadPoolLimit);
m_Events = new ManualResetEvent(false);
and then I have queued up the jobs using the following
WaitCallback objWcb = new WaitCallback(abc);
ThreadPool.QueueUserWorkItem(objWcb, m_objThreadData);
Here abc is the name of the function that I am calling.
After this I am doing the following so that all my threads come to 1 point and the main thread takes over and continues further
m_Events.WaitOne();
My thread limit is 3. The problem that I am facing is, inspite of the thread pool limit set to 3, my application is processing more than 3 files at the same time, whereas it was supposed to process only 3 files at a time. Please help me solve this issue.
What kind of computer are you using?
From MSDN
You cannot set the number of worker
threads or the number of I/O
completion threads to a number smaller
than the number of processors in the
computer.
If you have 4 cores, then the smallest you can have is 4.
Also note:
If the common language runtime is
hosted, for example by Internet
Information Services (IIS) or SQL
Server, the host can limit or prevent
changes to the thread pool size.
If this is a web site hosted by IIS then you cannot change the thread pool size either.
A better solution involves the use of a Semaphore which can throttle the concurrent access to a resource1. In your case the resource would simply be a block of code that processes work items.
var finished = new CountdownEvent(1); // Used to wait for the completion of all work items.
var throttle = new Semaphore(3, 3); // Used to throttle the processing of work items.
foreach (WorkItem item in workitems)
{
finished.AddCount();
WorkItem capture = item; // Needed to safely capture the loop variable.
ThreadPool.QueueUserWorkItem(
(state) =>
{
throttle.WaitOne();
try
{
ProcessWorkItem(capture);
}
finally
{
throttle.Release();
finished.Signal();
}
}, null);
}
finished.Signal();
finished.Wait();
In the code above WorkItem is a hypothetical class that encapsulates the specific parameters needed to process your tasks.
The Task Parallel Library makes this pattern a lot easier. Just use the Parallel.ForEach method and specify a ParallelOptions.MaxDegreesOfParallelism that throttles the concurrency.
var options = new ParallelOptions();
options.MaxDegreeOfParallelism = 3;
Parallel.ForEach(workitems, options,
(item) =>
{
ProcessWorkItem(item);
});
1I should point out that I do not like blocking ThreadPool threads using a Semaphore or any blocking device. It basically wastes the threads. You might want to rethink your design entirely.
You should use Semaphore object to limit concurent threads.
You say the files are open: are they actually being actively processed, or just left open?
If you're leaving them open: Been there, done that! Relying on connections and resources (it was a DB connection in my case) to close at end of scope should work, but it can take for the dispose / garbage collection to kick in.

Multiple Threads

I post a lot here regarding multithreading, and the great stackoverflow community have helped me alot in understand multithreading.
All the examples I have seen online only deal with one thread.
My application is a scraper for an insurance company (family company ... all free of charge). Anyway, the user is able to select how many threads they want to run. So lets say for example the user wants the application to scrape 5 sites at one time, and then later in the day he choses 20 threads because his computer isn't doing anything else so it has the resources to spare.
Basically the application builds a list of say 1000 sites to scrape. A thread goes off and does that and updates the UI and builds the list.
When thats finished another thread is called to start the scraping. Depending on the number of threads the user has set to use it will create x number of threads.
Whats the best way to create these threads? Should I create 1000 threads in a list. And loop through them? If the user has set 5 threads to run, it will loop through 5 at a time.
I understand threading, but it's the application logic which is catching me out.
Any ideas or resources on the web that can help me out?
You could consider using a thread pool for that:
using System;
using System.Threading;
public class Example
{
public static void Main()
{
ThreadPool.SetMaxThreads(100, 10);
// Queue the task.
ThreadPool.QueueUserWorkItem(new WaitCallback(ThreadProc));
Console.WriteLine("Main thread does some work, then sleeps.");
Thread.Sleep(1000);
Console.WriteLine("Main thread exits.");
}
// This thread procedure performs the task.
static void ThreadProc(Object stateInfo)
{
Console.WriteLine("Hello from the thread pool.");
}
}
This scraper, does it use a lot of CPU when its running?
If it does a lot of communication with these 1000 remote sites, downloading their pages, that may be taking more time than the actual analysis of the pages.
And how many CPU cores does your user have? If they have 2 (which is common these days) then beyond two simultaneous threads performing analysis, they aren't going to see any speed up.
So you probably need to "parallelize" the downloading of the pages. I doubt you need to do the same for the analysis of the pages.
Take a look into asynchronous IO, instead of explicit multi-threading. It lets you launch a bunch of downloads in parallel and then get called back when each one completes.
If you really just want the application, use something someone else already spent time developing and perfecting:
http://arachnode.net/
arachnode.net is a complete and comprehensive .NET web crawler for
downloading, indexing and storing
Internet content including e-mail
addresses, files, hyperlinks, images,
and Web pages.
Whether interested or involved in
screen scraping, data mining, text
mining, research or any other
application where a high-performance
crawling application is key to the
success of your endeavors,
arachnode.net provides the solution
you need for success.
If you also want to write one yourself because it's a fun thing to write (I wrote one not long ago, and yes, it is alot of fun ) then you can refer to this pdf provided by arachnode.net which really explains in detail the theory behind a good web crawler:
http://arachnode.net/media/Default.aspx?Sort=Downloads&PageIndex=1
Download the pdf entitled: "Crawling the Web" (second link from top). Scroll to Section 2.6 entitled: "2.6 Multi-threaded Crawlers". That's what I used to build my crawler, and I must say, I think it works quite well.
I think this example is basically what you need.
public class WebScraper
{
private readonly int totalThreads;
private readonly List<System.Threading.Thread> threads;
private readonly List<Exception> exceptions;
private readonly object locker = new object();
private volatile bool stop;
public WebScraper(int totalThreads)
{
this.totalThreads = totalThreads;
threads = new List<System.Threading.Thread>(totalThreads);
exceptions = new List<Exception>();
for (int i = 0; i < totalThreads; i++)
{
var thread = new System.Threading.Thread(Execute);
thread.IsBackground = true;
threads.Add(thread);
}
}
public void Start()
{
foreach (var thread in threads)
{
thread.Start();
}
}
public void Stop()
{
stop = true;
foreach (var thread in threads)
{
if (thread.IsAlive)
{
thread.Join();
}
}
}
private void Execute()
{
try
{
while (!stop)
{
// Scrap away!
}
}
catch (Exception ex)
{
lock (locker)
{
// You could have a thread checking this collection and
// reporting it as you see fit.
exceptions.Add(ex);
}
}
}
}
The basic logic is:
You have a single queue in which you put the URLs to scrape then you create your threads and use a queue object to which every thread has access. Let the threads start a loop:
lock the queue
check if there are items in the queue, if not, unlock queue and end thread
dequeue first item in the queue
unlock queue
process item
invoke an event that updates the UI (Remember to lock the UI Controller)
return to step 1
Just let the Threads do the "get stuff from the queue" part (pulling the jobs) instead of giving them the urls (pushing the jobs), that way you just say
YourThreadManager.StartThreads(numberOfThreadsTheUserWants);
and everything else happens automagically. See the other replies to find out how to create and manage the threads .
I solved a similar problem by creating a worker class that uses a callback to signal the main app that a worker is done. Then I create a queue of 1000 threads and then call a method that launches threads until the running thread limit is reached, keeping track of the active threads with a dictionary keyed by the thread's ManagedThreadId. As each thread completes, the callback removes its thread from the dictionary and calls the thread launcher.
If a connection is dropped or times out, the callback reinserts the thread back into the queue. Lock around the queue and the dictionary. I create threads vs using the thread pool because the overhead of creating a thread is insignificant compared to the connection time, and it allows me to have a lot more threads in flight. The callback also provides a convenient place with which to update the user interface, even allowing you to change the thread limit while it's running. I've had over 50 open connections at one time. Remember to increase your MacConnections property in your app.config (default is two).
I would use a queue and a condition variable and mutex, and start just the requested number of threads, for example, 5 or 20 (and not start 1,000).
Each thread blocks on the condition variable. When woken up, it dequeues the first item, unlocks the queue, works with the item, locks the queue and checks for more items. If the queue is empty, sleep on the condition variable. If not, unlock, work, repeat.
While the mutex is locked, it can also check if the user has requested the count of threads to be reduced. Just check if count > max_count, and if so, the thread terminates itself.
Any time you have more sites to queue, just lock the mutex and add them to the queue, then broadcast on the condition variable. Any threads that are not already working will wake up and take new work.
Any time the user increases the requested thread count, just start them up and they will lock the queue, check for work, and either sleep on the condition variable or get going.
Each thread will be continually pulling more work from the queue, or sleeping. You don't need more than 5 or 20.
Consider using the event-based asynchronous pattern (AsyncOperation and AsyncOperationManager Classes)
You might want to take a look at the ProcessQueue article on CodeProject.
Essentially, you'll want to create (and start) the number of threads that are appropriate, in your case that number comes from the user. Each of these threads should process a site, then find the next site needed to process. Even if you don't use the object itself (though it sounds like it would suit your purposes pretty well, though I'm obviously biased!) it should give you some good insight into how this sort of thing would be done.

Categories