using C# on the .Net 4.0 framework, I have a Windows Forms main thread (the only one, until now) that waits for filesystem events and then must start some predefined processing on the files provided by those events.
I am planning to do the following:
A1. To immediately create a separate thread when the main process start;
A2. Have the main thread to put in a Queue (FIFO) the file names to be
processed;
A3. Have the new thread triggered by a timer every n
seconds;
A4. Have the new thread read the queue, if there is an item
to perform the processing, then have it cancel the queue item just
processed.
Because I never have programmed threads before (I am basically using the Albahari as my compass) but I definitely want to, I have a few questions just to spot possible heavy headache in advance:
Q1. May I incur into concurrence problems on the Queue if the main process writes only and the new ones cancel only queue items? In other words: Is synchronization a significant issue in this case?
Q2. I have seen that I could create a new thread from scratch or can reuse one of the threads made available from an existing pool. It is safer / simpler to use threads from the pool in this context?
Q3. Are there any drawbacks to keeping alive the new thread indefinitely and responding only to timer until the main process is closed?
If you are targeting .Net Framework 4, the Blocking Collection sounds like it will solve your issues; i.e. creating a new thread pooled thread when "work" items become available on the queue (added to the queue on the event handler when new files are added) and process them asynchronously on that thread.
You can use one in a producer/consumer queue:
E.g.:
/// <summary>
/// Producer/consumer queue. Used when a task needs executing, it’s enqueued to ensure order,
/// allowing the caller to get on with other things. The number of consumers can be defined,
/// each running on a thread pool task thread.
/// Adapted from: http://www.albahari.com/threading/part5.aspx#_BlockingCollectionT
/// </summary>
public class ProducerConsumerQueue : IDisposable
{
private BlockingCollection<Action> _taskQ = new BlockingCollection<Action>();
public ProducerConsumerQueue(int workerCount)
{
// Create and start a separate Task for each consumer:
for (int i = 0; i < workerCount; i++)
{
Task.Factory.StartNew(Consume);
}
}
public void Dispose()
{
_taskQ.CompleteAdding();
}
public void EnqueueTask(Action action)
{
_taskQ.Add(action);
}
private void Consume()
{
// This sequence that we’re enumerating will block when no elements
// are available and will end when CompleteAdding is called.
// Note: This removes AND returns items from the collection.
foreach (Action action in _taskQ.GetConsumingEnumerable())
{
// Perform task.
action();
}
}
}
Related
I have the following requirements -
A thread which receives the messages and en-queue those.
A thread which processes the enqueued messages.
Now, the second thread always has to be alive - for which I have used infinite while loop as follows:
private AutoResetEvent messageReset;
private Queue<byte[]> messageQueue;
//thread 2 method
private void ProcessIncomingMessages()
{
messageReset.WaitOne(); //wait for signal
while(true)
{
if (messageQueue.Count > 0)
{
//processing messages
}
}
}
public void SubmitMessageForProcessing(byte[] message){
messageQueue.Enqueue(message); //enqueue message
// Release the thread
messageReset.Set();
}
Now, this infinite while loop is shooting the CPU utilization very high.
Is there any workaround to lower down the CPU utilization
NOTE: I can't add any thread.sleep statement as the incoming messages are to be displayed on UI with minimum delay.
Just use a BlockingCollection instead of Queue. It is threadsafe and will block onTake until some worker adds an item:
// Use default constructor to make BlockingCollection FIFO
private BlockingCollection<byte[]> messageQueue = new BlockingCollection<byte[]>();
//thread 2 method
private void ProcessIncomingMessages()
{
while (true)
{
//will block until thread1 Adds a message
byte[] message = messageQueue.Take();
//processing messages
}
}
public void SubmitMessageForProcessing(byte[] message)
{
messageQueue.Add(message); //enqueue message
}
EDIT2: I forgot to mention that by using the default constructor BlockingCollection will be FIFO. It will actually use a ConcurrentQueue as item container.
If you wanted BlockingCollection to behave like a LIFO collection you would need to pass a IProducerConsumerCollection that is LIFO to the constructor. The usual class for that would be ConcurrentStack
EDIT: Some explanation on how your Queue is not thread-safe and this could lead to problems with your current code.
From the Microsoft documentation on Queue:
A Queue can support multiple readers concurrently, as long as the collection is not modified.
This means you cannot read and write from multiple threads at the same time.
Look at the following example which also applies to the other answers which suggest just moving messageReset.WaitOne() in your while(true) block.
SubmitMessageForProcessing is called and signals messageReset.Set()
Thread 2 gets active and tries to read data.
While thread 2 reads data SubmitMessageForProcessing is called a second time.
Now you are writing and reading at the same time resulting in unexpected behavior (usually some kind of exception)
In your example, the while loop will busy-wait until the queue has at least one element. You can move the signal into that loop to reduce the busy-waiting and use less CPU.
private void ProcessIncomingMessages()
{
while(true)
{
messageReset.WaitOne(100); //wait for signal
while (messageQueue.Count > 0)
{
//processing messages
}
}
}
P.S. Unless you have some sort of custom locking mechanism, you must use a ConcurrentQueue<T> instead of a Queue<T> if you want to be thread-safe. Also, I put a timeout on the WaitOne call because there is a slim chance the signal will get set after you check Count but before the WaitOne call is reached. There may be other threading issues in your solution. If you're not confident about threading concerns, you might want to use a BlockingCollection, which takes care of a lot of the details for you.
I have a Window in WPF and user can start very long operations on it. User must be able to cancel those operations.
All of my operations are in separate threads. So my question is:
Can I terminate all threads that are started from that Window, without killing UI thread obviously, at any time?
On places where I need to do long operations threads were created and started like this
Thread thread =
new Thread(
new ThreadStart(
delegate
{...}));
thread.Start();
How to pass that object to it? is it possible? If it is important at all I do not care about graceful closing of threads, they can be killed, it would still be a solution. Is window object aware of threads to whom it is parent?
Thank you in advance.
Typically you won't want to create/destroy threads. There's much more overhead when creating a Thread every time you need one than there is in thread pools and Tasks (This applies, like specified, when you need to create a significant number of Threads in the lifetime of your processes).
The preferred approach (especially if you're using .Net 4.0, or even better 4.5) is to use Tasks.
There's is tons of documentation on how to use Tasks, and how to cancel them. #xxbbcc posted a link in a comment on your question.
However, if you still think that dealing with Threads is your best choice, you could keep a track of all the threads. Then whenever you (as a developer) or your user determines they want to kill the thread, you can just iterate through the threads and call the Abort() method on them.
public class MyExampleClass
{
private List<Thread> MyThreads { get; set; }
public MyExampleClass()
{
MyThreads = new List<Thread>();
InstanciateThreadsWithSomeSuperImportantOperations();
}
private void InstanciateThreadsWithSomeSuperImportantOperations()
{
var thread = new Thread();
// some code here
MyThreads.Add(thread);
}
public void KillAllThreads()
{
foreach (var t in MyThreads)
{
if (t.IsAlive)
t.Abort(); // Note this isn't guaranteed to stop the thread.
}
}
}
I am looking for a TaskScheduler that:
Allows me to define a number of dedicated threads (e.g. 8) - a standard LimitedConcurrencyLevelTaskScheduler (uses threadpool threads) or WorkStealingTaskScheduler do this.
Allows me to create sub-TaskSchedulers that are fully ordered but schedules the tasks on the dedicated threads of the parent scheduler.
At the moment we use TaskScheduler.Default for the general pool (at the mercy of the threadpool growth algorithm etc) and new OrderedTaskScheduler() whenever we want to order tasks. I want to keep this behavior but limit both requirements to my own pool of dedicated threads.
QueuedTaskScheduler seems to get pretty close. I thought the QueuedTaskScheduler.ActivateNewQueue() method, which returns a child TaskScheduler would execute tasks IN ORDER on the pool of workers from the parent but that doesn't seem to be the case. The child TaskSchedulers seem to have the same level of parallelization as the parent.
I don't necessarily want the child taskscheduler tasks to be prioritised over the parent taskscheduler tasks (although it might be a nice feature in the future).
I have seen a related question here: Limited concurrency level task scheduler (with task priority) handling wrapped tasks but my requirements do not need to handle async tasks (all my enqueued tasks are completely synchronous from start to end, with no continuations).
I assume by "fully ordered" you also mean "one at a time".
In that case, I believe there's a built-in solution that should do quite well: ConcurrentExclusiveSchedulerPair.
Your "parent" scheduler would be a concurrent scheduler:
TaskScheduler _parent = new ConcurrentExclusiveSchedulerPair(TaskScheduler.Default, 8)
.ConcurrentScheduler;
And the "child" schedulers would be an exclusive scheduler that uses the concurrent scheduler underneath:
var myScheduler = new ConcurrentExclusiveSchedulerPair(_parent).ExclusiveScheduler;
After carefully considering the other answers, I decided for my uses it was easier to create a custom QueuedTaskScheduler given I don't need to worry about async tasks or IO completion (although that has given me something to think about).
Firstly when we grab work from the child work pools, we add a semaphore based lock, inside FindNextTask_NeedsLock:
var items = queueForTargetTask._workItems;
if (items.Count > 0
&& queueForTargetTask.TryLock() /* This is added */)
{
targetTask = items.Dequeue();
For the dedicated thread version, inside ThreadBasedDispatchLoop:
// ... and if we found one, run it
if (targetTask != null)
{
queueForTargetTask.ExecuteTask(targetTask);
queueForTargetTask.Release();
}
For the task scheduler version, inside ProcessPrioritizedAndBatchedTasks:
// Now if we finally have a task, run it. If the task
// was associated with one of the round-robin schedulers, we need to use it
// as a thunk to execute its task.
if (targetTask != null)
{
if (queueForTargetTask != null)
{
queueForTargetTask.ExecuteTask(targetTask);
queueForTargetTask.Release();
}
else
{
TryExecuteTask(targetTask);
}
}
Where we create the new child queues:
/// <summary>Creates and activates a new scheduling queue for this scheduler.</summary>
/// <returns>The newly created and activated queue at priority 0 and max concurrency of 1.</returns>
public TaskScheduler ActivateNewQueue() { return ActivateNewQueue(0, 1); }
/// <summary>Creates and activates a new scheduling queue for this scheduler.</summary>
/// <param name="priority">The priority level for the new queue.</param>
/// <returns>The newly created and activated queue at the specified priority.</returns>
public TaskScheduler ActivateNewQueue(int priority, int maxConcurrency)
{
// Create the queue
var createdQueue = new QueuedTaskSchedulerQueue(priority, maxConcurrency, this);
...
}
Finally, inside the nested QueuedTaskSchedulerQueue:
// This is added.
private readonly int _maxConcurrency;
private readonly Semaphore _semaphore;
internal bool TryLock()
{
return _semaphore.WaitOne(0);
}
internal void Release()
{
_semaphore.Release();
_pool.NotifyNewWorkItem();
}
/// <summary>Initializes the queue.</summary>
/// <param name="priority">The priority associated with this queue.</param>
/// <param name="maxConcurrency">Max concurrency for this scheduler.</param>
/// <param name="pool">The scheduler with which this queue is associated.</param>
internal QueuedTaskSchedulerQueue(int priority, int maxConcurrency, QueuedTaskScheduler pool)
{
_priority = priority;
_pool = pool;
_workItems = new Queue<Task>();
// This is added.
_maxConcurrency = maxConcurrency;
_semaphore = new Semaphore(_maxConcurrency, _maxConcurrency);
}
I hope this might be useful for someone trying to do the same as me and interleave unordered tasks with ordered tasks on a single, easy to use scheduler (that can use the default threadpool, or any other scheduler).
=== UPDATE ===
Inspired by Stephen Cleary, I ended up using:
private static readonly Lazy<TaskScheduler> Scheduler = new Lazy<TaskScheduler>(
() => new WorkStealingTaskScheduler(16));
public static TaskScheduler Default
{
get
{
return Scheduler.Value;
}
}
public static TaskScheduler CreateNewOrderedTaskScheduler()
{
return new QueuedTaskScheduler(Default, 1);
}
I understand your tasks have dependencies which is why you want to (partially) order them. You could do this with ContinueWith chains. You just need to keep track of the latest task in any given chain. When a new one comes in you set up the next continuation off of that task and store the new task. You drop the old one.
Alternative solution: Have one SemaphoreSlim per chain and use await sem.WaitAsync() to manually control the DOP very flexibly. Note, that async-waiting on a semaphore does not block any thread. It causes just a little memory usage. No OS resource at all is being used. You can have extremely many semaphores in use.
I don't think schedulers are the right abstraction. Schedulers are for CPU-based work. The other coordination tools can work with any Task including async IO. Consider preferring ordinary task combinators and coordination primitives.
I'm playing around with a simple console app that creates one thread and I do some inter thread communication between the main and the worker thread.
I'm posting objects from the main thread to a concurrent queue and the worker thread is dequeueing that and does some processing.
What strikes me as odd, is that when I profile this app, even despite I have two cores.
One core is 100% free and the other core have done all the work, and I see that both threads have been running in that core.
Why is this?
Is it because I use a wait handle that sets when I post a message and releases when the processing is done?
This is my sample code, now using 2 worker threads.
It still behaves the same, main, worker1 and worker2 is running in the same core.
Ideas?
[EDIT]
It sort of works now, atleast, I get twice the performance compared to yesterday.
the trick was to slow down the consumer just enough to avoid signaling using the AutoResetEvent.
public class SingleThreadDispatcher
{
public long Count;
private readonly ConcurrentQueue<Action> _queue = new ConcurrentQueue<Action>();
private volatile bool _hasMoreTasks;
private volatile bool _running = true;
private int _status;
private readonly AutoResetEvent _signal = new AutoResetEvent(false);
public SingleThreadDispatcher()
{
var thread = new Thread(Run)
{
IsBackground = true,
Name = "worker" + Guid.NewGuid(),
};
thread.Start();
}
private void Run()
{
while (_running)
{
_signal.WaitOne();
do
{
_hasMoreTasks = false;
Action task;
while (_queue.TryDequeue(out task) && _running)
{
Count ++;
task();
}
//wait a short while to let _hasMoreTasks to maybe be set to true
//this avoids the roundtrip to the AutoResetEvent
//that is, if there is intense pressure on the pool, we let some new
//tasks have the chance to arrive and be processed w/o signaling
if(!_hasMoreTasks)
Thread.Sleep(5);
Interlocked.Exchange(ref _status, 0);
} while (_hasMoreTasks);
}
}
public void Schedule(Action task)
{
_hasMoreTasks = true;
_queue.Enqueue(task);
SetSignal();
}
private void SetSignal()
{
if (Interlocked.Exchange(ref _status, 1) == 0)
{
_signal.Set();
}
}
}
Is it because I use a wait handle that sets when I post a message and releases when the processing is done?
Without seeing your code it is hard to say for sure, but from your description it appears that the two threads that you wrote act as co-routines: when the main thread is running, the worker thread has nothing to do, and vice versa. It looks like .NET scheduler is smart enough to not load the second core when this happens.
You can change this behavior in several ways - for example
by doing some work on the main thread before waiting on the handle, or
by adding more worker threads that would compete for the tasks that your main thread posts, and could both get a task to work on.
OK, I've figured out what the problem is.
The producer and consumer is pretty much just as fast in this case.
This results in the consumer finishing all its work fast and then looping back to wait for the AutoResetEvent.
The next time the producer sends a task, it has to touch the AutoresetEvent and set it.
The solution was to add a very very small delay in the consumer, making it slightly slower than the producer.
This results in when the producer sends a task, it notices that the consumer is already active and it just has to post to the worker queue w/o touching the AutoResetEvent.
The original behavior resulted in a sort of ping-pong effect, that can be seen on the screenshot.
Dasblinkelight (probably) has the right answer.
Apart from that, it would also be the correct behaviour when one of your threads is I/O bound (that is, it's not stuck on the CPU) - in that case, you've got nothing to gain from using multiple cores, and .NET is smart enough to just change contexts on one core.
This is often the case for UI threads - it has very little work to do, so there usually isn't much of a reason for it to occupy a whole core for itself. And yes, if your concurrent queue is not used properly, it could simply mean that the main thread waits for the worker thread - again, in that case, there's no need to switch cores, since the original thread is waiting anyway.
You should use BlockingCollection rather than ConcurrentQueue. By default, BlockingCollection uses a ConcurrentQueue under the hood, but it has a much easier to use interface. In particular, it does non-busy waits. In addition, BlockingCollection supports cancellation, so your consumer becomes very simple. Here's an example:
public class SingleThreadDispatcher
{
public long Count;
private readonly BlockingCollection<Action> _queue = new BlockingCollection<Action>();
private readonly CancellationTokenSource _cancellation = new CancellationTokenSource();
public SingleThreadDispatcher()
{
var thread = new Thread(Run)
{
IsBackground = true,
Name = "worker" + Guid.NewGuid(),
};
thread.Start();
}
private void Run()
{
foreach (var task in _queue.GetConsumingEnumerable(_cancellation.Token))
{
Count++;
task();
}
}
public void Schedule(Action task)
{
_queue.Add(task);
}
}
The loop with GetConsumingEnumerable will do a non-busy wait on the queue. There's no need to do it with a separate event. It will wait for an item to be added to the queue, or it will exit if you set the cancellation token.
To stop it normally, you just call _queue.CompleteAdding(). That tells the consumer that no more items will be added to the queue. The consumer will empty the queue and then exit.
If you want to quit early, then just call _cancellation.Cancel(). That will cause GetConsumingEnumerable to exit.
In general, you shouldn't ever have to use ConcurrentQueue directly. BlockingCollection is easier to use and provides equivalent performance.
I would like to start x number of threads from my .NET application, and I would like to keep track of them as I will need to terminate them manually or when my application closes my application later on.
Example ==> Start Thread Alpha, Start Thread Beta .. then at any point in my application I should be able to say Terminate Thread Beta ..
What is the best way to keep track of opened threads in .NET and what do I need to know ( an id ? ) about a thread to terminate it ?
You could save yourself the donkey work and use this Smart Thread Pool. It provides a unit of work system which allows you to query each thread's status at any point, and terminate them.
If that is too much bother, then as mentioned anIDictionary<string,Thread> is probably the simplest solution. Or even simpler is give each of your thread a name, and use an IList<Thread>:
public class MyThreadPool
{
private IList<Thread> _threads;
private readonly int MAX_THREADS = 25;
public MyThreadPool()
{
_threads = new List<Thread>();
}
public void LaunchThreads()
{
for (int i = 0; i < MAX_THREADS;i++)
{
Thread thread = new Thread(ThreadEntry);
thread.IsBackground = true;
thread.Name = string.Format("MyThread{0}",i);
_threads.Add(thread);
thread.Start();
}
}
public void KillThread(int index)
{
string id = string.Format("MyThread{0}",index);
foreach (Thread thread in _threads)
{
if (thread.Name == id)
thread.Abort();
}
}
void ThreadEntry()
{
}
}
You can of course get a lot more involved and complicated with it. If killing your threads isn't time sensitive (for example if you don't need to kill a thread in 3 seconds in a UI) then a Thread.Join() is a better practice.
And if you haven't already read it, then Jon Skeet has this good discussion and solution for the "don't use abort" advice that is common on SO.
You can create a Dictionary of threads and assign them id's, like:
Dictionary<string, Thread> threads = new Dictionary<string, Thread>();
for(int i = 0 ;i < numOfThreads;i++)
{
Thread thread = new Thread(new ThreadStart(MethodToExe));
thread.Name = threadName; //Any name you want to assign
thread.Start(); //If you wish to start them straight away and call MethodToExe
threads.Add(id, thread);
}
If you don't want to save threads against an Id you can use a list and later on just enumerate it to kill threads.
And when you wish to terminate them, you can abort them. Better have some condition in your MethodToExe that allows that method to leave allowing the thread to terminate gracefully. Something like:
void MethodToExe()
{
while(_isRunning)
{
//you code here//
if(!_isRunning)
{
break;
}
//you code here//
}
}
To abort you can enumerate the dictionary and call Thread.Abort(). Be ready to catch ThreadAbortException
I asked a similar questions and received a bunch of good answers: Shutting down a multithreaded application
Note: my question did not require a graceful exit, but people still recommended that I gracefully exit from the loop of each thread.
The main thing to remember is that if you want to avoid having your threads prevent your process from terminating you should set all your threads to background:
Thread thread = new Thread(new ThreadStart(testObject.RunLoop));
thread.IsBackground = true;
thread.start();
The preferred way to start and manage threads is in a ThreadPool, but just about any container out there can be used to keep a reference to your threads. Your threads should always have a flag that will tell them to terminate and they should continually check it.
Furthermore, for better control you can supply your threads with a CountdownLatch: whenever a thread is exiting its loop it will signal on a CountdownLatch. Your main thread will call the CountdownLatch.Wait() method and it will block until all the threads have signaled... this allows you to properly cleanup and ensures that all your threads have shutdown before you start cleaning up.
public class CountdownLatch
{
private int m_remain;
private EventWaitHandle m_event;
public CountdownLatch(int count)
{
Reset(count);
}
public void Reset(int count)
{
if (count < 0)
throw new ArgumentOutOfRangeException();
m_remain = count;
m_event = new ManualResetEvent(false);
if (m_remain == 0)
{
m_event.Set();
}
}
public void Signal()
{
// The last thread to signal also sets the event.
if (Interlocked.Decrement(ref m_remain) == 0)
m_event.Set();
}
public void Wait()
{
m_event.WaitOne();
}
}
It's also worthy to mention that the Thread.Abort() method does some strange things:
When a thread calls Abort on itself,
the effect is similar to throwing an
exception; the ThreadAbortException
happens immediately, and the result is
predictable. However, if one thread
calls Abort on another thread, the
abort interrupts whatever code is
running. There is also a chance that a
static constructor could be aborted.
In rare cases, this might prevent
instances of that class from being
created in that application domain. In
the .NET Framework versions 1.0 and
1.1, there is a chance the thread could abort while a finally block is
running, in which case the finally
block is aborted.
The thread that calls Abort might
block if the thread that is being
aborted is in a protected region of
code, such as a catch block, finally
block, or constrained execution
region. If the thread that calls Abort
holds a lock that the aborted thread
requires, a deadlock can occur.
After creating your thread, you can set it's Name property. Assuming you store it in some collection you can access it conveniently via LINQ in order to retrieve (and abort) it:
var myThread = (select thread from threads where thread.Name equals "myThread").FirstOrDefault();
if(myThread != null)
myThread.Abort();
Wow, there are so many answers..
You can simply use an array to hold the threads, this will only work if the access to the array will be sequantial, but if you'll have another thread accessing this array, you will need to synchronize access
You can use the thread pool, but the thread pool is very limited and can only hold fixed amount of threads.
As mentioned above, you can create you own thread pool, which in .NET v4 becomes much easier with the introduction of safe collections.
you can manage them by holding a list of mutex object which will determine when those threads should finish, the threads will query the mutex each time they run before doing anything else, and if its set, terminate, you can manage the mutes from anywhere, and since mutex are by defenition thread-safe, its fairly easy..
i can think of another 10 ways, but those seems to work. let me know if they dont fit your needs.
Depends on how sophisticated you need it to be. You could implement your own type of ThreadPool with helper methods etc. However, I think its as simple as just maintaining a list/array and adding/removing the threads to/from the collection accordingly.
You could also use a Dictionary collection and use your own type of particular key to retrieve them i.e. Guids/strings.
As you start each thread, put it's ManagedThreadId into a Dictionary as the key and the thread instance as the value. Use a callback from each thread to return its ManagedThreadId, which you can use to remove the thread from the Dictionary when it terminates. You can also walk the Dictionary to abort threads if needed. Make the threads background threads so that they terminate if your app terminates unexpectedly.
You can use a separate callback to signal threads to continue or halt, which reflects a flag set by your UI, for a graceful exit. You should also trap the ThreadAbortException in your threads so that you can do any cleanup if you have to abort threads instead.