I have an integration service which runs a calculation heavy, data bound process. I want to make sure that there are never more than say, n = 5, (but n will be configurable, changeable at runtime) of these processes running at the same. The idea is to throttle the load on the server to a safe level. The amount of data processed by the method is limited by batching, so I don't need to worry about 1 process representing a much bigger load than another.
The processing method is called by another process, where requests to run payroll are held on a queue, and I can insert some logic at that point to determine whether to process this request now, or leave it on the queue.
So i want a seperate method on the same service as the processing method, which can tell me if the server can accept another call to the processing method. It's going to ask, "how many payroll runs are going on? is that less than n?" What's a neat way of achieving this?
-----------edit------------
I think I need to make it clear, the process that decides whether to take the request off the queue this is seperated from the service that processes the payroll data by a WCF boundary. Stopping a thread on the payroll processing process isn't going to prevent more requests coming in
You can use a Semaphore to do this.
public class Foo
{
private Semaphore semaphore;
public Foo(int numConcurrentCalls)
{
semaphore = new Semaphore(numConcurrentCalls, numConcurrentCalls);
}
public bool isReady()
{
return semaphore.WaitOne(0);
}
public void Bar()
{
try
{
semaphore.WaitOne();//it will only get past this line if there are less than
//"numConcurrentCalls" threads in this method currently.
//do stuff
}
finally
{
semaphore.Release();
}
}
}
Review the Object Pool pattern. This is what you're describing. While not strictly required by the pattern, you can expose the number of objects currently in the pool, the maximum (configured) number, the high-watermark, etc.
I think that you might want a BlockingCollection, where each item in the collection represents one of the concurrent calls.
Also see IProducerConsumerCollection.
If you were just using threads, I'd suggest you look at the methods for limiting thread concurrency (e.g. the TaskScheduler.MaximumConcurrencyLevel property, and this example.).
Also see ParallelEnumerable.WithDegreeOfParallelism
void ThreadTest()
{
ConcurrentQueue<int> q = new ConcurrentQueue<int>();
int MaxCount = 5;
Random r = new Random();
for (int i = 0; i <= 10000; i++)
{
q.Enqueue(r.Next(100000, 200000));
}
ThreadStart proc = null;
proc = () =>
{
int read = 0;
if (q.TryDequeue(out read))
{
Console.WriteLine(String.Format("[{1:HH:mm:ss}.{1:fff}] starting: {0}... #Thread {2}", read, DateTime.Now, Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(r.Next(100, 1000));
Console.WriteLine(String.Format("[{1:HH:mm:ss}.{1:fff}] {0} ended! #Thread {2}", read, DateTime.Now, Thread.CurrentThread.ManagedThreadId));
proc();
}
};
for (int i = 0; i <= MaxCount; i++)
{
new Thread(proc).Start();
}
}
Related
This is an example about Thread Local Storage (TLS) from Apress parallel programming book. I know that if we have 4 cores computer 4 thread can run parallel in same time. In this example we create 10 task and we suppose that have 4 cores computer. Each Thread local storage live in on thread so when start 10 task parallel only 4 thread perform. And We have 4 TLS so 10 task try to change 4 Thread local storage object. i want to ask how Tls prevent data race problem when thread count < Task count ??
using System;
using System.Threading;
using System.Threading.Tasks;
namespace Listing_04
{
class BankAccount
{
public int Balance
{
get;
set;
}
}
class Listing_04
{
static void Main(string[] args)
{
// create the bank account instance
BankAccount account = new BankAccount();
// create an array of tasks
Task<int>[] tasks = new Task<int>[10];
// create the thread local storage
ThreadLocal<int> tls = new ThreadLocal<int>();
for (int i = 0; i < 10; i++)
{
// create a new task
tasks[i] = new Task<int>((stateObject) =>
{
// get the state object and use it
// to set the TLS data
tls.Value = (int)stateObject;
// enter a loop for 1000 balance updates
for (int j = 0; j < 1000; j++)
{
// update the TLS balance
tls.Value++;
}
// return the updated balance
return tls.Value;
}, account.Balance);
// start the new task
tasks[i].Start();
}
// get the result from each task and add it to
// the balance
for (int i = 0; i < 10; i++)
{
account.Balance += tasks[i].Result;
}
// write out the counter value
Console.WriteLine("Expected value {0}, Balance: {1}",
10000, account.Balance);
// wait for input before exiting
Console.WriteLine("Press enter to finish");
Console.ReadLine();
}
}
}
We have 4 TLS so 10 task try to change 4 Thread local storage object
In your example, you could have anywhere between 1 and 10 TLS slots. This is because a) you are not managing your threads explicitly and so the tasks are executed using the thread pool, and b) the thread pool creates and destroys threads over time according to demand.
A loop of only 1000 iterations will completely almost instantaneously. So it's likely all ten of your tasks will get through the thread pool before the thread pool decides a work item has been waiting long enough to justify adding any new threads. But there is no guarantee of this.
Some important parts of the documentation include these statements:
By default, the minimum number of threads is set to the number of processors on a system
and
When demand is low, the actual number of thread pool threads can fall below the minimum values.
In other words, on your four-core system, the default minimum number of threads is four, but the actual number of threads active in the thread pool could in fact be less than that. And if the tasks take long enough to execute, the number of active threads could rise above that.
The biggest thing to keep in mind here is that using TLS in the context of a thread pool is almost certainly the wrong thing to do.
You use TLS when you have control over the threads, and you want a thread to be able to maintain some data private or unique to that thread. That's the opposite of what happens when you are using the thread pool. Even in the simplest case, multiple tasks can use the same thread, and so would wind up sharing TLS. And in more complicated scenarios, such as when using await, a single task could wind up executed in different threads, and so that one task could wind up using different TLS values depending on what thread is assigned to that task at that moment.
how Tls prevent data race problem when thread count < Task count ??
That depends on what "data race problem" you're talking about.
The fact is, the code you posted is filled with problems that are at the very least odd, if not outright wrong. For example, you are passing account.Balance as the initial value for each task. But why? This value is evaluated when you create the task, before it could ever be modified later, so what's the point of passing it?
And if you thought you were passing whatever the current value is when the task starts, that seems like that would be wrong too. Why would it be valid to make the starting value for a given task vary according to how many tasks had already completed and been accounted for in your later loop? (To be clear: that's not what's happening…but even if it were, it'd be a strange thing to do.)
Beyond all that, it's not clear what you thought using TLS here would accomplish anyway. When each task starts, you reinitialize the TLS value to 0 (i.e. the value of account.Balance that you've passed to the Task<int> constructor). So no thread involved ever sees a value other than 0 during the context of executing any given task. A local variable would accomplish exactly the same thing, without the overhead of TLS and without confusing anyone who reads the code and tries to figure out why TLS was used when it adds no value to the code.
So, does TLS solve some sort of "data race problem"? Not in this example, it doesn't appear to. So asking how it does that is impossible to answer. It doesn't do that, so there is no "how".
For what it's worth, I modified your example slightly so that it would report the individual threads that were assigned to the tasks. I found that on my machine, the number of threads used varied between two and eight. This is consistent with my eight-core machine, with the variation due to how much the first thread in the pool can get done before the pool has initialized additional threads and assigned tasks to them. Most commonly, I would see the first thread completing between three and five of the tasks, with the remaining tasks handled by remaining individual threads.
In each case, the thread pool created eight threads as soon as the tasks were started. But most of the time, at least one of those threads wound up unused, because the other threads were able to complete the tasks before the pool was saturated. That is, there is overhead in the thread pool just managing the tasks, and in your example the tasks are so inexpensive that this overhead allows one or more thread pool threads to finish one task before the thread pool needs that thread for another.
I've copied that version below. Note that I also added a delay between trial iterations, to allow the thread pool to terminate the threads it created (on my machine, this took 20 seconds, hence the delay time hard-coded…you can see the threads being terminated in the debugger output).
static void Main(string[] args)
{
while (_PromptContinue())
{
// create the bank account instance
BankAccount account = new BankAccount();
// create an array of tasks
Task<int>[] tasks = new Task<int>[10];
// create the thread local storage
ThreadLocal<int> tlsBalance = new ThreadLocal<int>();
ThreadLocal<(int Id, int Count)> tlsIds = new ThreadLocal<(int, int)>(
() => (Thread.CurrentThread.ManagedThreadId, 0), true);
for (int i = 0; i < 10; i++)
{
int k = i;
// create a new task
tasks[i] = new Task<int>((stateObject) =>
{
// get the state object and use it
// to set the TLS data
tlsBalance.Value = (int)stateObject;
(int id, int count) = tlsIds.Value;
tlsIds.Value = (id, count + 1);
Console.WriteLine($"task {k}: thread {id}, initial value {tlsBalance.Value}");
// enter a loop for 1000 balance updates
for (int j = 0; j < 1000; j++)
{
// update the TLS balance
tlsBalance.Value++;
}
// return the updated balance
return tlsBalance.Value;
}, account.Balance);
// start the new task
tasks[i].Start();
}
// Make sure this thread isn't busy at all while the thread pool threads are working
Task.WaitAll(tasks);
// get the result from each task and add it to
// the balance
for (int i = 0; i < 10; i++)
{
account.Balance += tasks[i].Result;
}
// write out the counter value
Console.WriteLine("Expected value {0}, Balance: {1}", 10000, account.Balance);
Console.WriteLine("{0} thread ids used: {1}",
tlsIds.Values.Count,
string.Join(", ", tlsIds.Values.Select(t => $"{t.Id} ({t.Count})")));
System.Diagnostics.Debug.WriteLine("done!");
_Countdown(TimeSpan.FromSeconds(20));
}
}
private static void _Countdown(TimeSpan delay)
{
System.Diagnostics.Stopwatch sw = System.Diagnostics.Stopwatch.StartNew();
TimeSpan remaining = delay - sw.Elapsed,
sleepMax = TimeSpan.FromMilliseconds(250);
int cchMax = $"{delay.TotalSeconds,2:0}".Length;
string format = $"\r{{0,{cchMax}:0}}", previousText = null;
while (remaining > TimeSpan.Zero)
{
string nextText = string.Format(format, remaining.TotalSeconds);
if (previousText != nextText)
{
Console.Write(format, remaining.TotalSeconds);
previousText = nextText;
}
Thread.Sleep(remaining > sleepMax ? sleepMax : remaining);
remaining = delay - sw.Elapsed;
}
Console.Write(new string(' ', cchMax));
Console.Write('\r');
}
private static bool _PromptContinue()
{
Console.Write("Press Esc to exit, any other key to proceed: ");
try
{
return Console.ReadKey(true).Key != ConsoleKey.Escape;
}
finally
{
Console.WriteLine();
}
}
I have a counter, which counts the currently processed large reports
private int processedLargeReports;
and I'm generating and starting five threads, where each thread accesses this method:
public bool GenerateReport(EstimatedReportSize reportSize)
{
var currentDateTime = DateTimeFactory.Instance.DateTimeNow;
bool allowLargeReports = (this.processedLargeReports < Settings.Default.LargeReportLimit);
var reportOrderNextInQueue = this.ReportOrderLogic.GetNextReportOrderAndLock(
currentDateTime.AddHours(
this.timeoutValueInHoursBeforeReleaseLock),
reportSize,
CorrelationIdForPickingReport,
allowLargeReports);
if (reportOrderNextInQueue.IsProcessing)
{
Interlocked.Increment(ref this.processedLargeReports);
}
var currentReport = this.GetReportToBeWorked(reportOrderNextInQueue);
var works = this.WorkTheReport(reportOrderNextInQueue, currentReport, currentDateTime);
if (reportOrderNextInQueue.IsProcessing)
{
Interlocked.Decrement(ref this.processedLargeReports);
}
return works;
}
the "reportOrderNextInQueue" variable gets a reportorder from the database and checks whether the report order is either "Normal" or "Large" (this is achieved by defining the bool IsProcessing property of reportOrderNextInQueue variable). In case of a large report, the system then Interlock Increments the processedLargeReport int and processes the large report. Once the large report is processed, the system Interlock Decrements the value.
The whole idea is that I'll only allow a single report to be processed at a time, so once a thread is processing a large report, the other threads should not be able to access a large report in the database. The bool allowLargeReport variable checks whether the processedLargeReports int and is above the limit or not.
I'm curious whether this is the proper implementation, since I cannot test it before Monday. I'm not sure whether I have to use the InterLocked class or just define the processedLargeReports variable as a volatile member.
Say you have 5 threads starting to run code above, and LargeReportLimit is 1. They will all read processedLargeReports as 0, allowLargeReports will be true for them, and they will start processing 5 items simultaneously, despite your limit is 1. So I don't really see how this code achieves you goal, if I understand it correctly.
To expand it a bit: you read processedLargeReports and then act on it (use it to check if you should allow report to be processed). You act like this variable cannot be changed between read and act, but that is not true. Any number of threads can do anything with processedLargeReports in between you read and act on variable, because you have no locking. Interlocked in this case will only ensure that processedLargeReports will always get to 0 after all threads finished processing all tasks, but that is all.
If you need to limit concurrent access to some resourse - just use appropriate tool for this: Semaphore or SemaphoreSlim classes. Create semaphore which allows LargeReportLimit threads in. Before processing report, Wait on your semaphore. This will block if number of concrurrent threads processing report is reached. When processing is done, release your semaphore to allow waiting threads to get in. No need to use Interlocked class here.
volatile does not provide thread safety. As usual with multithreading you need some synchronization - it could be based on Interlocked, lock or any other synchronization primitive and depends on your needs. You have chosen Interlocked - fine, but you have a race condition. You read the processedLargeReports field outside of any synchronization block and making a decision based on that value. But it could immediately change after you read it, so the whole logic will not work. The correct way would be to always do Interlocked.Increment and base your logic on the returned value. Something like this:
First, let use better name for the field
private int processingLargeReports;
and then
public bool GenerateReport(EstimatedReportSize reportSize)
{
var currentDateTime = DateTimeFactory.Instance.DateTimeNow;
bool allowLargeReports =
(Interlocked.Increment(ref this.processingLargeReports) <= Settings.Default.LargeReportLimit);
if (!allowLargeReports)
Interlocked.Decrement(ref this.processingLargeReports);
var reportOrderNextInQueue = this.ReportOrderLogic.GetNextReportOrderAndLock(
currentDateTime.AddHours(
this.timeoutValueInHoursBeforeReleaseLock),
reportSize,
CorrelationIdForPickingReport,
allowLargeReports);
if (allowLargeReports && !reportOrderNextInQueue.IsProcessing)
Interlocked.Decrement(ref this.processingLargeReports);
var currentReport = this.GetReportToBeWorked(reportOrderNextInQueue);
var works = this.WorkTheReport(reportOrderNextInQueue, currentReport, currentDateTime);
if (allowLargeReports && reportOrderNextInQueue.IsProcessing)
Interlocked.Decrement(ref this.processingLargeReports);
return works;
}
Note that this also contains race conditions, but holds your LargeReportLimit constraint.
EDIT: Now when I'm thinking, since your processing is based on both Allow and Is Large Report, Interlocked is not a good choice, better use Monitor based approach like:
private int processingLargeReports;
private object processingLargeReportsLock = new object();
private void AcquireProcessingLargeReportsLock(ref bool lockTaken)
{
Monitor.Enter(this.processingLargeReportsLock, ref lockTaken);
}
private void ReleaseProcessingLargeReportsLock(ref bool lockTaken)
{
if (!lockTaken) return;
Monitor.Exit(this.processingLargeReportsLock);
lockTaken = false;
}
public bool GenerateReport(EstimatedReportSize reportSize)
{
bool lockTaken = false;
try
{
this.AcquireProcessingLargeReportsLock(ref lockTaken);
bool allowLargeReports = (this.processingLargeReports < Settings.Default.LargeReportLimit);
if (!allowLargeReports)
{
this.ReleaseProcessingLargeReportsLock(ref lockTaken);
}
var currentDateTime = DateTimeFactory.Instance.DateTimeNow;
var reportOrderNextInQueue = this.ReportOrderLogic.GetNextReportOrderAndLock(
currentDateTime.AddHours(
this.timeoutValueInHoursBeforeReleaseLock),
reportSize,
CorrelationIdForPickingReport,
allowLargeReports);
if (reportOrderNextInQueue.IsProcessing)
{
this.processingLargeReports++;
this.ReleaseProcessingLargeReportsLock(ref lockTaken);
}
var currentReport = this.GetReportToBeWorked(reportOrderNextInQueue);
var works = this.WorkTheReport(reportOrderNextInQueue, currentReport, currentDateTime);
if (reportOrderNextInQueue.IsProcessing)
{
this.AcquireProcessingLargeReportsLock(ref lockTaken);
this.processingLargeReports--;
}
return works;
}
finally
{
this.ReleaseProcessingLargeReportsLock(ref lockTaken);
}
}
Let's say I have a business object that is very expensive to instantiate, and I would never want to create more than say 10 instances of that object in my application. So, that would mean I would never want to have more than 10 concurrent worker threads running at one time.
I'd like to use the new System.Threading.Tasks to create a task like this:
var task = Task.Factory.StartNew(() => myPrivateObject.DoSomethingProductive());
Is there a sample out there that would show how to:
create an 'object pool' for use by the TaskFactory?
limit the TaskFactory to a specified number of threads?
lock an instance in the object pool so it can only be used by one task at a time?
Igby's answer led me to this excellent blog post from Justin Etheridge. which then prompted me to write this sample:
using System;
using System.Collections.Concurrent;
using System.Threading.Tasks;
namespace MyThreadedApplication
{
class Program
{
static void Main(string[] args)
{
// build a list of 10 expensive working object instances
var expensiveStuff = new BlockingCollection<ExpensiveWorkObject>();
for (int i = 65; i < 75; i++)
{
expensiveStuff.Add(new ExpensiveWorkObject(Convert.ToChar(i)));
}
Console.WriteLine("{0} expensive objects created", expensiveStuff.Count);
// build a list of work to be performed
Random r = new Random();
var work = new ConcurrentQueue<int>();
for (int i = 0; i < 1000; i++)
{
work.Enqueue(r.Next(10000));
}
Console.WriteLine("{0} items in work queue", work.Count);
// process the list of work items in fifteen threads
for (int i = 1; i < 15; i++)
{
Task.Factory.StartNew(() =>
{
while (true)
{
var expensiveThing = expensiveStuff.Take();
try
{
int workValue;
if (work.TryDequeue(out workValue))
{
expensiveThing.DoWork(workValue);
}
}
finally
{
expensiveStuff.Add(expensiveThing);
}
}
});
}
}
}
}
class ExpensiveWorkObject
{
char identity;
public void DoWork(int someDelay)
{
System.Threading.Thread.Sleep(someDelay);
Console.WriteLine("{0}: {1}", identity, someDelay);
}
public ExpensiveWorkObject(char Identifier)
{
identity = Identifier;
}
}
So, I'm using the BlockingCollection as an object pool, and the worker threads don't check the queue for available work until they have an exclusive control over one of the expensive object instances. I think this meets my requirements, but I would really like feedback from people who know this stuff better than I do...
Two thoughts:
Limited Concurrency Scheduler
You can use a custom task scheduler which limits the number of concurrent tasks. Internally it will allocate up to n Task instances. If you pass it more tasks than it has available instances, it will put them in a queue. Adding custom schedulers like this is a design feature of the TPL.
Here is a good example of such a scheduler. I have sucessfully used a modified version of this.
Object Pool
Another option is to use an object pool. It's a very similar concept except that instead of putting the limitation at the task level, you put it on the number of object instances, and force tasks to wait for a free instance to become available. This has the benefit of reducing the overhead of object creation, but you need to ensure the object is written in a way that allows instances of it to be recycled. You could create an object pool around a concurrent producer-consumer collection such as ConcurrentStack where the consumer adds the instance back to the collection when it's finished.
I have a queue, a list with producer threads and a list with consumer threads.
My code looks like this
public class Runner
{
List<Thread> Producers;
List<Thread> Consumers;
Queue<int> queue;
Random random;
public Runner()
{
Producers = new List<Thread>();
Consumers = new List<Thread>();
for (int i = 0; i < 2; i++)
{
Thread thread = new Thread(Produce);
Producers.Add(thread);
}
for (int i = 0; i < 2; i++)
{
Thread thread = new Thread(Consume);
Consumers.Add(thread);
}
queue = new Queue<int>();
random = new Random();
Producers.ForEach(( thread ) => { thread.Start(); });
Consumers.ForEach(( thread ) => { thread.Start(); });
}
protected void Produce()
{
while (true)
{
int number = random.Next(0, 99);
queue.Enqueue(number);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Produce: " + number);
}
}
protected void Consume()
{
while (true)
{
if (queue.Any())
{
int number = queue.Dequeue();
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Consume: " + number);
}
else
{
Console.WriteLine("No items to consume");
}
}
}
}
Shouldn't this fail miserable cause of the missing use of the lock keyword?
It failed once because it tried to dequeue when the queue was empty, using the lock keyword will fix that right?
If the lock keyword is not needed for the above code, when is it needed then?
Thank you in advance! =)
Locking is to done to eliminate aberrant behavior of an application, most specifically in multithreading. The most common goal is the elimination of a "race condition" which causes non-deterministic program behavior.
This is the behavior you saw. In one run you get an error for the queue having no items, in another run you have no issues. This is a race condition. Proper usage of locking will eliminate this scenario.
Using Queue without locks is not thread safe indeed. But better than using locks you may try ConcurrentQueue. Google for "C# ConcurrentQueue" and you will find quite a lot of examples, e.g. this one compares the use and performance of Queue with a lock and ConcurrentQueue.
To clarify the existing answers, if you have a multithreading problem (such as a race condition) then it isn't guaranteed to always fail - it may fail, in a very unpredictable manner.
The reason is that two (or more) threads that are accessing a resource may try to access it at different times - precisely when each of them tries to access it will depend on many factors (how fast your CPU is, how many processor cores it has available, what other programs are running at the time, whether you are running a release or debug build, or running under a debugger, etc). You could run it many times without the failure showing up, and then have it suddenly and "inexplicably" fail - this can make these errors extremely hard to track down because they don't often show up while you're writing the faulty code, but more often when you are writing a different unrelated piece of code.
If you are going to use multithreading it is vital that you read up on the subject and gain an understanding of what can go wrong, when, and how to handle it properly - bad use of locking can be just as dangerous (if not more so) than not using locks at all (locking can cause deadlocks where your program simply "locks up"). This are aof programming must be approached carefully!
Yes this code will fail. The queue needs to support multi-threading. Use a ConcurrentQueue. See http://msdn.microsoft.com/en-us/library/dd267265.aspx
By running your code I received InvalidOperationException - "Collection was modified after the enumerator was instantiated." It means that you modify data while using several threads.
You can use the lock every time you Enqueue or Dequeue - because you modify the queue from several threads. A far better option is to use ConcurentQueues as it is thread safe and lock-free concurrent collection. It also provides better performance.
Yep, you would definitely to synchronize access to the Queue to make it thread-safe. But, you have another problem. There is no mechanism which keeps the consumers from spinning wildly around the loop. Synchronizing access to the Queue or using ConcurrentQueue will not fix that problem.
The simplest way to implement the producer-consumer pattern is to use a blocking queue. Fortunately, .NET 4.0 provides the BlockingCollection which is, despite the name, an implementation of a blocking queue.
public class Runner
{
private BlockingCollection<int> queue = new BlockingCollection<int>();
private Random random = new Random();
public Runner()
{
for (int i = 0; i < 2; i++)
{
var thread = new Thread(Produce);
thread.Start();
}
for (int i = 0; i < 2; i++)
{
var thread = new Thread(Consume);
thread.Start();
}
}
protected void Produce()
{
while (true)
{
int number = random.Next(0, 99);
queue.Add(number);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Produce: " + number);
}
}
protected void Consume()
{
while (true)
{
int number = queue.Take();
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " Consume: " + number);
}
}
}
Current implementation: Waits until parallelCount values are collected, uses ThreadPool to process the values, waits until all threads complete, re-collect another set of values and so on...
Code:
private static int parallelCount = 5;
private int taskIndex;
private object[] paramObjects;
// Each ThreadPool thread should access only one item of the array,
// release object when done, to be used by another thread
private object[] reusableObjects = new object[parallelCount];
private void MultiThreadedGenerate(object paramObject)
{
paramObjects[taskIndex] = paramObject;
taskIndex++;
if (taskIndex == parallelCount)
{
MultiThreadedGenerate();
// Reset
taskIndex = 0;
}
}
/*
* Called when 'paramObjects' array gets filled
*/
private void MultiThreadedGenerate()
{
int remainingToGenerate = paramObjects.Count;
resetEvent.Reset();
for (int i = 0; i < paramObjects.Count; i++)
{
ThreadPool.QueueUserWorkItem(delegate(object obj)
{
try
{
int currentIndex = (int) obj;
Generate(currentIndex, paramObjects[currentIndex], reusableObjects[currentIndex]);
}
finally
{
if (Interlocked.Decrement(ref remainingToGenerate) == 0)
{
resetEvent.Set();
}
}
}, i);
}
resetEvent.WaitOne();
}
I've seen significant performance improvements with this approach, however there are a number of issues to consider:
[1] Collecting values in paramObjects and synchronization using resetEvent can be avoided as there is no dependency between the threads (or current set of values with the next set of values). I'm only doing this to manage access to reusableObjects (when a set paramObjects is done processing, I know that all objects in reusableObjects are free, so taskIndex is reset and each new task of the next set of values will have its unique 'reusableObj' to work with).
[2] There is no real connection between the size of reusableObjects and the number of threads the ThreadPool uses. I might initialize reusableObjects to have 10 objects, and say due to some limitations, ThreadPool can run only 3 threads for my MultiThreadedGenerate() method, then I'm wasting memory.
So by getting rid of paramObjects, how can the above code be refined in a way that as soon as one thread completes its job, that thread returns its taskIndex(or the reusableObj) it used and no longer needs so that it becomes available to the next value. Also, the code should create a reUsableObject and add it to some collection only when there is a demand for it. Is using a Queue here a good idea ?
Thank you.
There's really no reason to do your own manual threading and task management any more. You could restructure this to a more loosely-coupled model using Task Parallel Library (and possibly System.Collections.Concurrent for result collation).
Performance could be further improved if you don't need to wait for a full complement of work before handing off each Task for processing.
TPL came along in .Net 4.0 but was back-ported to .Net 3.5. Download here.