Lock a single access variable for parallel threads in C# - c#

Hello i have this code
var queue = new BlockingCollection<int>();
queue.Add(0);
var producers = Enumerable.Range(1, 3)
.Select(_ => Task.Factory.StartNew(()=>
{
Enumerable.Range(1, queue.Count)
.ToList().ForEach(i =>
{
lock (queue)
{
if (!queue.Contains(i))
{
Console.WriteLine("Thread" + Task.CurrentId.ToString());
queue.Add(i);
}
}
Thread.Sleep(100);
});
}))
.ToArray();
Task.WaitAll(producers);
queue.CompleteAdding();
foreach (var item in queue.GetConsumingEnumerable())
{
Console.WriteLine(item.ToString());
}
But i need each time that a single thread ads something to the queue.Add(i) the
Enumerable.Range(1, queue.Count) to be inceased so that the code executes until there are no more items to be added to the queue. I hope you understand the question.
In other words i need this action to run infinitely untill i tell it to stop.
Any suggestions?

I´m sorry to say, but I can´t understand your motives for writing something like that without further explaination :(
Is the following code useful to you in any way? Because I don´t think it is :P
int n = 2;
Task[] producers = Enumerable.Range(1, 3).Select(_ =>
Task.Factory.StartNew(() =>
{
while (queue.Count < n)
{
lock (queue)
{
if (!queue.Contains(n))
{
Console.WriteLine("Thread" + Task. CurrentId);
queue.Add(n);
Interlocked.Increment(ref n);
}
}
Thread.Sleep(100);
}
}))
.ToArray();
I mean, it will just go on and on. It´s like a reeeeeeaaallllyyy strange way of just adding numbers to a List
Please explain you objective and we might be able to help you.

I see, what you need is a BlockingCollection which came with .NET 4.0.
It allows to implement the Producer-Consumer pattern.
Multiple threads or tasks can add items to the collection concurrently. Multiple consumers can remove items concurrently, and if the collection becomes empty, the consuming threads will block and wait until a producer adds an item. Over and over again ...
... until a special method will be called by producer to identify the end, saying consumer "Hey, stop waiting there - nothing will come anymore!".
I am not posting code samples, because there are some under given link. You can find much more if you just google for Producer-Consumer pattern and/or BlockingCollection.

Related

In C#, how do I process a large text file with multiple threads/tasks, but with conditions?

I am writing a file-processing program in C#. I have a HUGE text file, with 5 columns of data each separated by a bar(|). The first column in each row is a column containing a person's name, and each person has a unique name.
Its a very large text file, so I want to process it concurrently using multiple tasks. But I want every row with the same name to be processed by the SAME task, not a different task. For example, if (part of) my file reads:
Jason|BMW|354|23|1/1/2000|1:03
Jason|BMW|354|23|1/1/2000|1:03
Jason|BMW|354|23|1/1/2000|1:03
Jason|Acura|354|23|1/1/2000|1:03
Jason|BMW|354|23|1/1/2000|1:03
Jason|BMW|354|23|1/1/2000|1:03
Jason|Hyundai|392|17|1/1/2000|1:06
Mike|Infiniti|335|18|8/24/2005|7:11
Mike|Infiniti|335|18|8/24/2005|7:11
Mike|Infiniti|335|18|8/24/2005|7:11
Mike|Dodge|335|18|8/24/2005|7:18
Mike|Infiniti|335|18|8/24/2005|7:11
Mike|Infiniti|335|18|8/24/2005|7:14
Then I want one task processing ALL the Jason rows, and another task processing ALL the Mike rows. I don't want the first task processing any Mike rows, and conversely I don't want the second task processing any Jason rows. Essentially, how can I make it so that all rows of a certain name are all processed by the SAME task? ALSO, how will I know when all the processing of all the rows has been completed? I've been racking my tiny brain and I can't come up with a solution.
One idea is to implement the producer-consumer pattern, with one producer that reads the file line-by-line, and multiple consumers that process the lines, one consumer per name. Since the number of unique names may be large, it would be impractical to dedicate a Thread for each consumer, so the consumers should process the data asynchronously. Each consumer should have its own private queue with data to process. The most efficient asynchronous queue currently available in .NET is the Channel<T> class, and using it as a building block would be a good idea, but I will suggest something higher-level that this: an ActionBlock<T> from the TPL Dataflow library. This component combines a processor and a queue, is async-enabled, and is highly configurable. So it will make for a succinct, quite readable, and hopefully quite efficient solution:
var processors = new Dictionary<string, ActionBlock<string>>();
foreach (var line in File.ReadLines(filePath))
{
string name = ExtractName(line); // Reads the first part of the line
if (!processors.TryGetValue(name, out ActionBlock<string> processor))
{
processor = CreateProcessor(name);
processors.Add(name, processor);
}
var accepted = processor.Post(line);
if (!accepted) break; // The processor has failed
}
// Signal that no more lines will be sent to the processors
foreach (var processor in processors.Values) processor.Complete();
// Aggregate the completion of all processors
Task allCompletions = Task.WhenAll(processors.Values.Select(p => p.Completion));
// Wait for the completion of all processors, and allow errors to propagate
allCompletions.Wait(); // or await allCompletions;
static ActionBlock<string> CreateProcessor(string name)
{
return new ActionBlock<string>((string line) =>
{
// Process the line
}, new ExecutionDataflowBlockOptions()
{
// Configure the options if the defaults are not optimal
});
}
I'd go for a concurrent dictionary of concurrent queues, keyed by name.
In the main thread (call it the reader), loop line by line enqueueing the lines to the appropriate concurrent queue (call these the worker queues), with creation of the a new worker queue and dedicated task as needed when a new name is encountered.
It would look something like this (note: this is semi-pseudo code and semi-real code and has no error checking, so treat it as a base for a solution, not the solution).
class FileProcessor
{
private ConcurrentDictionary<string, Worker> workers = new ConcurrentDictionary<string, Worker>();
class Worker
{
public Worker() => Task = Task.Run(Process);
private void Process()
{
foreach (var row in Queue.GetConsumingEnumerable())
{
if (row.Length == 0) break;
ProcessRow(row);
}
}
private void ProcessRow(string[] row)
{
// your implementation here
}
public Task Task { get; }
public BlockingCollection<string[]> Queue { get; } = new BlockingCollection<string[]>(new ConcurrentQueue<string[]>());
}
void ProcessFile(string fileName)
{
foreach (var line in GetLinesOfFile(fileName))
{
var row = line.Split('|');
var name = row[0];
// create worker as needed
var worker = workers.GetOrAdd(name, x => new Worker());
// add a row for the worker to work on
worker.Queue.Add(row);
}
// send an empty array to each worker to signal end of input
foreach (var worker in workers.Values)
worker.Queue.Add(new string[0]);
// now wait for all workers to be done
Task.WaitAll(workers.Values.Select(x => x.Task).ToArray());
}
private static IEnumerable<string> GetLinesOfFile(string fileName)
{
// this helps limit memory consumption by not loading
// the whole file at once
return File.ReadLines(fileName);
}
}
I suggest that your reader thread stream the file rather than reading the entire file; you stated the file was huge, so streaming would be memory friendly). That reader thread is I/O bound, so if you can async/await it, that would be better than my simple Process() doing a foreach with no awaiting.
The features of this approach:
dedicated task per person's name
use of a sentinel value to signal end of input
use of Task.WaitAll to join back to the main thread
assumes the tasks are CPU bound. If they are I/O bound, consider using async/await and Task.WhenAll instead
file is streamed into memory with File.ReadLines()
names do not need to be sorted because the queue to enqueue to is selected by name on-demand
Refinements
In the interest of completeness, the approach above is a bit naive and can be refined by... reading all of the comments and answers; users Zoulias and Mercer in particular have good points. We can refine this approach with
adapt this to TPL Channels and use CompleteAdding. These are not only better abstractions, but more efficient (abstraction and efficient can often be at odds, but not in this case).
reduce the name-to-thread or name-to-task dedication, which can exhaust resources in the case of a large number of names, and instead map names to buckets or partitions where each bucket/partition has a dedicated task/thread.
For the second point, for example, you could have
// create worker as needed
var worker = workers.GetOrAdd(GetPartitionKey(name), x => new Worker());
where GetPartitionKey() could be implemented something like
private string GetPartitionKey(string name) =>
name[0] switch
{
>= 'a' and <= 'f' => "A thru F bucket",
>= 'A' and <= 'F' => "A thru F bucket",
>= 'g' and <= 'k' => "G thru K bucket",
>= 'G' and <= 'K' => "G thru K bucket",
_ => "everything else bucket"
}
or whatever algorithm you want to use as a partition selector.
how can I make it so that all rows of a certain name are all processed by the SAME task?
A System.Threading.Task can be created using various TaskCreationOptions that dictate how and when their threads and resources are managed during their lifetime. For an operation for consuming large amount of data and furthermore segregating the consumption of data to specific threads - you may want to consider creating the tasks that are responsible for individual names with the option TaskCreationOptions.LongRunning which may provide a hint to the task scheduler that an additional thread might be required for the task so that it does not block the forward progress of other threads or work items on the local thread-pool queue.
For the actual how, I would recommend starting various 'Worker' threads, each with their own Task and a way for your main task (the one reading the file, or parsing the JSON data) to communicate between the two that more work needs to be completed.
Consider the use of thread-safe collections such as a ConcurrentQueue<T> or other various collections that may help you in streaming data between threads for consumption safely.
Here's a very limited example of the structure you may want to consider:
void Worker(ConcurrentQueue<string> Queue, CancellationToken Token)
{
// keep the worker in a
while (Token.IsCancellationRequested is false)
{
// check to see if the queue has stuff, and consume it
if (Queue.TryDequeue(out string line))
{
Console.WriteLine($"Consumed Line {line} {Thread.CurrentThread.ManagedThreadId}");
}
// yield the thread incase other threads have work to do
Thread.Sleep(10);
}
Console.WriteLine("Finished Work");
}
// data could be a reader, list, array anything really
IEnumerable<string> Data()
{
yield return "JASON";
yield return "Mike";
yield return "JASON";
yield return "Mike";
}
void Reader()
{
// create some collections to stream the data to other tasks
ConcurrentQueue<string> Jason = new();
ConcurrentQueue<string> Mike = new();
// make sure we have a way to cancel the workers if we need to
CancellationTokenSource tokenSource = new();
// start some worker tasks that will consume the data
Task[] workers = {
new Task(()=> Worker(Jason, tokenSource.Token), TaskCreationOptions.LongRunning),
new Task(()=> Worker(Mike, tokenSource.Token), TaskCreationOptions.LongRunning)
};
for (int i = 0; i < workers.Length; i++)
{
workers[i].Start();
}
// iterate the data and send it off to the queues for consumption
foreach (string line in Data())
{
switch (line)
{
case "JASON":
Console.WriteLine($"Sent line to JASON {Thread.CurrentThread.ManagedThreadId}");
Jason.Enqueue(line);
break;
case "Mike":
Console.WriteLine($"Sent line to Mike {Thread.CurrentThread.ManagedThreadId}");
Mike.Enqueue(line);
break;
default:
Console.WriteLine($"Disposed unknown line {Thread.CurrentThread.ManagedThreadId}");
break;
}
}
// make sure that worker threads are cancelled if parent task has been cancelled
try
{
// wait for workers to finish by checking collections
do
{
Thread.Sleep(10);
} while (Jason.IsEmpty is false && Mike.IsEmpty is false);
}
finally
{
// cancel the worker threads, if they havent already
tokenSource.Cancel();
}
}
// make sure we have a way to cancel the reader if we need to
CancellationTokenSource tokenSource = new();
// start the reader thread
Task[] tasks = { Task.Run(Reader, tokenSource.Token) };
Console.WriteLine("Starting Reader");
Task.WaitAll(tasks);
Console.WriteLine("Finished Reader");
// cleanup the tasks if they are still running some how
tokenSource?.Cancel();
// dispose of IDisposable Object
tokenSource?.Dispose();
Console.ReadLine();

What are the functional benefits of recursive scheduling in System.Reactive?

I'm current reading http://www.introtorx.com/ and I'm getting really interested in stripping Subject<T> out of my reactive code. I'm starting to understand how to encapsulate sequence generation so that I can reason better about a given sequence. I read a few SO questions and ended up reading about scheduling. Of particular interest is recursive scheduling, using the Schedule(this IScheduler scheduler, Action<TState,Action<TState>>)overloads - like this one.
The book is starting to show its age in a few areas, and the biggest i see is that it never compares its techniques to alternatives that may be achieved using the Task and async/await language features. I always end up feeling like I could write less code by ignoring the book advice and using the asynchronous toys, but the back of my mind nags me about being lazy and not learning the pattern properly.
With that, here is my question. If I wanted to schedule a sequence at an interval, support cancellation and perform work on a background thread, I might do this:
static void Main(string[] args)
{
var sequence = Observable.Create<object>(o =>
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
DoWerk(o, cancellationTokenSource);
return cancellationTokenSource.Cancel;
});
sequence.Subscribe(p => Console.Write(p));
Console.ReadLine();
}
private static async void DoWerk(IObserver<object> o, CancellationTokenSource cancellationTokenSource)
{
string message = "hello world!";
for (int i = 0; i < message.Length; i++)
{
await Task.Delay(250, cancellationTokenSource.Token);
o.OnNext(message[i]);
if (cancellationTokenSource.IsCancellationRequested)
{
break;
}
}
o.OnCompleted();
}
Note the use of async void to create concurrency without explicitly borrowing a thread pool thread with Task.Run(). await Task.Delay() will, however, do just that but it will not lease the thread for long.
What are the limitations and pitfalls here? What are the reasons that you might prefer to use recursive scheduling?
I personally wouldn't use await Task.Delay(250, cancellationTokenSource.Token); as a way to slow down a loop. It's better than Thread.Sleep(250), but it's still code smell to me.
I would look at it that you should use a built-in operator in preference to a roll-your-own solution like this.
The operator you need is one of the most powerful, but often overlooked. Try Observable.Generate. He's how:
static void Main(string[] args)
{
IObservable<char> sequence = Observable.Create<char>(o =>
{
string message = "hello world!";
return
Observable
.Generate(
0,
n => n < message.Length,
n => n + 1,
n => message[n],
n => TimeSpan.FromMilliseconds(250.0))
.Subscribe(o);
});
using (sequence.Subscribe(p => Console.Write(p)))
{
Console.ReadLine();
}
}
This is self-cancelling (when you call .Dispose() on the subscription) and produces values every 250.0 milliseconds.
I've continued to use the Observable.Create operator to ensure that the message variable is encapsulated within the observable - otherwise it is possible for someone to change the value of message as the observable is working with it and thus break it.
As an alternative, that might not be as efficient with memory, but is self-encapsulating, try this:
IObservable<char> sequence =
Observable
.Generate(
"hello world!",
n => !String.IsNullOrEmpty(n),
n => n.Substring(1),
n => n[0],
n => TimeSpan.FromMilliseconds(250.0));
And, finally, there's nothing "recursive" about the scheduling in your question. What did you mean by that?
I finally figured out what you're looking at. I missed it in the question.
Here's an example using the recursive scheduling:
IObservable<char> sequence = Observable.Create<char>(o =>
{
string message = "hello world!";
return Scheduler.Default.Schedule<string>(message, TimeSpan.FromMilliseconds(250.0), (state, schedule) =>
{
if (!String.IsNullOrEmpty(state))
{
o.OnNext(state[0]);
schedule(state.Substring(1), TimeSpan.FromMilliseconds(250.0));
}
else
{
o.OnCompleted();
}
});
});

Start threads at the order that they were started, only when previous thread was finished

Sorry for the confusing title, but that's basically what i need, i could do something with global variables but that would only be viable for 2 threads that are requested one after the other.
here is a pseudo code that might explain it better.
/*Async function that gets requests from a server*/
if ()//recieved request from server
{
new Thread(() =>
{
//do stuff
//in the meantime a new thread has been requested from server
//and another one 10 seconds later.. etc.
//wait for this current thread to finish
//fire up the first thread that was requested while this ongoing thread
//after the second thread is finished fire up the third thread that was requested 10 seconds after this thread
//etc...
}).Start();
}
I don't know when each thread will be requested, as it is based on the server sending info to client, so i cant do Task.ContiuneWith as it's dynamic.
So Michael suggested me to look into Queues, and i came up with it
static Queue<Action> myQ = new Queue<Action>();
static void Main(string[] args)
{
new Thread(() =>
{
while (1 == 1)
{
if (myQ.FirstOrDefault() == null)
break;
myQ.FirstOrDefault().Invoke();
}
}).Start();
myQ.Enqueue(() =>
{
TestQ("First");
});
myQ.Enqueue(() =>
{
TestQ("Second");
});
Console.ReadLine();
}
private static void TestQ(string s)
{
Console.WriteLine(s);
Thread.Sleep(5000);
myQ.Dequeue();
}
I commented the code, i basically need to check if the act is first in queue or not.
EDIT: So i re-made it, and now it works, surely there is a better way to do this ? because i cant afford to use an infinite while loop.
You will have to use a global container for the threads. Maybe check Queues.
This class implements a queue as a circular array. Objects stored in a
Queue are inserted at one end and removed from the other.
Queues and stacks are useful when you need temporary storage for
information; that is, when you might want to discard an element after
retrieving its value. Use Queue if you need to access the information
in the same order that it is stored in the collection. Use Stack if
you need to access the information in reverse order. Use
ConcurrentQueue(Of T) or ConcurrentStack(Of T) if you need to access
the collection from multiple threads concurrently.
Three main operations can be performed on a Queue and its elements:
Enqueue adds an element to the end of the Queue.
Dequeue removes the oldest element from the start of the Queue.
Peek returns the oldest element that is at the start of the Queue but does not remove it from the Queue.
EDIT (From what you added)
Here is how I would change your example code to implement the infinite loop and keep it under your control.
static Queue<Action> myQ = new Queue<Action>();
static void Main(string[] args)
{
myQ.Enqueue(() =>
{
TestQ("First");
});
myQ.Enqueue(() =>
{
TestQ("Second");
});
Thread thread = new Thread(() =>
{
while(true) {
Thread.Sleep(5000)
if (myQ.Count > 0) {
myQ.Dequeue().Invoke()
}
}
}).Start();
// Do other stuff, eventually calling "thread.Stop()" the stop the infinite loop.
Console.ReadLine();
}
private static void TestQ(string s)
{
Console.WriteLine(s);
}
You could put the requests that you receive into a queue if there is a thread currently running. Then, to find out when threads return, they could fire an event. When this event fires, if there is something in the queue, start a new thread to process this new request.
The only thing with this is you have to be careful about race conditions, since you are communicating essentially between multiple threads.

How to wait for a function execution for specific duration

My C# application stops responding for a long time, as I break the Debug it stops on a function.
foreach (var item in list)
{
xmldiff.Compare(item, secondary, output);
...
}
I guess the running time of this function is long or it hangs. Anyway, I want to wait for a certain time (e.g. 5 seconds) for the execution of this function, and if it exceeds this time, I skip it and go to the next item in the loop. How can I do it? I found some similar question but they are mostly for processes or asynchronous methods.
You can do it the brutal way: spin up a thread to do the work, join it with timeout, then abort it, if the join didn't work.
Example:
var worker = new Thread( () => { xmlDiff.Compare(item, secondary, output); } );
worker.Start();
if (!worker.Join( TimeSpan.FromSeconds( 1 ) ))
worker.Abort();
But be warned - aborting threads is not considered nice and can make your app unstable. If at all possible try to modify Compare to accept a CancellationToken to cancel the comparison.
I would avoid directly using threads and use Microsoft's Reactive Extensions (NuGet "Rx-Main") to abstract away the management of the threads.
I don't know the exact signature of xmldiff.Compare(item, secondary, output) but if I assume it produces an integer then I could do this with Rx:
var query =
from item in list.ToObservable()
from result in
Observable
.Start(() => xmldiff.Compare(item, secondary, output))
.Timeout(TimeSpan.FromSeconds(5.0), Observable.Return(-1))
select new { item, result };
var subscription =
query
.Subscribe(x =>
{
/* do something with `x.item` and/or `x.result` */
});
This automatically iterates through each item and starts a background computation of xmldiff.Compare, but only allows each computation to take as much as 5.0 seconds before returning a default value of -1.
The subscription variable is an IDisposable, so if you want to abort the entire query before it completes just call .Dispose().
I skip it and go to the next item in the loop
By "skip it", do you mean "leave it there" or "cancel it"? The two scenarios are quite different. But for both two I suggest you use Task.
//generate 10 example tasks
var tasks = Enumerable
.Range(0, 10)
.Select(n => new Task(() => DoSomething(n)))
.ToList();
var maxExecutionTime = TimeSpan.FromSeconds(5);
foreach (var task in tasks)
{
if (task.Wait(maxExecutionTime))
{
//the task is finished in time
}
else
{
// the task is over time
// just leave it there
// the loop continues
// if you want to cancel it, see
// http://stackoverflow.com/questions/4783865/how-do-i-abort-cancel-tpl-tasks
}
}
One thing to improve is "do you really need to run your tasks one by one?" If they are independent you can run them in parallel.

Parallel.ForEach fails to execute messages on long running IEnumerable

Why will the Parallel.ForEach will not finish executing a series of tasks until MoveNext returns false?
I have a tool that monitors a combination of MSMQ and Service Broker queues for incoming messages. When a message is found, it hands that message off to the appropriate executor.
I wrapped the check for messages in an IEnumerable, so that I could hand the Parallel.ForEach method the IEnumerable plus a delegate to run. The application is designed to run continuously w/ the IEnumerator.MoveNext processing in a loop until it's able to get work, then the IEnumerator.Current giving it the next item.
Since the MoveNext will never die until I set the CancelToken to true, this should continue to process for ever. Instead what I'm seeing is that once the Parallel.ForEach has picked up all the messages and the MoveNext is no longer returning "true", no more tasks are processed. Instead it seems like the MoveNext thread is the only thread given any work while it waits for it to return, and the other threads (including waiting and scheduled threads) do not do any work.
Is there a way to tell the Parallel to keep working while it waits for a response from the MoveNext?
If not, is there another way to structure the MoveNext to get what I want? (having it return true and then the Current returning a null object spawns a lot of bogus Tasks)
Bonus Question: Is there a way to limit how many messages the Parallel pulls off at once? It seems to pull off and schedule a lot of messages at once (the MaxDegreeOfParallelism only seems to limit how much work it does at once, it doesn't stop it from pulling off a lot of messages to be scheduled)
Here is the IEnumerator for what I've written (w/o some extraneous code):
public class DataAccessEnumerator : IEnumerator<TransportMessage>
{
public TransportMessage Current
{ get { return _currentMessage; } }
public bool MoveNext()
{
while (_cancelToken.IsCancellationRequested == false)
{
TransportMessage current;
foreach (var task in _tasks)
{
if (task.QueueType.ToUpper() == "MSMQ")
current = _msmq.Get(task.Name);
else
current = _serviceBroker.Get(task.Name);
if (current != null)
{
_currentMessage = current;
return true;
}
}
WaitHandle.WaitAny(new [] {_cancelToken.WaitHandle}, 500);
}
return false;
}
public DataAccessEnumerator(IDataAccess<TransportMessage> serviceBroker, IDataAccess<TransportMessage> msmq, IList<JobTask> tasks, CancellationToken cancelToken)
{
_serviceBroker = serviceBroker;
_msmq = msmq;
_tasks = tasks;
_cancelToken = cancelToken;
}
private readonly IDataAccess<TransportMessage> _serviceBroker;
private readonly IDataAccess<TransportMessage> _msmq;
private readonly IList<JobTask> _tasks;
private readonly CancellationToken _cancelToken;
private TransportMessage _currentMessage;
}
Here is the Parallel.ForEach call where _queueAccess is the IEnumerable that holds the above IEnumerator and RunJob processes a TransportMessage that is returned from that IEnumerator:
var parallelOptions = new ParallelOptions
{
CancellationToken = _cancelTokenSource.Token,
MaxDegreeOfParallelism = 8
};
Parallel.ForEach(_queueAccess, parallelOptions, x => RunJob(x));
It sounds to me like Parallel.ForEach isn't really a good match for what you want to do. I suggest you use BlockingCollection<T> to create a producer/consumer queue instead - create a bunch of threads/tasks to service the blocking collection, and add work items to it as and when they arrive.
Your problem might be to do with the Partitioner being used.
In your case, the TPL will choose the Chunk Partitioner, which will take multiple items from the enum before passing them on to be processed. The number of items taken in each chunk will increase with time.
When your MoveNext method blocks, the TPL is left waiting for the next item and won't process the items that it has already taken.
You have a couple of options to fix this:
1) Write a Partitioner that always returns individual items. Not as tricky as it sounds.
2) Use the TPL instead of Parallel.ForEach:
foreach ( var item in _queueAccess )
{
var capturedItem = item;
Task.Factory.StartNew( () => RunJob( capturedItem ) );
}
The second solution changes the behaviour a bit. The foreach loop will complete when all the Tasks have been created, not when they have finished. If this is a problem for you, you can add a CountdownEvent:
var ce = new CountdownEvent( 1 );
foreach ( var item in _queueAccess )
{
ce.AddCount();
var capturedItem = item;
Task.Factory.StartNew( () => { RunJob( capturedItem ); ce.Signal(); } );
}
ce.Signal();
ce.Wait();
I haven't gone to the effort to make sure of this, but the impression I'd received from discussions of Parallel.ForEach was that it would pull all the items out of the enumerable them make appropriate decisions about how to divide them across threads. Based on your problem, that seems correct.
So, to keep most of your current code, you should probably pull the blocking code out of the iterator and place it into a loop around the call to Parallel.ForEach (which uses the iterator).

Categories