I'm running into a sync issue involving reporting progress inside of a Parallel.ForEach. I recreated a simplified version of the problem in a Console App. The example actually only uses one item in the list. Here's the code:
class Program
{
static void Main(string[] args)
{
int tracker = 0;
Parallel.ForEach(Enumerable.Range(1, 1), (item) =>
{
var progress = new Progress<int>((p) =>
{
tracker = p;
Console.WriteLine(String.Format("{0}", p));
});
Test(progress);
});
Console.WriteLine("The last value is: {0}", tracker);
Console.ReadKey();
}
static void Test(IProgress<int> progress)
{
for (int i = 0; i < 20; i++)
{
progress.Report(i);
}
}
}
As you can see, the line I expect to see last isn't output last and doesn't contain 20. But if I remove progress reporting and just write to output in the for loop like this:
class Program
{
static void Main(string[] args)
{
int tracker = 0;
Parallel.ForEach(Enumerable.Range(1, 1), (item) =>
{
tracker = Test();
});
Console.WriteLine("The last value is: {0}", tracker);
Console.ReadKey();
}
static int Test()
{
int i;
for ( i = 0; i < 20; i++)
{
Console.WriteLine(i.ToString());
}
return i;
}
}
it behaves like I expect. As far as I know, Parallel.ForEach creates a Task for each in item in the list and IProgress captures the context in which it's created on. Given it's a console app I didn't think that would matter. Help please!
The explanation is pretty much exactly what's written in the docs:
Any handler provided to the constructor or event handlers registered with the ProgressChanged event are invoked through a SynchronizationContext instance captured when the instance is constructed. If there is no current SynchronizationContext at the time of construction, the callbacks will be invoked on the ThreadPool.
By using Progress<T>.Report you're effectively queueing 20 tasks on the thread pool. There's no guarantee as to what order they're executed in.
Related
I am a C# programmer, and I run into some thread issue problem.
Assets are entities, and I need to run each asset parallel, and run a method "doSomethingOnAsset"
I have a program that has 100 thread (i.e 1 thread per asset I am doing on it some manipulations). Generally each thread has the same time frame on each intrval that is running, and each one call "doSomethingOnAsset" method.
Each thread interval is running 10 millisecond (i.e).
I don't want so many threads, so I create one queue for each asset, but when calling the central method "doSomethingOnAsset" - the threads are not running in same time frame interval.
i.e the 1st thread running interval cycle is 300 milliseconds.
the 2nd thread running interval cycle is 700 milliseconds.
the 3rd thread running interval cycle is 2 seconds.
...
What is the best way running a predefined method 100 times parallel (the parallel entry may be an external service that when running, trigger an event that run my code of "doSomethingOnAsset".
public void doSomethingOnAsset(object obj)
{
// infinite loop when thread.
while (true)
{
doSomething(obj);
Thread.Sleep(100);
}
}
public void doSomething(object obj)
{
// do something.
}
public void Run()
{
Thread t;
for (int i = 0; i < 100; i++)
{
t = new Thread(new ParameterizedThreadStart(this.doSomethingOnAsset));
t.Start(new object());
}
Console.ReadLine();
}
or call doSomething on event signal, when an external program trigger.
Thanks :)
For these kinds of producing-consumer situations I usually define a blocking collection, define and create a consumer (or multiple), and start adding data to the collection. Each consumer instance will try to take an item, and if any, consume it. Otherwise, it will wait for an item.
You could add a cancellation token to support to stop processing.
You can scale it easily by adding more consumers. Of course, what number is the most efficient depends on the machine and the number of cores, in combination with the processing-length-per-item.
The consumer:
public class MyConsumer<T> {
public MyConsumer(BlockingCollection<T> collection, Action<T> action) {
_collection = collection;
_action = action;
}
private readonly BlockingCollection<T> _collection;
private readonly Action<T> _action;
public void StartConsuming() {
new Task(Consume).Start();
}
private void Consume() {
while (true) {
var obj = _collection.Take();
_action(obj);
}
}
}
Usage:
public void doSomething(object obj) {
// do something.
}
public void Run() {
var collection = new BlockingCollection<object>();
// Start workers
for (int i = 0; i < 5; i++) {
new MyConsumer<object>(collection, doSomethingOnAsset);
}
// Create object to consume
for (int i = 0; i < 100; i++) {
collection.Add(new object());
}
}
So Here is the Program Again
As u can see i have Created 2 methods with a Working Loop
And Created 2 threads in main pointing towards these methods and they are started
What gets out as output is Both Loops work like 1 and then space and in new line 1 and so on
But what i want is to make them appear in the same row line side by side As we divide a page in 2 parts and write things in lines
I do not want To make them Work Seperately but at a time and in the same line but Different columns
I know it can be acheived by Writing Both Objects in same Console .wl but i want to acheive it this way by these 2 threads
Please provide valuable solutions that would work
Thanks
using System;
using System.Threading;
class Program
{
static void Main(string [] args)
{
Thread t1 = new Thread(code1);
Thread t2= new Thread (code2);
t1.Start();
t2.Start();
}
static void code1()
{
for(int i=0;i<50;i++)
{
Console.WriteLine(i);
Thread.Sleep(1000);
}
}
static void code2()
{
for(int i=0;i<50;i++)
{
Console.WriteLine("/t/t"+i);
Thread.Sleep(1000);
}
}}
You have to use the Console.SetCursorPosition(int left, int top) method, so you can write on the Console starting from any position you want, also back in the previous rows.
Obviously, you have to keep trace of the position for each Thread. That is, the current row of that Thread, and its first column.
In my example I made 2 threads, one with the first column in position 0, and the second with the first column in position 50. Be careful about the width of the strings that you need to write, or they will overflow their own space on the Console.
Also, because you are doing it in a multithreading app, you need a lock on the Console. Otherwise, suppose this: a Thread sets the CursorPosition, then another Thread sets it, then the scheduler returns to the first Thread... the first Thread writes on the second Thread's position!
This is a very simple Console Program that gets the point:
using System;
using System.Threading;
namespace StackOverflow_3_multithread_on_console
{
class Program
{
static Random _random = new Random();
static void Main(string[] args)
{
var t1 = new Thread(Run1);
var t2 = new Thread(Run2);
t1.Start();
t2.Start();
}
static void Run1()
{
for(int i = 0; i < 30; i++)
{
Thread.Sleep(_random.Next(2000)); //for test
ConsoleLocker.Write("t1:" + i.ToString(), 0, i);
}
}
static void Run2()
{
for (int i = 0; i < 30; i++)
{
Thread.Sleep(_random.Next(2000)); //for test
ConsoleLocker.Write("t2:" + i.ToString(), 30, i);
}
}
}
static class ConsoleLocker
{
private static object _lock = new object();
public static void Write(string s, int left, int top)
{
lock (_lock)
{
Console.SetCursorPosition(left, top);
Thread.Sleep(100); //for test
Console.Write(s);
}
}
}
}
All the Thread.Sleep are there just to demonstrate that the lock works well. You can remove all them, especially the one in the ConsoleLocker.
How can I run the code below in LinqPad as C# Program Thank you...
class ThreadTest
{
static void Main()
{
Thread t = new Thread (WriteY); // Kick off a new thread
t.Start(); // running WriteY()
// Simultaneously, do something on the main thread.
for (int i = 0; i < 1000; i++) Console.Write ("x");
}
static void WriteY()
{
for (int i = 0; i < 1000; i++) Console.Write ("y");
}
}
Result Expected
xxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ...
So far I came up with
static void Main()
{
Thread t = new Thread (ThreadTest.WriteY); // Kick off a new thread
t.Start(); // running WriteY()
// Simultaneously, do something on the main thread.
for (int i = 0; i < 1000; i++) Console.Write ("x");
}
class ThreadTest
{
public static void WriteY()
{
for (int i = 0; i < 1000; i++) Console.Write ("y");
}
}
Actual Result
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy...
As seen on Result Expected it should be mixed X and Y.
Unfortunately Actual Result is 1000 times X and 1000 times Y
UPDATE
This sample - along with all the others in the concurrency chapters of C# 5 in a Nutshell are downloadable as a LINQPad sample library. Go to LINQPad's samples TreeView and click 'Download/Import more samples' and choose the first listing. – Joe Albahari
Thread switching is by nature non-deterministic. I can run your program multiple times and get varying results.
If you want the switching to be more evident, add some pauses:
static void Main()
{
Thread t = new Thread (ThreadTest.WriteY); // Kick off a new thread
t.Start(); // running WriteY()
// Simultaneously, do something on the main thread.
for (int i = 0; i < 1000; i++)
{
Console.Write ("x");
Thread.Sleep(1);
}
}
class ThreadTest
{
public static void WriteY()
{
for (int i = 0; i < 1000; i++)
{
Console.Write ("y");
Thread.Sleep(1);
}
}
}
I cannot explain why this works, but changing to using Dump() seems to make it behave like the OP wants with the x's and y's "mixed" with every run (although with newlines between every output):
void Main()
{
Thread t = new Thread (ThreadTest.WriteY); // Kick off a new thread
t.Start(); // running WriteY()
// Simultaneously, do something on the main thread.
for (int i = 0; i < 1000; i++) "x".Dump();
}
class ThreadTest
{
public static void WriteY()
{
for (int i = 0; i < 1000; i++) "y".Dump();
}
}
From the LinqPAD documentation:
LINQPad's Dump command feeds the output into an XHTML stream which it
displays using an embedded web browser (you can see this by
right-clicking a query result and choosing 'View Source'. The
transformation into XHTML is done entirely using LINQ to XML, as one
big LINQ query! The deferred expansion of results works via
JavaScript, which means the XHTML is fully prepopulated after a query
finishes executing. The lambda window populates using a custom
expression tree visitor (simply calling ToString on an expression tree
is no good because it puts the entire output on one line).
I also know that LinqPAD overrides the default Console.WriteLine behavior, so perhaps that has something to do with it.
I have a class in C# like this:
public MyClass
{
public void Start() { ... }
public void Method_01() { ... }
public void Method_02() { ... }
public void Method_03() { ... }
}
When I call the "Start()" method, an external class start to work and will create many parallel threads that those parallel threads call the "Method_01()" and "Method_02()" form above class. after end of working of the external class, the "Method_03()" will be run in another parallel thread.
Threads of "Method_01()" or "Method_02()" are created before creation of thread of Method_03(), but there is no guaranty to end before start of thread of "Method_03()". I mean the "Method_01()" or the "Method_02()" will lost their CPU turn and the "Method_03" will get the CPU turn and will end completely.
In the "Start()" method I know the total number of threads that are supposed to create and run "Method_01" and "Method_02()". The question is that I'm searching for a way using semaphore or mutex to ensure that the first statement of "Method_03()" will be run exactly after end of all threads which are running "Method_01()" or "Method_02()".
Three options that come to mind are:
Keep an array of Thread instances and call Join on all of them from Method_03.
Use a single CountdownEvent instance and call Wait from Method_03.
Allocate one ManualResetEvent for each Method_01 or Method_02 call and call WaitHandle.WaitAll on all of them from Method_03 (this is not very scalable).
I prefer to use a CountdownEvent because it is a lot more versatile and is still super scalable.
public class MyClass
{
private CountdownEvent m_Finished = new CountdownEvent(0);
public void Start()
{
m_Finished.AddCount(); // Increment to indicate that this thread is active.
for (int i = 0; i < NUMBER_OF_THREADS; i++)
{
m_Finished.AddCount(); // Increment to indicate another active thread.
new Thread(Method_01).Start();
}
for (int i = 0; i < NUMBER_OF_THREADS; i++)
{
m_Finished.AddCount(); // Increment to indicate another active thread.
new Thread(Method_02).Start();
}
new Thread(Method_03).Start();
m_Finished.Signal(); // Signal to indicate that this thread is done.
}
private void Method_01()
{
try
{
// Add your logic here.
}
finally
{
m_Finished.Signal(); // Signal to indicate that this thread is done.
}
}
private void Method_02()
{
try
{
// Add your logic here.
}
finally
{
m_Finished.Signal(); // Signal to indicate that this thread is done.
}
}
private void Method_03()
{
m_Finished.Wait(); // Wait for all signals.
// Add your logic here.
}
}
This appears to be a perfect job for Tasks. Below I assume that Method01 and Method02 are allowed to run concurrently with no specific order of invocation or finishing (with no guarantee, just typed in out of memory without testing):
int cTaskNumber01 = 3, cTaskNumber02 = 5;
Task tMaster = new Task(() => {
for (int tI = 0; tI < cTaskNumber01; ++tI)
new Task(Method01, TaskCreationOptions.AttachedToParent).Start();
for (int tI = 0; tI < cTaskNumber02; ++tI)
new Task(Method02, TaskCreationOptions.AttachedToParent).Start();
});
// after master and its children are finished, Method03 is invoked
tMaster.ContinueWith(Method03);
// let it go...
tMaster.Start();
What it sounds like you need to do is to create a ManualResetEvent (initialized to unset) or some other WatHandle for each of Method_01 and Method_02, and then have Method_03's thread use WaitHandle.WaitAll on the set of handles.
Alternatively, if you can reference the Thread variables used to run Method_01 and Method_02, you could have Method_03's thread use Thread.Join to wait on both. This assumes however that those threads are actually terminated when they complete execution of Method_01 and Method_02- if they are not, you need to resort to the first solution I mention.
Why not use a static variable static volatile int threadRuns, which is initialized with the number threads Method_01 and Method_02 will be run.
Then you modify each of those two methods to decrement threadRuns just before exit:
...
lock(typeof(MyClass)) {
--threadRuns;
}
...
Then in the beginning of Method_03 you wait until threadRuns is 0 and then proceed:
while(threadRuns != 0)
Thread.Sleep(10);
Did I understand the quesiton correctly?
There is actually an alternative in the Barrier class that is new in .Net 4.0. This simplifies the how you can do the signalling across multiple threads.
You could do something like the following code, but this is mostly useful when synchronizing different processing threads.
public class Synchro
{
private Barrier _barrier;
public void Start(int numThreads)
{
_barrier = new Barrier((numThreads * 2)+1);
for (int i = 0; i < numThreads; i++)
{
new Thread(Method1).Start();
new Thread(Method2).Start();
}
new Thread(Method3).Start();
}
public void Method1()
{
//Do some work
_barrier.SignalAndWait();
}
public void Method2()
{
//Do some other work.
_barrier.SignalAndWait();
}
public void Method3()
{
_barrier.SignalAndWait();
//Do some other cleanup work.
}
}
I would also like to suggest that since your problem statement was quite abstract, that often actual problems that are solved using countdownevent are now better solved using the new Parallel or PLINQ capabilities. If you were actually processing a collection or something in your code, you might have something like the following.
public class Synchro
{
public void Start(List<someClass> collection)
{
new Thread(()=>Method3(collection));
}
public void Method1(someClass)
{
//Do some work.
}
public void Method2(someClass)
{
//Do some other work.
}
public void Method3(List<someClass> collection)
{
//Do your work on each item in Parrallel threads.
Parallel.ForEach(collection, x => { Method1(x); Method2(x); });
//Do some work on the total collection like sorting or whatever.
}
}
I have the following piece of code:
private Dictionary<object, object> items = new Dictionary<object, object>;
public IEnumerable<object> Keys
{
get
{
foreach (object key in items.Keys)
{
yield return key;
}
}
}
Is this thread-safe? If not do I have to put a lock around the loop or the yield return?
Here is what I mean:
Thread1 accesses the Keys property while Thread2 adds an item to the underlying dictionary. Is Thread1 affected by the add of Thread2?
What exactly do you mean by thread-safe?
You certainly shouldn't change the dictionary while you're iterating over it, whether in the same thread or not.
If the dictionary is being accessed in multiple threads in general, the caller should take out a lock (the same one covering all accesses) so that they can lock for the duration of iterating over the result.
EDIT: To respond to your edit, no it in no way corresponds to the lock code. There is no lock automatically taken out by an iterator block - and how would it know about syncRoot anyway?
Moreover, just locking the return of the IEnumerable<TKey> doesn't make it thread-safe either - because the lock only affects the period of time when it's returning the sequence, not the period during which it's being iterated over.
Check out this post on what happens behind the scenes with the yield keyword:
Behind the scenes of the C# yield keyword
In short - the compiler takes your yield keyword and generates an entire class in the IL to support the functionality. You can check out the page after the jump and check out the code that gets generated...and that code looks like it tracks thread id to keep things safe.
OK, I did some testing and got an interesting result.
It seems that it is more an issue of the enumerator of the underlying collection than the yield keyword. The enumerator (actually its MoveNext method) throws (if implemented correctly) an InvalidOperationException because the enumeration has changed. According to the MSDN documentation of the MoveNext method this is the expected behavior.
Because enumerating through a collection is usually not thread-safe a yield return is not either.
I believe it is, but I cannot find a reference that confirms it. Each time any thread calls foreach on an iterator, a new thread local* instance of the underlying IEnumerator should get created, so there should not be any "shared" memory state that two threads can conflict over...
Thread Local - In the sense that it's reference variable is scoped to a method stack frame on that thread
I believe yield implementation is thread-safe. Indeed, you can run that simple program at home and you will notice that the state of the listInt() method is correctly saved and restored for each thread without edge effect from other threads.
public class Test
{
public void Display(int index)
{
foreach (int i in listInt())
{
Console.WriteLine("Thread {0} says: {1}", index, i);
Thread.Sleep(1);
}
}
public IEnumerable<int> listInt()
{
for (int i = 0; i < 5; i++)
{
yield return i;
}
}
}
class MainApp
{
static void Main()
{
Test test = new Test();
for (int i = 0; i < 4; i++)
{
int x = i;
Thread t = new Thread(p => { test.Display(x); });
t.Start();
}
// Wait for user
Console.ReadKey();
}
}
class Program
{
static SomeCollection _sc = new SomeCollection();
static void Main(string[] args)
{
// Create one thread that adds entries and
// one thread that reads them
Thread t1 = new Thread(AddEntries);
Thread t2 = new Thread(EnumEntries);
t2.Start(_sc);
t1.Start(_sc);
}
static void AddEntries(object state)
{
SomeCollection sc = (SomeCollection)state;
for (int x = 0; x < 20; x++)
{
Trace.WriteLine("adding");
sc.Add(x);
Trace.WriteLine("added");
Thread.Sleep(x * 3);
}
}
static void EnumEntries(object state)
{
SomeCollection sc = (SomeCollection)state;
for (int x = 0; x < 10; x++)
{
Trace.WriteLine("Loop" + x);
foreach (int item in sc.AllValues)
{
Trace.Write(item + " ");
}
Thread.Sleep(30);
Trace.WriteLine("");
}
}
}
class SomeCollection
{
private List<int> _collection = new List<int>();
private object _sync = new object();
public void Add(int i)
{
lock(_sync)
{
_collection.Add(i);
}
}
public IEnumerable<int> AllValues
{
get
{
lock (_sync)
{
foreach (int i in _collection)
{
yield return i;
}
}
}
}
}