C# defining multiple threads for the same method - c#

I'm using the code below to initialise 4 thread with the same method so that each one can perform the same process but on a separate file.
for (int i = 0; i < 4; i++)
{
Thread newProcessThread = new Thread(ThreadProcessFile)
{
Priority = ThreadPriority.BelowNormal,
IsBackground = true
};
newProcessThread.Start();
}
Inside the ThreadProcessFile method, it starts like this so each thread knows what is its ID is. _threadInitCount is declared in the same class.
int threadID = _threadInitCount;
_threadInitCount += 1;
However, I'm getting a weird behaviour where a number might be missed or duplicated. e.g. the first thread might have an ID of 1 and not 0 or 2 will be missing from the set of four threads.
Can anyone explain this behaviour or advise on a better way of doing this?

Each thread already has an unique ID assigned to it. You can access it with
Thread.CurrentThread.ManagedThreadId
property.
The reason you get duplicated/missing numbers is that a context switch might occur at every moment. Imagine, for instance, that your threads get executed one-by-one line-by-line. The first thread assigns its threadID variable to _threadInitCount, which is 0. Then the second does the same, its threadID is 0 too. Then the third gets its threadID=0 and so on. Then the first thread turns on again to increase _threadInitCount to 1, then the second increases it to 2 and so on.

Related

Reliable way to prove that int++ is not atomic

I can already see it's not by the incorrect increments, but there's just one small piece of the puzzle I can't quite seem to catch.
We have the following code:
internal class StupidObject
{
static public SemaphoreSlim semaphore = new SemaphoreSlim(0, 100);
private int counter;
public bool MethodCall() => counter++ == 0;
public int GetCounter() => counter;
}
And the following test code to try and see if it's an atomic operation:
var sharedObj = new StupidObject();
var resultTasks = new Task[100];
for (int i = 0; i < 100; i++)
{
resultTasks[i] = Task.Run(async () =>
{
await StupidObject.semaphore.WaitAsync();
if (sharedObj.MethodCall())
{
Console.WriteLine("True");
};
});
}
Console.WriteLine("Done");
Console.ReadLine();
StupidObject.semaphore.Release(100);
Console.ReadLine();
Console.WriteLine(sharedObj.GetCounter());
Console.ReadLine();
I expect to see multiple True's written to the console, but I ever see a single one.
Why is that? By my understanding, a ++ operation reads the value, increments the read value, and then stores that value to the variable.
Those are 3 operations. If we had a race condition, where thread A did the following:
Reads value to be 0.
Increments read value by 1.
And another thread B did the same things, but beat thread A to the third operation as following:
Writes read value to variable.
When A finishes writing the incremented read value, it should print back 0, same with thread B after it has done its write operation.
Am I missing something at the design aspect of things, or is my test not good enough to make this exact situation come to fruition?
Example without the Task Parallel Library (still yields a single True to the console):
var sharedObj = new StupidObject();
var resultTasks = new Thread[10000];
for (int i = 0; i < 10000; i++)
{
resultTasks[i] = new Thread(() =>
{
StupidObject.semaphore.Wait();
if (sharedObj.MethodCall())
{
Console.WriteLine("True");
};
});
resultTasks[i].IsBackground = false;
resultTasks[i].Start();
}
Console.WriteLine("Done");
Console.ReadLine();
StupidObject.semaphore.Release(10000);
What Liam said about Console.WriteLine is possible, but also there's another thing.
Starting Tasks doesn't equal starting threads, and even starting threads doesn't guarantee that all threads will begin immediatelly. Starting 100 short tasks probably won't even fill .Net's thread pool significantly, because those tasks end quickly and thread pool's manager probably won't start more than 3-5 threads. That's not the "immediate" and "parallel" you'd like to see when you want to start parallel 100 increments to race with each other, right? Remember that Tasks are queued first, then assigned to threads.
Note that the StupidObject's counter starts with zero and that's the ONLY MOMENT EVER that the value is zero. If ANY thread wins the race and successfully writes an update to that integer, you'll get FALSE in all future tasks, because it's already 1.
And if there are many tasks on the thread pool's queue, something first has to notice that fact. At program's start, thread pool lacks threads. They are not started in dozens right at program start. They are started on demand. Most probably you fill up the queue with 100 tasks, threadpool's thread is created, picks first task, bumps counter to 1, then maybe thread pool starts new threads to consume tasks faster.
To get a bit better image what's happening, instead of printing out 'true', collect values observed by return counter++: let each task run, finish, store its value in Task's .Result, then run threads/tasks, then wait for all of then to stop, then collect .Results and write a histogram of those values. Even if you don't see 5 zeros, maybe you will see 3 ones, 7 twos, 2 threes and so on.

c# multiple tasks do not run in parallel

I tried to create a multithreaded dice rolling simulation - just for curiosity, the joy of multithreaded progarmming and to show others the effects of "random results" (many people can't understand that if you roll a laplace dice six times in a row and you already had 1, 2, 3, 4, 5 that the next roll is NOT a 6.). To show them the distribution of n rolls with m dice I created this code.
Well, the result is fine BUT even though I create a new task for each dice the program runs single threaded.
Multithreaded would be reasonable to simulate "millions" of rerolls with 6 or more dice as the time to finish will grow rapidly.
I read several examples from msdn that all indicate that there should be several tasks running simultanously.
Can someone please give me a hint, why this code does not utilize many threads / cores? (Not even, when I try to run it for 400 dice at once and 10 Million rerolls)
At first I initialize the jagged Array that stores the results. 1st dimension: 1 Entry per dice, the second dimension will be the distribution of eyes rolled with each dice.
Next I create an array of tasks that each return an array of results (the second dimension, as described above)
Each of these arrays has 6 entries that represent eachs side of a laplace W6 dice. If the dice roll results in 1 eye the first entry [0] is increased by +1. So you can visualize how often each value has been rolled.
Then I use a plain for-loop to start all threads. There is no indication to wait for a thread until all are started.
At the end I wait for all to finish and sum up the results. It does not make any difference if change
Task.WaitAll(tasks); to
Task.WhenAll(tasks);
Again my quation: Why doesn't that code utilize more than one core of my CPU? What do I have to change?
Thanks in advance!
Here's the code:
private void buttonStart_Click(object sender, RoutedEventArgs e)
{
int tries = 1000000;
int numberofdice = 20 ;
int numberofsides = 6; // W6 = 6
var rnd = new Random();
int[][] StoreResult = new int[numberofdice][];
for (int i = 0; i < numberofdice; i++)
{
StoreResult[i] = new int[numberofsides];
}
Task<int[]>[] tasks = new Task<int[]>[numberofdice];
for (int ctr = 0; ctr < numberofdice; ctr++)
{
tasks[ctr] = Task.Run(() =>
{
int newValue = 0;
int[] StoreTemp = new int[numberofsides]; // Array that represents how often each value appeared
for (int i = 1; i <= tries; i++) // how often to roll the dice
{
newValue = rnd.Next(1, numberofsides + 1); // Roll the dice; UpperLimit for random int EXCLUDED from range
StoreTemp[newValue-1] = StoreTemp[newValue-1] + 1; //increases value corresponding to eyes on dice by +1
}
return StoreTemp;
});
StoreResult[ctr] = tasks[ctr].Result; // Summing up the individual results for each dice in an array
}
Task.WaitAll(tasks);
// do something to visualize the results - not important for the question
}
}
The issue here is tasks[ctr].Result. The .Result portion itself waits for the function to complete before storing the resulting int array into StoreResult. Instead, make a new loop after Task.WaitAll to get your results.
You may consider doing a Parallel.Foreach loop instead of manually creating separate tasks for this.
As others have indicated, when you try to aggregate this you just end up waiting for each individual task to finish, so this isn't actually multi-threaded.
Very important note: The C# random number generator is not thread safe (see also this MSDN blog post for discussion on the topic). Don't share the same instance between multiple threads. From the documentation:
...Random objects are not thread safe. If your app calls Random
methods from multiple threads, you must use a synchronization object
to ensure that only one thread can access the random number generator
at a time. If you don't ensure that the Random object is accessed in a
thread-safe way, calls to methods that return random numbers return 0.
Also, just to be nit-picky, using a Task is not really the same thing as doing multithreading; while you are, in fact, doing multithreading here, it's also possible to do purely asynchronous, non-multithreaded code with async/await. This is used mostly for I/O-bound operations where it's largely pointless to create a separate thread just to wait for a result (but it's desirable to allow the calling thread to do other work while it's waiting for the result).
I don't think you should have to worry about thread safety while assigning to the main array (assuming that each thread is assigning only to a specific index in the array and that no one else is assigning to the same memory location); you only have to worry about locking when multiple threads are accessing/modifying shared mutable state at the same time. If I'm reading this correctly, this is mutable state (but it's not shared mutable state).

Mysterious Negative Counter Value

I'm creating new threads for every sql call for a project. There are millions of sql calls so I'm calling a procedure in a new thread to handle the sql calls.
In doing so, I wanted to increment and decrement a counter so that I know when these threads have completed the sql query.
To my amazement the output shows NEGATIVE values in the counter. HOW? When I am starting with 0 and adding 1 at the beginning of the process and subtracting 1 at the end of the process?
This int is not called anywhere else in the program.. the following is the code..
public static int counter=0;
while(!txtstream.EndOfStream)
{
new Thread(delegate()
{
processline();
}).Start();
Console.WriteLine(counter);
}
public static void processline()
{
counter++;
sql.ExecuteNonQuery();
counter--;
}
Output looks something like this:
1
21
-2
-2
5
Nothing mysterious about it, you are using threading, right?
The ++ and -- operator aren't thread safe. Do this.
public static void processline()
{
Interlocked.Increment(ref counter);
sql.ExecuteNonQuery();
Interlocked.Decrement(ref counter);
}
How to overcome
Use Interlocked.Increment and Interlocked.Decrement to safely change the value of the counter.
Why this happensYou have counter as variable, which is shared across multiple threads. This variable is non-volatile and not wrapped by any synchronization block, so each thread has its own copy of that variable. So if two threads try to change it value at the same time, value would be overrriden by copy from thread which accessed it last. Imagine you start your code in two different threads:
Initially counter equals zero, both threads havy copy of that
Both thread invoke increment their cached copies, and than change counter. So thread1 increments its copy to 1 and overrides counter, thread2 also increments its copy (still equal to zero) to 1 and overrides counter to, again, 1. After that that value is propagated to all threads (all copies are refreshed)
Both threads invoke sql query. Due to variability in sql performance, these queries are completed in different time.
Thread1 ends sql query, decrements counter from 1 to 0. Counter value is propagated to all threads
After some time, Thread2 ends sql query, decrement counter from already propagated 0 to -1. Counter value is propagated to all threads. And it is -1.

Parallel.For: Is it safe to lock the value?

Im trying to figure out the output of this code:
Dictionary<int, MyRequest> request = new Dictionary<int, MyRequest>();
for (int i = 0; i < 1000; i++ )
{
request.Add(i, new MyRequest() { Name = i.ToString() });
}
var ids = request.Keys.ToList();
Parallel.For(0, ids.Count, (t) =>
{
var id = ids[t];
var b = request[id];
lock (b)
{
if (b.Name == 4.ToString())
{
Thread.Sleep(10000);
}
Console.WriteLine(b.Name);
}
});
Console.WriteLine("done");
Console.Read();
output:
789
800
875
.
.
.
4
5
6
7
done
MyRequest is just a dummy class used for demonstration (it is not doing anything but holding values). Is my lock blocking the execution or are the last 4 being put on their own thread?
This is a .NET 4.0 demo.
UPDATE
Ok I did figure out they were on teh same thread, but i would still like to know if the lock does anything to block execution. I cant imagine it does.
If ids does not contain duplicates, that lock won't block anything. But if there are duplicates in ids, then yes, there might be contention at the lock, as different threads fight for access to the same request.
Your lock will only be blocking execution if the ids line up such that you retrieve the same request more than once. Since different names are being printed each time, that shouldn't be a concern.
Parallel.For uses a thread pool to process your loop. As soon as one of its threads is free, it assigns it to the next element. This is non-deterministic, because you don't know how many threads there are in the pool, and you don't control the CPU time given to each thread. This means that some threads may finish sooner or later than you would "naturally" expect.
Your lock isn't doing anything. A lock blocks delimits sections of code that attempt to use the same object. In your case, you're not ever using the same object twice in the loop. The fact that the last IDs processed seem consistent is probably purely coincidental.

One thread can change the reference to the array while another thread is iterating through it

I have a static array variable which is being shared by two threads.
I'd like to understand what happens if I assign the array variable to another array object in Thread1 while Thread2 is iterating over the array.
I.e
In thread 1
MyStaticClass.MyArray = SomeOtherArray
While in Thread 2:
for (int i = 0; i < MyStaticClass.MyArray.length; i++)
{
//do something with the i'th element
}
Assuming MyStaticClass.MyArray is just a field or simple property.
Nothing good or predictable will happen that's for sure.
I'd say most likely are:
The loop may read one half of old array and rest from new array
The new array may be shorter than the last giving a exception when you access [i]
Or Thread 2 may actually completely ignore the change to the array! And what's worse, this behaviour may be different in Release build to Debug.
The last situation is due to compiler optimisations and/or the way the memory model works in .net (and other languages like Java BTW). There's a whole keyword to get around this issue (volatile) http://igoro.com/archive/volatile-keyword-in-c-memory-model-explained/
To prevent this, you want to use a lock to lock the critical section. Basically wrapping your itteration in a lock will prevent the other thread from overwriting the array while you are processing it
The lock keyword ensures that one thread does not enter a critical section of code while another thread is in the critical section. If another thread tries to enter a locked code, it will wait, block, until the object is released.
https://msdn.microsoft.com/en-us/library/c5kehkcz.aspx
The condition in your for loop is evaluated for every iteration. So if another thread changes the reference in MyStaticClass.MyArray, you use this new reference in the next iteration.
For example, this code:
int[] a = {1,2,3,4,5};
int[] b = {10, 20, 30, 40, 50, 60, 70};
for (int i = 0; i < a.Length; i++)
{
Console.WriteLine(a[i]);
a = b;
}
gives this output:
1
20
30
40
50
60
70
To avoid this, you could use foreach:
foreach(int c in a)
{
Console.WriteLine(c);
a = b;
}
gives:
1
2
3
4
5
because foreach is translated so that it calls a.GetEnumerator() once and then uses this enumerator (MoveNext() and Current) for all iterations, not accessing a again.
There will be no consequences other than thread 2 at some point starting to work on the new array instead of the old. When exactly this happens is more or less random, and as weston writes in his comment, it may never even happen at all.
If the new array has fewer elements than the old one this can (but doesn't have to) cause a runtime exception.

Categories