I'm creating new threads for every sql call for a project. There are millions of sql calls so I'm calling a procedure in a new thread to handle the sql calls.
In doing so, I wanted to increment and decrement a counter so that I know when these threads have completed the sql query.
To my amazement the output shows NEGATIVE values in the counter. HOW? When I am starting with 0 and adding 1 at the beginning of the process and subtracting 1 at the end of the process?
This int is not called anywhere else in the program.. the following is the code..
public static int counter=0;
while(!txtstream.EndOfStream)
{
new Thread(delegate()
{
processline();
}).Start();
Console.WriteLine(counter);
}
public static void processline()
{
counter++;
sql.ExecuteNonQuery();
counter--;
}
Output looks something like this:
1
21
-2
-2
5
Nothing mysterious about it, you are using threading, right?
The ++ and -- operator aren't thread safe. Do this.
public static void processline()
{
Interlocked.Increment(ref counter);
sql.ExecuteNonQuery();
Interlocked.Decrement(ref counter);
}
How to overcome
Use Interlocked.Increment and Interlocked.Decrement to safely change the value of the counter.
Why this happensYou have counter as variable, which is shared across multiple threads. This variable is non-volatile and not wrapped by any synchronization block, so each thread has its own copy of that variable. So if two threads try to change it value at the same time, value would be overrriden by copy from thread which accessed it last. Imagine you start your code in two different threads:
Initially counter equals zero, both threads havy copy of that
Both thread invoke increment their cached copies, and than change counter. So thread1 increments its copy to 1 and overrides counter, thread2 also increments its copy (still equal to zero) to 1 and overrides counter to, again, 1. After that that value is propagated to all threads (all copies are refreshed)
Both threads invoke sql query. Due to variability in sql performance, these queries are completed in different time.
Thread1 ends sql query, decrements counter from 1 to 0. Counter value is propagated to all threads
After some time, Thread2 ends sql query, decrement counter from already propagated 0 to -1. Counter value is propagated to all threads. And it is -1.
Related
I can already see it's not by the incorrect increments, but there's just one small piece of the puzzle I can't quite seem to catch.
We have the following code:
internal class StupidObject
{
static public SemaphoreSlim semaphore = new SemaphoreSlim(0, 100);
private int counter;
public bool MethodCall() => counter++ == 0;
public int GetCounter() => counter;
}
And the following test code to try and see if it's an atomic operation:
var sharedObj = new StupidObject();
var resultTasks = new Task[100];
for (int i = 0; i < 100; i++)
{
resultTasks[i] = Task.Run(async () =>
{
await StupidObject.semaphore.WaitAsync();
if (sharedObj.MethodCall())
{
Console.WriteLine("True");
};
});
}
Console.WriteLine("Done");
Console.ReadLine();
StupidObject.semaphore.Release(100);
Console.ReadLine();
Console.WriteLine(sharedObj.GetCounter());
Console.ReadLine();
I expect to see multiple True's written to the console, but I ever see a single one.
Why is that? By my understanding, a ++ operation reads the value, increments the read value, and then stores that value to the variable.
Those are 3 operations. If we had a race condition, where thread A did the following:
Reads value to be 0.
Increments read value by 1.
And another thread B did the same things, but beat thread A to the third operation as following:
Writes read value to variable.
When A finishes writing the incremented read value, it should print back 0, same with thread B after it has done its write operation.
Am I missing something at the design aspect of things, or is my test not good enough to make this exact situation come to fruition?
Example without the Task Parallel Library (still yields a single True to the console):
var sharedObj = new StupidObject();
var resultTasks = new Thread[10000];
for (int i = 0; i < 10000; i++)
{
resultTasks[i] = new Thread(() =>
{
StupidObject.semaphore.Wait();
if (sharedObj.MethodCall())
{
Console.WriteLine("True");
};
});
resultTasks[i].IsBackground = false;
resultTasks[i].Start();
}
Console.WriteLine("Done");
Console.ReadLine();
StupidObject.semaphore.Release(10000);
What Liam said about Console.WriteLine is possible, but also there's another thing.
Starting Tasks doesn't equal starting threads, and even starting threads doesn't guarantee that all threads will begin immediatelly. Starting 100 short tasks probably won't even fill .Net's thread pool significantly, because those tasks end quickly and thread pool's manager probably won't start more than 3-5 threads. That's not the "immediate" and "parallel" you'd like to see when you want to start parallel 100 increments to race with each other, right? Remember that Tasks are queued first, then assigned to threads.
Note that the StupidObject's counter starts with zero and that's the ONLY MOMENT EVER that the value is zero. If ANY thread wins the race and successfully writes an update to that integer, you'll get FALSE in all future tasks, because it's already 1.
And if there are many tasks on the thread pool's queue, something first has to notice that fact. At program's start, thread pool lacks threads. They are not started in dozens right at program start. They are started on demand. Most probably you fill up the queue with 100 tasks, threadpool's thread is created, picks first task, bumps counter to 1, then maybe thread pool starts new threads to consume tasks faster.
To get a bit better image what's happening, instead of printing out 'true', collect values observed by return counter++: let each task run, finish, store its value in Task's .Result, then run threads/tasks, then wait for all of then to stop, then collect .Results and write a histogram of those values. Even if you don't see 5 zeros, maybe you will see 3 ones, 7 twos, 2 threes and so on.
I'm using the code below to initialise 4 thread with the same method so that each one can perform the same process but on a separate file.
for (int i = 0; i < 4; i++)
{
Thread newProcessThread = new Thread(ThreadProcessFile)
{
Priority = ThreadPriority.BelowNormal,
IsBackground = true
};
newProcessThread.Start();
}
Inside the ThreadProcessFile method, it starts like this so each thread knows what is its ID is. _threadInitCount is declared in the same class.
int threadID = _threadInitCount;
_threadInitCount += 1;
However, I'm getting a weird behaviour where a number might be missed or duplicated. e.g. the first thread might have an ID of 1 and not 0 or 2 will be missing from the set of four threads.
Can anyone explain this behaviour or advise on a better way of doing this?
Each thread already has an unique ID assigned to it. You can access it with
Thread.CurrentThread.ManagedThreadId
property.
The reason you get duplicated/missing numbers is that a context switch might occur at every moment. Imagine, for instance, that your threads get executed one-by-one line-by-line. The first thread assigns its threadID variable to _threadInitCount, which is 0. Then the second does the same, its threadID is 0 too. Then the third gets its threadID=0 and so on. Then the first thread turns on again to increase _threadInitCount to 1, then the second increases it to 2 and so on.
I am currently reading this excellent article on threading and read the following text:
Thread.Sleep(0) relinquishes the thread’s current time slice immediately, voluntarily handing over the CPU to other threads.
I wanted to test this and below is my test code:
static string s = "";
static void Main(string[] args)
{
//Create two threads that append string s
Thread threadPoints = new Thread(SetPoints);
Thread threadNewLines = new Thread(SetNewLines);
//Start threads
threadPoints.Start();
threadNewLines.Start();
//Wait one second for threads to manipulate string s
Thread.Sleep(1000);
//Threads have an infinite loop so we have to close them forcefully.
threadPoints.Abort();
threadNewLines.Abort();
//Print string s and wait for user-input
Console.WriteLine(s);
Console.ReadKey();
}
The functions that threadPoints and threadNewLines run:
static void SetPoints()
{
while(true)
{
s += ".";
}
}
static void SetNewLines()
{
while(true)
{
s += "\n";
Thread.Sleep(0);
}
}
If I understand Thread.Sleep(0) correctly, the output should be something like this:
............ |
.............. |
................ | <- End of console
.......... |
............. |
............... |
But I get this as output:
....................|
....................|
.... |
|
|
....................|
....................|
................. |
|
Seeing as the article mentioned in the beginning of the post is highly recommended by many programmers, I can only assume that my understanding of Thread.Sleep(0) is wrong. So if someone could clarify, I'd be much obliged.
What thread.sleep(0) is to free the cpu to handle other threads, but that doesn't mean that another thread couldn't be the current one. If you're trying to send the context to another thread, try to use some sort of signal.
If you have access to a machine (or perhaps a VM) with only a single core/processor, try running your code on that machine. You may be surprised with how the results vary. Just because two threads refer to the same variable "s", does not mean they actually refer to the same value at the same time, due to various levels of caching that can occur on modern multi-core (and even just parallel pipeline) CPUs. If you want to see how the yielding works irrespective of the caching issues, try wrapping each s += expression inside a lock statement.
If you'd expand the width of the console to be 5 time larger than current then you'd see what you expect, lines not reaching the console width. The problem is one time slice is actually very long. So, to have the expected effect with normal console with you'd have to slow down the Points thread, but without using Sleep. Instead of while (true) loop try this
for (int i = 0;; i++)
{
if (int % 10 == 0)
s += ".";
}
To slow down the thread even more replace number 10 with bigger number.
The next thread the processor handles is random thread and it even could be the same thread you just called Thread.Sleep(0). To ensure that next thread will be not the same thread you can call Thread.Yield() and check it's return result - if os has another thread that can run true will be returned else false.
You should (almost) never abort threads. The best practice is to signal them to die (commit suicide).
This is normally accomplished by setting some boolean variable and the threads should inspect its value to whether continue or not its execution.
You are setting a string variable named "s". You will incur in race conditions. String is not thread safe. You can wrap the operations that manipulate it in a lock or use a built-in type that is thread-safe.
Always pay attention, in the documentation, to know if the types you use are thread-safe.
Because of this you can't rely on your results because your program is not thread-safe.
If you run the program several times my guess is that you'll get different outputs.
Note: When using a boolean to share some state to cancel threads, make sure it is marked as volatile. JIT might optimize the code and never looks at its changed value.
Im trying to figure out the output of this code:
Dictionary<int, MyRequest> request = new Dictionary<int, MyRequest>();
for (int i = 0; i < 1000; i++ )
{
request.Add(i, new MyRequest() { Name = i.ToString() });
}
var ids = request.Keys.ToList();
Parallel.For(0, ids.Count, (t) =>
{
var id = ids[t];
var b = request[id];
lock (b)
{
if (b.Name == 4.ToString())
{
Thread.Sleep(10000);
}
Console.WriteLine(b.Name);
}
});
Console.WriteLine("done");
Console.Read();
output:
789
800
875
.
.
.
4
5
6
7
done
MyRequest is just a dummy class used for demonstration (it is not doing anything but holding values). Is my lock blocking the execution or are the last 4 being put on their own thread?
This is a .NET 4.0 demo.
UPDATE
Ok I did figure out they were on teh same thread, but i would still like to know if the lock does anything to block execution. I cant imagine it does.
If ids does not contain duplicates, that lock won't block anything. But if there are duplicates in ids, then yes, there might be contention at the lock, as different threads fight for access to the same request.
Your lock will only be blocking execution if the ids line up such that you retrieve the same request more than once. Since different names are being printed each time, that shouldn't be a concern.
Parallel.For uses a thread pool to process your loop. As soon as one of its threads is free, it assigns it to the next element. This is non-deterministic, because you don't know how many threads there are in the pool, and you don't control the CPU time given to each thread. This means that some threads may finish sooner or later than you would "naturally" expect.
Your lock isn't doing anything. A lock blocks delimits sections of code that attempt to use the same object. In your case, you're not ever using the same object twice in the loop. The fact that the last IDs processed seem consistent is probably purely coincidental.
I am reading http://www.mono-project.com/ThreadsBeginnersGuide.
The first example looks like this:
public class FirstUnsyncThreads {
private int i = 0;
public static void Main (string[] args) {
FirstUnsyncThreads myThreads = new FirstUnsyncThreads ();
}
public FirstUnsyncThreads () {
// Creating our two threads. The ThreadStart delegate is points to
// the method being run in a new thread.
Thread firstRunner = new Thread (new ThreadStart (this.firstRun));
Thread secondRunner = new Thread (new ThreadStart (this.secondRun));
// Starting our two threads. Thread.Sleep(10) gives the first Thread
// 10 miliseconds more time.
firstRunner.Start ();
Thread.Sleep (10);
secondRunner.Start ();
}
// This method is being excecuted on the first thread.
public void firstRun () {
while(this.i < 10) {
Console.WriteLine ("First runner incrementing i from " + this.i +
" to " + ++this.i);
// This avoids that the first runner does all the work before
// the second one has even started. (Happens on high performance
// machines sometimes.)
Thread.Sleep (100);
}
}
// This method is being excecuted on the second thread.
public void secondRun () {
while(this.i < 10) {
Console.WriteLine ("Second runner incrementing i from " + this.i +
" to " + ++this.i);
Thread.Sleep (100);
}
}
}
Output:
First runner incrementing i from 0 to 1
Second runner incrementing i from 1 to 2
Second runner incrementing i from 3 to 4
First runner incrementing i from 2 to 3
Second runner incrementing i from 5 to 6
First runner incrementing i from 4 to 5
First runner incrementing i from 6 to 7
Second runner incrementing i from 7 to 8
Second runner incrementing i from 9 to 10
First runner incrementing i from 8 to 9
Wow, what is this? Unfortunately, the explanation in the article is inadequate for me. Can you explain me why the increments happened in a jumbled order?
Thanks!
I think the writer of the article has confused things.
VoteyDisciple is correct that ++i is not atomic and a race condition can occur if the target is not locked during the operation but this will not cause the issue described above.
If a race condition occurs calling ++i then internal operations of the ++ operator will look something like:-
1st thread reads value 0
2nd thread reads value 0
1st thread increments value to 1
2nd thread increments value to 1
1st thread writes value 1
2nd thread writes value 1
The order of operations 3 to 6 is unimportant, the point is that both the read operations, 1 and 2, can occur when the variable has value x resulting in the same incrementation to y, rather than each thread performing incrementations for distinct values of x and y.
This may result in the following output:-
First runner incrementing i from 0 to 1
Second runner incrementing i from 0 to 1
What would be even worse is the following:-
1st thread reads value 0
2nd thread reads value 0
2nd thread increments value to 1
2nd thread writes value 1
2nd thread reads value 1
2nd thread increments value to 2
2nd thread writes value 2
1st thread increments value to 1
1st thread writes value 1
2nd thread reads value 1
2nd thread increments value to 2
2nd thread writes value 2
This may result in the following output:-
First runner incrementing i from 0 to 1
Second runner incrementing i from 0 to 1
Second runner incrementing i from 1 to 2
Second runner incrementing i from 1 to 2
And so on.
Furthermore, there is a possible race condition between reading i and performing ++i since the Console.WriteLine call concatenates i and ++i. This may result in output like:-
First runner incrementing i from 0 to 1
Second runner incrementing i from 1 to 3
First runner incrementing i from 1 to 2
The jumbled console output which the writer has described can only result from the unpredictability of the console output and has nothing to do with a race condition on the i variable. Taking a lock on i whilst performing ++i or whilst concatenating i and ++i will not change this behaviour.
When I run this (on a dualcore), my output is
First runner incrementing i from 0 to 1
Second runner incrementing i from 1 to 2
First runner incrementing i from 2 to 3
Second runner incrementing i from 3 to 4
First runner incrementing i from 4 to 5
Second runner incrementing i from 5 to 6
First runner incrementing i from 6 to 7
Second runner incrementing i from 7 to 8
First runner incrementing i from 8 to 9
Second runner incrementing i from 9 to 10
As I would have expected. You are running two loops, both executing Sleep(100). That is very ill suited to demonstrate a race-condition.
The code does have a race condition (as VoteyDisciple describes) but it is very unlikely to surface.
I can't explain the lack of order in your output (is it a real output?), but the Console class will synchronize output calls.
If you leave out the Sleep() calls and run the loops 1000 times (instead of 10) you might see two runners both incrementing from 554 to 555 or something.
Synchronization is essential when multiple threads are present. In this case you are seeing that both threads read and write to this.i , but no good attempt is done at synchronize these accesses. Since both of them concurrently modify the same memory area, you observe the jumbled output.
The call to Sleep is dangerous, it is an approach which leads to sure bugs. You cannot assume that the threads will be always displaced by the inital 10 ms.
In short: Never use Sleep for synchronization :-) but instead adopt some kind of thread synchronization technique (eg. locks, mutexes, semaphores). Always try to use the lightest possible lock that will fulfill your need....
A useful resource is the book by Joe Duffy, Concurrent Programming on Windows.
The increments are not happening out of order, the Console.WriteLine(...) is writing the output from multiple threads into a single-threaded console, and the synchronization from many threads to one thread is causing the messages to appear out of order.
I assume this example attempted to create a race condition, and in your case failed. Unfortunately, concurrency issues, such as a race condition and deadlocks, are hard to predict and reproduce due to their nature. You might want to try and run it a few more times, alter it to use more threads and each thread should increment more times (say 100,000). Then you might see that the end result will not equal the sum of all the increments (caused by a race condition).