I have a problem when multiple threads try to increase int. Here's my code:
private int _StoreIndex;
private readonly List<Store> _Stores = new List<Store>();
public void TestThreads()
{
_StoreIndex = 0;
for (int i = 0; i < 20; i++)
{
Thread thread = new Thread(() =>
{
while (_StoreIndex < _Stores.Count - 1)
{
_Stores[Interlocked.Increment(ref _StoreIndex)].CollectData();
}
});
thread.Start();
}
}
I would expect that int gets increased by one each time the thread executes this code. However, it does not. I have also tried using lock (new object()), but this doesn't work as well. The problem is that not all the stores collect data because (when debugging), _StoreIndex goes like 0, 1, 1, 3, 4, 5, for example. The second object in the list is obviously skipped.
What am I doing wrong? Thanks in advance.
In your case I would use the TPL to avoid all of these problems with manual thread creation and indexes in the first place:
Parallel.ForEach(_Stores, (store) => store.CollectData());
I think it should be corrected to:
Thread thread = new Thread(() =>
{
int index = 0;
while ((index = Interlocked.Increment(ref _StoreIndex)) < _Stores.Count - 1)
{
_Stores[index].CollectData();
}
});
Now index is local, so there is no interference, while _StoreIndex is only used atomically in a single place.
This is not an atomic operation:
_Stores[Interlocked.Increment(ref _StoreIndex)].CollectData();
Increment is atomic, but this line contains more code than a simple increment. You may need to sort out your indeces first, then use thread safe collection to hold your stores, like ConcurrentBag and perhaps consider TPL library and classes like Task and Parallel to perform the workload.
Related
let start with the code;
checkedUnlock is an HashSet<ulong>
_hashsetLock is an object
lock (_hashsetLock)
newMap = checkedUnlock.Add(uniqueId);
vs
fun in an int
SpinWait.SpinUntil(() => Interlocked.CompareExchange(ref fun, 1, 0) == 1);
newMap = checkedUnlock.Add(uniqueId);
fun = 0;
my understanding is the SpinWait in this scenario should work like the lock() but there is more items added in the HashSet, sometime it match lock, sometime there is 1 to 5 more items in it, which make it obvious that it doesnt work
is my understanding flawed?
edit
I tried this and it seem to work, my test show the same number as lock() so far
SpinWait spin = new SpinWait();
while (Interlocked.CompareExchange(ref fun, 1, 0) == 1)
spin.SpinOnce();
so why would it work with this but not SpinWait.SpinUntil() ?
edit #2
small full application to see
in this code, the SpinWait.SpinUntil will sometime blow up (the add will throw an exception) but when it work, the count will be different so my expected behavior for this one is wrong
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var list = new List<int>();
var rnd = new Random(42);
for (var i = 0; i < 1000000; ++i)
list.Add(rnd.Next(500000));
object _lock1 = new object();
var hashset1 = new HashSet<int>();
int _lock2 = 0;
var hashset2 = new HashSet<int>();
int _lock3 = 0;
var hashset3 = new HashSet<int>();
Parallel.ForEach(list, item =>
{
/******************/
lock (_lock1)
hashset1.Add(item);
/******************/
/******************/
SpinWait.SpinUntil(() => Interlocked.CompareExchange(ref _lock2, 1, 0) == 1);
hashset2.Add(item);
_lock2 = 0;
/******************/
/******************/
SpinWait spin = new SpinWait();
while (Interlocked.CompareExchange(ref _lock3, 1, 0) == 1)
spin.SpinOnce();
hashset3.Add(item);
_lock3 = 0;
/******************/
});
Console.WriteLine("Lock: {0}", hashset1.Count);
Console.WriteLine("SpinWaitUntil: {0}", hashset2.Count);
Console.WriteLine("SpinWait: {0}", hashset3.Count);
Console.ReadKey();
}
}
}
The condition used in SpinWait.SpinUntil is wrong.
Interlocked.CompareExchange returns the original value of the variable.
MSDN docs of SpinWait.SpinUntil says, condition is
A delegate to be executed over and over until it returns true.
You want to spin until a 0 -> 1 transition occurs, so the condition should be
Interlocked.CompareExchange(ref fun, 1, 0) == 0
Subsequent calls to CompareExchange on other threads results in 1, so they will wait until the fun flag is restored to 0 by the "winner" thread.
Some further remarks:
fun = 0; should work on x86 architecture, but I'm not sure it's correct everywhere. If you use Interlocked to access a field, it's a best practice to use Interlocked for all access to that field. So I suggest Interlocked.Exchange(ref fun, 0) instead.
SpinWait is rarely a good solution regarding performance as it prevents the OS putting the spinning thread into an idle state. It should be used for very short waits only. (An example of a proper usage). Simple locks (aka Monitor.Enter/Exit) or SemaphoreSlim will do in general or you can consider ReaderWriterLockSlim if # of reads >> # of writes.
I want to assign and index in the range (0 to MaxDegreeOfParallelism) to each incoming thread in a parallel.for. I have tried using CurrentThread.ManagedThreadId and CurrentThread.Name but they are not in a given range and think they sometimes change values (is this right?). What I am using now is a ConcurrentStack in the following way which works but I think is making my loop horribly slow. Any idea how I could solve this problem? Thanks!!
int nCPU = 16; //number of CORES
ConcurrentStack<int> cs = new ConcurrentStack<int>();
for (int i = 0; i < nCPU; i++) { cs.Push(i); }
options.MaxDegreeOfParallelism = nCPU - 1;
Parallel.For(0, 10000, options, tt =>
{
int threadIndex;
cs.TryPop(out threadIndex); //assign an index to the incoming thread
//..... do stuff with all my variables matrix1[threadIndex][i][j], matrix2.......//
cs.Push(threadIndex); //the thread leaves the previously assigned index
});
I have the following TPL function:
int arrayIndex = 0;
Dictionary < string, int > customModel = new Dictionary < string, int > ();
Task task = Task.Factory.StartNew(() =>
// process each employee holiday
Parallel.ForEach < EmployeeHolidaysModel > (holidays,
new ParallelOptions() {
MaxDegreeOfParallelism = System.Enviroment.ProcessorCount
},
item => {
customModel.Add(item.HolidayName, arrayIndex);
// increment the index
arrayIndex++;
})
);
//wait for all Tasks to finish
Task.WaitAll(task);
The problem is that arrayIndex won't have unique values because of the Parallelism.
Is there a way I can control the arrayIndex variable so between parallel tasks the value is unique?
Basically in my customModel I can't have a duplicate arrayIndex value.
Appreciate any help.
Three problems here:
You are writing to shared variables (both the int and the dictionary). This is unsafe. You must either synchronize or use thread-safe collections.
The amount of work that you're doing per iteration is so small that the overhead of parallelism will be multiple orders of magnitude bigger. This is not a good case for parallelism. Expect major slowdowns.
You start a task, then wait for it. What did you meant to accomplish doing that?
I think you need a basic tutorial about threading. These are very basic issues. You won't be having fun using multi-threading at your current level of knowledge...
You'll need to use Interlocked.Increment(). You should probably also use ConcurrentDictionary to be safe, assuming that's not just sample-code you cooked up for the question.
Similarly, the Task isn't necessary here, since you're just waiting on it to finish filling customModel. Obviously, your scenario may be more complex.
But given the code you posted, I'd do something like:
int arrayIndex = 0;
ConcurrentDictionary<string,int> customModel
= new ConcurrentDictionary<string,int>();
Parallel.ForEach<EmployeeHolidaysModel>(
holidays,
new ParallelOptions() {
MaxDegreeOfParallelism = System.Enviroment.ProcessorCount
},
item => customModel.TryAdd(
item.HolidayName,
Interlocked.Increment(ref arrayIndex)
)
);
NowYouCanDoSomethingWith(customModel);
I have a List to loop while using multi-thread,I will get the first item of the List and do some processing,then remove the item.
While the count of List is not greater than 0 ,fetch data from data.
In a word:
In have a lot of records in my database.I need to publish them to my server.In the process of publishing, multithreading is required and the number of threads may be 10 or less.
For example:
private List<string> list;
void LoadDataFromDatabase(){
list=...;//load data from database...
}
void DoMethod()
{
While(list.Count>0)
{
var item=list.FirstOrDefault();
list.RemoveAt(0);
DoProcess();//how to use multi-thread (custom the count of theads)?
if(list.Count<=0)
{
LoadDataFromDatabase();
}
}
}
Please help me,I'm a beginner of c#,I have searched a lot of solutions, but no similar.
And more,I need to custom the count of theads.
Should your processing of the list be sequential? In other words, cannot you process element n + 1 while not finished yet processing of element n? If this is your case, then Multi-Threading is not the right solution.
Otherwise, if your processing elements are fully independent, you can use m threads, deviding Elements.Count / m elements for each thread to work on
Example: printing a list:
List<int> a = new List<int> { 1, 2, 3, 4,5 , 6, 7, 8, 9 , 10 };
int num_threads = 2;
int thread_elements = a.Count / num_threads;
// start the threads
Thread[] threads = new Thread[num_threads];
for (int i = 0; i < num_threads; ++i)
{
threads[i] = new Thread(new ThreadStart(Work));
threads[i].Start(i);
}
// this works fine if the total number of elements is divisable by num_threads
// but if we have 500 elements, 7 threads, then thread_elements = 500 / 7 = 71
// but 71 * 7 = 497, so that there are 3 elements not processed
// process them here:
int actual = thread_elements * num_threads;
for (int i = actual; i < a.Count; ++i)
Console.WriteLine(a[i]);
// wait all threads to finish
for (int i = 0; i < num_threads; ++i)
{
threads[i].Join();
}
void Work(object arg)
{
Console.WriteLine("Thread #" + arg + " has begun...");
// calculate my working range [start, end)
int id = (int)arg;
int mystart = id * thread_elements;
int myend = (id + 1) * thread_elements;
// start work on my range !!
for (int i = mystart; i < myend; ++i)
Console.WriteLine("Thread #" + arg + " Element " + a[i]);
}
ADD For your case, (uploading to server), it is the same as the code obove. You assign a number of threads, assigning each thread number of elements (which is auto calculated in the variable thread_elements, so you need only to change num_threads). For method Work, all you need is replacing the line Console.WriteLine("Thread #" + arg + " Element " + a[i]); with you uploading code.
One more thing to keep in mind, that multi-threading is dependent on your machine CPU. If your CPU has 4 cores, for example, then the best performance obtained would be 4 threads at maximum, so that assigning each core a thread. Otherwise, if you have 10 threads, for example, they would be slower than 4 threads because they will compete on CPU cores (Unless the threads are idle, waiting for some event to occur (e.g. uploading). In this case, 10 threads can run, because they don't take %100 of CPU usage)
WARNING: DO NOT modify the list while any thread is working (add, remove, set element...), neither assigning two threads the same element. Such things cause you a lot of bugs and exceptions !!!
This is a simple scenario that can be expanded in multiple ways if you add some details to your requirements:
IEnumerable<Data> LoadDataFromDatabase()
{
return ...
}
void ProcessInParallel()
{
while(true)
{
var data = LoadDataFromDatabase().ToList();
if(!data.Any()) break;
data.AsParallel().ForEach(ProcessSingleData);
}
}
void ProcessSingleData(Data d)
{
// do something with data
}
There are many ways to approach this. You can create threads and partition the list yourself or you can take advantage of the TPL and utilize Parallel.ForEach. In the example on the link you see a Action is called for each member of the list being iterated over. If this is your first taste of threading I would also attempt to do it the old fashioned way.
Here my opinion ;)
You can avoid use multithread if youur "List" is not really huge.
Instead of a List, you can use a Queue (FIFO - First In First Out). Then only use Dequeue() method to get one element of the Queue, DoSomeWork and get the another. Something like:
while(queue.Count > 0)
{
var temp = DoSomeWork(queue.Dequeue());
}
I think that this will be better for your propose.
I will get the first item of the List and do some processing,then remove the item.
Bad.
First, you want a queue, not a list.
Second, you do not process then remove, you remove THEN process.
Why?
So that you keep the locks small. Lock list access (note you need to synchonize access), remove, THEN unlock immediately and then process. THis way you keep the locks short. If you take, process, then remove - you basically are single threaded as you have to keep the lock in place while processing, so the next thread does not take the same item again.
And as you need to synchronize access and want multiple threads this is about the only way.
Read up on the lock statement for a start (you can later move to something like spinlock). Do NOT use threads unless you ahve to put schedule Tasks (using the Tasks interface new in 4.0), which gives you more flexibility.
I wrote this experiment to demonstrate to someone that accessing shared data conccurently with multiple threads was a big no-no. To my surprise, regardless of how many threads I created, I was not able to create a concurrency issue and the value always resulted in a balanced value of 0. I know that the increment operator is not thread-safe which is why there are methods like Interlocked.Increment() and Interlocked.Decrement() (also noted here Is the ++ operator thread safe?).
If the increment/decrement operator is not thread safe, then why does the below code execute without any issues and results to the expected value?
The below snippet creates 2,000 threads. 1,000 constantly incrementing and 1,000 constantly decrementing to insure that the data is being accessed by multiple threads at the same time. What makes it worse is that in a normal program you would not have nearly as many threads. Yet despite the exaggerated numbers in an effort to create a concurrency issue the value always results in being a balanced value of 0.
static void Main(string[] args)
{
Random random = new Random();
int value = 0;
for (int x=0; x<1000; x++)
{
Thread incThread = new Thread(() =>
{
for (int y=0; y<100; y++)
{
Console.WriteLine("Incrementing");
value++;
}
});
Thread decThread = new Thread(() =>
{
for (int z=0; z<100; z++)
{
Console.WriteLine("Decrementing");
value--;
}
});
incThread.Start();
decThread.Start();
}
Thread.Sleep(TimeSpan.FromSeconds(15));
Console.WriteLine(value);
Console.ReadLine();
}
I'm hoping someone can provide me with an explanation so that I know that all my effort into writing thread-safe software is not in vain, or perhaps this experiment is flawed in some way. I have also tried with all threads incrementing and using the ++i instead of i++. The value always results in the expected value.
You'll usually only see issues if you have two threads which are incrementing and decrementing at very close times. (There are also memory model issues, but they're separate.) That means you want them spending most of the time incrementing and decrementing, in order to give you the best chance of the operations colliding.
Currently, your threads will be spending the vast majority of the time sleeping or writing to the console. That's massively reducing the chances of collision.
Additionally, I'd note that absence of evidence is not evidence of absence - concurrency issues can indeed be hard to provoke, particularly if you happen to be running on a CPU with a strong memory model and internally-atomic increment/decrement instructions that the JIT can use. It could be that you'll never provoke the problem on your particular machine - but that the same program could fail on another machine.
IMO these loops are too short. I bet that by the time the second thread starts the first thread has already finished executing its loop and exited. Try to drastically increase the number of iterations that each thread executes. At this point you could even spawn just two threads (remove the outer loop) and it should be enough to see wrong values.
For example, with the following code I'm getting totally wrong results on my system:
static void Main(string[] args)
{
Random random = new Random();
int value = 0;
Thread incThread = new Thread(() =>
{
for (int y = 0; y < 2000000; y++)
{
value++;
}
});
Thread decThread = new Thread(() =>
{
for (int z = 0; z < 2000000; z++)
{
value--;
}
});
incThread.Start();
decThread.Start();
incThread.Join();
decThread.Join();
Console.WriteLine(value);
}
In addition to Jon Skeets answer:
A simple test that at least on my litte Dual Core shows the problem easily:
Sub Main()
Dim i As Long = 1
Dim j As Long = 1
Dim f = Sub()
While Interlocked.Read(j) < 10 * 1000 * 1000
i += 1
Interlocked.Increment(j)
End While
End Sub
Dim l As New List(Of Task)
For n = 1 To 4
l.Add(Task.Run(f))
Next
Task.WaitAll(l.ToArray)
Console.WriteLine("i={0} j={1}", i, j)
Console.ReadLine()
End Sub
i and j should both have the same final value. But they dont have!
EDIT
And in case you think, that C# is more clever than VB:
static void Main(string[] args)
{
long i = 1;
long j = 1;
Task[] t = new Task[4];
for (int k = 0; k < 4; k++)
{
t[k] = Task.Run(() => {
while (Interlocked.Read(ref j) < (long)(10*1000*1000))
{
i++;
Interlocked.Increment(ref j);
}});
}
Task.WaitAll(t);
Console.WriteLine("i = {0} j = {1}", i, j);
Console.ReadLine();
}
it isnt ;)
The result: i is around 15% (percent!) lower than j. ON my machine. Having an eight thread machine, probabyl might even make the result more imminent, because the error is more likely to happen if several tasks run truly parallel and are not just pre-empted.
The above code is flawed of course :(
IF a task is preempted, just AFTER i++, all other tasks continue to increment i and j, so i is expected to differ from j, even if "++" would be atomic. There a simple solution though:
static void Main(string[] args)
{
long i = 0;
int runs = 10*1000*1000;
Task[] t = new Task[Environment.ProcessorCount];
Stopwatch stp = Stopwatch.StartNew();
for (int k = 0; k < t.Length; k++)
{
t[k] = Task.Run(() =>
{
for (int j = 0; j < runs; j++ )
{
i++;
}
});
}
Task.WaitAll(t);
stp.Stop();
Console.WriteLine("i = {0} should be = {1} ms={2}", i, runs * t.Length, stp.ElapsedMilliseconds);
Console.ReadLine();
}
Now a task could be pre-empted somewhere in the loop statements. But that wouldn't effect i. So the only way to see an effect on i would be, if a task is preempted when it just at the i++ statement. And thats what was to be shown: It CAN happen and it's more likely to happen when you have fewer but longer running tasks.
If you write Interlocked.Increment(ref i); instead of i++ the code runs much longer (because of the locking), but i is exactly what it should be!