So I have a method that gets a Dictionary of List<myObj>, then cycles through the keys of the dictionary and passes each List<myObj> to a separate thread.
Here is some Code / Psuedo-Code:
public static void ProcessEntries() {
Dictionary<string, List<myObj>> myDictionary = GetDictionary();
foreach(string key in myDictionary.keys)
{
List<myObj> myList = myDictionary[key];
Thread myThread = new System.Threading.Thread(new System.Threading.ThreadStart(delegate() {
ProcessList(myList);
}
}
}
public static void ProcessList(List<myObj> myList) {
// Process entries
// read-only operations on myList
}
The problem is that during execution of ProcessList the myList parameter simply changes.
I have looped through the list before kicking of the thread, and then immediately inside the thread, and I've found the results to be different.
I have since solved the problem (I think!) by making the Dictionary variable global. Using the [ThreadStatic] property is next on the list of possible fixes.
What I really want to know is why does the myList object changes inside ProcessList() presumably when the myList object is re-assigned in ProcessEntries() ? Are these not two different Lists ? If all parameter passing is by value by default, why does the ProcessList() function not have a local copy of myList ? (does it?)
Is there a way to specify that you want to pass a parameter to a thread and not have it be altered by the parent thread or other threads during execution? (This would be similar to the [ThreadSafe] attribute for global variables)
I suspect your pseudo-code isn't actually an accurate reflection of your real code. I suspect your real code looks like this:
foreach(var pair in myDictionary)
{
Thread myThread = new Thread(delegate() {
ProcessList(pair.Value);
});
myThread.Start();
}
If that's the case, the problem is that the pair variable is being captured - so by the time your thread starts, it may be referring to a different key/value pair.
The way to fix it is to make the code precisely more like your pseudo-code:
foreach(var pair in myDictionary)
{
// You'll get a new list variable on each iteration
var list = pair.Value;
Thread myThread = new Thread(delegate() {
ProcessList(list);
});
myThread.Start();
}
See Eric Lippert's blog post on this for more information.
If this isn't what's going wrong, please give a real example rather than pseudo-code. A short but complete example demonstrating the problem would be ideal.
Also be sure that other threads don't affect the thread you're trying to use. be sure to use locks and monitors... Had some issues with that just few weeks ago..
You are passing a reference by value in that case, so if you modify it somewhere it will be different everywhwere.
Related
So I have a IList as the value in my ConcurrentDictionary.
ConcurrentDictionary<int, IList<string>> list1 = new ConcurrentDictionary<int, IList<string>>;
In order to update a value in a list I do this:
if (list1.ContainsKey[key])
{
IList<string> templist;
list1.TryGetValue(key, out templist);
templist.Add("helloworld");
}
However, does adding a string to templist update the ConcurrentDictionary? If so, is the update thread-safe so that no data corruption would occur?
Or is there a better way to update or create a list inside the ConcurrentDictionary
EDIT
If I were to use a ConcurrentBag instead of a List, how would I implement this? More specifically, how could I update it? ConcurrentDictionary's TryUpdate method feels a bit excessive.
Does ConcurrentBag.Add update the ConcurrentDictionary in a thread-safe mannar?
ConcurrentDictionary<int, ConcurrentBag<string>> list1 = new ConcurrentDictionary<int, ConcurrentBag<string>>
Firstly, there's no need to do ContainsKey() and TryGetValue().
You should just do this:
IList<string> templist;
if (list1.TryGetValue(key, out templist))
templist.Add("helloworld");
In fact your code as written has a race condition.
Inbetween one thread calling ContainsKey() and TryGetValue() a different thread may have removed the item with that key. Then TryGetValue() will return tempList as null, and then you'll get a null reference exception when you call tempList.Add().
Secondly, yes: There's another possible threading issue here. You don't know that the IList<string> stored inside the dictionary is threadsafe.
Therefore calling tempList.Add() is not guaranteed to be safe.
You could use ConcurrentQueue<string> instead of IList<string>. This is probably going to be the most robust solution.
Note that simply locking access to the IList<string> wouldn't be sufficient.
This is no good:
if (list1.TryGetValue(key, out templist))
{
lock (locker)
{
templist.Add("helloworld");
}
}
unless you also use the same lock everywhere else that the IList may be accessed. This is not easy to achieve, hence it's better to either use a ConcurrentQueue<> or add locking to this class and change the architecture so that no other threads have access to the underlying IList.
Operations on a thread-safe dictionary are thread-safe by key, so to say. So as long as you access your values (in this case an IList<T>) only from one thread, you're good to go.
The ConcurrentDictionary does not prevent two threads at the same time to access the value beloning to one key.
You can use ConcurrentDictionary.AddOrUpdate method to add item to list in thread-safe way. Its simpler and should work fine.
var list1 = new ConcurrentDictionary<int, IList<string>>();
list1.AddOrUpdate(key,
new List<string>() { "test" }, (k, l) => { l.Add("test"); return l;});
UPD
According to docs and sources, factories, which was passed to AddOrUpdate method will be run out of lock scope, so calling List methods inside factory delegate is NOT thread safe.
See comments under this answer.
The ConcurrentDictionary has no effect on whether you can apply changes to value objects in a thread-safe manner or not. That is the reponsiblity of the value object (the IList-implementation in your case).
Looking at the answers of No ConcurrentList<T> in .Net 4.0? there are some good reasons why there is no ConcurrentList implementation in .net.
Basically you have to take care of thread-safe changes yourself. The most simple way is to use the lock operator. E.g.
lock (templist)
{
templist.Add("hello world");
}
Another way is to use the ConcurrentBag in the .net Framework. But this way is only useful for you, if you do not rely on the IList interface and the ordering of items.
it has been already mentioned about what would be the best solution ConcurrentDictionary with ConcurrentBag. Just going to add how to do that
ConcurrentBag<string> bag= new ConcurrentBag<string>();
bag.Add("inputstring");
list1.AddOrUpdate(key,bag,(k,v)=>{
v.Add("inputString");
return v;
});
does adding a string to templist update the ConcurrentDictionary?
It does not.
Your thread safe collection (Dictionary) holds references to non-thread-safe collections (IList). So changing those is not thread safe.
I suppose you should consider using mutexes.
If you use ConcurrentBag<T>:
var dic = new ConcurrentDictionary<int, ConcurrentBag<string>>();
Something like this could work OK:
public static class ConcurentDictionaryExt
{
public static ConcurrentBag<V> AddToInternal<K, V>(this ConcurrentDictionary<K, ConcurrentBag<V>> dic, K key, V value)
=> dic.AddOrUpdate(key,
k => new ConcurrentBag<V>() { value },
(k, existingBag) =>
{
existingBag.Add(value);
return existingBag;
}
);
public static ConcurrentBag<V> AddRangeToInternal<K, V>(this ConcurrentDictionary<K, ConcurrentBag<V>> dic, K key, IEnumerable<V> values)
=> dic.AddOrUpdate(key,
k => new ConcurrentBag<V>(values),
(k, existingBag) =>
{
foreach (var v in values)
existingBag.Add(v);
return existingBag;
}
);
}
I didn't test it yet :)
In my program, i'm iterating through a list of 'Group' objects using a Parallel.Foreach loop. Inside this loop, I first check my concurrentdictionary if a key exists, and if the value contains a Group property. I then add an object to a list depending on whether or not the dictionary has the key and value. Code shown below:
var roleUsers = new ExtendedBindingList<RoleUser>();
ConcurrentDictionary<int, List<int>> roleMatch = new ConcurrentDictionary<int, List<int>>();
Parallel.ForEach(groupsWithRole, group =>
{
foreach (var u in usersInThisGroup[group.GroupID])
{
if (roleMatch.ContainsKey(u.UserID) && roleMatch[u.UserID].Contains(group.RoleID))
continue;
//
//Unimportant logic
//
lock (writelock)
{
roleUsers.Add(roleUser);
if (!roleMatch.ContainsKey(u.UserID))
roleMatch.TryAdd(u.UserID, new List<int>());
roleMatch[u.UserID].Add(group.RoleID);
}
}
}
});
I've come to find that the list roleUsers doesn't always have the same number of objects in it, when it very much should. It's quite obviously a threading issue. My question is, is there any way, besides locking the whole thing, to read and write to the concurrentdictionary safely?
How to write a thread-safe list using copy-on-write model in .NET?
Below is my current implementation, but after lots of reading about threading, memory barriers, etc, I know that I need to be cautious when multi-threading without locks is involved. Could someone comment if this is the correct implementation?
class CopyOnWriteList
{
private List<string> list = new List<string>();
private object listLock = new object();
public void Add(string item)
{
lock (listLock)
{
list = new List<string>(list) { item };
}
}
public void Remove(string item)
{
lock (listLock)
{
var tmpList = new List<string>(list);
tmpList.Remove(item);
list = tmpList;
}
}
public bool Contains(string item)
{
return list.Contains(item);
}
public string Get(int index)
{
return list[index];
}
}
EDIT
To be more specific: is above code thread safe, or should I add something more? Also, will all thread eventually see change in list reference? Or maybe I should add volatile keyword on list field or Thread.MemoryBarrier in Contains method between accessing reference and calling method on it?
Here is for example Java implementation, looks like my above code, but is such approach also thread-safe in .NET?
And here is the same question, but also in Java.
Here is another question related to this one.
Implementation is correct because reference assignment is atomic in accordance to Atomicity of variable references. I would add volatile to list.
Your approach looks correct, but I'd recommend using a string[] rather than a List<string> to hold your data. When you're adding an item, you know exactly how many items are going to be in the resulting collection, so you can create a new array of exactly the size required. When removing an item, you can grab a copy of the list reference and search it for your item before making a copy; if it turns out that the item doesn't exist, there's no need to remove it. If it does exist, you can create a new array of the exact required size, and copy to the new array all the items preceding or following the item to be removed.
Another thing you might want to consider would be to use a int[1] as your lock flag, and use a pattern something like:
static string[] withAddedItem(string[] oldList, string dat)
{
string[] result = new string[oldList.Length+1];
Array.Copy(oldList, result, oldList.Length);
return result;
}
int Add(string dat) // Returns index of newly-added item
{
string[] oldList, newList;
if (listLock[0] == 0)
{
oldList = list;
newList = withAddedItem(oldList, dat);
if (System.Threading.Interlocked.CompareExchange(list, newList, oldList) == oldList)
return newList.Length;
}
System.Threading.Interlocked.Increment(listLock[0]);
lock (listLock)
{
do
{
oldList = list;
newList = withAddedItem(oldList, dat);
} while (System.Threading.Interlocked.CompareExchange(list, newList, oldList) != oldList);
}
System.Threading.Interlocked.Decrement(listLock[0]);
return newList.Length;
}
If there is no write contention, the CompareExchange will succeed without having to acquire a lock. If there is write contention, writes will be serialized by the lock. Note that the lock here is neither necessary nor sufficient to ensure correctness. Its purpose is to avoid thrashing in the event of write contention. It is possible that thread #1 might get past its first "if" test, and get task task-switched out while many other threads simultaneously try to write the list and start using the lock. If that occurs, thread #1 might then "surprise" the thread in the lock by performing its own CompareExchange. Such an action would result in the lock-holding thread having to waste time making a new array, but that situation should arise rarely enough that the occasional cost of an extra array copy shouldn't matter.
Yes, it is thread-safe:
Collection modifications in Add and Remove are done on separate collections, so it avoids concurrent access to the same collection from Add and Remove or from Add/Remove and Contains/Get.
Assignment of the new collection is done inside lock, which is just pair of Monitor.Enter and Monitor.Exit, which both do a full memory barrier as noted here, which means that after the lock all threads should observe the new value of list field.
As far as Thread Safety goes is this ok to do or do I need to be using a different collection ?
List<FileMemberEntity> fileInfo = getList();
Parallel.ForEach(fileInfo, fileMember =>
{
//Modify each fileMember
}
As long as you are only modifying the contents of the item that is passed to the method, there is no locking needed.
(Provided of course that there are no duplicate reference in the list, i.e. two references to the same FileMemberEntity instance.)
If you need to modify the list itself, create a copy that you can iterate, and use a lock when you modify the list:
List<FileMemberEntity> fileInfo = getList();
List<FileMemberEntity> copy = new List<FileMemberEntity>(fileInfo);
object sync = new Object();
Parallel.ForEach(copy, fileMember => {
// do something
lock (sync) {
// here you can add or remove items from the fileInfo list
}
// do something
});
You're safe since you are just reading. Just don't modify the list while you are iterating over its items.
We should use less lock object to make it faster. Only lock object in different local threads of Parrallel.ForEach:
List<FileMemberEntity> copy = new List<FileMemberEntity>(fileInfo);
object sync = new Object();
Parallel.ForEach<FileMemberEntity, List<FileMemberEntity>>(
copy,
() => { return new List<FileMemberEntity>(); },
(itemInCopy, state, localList) =>
{
// here you can add or remove items from the fileInfo list
localList.Add(itemInCopy);
return localList;
},
(finalResult) => { lock (sync) copy.AddRange(finalResult); }
);
// do something
Reference: http://msdn.microsoft.com/en-gb/library/ff963547.aspx
If it does not matter what order the FileMemberEntity objects are acted on, you can use List<T> because you are not modifying the list.
If you must ensure some sort of ordering, you can use OrderablePartitioner<T> as a base class and implement an appropriate partitioning scheme. For example, if the FileMemberEntity has some sort of categorization and you must process each of the categories in some specific order, you would want to go this route.
Hypothetically if you have
Object 1 Category A
Object 2 Category A
Object 3 Category B
there is no guarantee that Object 2 Category A will be processed before Object 3 Category B is processed when iterating a List<T> using Parallel.ForEach.
The MSDN documentation you link to provides an example of how to do that.
I am trying to update entries in a ConcurrentDictionary something like this:
class Class1
{
public int Counter { get; set; }
}
class Test
{
private ConcurrentDictionary<int, Class1> dict =
new ConcurrentDictionary<int, Class1>();
public void TestIt()
{
foreach (var foo in dict)
{
foo.Value.Counter = foo.Value.Counter + 1; // Simplified example
}
}
}
Essentially I need to iterate over the dictionary and update a field on each Value. I understand from the documentation that I need to avoid using the Value property. Instead I think I need to use TryUpdate except that I don’t want to replace my whole object. Instead, I want to update a field on the object.
After reading this blog entry on the PFX team blog: Perhaps I need to use AddOrUpdate and simply do nothing in the add delegate.
Does anyone have any insight as to how to do this?
I have tens of thousands of objects in the dictionary which I need to update every thirty seconds or so. Creating new ones in order to update the property is probably not feasible. I would need to clone the existing object, update it and replace the one in the dictionary. I’d also need to lock it for the duration of the clone/add cycle. Yuck.
What I’d like to do is iterate over the objects and update the Counter property directly if possible.
My latest research has led me to to Parallel.ForEach which sounds great but it is not supposed to be used for actions that update state.
I also saw mention of Interlocked.Increment which sounds great but I still need to figure out how to use it on each element in my dictionary in a thread safe way.
First, to solve your locking problem:
class Class1
{
// this must be a variable so that we can pass it by ref into Interlocked.Increment.
private int counter;
public int Counter
{
get{return counter; }
}
public void Increment()
{
// this is about as thread safe as you can get.
// From MSDN: Increments a specified variable and stores the result, as an atomic operation.
Interlocked.Increment(ref counter);
// you can return the result of Increment if you want the new value,
//but DO NOT set the counter to the result :[i.e. counter = Interlocked.Increment(ref counter);] This will break the atomicity.
}
}
Iterating the just values should be faster than iterating the key value pair. [Though I think iterating a list of keys and doing the look-ups will be faster still on the ConcurrentDictionary in most situations.]
class Test
{
private ConcurrentDictionary<int, Class1> dictionary = new ConcurrentDictionary<int, Class1>();
public void TestIt()
{
foreach (var foo in dictionary.Values)
{
foo.Increment();
}
}
public void TestItParallel()
{
Parallel.ForEach(dictionary.Values,x=>x.Increment() );
}
}
ConcurrentDictionary doesn't help you with accessing members of stored values concurrently, just with the elements themselves.
If multiple threads call TestIt, you should get a snapshot of the collection and lock the shared resources (which are the individual dictionary values):
foreach (KeyValuePair<int, Class1> kvp in dict.ToArray())
{
Class1 value = kvp.Value;
lock (value)
{
value.Counter = value.Counter + 1;
}
}
However, if you want to update the counter for a specific key, ConcurrentDictionary can help you with atomically adding a new key value pair if the key does not exist:
Class1 value = dict.GetOrAdd(42, key => new Class1());
lock (value)
{
value.Counter = value.Counter + 1;
}
AddOrUpdate and TryUpdate indeed are for cases in which you want to replace the value for a given key in a ConcurrentDictionary. But, as you said, you don't want to change the value, you want to change a property of the value.
You can use the AddOrUpdate function.
Here is how you can increment the current value by 1:
dict.AddOrUpdate(key, 1, (key, oldValue) => oldValue + 1);