Thread-safe way to check ConcurrentDictionary while adding to it? - c#

In my program, i'm iterating through a list of 'Group' objects using a Parallel.Foreach loop. Inside this loop, I first check my concurrentdictionary if a key exists, and if the value contains a Group property. I then add an object to a list depending on whether or not the dictionary has the key and value. Code shown below:
var roleUsers = new ExtendedBindingList<RoleUser>();
ConcurrentDictionary<int, List<int>> roleMatch = new ConcurrentDictionary<int, List<int>>();
Parallel.ForEach(groupsWithRole, group =>
{
foreach (var u in usersInThisGroup[group.GroupID])
{
if (roleMatch.ContainsKey(u.UserID) && roleMatch[u.UserID].Contains(group.RoleID))
continue;
//
//Unimportant logic
//
lock (writelock)
{
roleUsers.Add(roleUser);
if (!roleMatch.ContainsKey(u.UserID))
roleMatch.TryAdd(u.UserID, new List<int>());
roleMatch[u.UserID].Add(group.RoleID);
}
}
}
});
I've come to find that the list roleUsers doesn't always have the same number of objects in it, when it very much should. It's quite obviously a threading issue. My question is, is there any way, besides locking the whole thing, to read and write to the concurrentdictionary safely?

Related

How could i make this deep-copy / iterating dictionary work, and how could i make it cleaner?

i know my question isn't really detailled, by i didn't know how to ask it, since i don't really how to explain my issue without writting many lines. Anyway, here is my actual code
var numberOfItemInTempDict = 0;
Dictionary<int, Dictionary<string, List<dynamic>>> tempDict = new Dictionary<int, Dictionary<string, List<dynamic>>>();
foreach (KeyValuePair<int, Dictionary<string, List<dynamic>>> kvp in alertSortedByCompanyAndType.ToList())
{
// We iterate through all the companies id
foreach (var kvp2 in kvp.Value)
{
// We iterate through all array of type in one company
foreach (var item in kvp2.Value)
{
// We iterate through all the data for one array of type in one company
if (!tempDict.ContainsKey(kvp.Key))
{
tempDict.Add(kvp.Key, new Dictionary<string, List<dynamic>>());
}
if (!tempDict[kvp.Key].ContainsKey(kvp2.Key))
{
tempDict[kvp.Key].Add(kvp2.Key, new List<dynamic>());
}
// We add the item to the tempDictionary
tempDict[kvp.Key][kvp2.Key].Add(item);
// And after adding it, we delete it from the original dictionary
alertSortedByCompanyAndType[kvp.Key][kvp2.Key].Remove(item);
numberOfItemInTempDict++;
if (numberOfItemInTempDict >= 250)
{
break;
}
}
if (numberOfItemInTempDict >= 250)
{
break;
}
}
if (numberOfItemInTempDict >= 250)
{
break;
}
}
The problem here seems to be that i'm deleting items from alertSortedByCompanyAndType dictionary, but i'm iterating to it as well, so there is a InvalidOperationException since the collection was modified during iteration.
But, i also saw somewhere that by doing a ToList(), it's recreating alertSortedByCompanyAndType, and then, you can do whatever you want with the original dict since the ToList() will be our iterated dictionary. I could have missinterpreted it as well, and that's probably the case.
(If you have any improvements for my code, feel free to tell me or explain what i do wrong, i'm quite new, the following text is mandatory for the question answer)
For the few if at the end of each foreach, it's because i want it to iterate only 250 times, but there could be like 250 items for one company, or either 1 item for 250 company, or whatever, maybe i could just do a function to then just return instead of doing ugly if statement like i do.
For the tempDict that i used above, it's because both dictionary can be populated asynchronously, and the first dictionary could contain like 2000 items, but i want the tempDict to only have 250 of them, so i can pass the tempDict to another function that is after that.
And for the dynamic keyword for the list, it contains items that can be from 9 different classes (for now at least) that i receive in json from another program, so i deserialize them, using dynamic keyword since i don't really need to access to all the data in it, only part of it, i use the class to then register alerts to Amazon SQS.
You can try to split the copying and deleting. Put the items you want to remove in a list instead of removing them right away. After you are done remove them afterwards from the actual dictionary.
Something like this (pseudocode, untestet)
List<ValueTuple<string, string, object>> toBeDeleted = new List...; //asuming kvp.Key and kvp2.Key are strings and the item is object, use your actuall types
....
//instead of alertSortedByCompanyAndType[kvp.Key][kvp2.Key].Remove(item);
toBeDeleted.Add((kvp.Key,kvp2.Key,item))
....
//after the foreach nests
foreach( var d in toBeDeleted)
alertSortedByCompanyAndType[d.Item1][d.Item2].Remove(d.Item3);

Updating List in ConcurrentDictionary

So I have a IList as the value in my ConcurrentDictionary.
ConcurrentDictionary<int, IList<string>> list1 = new ConcurrentDictionary<int, IList<string>>;
In order to update a value in a list I do this:
if (list1.ContainsKey[key])
{
IList<string> templist;
list1.TryGetValue(key, out templist);
templist.Add("helloworld");
}
However, does adding a string to templist update the ConcurrentDictionary? If so, is the update thread-safe so that no data corruption would occur?
Or is there a better way to update or create a list inside the ConcurrentDictionary
EDIT
If I were to use a ConcurrentBag instead of a List, how would I implement this? More specifically, how could I update it? ConcurrentDictionary's TryUpdate method feels a bit excessive.
Does ConcurrentBag.Add update the ConcurrentDictionary in a thread-safe mannar?
ConcurrentDictionary<int, ConcurrentBag<string>> list1 = new ConcurrentDictionary<int, ConcurrentBag<string>>
Firstly, there's no need to do ContainsKey() and TryGetValue().
You should just do this:
IList<string> templist;
if (list1.TryGetValue(key, out templist))
templist.Add("helloworld");
In fact your code as written has a race condition.
Inbetween one thread calling ContainsKey() and TryGetValue() a different thread may have removed the item with that key. Then TryGetValue() will return tempList as null, and then you'll get a null reference exception when you call tempList.Add().
Secondly, yes: There's another possible threading issue here. You don't know that the IList<string> stored inside the dictionary is threadsafe.
Therefore calling tempList.Add() is not guaranteed to be safe.
You could use ConcurrentQueue<string> instead of IList<string>. This is probably going to be the most robust solution.
Note that simply locking access to the IList<string> wouldn't be sufficient.
This is no good:
if (list1.TryGetValue(key, out templist))
{
lock (locker)
{
templist.Add("helloworld");
}
}
unless you also use the same lock everywhere else that the IList may be accessed. This is not easy to achieve, hence it's better to either use a ConcurrentQueue<> or add locking to this class and change the architecture so that no other threads have access to the underlying IList.
Operations on a thread-safe dictionary are thread-safe by key, so to say. So as long as you access your values (in this case an IList<T>) only from one thread, you're good to go.
The ConcurrentDictionary does not prevent two threads at the same time to access the value beloning to one key.
You can use ConcurrentDictionary.AddOrUpdate method to add item to list in thread-safe way. Its simpler and should work fine.
var list1 = new ConcurrentDictionary<int, IList<string>>();
list1.AddOrUpdate(key,
new List<string>() { "test" }, (k, l) => { l.Add("test"); return l;});
UPD
According to docs and sources, factories, which was passed to AddOrUpdate method will be run out of lock scope, so calling List methods inside factory delegate is NOT thread safe.
See comments under this answer.
The ConcurrentDictionary has no effect on whether you can apply changes to value objects in a thread-safe manner or not. That is the reponsiblity of the value object (the IList-implementation in your case).
Looking at the answers of No ConcurrentList<T> in .Net 4.0? there are some good reasons why there is no ConcurrentList implementation in .net.
Basically you have to take care of thread-safe changes yourself. The most simple way is to use the lock operator. E.g.
lock (templist)
{
templist.Add("hello world");
}
Another way is to use the ConcurrentBag in the .net Framework. But this way is only useful for you, if you do not rely on the IList interface and the ordering of items.
it has been already mentioned about what would be the best solution ConcurrentDictionary with ConcurrentBag. Just going to add how to do that
ConcurrentBag<string> bag= new ConcurrentBag<string>();
bag.Add("inputstring");
list1.AddOrUpdate(key,bag,(k,v)=>{
v.Add("inputString");
return v;
});
does adding a string to templist update the ConcurrentDictionary?
It does not.
Your thread safe collection (Dictionary) holds references to non-thread-safe collections (IList). So changing those is not thread safe.
I suppose you should consider using mutexes.
If you use ConcurrentBag<T>:
var dic = new ConcurrentDictionary<int, ConcurrentBag<string>>();
Something like this could work OK:
public static class ConcurentDictionaryExt
{
public static ConcurrentBag<V> AddToInternal<K, V>(this ConcurrentDictionary<K, ConcurrentBag<V>> dic, K key, V value)
=> dic.AddOrUpdate(key,
k => new ConcurrentBag<V>() { value },
(k, existingBag) =>
{
existingBag.Add(value);
return existingBag;
}
);
public static ConcurrentBag<V> AddRangeToInternal<K, V>(this ConcurrentDictionary<K, ConcurrentBag<V>> dic, K key, IEnumerable<V> values)
=> dic.AddOrUpdate(key,
k => new ConcurrentBag<V>(values),
(k, existingBag) =>
{
foreach (var v in values)
existingBag.Add(v);
return existingBag;
}
);
}
I didn't test it yet :)

What is the correct approach when trying to remove some items in ConcurrentDictionary

Is this better:
public void Test()
{
ConcurrentDictionary<int, string> dictionary = new ConcurrentDictionary<int, string>();
dictionary.TryAdd(0, "A");
dictionary.TryAdd(1, "B");
dictionary.TryAdd(2, "A");
dictionary.TryAdd(3, "D");
foreach (var item in dictionary)
{
string foundItem;
if (dictionary.TryGetValue(item.Key, out foundItem))
{
if (foundItem == "A")
{
if (dictionary.TryRemove(item.Key, out foundItem))
{
// Success
}
}
}
}
}
Than this?:
public void Test2()
{
ConcurrentDictionary<int, string> dictionary = new ConcurrentDictionary<int, string>();
dictionary.TryAdd(0, "A");
dictionary.TryAdd(1, "B");
dictionary.TryAdd(2, "A");
dictionary.TryAdd(3, "D");
foreach (var item in dictionary)
{
string foundItem;
if (item.Value == "A")
{
if (dictionary.TryRemove(item.Key, out foundItem))
{
// Success
}
}
}
}
This method will be accessed by multiple thread.
My confusion is, whenever I want to remove an item, I try to get it first, then remove it. But in the first place, I have used foreach loop, meaning I have already get the item. Any idea would be appreciated.
I don't see any benefit in the first approach. I'd just use LINQ to find the items though:
foreach (var entry in dictionary.Where(e => e.Value == "A"))
{
string ignored;
// Do you actually need to check the return value?
dictionary.TryRemove(entry.Key, out ignored);
}
Of course, you need to consider what you want to happen if another thread adds a new entry with value "A" or updates an existing entry (possibly to make the value "A", possibly to make the value not-"A" while you're iterating... does it matter to you whether or not that entry is removed? It's not guaranteed what will happen. (The iterator doesn't take a snapshot, but isn't guaranteed to return entirely up-to-date data either.)
You may want to check that the value you've removed really is "A" by checking the variable I've called ignored afterwards. It really depends on your context. When you've got multiple threads modifying the map, you need to think that anything can happen at any time - within the operations that your code actually performs.
There's also the fact that you're effectively having to trawl through the whole dictionary... are you looking up by key elsewhere?
Second approach sounds good. If you have multiple threads trying to remove the item some of them will fail in the TryRemove but one will succeed, which should be good.

Loop in Dictionary

I use this:
foreach(KeyValuePair<String,String> entry in MyDic)
{
// do something with entry.Value or entry.Key
}
The problem is that I can't change the value of entry.Value or entry.Key
My question is that how can i change the value or key when looping through a dictionary?
And, does dictionary allow duplicated key? And if yes, how can we avoid ?
Thank you
You cannot change the value of a dictionary entry while looping through the items in the dictionary, although you can modify a property on the value if it's an instance of a reference type.
For example,
public class MyClass
{
public int SomeNumber { get; set;}
}
foreach(KeyValuePair<string, MyClass> entry in myDict)
{
entry.Value.SomeNumber = 3; // is okay
myDict[entry.Key] = new MyClass(); // is not okay
}
Trying to modify a dictionary (or any collection) while looping through its elements will result in an InvalidOperationException saying the collection was modified.
To answer your specific questions,
My question is that how can i change the value or key when looping through a dictionary?
The approach to both will be pretty much the same. You can either loop over a copy of the dictionary as Anthony Pengram said in his answer, or you can loop once through all the items to figure out which ones you need to modify and then loop again through a list of those items:
List<string> keysToChange = new List<string>();
foreach(KeyValuePair<string, string> entry in myDict)
{
if(...) // some check to see if it's an item you want to act on
{
keysToChange.Add(entry.Key);
}
}
foreach(string key in keysToChange)
{
myDict[key] = "new value";
// or "rename" a key
myDict["new key"] = myDict[key];
myDict.Remove(key);
}
And, does dictionary allow duplicated key? And if yes, how can we avoid ?
A dictionary does not allow duplicate keys. If you want a collection of <string, string> pairs that does, check out NameValueCollection.
Updating the dictionary in the loop is going to be a problem, as you cannot modify the dictionary as it is being enumerated. However, you can work around this pretty easily by converting the dictionary to a list of KeyValuePair<> objects. You enumerate that list, and then you can modify the dictionary.
foreach (var pair in dictionary.ToList())
{
// to update the value
dictionary[pair.Key] = "Some New Value";
// or to change the key => remove it and add something new
dictionary.Remove(pair.Key);
dictionary.Add("Some New Key", pair.Value);
}
For the second part, the key in a dictionary must be unique.
KeyValuePair's Key and value are read only. But you can change a value like that:
dictionary[key].Value = newValue;
But if you want to change the key, you will have to remove/add a key.
And no, a Dictionary does not allow duplicate keys, it will throw an ArgumentException.
You cannot modify keys while enumerating them.
One method I use for changes to the collection while enumerating them is that I do break; out of the foreach loop when a match is found and item is modified, and am restarting the whole enumeration all over again. That's one way of handling it...
No, Dictionary can't have duplicate keys. If you want something that will sort by key and allow duplicates, you should use some other data structure.
You can do this like
for (int i = 0; i < MyDic.Count; i++)
{
KeyValuePair<string, string> s = MyDic.ElementAt(i);
MyDic.Remove(s.Key);
MyDic.Add(s.Key, "NewValue");
}
And Dictionary doesn't allow duplicates

Updating fields of values in a ConcurrentDictionary

I am trying to update entries in a ConcurrentDictionary something like this:
class Class1
{
public int Counter { get; set; }
}
class Test
{
private ConcurrentDictionary<int, Class1> dict =
new ConcurrentDictionary<int, Class1>();
public void TestIt()
{
foreach (var foo in dict)
{
foo.Value.Counter = foo.Value.Counter + 1; // Simplified example
}
}
}
Essentially I need to iterate over the dictionary and update a field on each Value. I understand from the documentation that I need to avoid using the Value property. Instead I think I need to use TryUpdate except that I don’t want to replace my whole object. Instead, I want to update a field on the object.
After reading this blog entry on the PFX team blog: Perhaps I need to use AddOrUpdate and simply do nothing in the add delegate.
Does anyone have any insight as to how to do this?
I have tens of thousands of objects in the dictionary which I need to update every thirty seconds or so. Creating new ones in order to update the property is probably not feasible. I would need to clone the existing object, update it and replace the one in the dictionary. I’d also need to lock it for the duration of the clone/add cycle. Yuck.
What I’d like to do is iterate over the objects and update the Counter property directly if possible.
My latest research has led me to to Parallel.ForEach which sounds great but it is not supposed to be used for actions that update state.
I also saw mention of Interlocked.Increment which sounds great but I still need to figure out how to use it on each element in my dictionary in a thread safe way.
First, to solve your locking problem:
class Class1
{
// this must be a variable so that we can pass it by ref into Interlocked.Increment.
private int counter;
public int Counter
{
get{return counter; }
}
public void Increment()
{
// this is about as thread safe as you can get.
// From MSDN: Increments a specified variable and stores the result, as an atomic operation.
Interlocked.Increment(ref counter);
// you can return the result of Increment if you want the new value,
//but DO NOT set the counter to the result :[i.e. counter = Interlocked.Increment(ref counter);] This will break the atomicity.
}
}
Iterating the just values should be faster than iterating the key value pair. [Though I think iterating a list of keys and doing the look-ups will be faster still on the ConcurrentDictionary in most situations.]
class Test
{
private ConcurrentDictionary<int, Class1> dictionary = new ConcurrentDictionary<int, Class1>();
public void TestIt()
{
foreach (var foo in dictionary.Values)
{
foo.Increment();
}
}
public void TestItParallel()
{
Parallel.ForEach(dictionary.Values,x=>x.Increment() );
}
}
ConcurrentDictionary doesn't help you with accessing members of stored values concurrently, just with the elements themselves.
If multiple threads call TestIt, you should get a snapshot of the collection and lock the shared resources (which are the individual dictionary values):
foreach (KeyValuePair<int, Class1> kvp in dict.ToArray())
{
Class1 value = kvp.Value;
lock (value)
{
value.Counter = value.Counter + 1;
}
}
However, if you want to update the counter for a specific key, ConcurrentDictionary can help you with atomically adding a new key value pair if the key does not exist:
Class1 value = dict.GetOrAdd(42, key => new Class1());
lock (value)
{
value.Counter = value.Counter + 1;
}
AddOrUpdate and TryUpdate indeed are for cases in which you want to replace the value for a given key in a ConcurrentDictionary. But, as you said, you don't want to change the value, you want to change a property of the value.
You can use the AddOrUpdate function.
Here is how you can increment the current value by 1:
dict.AddOrUpdate(key, 1, (key, oldValue) => oldValue + 1);

Categories