I am trying to update entries in a ConcurrentDictionary something like this:
class Class1
{
public int Counter { get; set; }
}
class Test
{
private ConcurrentDictionary<int, Class1> dict =
new ConcurrentDictionary<int, Class1>();
public void TestIt()
{
foreach (var foo in dict)
{
foo.Value.Counter = foo.Value.Counter + 1; // Simplified example
}
}
}
Essentially I need to iterate over the dictionary and update a field on each Value. I understand from the documentation that I need to avoid using the Value property. Instead I think I need to use TryUpdate except that I don’t want to replace my whole object. Instead, I want to update a field on the object.
After reading this blog entry on the PFX team blog: Perhaps I need to use AddOrUpdate and simply do nothing in the add delegate.
Does anyone have any insight as to how to do this?
I have tens of thousands of objects in the dictionary which I need to update every thirty seconds or so. Creating new ones in order to update the property is probably not feasible. I would need to clone the existing object, update it and replace the one in the dictionary. I’d also need to lock it for the duration of the clone/add cycle. Yuck.
What I’d like to do is iterate over the objects and update the Counter property directly if possible.
My latest research has led me to to Parallel.ForEach which sounds great but it is not supposed to be used for actions that update state.
I also saw mention of Interlocked.Increment which sounds great but I still need to figure out how to use it on each element in my dictionary in a thread safe way.
First, to solve your locking problem:
class Class1
{
// this must be a variable so that we can pass it by ref into Interlocked.Increment.
private int counter;
public int Counter
{
get{return counter; }
}
public void Increment()
{
// this is about as thread safe as you can get.
// From MSDN: Increments a specified variable and stores the result, as an atomic operation.
Interlocked.Increment(ref counter);
// you can return the result of Increment if you want the new value,
//but DO NOT set the counter to the result :[i.e. counter = Interlocked.Increment(ref counter);] This will break the atomicity.
}
}
Iterating the just values should be faster than iterating the key value pair. [Though I think iterating a list of keys and doing the look-ups will be faster still on the ConcurrentDictionary in most situations.]
class Test
{
private ConcurrentDictionary<int, Class1> dictionary = new ConcurrentDictionary<int, Class1>();
public void TestIt()
{
foreach (var foo in dictionary.Values)
{
foo.Increment();
}
}
public void TestItParallel()
{
Parallel.ForEach(dictionary.Values,x=>x.Increment() );
}
}
ConcurrentDictionary doesn't help you with accessing members of stored values concurrently, just with the elements themselves.
If multiple threads call TestIt, you should get a snapshot of the collection and lock the shared resources (which are the individual dictionary values):
foreach (KeyValuePair<int, Class1> kvp in dict.ToArray())
{
Class1 value = kvp.Value;
lock (value)
{
value.Counter = value.Counter + 1;
}
}
However, if you want to update the counter for a specific key, ConcurrentDictionary can help you with atomically adding a new key value pair if the key does not exist:
Class1 value = dict.GetOrAdd(42, key => new Class1());
lock (value)
{
value.Counter = value.Counter + 1;
}
AddOrUpdate and TryUpdate indeed are for cases in which you want to replace the value for a given key in a ConcurrentDictionary. But, as you said, you don't want to change the value, you want to change a property of the value.
You can use the AddOrUpdate function.
Here is how you can increment the current value by 1:
dict.AddOrUpdate(key, 1, (key, oldValue) => oldValue + 1);
Related
In my app I have a Dictionary<ContainerControl, int>.
I need to check if a key is present in the dictionary and alter its corresponding value if key is found or add the key if not already present.
The key for my dictionary is a ControlContainer object.
I could use this method:
var dict = new Dictionary<ContainerControl, int>();
/*...*/
var c = GetControl();
if (dict.ContainsKey(c))
{
dict[c] = dict[c] + 1;
}
else
{
dict.Add(c, 0);
}
but I think that this way if the key is already present, my dictionary is iterated three times: once in ContainsKey and twice in the if branch.
I wander if there is a more efficient way to do this, something like
var dict = new Dictionary<ContainerControl, int>();
/*...*/
var c = GetControl();
var kvp = dict.GetKeyValuePair(c); /* there is no such function in Dictionary */
if (kvp != null)
{
kvp.Value++;
}
else
{
dict.Add(c, 0);
}
This is possible using linq:
var kvp = dict.SingleOrDefault(x => x.Key == c);
but what about performance?
As noted in comments, finding a key in a dictionary doesn't mean iterating over the whole dictionary. But in some cases it's still worth trying to reduce the lookups.
KeyValuePair<,> is a struct anyway, so if GetKeyValuePair did exist, your kvp.Value++ wouldn't compile (as Value is read-only) and wouldn't work even if it did (as the pair wouldn't be the "original" in the dictionary).
You can use TryGetValue to reduce this to a single "read" operation and a single "write" operation:
// value will be 0 if TryGetValue returns false
if (dict.TryGetValue(c, out var value))
{
value++;
}
dict[c] = value;
Or change to ConcurrentDictionary and use AddOrUpdate to perform the change in a single call.
You could also store a reference type in the dict. This means an extra allocation when you insert an item, but you can mutate items without another dictionary access. You'll need a profiler to tell you whether this is a net improvement!
class IntBox
{
public int Value { get; set; }
}
if (dict.TryGetValue(c, out var box))
{
box.Value++;
}
else
{
dict[c] = new IntBox();
}
With .NET 6 you can use CollectionsMarshal.GetValueRefOrAddDefault for a single lookup:
ref int value = ref CollectionsMarshal.GetValueRefOrAddDefault(dict, c, out bool exists);
if(exists) value++; // changes the value in the dictionary even if it's a value type
Demo: https://dotnetfiddle.net/tnW9P5
I am trying to change the value of Keys in my dictionary as follows:
//This part loads the data in the iterator
List<Recommendations> iterator = LoadBooks().ToList();
//This part adds the data to a list
List<Recommendations> list = new List<Recommendations>();
foreach (var item in iterator.Take(100))
{
list.Add(item);
}
//This part adds Key and List as key pair value to the Dictionary
if (!SuggestedDictionary.ContainsKey(bkName))
{
SuggestedDictionary.Add(bkName, list);
}
//This part loops over the dictionary contents
for (int i = 0; i < 10; i++)
{
foreach (var entry in SuggestedDictionary)
{
rec.Add(new Recommendations() { bookName = entry.Key, Rate = CalculateScore(bkName, entry.Key) });
entry.Key = entry.Value[i];
}
}
But it says "Property or Indexer KeyValuePair>.Key Cannot be assigned to. Is read only. What I exactly want to do is change the value of dictionary Key here and assign it another value.
The only way to do this will be to remove and re-add the dictionary item.
Why? It's because a dictionary works on a process called chaining and buckets (it's similar to a hash table with different collision resolution strategy).
When an item is added to a dictionary, it is added to the bucket that its key hashes to and, if there's already an instance there, it's prepended to a chained list. If you were to change the key, it will need to to go through the process of working out where it belongs. So the easiest and most sane solution is to just remove and re-add the item.
Solution
var data = SomeFunkyDictionary[key];
SomeFunkyDictionary.Remove(key);
SomeFunkyDictionary.Add(newKey,data);
Or make your self an extension method
public static class Extensions
{
public static void ReplaceKey<T, U>(this Dictionary<T, U> source, T key, T newKey)
{
if(!source.TryGetValue(key, out var value))
throw new ArgumentException("Key does not exist", nameof(key));
source.Remove(key);
source.Add(newKey, value);
}
}
Usage
SomeFunkyDictionary.ReplaceKey(oldKye,newKey);
Side Note : Adding and removing from a dictionary incurs a penalty; if you don't need fast lookups, it may just be more suitable not use a dictionary at all, or use some other strategy.
In my program, i'm iterating through a list of 'Group' objects using a Parallel.Foreach loop. Inside this loop, I first check my concurrentdictionary if a key exists, and if the value contains a Group property. I then add an object to a list depending on whether or not the dictionary has the key and value. Code shown below:
var roleUsers = new ExtendedBindingList<RoleUser>();
ConcurrentDictionary<int, List<int>> roleMatch = new ConcurrentDictionary<int, List<int>>();
Parallel.ForEach(groupsWithRole, group =>
{
foreach (var u in usersInThisGroup[group.GroupID])
{
if (roleMatch.ContainsKey(u.UserID) && roleMatch[u.UserID].Contains(group.RoleID))
continue;
//
//Unimportant logic
//
lock (writelock)
{
roleUsers.Add(roleUser);
if (!roleMatch.ContainsKey(u.UserID))
roleMatch.TryAdd(u.UserID, new List<int>());
roleMatch[u.UserID].Add(group.RoleID);
}
}
}
});
I've come to find that the list roleUsers doesn't always have the same number of objects in it, when it very much should. It's quite obviously a threading issue. My question is, is there any way, besides locking the whole thing, to read and write to the concurrentdictionary safely?
How to write a thread-safe list using copy-on-write model in .NET?
Below is my current implementation, but after lots of reading about threading, memory barriers, etc, I know that I need to be cautious when multi-threading without locks is involved. Could someone comment if this is the correct implementation?
class CopyOnWriteList
{
private List<string> list = new List<string>();
private object listLock = new object();
public void Add(string item)
{
lock (listLock)
{
list = new List<string>(list) { item };
}
}
public void Remove(string item)
{
lock (listLock)
{
var tmpList = new List<string>(list);
tmpList.Remove(item);
list = tmpList;
}
}
public bool Contains(string item)
{
return list.Contains(item);
}
public string Get(int index)
{
return list[index];
}
}
EDIT
To be more specific: is above code thread safe, or should I add something more? Also, will all thread eventually see change in list reference? Or maybe I should add volatile keyword on list field or Thread.MemoryBarrier in Contains method between accessing reference and calling method on it?
Here is for example Java implementation, looks like my above code, but is such approach also thread-safe in .NET?
And here is the same question, but also in Java.
Here is another question related to this one.
Implementation is correct because reference assignment is atomic in accordance to Atomicity of variable references. I would add volatile to list.
Your approach looks correct, but I'd recommend using a string[] rather than a List<string> to hold your data. When you're adding an item, you know exactly how many items are going to be in the resulting collection, so you can create a new array of exactly the size required. When removing an item, you can grab a copy of the list reference and search it for your item before making a copy; if it turns out that the item doesn't exist, there's no need to remove it. If it does exist, you can create a new array of the exact required size, and copy to the new array all the items preceding or following the item to be removed.
Another thing you might want to consider would be to use a int[1] as your lock flag, and use a pattern something like:
static string[] withAddedItem(string[] oldList, string dat)
{
string[] result = new string[oldList.Length+1];
Array.Copy(oldList, result, oldList.Length);
return result;
}
int Add(string dat) // Returns index of newly-added item
{
string[] oldList, newList;
if (listLock[0] == 0)
{
oldList = list;
newList = withAddedItem(oldList, dat);
if (System.Threading.Interlocked.CompareExchange(list, newList, oldList) == oldList)
return newList.Length;
}
System.Threading.Interlocked.Increment(listLock[0]);
lock (listLock)
{
do
{
oldList = list;
newList = withAddedItem(oldList, dat);
} while (System.Threading.Interlocked.CompareExchange(list, newList, oldList) != oldList);
}
System.Threading.Interlocked.Decrement(listLock[0]);
return newList.Length;
}
If there is no write contention, the CompareExchange will succeed without having to acquire a lock. If there is write contention, writes will be serialized by the lock. Note that the lock here is neither necessary nor sufficient to ensure correctness. Its purpose is to avoid thrashing in the event of write contention. It is possible that thread #1 might get past its first "if" test, and get task task-switched out while many other threads simultaneously try to write the list and start using the lock. If that occurs, thread #1 might then "surprise" the thread in the lock by performing its own CompareExchange. Such an action would result in the lock-holding thread having to waste time making a new array, but that situation should arise rarely enough that the occasional cost of an extra array copy shouldn't matter.
Yes, it is thread-safe:
Collection modifications in Add and Remove are done on separate collections, so it avoids concurrent access to the same collection from Add and Remove or from Add/Remove and Contains/Get.
Assignment of the new collection is done inside lock, which is just pair of Monitor.Enter and Monitor.Exit, which both do a full memory barrier as noted here, which means that after the lock all threads should observe the new value of list field.
I have this segment of code , a lot of things skipped for brevity but the scene is this one:
public class Billing
{
private List<PrecalculateValue> Values = new List<PrecalculateValue>();
public int GetValue(DateTime date)
{
var preCalculated = Values.SingleOrDefault(g => g.date == date).value;
//if exist in Values, return it
if(preCalculated != null)
{
return preCalculated;
}
// if it does not exist calculate it and store it in Values
int value = GetValueFor(date);
Values.Add(new PrecalculateValue{date = date, value = value});
return value;
}
private object GetValueFor(DateTime date)
{
//some logic here
}
}
I have a List<PrecalculateValue> Values where i store all the values i already calculated for later use, i do these mainly because i don't want to recalculate things twice for the same client, each calculation involve a lot of operations and take between 500 and 1000 ms, and there is a big chance of reuse that value, because of some recursion involved in the hole billing class.
All of these work perfectly until i made a test where i hit two simultaneous calculations for two different clients, and the line Values.Single(g => g.date == date).value returned an exception because it found more than one result in the collection.
So i checked the list and it stored values of both clients in the same list. What can i do to avoid this little problem?
Well, first of all, this line:
return Values.Single(g => g.date == date).value;
makes it so that the subsequent lines will never be called. I'm guessing you've paraphrased your code a little bit here?
If you want to synchronize writes to your Values list, the most straightforward way would be to lock on a common object everywhere in the code that you're modifying the list:
int value = GetValueFor(date);
lock (dedicatedLockObject) {
Values.Add(new PrecalculateValue{date = date, value = value});
}
return value;
But here's something else worth noting: since it looks like you want to have one PrecalculateValue per DateTime, a more appropriate data structure would probably be a Dictionary<DateTime, PrecalculateValue> -- it will provide lightning-fast, O(1) lookup based on your DateTime key, as compared to a List<PrecalculateValue> which would have to iterate to find what you're looking for.
With this change in place, your code might look something like this:
public class Billing
{
private Dictionary<DateTime, PrecalculateValue> Values =
new Dictionary<DateTime, PrecalculateValue>();
private readonly commonLockObject = new object();
public int GetValue(DateTime date)
{
PrecalculateValue cachedCalculation;
// Note: for true thread safety, you need to lock reads as well as
// writes, to ensure that a write happening concurrently with a read
// does not corrupt state.
lock (commonLockObject) {
if (Values.TryGetValue(date, out cachedCalculation))
return cachedCalculation.value;
}
int value = GetValueFor(date);
// Here we need to check if the key exists again, just in case another
// thread added an item since we last checked.
// Also be sure to lock ANYWHERE ELSE you're manipulating
// or reading from the collection.
lock (commonLockObject) {
if (!Values.ContainsKey(date))
Values[date] = new PrecalculateValue{date = date, value = value};
}
return value;
}
private object GetValueFor(DateTime date)
{
//some logic here
}
}
And one last piece of advice: unless it's critical that no more than one of a particular value exist in your collection, the Single method is overkill. If you'd rather just get the first value and disregard potential duplicates, First is both safer (as in, less chance of an exception) and faster (because it doesn't have to iterate over the entire collection).
Could use something like this
public int GetValue(DateTime date)
{
var result = Values.Single(g => g.date == date) ?? GetValueFor(date);
lock (Values)
{
if (!Values.Contains(result)) Values.Add(result);
}
return result.value;
}
private PrecalculateValue GetValueFor(DateTime date)
{
//logic
return new PrecalculateValue() ;
}
Would advise using a dictionary though for a list of key value pairs.