Thread safe re-initialization of concurrent dictionary - c#

I want to know if the following code is thread safe, which I assume it is not. And how I could possibly make it thread safe?
Basically I have a ConcurrentDictionary which acts as a cache for a database table. I want to query the DB every 10 seconds and update the db cache. There will be other threads querying this dictionary the whole time.
I can't just use TryAdd as there may also me elements which have been removed. So I decided instead of searching through the entire dictionary to possibly update, add or remove. I would just reinitialize the dictionary. Please do tell me if this is a silly idea.
My concern is that when I reinitialize the dictionary the querying threads will not longer by thread safe for the instance when the initialization takes place. For that reason I have used a lock for the dictionary when updating it, However I am not sure if this is correct as the object changes in the lock?
private static System.Timers.Timer updateTimer;
private static volatile Boolean _isBusyUpdating = false;
private static ConcurrentDictionary<int, string> _contactIdNames;
public Constructor()
{
// Setup Timers for data updater
updateTimer = new System.Timers.Timer();
updateTimer.Interval = new TimeSpan(0, 0, 10, 0).TotalMilliseconds;
updateTimer.Elapsed += OnTimedEvent;
// Start the timer
updateTimer.Enabled = true;
}
private void OnTimedEvent(Object source, System.Timers.ElapsedEventArgs e)
{
if (!_isBusyUpdating)
{
_isBusyUpdating = true;
// Get new data values and update the list
try
{
var tmp = new ConcurrentDictionary<int, string>();
using (var db = new DBEntities())
{
foreach (var item in db.ContactIDs.Select(x => new { x.Qualifier, x.AlarmCode, x.Description }).AsEnumerable())
{
int key = (item.Qualifier * 1000) + item.AlarmCode;
tmp.TryAdd(key, item.Description);
}
}
if (_contactIdNames == null)
{
_contactIdNames = tmp;
}
else
{
lock (_contactIdNames)
{
_contactIdNames = tmp;
}
}
}
catch (Exception e)
{
Debug.WriteLine("Error occurred in update ContactId db store", e);
}
_isBusyUpdating = false;
}
}
/// Use the dictionary from another Thread
public int GetIdFromClientString(string Name)
{
try
{
int pk;
if (_contactIdNames.TryGetValue(Name, out pk))
{
return pk;
}
}
catch { }
//If all else fails return -1
return -1;
}

You're right your code is not thread safe.
You need to lock _isBusyUpdating variable.
You need to lock _contactIdNames every time, not only when its not null.
Also this code is similar to singleton pattern and it has the same problem with initialization. You can solve it with Double checked locking. However you also need double checked locking when accessing entries.
In the case when you updating whole dictionary at once you need to lock current value every time when accessing. Otherwise you can access it while it's still changing and get error. So you either need to lock variable each time or use Interlocked.
As MSDN says volatile should do the trick with _isBusyUpdating, it should be thread safe.
If you don't want to keep track of _contactIdNames thread safety try to implement update of each entry on the same dictionary. The problem will be in difference detection between DB and current values (what entries have been removed or added, others can be simple rewritten), but not in thread safety, since ConcurrentDictionary is already thread safe.

You seem to be making a lot of work for yourself. Here's how I would tackle this task:
public class Constructor
{
private volatile Dictionary<int, string> _contactIdNames;
public Constructor()
{
Observable
.Interval(TimeSpan.FromSeconds(10.0))
.StartWith(-1)
.Select(n =>
{
using (var db = new DBEntities())
{
return db.ContactIDs.ToDictionary(
x => x.Qualifier * 1000 + x.AlarmCode,
x => x.Description);
}
})
.Subscribe(x => _contactIdNames = x);
}
public string TryGetValue(int key)
{
string value = null;
_contactIdNames.TryGetValue(key, out value);
return value;
}
}
I'm using Microsoft's Reactive Extensions (Rx) Framework - NuGet "Rx-Main" - for the timer to update the dictionary.
The Rx should be fairly straightforward. If you haven't seen it before in very simple terms it's like LINQ meets events.
If you don't like Rx then just go with your current timer model.
All this code does is create a new dictionary every 10 seconds from the DB. I'm just using a plain dictionary since it is only being created from one thread. Since reference assignment is atomic then you can just re-assign the dictionary when you like with complete thread-safety.
Multiple threads can safely read from a dictionary as long as the elements don't change.

I want to know if the following code is thread safe, which I assume it
is not. And how I could possibly make it thread safe?
I believe it's not. First of all i'd create property for ConcurrentDictionary and check if update is underway inside get method, and if it is, i'd return the previous version of object :
private object obj = new object();
private ConcurrentDictionary<int, string> _contactIdNames;
private ConcurrentDictionary<int, string> _contactIdNamesOld;
private volatile bool _isBusyUpdating = false;
public ConcurrentDictionary<int, string> ContactIdNames
{
get
{
if (!_isBusyUpdating) return _contactIdNames;
return _contactIdNamesOld;
}
private set
{
if(_isBusyUpdating) _contactIdNamesOld =
new ConcurrentDictionary<int, string>(_contactIdNames);
_contactIdNames = value;
}
}
And your method can be :
private static void OnTimedEvent(Object source, System.Timers.ElapsedEventArgs e)
{
if (_isBusyUpdating) return;
lock (obj)
{
_isBusyUpdating = true;
// Get new data values and update the list
try
{
ContactIdNames = new ConcurrentDictionary<int, string>();
using (var db = new DBEntities())
{
foreach (var item in db.ContactIDs.Select(x => new { x.Qualifier, x.AlarmCode, x.Description }).AsEnumerable())
{
int key = (item.Qualifier * 1000) + item.AlarmCode;
_contactIdNames.TryAdd(key, item.Description);
}
}
}
catch (Exception e)
{
Debug.WriteLine("Error occurred in update ContactId db store", e);
_contactIdNames = _contactIdNamesOld;
}
finally
{
_isBusyUpdating = false;
}
}
}
P.S.
My concern is that when I reinitialize the dictionary the querying
threads will not longer by thread safe for the instance when the
initialization takes place. For that reason I have used a lock for the
dictionary when updating it, However I am not sure if this is correct
as the object changes in the lock?
It's ConcurrentDictionary<T> type is threadsafe and not the instance of it, so even if you create new instance and change the reference to it - it's not something to worry about.

Related

How do I accomplish concurrent, thread safe access to a list?

What if I've got an object reference in a safe way within a lock, when outside the lock begun to use it without syncronization? Imaging the other threads can NOT use this object after lock release. Would it be guaranteed I would have no stale or cached in processor data? Would I see all the changes other threads made before I've got the lock? Generally, whould it be thread safe?
Let me show some code.
private List<string> list1 = new List<string>();
private List<string> list2 = new List<string>();
private List<string> list;
private bool flagList;
private readonly object locker = new object();
// in constructor
list = list1;
flagList = true;
public void Write(string data)
{
lock (locker)
{
list.Add(data);
}
}
// guaranteed non-reentrant code
private void TimerElapsed(object obj)
{
List<string> l;
lock (locker)
{
l = list;
list = (flagList) ? list2 : list1;
flagList = !flagList;
}
// process l Count/items here, then clear
...
l.Clear();
// restart timer
}
So, TimerElapsed is guaranteed to run only once a time, and Write could be called by multiple different threads at any time. The question is: would I always see with l all additions being made since previous TimerElapsed?
So if that is your question, Write will not add any items while the you are within the block of the lock statement in TimerElapsed.
You can simplify your code though by using only one list.
private List<string> list = new List<string>();
private readonly object locker = new object();
public void Write(string data)
{
lock (locker)
{
list.Add(data);
}
}
// guaranteed non-reentrant code
private void TimerElapsed(object obj)
{
List<string> copy;
lock (locker)
{
copy = list;
list = new List<string>();
}
// process copy
}
In your example, l.Clear() and your other processing occurs after you have released the lock. That means that by the time l.Clear() is executed, another thread could have acquired the lock and started executing code.
Since you've released the lock, while you're processing another thread could call Write. The timer could elapse again and another thread could acquire the lock, get a reference to the same list, and start processing it. This could happen multiple times. (You don't want to make any assumptions about the length of the timer interval vs. how long it takes to process.)
Instead of making access to the List thread safe, what you really need is for the collection itself to be thread safe. Otherwise, as you said, any other thread that obtained a reference "safely" can then do whatever it wants with that object, whenever it wants.
One solution is to use ConcurrentQueue. This allows you to safely insert and remove on separate threads, and it handles the locking internally for you.
public class UsesConcurrentQueue
{
private readonly ConcurrentQueue<string> _queue = new ConcurrentQueue<string>();
public void Write(string data)
{
_queue.Enqueue(data);
}
private void TimerElapsed(object obj)
{
string dataToProcess = null;
while(_queue.TryDequeue(out dataToProcess))
{
// process each string;
}
}
}
Now when you add an item into the queue you don't have to care whether another thread is reading from the same queue.
When the timer elapses and you start processing, you'll keep processing until the queue is empty. You could keep adding items to the queue while you're processing and it would be okay.
If you need to ensure that you don't process the queue on two threads (suppose the timer elapses again before you finish processing the first time) then you could just stop the timer when you enter that method and restart it when you exit.
Or suppose you've got some reason why you want to process a list of items instead of processing them one at a time, like you're processing them as a batch. Then you could do this:
private void TimerElapsed(object obj)
{
string dataToProcess = null;
var list = new List<string>();
while (_queue.TryDequeue(out dataToProcess))
{
list.Add(dataToProcess);
}
// Now you've got a list.
}
Now you've got a List of items, but it's a local variable, not a class-level field. So you can operate on it, even send it to another method, without having to worry that some other thread is going to modify it.

Is Garbage Collection occurring on my static property before my background worker has fully consumed it?

I have a background worker in a web page which processes a large file import. I have a static property containing a Dictionary of values which I need my background worker to access. To prevent issues with garbage collection, I stringify the Dictionary when passing it into the background worker. The problem is, 1 out of 20 or so times, the Dictionary appears to be garbage collected before it is stringified.
static readonly Dictionary<int, int> myDictionary = new Dictionary<int, int>();
// When button is clicked, I fire the background worker
// Assume for posterity, I've filled the Dictionary with a list of values and those values exist at the time the worker is queued up.
protected void OnProcessClickHandler(object sender, EventArgs e)
{
ThreadPool.QueueUserWorkItem(ProcessInBackground, new object[] {
DictionaryIntIntToCsv(myDictionary)
});
}
// Example of the Background Process
private void ProcessInBackground(object state)
{
object[] parms = state as object[];
if (parms != null && parms.Length > 0)
{
var MyNewDictionary = DictionaryIntIntFromCsv(parms[0] as string);
//... Doing something with the Dictionary
}
}
// Here are some helper methods I am using to stringify the Dictionary. You can ignore these unless you think they have something to do with the issue at hand.
public static Dictionary<int, int> DictionaryIntIntFromCsv(string csv)
{
var dictionary = new Dictionary<int, int>();
foreach (var pair in csv.Split(','))
{
var arrNameValue = pair.Split(':');
if (arrNameValue.Count() != 2) continue;
var key = 0;
var val = 0;
int.TryParse(arrNameValue[0], out key);
int.TryParse(arrNameValue[1], out val);
if (key > 0 && val > 0)
{
dictionary.Add(key, val);
}
}
return dictionary;
}
public static string DictionaryIntIntToCsv(Dictionary<int, int> dictionary)
{
var str = "";
foreach (var key in dictionary.Keys)
{
var value = 0;
dictionary.TryGetValue(key, out value);
if (key == 0 || value == 0) continue;
var item = key + ":" + value;
str += (item + ",");
}
return str;
}
I know there is an issue with Garbage Collection. My theory is sometimes the main thread completes and garbage collection is run before the background worker has a chance to stringify the Dictionary. Would I be correct in assuming I could avoid issues with Garbage Collection if I stringify the Dictionary before queuing the background worker? Like so:
protected void OnProcessClickHandler(object sender, EventArgs e)
{
var MyString = DictionaryIntIntToCsv(MyDictionary);
ThreadPool.QueueUserWorkItem(ProcessInBackground, new object[] {
MyString
});
}
NOTE: The page is interactive and does several postbacks before firing off the background worker.
There is really a lot of misinformation and bizarre implementation in this question, so much so that it cannot actually be answered without further clarification.
What leads you to believe the dictionary will be collected once you've "stringified" it? What are you actually doing with the dictionary values in the ProcessInBackground() method? Is the processing actually more expensive than serializing and deserializing a dictionary to and from a string for no reason? If so, why is there a background worker being used at all? Why is the string passed in inside an object array instead of simply the string itself? Further on that point, why is the dictionary being serialized at all, is there any good reason can't it be passed in as the state argument directly?
You are likely initializing the property on page load. The reference to the property is tied to the instance of the page which existed on page load. After the server delivered the initial page to you, the class was eligible for garbage collection.
I believe you are seeing a race condition between how long it takes the user to do the postback on the page and how long it takes the server to collect the first instance of the class.
If the property were non-static, the values would not be there on postback. However, since it is a static property, it will exist in memory until the Garbage collector cleans it up.
Here you create a local variable with the dictionary:
if (parms != null && parms.Length > 0)
{
var MyNewDictionary = DictionaryIntIntFromCsv(parms[0] as string);
}
The above does not affect the below in any way. No where else in your code do you ever populate the static field MyDictionary.
The above local variable is completely separate from the below static field you have here, so the above assignment does not affect the below property in any way:
static readonly Dictionary<int, int> MyDictionary = new Dictionary<int, int>();

WPF MultiThreading: How can this still cause an exception? (Careful with that Dispatcher!)

I have the following code in a message handler (that can be invoked on any thread):
private readonly Dictionary<string,IView> _openedViews = new Dictionary<string,IView>();
private readonly object _lockObject = new object();
public MainView()
{
Messenger.Default.Register<ViewChangeMessage>(this, "adminView", m =>
{
var key = m.ViewName;
lock (_lockObject)
{
if (_openedViews.ContainsKey(key) == false)
_openedViews.Add(key, GetView(key));
content.Content = _openedViews[key];
}
//...
});
//...
How can I still get this exception: An element with the same key already exists in the System.Collections.Generic.Dictionary<TKey,TValue>.
The exception is produced if I rapidly cause the message to be sent multiple times.
EDIT: added more context to the code, Messenger is from Galasoft.MVVMLight
Well in that code you posted I don't see any data race.
If GetView cannot cause a data race you could try to replace that entire block of locked code with a ConcurrentDictionary.GetOrAdd:
private readonly ConcurrentDictionary<string,IView> _openedViews =
new ConcurrentDictionary<string,IView>();
public MainView()
{
Messenger.Default.Register<ViewChangeMessage>(this, "adminView", m =>
{
var key = m.ViewName;
content.Content = _openedViews.GetOrAdd(key, GetView(key));
//...
});
//...
Have you made sure that all threads are using the same instance of lockObject? If they're not then it won't stop multiple threads getting to your add code.
Move var key = m.ViewName; inside lock statement.
Here is what happened:
the GetView instantiated a view, which had a long running operation somewhere (waiting on a background thread), and so that waiting wouldn't lock the UI up someone introduced this code:
public static void DoEvents()
{
DispatcherFrame frame = new DispatcherFrame();
Dispatcher.CurrentDispatcher.BeginInvoke(DispatcherPriority.Background,
new DispatcherOperationCallback(ExitFrame), frame);
Dispatcher.PushFrame(frame);
}
and thanks to the PushFrame, a second message was handled ON THE SAME THREAD as the first, hence the lock did nothing to stop it.
Once I've rearranged the code to this, the problem went away:
if (_openedViews.ContainsKey(key) == false)
{
_openedViews.Add(key, null);
_openedViews[key] = ServiceRegistry.GetService<IShellService>().GetView(key);
}

Multiple timers accessing dictionary object in a singleton object

I have a singleton object and have a dictionary defined in it.
public class MyClass
{
public static readonly MyClass Instance = new MyClass();
private MyClass
{}
public Dictionary<int, int> MyDictionary = new Dictionary<int, int>();
}
Now, I have two System.Timers.Timer objects updating MyDictionary.
System.Timers.Timer timer1 = new System.Timers.Timer(5);
timer1.AutoReset = false;
timer1.Elapsed += new System.Timers.ElapsedEventHandler(MyTimer1Handler);
timer1.Enabled = true;
timer1.Start();
System.Timers.Timer timer2 = new System.Timers.Timer(5);
timer2.AutoReset = false;
timer2.Elapsed += new System.Timers.ElapsedEventHandler(MyTimer2Handler);
timer2.Enabled = true;
timer2.Start();
private void MyTimer1Handler(object sender, ElapsedEventArgs e)
{
MyClass.Instance.MyDictonary[1] = 100;
}
private void MyTimer1Handler(object sender, ElapsedEventArgs e)
{
MyClass.Instance.MyDictonary[2] = 100;
}
My question is now, considering the elapsed event handler of timers operate uniquely on index 1 and index 2 of MyDictionary, do I need any lock on MyDictionary ?
Yes, you have to.
http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
That says reading is thread safe, but editing is not. It also says it isn't really safe to iterate the Dictionary.
If you are able to use .NET 4, you can use a ConcurrentDictionary, which is thread safe.
http://msdn.microsoft.com/en-us/library/dd287191.aspx
For this specific example that you post, yes you have to, but strictly speaking, it is not always necessary depending on your usage pattern.
For example, If you have 2 keys predetermined, then you are not modifying shared state of the dictionary if one thread operation is not affecting state of the other thread operation. For example, if you know that you are not adding/removing keys and that each thread will be accessing a specific key.
Lets consider the following simplified example where we are simply incrementing the previous value of 2 given keys in parallel:
class Program
{
static Dictionary<string, int> _dictionary = new Dictionary<string, int>();
static void Main(string[] args)
{
_dictionary["key1"] = 0;
_dictionary["key2"] = 0;
Action<string> updateEntry = (key) =>
{
for (int i = 0; i < 10000000; i++)
{
_dictionary[key] = _dictionary[key] + 1;
}
};
var task1 = Task.Factory.StartNew(() =>
{
updateEntry("key1");
});
var task2 = Task.Factory.StartNew(() =>
{
updateEntry("key2");
});
Task.WaitAll(task1, task2);
Console.WriteLine("Key1 = {0}", _dictionary["key1"]);
Console.WriteLine("Key2 = {0}", _dictionary["key2"]);
Console.ReadKey();
}
}
What do you think the value of each of the keys of the dictionary will be after iterating in 2 separate threads simultaneously on the same dictionary for more than 10 million times within a loop?
Well you get
Key1 = 10000000
Key2 = 10000000
No extra synchronization is necessary in the above example simply to assign values to existing keys in a dictionary.
Of course, if you wanted to add or remove keys then you need to consider synchronizing or using data structures such as ConcurrentDictionary<TKey,TValue>
In your case you are actually adding values to the dictionary, so you have to use some form of synchronization.

Is yield return in C# thread-safe?

I have the following piece of code:
private Dictionary<object, object> items = new Dictionary<object, object>;
public IEnumerable<object> Keys
{
get
{
foreach (object key in items.Keys)
{
yield return key;
}
}
}
Is this thread-safe? If not do I have to put a lock around the loop or the yield return?
Here is what I mean:
Thread1 accesses the Keys property while Thread2 adds an item to the underlying dictionary. Is Thread1 affected by the add of Thread2?
What exactly do you mean by thread-safe?
You certainly shouldn't change the dictionary while you're iterating over it, whether in the same thread or not.
If the dictionary is being accessed in multiple threads in general, the caller should take out a lock (the same one covering all accesses) so that they can lock for the duration of iterating over the result.
EDIT: To respond to your edit, no it in no way corresponds to the lock code. There is no lock automatically taken out by an iterator block - and how would it know about syncRoot anyway?
Moreover, just locking the return of the IEnumerable<TKey> doesn't make it thread-safe either - because the lock only affects the period of time when it's returning the sequence, not the period during which it's being iterated over.
Check out this post on what happens behind the scenes with the yield keyword:
Behind the scenes of the C# yield keyword
In short - the compiler takes your yield keyword and generates an entire class in the IL to support the functionality. You can check out the page after the jump and check out the code that gets generated...and that code looks like it tracks thread id to keep things safe.
OK, I did some testing and got an interesting result.
It seems that it is more an issue of the enumerator of the underlying collection than the yield keyword. The enumerator (actually its MoveNext method) throws (if implemented correctly) an InvalidOperationException because the enumeration has changed. According to the MSDN documentation of the MoveNext method this is the expected behavior.
Because enumerating through a collection is usually not thread-safe a yield return is not either.
I believe it is, but I cannot find a reference that confirms it. Each time any thread calls foreach on an iterator, a new thread local* instance of the underlying IEnumerator should get created, so there should not be any "shared" memory state that two threads can conflict over...
Thread Local - In the sense that it's reference variable is scoped to a method stack frame on that thread
I believe yield implementation is thread-safe. Indeed, you can run that simple program at home and you will notice that the state of the listInt() method is correctly saved and restored for each thread without edge effect from other threads.
public class Test
{
public void Display(int index)
{
foreach (int i in listInt())
{
Console.WriteLine("Thread {0} says: {1}", index, i);
Thread.Sleep(1);
}
}
public IEnumerable<int> listInt()
{
for (int i = 0; i < 5; i++)
{
yield return i;
}
}
}
class MainApp
{
static void Main()
{
Test test = new Test();
for (int i = 0; i < 4; i++)
{
int x = i;
Thread t = new Thread(p => { test.Display(x); });
t.Start();
}
// Wait for user
Console.ReadKey();
}
}
class Program
{
static SomeCollection _sc = new SomeCollection();
static void Main(string[] args)
{
// Create one thread that adds entries and
// one thread that reads them
Thread t1 = new Thread(AddEntries);
Thread t2 = new Thread(EnumEntries);
t2.Start(_sc);
t1.Start(_sc);
}
static void AddEntries(object state)
{
SomeCollection sc = (SomeCollection)state;
for (int x = 0; x < 20; x++)
{
Trace.WriteLine("adding");
sc.Add(x);
Trace.WriteLine("added");
Thread.Sleep(x * 3);
}
}
static void EnumEntries(object state)
{
SomeCollection sc = (SomeCollection)state;
for (int x = 0; x < 10; x++)
{
Trace.WriteLine("Loop" + x);
foreach (int item in sc.AllValues)
{
Trace.Write(item + " ");
}
Thread.Sleep(30);
Trace.WriteLine("");
}
}
}
class SomeCollection
{
private List<int> _collection = new List<int>();
private object _sync = new object();
public void Add(int i)
{
lock(_sync)
{
_collection.Add(i);
}
}
public IEnumerable<int> AllValues
{
get
{
lock (_sync)
{
foreach (int i in _collection)
{
yield return i;
}
}
}
}
}

Categories