Is yield return in C# thread-safe? - c#

I have the following piece of code:
private Dictionary<object, object> items = new Dictionary<object, object>;
public IEnumerable<object> Keys
{
get
{
foreach (object key in items.Keys)
{
yield return key;
}
}
}
Is this thread-safe? If not do I have to put a lock around the loop or the yield return?
Here is what I mean:
Thread1 accesses the Keys property while Thread2 adds an item to the underlying dictionary. Is Thread1 affected by the add of Thread2?

What exactly do you mean by thread-safe?
You certainly shouldn't change the dictionary while you're iterating over it, whether in the same thread or not.
If the dictionary is being accessed in multiple threads in general, the caller should take out a lock (the same one covering all accesses) so that they can lock for the duration of iterating over the result.
EDIT: To respond to your edit, no it in no way corresponds to the lock code. There is no lock automatically taken out by an iterator block - and how would it know about syncRoot anyway?
Moreover, just locking the return of the IEnumerable<TKey> doesn't make it thread-safe either - because the lock only affects the period of time when it's returning the sequence, not the period during which it's being iterated over.

Check out this post on what happens behind the scenes with the yield keyword:
Behind the scenes of the C# yield keyword
In short - the compiler takes your yield keyword and generates an entire class in the IL to support the functionality. You can check out the page after the jump and check out the code that gets generated...and that code looks like it tracks thread id to keep things safe.

OK, I did some testing and got an interesting result.
It seems that it is more an issue of the enumerator of the underlying collection than the yield keyword. The enumerator (actually its MoveNext method) throws (if implemented correctly) an InvalidOperationException because the enumeration has changed. According to the MSDN documentation of the MoveNext method this is the expected behavior.
Because enumerating through a collection is usually not thread-safe a yield return is not either.

I believe it is, but I cannot find a reference that confirms it. Each time any thread calls foreach on an iterator, a new thread local* instance of the underlying IEnumerator should get created, so there should not be any "shared" memory state that two threads can conflict over...
Thread Local - In the sense that it's reference variable is scoped to a method stack frame on that thread

I believe yield implementation is thread-safe. Indeed, you can run that simple program at home and you will notice that the state of the listInt() method is correctly saved and restored for each thread without edge effect from other threads.
public class Test
{
public void Display(int index)
{
foreach (int i in listInt())
{
Console.WriteLine("Thread {0} says: {1}", index, i);
Thread.Sleep(1);
}
}
public IEnumerable<int> listInt()
{
for (int i = 0; i < 5; i++)
{
yield return i;
}
}
}
class MainApp
{
static void Main()
{
Test test = new Test();
for (int i = 0; i < 4; i++)
{
int x = i;
Thread t = new Thread(p => { test.Display(x); });
t.Start();
}
// Wait for user
Console.ReadKey();
}
}

class Program
{
static SomeCollection _sc = new SomeCollection();
static void Main(string[] args)
{
// Create one thread that adds entries and
// one thread that reads them
Thread t1 = new Thread(AddEntries);
Thread t2 = new Thread(EnumEntries);
t2.Start(_sc);
t1.Start(_sc);
}
static void AddEntries(object state)
{
SomeCollection sc = (SomeCollection)state;
for (int x = 0; x < 20; x++)
{
Trace.WriteLine("adding");
sc.Add(x);
Trace.WriteLine("added");
Thread.Sleep(x * 3);
}
}
static void EnumEntries(object state)
{
SomeCollection sc = (SomeCollection)state;
for (int x = 0; x < 10; x++)
{
Trace.WriteLine("Loop" + x);
foreach (int item in sc.AllValues)
{
Trace.Write(item + " ");
}
Thread.Sleep(30);
Trace.WriteLine("");
}
}
}
class SomeCollection
{
private List<int> _collection = new List<int>();
private object _sync = new object();
public void Add(int i)
{
lock(_sync)
{
_collection.Add(i);
}
}
public IEnumerable<int> AllValues
{
get
{
lock (_sync)
{
foreach (int i in _collection)
{
yield return i;
}
}
}
}
}

Related

Locking and ref values from external function

I'm trying to track down a bug that I think might be related to the applications multithreading. I've simplified the code below:
class Outer {
private static int count;
//this function is called from multiple threads in quick succession
public void DoFoo() {
Inner.Increment(ref count);
}
}
class Inner {
private readonly static object mLock = new object();
public static string Increment(ref count) {
lock (mLock) (
if (count > 1000)
count = 0;
count++;
}
return count.ToString();
}
}
Can the locking guarantee the safety of a variable passed in that way? Is there any copying of count going on that seems non-obvious and may break the memory safety? I was thinking it might return a new int and do the assignment at the end of the method or something. Apart from that it's my understanding that the lock section would handle any caching issues.
The error which bought the issue to our attention was seemingly one of the threads having a stale version of count.
The problem here is that some other thread could read directly Outer.count when it is == 0, because you can access Outer.count without first having obtained a lock (normally as written in your code, count can be 0 only before the first call to Inner.Increment, from then on it can only have a value between 1 and 1001)
Lockless can be done in this way:
class Inner
{
public static string Increment(ref int count)
{
while (true)
{
int original = count;
int next = original;
if (next > 1000)
{
next = 0;
}
next++;
if (Interlocked.CompareExchange(ref count, next, original) == original)
{
return next.ToString();
}
}
}
}
I'm calculating a next value and using it (through Interlocked.CompareExchange) only if count hasn't changed in the meantime.

IProgress<T> and Parallel.ForEach Sync Issues

I'm running into a sync issue involving reporting progress inside of a Parallel.ForEach. I recreated a simplified version of the problem in a Console App. The example actually only uses one item in the list. Here's the code:
class Program
{
static void Main(string[] args)
{
int tracker = 0;
Parallel.ForEach(Enumerable.Range(1, 1), (item) =>
{
var progress = new Progress<int>((p) =>
{
tracker = p;
Console.WriteLine(String.Format("{0}", p));
});
Test(progress);
});
Console.WriteLine("The last value is: {0}", tracker);
Console.ReadKey();
}
static void Test(IProgress<int> progress)
{
for (int i = 0; i < 20; i++)
{
progress.Report(i);
}
}
}
As you can see, the line I expect to see last isn't output last and doesn't contain 20. But if I remove progress reporting and just write to output in the for loop like this:
class Program
{
static void Main(string[] args)
{
int tracker = 0;
Parallel.ForEach(Enumerable.Range(1, 1), (item) =>
{
tracker = Test();
});
Console.WriteLine("The last value is: {0}", tracker);
Console.ReadKey();
}
static int Test()
{
int i;
for ( i = 0; i < 20; i++)
{
Console.WriteLine(i.ToString());
}
return i;
}
}
it behaves like I expect. As far as I know, Parallel.ForEach creates a Task for each in item in the list and IProgress captures the context in which it's created on. Given it's a console app I didn't think that would matter. Help please!
The explanation is pretty much exactly what's written in the docs:
Any handler provided to the constructor or event handlers registered with the ProgressChanged event are invoked through a SynchronizationContext instance captured when the instance is constructed. If there is no current SynchronizationContext at the time of construction, the callbacks will be invoked on the ThreadPool.
By using Progress<T>.Report you're effectively queueing 20 tasks on the thread pool. There's no guarantee as to what order they're executed in.

BeginInvoke causes error due to skipped null check

I encountered a to me not understandable error caused by using Dispatcher.BeginInvoke within a multi-threaded application.
My program contains a List of objects through which I loop using multiple threads and perform some calculations.
I have simplified (and slightly modified) my code structure to the very bare essentials, so it will hopefully be easier to understand:
public class Foo
{
public void DoCalc()
{
//do calculations
}
}
public class Foo2
{
public Foo foo = new Foo();
public Ellipse ellipse;
}
public partial class MainWindow : Window
{
List<Foo2> myList = new List<Foo2>();
List<Tuple<int, int>> indicesList = new List<Tuple<int, int>>();
public MainWindow()
{
InitializeComponent();
for (int i = 0; i < 10; ++i)
myList.Add(new Foo2 { ellipse = new Ellipse() });
for (int i = 10; i < 20; ++i)
myList.Add(new Foo2());
indicesList.Add(new Tuple<int, int>(0, 9));
indicesList.Add(new Tuple<int, int>(10, 19));
}
private void OnStart(object sender, RoutedEventArgs e)
{
foreach (var t in indicesList)
ThreadPool.QueueUserWorkItem(new WaitCallback(Loop), t);
}
private void Loop(object o)
{
Tuple<int, int> indices = o as Tuple<int, int>;
for(int i = indices.Item1; i <= indices.Item2; ++i)
{
myList[i].foo.DoCalc();
if (myList[i].ellipse == null)
continue;
Application.Current.Dispatcher.BeginInvoke(DispatcherPriority.Normal, new Action(() => myList[i].ellipse.Fill = Brushes.Black));
}
}
}
The first half of items in myList has ellipse point to actual objects, while the second half points to null. Within Loop I check at every iteration if ellipse is null and skip that iteration if necessary. What is weird is that in the second thread where all ellipses point to null, the program still ends up calling the BeginInvoke action on the first item (index 10), causing the program to crash, due to a null reference exception.
If I put a breakpoint on the line myList[i].foo.DoCalc(); and go through the program slowly step by step, no error occurs. Also when I change BeginInvoke to Invoke no error occurs.
As I understand, BeginInvoke works asynchronously by sending a request to the Dispatcher to be performed at some point, while Invoke blocks the calling thread until the Dispatcher has performed the request. However, since I neither access the same elements in the two loops, nor do I change anything about the list itself, so I don't understand how multithreading or the asynchronous nature of BeginInvoke could in any way interfere with what I am doing here.
Unfortunately, I have only found BeginInvoke errors connected with Lists when adding or removing items on SO, however, nothing where an error occurs when all threads seem to access different items at all times. If my code has some fundamental issues that I do simply not understand (which might cause my program to actually not function), then please clear this up for me or send me a link to an answer, I couldn't find. I have been stuck with this issue for a whole day now!
You are running in to variable capture. The i that gets used in the invoke call will likely be indices.Item2 + 1 not the i value that it had at the time you did the BeginInvoke. You must copy i in to a local variable that is created new each loop itteration.
private void Loop(object o)
{
Tuple<int, int> indices = o as Tuple<int, int>;
for(int i = indices.Item1; i <= indices.Item2; ++i)
{
myList[i].foo.DoCalc();
if (myList[i].ellipse == null)
continue;
int iLocal = i;
Application.Current.Dispatcher.BeginInvoke(DispatcherPriority.Normal, new Action(() => myList[iLocal].ellipse.Fill = Brushes.Black));
}
}
foreach prior to C# 5 had the same issue, see here for more info.
This could have something to do with closures...
Try this:
var current = myList[i].foo.DoCalc();
if (current.ellipse == null)
continue;
Application.Current.Dispatcher.BeginInvoke(DispatcherPriority.Normal,
new Action(() => current.ellipse.Fill = Brushes.Black));

Thread safe re-initialization of concurrent dictionary

I want to know if the following code is thread safe, which I assume it is not. And how I could possibly make it thread safe?
Basically I have a ConcurrentDictionary which acts as a cache for a database table. I want to query the DB every 10 seconds and update the db cache. There will be other threads querying this dictionary the whole time.
I can't just use TryAdd as there may also me elements which have been removed. So I decided instead of searching through the entire dictionary to possibly update, add or remove. I would just reinitialize the dictionary. Please do tell me if this is a silly idea.
My concern is that when I reinitialize the dictionary the querying threads will not longer by thread safe for the instance when the initialization takes place. For that reason I have used a lock for the dictionary when updating it, However I am not sure if this is correct as the object changes in the lock?
private static System.Timers.Timer updateTimer;
private static volatile Boolean _isBusyUpdating = false;
private static ConcurrentDictionary<int, string> _contactIdNames;
public Constructor()
{
// Setup Timers for data updater
updateTimer = new System.Timers.Timer();
updateTimer.Interval = new TimeSpan(0, 0, 10, 0).TotalMilliseconds;
updateTimer.Elapsed += OnTimedEvent;
// Start the timer
updateTimer.Enabled = true;
}
private void OnTimedEvent(Object source, System.Timers.ElapsedEventArgs e)
{
if (!_isBusyUpdating)
{
_isBusyUpdating = true;
// Get new data values and update the list
try
{
var tmp = new ConcurrentDictionary<int, string>();
using (var db = new DBEntities())
{
foreach (var item in db.ContactIDs.Select(x => new { x.Qualifier, x.AlarmCode, x.Description }).AsEnumerable())
{
int key = (item.Qualifier * 1000) + item.AlarmCode;
tmp.TryAdd(key, item.Description);
}
}
if (_contactIdNames == null)
{
_contactIdNames = tmp;
}
else
{
lock (_contactIdNames)
{
_contactIdNames = tmp;
}
}
}
catch (Exception e)
{
Debug.WriteLine("Error occurred in update ContactId db store", e);
}
_isBusyUpdating = false;
}
}
/// Use the dictionary from another Thread
public int GetIdFromClientString(string Name)
{
try
{
int pk;
if (_contactIdNames.TryGetValue(Name, out pk))
{
return pk;
}
}
catch { }
//If all else fails return -1
return -1;
}
You're right your code is not thread safe.
You need to lock _isBusyUpdating variable.
You need to lock _contactIdNames every time, not only when its not null.
Also this code is similar to singleton pattern and it has the same problem with initialization. You can solve it with Double checked locking. However you also need double checked locking when accessing entries.
In the case when you updating whole dictionary at once you need to lock current value every time when accessing. Otherwise you can access it while it's still changing and get error. So you either need to lock variable each time or use Interlocked.
As MSDN says volatile should do the trick with _isBusyUpdating, it should be thread safe.
If you don't want to keep track of _contactIdNames thread safety try to implement update of each entry on the same dictionary. The problem will be in difference detection between DB and current values (what entries have been removed or added, others can be simple rewritten), but not in thread safety, since ConcurrentDictionary is already thread safe.
You seem to be making a lot of work for yourself. Here's how I would tackle this task:
public class Constructor
{
private volatile Dictionary<int, string> _contactIdNames;
public Constructor()
{
Observable
.Interval(TimeSpan.FromSeconds(10.0))
.StartWith(-1)
.Select(n =>
{
using (var db = new DBEntities())
{
return db.ContactIDs.ToDictionary(
x => x.Qualifier * 1000 + x.AlarmCode,
x => x.Description);
}
})
.Subscribe(x => _contactIdNames = x);
}
public string TryGetValue(int key)
{
string value = null;
_contactIdNames.TryGetValue(key, out value);
return value;
}
}
I'm using Microsoft's Reactive Extensions (Rx) Framework - NuGet "Rx-Main" - for the timer to update the dictionary.
The Rx should be fairly straightforward. If you haven't seen it before in very simple terms it's like LINQ meets events.
If you don't like Rx then just go with your current timer model.
All this code does is create a new dictionary every 10 seconds from the DB. I'm just using a plain dictionary since it is only being created from one thread. Since reference assignment is atomic then you can just re-assign the dictionary when you like with complete thread-safety.
Multiple threads can safely read from a dictionary as long as the elements don't change.
I want to know if the following code is thread safe, which I assume it
is not. And how I could possibly make it thread safe?
I believe it's not. First of all i'd create property for ConcurrentDictionary and check if update is underway inside get method, and if it is, i'd return the previous version of object :
private object obj = new object();
private ConcurrentDictionary<int, string> _contactIdNames;
private ConcurrentDictionary<int, string> _contactIdNamesOld;
private volatile bool _isBusyUpdating = false;
public ConcurrentDictionary<int, string> ContactIdNames
{
get
{
if (!_isBusyUpdating) return _contactIdNames;
return _contactIdNamesOld;
}
private set
{
if(_isBusyUpdating) _contactIdNamesOld =
new ConcurrentDictionary<int, string>(_contactIdNames);
_contactIdNames = value;
}
}
And your method can be :
private static void OnTimedEvent(Object source, System.Timers.ElapsedEventArgs e)
{
if (_isBusyUpdating) return;
lock (obj)
{
_isBusyUpdating = true;
// Get new data values and update the list
try
{
ContactIdNames = new ConcurrentDictionary<int, string>();
using (var db = new DBEntities())
{
foreach (var item in db.ContactIDs.Select(x => new { x.Qualifier, x.AlarmCode, x.Description }).AsEnumerable())
{
int key = (item.Qualifier * 1000) + item.AlarmCode;
_contactIdNames.TryAdd(key, item.Description);
}
}
}
catch (Exception e)
{
Debug.WriteLine("Error occurred in update ContactId db store", e);
_contactIdNames = _contactIdNamesOld;
}
finally
{
_isBusyUpdating = false;
}
}
}
P.S.
My concern is that when I reinitialize the dictionary the querying
threads will not longer by thread safe for the instance when the
initialization takes place. For that reason I have used a lock for the
dictionary when updating it, However I am not sure if this is correct
as the object changes in the lock?
It's ConcurrentDictionary<T> type is threadsafe and not the instance of it, so even if you create new instance and change the reference to it - it's not something to worry about.

Making a "modify-while-enumerating" collection thread-safe

I want to create a thread-safe collection that can be modified while being enumerated.
The sample ActionSet class stores Action handlers. It has the Add method that adds a new handler to the list and the Invoke method that enumerates and invokes all of the collected action handlers. The intended working scenarios include very frequent enumerations with occasional modifications while enumerating.
Normal collections throw exception if you modify them using the Add method while the enumeration is not over.
There is an easy, but slow solution to the problem: Just clone the collection before enumeration:
class ThreadSafeSlowActionSet {
List<Action> _actions = new List<Action>();
public void Add(Action action) {
lock(_actions) {
_actions.Add(action);
}
}
public void Invoke() {
lock(_actions) {
List<Action> actionsClone = _actions.ToList();
}
foreach (var action in actionsClone ) {
action();
}
}
}
The problem with this solution is the enumeration overhead and I want enumeration to be very fast.
I've created a rather fast "recursion-safe" collection that allows adding new values even while enumerating. If you add new values while the main _actions collection is being enumerated, the values are added to the temporary _delta collection instead of the main one. After all enumerations are finished, the _delta values are added to the _actions collection. If you add some new values while the main _actions collection is being enumerated (creating the _delta collection) and then re-enter the Invoke method again we have to create a new merged collection (_actions + _delta) and replace _actions with it.
So, this collection looks "recursion-safe", but I want to make it thread-safe. I think that I need to use the Interlocked.* constructs, classes from System.Threading and other synchronization primitives to make this collection thread-safe, but I don't have a good idea on how to do that.
How to make this collection thread-safe?
class RecursionSafeFastActionSet {
List<Action> _actions = new List<Action>(); //The main store
List<Action> _delta; //Temporary buffer for storing added values while the main store is being enumerated
int _lock = 0; //The number of concurrent Invoke enumerations
public void Add(Action action) {
if (_lock == 0) { //_actions list is not being enumerated and can be modified
_actions.Add(action);
} else { //_actions list is being enumerated and cannot be modified
if (_delta == null) {
_delta = new List<Action>();
}
_delta.Add(action); //Storing the new values in the _delta buffer
}
}
public void Invoke() {
if (_delta != null) { //Re-entering Invoke after calling Add: Invoke->Add,Invoke
Debug.Assert(_lock > 0);
var newActions = new List<Action>(_actions); //Creating a new list for merging delta
newActions.AddRange(_delta); //Merging the delta
_delta = null;
_actions = newActions; //Replacing the original list (which is still being iterated)
}
_lock++;
foreach (var action in _actions) {
action();
}
_lock--;
if (_lock == 0 && _delta != null) {
_actions.AddRange(_delta); //Merging the delta
_delta = null;
}
}
}
Update: Added the ThreadSafeSlowActionSet variant.
A simpler approach (used, for example, by ConcurrentBag) is to have GetEnumerator() return an enumerator over a snapshot of the collection's contents. In your case this might look like:
public IEnumerator<Action> GetEnumerator()
{
lock(sync)
{
return _actions.ToList().GetEnumerator();
}
}
If you do this, you don't need a _delta field and the complexity it adds.
Here is your class modified for thread safety:
class SafeActionSet
{
Object _sync = new Object();
List<Action> _actions = new List<Action>(); //The main store
List<Action> _delta = new List<Action>(); //Temporary buffer for storing added values while the main store is being enumerated
int _lock = 0; //The number of concurrent Invoke enumerations
public void Add(Action action)
{
lock(sync)
{
if (0 == _lock)
{ //_actions list is not being enumerated and can be modified
_actions.Add(action);
}
else
{ //_actions list is being enumerated and cannot be modified
_delta.Add(action); //Storing the new values in the _delta buffer
}
}
}
public void Invoke()
{
lock(sync)
{
if (0 < _delta.Count)
{ //Re-entering Invoke after calling Add: Invoke->Add,Invoke
Debug.Assert(0 < _lock);
var newActions = new List<Action>(_actions); //Creating a new list for merging delta
newActions.AddRange(_delta); //Merging the delta
_delta.Clear();
_actions = newActions; //Replacing the original list (which is still being iterated)
}
++_lock;
}
foreach (var action in _actions)
{
action();
}
lock(sync)
{
--_lock;
if ((0 == _lock) && (0 < _delta.Count))
{
_actions.AddRange(_delta); //Merging the delta
_delta.Clear();
}
}
}
}
I made a few other tweaks, for the following reason:
reversed IF expressions to have constant value first, so if I do a
typo and put "=" instead of "==" or "!=" etc., the compiler will
instantly tell me of the typo.
(: a habit I got into because my brain and fingers are often out of sync :)
preallocated _delta, and called .Clear() instead of setting it to null,
because I find it is easier to read.
the various lock(_sync) {...} give you your thread safety on all instance variable access.
:( with the exception of your access to _action in the enumeration itself. ):
Since I actually also needed to delete items from the collection, the implementation that I ultimately used was based on a rewritten LinkedList that locks adjacent nodes on deletion/insertion and doesn't complain if the collection was changed during enumeration.
I also added a Dictionary to make the element search fast.

Categories