Is this Hashset lock threadsafe? - c#

private static readonly object MyMethodLockobject = new object();
private static readonly HashSet<long> ActiveWorkItem = new HashSet<long>();
public async Task MyMethod(long id)
{
lock (MyMethodLockobject)
{
if (ActiveWorkItem.Contains(id))
{
throw new AnotherRequestAlreadyInProgressException();
}
ActiveWorkItem.Add(id);
}
try
{
return await DoWork(id);
}
finally
{
ActiveWorkItem.Remove(id);
}
}
ActiveWorkItem purpose is preventing concurrent calls on same id. only add contain and remove are needed.
MyMethod is the only place with ActiveWorkItem access.
my concern is this line:
finally
{
ActiveWorkItem.Remove(id);
}
or is is necessary to change to
finally
{
lock (MyMethodLockobject)
{
ActiveWorkItem.Remove(id);
}
}
better alternative also appreciated

Yes, it is necessary to protect the HashSet inside the finally block too.
finally
{
lock (MyMethodLockobject) ActiveWorkItem.Remove(id);
}
Otherwise one thread may be adding items to the HashSet while multiple threads are concurrently removing items from the same HashSet, potentially putting the object into a corrupted state, and resulting to undefined behavior.

Related

Change and read properties of objects in ConcurrentDictionary in thread safe manner

I use ConcurrentDictionary to collect data in memory in web api application. Using api methods I add and update objects in ConcurrentDictionary. And there is background thread which analyze and clean up this dictionary based on object properties. Now I'm considering two approaches:
1. use lock on dictionary item in updateValueFactory in AddOrUpdate method, but question is how to read properties properly to be sure I have the latest version of it and that I'm not reading property in not stable state.
public class ThreadsafeService2
{
private readonly ConcurrentDictionary<string, ThreadSafeItem2> _storage =
new ConcurrentDictionary<string, ThreadSafeItem2>();
public void AddOrUpdate(string name)
{
var newVal = new ThreadSafeItem2();
_storage.AddOrUpdate(name, newVal, (key, oldVal) =>
{
//use lock
lock (oldVal)
{
oldVal.Increment();
}
return oldVal;
});
}
public void Analyze()
{
foreach (var key in _storage.Keys)
{
if (_storage.TryGetValue(key, out var item))
{
//how to read it properly?
long ticks = item.ModifiedTicks;
}
}
}
}
public class ThreadSafeItem2
{
private long _modifiedTicks;
private int _counter;
public void Increment()
{
//no interlocked here
_modifiedTicks = DateTime.Now.Ticks;
_counter++;
}
//now interlocked here
public long ModifiedTicks => _modifiedTicks;
public int Counter => _counter;
}
2. use Interlocked and memory barriers on property level without lock, looks a bit verbose for me.
public class ThreadsafeService1
{
private readonly ConcurrentDictionary<string, ThreadSafeItem1> _storage =
new ConcurrentDictionary<string, ThreadSafeItem1>();
public void AddOrUpdate(string name)
{
var newVal = new ThreadSafeItem1();
_storage.AddOrUpdate(name, newVal, (key, oldVal) =>
{
//no lock here
oldVal.Increment();
return oldVal;
});
}
public void Analyze()
{
foreach(var key in _storage.Keys)
{
if(_storage.TryGetValue(key, out var item))
{
//reading through interloacked
long ticks = item.ModifiedTicks;
}
}
}
}
public class ThreadSafeItem1
{
private long _modifiedTicks;
private int _counter;
public void Increment()
{
//make changes in atomic manner
Interlocked.Exchange(ref _modifiedTicks, DateTime.Now.Ticks);
Interlocked.Increment(ref _counter);
}
public long ModifiedTicks => Interlocked.Read(ref _modifiedTicks);
public int Counter => Thread.VolatileRead(ref _counter);
}
What is the best practices here?
So both of your implementations have major problems. The first solution locks when incrementing, but doesn't lock when reading, meaning the other places accessing the data can read invalid state.
A non-technical problem, but a major issue nonetheless, is that you've named your class ThreadSaveItem and yet it's not actually designed to be accessed safely from multiple threads. It's the callers responsibility, in this implementation, to ensure that the item isn't accessed from multiple threads. If I see a class called ThreadSafeItem I'm going to assume it's safe to access it from multiple threads, and that I don't need to synchronize my access to it so long as each operation I perform is the only thing that needs to be logically atomic.
Your Interlocked solution is problematic in that you have to fields that you're modifying, that are conceptually tied together, but you don't synchronize their changes together, meaning someone can observe a modification to one and not the other, which is a problem for that code.
Next, your use of AddOrUpdate in both solutions isn't really appropriate. The whole point of the method call is to add an item or replace it with another item, not to mutate the provided item (that's why it takes a return value; you're supposed to produce a new item). If you want to go with the approach of getting a mutable item and mutating it, the way to go would be to call GetOrAdd to either get an existing item or create a new one, and then to mutate it in a thread safe manner using the returned value.
The whole solution is radically simplified by simply making ThreadSafeItem immutable. It lets you use AddOrUpdate on the ConcurrentDictionary for the update, and it means that the only synchronization that needs to be done is the updating of the value of the ConcurrentDictionary, and it already handles synchronization of its own state, no synchronization needs to be done at all when accessing ThreadSafeItem, because all access to the data is inherently thread safe because it's immutable. This means that you never actually need to write any synchronization code at all, which is exactly what you want to strive for whenever possible.
And finally, we have the actual code:
public class ThreadsafeService3
{
private readonly ConcurrentDictionary<string, ThreadSafeItem3> _storage =
new ConcurrentDictionary<string, ThreadSafeItem3>();
public void AddOrUpdate(string name)
{
_storage.AddOrUpdate(name, _ => new ThreadSafeItem3(), (_, oldValue) => oldValue.Increment());
}
public void Analyze()
{
foreach (var pair in _storage)
{
long ticks = pair.Value.ModifiedTicks;
//Note, the value may have been updated since we checked;
//you've said you don't care and it's okay for a newer item to be removed here if it loses the race.
if (isTooOld(ticks))
_storage.TryRemove(pair.Key, out _);
}
}
}
public class ThreadSafeItem3
{
public ThreadSafeItem3()
{
Counter = 0;
}
private ThreadSafeItem3(int counter)
{
Counter = counter;
}
public ThreadSafeItem3 Increment()
{
return new ThreadSafeItem3(Counter + 1);
}
public long ModifiedTicks { get; } = DateTime.Now.Ticks;
public int Counter { get; }
}
The solution proposed by Servy (using an immutable Item type) is probably the best solution for your scenario. I would also suggest switching from class to readonly struct for reducing the allocations, although the ConcurrentDictionary is probably going to wrap the struct in a reference-type Node internally, so you might not gain anything from this.
For the sake of completeness I will propose an alternative solution, which is to use the GetOrAdd instead of the AddOrUpdate, and lock on the Item whenever you are doing anything with it:
public class Item // Mutable and thread-unsafe
{
public long ModifiedTicks { get; private set; }
public int Counter { get; private set; }
public void Increment()
{
ModifiedTicks = DateTime.Now.Ticks;
Counter++;
}
}
public class Service
{
private readonly ConcurrentDictionary<string, Item> _storage = new();
public void AddOrUpdate(string name)
{
Item item = _storage.GetOrAdd(name, _ => new());
lock (item) item.Increment(); // Dont't forget to lock!
}
public void Analyze()
{
foreach (var (key, item) in _storage.ToArray())
{
lock (item) // Dont't forget to lock!
{
long ticks = item.ModifiedTicks;
}
}
}
}
This solution offers probably the best performance, but the burden of remembering to lock correctly everywhere cannot be underestimated.
I can't comment on the specifics of what exactly you are doing, but interlock and Concurrent dictionary is better than locks you do yourself.
I would question this approach though. Your data is important enough, but not so important to persist it? Depending on the usage of the application this approach will slow it down by some degree. Again, not knowing exactly what you are doing, you could throw each "Add" into an MSMQ, and then have an external exe run at some schedule to process the items. The website will just fire and forget, with no threading requirements.

A static property in static class when used concurrently

I have a static class 'Logger' with a public property called 'LogLevels' as in code below.
When the property is used concurrently in a multi-user or multi-threaded environment, could it cause problems?
Do I need to use thread synchronization for the code within the property 'LogLevels'?
public class Logger
{
private static List<LogLevel> _logLevels = null;
public static List<LogLevel> LogLevels
{
get
{
if (_logLevels == null)
{
_logLevels = new List<LogLevel>();
if (!string.IsNullOrWhiteSpace(System.Configuration.ConfigurationManager.AppSettings["LogLevels"]))
{
string[] lls = System.Configuration.ConfigurationManager.AppSettings["LogLevels"].Split(",".ToCharArray());
foreach (string ll in lls)
{
_logLevels.Add((LogLevel)System.Enum.Parse(typeof(LogLevel), ll));
}
}
}
if (_logLevels.Count == 0)
{
_logLevels.Add(LogLevel.Error);
}
return _logLevels;
}
}
}
UPDATE: I ended up using thread synchronization to solve concurrency problem in a static class, as in code below.
public class Logger
{
private static readonly System.Object _object = new System.Object();
private static List<LogLevel> _logLevels = null;
private static List<LogLevel> LogLevels
{
get
{
//Make sure that in a multi-threaded or multi-user scenario, we do not run into concurrency issues with this code.
lock (_object)
{
if (_logLevels == null)
{
_logLevels = new List<LogLevel>();
if (!string.IsNullOrWhiteSpace(System.Configuration.ConfigurationManager.AppSettings["SimpleDBLogLevelsLogger"]))
{
string[] lls = System.Configuration.ConfigurationManager.AppSettings["SimpleDBLogLevelsLogger"].Split(",".ToCharArray());
foreach (string ll in lls)
{
_logLevels.Add((LogLevel)System.Enum.Parse(typeof(LogLevel), ll));
}
}
}
if (_logLevels.Count == 0)
{
_logLevels.Add(LogLevel.Error);
}
}
return _logLevels;
}
}
}
When the property is used concurrently in a multi-user or multi-threaded environment, could it cause problems?
Absolutely. List<T> is not designed for multiple threads, except for the case where there are just multiple readers (no writers).
Do I need to use thread synchronization for the code within the property 'LogLevels'?
Well that's one approach. Or just initialize it on type initialization, and then return a read-only wrapper around it. (You really don't want multiple threads modifying it.)
Note that in general, doing significant amounts of work in a static constructor is a bad idea. Are you happy enough that if this fails, every access to this property will fail, forever?
This code posses race conditions and cannot be safely executed from multiple threads. The primary problem is the List<T> type is not thread safe yet this code will freely write to. This mean the writes can occur in parallel and hence break the implicit contract of List<T>
The short answer is "yes" and "yes" you do need threads synchronization.
The other question is, why re-invent the wheel? You can use something like log4net or .NET logging framework.

Global thread-safe multi-value custom Dictionary with single instantiation

I would like to have a global object similar to a multi-value Dictionary that is shared among different Threads.
I would like the object to be created only once (for example getting the data from a Database) and then used by the different Threads.
The Object should be easily extendable with additional properties (currently have only JobName and URL).
If possible, I would prefer to avoid locking.
I am facing the following issues:
The current version displayed below is not Thread safe;
I cannot use a ConcurrentDictionary since I have extended the Dictionary object to allow multiple values for each key;
This is the object structure that should be modified easily:
public struct JobData
{
public string JobName;
public string URL;
}
I have extended the Dictionary object to allow multiple values for each key:
public class JobsDictionary : Dictionary<string, JobData>
{
public void Add(string key, string jobName, string url)
{
JobData data;
data.JobName = jobName;
data.URL = url;
this.Add(key, data);
}
}
Static class that is shared among Threads.
As you can see it creates a Dictionary entry for the specific Job the first time it is called for that Job.
For instance, the first time it is called for "earnings" it will create the "earnings" dictionary entry. This creates issues with Thread safety:
public static class GlobalVar
{
private static JobsDictionary jobsDictionary = new JobsDictionary();
public static JobData Job(string jobCat)
{
if (jobsDictionary.ContainsKey(jobCat))
return jobsDictionary[jobCat];
else
{
String jobName;
String url = null;
//TODO: get the Data from the Database
switch (jobCat)
{
case "earnings":
jobName="EarningsWhispers";
url = "http://www.earningswhispers.com/stocks.asp?symbol={0}";
break;
case "stock":
jobName="YahooStock";
url = "http://finance.yahoo.com/q?s={0}";
break;
case "functions":
jobName = "Functions";
url = null;
break;
default:
jobName = null;
url = null;
break;
}
jobsDictionary.Add(jobCat, jobName, url);
return jobsDictionary[jobCat];
}
}
In each Thread I get the specific Job property in this way:
//Get the Name
string JobName= GlobalVar.Job(jobName).JobName;
//Get the URL
string URL = string.Format((GlobalVar.Job(jobName).URL), sym);
How can I create a custom Dictionary that is "instantiated" once (I know it is not the right term since it is static...) and it is Thread-safe ?
Thanks
UPDATE
Ok, here is the new version.
I have simplified the code by removing the switch statement and loading all dictionary items at once (I need all of them anyway).
The advantage of this solution is that it is locked only once: when the dictionary data is added (the first Thread entering the lock will add data to the dictionary).
When the Threads access the dictionary for reading, it is not locked.
It should be Thread-Safe and it should not incur in deadlocks since jobsDictionary is private.
public static class GlobalVar
{
private static JobsDictionary jobsDictionary = new JobsDictionary();
public static JobData Job(string jobCat)
{
JobData result;
if (jobsDictionary.TryGetValue(jobCat, out result))
return result;
//if the jobsDictionary is not initialized yet...
lock (jobsDictionary)
{
if (jobsDictionary.Count == 0)
{
//TODO: get the Data from the Database
jobsDictionary.Add("earnings", "EarningsWhispers", "http://www.earningswhispers.com/stocks.asp?symbol={0}");
jobsDictionary.Add("stock", "YahooStock", "http://finance.yahoo.com/q?s={0}");
jobsDictionary.Add("functions", "Functions", null);
}
return jobsDictionary[jobCat];
}
}
}
If you are populating the collection once, you don't need any locking at all, since a Dictionary is thread-safe when it is only read from. If you want prevent multiple threads from initializing multiple times you can use a double-checked lock during initalization, like this:
static readonly object syncRoot = new object();
static Dictionary<string, JobData> cache;
static void Initialize()
{
if (cache == null)
{
lock (syncRoot)
{
if (cache == null)
{
cache = LoadFromDatabase();
}
}
}
}
Instead of allowing every thread to access the dictionary, hide it behind a facade that only exposes the operations you really need. This makes it much easier to reason about thread-safety. For instance:
public class JobDataCache : IJobData
{
readonly object syncRoot = new object();
Dictionary<string, JobData> cache;
public void AddJob(string key, JobData data)
{
lock (this.syncRoot)
{
cache[key] = data;
}
}
}
Trying to prevent locking without having measured that locking actually has a too big impact on performance is bad. Prevent doing that. Often using a simple lock statement is much simpler than writing lock-free code. There is a nasty problem with concurrency bugs compared to normal software bugs. They are very hard to reproduce and very hard to track down. If you can, prevent writing concurrency bugs. You can do this by writing the simplest code you can, even if it is slower. If it proves to be too slow, you can always optimize.
If you want to write lock-free code anyway, try using immutable data structures, or prevent changing existing data. This is one trick I used when writing the Simple Injector (a reusable library). In this framework, I never update the internal dictionary, but always completely replace it with a new one. The dictionary itself is therefore never changed, the reference to that instance is just replaced with a completely new dictionary. This prevents you from having to do locks completely. However, you must realize that it is possible to loose updates. In other words, when multiple threads are updating that dictionary, one can loose its changes, simply because each thread creates a new copy of that dictionary and adds its own value too its own copy, before making that reference public to other threads.
In other words, you can only use this method when external callers only read (and you can recover from lost changes, for instance by querying the database again).
UPDATE
Your updated version is still not thread-safe, because of the reasons I explained on #ili's answer. The following will do the trick:
public static class GlobalVar
{
private static readonly object syncRoot = new object();
private static JobsDictionary jobsDictionary = null;
public static JobData Job(string jobCat)
{
Initialize();
return jobsDictionary[jobCat];
}
private void Initialize()
{
// Double-checked lock.
if (jobsDictionary == null)
{
lock (syncRoot)
{
if (jobsDictionary == null)
{
jobsDictionary = CreateJobsDictionary();
}
}
}
}
private static JobsDictionary CreateJobsDictionary()
{
var jobs = new JobsDictionary();
//TODO: get the Data from the Database
jobs.Add("earnings", "EarningsWhispers", "http://...");
jobs.Add("stock", "YahooStock", "http://...");
jobs.Add("functions", "Functions", null);
return jobs;
}
}
You can also use the static constructor, which would prevent you from having to write the double checked lock yourself. However, it is dangarous to call the database inside a static constructor, because a static constructor will only run once and when it fails, the complete type will be unusable for as long as the AppDomain lives. In other words your application must be restarted when this happens.
UPDATE 2:
You can also use .NET 4.0's Lazy<T>, which is safer than a double checked lock, since it is easier to implement (and easier to implement correctly) and is is also thread-safe on processor architectures with weak memory models (weaker than x86 such as ARM):
static Lazy<Dictionary<string, JobData>> cache =
new Lazy<Dictionary<string, JobData>>(() => LoadFromDatabase());
1) Use singleton patern to have one instance (one of the ways is to use static class as you have done)
2) To make anything thread safe you should use lock or it's analog. If you are afraids of unnessessary locks do like this:
public object GetValue(object key)
{
object result;
if(_dictionary.TryGetValue(key, out result)
return result;
lock(_dictionary)
{
if(_dictionary.TryGetValue(key, out result)
return result;
//some get data code
_dictionary[key]=result;
return result;
}
}

add items to a list from a different thread

I have a class as:
class SomeClass{
class Connection{//some fields}
static List<Connection> connections { get; set; }
public SomeClass( \\params etc)
{
connections = new List<Connections>(); // initialize connections list
//initialize some other private vars
// ...
mainClassThreadMethod();
}
private void mainClassThreadMethod()
{
while (true)
{
Thread t;
Connection p = new Connection ( { \\instantiate the class})
// this code will not execute until p is initialized... In other words this loop will not execute several times quickly.
t = new Thread(new ParameterizedThreadStart(startThread));
t.Start(p);
}
}
private void startThread(object o)
{
//add a new connection to the list
connections.Add((Connection)o));
}
public List<Connection> getConnections()
{
return connections;
}
}
why is it that after adding new connections to the list if I then call the getConnections method it returns a null list? I figure it is because I am adding the items from a different thread. How can I keep track of this?
You have several problems in the code above, but sticking to the question asked, to synchronize the list (allow adding from different threads) you can (1) implement your own locking, or (2) use http://msdn.microsoft.com/en-us/library/3azh197k.aspx.
I'd go for #2, but in your case:
No, it's probably not caused by adding anything from a different thread
Why would you even want to add from inside startThread? You have the connection object before you instantiate a thread, so you can easily call connections.Add(connection) from the same thread thereby eliminating the need for any locking.
Why is there a while(true) loop around the thread spin-up process?

Problem with clearing a List<T>

I don't know why I have an IndexOutOfRangeException when I am clearing a System.Collections.Generic.List<T>. Does this make sense?
List<MyObject> listOfMyObject = new List<MyObject>();
listOfMyObject.Clear();
This typically happens if multiple threads are accessing the list simultaneously. If one thread deletes an element while another calls Clear(), this exception can occur.
The "answer" in this case is to synchronize this appropriately, locking around all of your List access.
Edit:
In order to handle this, the simplest method is to encapsulate your list within a custom class, and expose the methods you need, but lock as needed. You'll need to add locking to anything that alters the collection.
This would be a simple option:
public class MyClassCollection
{
// Private object for locking
private readonly object syncObject = new object();
private readonly List<MyObject> list = new List<MyObject>();
public this[int index]
{
get { return list[index]; }
set
{
lock(syncObject) {
list[index] = value;
}
}
}
public void Add(MyObject value)
{
lock(syncObject) {
list.Add(value);
}
}
public void Clear()
{
lock(syncObject) {
list.Clear();
}
}
// Do any other methods you need, such as remove, etc.
// Also, you can make this class implement IList<MyObject>
// or IEnumerable<MyObject>, but make sure to lock each
// of the methods appropriately, in particular, any method
// that can change the collection needs locking
}
Are you sure that that code throws an exception? I have
using System.Collections.Generic;
class MyObject { }
class Program {
static void Main(string[] args) {
List<MyObject> listOfMyObject = new List<MyObject>();
listOfMyObject.Clear();
}
}
and I do not get an exception.
Is your real-life example more complex? Perhaps you have multiple threads simultaneously accessing the list? Can we see a stack trace?
List<T>.Clear is really quite simple. Using Reflector:
public void Clear() {
if (this._size > 0) {
Array.Clear(this._items, 0, this._size);
this._size = 0;
}
this._version++;
}
In the case when the list already empty, that is not going to ever throw an exception. However, if you are modifying the list on another thread, Array.Clear could throw an IndexOutOfRangeException exception. So if another thread removes an item from the list then this._size (the number of items to clear) will be too big.
The documentation doesn't mention any Exception this method throws, your problem is probably elsewhere.
List<T>.Clear

Categories