I have a number of static List's in my application, which are used to store data from my database and are used when looking up information:
public static IList<string> Names;
I also have some methods to refresh this data from the database:
public static void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
lock (Names)
{
Names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
I initially didnt have the lock() in place, however i noticed very occasionally, the requesting thread couldnt find the information in the list. Now, I am assuming that if the requesting thread tries to access the Names list, it cant until it has been fully updated.
Is this the correct methodology and usage of the lock() statement?
As a sidenote, i noticed on MSDN that one shouldnt use lock() on public variables. Could someone please elaborate in my particular scenario?
lock is only useful if all places intended to be synchronized also apply the lock. So every time you access Names you would be required to lock. At the moment, that only stops 2 threads swapping Names at the same time, which frankly isn't a problem here anyway, as reference swaps are atomic anyway.
Another problem; presumably Names starts off null? You can't lock a null. Equally, you shouldn't lock on something that may change reference. If you want to synchronize, a common approach is something like:
// do not use for your scenario - see below
private static readonly object lockObj = new object();
then lock(lockObj) instead of your data.
With regards to not locking things that are visible externally; yes. That is because some other code could randomly choose to lock on it, which could cause unexpected blocking, and quite possibly deadlocks.
The other big risk is that some of your code obtains the names, and then does a sort/add/remove/clear/etc - anything that mutates the data. Personally, I would be using a read-only list here. In fact, with a read-only list, all you have is a reference swap; since that is atomic, you don't need any locking:
public static IList<string> Names { get; private set; }
public static void UpdateNames() {
List<string> tmp = SomeSqlQuery();
Names = tmp.AsReadOnly();
}
And finally: public fields are very very rarely a good idea. Hence the property above. This will be inlined by the JIT, so it is not a penalty.
No, it's not correct since anyone can use the Names property directly.
public class SomeClass
{
private List<string> _names;
private object _namesLock = new object();
public IEnumerable<string> Names
{
get
{
if (_names == null)
{
lock (_namesLock )
{
if (_names == null)
_names = GetNames();
}
}
return _names;
}
}
public void UpdateNames()
{
lock (_namesLock)
GetNames();
}
private void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
_names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
Try to avoid static methods. At least use a singleton.
The check, lock, check is faster than a lock, check since the write will only occur once.
Assigning a property on usage is called lazy loading.
The _namesLock is required since you can't lock on null.
From the oode you have shown, the first time GetNames() is called the Names property is null. I don't known what a lock on a null object would do. I would add a variable to lock on.
static object namesLock = new object();
Then in GetNames()
lock (namesLock)
{
if (Names == null)
Names = ...;
}
We do the if test inside of the lock() to stop race conditions. I'm assuming that the caller of GetNames() also does the same test.
Related
Assuming the following case:
public HashTable map = new HashTable();
public void Cache(String fileName) {
if (!map.ContainsKey(fileName))
{
map.Add(fileName, new Object());
_Cache(fileName);
}
}
}
private void _Cache(String fileName) {
lock (map[fileName])
{
if (File Already Cached)
return;
else {
cache file
}
}
}
When having the following consumers:
Task.Run(()=> {
Cache("A");
});
Task.Run(()=> {
Cache("A");
});
Would it be possible in any ways that the Cache method would throw a Duplicate key exception meaning that both tasks would hit the map.add method and try to add the same key??
Edit:
Would using the following data structure solve this concurrency problem?
public class HashMap<Key, Value>
{
private HashSet<Key> Keys = new HashSet<Key>();
private List<Value> Values = new List<Value>();
public int Count => Keys.Count;
public Boolean Add(Key key, Value value) {
int oldCount = Keys.Count;
Keys.Add(key);
if (oldCount != Keys.Count) {
Values.Add(value);
return true;
}
return false;
}
}
Yes, of course it would be possible. Consider the following fragment:
if (!map.ContainsKey(fileName))
{
map.Add(fileName, new Object());
Thread 1 may execute if (!map.ContainsKey(fileName)) and find that the map does not contain the key, so it will proceed to add it, but before it gets the chance to add it, Thread 2 may also execute if (!map.ContainsKey(fileName)), at which point it will also find that the map does not contain the key, so it will also proceed to add it. Of course, that will fail.
EDIT (after clarifications)
So, the problem seems to be how to keep the main map locked for as little as possible, and how to prevent cached objects from being initialized twice.
This is a complex problem, so I cannot give you a ready-to-run answer that will work, (especially since I do not currently even have a C# development environment handy,) but generally speaking, I think that you should proceed as follows:
Fully guard your map with lock().
Keep your map locked as little as possible; when an object is not found to be in the map, add an empty object to the map and exit the lock immediately. This will ensure that this map will not become a point of contention for all requests coming in to the web server.
After the check-if-present-and-add-if-not fragment, you are holding an object which is guaranteed to be in the map. However, this object may and may not be initialized at this point. That's fine. We will take care of that next.
Repeat the lock-and-check idiom, this time with the cached object: every single incoming request interested in that specific object will need to lock it, check whether it is initialized, and if not, initialize it. Of course, only the first request will suffer the penalty of initialization. Also, any requests that arrive before the object has been fully initialized will have to wait on their lock until the object is initialized. But that's all very fine, that's exactly what you want.
There are a great number of articles available regarding thread safe caching, here's an example:
private static object _lock = new object();
public void CacheData()
{
SPListItemCollection oListItems;
oListItems = (SPListItemCollection)Cache["ListItemCacheName"];
if(oListItems == null)
{
lock (_lock)
{
// Ensure that the data was not loaded by a concurrent thread
// while waiting for lock.
oListItems = (SPListItemCollection)Cache[“ListItemCacheName”];
if (oListItems == null)
{
oListItems = DoQueryToReturnItems();
Cache.Add("ListItemCacheName", oListItems, ..);
}
}
}
}
However, this example depends on the request for the cache also rebuilding the cache.
I'm looking for a solution where the request and rebuild are separate. Here's the scenario.
I have a web service that I want to monitor for certain types of error. If an error occurs, I create an monitor object and cache - it is updatable and is locked accordingly during update. Alls well so far.
Elsewhere, I check for the existence of the cached object, and the data it contains. This would work straight out of the box except for one particular scenario.
If the cache object is being updated - say a status change, I would like to wait and get the latest info rather than the current info, which if returned, would be out of date. So for my fetch code, I need to check if the object is currently being created/updating, and if so wait, then retry.
As I pointed out, there are many examples of cache locking patterns but I can't seem to find one that for this scenario. Any ideas as to how to go about this would be appreciated?
You can try the following code using two locks. Write lock in the setter is quite simple and protects cache from being written by more than one threads. The getter use a simple double-check lock.
Now, the trick is in Refresh() method, which uses the same lock as the getter. The method uses the lock and in the first step removes list from the cache. It will trigger any getter to fail the first null check and wait for the lock. The method in the meantime gets items, sets cache again and releases the lock.
When it comes back to the getter, it reads the cache again and now it contains the list.
public class CacheData
{
private static object _readLock = new object();
private static object _writeLock = new object();
public SPListItemCollection ListItem
{
get
{
var oListItems = (SPListItemCollection) Cache["ListItemCacheName"];
if (oListItems == null)
{
lock (_readLock)
{
oListItems = (SPListItemCollection)Cache["ListItemCacheName"];
if (oListItems == null)
{
oListItems = DoQueryToReturnItems();
Cache.Add("ListItemCacheName", oListItems, ..);
}
}
}
return oListItems;
}
set
{
lock (_writeLock)
{
Cache.Add("ListItemCacheName", value, ..);
}
}
}
public void Refresh()
{
lock (_readLock)
{
Cache.Remove("ListItemCacheName");
var oListItems = DoQueryToReturnItems();
ListItem = oListItems;
}
}
}
You can make the method and property static, if you do not need CacheData instance.
So in order to have a separate context for each thread that the program is running I set up a Context - Thread mapping class as follows
public class ContextMap : IContextMap
{
private static IContextMap _contextMap;
private Dictionary<int, IArbContext2> ContextDict;
private static string DbName;
private ContextMap()
{
if (string.IsNullOrWhiteSpace(DbName))
throw new InvalidOperationException("Setup must be called before accessing ContextMap");
ContextDict = new Dictionary<int, IArbContext2>();
}
protected internal static void Setup(IContextMap map)
{
_contextMap = map;
}
public static void Setup(string dbName)
{
DbName = dbName;
}
public static IContextMap GetInstance()
{
return _contextMap ?? (_contextMap = new ContextMap());
}
public IArbContext2 GetOrCreateContext()
{
var threadId = Thread.CurrentThread.ManagedThreadId;
if(!ContextDict.ContainsKey(threadId))
ContextDict.Add(threadId,new ArbContext(DbName));
return ContextDict[threadId];
}
public void DestroyContext()
{
if (ContextDict.ContainsKey(Thread.CurrentThread.ManagedThreadId))
ContextDict.Remove(Thread.CurrentThread.ManagedThreadId);
}
Somehow the code is (very rarely but still happening) throwing a keynotfound exception in the GetOrCreateContext method. Is it possible for a thread to be sidetracked to a separate action (e.g. the overseeing thread forces it to do another action that causes the thread to call DestroyContext after it checked if the Dict had the key but before it returned it) and then to resume where it left off. I never specifically do this but I can't understand any other reason how this error is being thrown.
Thank You.
The problem here is that Dictionary is not thread-safe. There can be unexpected behaviour when multiple threads try to access it, even if they are all using unique keys, because creating or removing a key/value pair is not an atomic action.
The easiest fix would be to use a ConcurrentDictionary in its place for ContextDict
Answering your literal question, NOT attempting to solve your problem. (#BenAaronsom has already done that.)
No: You have a "race condition" when the result of some computation depends on the order in which two or more threads access the same variable. If there is only one thread running in the race, then no matter how many times you run it, the same thread will always win. If a single-threaded program gives a non-deterministic answer, then whatever the problem is, it's not a race condition.
I have the following code to cache instances of some class in a Concurrent Dictionary to which I use in a multi threaded application.
Simply, when I instantinate the class with the id parameter, it first checks if an instance of privateclass with the given id exists in the dictionary, and if not creates an instance of the privateclass (which takes long time, sometimes couple of seconds), and adds it to the dictionary for future use.
public class SomeClass
{
private static readonly ConcurrentDictionary<int, PrivateClass> SomeClasses =
new ConcurrentDictionary<int, PrivateClass>();
private readonly PrivateClass _privateClass;
public SomeClass(int cachedInstanceId)
{
if (!SomeClasses.TryGetValue(cachedInstanceId, out _privateClass))
{
_privateClass = new PrivateClass(); // This takes long time
SomeClasses.TryAdd(cachedInstanceId, _privateClass);
}
}
public int SomeCalculationResult()
{
return _privateClass.CalculateSomething();
}
private class PrivateClass
{
internal PrivateClass()
{
// this takes long time
}
internal int CalculateSomething()
{
// Calculates and returns something
}
}
}
My question is, do I need to add a lock around the generation and assignment part of the outer classes constructor to make this code thread safe or is it good as it is?
Update:
After SLaks's suggestion, tried to use GetOrAdd() method of ConcurrentDictionary with the combination of Lazy, but unfortunately the constructor of the PrivateClass still called more than once. See https://gist.github.com/3500955 for the test code.
Update 2:
You can see the final solution here:
https://gist.github.com/3501446
You're misusing ConcurrentDictionary.
In multi-threaded code, you should never check for the presence of an item, then add it if it's not there.
If two threads run that code at once, they will both end up adding it.
In general, there are two solutions tho this kind of problem. You can wrap all of that code in a lock, or you can redesign it to the whole thing in one atomic operation.
ConcurrentDictionary is designed to for this kind of scenario.
You should simply call
_privateClass = SomeClasses.GetOrAdd(cachedInstanceId, key => new PrivateClass());
Locking is not necessary, but what you're doing is not thread-safe. Instead of first checking the dictionary for presence of an item and then adding it if necessary, you should use ConcurrentDictionary.GetOrAdd() to do it all in one atomic operation.
Otherwise, you're exposing yourself to the same problem that you'd have with a regular dictionary: another thread might add an entry to SomeClasses after you check for existence but before you insert.
Your sample code at https://gist.github.com/3500955 using ConcurrentDictionary and Lazy<T> is incorrect - you're writing:
private static readonly ConcurrentDictionary<int, PrivateClass> SomeClasses =
new ConcurrentDictionary<int, PrivateClass>();
public SomeClass(int cachedInstanceId)
{
_privateClass = SomeClasses.GetOrAdd(cachedInstanceId, (key) => new Lazy<PrivateClass>(() => new PrivateClass(key)).Value);
}
..which should have been:
private static readonly ConcurrentDictionary<int, Lazy<PrivateClass>> SomeClasses =
new ConcurrentDictionary<int, Lazy<PrivateClass>>();
public SomeClass(int cachedInstanceId)
{
_privateClass = SomeClasses.GetOrAdd(cachedInstanceId, (key) => new Lazy<PrivateClass>(() => new PrivateClass(key))).Value;
}
You need to use ConcurrentDictionary<TKey, Lazy<TVal>>, and not ConcurrentDictionary<TKey, TVal>.
The point is that you only access the Value of the Lazy after the correct Lazy object has been returned from the GetOrAdd() - sending in the Value of the Lazy object to the GetOrAdd function defeats the whole purpose of using it.
Edit: Ah - you got it in https://gist.github.com/mennankara/3501446 :)
I'd like to use .NET's Lazy<T> class to implement thread safe caching. Suppose we had the following setup:
class Foo
{
Lazy<string> cachedAttribute;
Foo()
{
invalidateCache();
}
string initCache()
{
string returnVal = "";
//CALCULATE RETURNVAL HERE
return returnVal;
}
public String CachedAttr
{
get
{
return cachedAttribute.Value;
}
}
void invalidateCache()
{
cachedAttribute = new Lazy<string>(initCache, true);
}
}
My questions are:
Would this work at all?
How would the locking have to work?
I feel like I'm missing a lock somewhere near the invalidateCache, but for the life of me I can't figure out what it is.
I'm sure there's a problem with this somewhere, I just haven't figured out where.
[EDIT]
Ok, well it looks like I was right, there were things I hadn't thought about. If a thread sees an outdated cache it'd be a very bad thing, so it looks like "Lazy" is not safe enough. The Property is accessed a lot though, so I was engaging in pre-mature optimization in hopes that I could learn something and have a pattern to use in the future for thread-safe caching. I'll keep working on it.
P.S.: I decided to make the object thread-un-safe and have access to the object be carefully controlled instead.
Well, it's not thread-safe in that one thread could still see the old value after another thread sees the new value after invalidation - because the first thread could have not seen the change to cachedAttribute. In theory, that situation could perpetuate forever, although it's pretty unlikely :)
Using Lazy<T> as a cache of unchanging values seems like a better idea to me - more in line with how it was intended - but if you can cope with the possibility of using an old "invalidated" value for an arbitrarily long period in another thread, I think this would be okay.
cachedAttribute is a shared resource that needs to be protected from concurrent modification.
Protect it with a lock:
private readonly object gate = new object();
public string CachedAttr
{
get
{
Lazy<string> lazy;
lock (gate) // 1. Lock
{
lazy = this.cachedAttribute; // 2. Get current Lazy<string>
} // 3. Unlock
return lazy.Value // 4. Get value of Lazy<string>
// outside lock
}
}
void InvalidateCache()
{
lock (gate) // 1. Lock
{ // 2. Assign new Lazy<string>
cachedAttribute = new Lazy<string>(initCache, true);
} // 3. Unlock
}
or use Interlocked.Exchange:
void InvalidateCache()
{
Interlocked.Exchange(ref cachedAttribute, new Lazy<string>(initCache, true));
}
volatile might work as well in this scenario, but it makes my head hurt.