I've got a scenario where I require to cache information from a webapi temporarily when it is first called. With the same parameters this API can be called a few times a second.
Due to performance restrictions I don't want each call fetching the data and putting it into the memory cache so I've implemented a system with Semaphores to try and allow one thread to initialize the cache and then allow the rest to just query that cache.
I've stripped down the code to show an example of what i'm doing currently.
private static MemoryCacher memCacher = new MemoryCacher();
private static ConcurrentDictionary<string, Semaphore> dictionary = new ConcurrentDictionary<string, Semaphore>();
private async Task<int[]> DoAThing(string requestHash)
{
// check for an existing cached result before hitting the dictionary
var cacheValue = memCacher.GetValue(requestHash);
if (cacheValue != null)
{
return ((CachedResult)cacheValue).CheeseBurgers;
}
Semaphore semi;
semi = dictionary.GetOrAdd(requestHash, new Semaphore(1, 1, requestHash));
semi.WaitOne();
//It's possible a previous thread has now filled up the cache. Have a squiz.
cacheValue = memCacher.GetValue(requestHash);
if (cacheValue != null)
{
dictionary.TryRemove(requestHash);
semi.Release();
return ((CachedResult)cacheValue).CheeseBurgers;
}
// fetch the latest data from the relevant web api
var response = await httpClient.PostAsync(url, content);
// add the result to the cache
memCacher.Add(requestHash, new CachedResult() { CheeseBurgers = response.returnArray }, DateTime.Now.AddSeconds(30));
// We have added everything to the cacher so we don't need this semaphore in the dictonary anymore:
dictionary.TryRemove(requestHash);
//Open the floodgates
semi.Release()
return response.returnArray;
}
Unfortunately there are many weird issues where more than one thread at a time manages to get through the WaitOne() call and then when released manages to break due to the count restriction on the semaphore. (to make sure only one semaphore is working at a time)
I've tried using Mutexes and Monitors, but since IIS doesn't guarantee that an API call will always run on the same thread this causes it to fail regularly when the mutex is attempted to be released in a different thread.
Any suggestions on other ways to implement this would be welcome as well!
Related
This question already has answers here:
Locking pattern for proper use of .NET MemoryCache
(10 answers)
Closed 6 years ago.
I'm trying to implement following scenario and unable to come up with a solution.
In my web service I've cache object (contains static data) based on the session id. Once request is received it checks whether cache contains any key for the session id.
If not available, it will load it from DB and stores it in cache.
If available it uses that cache and continues with further processing.
Now, with multithreading enabled in this service and when multiple requests (with same session id) are sent to service, all of them are trying to load the data into cache as none of them find that key initially.
Question is: I wanted to stop all the other threads till the first thread loads static data into cache and once first thread is done with loading data in to cache, other threads should use that cache instead of trying to load again.
Looks trivial but somehow not able to think of any multi threading feature which can solve this.
My code looks something like below:
somemethod()
{
if(cache.Contains(someKey)
{
// use cache and do further processing
}
else
{
cache.add(someKey)
}
}
You can try following logic
1) Thread1 for comes and finds that object is not present in cache
2) Puts a wait command object in cache for this session Id. This object tells any other threads to wait till further notice.
3) Thread1 fetches the data from DB and put it backs into cache.
4) Thread1 notifies other threads that they can proceed since data is now available.
Classic remedy againsnt the race condition is mutual exclusion. Locking is the simplest solution providing such capability.
public class Cache
{
private object _locker = new object();
private SessionDataCollection _cache;
public SessionData Get(SessionId id)
{
lock (_locker)
{
if (!Contains(id))
Fetch(id);
return Retrieve(id);
}
}
private bool Contains(SessionId id)
{
//check if present in _cache
}
private void Fetch(SessionId id)
{
//get from db and store in _cache
}
private SessionData Retrieve(SessionId id)
{
//retrvieve from _cache
}
}
We have a legacy ASP.NET 2.0 environment where each page execution is authenticated to a specific user, and therefore I have an integer representing the logged-in user's ID.
On one of the pages I need to run some code where I want to prevent the user from performing a duplicate action. Finding it difficult to guarantee this can't happen, even though we're doing basic dupe-prevention checking.
Obviously I could create a static object and do a lock(myObject) { ... } around the entire piece of code to try and help prevent some of these race conditions. But I don't want to create a bottleneck for everyone ... just want to stop the same logged-in user from running the code simultaneously or nearly simultaneously.
So I am thinking of creating an object instance for each user, and storing it in a cache based on their user id. Then I lookup that object, and if the object is found, I lock on it. If not found, I first create/cache it, then lock on it.
Does this make sense? Is there a better way to accomplish what I'm after?
Something like this is what I'm thinking:
public class MyClass
{
private static object lockObject = new object(); // global locking object
public void Page_Load()
{
string cachekey = "preventdupes:" + UserId.ToString();
object userSpecificLock = null;
// This part would synchronize among all requests, but should be quick
// as it is simply trying to find out if a user-specific lock object
// exists, and if so, it gets it. Otherwise, it creates and stores it.
lock (lockObject)
{
userSpecificLock = HttpRuntime.Cache.Get(cachekey);
if (userSpecificLock == null)
{
userSpecificLock = new object();
// Cache the locking object on a sliding 30 minute window
HttpRuntime.Cache.Add(cachekey, userSpecificLock, null,
System.Web.Caching.Cache.NoAbsoluteExpiration,
new TimeSpan(0, 30, 0),
System.Web.Caching.CacheItemPriority.AboveNormal, null);
}
}
// Now we have obtained an instance of an object specific to the user,
// and we'll lock the next block of code specifically to them.
lock (userSpecificLock)
{
try
{
// Perform some operations to check our database to see if the
// transaction already occurred for this user, and if not,
// perform the transaction, and then record it into our db.
}
catch (Exception)
{
// Rollback anything our code has done up until this exception,
// so that if the user tries again, it will work.
}
}
}
}
The solution is to use mutex.
Mutex can be named, so you can name your user id, and they are work for the full computer, so they are work if you have many processes under the same pool (web garden).
More to read:
http://en.wikipedia.org/wiki/Mutual_exclusion
Asp.Net. Synchronization access(mutex)
http://www.dotnetperls.com/mutex
MSDN Mutex with example
Some points
The lock The lock is work only inside the same and parent threads and you can use them only for synchronized static variables. But also the HttpRuntime.Cache is a static memory, that is means that if you have many processes under the same pool (web garden), you have many different Cache variables.
The page is also automatically synchronized by the session. So if you have disable the session for this page, then the mutex have a point, if not, the session all ready locks the page_load (with mutex), and the mutex that you will going to place have no meaning.
Some reference about:
ASP.NET Server does not process pages asynchronously
Is Session variable thread-safe within a Parallel.For loop in ASP.Net page
HttpContext.Current is null when in method called from PageAsyncTask
I am getting an XML feed and I parse it the my MQ server, then I have a service that listen to the MQ server and reading all its messages.
I have a foreach loop that opens a new thread each iteration, in order to make the parsing faster, cause there are around 500 messages in the MQ (means there are 500 XMLs)
foreach (System.Messaging.Message m in msgs)
{
byte[] bytes = new byte[m.BodyStream.Length];
m.BodyStream.Read(bytes, 0, (int)m.BodyStream.Length);
System.Text.ASCIIEncoding ascii = new System.Text.ASCIIEncoding();
ParserClass tst = new ParserClass(ascii.GetString(bytes, 0, (int)m.BodyStream.Length));
new Thread( new ThreadStart(tst.ProcessXML)).Start();
}
In the ParserClass I have this code:
private static object thLockMe = new object();
public string xmlString { get; set; }
public ParserClass(string xmlStringObj)
{
this.xmlString = xmlStringObj;
}
public void ProcessXML()
{
lock (thLockMe)
{
XDocument reader = XDocument.Parse(xmlString);
//Some more code...
}
}
The problem is, when I run this foreach loop with 1 thread only, it works perfect, but slow.
When I run it with more then 1 thread, I get an error "Object reference not set to an instance of an object".
I guess there is something wrong with my locking since I am not very experienced with threading.
I am kinda hopeless, hope you can help!
Cheers!
I note that you are running a bunch of threads with their entire code wrapped inside a lock statement. You might as well run the methods in a sequence this way, because you are not getting any parallelism.
Since you are creating a new ParserClass instance on every iteration of your loop, and also creating and starting a new thread every iteration, you do not need a lock in your ParseXML method.
Your object on which you lock is currently static, so it is not instance bound, which means, once one thread is inside your ParseXML method, no other will be able to do anything, until the first has finished.
You are not sharing any data (from the code I can see) in your Parser class amongst threads, so you don't need a lock, inside your ParseXML function.
If you are using data that is shared between threads, then you should have a lock.
If you're going to be using lots of threads, then you're better of using a ThreadPool, and taking a finite (4 perhaps) from your pool, assigning them some work, and recycling them for the next 4 tasks.
Creating threads is an expensive operation, which requires a call into the OS kernel, so you do not want to do that 500 times. This is too costly. Also, the min reserved memory for a threadstack in Windows is 1MB, so that is 500MB in stackspace alone for your threads.
An optimal number of threads should be equal to the number of cores in your machine, however since that's not real for most purposes, you can do double or triple that, but then you're better off with a threadpool, where you recycle threads, instead of creating new one's all the time.
Even though this probably won't solve your problem, instead of creating 500 simultaneous threads you should just use the ThreadPool, which manages threads in a much more efficient way:
foreach (System.Messaging.Message m in msgs)
{
byte[] bytes = new byte[m.BodyStream.Length];
m.BodyStream.Read(bytes, 0, (int)m.BodyStream.Length);
System.Text.ASCIIEncoding ascii = new System.Text.ASCIIEncoding();
ParserClass tst = new ParserClass(ascii.GetString(bytes, 0, (int)m.BodyStream.Length));
ThreadPool.QueueUserWorkItem(x => tst.ProcessXML());
}
And to make sure they run as simultaneously as possible change your code in the ParserClass like this (assuming you indeed have resources you share between threads - if you don't have any, you don't have to lock at all):
private static object thLockMe = new object();
public string XmlString { get; set; }
public ParserClass(string xmlString)
{
XmlString = xmlString;
}
public void ProcessXML()
{
XDocument reader = XDocument.Parse(xmlString);
// some more code which doesn't need to access the shared resource
lock (thLockMe)
{
// the necessary code to access the shared resource (and only that)
}
// more code
}
Regarding your actual question:
Instead of calling OddService.InsertEvent(...) multiple times with the same parameters (that method reeks of remote calls and side effects...) you should call it once, store the result in a variable and do all subsequent operations on that variable. That way you can also conveniently check if it's not that precise method which returns null sometimes (when accessed simultaneously?).
Edit:
Does it work if you put all calls to OddService.* in lock blocks?
We're using the following pattern to handle caching of universal objects for our asp.net application.
private object SystemConfigurationCacheLock = new object();
public SystemConfiguration SystemConfiguration
{
get
{
if (HttpContext.Current.Cache["SystemConfiguration"] == null)
lock (SystemConfigurationCacheLock)
{
if (HttpContext.Current.Cache["SystemConfiguration"] == null)
HttpContext.Current.Cache.Insert("SystemConfiguration", GetSystemConfiguration(), null, DateTime.Now.AddMinutes(1), Cache.NoSlidingExpiration, new CacheItemUpdateCallback(SystemConfigurationCacheItemUpdateCallback));
}
return HttpContext.Current.Cache["SystemConfiguration"] as SystemConfiguration;
}
}
private void SystemConfigurationCacheItemUpdateCallback(string key, CacheItemUpdateReason reason, out object expensiveObject, out CacheDependency dependency, out DateTime absoluteExpiration, out TimeSpan slidingExpiration)
{
dependency = null;
absoluteExpiration = DateTime.Now.AddMinutes(1);
slidingExpiration = Cache.NoSlidingExpiration;
expensiveObject = GetSystemConfiguration();
}
private SystemConfiguration GetSystemConfiguration()
{
//Load system configuration
}
The problem is that when under load (~100,000 users) we see a huge jump in TTFB as the CacheItemUpdateCallback blocks all the other threads from executing until it has finished refreshing the cache from the database.
So what I figured we needed is solution that when the first thread after an expiry of the cache attempts to access it, an asynchronous thread is fired off to update the cache but still allows all other executing threads to read from the old cache until it has sucessfully updated.
Is there anything built into the .NET framework that can natively handle what I'm asking, or will I have to write it from scratch? Your thoughts please...
A couple of things...
The use of the HttpContext.Current.Cache is incidental and not necessarily essential as we've got no problem using private members on a singleton to hold the cached data.
Please don't comment on the cache times, SPROC effeciency, why we're caching in the first place etc as it's not relevent. Thanks!
AppFabric might be a good fit for what you're looking for.
http://msdn.microsoft.com/en-us/windowsserver/ee695849
http://msdn.microsoft.com/en-us/library/ff383731.aspx
So it turns out after several hours of investigation that the problem is not the CacheItemUpdateCallback blocking other threads as I originally thought, in fact it did exactly what I wanted it to asynchronously but it was the garbage collector stopping everything to clean up the LOH.
I have the following code, in which I’m trying to process a large amount of data, and update the UI. I’ve tried the same thing using a background worker, but I get a similar issue. The problem seems to be that I’m trying to use a class that was not instantiated on the new thread (the actual error is that the current thread doesn't "own" the instance). My question is, is there a way that I can pass this instance between threads to avoid this error?
DataInterfaceClass dataInterfaceClass = new DataInterfaceClass();
private void OutputData(List<MyResult> Data)
{
progressBar1.Maximum = Data.Count;
progressBar1.Minimum = 1;
progressBar1.Value = 1;
foreach (MyResult res in Data)
{
// Add data to listview
UpdateStatus("Processing", res.Name);
foreach (KeyValuePair<int, string> dets in res.Details)
{
ThreadPool.QueueUserWorkItem((o) =>
{
// Get large amount of data from DB based on key
// – gives error because DataInterfaceClass was
// created in different thread.
MyResult tmpResult = dataInterfaceClass
.GetInfo(dets.DataKey);
if (tmpResult == null)
{
// Updates listview
UpdateStatus("Could not get details",
dets.DataKey);
}
else
{
UpdateStatus("Got Details", dets.DataKey);
}
progressBar1.Dispatcher.BeginInvoke(
(Action)(() => progressBar1.Value++));
});
}
}
}
EDIT:
DataInterfaceClass is actually definated and created outside of the function that it is used in, but it is an instance and not static.
UPDATE:
You seem to have modified the posted source code, so...
You should create an instance of the DataInterfaceClass exclusively for each background thread or task. Provide your task with enough input to create its own instance.
That being said, if you try to access data in a single database in a highly parallel way, this might result in database timeouts. Even if you can get your data access to work in a multithreaded way, I would recommend limiting the number of simultaneous background tasks to prevent this from occurring.
You could use a Semaphore (or similar) to ensure that no more than a certain amount of tasks are running at the same time.
Create a global instance for DataInterfaceClass inside the class that has OutputData method defined, that way you would be able to use it within the method.
However, you would need to be cautious in using it. If all the threads would use the same instance to read from the database, it would result in errors.
You should either create a new instance of the DataInterfaceClass in each thread, or have some lock implemented inside your GetInfo method to avoid multiple access issues.