Parallelize / Multi-Thread a singleton object method call

Parallelize / Multi-Thread a singleton object method call - c#

Code Details:
// Singleton class CollectionObject
public class CollectionObject
{
private static CollectionObject instance = null;
// GetInstance() is not called from multiple threads
public static CollectionObject GetInstance()
{
if (CollectionObject.instance == null)
CollectionObject.instance = new CollectionObject();
return CollectionObject.instance;
}
// Dictionary object contains Service ID (int) as key and Service object as the value
// Dictionary is filled up during initiation, before the method call ReadServiceMatrix detailed underneath
public Dictionary<int, Service> serviceCollectionDictionary = new Dictionary<int,Service>();
public Service GetServiceByIDFromDictionary(int servID)
{
if (this.serviceCollectionDictionary.ContainsKey(servID))
return this.serviceCollectionDictionary[servID];
else
return null;
}
}
DataTable serviceMatrix = new DataTable();
// Fill serviceMatrix data table from the database
private int ReadServiceMatrix()
{
// Access the Singleton class object
CollectionObject collectionObject = CollectionObject.GetInstance();
// Parallel processing of the data table rows
Parallel.ForEach<DataRow>(serviceMatrix.AsEnumerable(), row =>
{
//Access Service ID from the Data table
string servIDStr = row["ServID"].ToString().Trim();
// Access other column details for each row of the data table
string currLocIDStr = row["CurrLocId"].ToString().Trim();
string CurrLocLoadFlagStr = row["CurrLocLoadFlag"].ToString().Trim();
string nextLocIDStr = row["NextLocId"].ToString().Trim();
string nextLocBreakFlagStr = row["NextLocBreakFlag"].ToString().Trim();
string seqStr = row["Seq"].ToString().Trim();
int servID = Int32.Parse(servIDStr);
int currLocID = Int32.Parse(currLocIDStr);
int nextLocID = Int32.Parse(nextLocIDStr);
bool nextLocBreakFlag = Int32.Parse(nextLocBreakFlagStr) > 0 ? true : false;
bool currLocBreakFlag = Int32.Parse(CurrLocLoadFlagStr) > 0 ? true : false;
int seq = Int32.Parse(seqStr);
// Method call leading to the issue (definition in Collection Object class)
// Fetch service object using the Service ID from the DB
Service service = collectionObject.GetServiceByIDFromDictionary(servID);
// Call a Service class method
service.InitLanes.Add(new Service.LaneNode(currLoc.SequentialID, currLocBreakFlag, nextLoc.SequentialID, nextLocBreakFlag, seq));
}
Issue that happens is:
In the code above for all the Service objects in the dictionary, the subsequent method call is not made, leading to issues in further processing. It has to o with fetching the Service object from the dictionary in parallel mode
The db an dictionary contains all the Ids /Service objects, but my understanding is when processing in Parallel mode for the Singleton class, few of the objects are skipped leading to the issue.
In my understanding the service id passed and service object created is local to a thread, so there shouldn't be an issue that I am facing. This kind of issue is only possible, when for a given method call one thread replace service id value of another thread by its, thus both end up with Service object and few are thus skipped, which is strange in my view until and unless I do not understand the Multi threading in this case correctly
Currently I am able to run the same code in non threaded mode by using the foreach loop instead of Parallel.ForEach / Parallel.Invoke
Please review and let me know your view or any pointer that can help me resolve the issue

In my understanding the service id passed and service object created
is local to a thread
Your understanding is incorrect, if two threads request the same service id the two threads will be both working on the same singular object. If you wanted separate objects you would need to put some kind of new Service() call in GetServiceByIDFromDictionary instead of a dictionary of existing values.
Because multiple threads could be using the same service objects I think your problem lies from the fact that service.InitLanes.Add is likely not thread safe.
The easiest fix is to just lock on that single step
//...SNIP...
Service service = collectionObject.GetServiceByIDFromDictionary(servID);
// Call a Service class method, only let one thread do it for this specific service instance,
// other threads locking on other instances will not block, only other threads using the same instance will block
lock(service)
{
service.InitLanes.Add(new Service.LaneNode(currLoc.SequentialID, currLocBreakFlag, nextLoc.SequentialID, nextLocBreakFlag, seq));
}
}
This assumes that this Parallel.Foreach is the only location collectionObject.GetServiceByIDFromDictionary is used concurrently. If it is not, any other locations that could potentially be calling any methods on returned services must also lock on service.
However if Service is under your control and you can somehow modify service.InitLanes.Add to be thread safe (perhaps change InitLanes out with a thread safe collection from the System.Collections.Concurrent namespace) that would be a better solution than locking.

1.Implementing singleton always think about using of it in mulithreaded way. Always use multithreaded singleton pattern variant, one of them - lazy singleton. Use Lazy singleton using System.Lazy with appropriate LazyThreadSafeMode consturctor argument:
public class LazySingleton3
{
// static holder for instance, need to use lambda to enter code here
//construct since constructor private
private static readonly Lazy<LazySingleton3> _instance
= new Lazy<LazySingleton3>(() => new LazySingleton3(),
LazyThreadSafeMode.PublicationOnly);
// private to prevent direct instantiation.
private LazySingleton3()
{
}
// accessor for instance
public static LazySingleton3 Instance
{
get
{
return _instance.Value;
}
}
}
Read about it here
2.Use lock-ing of your service variable in parallel loop body
// Method call leading to the issue (definition in Collection Object class)
// Fetch service object using the Service ID from the DB
Service service = collectionObject.GetServiceByIDFromDictionary(servID);
lock (service)
{
// Call a Service class method
service.InitLanes.Add(new Service.LaneNode(currLoc.SequentialID,
currLocBreakFlag, nextLoc.SequentialID,
nextLocBreakFlag, seq));
}
3.Consider to use multithreading here. Using lock-ing code make your code not so perfomant as synchronous. So make sure you multithreaded/paralelised code gives you advantages
4.Use appropriate concurrent collections instead of reinventing wheel - System.Collections.Concurrent Namespace

Related

Chance of hitting the same function at the same time by two Threads/Tasks

Assuming the following case:
public HashTable map = new HashTable();
public void Cache(String fileName) {
if (!map.ContainsKey(fileName))
{
map.Add(fileName, new Object());
_Cache(fileName);
}
}
}
private void _Cache(String fileName) {
lock (map[fileName])
{
if (File Already Cached)
return;
else {
cache file
}
}
}
When having the following consumers:
Task.Run(()=> {
Cache("A");
});
Task.Run(()=> {
Cache("A");
});
Would it be possible in any ways that the Cache method would throw a Duplicate key exception meaning that both tasks would hit the map.add method and try to add the same key??
Edit:
Would using the following data structure solve this concurrency problem?
public class HashMap<Key, Value>
{
private HashSet<Key> Keys = new HashSet<Key>();
private List<Value> Values = new List<Value>();
public int Count => Keys.Count;
public Boolean Add(Key key, Value value) {
int oldCount = Keys.Count;
Keys.Add(key);
if (oldCount != Keys.Count) {
Values.Add(value);
return true;
}
return false;
}
}

Yes, of course it would be possible. Consider the following fragment:
if (!map.ContainsKey(fileName))
{
map.Add(fileName, new Object());
Thread 1 may execute if (!map.ContainsKey(fileName)) and find that the map does not contain the key, so it will proceed to add it, but before it gets the chance to add it, Thread 2 may also execute if (!map.ContainsKey(fileName)), at which point it will also find that the map does not contain the key, so it will also proceed to add it. Of course, that will fail.
EDIT (after clarifications)
So, the problem seems to be how to keep the main map locked for as little as possible, and how to prevent cached objects from being initialized twice.
This is a complex problem, so I cannot give you a ready-to-run answer that will work, (especially since I do not currently even have a C# development environment handy,) but generally speaking, I think that you should proceed as follows:
Fully guard your map with lock().
Keep your map locked as little as possible; when an object is not found to be in the map, add an empty object to the map and exit the lock immediately. This will ensure that this map will not become a point of contention for all requests coming in to the web server.
After the check-if-present-and-add-if-not fragment, you are holding an object which is guaranteed to be in the map. However, this object may and may not be initialized at this point. That's fine. We will take care of that next.
Repeat the lock-and-check idiom, this time with the cached object: every single incoming request interested in that specific object will need to lock it, check whether it is initialized, and if not, initialize it. Of course, only the first request will suffer the penalty of initialization. Also, any requests that arrive before the object has been fully initialized will have to wait on their lock until the object is initialized. But that's all very fine, that's exactly what you want.

block multiple request from same user id to a web method c#

I have a web method upload Transaction (ASMX web service) that take the XML file, validate the file and store the file content in SQL server database. we noticed that a certain users can submit the same file twice at the same time. so we can have the same codes again in our database( we cannot use unique index on the database or do anything on database level, don't ask me why). I thought I can use the lock statement on the user id string but i don't know if this will solve the issue. or if I can use a cashed object for storing all user id requests and check if we have 2 requests from the same user Id we will execute the first one and block the second request with an error message
so if anyone have any idea please help

Blocking on strings is bad. Blocking your webserver is bad.
AsyncLocker is a handy class that I wrote to allow locking on any type that behaves nicely as a key in a dictionary. It also requires asynchronous awaiting before entering the critical section (as opposed to the normal blocking behaviour of locks):
public class AsyncLocker<T>
{
private LazyDictionary<T, SemaphoreSlim> semaphoreDictionary =
new LazyDictionary<T, SemaphoreSlim>();
public async Task<IDisposable> LockAsync(T key)
{
var semaphore = semaphoreDictionary.GetOrAdd(key, () => new SemaphoreSlim(1,1));
await semaphore.WaitAsync();
return new ActionDisposable(() => semaphore.Release());
}
}
It depends on the following two helper classes:
LazyDictionary:
public class LazyDictionary<TKey,TValue>
{
//here we use Lazy<TValue> as the value in the dictionary
//to guard against the fact the the initializer function
//in ConcurrentDictionary.AddOrGet *can*, under some conditions,
//run more than once per key, with the result of all but one of
//the runs being discarded.
//If this happens, only uninitialized
//Lazy values are discarded. Only the Lazy that actually
//made it into the dictionary is materialized by accessing
//its Value property.
private ConcurrentDictionary<TKey, Lazy<TValue>> dictionary =
new ConcurrentDictionary<TKey, Lazy<TValue>>();
public TValue GetOrAdd(TKey key, Func<TValue> valueGenerator)
{
var lazyValue = dictionary.GetOrAdd(key,
k => new Lazy<TValue>(valueGenerator));
return lazyValue.Value;
}
}
ActionDisposable:
public sealed class ActionDisposable:IDisposable
{
//useful for making arbitrary IDisposable instances
//that perform an Action when Dispose is called
//(after a using block, for instance)
private Action action;
public ActionDisposable(Action action)
{
this.action = action;
}
public void Dispose()
{
var action = this.action;
if(action != null)
{
action();
}
}
}
Now, if you keep a static instance of this somewhere:
static AsyncLocker<string> userLock = new AsyncLocker<string>();
you can use it in an async method, leveraging the delights of LockAsync's IDisposable return type to write a using statement that neatly wraps the critical section:
using(await userLock.LockAsync(userId))
{
//user with userId only allowed in this section
//one at a time.
}
If we need to wait before entering, it's done asynchronously, freeing up the thread to service other requests, instead of blocking until the wait is over and potentially messing up your server's performance under load.
Of course, when you need to scale to more than one webserver, this approach will no longer work, and you'll need to synchronize using a different means (probably via the DB).

Executing part of code exactly 1 time inside Parallel.ForEach

I have to query in my company's CRM Solution(Oracle's Right Now) for our 600k users, and update them there if they exist or create them in case they don't. To know if the user already exists in Right Now, I consume a third party WS. And with 600k users this can be a real pain due to the time it takes each time to get a response(around 1 second). So I managed to change my code to use Parallel.ForEach, querying each record in just 0,35 seconds, and adding it to a List<User> of records to be created or to be updated (Right Now is kinda dumb so I need to separate them in 2 lists and call 2 distinct WS methods).
My code used to run perfectly before multithread, but took too long. The problem is that I can't make a batch too large or I get a timeout when I try to update or create via Web Service. So I'm sending them around 500 records at once, and when it runs the critical code part, it executes many times.
Parallel.ForEach(boDS.USERS.AsEnumerable(), new ParallelOptions { MaxDegreeOfParallelism = -1 }, row =>
{
...
user = null;
user = QueryUserById(row["USER_ID"].Trim());
if (user == null)
{
isUpdate = false;
gObject.ID = new ID();
}
else
{
isUpdate = true;
gObject.ID = user.ID;
}
... fill user attributes as generic fields ...
gObject.GenericFields = listGenericFields.ToArray();
if (isUpdate)
listUserUpdate.Add(gObject);
else
listUserCreate.Add(gObject);
if (i == batchSize - 1 || i == (boDS.USERS.Rows.Count - 1))
{
UpdateProcessingOptions upo = new UpdateProcessingOptions();
CreateProcessingOptions cpo = new CreateProcessingOptions();
upo.SuppressExternalEvents = false;
upo.SuppressRules = false;
cpo.SuppressExternalEvents = false;
cpo.SuppressRules = false;
RNObject[] results = null;
// <Critical_code>
if (listUserCreate.Count > 0)
{
results = _service.Create(_clientInfoHeader, listUserCreate.ToArray(), cpo);
}
if (listUserUpdate.Count > 0)
{
_service.Update(_clientInfoHeader, listUserUpdate.ToArray(), upo);
}
// </Critical_code>
listUserUpdate = new List<RNObject>();
listUserCreate = new List<RNObject>();
}
i++;
});
I thought about using lock or mutex, but it isn't gonna help me, since they will just wait to execute afterwards. I need some solution to execute only ONCE in only ONE thread that part of code. Is it possible? Can anyone share some light?
Thanks and kind regards,
Leandro

As you stated in the comments you're declaring the variables outside of the loop body. That's where your race conditions originate from.
Let's take variable listUserUpdate for example. It's accessed randomly by parallel executing threads. While one thread is still adding to it, e.g. in listUserUpdate.Add(gObject); another thread could already be resetting the lists in listUserUpdate = new List<RNObject>(); or enumerating it in listUserUpdate.ToArray().
You really need to refactor that code to
make each loop run as independent from each other as you can by moving variables inside the loop body and
access data in a synchronizing way using locks and/or concurrent collections

You can use the Double-checked locking pattern. This is usually used for singletons, but you're not making a singleton here so generic singletons like Lazy<T> do not apply.
It works like this:
Separate out your shared data into some sort of class:
class QuerySharedData {
// All the write-once-read-many fields that need to be shared between threads
public QuerySharedData() {
// Compute all the write-once-read-many fields. Or use a static Create method if that's handy.
}
}
In your outer class add the following:
object padlock;
volatile QuerySharedData data
In your thread's callback delegate, do this:
if (data == null)
{
lock (padlock)
{
if (data == null)
{
data = new QuerySharedData(); // this does all the work to initialize the shared fields
}
}
}
var localData = data
Then use the shared query data from localData By grouping the shared query data into a subordinate class you avoid the necessity of making its individual fields volatile.
More about volatile here: Part 4: Advanced Threading.
Update my assumption here is that all the classes and fields held by QuerySharedData are read-only once initialized. If this is not true, for instance if you initialize a list once but add to it in many threads, this pattern will not work for you. You will have to consider using things like Thread-Safe Collections.

How to implement locking in a shared cachecontroller?

I have a static class which handles the cache read/write for frequently used data.
The code is this:
public static T GetFromCache<T>(double seconds, string cacheId, Func<T> method) where T : class
{
HttpContext ctx = HttpContext.Current;
object temp = null;
temp = ctx.Cache[cacheId];
if (temp == null)
{
lock (Sync)
{
temp = ctx.Cache[cacheId];
if (temp == null)
{
temp = method.Invoke();
AddToCache(temp as T, seconds, cacheId);
return temp as T;
}
}
}
if (temp is T)
{
return (T)temp;
}
return null;
}
The code is used by various callers to read data from and write data to the cache.
Now I have a Sync object (private static readonly object Sync = new object();) which gets locked when data gets written to the cache.
As this code is called by multiple callers, I would like to create a List of Sync objects, one for each caller. (with caller I don't mean the user, but calling code. I then would identify a caller by the signature of the parameter method)
The reason I want this is that every piece of calling code can have it's own lock object; otherwise (I think) every call to this cachecontroller from different callers will use the same lock object. Then, the caching for the list of countries will also lock the caching of the list of states, and with two different lock objects, they will not be in each others way.
I would then use the CacheItemRemovedCallback method to remove the lockitems from the list.
The question is this: How can I do that?

By having one Sync object for each user will defy the purpose of synchronization as each use will hold its own lock and there will be a chance that for the same cacheId you will end up invoking the method multiple times. This might result in data becoming inconsistent.
If you wish to keep one Sync object per user then it's good to make use of session variables or per user cache or something similar.. otherwise each user will virtually end up messing up with each other's cacheId results.
If you have a scenario when there can be Multiple readers of data but at a time a single user can write it then try using ReaderWriterLockSlim. This is very fast compared to lock in a multi user scenario.
Update1
Considering the cacheId is unique and not common among the callers. You can use the following code.
No lock is needed here. Reason, HttpContext.Cache is ThreadSafe. Meaning, you can read/save values to Cache. But, if the value reference itself is being shared among more than one concurrent calls, then please synchronize it.
public static T GetFromCache<T>(double seconds, string cacheId, Func<T> method) where T : class
{
HttpContext ctx = HttpContext.Current;
object temp = null;
temp = ctx.Cache[cacheId];
if (temp == null)
{
temp = method.Invoke();
AddToCache(temp as T, seconds, cacheId);
return temp as T;
}
}
Regards

LINQ To SQL Thread Safety

I want to ask whether the following code is thread safe:
Consider that Save1 and Save2 will be executed concurrently. Is there any problem with the thread safety of the datacontext?
public class Test1()
{
private void Save1()
{
using(TestLinqToSQL obj = new TestLinqToSQL())
{
obj.SaveRecord(new Client (){Id = 1, Name = "John Doe");
}
}
private void Save2()
{
using(TestLinqToSQL obj = new TestLinqToSQL())
{
obj.SaveRecord(new Client (){Id = 2, Name = "Mike Tyson");
}
}
}
public class TestLinqToSQL : IDisposable
{
public void SaveRecord(Client newClient)
{
using(ClientDatacontext cont = new ClientDatacontext())
{
cont.InsertRecord(newClient);
}
}
}
Thanks in advance

In this case, no it is not a problem as each thread will get a separate DataContext instance since each method results in a new one being created. You would have a problem if the DataContext was shared between threads as the instance methods are not thread safe see MSDN

Thread safe doesn't really mean anything without context. You need to be much more detailed about what you would consider acceptable and unacceptable. In your specific case, because you have a separate data context for each method, you don't need to worry about one of the inserts being "in the middle of" another insert, or in some other way causing one of them to fail entirely as a result of unsynchronized access to a shared resource (that would potentially be a problem if the data context was shared between threads).
However, the order of the inserts is entirely indeterminate. If the order of those operations matters then it's "not thread safe".
Additionally, if you were performing multiple operations that comprised a "transaction" it may or may not be "thread safe' depending on how you define thread safe. If each method were inserting 5 items you couldn't be sure that all five inserts were either before or after the other method's inserts (unless you explicitly added a lock to ensure that).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.