Single class instance serving multiple requests - c#

I have a singleton client class in my solution, which calls an external service/APIs. There are no locks inside this class (nothing protecting variables being accessed by multiple threads).
If this singleton instance gets request 1, and while handling that request, it gets another request (request2). What happens? Does it continue processing request1 to completion, then serve request 2? Or will it start serving request 2 at the same time which in turn might over-write any variables in this singleton class?
Thanks for the help!

When two threads concurrently execute a method on a single instance of a class, arguments passed to a method and variables defined within that method are not overwritten. The values of fields and properties, however, can be changed.
This class is thread safe:
public class Calculator
{
public long Add(int a, int b)
{
var result = a + b;
return result;
}
}
The arguments (a, b) and the variable (result) are stored in the stack frame, memory which is allocated each time the method is executed. So the result variable exists for each method call. Two method calls cannot share or overwrite that variable.
Similarly, if this method gets called while it's already executing, the a and b arguments are not overwritten.
As a result, any number of threads can safely call the Add method on a single instance of Calculator. They can do this concurrently. One execution does not wait for the other.
This class is not thread safe:
public class Calculator
{
private int _a;
private int _b;
public long Add(int a, int b)
{
_a = a;
_b = b;
var result = _a + _b;
return result;
}
}
It's a contrived example. The difference here is that calling Add modifies the state of the class, changing the _a and _b fields. It would be the same if these were properties instead of fields.
If two threads tried to execute this at once you would get unpredictable results. Right before the first thread adds _a and _b another thread might change the value of one of those fields.
This is the simple version. There are more complicated scenarios. Suppose, for instance, we pass a List<int> as an argument to a method. If another thread has a reference to the same list, both threads could try to modify it, or one could modify it while the other is reading it, all with unexpected results. How to manage all of that is outside the scope of this answer.
Here are a few takeaways:
Don't add state to class (fields and properties that change after its constructor is called) unless it's needed. There are scenarios where we must, and in some cases it's the whole purpose of the class. But if you can choose between the examples above, always choose the first one.
Whenever we pass around references to objects like lists or instances of our own classes, consider what would happen if two threads had access to that object. And then
Don't pass them around if they're not thread safe
Make them thread safe before they're passed around
Be very, very careful. This is the worst option because it puts a burden on us and future developers to simulate the behavior of the code in our heads and see potential problems. It's so much better to prevent problems than to figure out how to navigate around them.
Another way of describing it: Concurrency is like plutonium. It's powerful and useful, but we must always know where it is and make sure it never leaks.

Related

C# lock based on class property

I've seen many examples of the lock usage, and it's usually something like this:
private static readonly object obj = new object();
lock (obj)
{
// code here
}
Is it possible to lock based on a property of a class? I didn't want to lock globally for any calls to the method with the lock statement, I'd like to lock only if the object passed as argument had the same property value as another object which was being processed prior to that.
Is that possible? Does that make sense at all?
This is what I had in mind:
public class GmailController : Controller
{
private static readonly ConcurrentQueue<PushRequest> queue = new ConcurrentQueue<PushRequest>();
[HttpPost]
public IActionResult ProcessPushNotification(PushRequest push)
{
var existingPush = queue.FirstOrDefault(q => q.Matches(push));
if (existingPush == null)
{
queue.Enqueue(push);
existingPush = push;
}
try
{
// lock if there is an existing push in the
// queue that matches the requested one
lock (existingPush)
{
// process the push notification
}
}
finally
{
queue.TryDequeue(out existingPush);
}
}
}
Background: I have an API where I receive push notifications from Gmail's API when our users send/receive emails. However, if someone sends a message to two users at the same time, I get two push notifications. My first idea was querying the database before inserting (based on subject, sender, etc). In some rare cases, the query of the second call is made before the SaveChanges of the previous call, so I end up having duplicates.
I know that if I ever wanted to scale out, lock would become useless. I also know I could just create a job to check recent entries and eliminate duplicates, but I was trying something different. Any suggestions are welcome.
Let me first make sure I understand the proposal. The problem given is that we have some resource shared to multiple threads, call it database, and it admits two operations: Read(Context) and Write(Context). The proposal is to have lock granularity based on a property of the context. That is:
void MyRead(Context c)
{
lock(c.P) { database.Read(c); }
}
void MyWrite(Context c)
{
lock(c.P) { database.Write(c); }
}
So now if we have a call to MyRead where the context property has value X, and a call to MyWrite where the context property has value Y, and the two calls are racing on two different threads, they are not serialized. However, if we have, say, two calls to MyWrite and a call to MyRead, and in all of them the context property has value Z, those calls are serialized.
Is this possible? Yes. That doesn't make it a good idea. As implemented above, this is a bad idea and you shouldn't do it.
It is instructive to learn why it is a bad idea.
First, this simply fails if the property is a value type, like an integer. You might think, well, my context is an ID number, that's an integer, and I want to serialize all accesses to the database using ID number 123, and serialize all accesses using ID number 345, but not serialize those accesses with respect to each other. Locks only work on reference types, and boxing a value type always gives you a freshly allocated box, so the lock would never be contested even if the ids were the same. It would be completely broken.
Second, it fails badly if the property is a string. Locks are logically "compared" by reference, not by value. With boxed integers, you always get different references. With strings, you sometimes get different references! (Because of interning being applied inconsistently.) You could be in a situation where you are locking on "ABC" and sometimes another lock on "ABC" waits, and sometimes it does not!
But the fundamental rule that is broken is: you must never lock on an object unless that object has been specifically designed to be a lock object, and the same code which controls access to the locked resource controls access to the lock object.
The problem here is not "local" to the lock but rather global. Suppose your property is a Frob where Frob is a reference type. You don't know if any other code in your process is also locking on that same Frob, and therefore you don't know what lock ordering constraints are necessary to prevent deadlocks. Whether a program deadlocks or not is a global property of a program. Just like you can build a hollow house out of solid bricks, you can build a deadlocking program out of a collection of locks that are individually correct. By ensuring that every lock is only taken out on a private object that you control, you ensure that no one else is ever locking on one of your objects, and therefore the analysis of whether your program contains a deadlock becomes simpler.
Note that I said "simpler" and not "simple". It reduces it to almost impossible to get correct, from literally impossible to get correct.
So if you were hell bent on doing this, what would be the right way to do it?
The right way would be to implement a new service: a lock object provider. LockProvider<T> needs to be able to hash and compare for equality two Ts. The service it provides is: you tell it that you want a lock object for a particular value of T, and it gives you back the canonical lock object for that T. When you're done, you say you're done. The provider keeps a reference count of how many times it has handed out a lock object and how many times it got it back, and deletes it from its dictionary when the count goes to zero, so that we don't have a memory leak.
Obviously the lock provider needs to be threadsafe and needs to be extremely low contention, because it is a mechanism designed to prevent contention, so it had better not cause any! If this is the road you intend to go down, you need to get an expert on C# threading to design and implement this object. It is very easy to get this wrong. As I have noted in comments to your post, you are attempting to use a concurrent queue as a sort of poor lock provider and it is a mass of race condition bugs.
This is some of the hardest code to get correct in all of .NET programming. I have been a .NET programmer for almost 20 years and implemented parts of the compiler and I do not consider myself competent to get this stuff right. Seek the help of an actual expert.
Although I find Eric Lippert's answer fantastic and marked it as the correct one (and I won't change that), his thoughts made me think and I wanted to share an alternative solution I found to this problem (and I'd appreciate any feedbacks), even though I'm not going to use it as I ended up using Azure functions with my code (so this wouldn't make sense), and a cron job to detected and eliminate possible duplicates.
public class UserScopeLocker : IDisposable
{
private static readonly object _obj = new object();
private static ICollection<string> UserQueue = new HashSet<string>();
private readonly string _userId;
protected UserScopeLocker(string userId)
{
this._userId = userId;
}
public static UserScopeLocker Acquire(string userId)
{
while (true)
{
lock (_obj)
{
if (UserQueue.Contains(userId))
{
continue;
}
UserQueue.Add(userId);
return new UserScopeLocker(userId);
}
}
}
public void Dispose()
{
lock (_obj)
{
UserQueue.Remove(this._userId);
}
}
}
...then you would use it like this:
[HttpPost]
public IActionResult ProcessPushNotification(PushRequest push)
{
using(var scope = UserScopeLocker.Acquire(push.UserId))
{
// process the push notification
// two threads can't enter here for the same UserId
// the second one will be blocked until the first disposes
}
}
The idea is:
UserScopeLocker has a protected constructor, ensuring you call Acquire.
_obj is private static readonly, only the UserScopeLocker can lock this object.
_userId is a private readonly field, ensuring even its own class can't change its value.
lock is done when checking, adding and removing, so two threads can't compete on these actions.
Possible flaws I detected:
Since UserScopeLocker relies on IDisposable to release some UserId, I can't guarantee the caller will properly use using statement (or manually dispose the scope object).
I can't guarantee the scope won't be used in a recursive function (thus possibly causing a deadlock).
I can't guarantee the code inside the using statement won't call another function which also tries to acquire a scope to the user (this would also cause a deadlock).

Calling static method via Multiple threads - Can they meddle with each other's Input Parameters

My code gets called by AJAX UI (Multiple threads), and post data processing it sends output in Json. Recently while refactoring the code, we have shifted lot of common and repeated methods to a separate file, where we have made the them static, since we are not working on any static / shared data.
Following is a sample design our static method:
public class Helper
{
public static C Method1(List<A> aList, List<B> bList)
{
C objC = new C();
// Create ObjC based on inputs aList and bList
return objC;
}
}
Now, my understanding is that the following call will have no issue, when called in a Parallel.foreach or any other multithread scenario, please verify.
C resultC = Helper.Method1(aList, bList);
However we have doubt, can there be a rare case possible where two threads make the above mentioned call and one thread data of aList, bList, is replaced by another thread, thus giving a flawed result (may be exception), which can for that matter will be impossible to debug and repeat, since two threads have to go / execute together in precise milli seconds that method takes to execute
Please share your view are we on right track to create the above mentioned design or there are pits that we are not able to see. We can easily replace by instance method, they are surely thread safe in this scenario, since each thread has its own instance to work with, but I feel that may not be required and its troublesome to keep creating instance, when we can conveniently work with a static call.
Please note till now I haven't seen an issue with code running, but as I said if this ever happens it will be corner case, for two threads to come at same time and one thread replace the input parameter while other thread is still processing result.
The short answer to your question is no, as long as you pass in different List instances across all your different threads. .NET handles threading fine, and in itself won't get itself into a tangle, it's only if your code encourages it to do so can things get messy.
The way things get mixed up is by sharing state across different threads. So as an example, having a static method you may think it a good idea to use a static variable somewhere:
private static int count;
public static void MyMethod() {
count = count + 1;
if(count == 5) {
console.log("Equal to 5");
}
};
This sort of method is not thread safe because count can be modified by two different threads at the same time. In fact it's possible that count could be incremented to 5, and then another thread increment it to 6 before the if check, therefore you'd never log anything - which would obviously be a bit confusing.
The other way you can share state, is if you pass something in, hence my caveat at the start of the answer. If you pass the same object into the method from multiple threads this object should ideally be immutable so in your case a collection that can't be modified. This prevents the internals of the method modifying the object that could impact another thread. As already mentioned though, this is only a concern if you pass the same instances in, within different threads.
Method call will not interfere with other method calls from different thread. You have to think carefully only about static variables. Also if you share input parameters aList or bList between threads you can run into troubles.

Static variable population best practice

I have a class Meterage which I expect to be instantiated several times, probably in quick succession. Each class will need to know the location of the dropbox folder in the executing machine, and I have code for this.
The class currently has a variable:
private string dropboxPath = string.Empty;
to hold the path, but I am considering making this a static to save repeated execution of
this.LocateDropboxFolder();
in the constructor. But I am a little concerned by the switch: what if two constructors try to set this at the same time? Would this code in the constructor be safe (LocateDropboxFolder becomes static too in this example):
public Meterage()
{
if (dropboxPath == string.Empty)
{
LocateDropboxFolder();
}
}
I think my concerns are perhaps irrelevant as long as I don't have construction occurring in multiple threads?
If the field is made static then static field initializers or static constructors are the easy way to initialize them. This will be executed at most once in a thread safe manner.
private static string dropboxPath;
static Meterage()
{
LocateDropboxFolder();
}
If you don't want to re-assign the field I suggest you to use readonly modifier, then the code should look like:
private static readonly string dropboxPath;
static Meterage()
{
dropboxPath = LocateDropboxFolder();
}
LocateDropboxFolder needs to return a string in this case.
Variables declared outside the constructor are evaluated before the constructor. Then the constructor will evaluate it.
Do remember that you will end up have only one dropBoxPath. If this is intended, it is okay to do so. Optionally, make LocateDropboxFolder a static method and call it from the static constructor.
If you want to prevent other constructors to overwrite the default, try this:
if (string.IsNullOrEmpty(dropboxPath))
{
LocateDropboxFolder();
}
Or, in a static constructor (at most called once):
static Meterage()
{
LocateDropboxFolder();
}
private static LocateDropboxFolder()
{
...
}
Your example will be safe provided your code is executing synchronously. If multiple instances are created, their constructors will be called in the order they are created.
On the first run through, LocateDropboxFolder() will execute. When this completes, dropboxPath will be set.
On the second constructor execution, LocateDropboxFolder() will not execute because dropboxPath will no longer equal string.Empty (provided 'LocateDropboxFolder()' does not return string.Empty.
However, if LocateDropboxFolder() is asynchronous or the objects are instantiated on different threads, then it is possible to create a second Meterage instance before dropBoxPath has been set by the LocateDropboxFolder() function. As such, multiple calls to the function will likely be made.
If you wish to guard against multithreading errors like this, you could consider using lock statements.
You might potentially end up running the LocateDropboxFolder multiple times if the object tries to be constructed multiple times in close succession from multiple threads. As long as the method returns the same result every time though this shouldn't be a problem since it will still be using the same value.
Additionally if you are setting the value of dropboxPath in the constructor then there is no point setting a default value for it. I'd just declare it (and not assign it) and then check for null in your constructor.
I hava a feeling that your Meterage class is breaking a Single Responsibility Principle. What has the meterage to do with a file access? I would say you have 2 concerns here: your Meterage and, let's say, FolderLocator. the second one should have some property or method like Dropbox which could use lazy evaluation pattern. It should be instantiated once and this single instance can be injected to each Metarage instance.
Maybe not FolderLocator but FileSystem with some more methods than just a single property? Nos sure what you're actually doing. Anyway - make an interface for this. That would allow unit testing without using the actual Dropbox folder.

What Makes a Method Thread-safe? What are the rules?

Are there overall rules/guidelines for what makes a method thread-safe? I understand that there are probably a million one-off situations, but what about in general? Is it this simple?
If a method only accesses local variables, it's thread safe.
Is that it? Does that apply for static methods as well?
One answer, provided by #Cybis, was:
Local variables cannot be shared among threads because each thread gets its own stack.
Is that the case for static methods as well?
If a method is passed a reference object, does that break thread safety? I have done some research, and there is a lot out there about certain cases, but I was hoping to be able to define, by using just a few rules, guidelines to follow to make sure a method is thread safe.
So, I guess my ultimate question is: "Is there a short list of rules that define a thread-safe method? If so, what are they?"
EDIT
A lot of good points have been made here. I think the real answer to this question is: "There are no simple rules to ensure thread safety." Cool. Fine. But in general I think the accepted answer provides a good, short summary. There are always exceptions. So be it. I can live with that.
If a method (instance or static) only references variables scoped within that method then it is thread safe because each thread has its own stack:
In this instance, multiple threads could call ThreadSafeMethod concurrently without issue.
public class Thing
{
public int ThreadSafeMethod(string parameter1)
{
int number; // each thread will have its own variable for number.
number = parameter1.Length;
return number;
}
}
This is also true if the method calls other class method which only reference locally scoped variables:
public class Thing
{
public int ThreadSafeMethod(string parameter1)
{
int number;
number = this.GetLength(parameter1);
return number;
}
private int GetLength(string value)
{
int length = value.Length;
return length;
}
}
If a method accesses any (object state) properties or fields (instance or static) then you need to use locks to ensure that the values are not modified by a different thread:
public class Thing
{
private string someValue; // all threads will read and write to this same field value
public int NonThreadSafeMethod(string parameter1)
{
this.someValue = parameter1;
int number;
// Since access to someValue is not synchronised by the class, a separate thread
// could have changed its value between this thread setting its value at the start
// of the method and this line reading its value.
number = this.someValue.Length;
return number;
}
}
You should be aware that any parameters passed in to the method which are not either a struct or immutable could be mutated by another thread outside the scope of the method.
To ensure proper concurrency you need to use locking.
for further information see lock statement C# reference and ReadWriterLockSlim.
lock is mostly useful for providing one at a time functionality,
ReadWriterLockSlim is useful if you need multiple readers and single writers.
If a method only accesses local variables, it's thread safe. Is that it?
Absolultely not. You can write a program with only a single local variable accessed from a single thread that is nevertheless not threadsafe:
https://stackoverflow.com/a/8883117/88656
Does that apply for static methods as well?
Absolutely not.
One answer, provided by #Cybis, was: "Local variables cannot be shared among threads because each thread gets its own stack."
Absolutely not. The distinguishing characteristic of a local variable is that it is only visible from within the local scope, not that it is allocated on the temporary pool. It is perfectly legal and possible to access the same local variable from two different threads. You can do so by using anonymous methods, lambdas, iterator blocks or async methods.
Is that the case for static methods as well?
Absolutely not.
If a method is passed a reference object, does that break thread safety?
Maybe.
I've done some research, and there is a lot out there about certain cases, but I was hoping to be able to define, by using just a few rules, guidelines to follow to make sure a method is thread safe.
You are going to have to learn to live with disappointment. This is a very difficult subject.
So, I guess my ultimate question is: "Is there a short list of rules that define a thread-safe method?
Nope. As you saw from my example earlier an empty method can be non-thread-safe. You might as well ask "is there a short list of rules that ensures a method is correct". No, there is not. Thread safety is nothing more than an extremely complicated kind of correctness.
Moreover, the fact that you are asking the question indicates your fundamental misunderstanding about thread safety. Thread safety is a global, not a local property of a program. The reason why it is so hard to get right is because you must have a complete knowledge of the threading behaviour of the entire program in order to ensure its safety.
Again, look at my example: every method is trivial. It is the way that the methods interact with each other at a "global" level that makes the program deadlock. You can't look at every method and check it off as "safe" and then expect that the whole program is safe, any more than you can conclude that because your house is made of 100% non-hollow bricks that the house is also non-hollow. The hollowness of a house is a global property of the whole thing, not an aggregate of the properties of its parts.
There is no hard and fast rule.
Here are some rules to make code thread safe in .NET and why these are not good rules:
Function and all functions it calls must be pure (no side effects) and use local variables. Although this will make your code thread-safe, there is also very little amount of interesting things you can do with this restriction in .NET.
Every function that operates on a common object must lock on a common thing. All locks must be done in same order. This will make the code thread safe, but it will be incredibly slow, and you might as well not use multiple threads.
...
There is no rule that makes the code thread safe, the only thing you can do is make sure that your code will work no matter how many times is it being actively executed, each thread can be interrupted at any point, with each thread being in its own state/location, and this for each function (static or otherwise) that is accessing common objects.
It must be synchronized, using an object lock, stateless, or immutable.
link: http://docs.oracle.com/javase/tutorial/essential/concurrency/immutable.html

Are local variables within static methods thread safe?

If I have a static class with a static method, are the local variables within the method safe if multiple threads are calling it?
static class MyClass {
static int DoStuff(int n) {
int x = n; // <--- Can this be modified by another thread?
return x++;
}
}
Answers to this question which state that locals are stored on the stack and are therefore threadsafe are incomplete and perhaps dangerously wrong.
Do threads create their own scope when executing static methods?
Your question contains a common error. A "scope" in C# is purely a compile-time concept; a "scope" is a region of program text in which a particular entity (such as a variable or type) may be referred to by its unqualified name. Scopes help determine how the compiler maps a name to the concept that the name represents.
Threads do not create "scopes" at runtime because "scope" is purely a compile-time concept.
The scope of a local variable is connected -- loosely -- to its lifetime; roughly speaking, the runtime lifetime of a local variable typically begins when a thread of control enters code corresponding to the beginning of its scope, and ends when the thread of control leaves. However, the compiler and the runtime are both granted considerable discretion to lengthen or shorten that lifetime if they deem that doing so is efficient or necessary.
In particular, locals in iterator blocks and closed-over locals of anonymous functions have their lifetimes extended to beyond the point where control leaves the scope.
However, none of this has to do with thread safety. So let's abandon this badly-phrased question and move on to a somewhat better phrasing:
If I have a static class with a static method, are the instance variables within the method safe if multiple threads are calling it?
Your question contains an error. Instance variables are non-static fields. Obviously there are no non-static fields in a static class. You are confusing instance variables with local variables. The question you intended to ask is:
If I have a static class with a static method, are the local variables within the method safe if multiple threads are calling it?
Rather than answer this question directly I'm going to rephrase it into two questions that can more easily be answered.
Under what circumstances do I need to use locking or other special thread-safety techniques to ensure safe access to a variable?
You need to do so if there are two threads, both can access the variable, at least one of the threads is mutating it, and at least one of the threads is performing some non-atomic operation on it.
(I note that there may be other factors at play as well. For example, if you require consistently observed values of shared memory variables as seen from every thread then you need to use special techniques even if the operations are all atomic. C# does not guarantee sequential consistency of observations of shared memory even if the variable is marked as volatile.)
Super. Let's concentrate on that "both can access the variable" part. Under what circumstances can two threads both access a local variable?
Under typical circumstances, a local variable can only be accessed inside the method which declares it. Each method activation will create a different variable, and therefore no matter how many times the method is activated on different threads, the same variable is not accessed on two different threads.
However, there are ways to access a local variable outside of the method which created it.
First, a "ref" or "out" parameter may be an alias to a local variable, but the CLR and the C# language have both been carefully designed so that aliases to local variables are only accessible from methods which were called by the method declaring the variable (directly or indirectly). Therefore, this should still be threadsafe; there should not be a way to get a ref from one thread to another, and thereby share the variable across threads.
Second, a local variable might be a closed-over local of a lambda or anonymous method. In this circumstance a local variable is not necessarily threadsafe. If the delegate is stored in shared memory then two threads can manipulate the local independently!
static class Foo
{
private static Func<int> f;
public static int Bar()
{
if (Foo.f == null)
{
int x = 0;
Foo.f = ()=>{ return x++; };
}
return Foo.f();
}
}
Here "Bar" has a local variable "x". If Bar is called on multiple threads, then first the threads race to determine who sets Foo.f. One of them wins. And from now on, calls to Bar on multiple threads all unsafely manipulate the same local variable x which was captured by the delegate created by the winning thread. Being a local variable is not a guarantee of thread safety.
Third, a local variable inside an iterator block has the same problem:
static class Foo
{
public static IEnumerable<int> f;
private static IEnumerable<int> Sequence()
{
int x = 0;
while(true) yield return x++;
}
public static Bar() { Foo.f = Sequence(); }
}
If someone calls Foo.Bar() and then accesses Foo.f from two different threads, again a single local variable x can be unsafely mutated on two different threads. (And of course the mechanisms which run the iterator logic are also not threadsafe.)
Fourth, in code that is marked as "unsafe" a local variable may be shared across threads by sharing a pointer to the local across threads. If you mark a block of code as "unsafe" then you are responsible for ensuring that the code is threadsafe if it needs to be. Don't turn off the safety system unless you know what you are doing.

Categories