C# Async and sync function variable collision [duplicate]

C# Async and sync function variable collision [duplicate] - c#

If I have a local variable like so:
Increment()
{
int i = getFromDb(); // get count for a customer from db
};
And this is an instance class which gets incremented (each time a customer - an instance object - makes a purchase), is this variable thread safe? I hear that local variables are thread safe because each thread gets its own stack, etc etc.
Also, am I right in thinking that this variable is shared state? What I'm lacking in the thinking dept is that this variable will be working with different customer objects (e.g. John, Paul, etc) so is thread safe but this is flawed thinking and a bit of inexperience in concurrent programming. This sounds very naive, but then I don't have a lot of experience in concurrent coding like I do in general, synchronous coding.
EDIT: Also, the function call getFromDb() isn't part of the question and I don't expect anyone to guess on its thread safety as it's just a call to indicate the value is assigned from a function which gets data from the db. :)
EDIT 2: Also, getFromDb's thread safety is guaranteed as it only performs read operations.

i is declared as a local (method) variable, so it only normally exists in the stack-frame of Increment() - so yes, i is thread safe... (although I can't comment on getFromDb).
except if:
Increment is an iterator block (i.e. uses yield return or yield break)
i is used in an anonymous method ( delegate { i = i + 1;} ) or lambda (foo => {i=i+foo;}
In the above two scenarios, there are some cases when it can be exposed outside the stack. But I doubt you are doing either.
Note that fields (variables on the class) are not thread-safe, as they are trivially exposed to other threads. This is even more noticeable with static fields, since all threads automatically share the same field (except for thread-static fields).

Your statement has two separate parts - a function call, and an assignment.
The assignment is thread safe, because the variable is local. Every different invocation of this method will get its own version of the local variable, each stored in a different stack frame in a different place in memory.
The call to getFromDb() may or may not be threadsafe - depending on its implementation.

As long as the variable is local to the method it is thread-safe. If it was a static variable, then it wouldn't be by default.
class Example
{
static int var1; //not thread-safe
public void Method1()
{ int var2; //thread-safe
}
}

While your int i is thread safe, your whole case is case is probably not thread safe. Your int i is, as you said, thread safe because each thread has its own stack trace and therefore each thread has his own i. However, your thread all share the same database, therefore your database accesses are not thread safe. You need to properly synchronize your database accesses to make sure that each thread will see the database only at the correct moment.
As usual with concurrency and multithreading, you do not need to synchronize on your DB if you only read information. You do need to synchronize as soon as two thread will try to read/write the same set of informations from your DB.

i will be "thread safe" as each thread will have it's own copy of i on the stack as you suggest. The real question will be are the contents of getFromDb() thread safe?

i is a local variable so it's not shared state.
If your getFromDb() is reading from, say, an oracle sequence or a sql server autoincrement field then the db is taking care of synchronization (in most scenarios, excluding replication/distributed DBs) so you can probably safely return the result to any calling thread. That is, the DB is guaranteeing that every getFromDB() call will get a different value.
Thread-safety is usually a little bit of work - changing the type of a variable will rarely get you thread safety since it depends on how your threads will access the data. You can save yourself some headache by reworking your algorithm so that it uses a queue that all consumers synchronize against instead of trying to orchestrate a series of locks/monitors. Or better yet make the algorithm lock-free if possible.

i is thread safe syntactically. But when you assign instance variable's values,instance method return value to i, then the shared data is manipulated by multiple threads.

Related

Is it possible to modify an object passed like a parameter to other thread with C#? [duplicate]

is there any way in c# to put objects in another thread? All I found is how to actually execute some methods in another thread. What I actually want to do is to instanciate an object in a new thread for later use of the methods it provides.
Hope you can help me,
Russo

Objects do not really belong to a thread. If you have a reference to an object, you can access it from many threads.
This can give problems with object that are not designed to be accessed from many threads, like (almost all) System.Windows.Forms classes, and access to COM objects.
If you only want to access an object from the same thread, store a reference to the thread in the object (or a wrapping object), and execute the methods via that thread.

There seems to be some confusion about how threads work here, so this is a primer (very short too, so you should find more material before venturing further into multi-threaded programming.)
Objects and memory are inherently multi-thread in the sense that all threads in a process can access them as they choose.
So objects do not have anything to do with threads.
However, code executes in a thread, and it is the thread the code executes in that you're probably after.
Unfortunately there is no way to just "put an object into a different thread" as you put it, you need to specifically start a thread and specify what code to execute in that thread. Objects used by that code can thus be "said" to belong to that thread, though that is an artificial limit you impose yourself.
So there is no way to do this:
SomeObject obj = new SomeObject();
obj.PutInThread(thatOtherThread);
obj.Method(); // this now executes in that other thread
In fact, a common trap many new multi-thread programmers fall into is that if they create an object in one thread, and call methods on it from another thread, all those methods execute in the thread that created the object. This is incorrect, methods always executes in the thread that called them.
So the following is also incorrect:
Thread 1:
SomeObject obj = new SomeObject();
Thread 2:
obj.Method(); // executes in Thread 1
The method here will execute in Thread 2. The only way to get the method to execute in the original thread is to cooperate with the original thread and "ask it" to execute that method. How you do that depends on the situation and there's many many ways to do this.
So to summarize what you want: You want to create a new thread, and execute code in that thread.
To do that, look at the Thread class of .NET.
But be warned: Multi-threaded applications are exceedingly hard to get correct, I would not add multi-threaded capabilities to a program unless:
That is the only way to get more performance out of it
And, you know what you're doing

All threads of a process share the same data (ignoring thread local storage) so there is no need to explicitly migrate objects between threads.
internal sealed class Foo
{
private Object bar = null;
private void CreateBarOnNewThread()
{
var thread = new Thread(this.CreateBar);
thread.Start();
// Do other stuff while the new thread
// creates our bar.
Console.WriteLine("Doing crazy stuff.");
// Wait for the other thread to finish.
thread.Join();
// Use this.bar here...
}
private void CreateBar()
{
// Creating a bar takes a long time.
Thread.Sleep(1000);
this.bar = new Object();
}
}

All threads can see the stack heap, so if the thread has a reference to the objects you need (passed in through a method, for example) then the thread can use those objects. This is why you have to be very careful accessing objects when multi-threading, as two threads might try and change the object at the same time.
There is a ThreadLocal<T> class in .NET that you can use to restrict variables to a specific thread: see http://msdn.microsoft.com/en-us/library/dd642243.aspx and http://www.c-sharpcorner.com/UploadFile/ddoedens/UseThreadLocals11212005053901AM/UseThreadLocals.aspx

Use ParameterizedThreadStart to pass an object to your thread.

"for later use of the methods it provides."
Using a class that contains method to execute on new thread and other data and methods, you can gain access from your thread to Data and methods from the new thread.
But ... if your execute a method from the class, you are executing on current thread.
To execute the method on the new thread needs some Thread syncronization.
System.Windows.Forms.Control.BeginInvoke do it, the Control thread is waiting until a request arrives.
WaitHandle class can help you.

There's a lot of jargon around threading, But it boils down something pretty simple.
For a simple program, you have one point of execution flowing from point a to b, one line at a time. Programming 101, right?
Ok, for multithreading, You now have more then one point of execution in your program. So, point 1 can be in one part of your program, and point 2 can be someplace else.
It's all the same memory, data and code, but you have more then one thing happening at a time. So, you can think, what happens of both points enter a loop at the same time, what do you think would happen? So techniques were created to keep that kind of issue either from happening, or to speed up some kind of process. (counting a value vs. say, networking.)
That's all it really is. It can be tricky to manage, and and it's easy to get lost in the jargon and theory, but keep this in mind and it will be much simpler.
There are other exceptions to the rule as always, but this is the basics of it.

If the method that you run in a thread resides in a custom class you can have members of this class to hold the parameters.
public class Foo
{
object parameter1;
object parameter2;
public void ThreadMethod()
{
...
}
}

Sorry to duplicate some previous work, but the OP said
What I actually want to do is to instanciate an object in a new thread for later use of the methods it provides.
Let me interpret that as:
What I actually want to do is have a new thread instantiate an object so that later I can use that object's methods.
Pray correct me if I've missed the mark. Here's the example:
namespace silly
{
public static class Program
{
//declared volatile to make sure the object is in a consistent state
//between thread usages -- For thread safety.
public static volatile Object_w_Methods _method_provider = null;
static void Main(string[] args)
{
//right now, _method_provider is null.
System.Threading.Thread _creator_thread = new System.Threading.Thread(
new System.Threading.ThreadStart(Create_Object));
_creator_thread.Name = "Thread for creation of object";
_creator_thread.Start();
//here I can do other work while _method_provider is created.
System.Threading.Thread.Sleep(256);
_creator_thread.Join();
//by now, the other thread has created the _method_provider
//so we can use his methods in this thread, and any other thread!
System.Console.WriteLine("I got the name!! It is: `" +
_method_provider.Get_Name(1) + "'");
System.Console.WriteLine("Press any key to exit...");
System.Console.ReadKey(true);
}
static void Create_Object()
{
System.Threading.Thread.Sleep(512);
_method_provider = new Object_w_Methods();
}
}
public class Object_w_Methods
{
//Synchronize because it will probably be used by multiple threads,
//even though the current implementation is thread safe.
[System.Runtime.CompilerServices.MethodImpl(
System.Runtime.CompilerServices.MethodImplOptions.Synchronized)]
public string Get_Name(int id)
{
switch (id)
{
case 1:
return "one is the name";
case 2:
return "two is the one you want";
default:
return "supply the correct ID.";
}}}}

Just like to elaborate on a previous answer. To get back to the problem, objects and memory space are shared by all threads. So they are always shared, but I am assuming you want to do so safely and work with results created by another thread.
Firstly try one of the trusted C# patterns. Async Patterns
There are set patterns to work with, that do transmit basic messages and data between threads.
Usually the one threat completes after it computes the results!
Life threats: Nothing is fool proof when going asynchronous and sharing data on life threats.
So basically keep it as simple as possible if you do need to go this route and try follow known patterns.
So now I just like to elaborate why some of the known patters have a certain structure:
Eventargs: where you create a deepcopy of the objects before passing it. (It is not foolproof because certain references might still be shared . )
Passing results with basic types like int floats, etc, These can be created on a constructor and made immutable.
Atomic key words one these types, or create monitors etc.. Stick to one thread reads the other writes.
Assuming you have complex data you like to work with on two threads simultaneously a completely different ways to solve this , which I have not yet tested:
You could store results in database and let the other executable read it. ( There locks occur on a row level but you can try again or change the SQL code and at least you will get reported deadlocks that can be solved with good design, not just hanging software!!) I would only do this if it actually makes sense to store the data in a database for other reasons.
Another way that helps is to program F# . There objects and all types are immutable by default/ So your objects you want to share should have a constructor and no methods allow the object to get changed or basic types to get incremented.
So you create them and then they don't change! So they are non mutable after that.
Makes locking them and working with them in parallel so much easier. Don't go crazy with this in C# classes because others might follow this "convention' and most things like Lists were just not designed to be immutable in C# ( readonly is not the same as immutable, const is but it is very limiting). Immutable versus readonly

Lock scope in C#: is returned object still "locked"?

Assuming I have an object A containing
// ...
private List<double> someList = new List<double>();
// ...
public List<double> SomeList
{
get { lock (this) { return someList; } }
}
// ...
would it be thread safe to perform operation on the list as in the code below. Knowing that several operations could be executed simultaneously by different threads.
A.SomeList.Add(2.0);
or
A.SomeList.RemoveAt(0);
In other words, when is the lock released?

There is no thread safety here.
The lock is released as soon as the block it protects is finished, just before the property returns, so the calls to Add ad RemoveAt are not protected by the lock.

The lock you shown in the question isn't of much use.
To make list operations thread safe you need to implement your own Add/Remove/etc methods wrapping those of the list.
public void Add(double item)
{
lock(_list)
{
_list.Add(item);
}
}
Also, it's a good idea to hide the list itself from the consumers of your class, i.e. make the field private.

The lock is released when you exit the body of the lock statement. That means that your code is not thread-safe.
In other words, you can be sure that two threads won't be executing return someList on the same object at the same time. But it's certainly possible that one thread will execute Add() at the same time as another thread will execute RemoveAt(), which is what makes it non thread-safe.

The lock is released when the code inside the lock is finished executing.
Also, locking on this will only affect the current instance of the object

Ok, just for the hell of it.
There IS a way to make your object threadsafe, by using architectures that are already threadsafe.
For example, you could make your object a single threaded COM object. The COM object will be thread safe, but you'll pay with performance (the price of being lazy and not managing your own locks).
Create a COM Object in C#

...others said already, but just to formalize a problem a bit...
First, lock (this) {...} suggests a 'scope' - e.g. like using (){} - it only locks (this in this case) for variables inside. And that's a 'good thing' :) actually, as if you couldn't rely on that the whole locks/synchronization concept would be very much useless,
lock (this) { return something; } is an oxymoron of a sort - it's returning something that unlocks the very same moment it returns,
And the problems I think is the understanding of how it works. 'lock()' is not 'persisted' in a state of the object, so that you could return it etc. Take a look here how it's implemented How does lock work exactly? - answer explains it. It's more of a 'critical section' - i.e. you protect certain parts of 'code', which uses the variable - not the variable itself. Any type of synchronization requires 'synchronization objects' to hold the locks - and to be disposed of once lock is no longer needed. Take a look at this post https://stackoverflow.com/a/251668/417747, Esteban formulated that very well,
"Finally, there is the common misconception that lock(this) actually modifies the object passed as a parameter, and in some way makes it read-only or inaccessible. This is false. The object passed as a parameter to lock merely serves as a key" (this is a quote)
you either (usually) lock 'private code' of a class method, property... - to synchronize access to something you're doing inside - and access being to a private member (again usually) so that nobody else can access it w/o going through your synchronized code.
Or you make a thread-safe 'structure' - like a list - which is already 'synchronized' inside so that you can access it in a thread-safe manner. But there are no such things (or not used, almost never) as passing locks around or having one place lock the variable, while the other part of the code unlocks it etc. (in that case it's some type of EventWaitHandle that's rather used to synchronize things in between 'distant' code where one fires off on another etc.)
In your case, the choice is I think to go with the 'synchronized structure', i.e. the list that's internally handled,

What Makes a Method Thread-safe? What are the rules?

Are there overall rules/guidelines for what makes a method thread-safe? I understand that there are probably a million one-off situations, but what about in general? Is it this simple?
If a method only accesses local variables, it's thread safe.
Is that it? Does that apply for static methods as well?
One answer, provided by #Cybis, was:
Local variables cannot be shared among threads because each thread gets its own stack.
Is that the case for static methods as well?
If a method is passed a reference object, does that break thread safety? I have done some research, and there is a lot out there about certain cases, but I was hoping to be able to define, by using just a few rules, guidelines to follow to make sure a method is thread safe.
So, I guess my ultimate question is: "Is there a short list of rules that define a thread-safe method? If so, what are they?"
EDIT
A lot of good points have been made here. I think the real answer to this question is: "There are no simple rules to ensure thread safety." Cool. Fine. But in general I think the accepted answer provides a good, short summary. There are always exceptions. So be it. I can live with that.

If a method (instance or static) only references variables scoped within that method then it is thread safe because each thread has its own stack:
In this instance, multiple threads could call ThreadSafeMethod concurrently without issue.
public class Thing
{
public int ThreadSafeMethod(string parameter1)
{
int number; // each thread will have its own variable for number.
number = parameter1.Length;
return number;
}
}
This is also true if the method calls other class method which only reference locally scoped variables:
public class Thing
{
public int ThreadSafeMethod(string parameter1)
{
int number;
number = this.GetLength(parameter1);
return number;
}
private int GetLength(string value)
{
int length = value.Length;
return length;
}
}
If a method accesses any (object state) properties or fields (instance or static) then you need to use locks to ensure that the values are not modified by a different thread:
public class Thing
{
private string someValue; // all threads will read and write to this same field value
public int NonThreadSafeMethod(string parameter1)
{
this.someValue = parameter1;
int number;
// Since access to someValue is not synchronised by the class, a separate thread
// could have changed its value between this thread setting its value at the start
// of the method and this line reading its value.
number = this.someValue.Length;
return number;
}
}
You should be aware that any parameters passed in to the method which are not either a struct or immutable could be mutated by another thread outside the scope of the method.
To ensure proper concurrency you need to use locking.
for further information see lock statement C# reference and ReadWriterLockSlim.
lock is mostly useful for providing one at a time functionality,
ReadWriterLockSlim is useful if you need multiple readers and single writers.

If a method only accesses local variables, it's thread safe. Is that it?
Absolultely not. You can write a program with only a single local variable accessed from a single thread that is nevertheless not threadsafe:
https://stackoverflow.com/a/8883117/88656
Does that apply for static methods as well?
Absolutely not.
One answer, provided by #Cybis, was: "Local variables cannot be shared among threads because each thread gets its own stack."
Absolutely not. The distinguishing characteristic of a local variable is that it is only visible from within the local scope, not that it is allocated on the temporary pool. It is perfectly legal and possible to access the same local variable from two different threads. You can do so by using anonymous methods, lambdas, iterator blocks or async methods.
Is that the case for static methods as well?
Absolutely not.
If a method is passed a reference object, does that break thread safety?
Maybe.
I've done some research, and there is a lot out there about certain cases, but I was hoping to be able to define, by using just a few rules, guidelines to follow to make sure a method is thread safe.
You are going to have to learn to live with disappointment. This is a very difficult subject.
So, I guess my ultimate question is: "Is there a short list of rules that define a thread-safe method?
Nope. As you saw from my example earlier an empty method can be non-thread-safe. You might as well ask "is there a short list of rules that ensures a method is correct". No, there is not. Thread safety is nothing more than an extremely complicated kind of correctness.
Moreover, the fact that you are asking the question indicates your fundamental misunderstanding about thread safety. Thread safety is a global, not a local property of a program. The reason why it is so hard to get right is because you must have a complete knowledge of the threading behaviour of the entire program in order to ensure its safety.
Again, look at my example: every method is trivial. It is the way that the methods interact with each other at a "global" level that makes the program deadlock. You can't look at every method and check it off as "safe" and then expect that the whole program is safe, any more than you can conclude that because your house is made of 100% non-hollow bricks that the house is also non-hollow. The hollowness of a house is a global property of the whole thing, not an aggregate of the properties of its parts.

There is no hard and fast rule.
Here are some rules to make code thread safe in .NET and why these are not good rules:
Function and all functions it calls must be pure (no side effects) and use local variables. Although this will make your code thread-safe, there is also very little amount of interesting things you can do with this restriction in .NET.
Every function that operates on a common object must lock on a common thing. All locks must be done in same order. This will make the code thread safe, but it will be incredibly slow, and you might as well not use multiple threads.
...
There is no rule that makes the code thread safe, the only thing you can do is make sure that your code will work no matter how many times is it being actively executed, each thread can be interrupted at any point, with each thread being in its own state/location, and this for each function (static or otherwise) that is accessing common objects.

It must be synchronized, using an object lock, stateless, or immutable.
link: http://docs.oracle.com/javase/tutorial/essential/concurrency/immutable.html

What is exactly a "thread-safe type"? When do we need to use the "lock" statement?

I read all documentation about thread-safe types and the "lock" statement, but I am still not getting it 100%.
When exactly do I need to use the "lock" statement? How it relates to (non) thread-safe types? Thank you.

Imagine an instance of a class with a global variable in it. Imagine two threads call a method on that object at exactly the same time, and that method updates the global variable inside.
The likelihood is that value in the variable will get corrupted. Different languages and compilers/interpreters will deal with this in different ways (or not at all...) but the point is that you get "undesired" and "unpredictable" results.
Now imagine that the method obtains a "lock" on the variable before attempting to read from or write to it. The first thread to call the method will get a "lock" on the variable, the second thread to call the method will have to wait until the lock is released by the first thread. While you still have a race condition (i.e. the second thread might overwrite the value from the first) at least you have predictable results because no two threads (that are unaware of each other) can modify the value at the same time.
You use the lock statement to obtain that lock on the variable. Typically you'd define a separate object variable and use that for the lock object:
public class MyThreadSafeClass
{
private readonly object lockObject = new object();
private string mySharedString;
public void ThreadSafeMethod(string newValue)
{
lock (lockObject)
{
// Once one thread has got inside this lock statement, any others will have to wait outside for their turn...
mySharedString = newValue;
}
}
}
A type is deemed "thread-safe" if it applies the principle that no corruption will occur if shared data is accessed by multiple threads at the same time.
Beware the difference between "immutable" and "thread-safe". Thread-safe says that you have coded for the scenario and won't get corruption if two threads access shared state at the same time, whereas immutability is simply saying you return a new object rather than modifying it. Immutable objects are thread-safe, but not all thread-safe objects are immutable.

Thread safe code means code that can be accessed with many threads and still operate correctly.
In C#, this normally requires some sort of synchronization mechanism. A simple one is the lock statement (which is behind the scenes a call to Monitor.Enter). A code block that is surrounded by a lock block can only be accessed by one thread at a time.
Any use of a type that is not thread safe requires you to manage synchronization yourself.
A good resource to learn about threading in C# is the free eBook by Joe Albahari, found here.

http://en.wikipedia.org/wiki/Thread_safety

In C# 2.0+, is it necessary to lock around a closure that another thread will execute?

In C# 2.0+, is it necessary to lock around a closure that another thread will execute? Specifically, in the example below, is the locking necessary to ensure that the Maint thread flushes its value of x to shared memory and that thread t reads its value of x from shared memory?
I think it is, but if I'm wrong, please reference e.g. an authoritative (e.g., MSDN) article indicating why not. Thanks!
delegate void Foo();
public static void Main()
{
Foo foo = null;
object guard = new object();
int x = 1;
lock (guard)
{
foo = () =>
{
int temp;
lock (guard) temp = x;
Console.WriteLine(temp);
};
}
Thread t = new Thread(() => foo());
t.Start();
t.Join();
}
Edit: Clarified that I want to know for C# 2.0+, which is to say that the .NET 2.0+'s stronger memory model (than ECMA 335 CLI) is in effect.

Calling any constructor has release semantics in the .NET memory model. (Not in the CLI memory model, but in the .NET one.) So by calling the Thread constructor - and the delegate constructor itself - I believe you're okay. If you set x to 1 after the final constructor I'd be less sure. (EDIT: It's possible that the constructor business is more to do with the new Java memory model than .NET, given the later stuff about writes...)
In fact, I have a suspicion that all writes have release semantics (again, in the implemented .NET memory model) but not all reads have acquire semantics. In other words, the data will always be available to other threads, but it might not be used by those threads. In this case the new thread won't "have" the old value of the variable - there's no way it could have logically read the value before it started, so you're safe on that front.
I can check up on the "all writes" side of things - I suspect there's something in Joe Duffy's blog or book about it.
EDIT: From "Concurrent Programming on Windows" P516:
The major difference in the stronger
2.0+ model is that it prevents stores from being reordered.
I believe that's enough. I suspect that there are very few people on Stack Overflow who are capable of saying for sure - I wouldn't fully trust many people who aren't in the MS CLR or concurrency teams, except maybe Jeff Richter.

Unless more than 1 thread will execute it possibly the same time, no locking is needed.

You need two things:
The assignment should be atomic (not interrupted by another thread that modifies the same data).
The threads need to pick up the current value if it has been changed in another thread.
does not require locking in case of an int (see ECMA 344 C# specification, look for atomic).
is not guaranteed by locking. Use volatile to ensure that (see 17.4.3. in the same document).
Summarising: you do not need the lock, but you may need volatile.
EDIT (See remarks): volatile is not possible on a local variable (x in this case). On a local variable, I don't know of anything that will guarantee that multiple reads of x (in the same thread) will pick up changes that were made in other threads. In the provided code only one read is present, but I guess that this is a simplification.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.