Quick fix for parallel use of a non thread safe library

Quick fix for parallel use of a non thread safe library - c#

I'm using a library that isn't thread safe but doesn't use other non thread safe external ressources.
It's a bit of a pain to make it thread safe (main reason it's not thread safe is that it uses a lot of static fields that it reads/writes to on a per instance basis, works just fine with many instances processed sequentially buy obviously fails if using multiple instances in parallel)
I'm thinking of doing the following as a quick fix that feels hackish and i'm wondering if there is anything inherently wrong with this approach.
Create an app domain per thread, marshall the thread data to that domain, load the library in each domain and run as many app domains as i have threads, then unload when done.
I understand this implies a slowdown (many library loads and lots of marshalling back and force) but that aside am i missing something obvious that makes this impossible?

How much of the library do you need to interact with? If it's not too huge you could build a wrapper around it. You call the wrapper's methods and it handles locking and makes thread safe calls to the inner library.
You can declare an interface, and the implementation is your wrapper class. Now in addition to solving the thread safety issue you've removed an untestable direct dependency on static methods.
It might be a little bit of work (if it's feasible, depending on the library) but it would definitely be a lot cleaner than creating app domains just to keep separate "instances" of static members.
// Class with static memebers
public class FooManager
{
public static void ManageFoo(Foo foo)
{
// Does some non thread-safe stuff
}
}
public interface IFooManager
{
void ManageFoo(Foo foo);
}
public class FooManagerWrapper : IFooManager
{
private static readonly object FooManagerLock = new object();
public void ManageFoo(Foo foo)
{
lock (FooManagerLock)
{
FooManager.ManageFoo(foo);
}
}
}
This blog post contains another example. Even if thread safety isn't an issue I don't want my code to have a direct dependency on static classes and methods (even if I wrote them.)

Related

Is it possible to modify an object passed like a parameter to other thread with C#? [duplicate]

is there any way in c# to put objects in another thread? All I found is how to actually execute some methods in another thread. What I actually want to do is to instanciate an object in a new thread for later use of the methods it provides.
Hope you can help me,
Russo

Objects do not really belong to a thread. If you have a reference to an object, you can access it from many threads.
This can give problems with object that are not designed to be accessed from many threads, like (almost all) System.Windows.Forms classes, and access to COM objects.
If you only want to access an object from the same thread, store a reference to the thread in the object (or a wrapping object), and execute the methods via that thread.

There seems to be some confusion about how threads work here, so this is a primer (very short too, so you should find more material before venturing further into multi-threaded programming.)
Objects and memory are inherently multi-thread in the sense that all threads in a process can access them as they choose.
So objects do not have anything to do with threads.
However, code executes in a thread, and it is the thread the code executes in that you're probably after.
Unfortunately there is no way to just "put an object into a different thread" as you put it, you need to specifically start a thread and specify what code to execute in that thread. Objects used by that code can thus be "said" to belong to that thread, though that is an artificial limit you impose yourself.
So there is no way to do this:
SomeObject obj = new SomeObject();
obj.PutInThread(thatOtherThread);
obj.Method(); // this now executes in that other thread
In fact, a common trap many new multi-thread programmers fall into is that if they create an object in one thread, and call methods on it from another thread, all those methods execute in the thread that created the object. This is incorrect, methods always executes in the thread that called them.
So the following is also incorrect:
Thread 1:
SomeObject obj = new SomeObject();
Thread 2:
obj.Method(); // executes in Thread 1
The method here will execute in Thread 2. The only way to get the method to execute in the original thread is to cooperate with the original thread and "ask it" to execute that method. How you do that depends on the situation and there's many many ways to do this.
So to summarize what you want: You want to create a new thread, and execute code in that thread.
To do that, look at the Thread class of .NET.
But be warned: Multi-threaded applications are exceedingly hard to get correct, I would not add multi-threaded capabilities to a program unless:
That is the only way to get more performance out of it
And, you know what you're doing

All threads of a process share the same data (ignoring thread local storage) so there is no need to explicitly migrate objects between threads.
internal sealed class Foo
{
private Object bar = null;
private void CreateBarOnNewThread()
{
var thread = new Thread(this.CreateBar);
thread.Start();
// Do other stuff while the new thread
// creates our bar.
Console.WriteLine("Doing crazy stuff.");
// Wait for the other thread to finish.
thread.Join();
// Use this.bar here...
}
private void CreateBar()
{
// Creating a bar takes a long time.
Thread.Sleep(1000);
this.bar = new Object();
}
}

All threads can see the stack heap, so if the thread has a reference to the objects you need (passed in through a method, for example) then the thread can use those objects. This is why you have to be very careful accessing objects when multi-threading, as two threads might try and change the object at the same time.
There is a ThreadLocal<T> class in .NET that you can use to restrict variables to a specific thread: see http://msdn.microsoft.com/en-us/library/dd642243.aspx and http://www.c-sharpcorner.com/UploadFile/ddoedens/UseThreadLocals11212005053901AM/UseThreadLocals.aspx

Use ParameterizedThreadStart to pass an object to your thread.

"for later use of the methods it provides."
Using a class that contains method to execute on new thread and other data and methods, you can gain access from your thread to Data and methods from the new thread.
But ... if your execute a method from the class, you are executing on current thread.
To execute the method on the new thread needs some Thread syncronization.
System.Windows.Forms.Control.BeginInvoke do it, the Control thread is waiting until a request arrives.
WaitHandle class can help you.

There's a lot of jargon around threading, But it boils down something pretty simple.
For a simple program, you have one point of execution flowing from point a to b, one line at a time. Programming 101, right?
Ok, for multithreading, You now have more then one point of execution in your program. So, point 1 can be in one part of your program, and point 2 can be someplace else.
It's all the same memory, data and code, but you have more then one thing happening at a time. So, you can think, what happens of both points enter a loop at the same time, what do you think would happen? So techniques were created to keep that kind of issue either from happening, or to speed up some kind of process. (counting a value vs. say, networking.)
That's all it really is. It can be tricky to manage, and and it's easy to get lost in the jargon and theory, but keep this in mind and it will be much simpler.
There are other exceptions to the rule as always, but this is the basics of it.

If the method that you run in a thread resides in a custom class you can have members of this class to hold the parameters.
public class Foo
{
object parameter1;
object parameter2;
public void ThreadMethod()
{
...
}
}

Sorry to duplicate some previous work, but the OP said
What I actually want to do is to instanciate an object in a new thread for later use of the methods it provides.
Let me interpret that as:
What I actually want to do is have a new thread instantiate an object so that later I can use that object's methods.
Pray correct me if I've missed the mark. Here's the example:
namespace silly
{
public static class Program
{
//declared volatile to make sure the object is in a consistent state
//between thread usages -- For thread safety.
public static volatile Object_w_Methods _method_provider = null;
static void Main(string[] args)
{
//right now, _method_provider is null.
System.Threading.Thread _creator_thread = new System.Threading.Thread(
new System.Threading.ThreadStart(Create_Object));
_creator_thread.Name = "Thread for creation of object";
_creator_thread.Start();
//here I can do other work while _method_provider is created.
System.Threading.Thread.Sleep(256);
_creator_thread.Join();
//by now, the other thread has created the _method_provider
//so we can use his methods in this thread, and any other thread!
System.Console.WriteLine("I got the name!! It is: `" +
_method_provider.Get_Name(1) + "'");
System.Console.WriteLine("Press any key to exit...");
System.Console.ReadKey(true);
}
static void Create_Object()
{
System.Threading.Thread.Sleep(512);
_method_provider = new Object_w_Methods();
}
}
public class Object_w_Methods
{
//Synchronize because it will probably be used by multiple threads,
//even though the current implementation is thread safe.
[System.Runtime.CompilerServices.MethodImpl(
System.Runtime.CompilerServices.MethodImplOptions.Synchronized)]
public string Get_Name(int id)
{
switch (id)
{
case 1:
return "one is the name";
case 2:
return "two is the one you want";
default:
return "supply the correct ID.";
}}}}

Just like to elaborate on a previous answer. To get back to the problem, objects and memory space are shared by all threads. So they are always shared, but I am assuming you want to do so safely and work with results created by another thread.
Firstly try one of the trusted C# patterns. Async Patterns
There are set patterns to work with, that do transmit basic messages and data between threads.
Usually the one threat completes after it computes the results!
Life threats: Nothing is fool proof when going asynchronous and sharing data on life threats.
So basically keep it as simple as possible if you do need to go this route and try follow known patterns.
So now I just like to elaborate why some of the known patters have a certain structure:
Eventargs: where you create a deepcopy of the objects before passing it. (It is not foolproof because certain references might still be shared . )
Passing results with basic types like int floats, etc, These can be created on a constructor and made immutable.
Atomic key words one these types, or create monitors etc.. Stick to one thread reads the other writes.
Assuming you have complex data you like to work with on two threads simultaneously a completely different ways to solve this , which I have not yet tested:
You could store results in database and let the other executable read it. ( There locks occur on a row level but you can try again or change the SQL code and at least you will get reported deadlocks that can be solved with good design, not just hanging software!!) I would only do this if it actually makes sense to store the data in a database for other reasons.
Another way that helps is to program F# . There objects and all types are immutable by default/ So your objects you want to share should have a constructor and no methods allow the object to get changed or basic types to get incremented.
So you create them and then they don't change! So they are non mutable after that.
Makes locking them and working with them in parallel so much easier. Don't go crazy with this in C# classes because others might follow this "convention' and most things like Lists were just not designed to be immutable in C# ( readonly is not the same as immutable, const is but it is very limiting). Immutable versus readonly

Static or Non Static methods, thread safety is for the types not methods

I have this confusion for sometime, are static method implementation threads safe, instance methods are certainly thread safe, if we assign a separate instance to each thread, then they do not meddle, then I have realized that, thread safety is more about types then methods, which are in themselves not a memory allocation, so let's take an example:
private static ConcurrentDictionary<int,int> cd;
public static void Method1(int userid)
{
// Modify static object cd based on userid key
}
public void Method2(int userid)
{
// Modify static object cd based on userid key
}
In essence there's no difference between two methods when accessed by multiple threads supplying different user ids at run time. I have tested the same but want to verify if my understanding is correct.

Static methods are not thread-safe because they are static.
They are thread-safe because someone made them thread-safe. Typically in the .NET framework, static methods are thread-safe because someone wrote them that way.
You can just as easily write non-thread-safe static members, there's no magic being done here.
The exact same rules you would have to follow to write a thread-safe instance member must be followed to write thread-safe static members.

are static method implementation threads safe?
No, if they modify shared data, then they are just as non-thread safe. Your example could be OK but only because the shared data is threadsafe itself being a ConcurrentDictionary of immutable types (ints).
instance methods are certainly thread safe, if we assign a separate instance to each thread
Well no, if an instance is accessed by a single thread then that doesn't make it threadsafe. That's just avoiding multi-thread issues.
In short static has nothing to do with multi-threading.

Thread Safety has nothing to do with classes and instances of classes. Both can be used in a unsafe manner for threading.
But ussually objects like winforms controls can't access their resources from other threads, thus they check if you accessing them from other thread and you must make sure you use Invoke to use the desired thread for that control...

instance methods are certainly thread safe, if we assign a separate instance to each thread
Yes when a thread constructs an object only this thread has a reference to the object no other thread can access that object and no thread synchronization is required when invoking instance methods.
Thread-safety does not means synchronization
Thread-safe means that data doesn't get corrupted if two threads attempt to access the data at the some time.Thread-safe also depends on which type you are Reading and writing a data type that fits into a single word (int on a 32-bit processor, long on a 64-bit processor) is thread-safe.
Synchronization it's a way to achieve thread safety but, Immutability of objects too.
Back on your question, if for example your thread exposes the reference to the object by a static field and passing it as state argument to another thread's method here synchronization is required if the threads could attempt simultaneous write access (but not read-only which is different from read and write access)
So having an object(not related to a method) (static or instance) that can be accessed by many threads at the same time in read and write should be made thread safe.

C# mutex through reference

I have a reasonably simple case of two threads interacting with the same data structure. The threads are hosted in their own responsible classes. Let's say these are class Alfons and class Belzebub:
class Alfons {
public Mutex listMutex = new Mutex();
private void ProcessListInfo()
{
listMutex.WaitOne();
//
// ... Process multi-access list stuff ...
//
listMutex.ReleaseMutex();
}
}
class Belzebub {
private Alfons mCachedAlfonsReference;
private void ProcessListInfoDifferently()
{
mCachedAlfonsReference.listMutex.WaitOne();
//
// ... Process multi-access list stuff in a different fashion ...
//
mCachedAlfonsReference.listMutex.ReleaseMutex();
}
}
My question is whether referencing a Mutex like this can create a concurrency issue OR whether it is recommended practice to do so. Is there a better way of doing this and should I, for example, cache the mutex reference rather than accessing it through a reference.

There would be no concurrency issue - the mutex is supposed to be shared. As per the Mutex MSDN docs
This type is thread safe.
However, I'd say that the data structure itself should synchronize access coming from different threads. If the data structure doesn't support this (e.g., using SyncRoot), encapsulate it and add that feature.
Out of curiosity: which data structure are you using? You might consider using one of the System.Collections.Concurrent collections for lock-free/fine-grained locking solutions. Also, wouldn't using the lock keyword be simpler and less error-prone for your scenario?

Generally, since locking can be tricky and deadlocks will stop all fun, I try to reduce the code that is concerned with the mutex rather than passing it around. Otherwise it can be a headache to figure out which paths lead to a lock.
It may be better to encapsulate the resource and thread critical operations in a class and then:
Lock( this )
{
}
Or see if there is a thread-safe version as suggested by dcastro.
Besides this, be very careful that there is no return (throw, etc) between WaitOne() and ReleaseMutex() otherwise other threads will be locked out indefinitely - lock or a finally with the ReleaseMutex is safer in this respect. As castro pointed out in the comments, it could be another library that raises the exception.
Finally, I am assuming that it is the same resource that is being protected in ProcessListInfo() and ProcessListInfoDifferently(). If these are two different resources that are being protected, then you have extended the likelihood of unnecessary thread contention.

I don't see how caching the mutex reference would make any difference, either way you are still accessing the same object through references, and if you don't do that then it defeats the point of a mutex.

Why Locking On a Public Object is a Bad Idea

Ok, I've used locks quite a bit, but I've never had this scenario before. I have two different classes that contain code used to modify the same MSAccess database:
public class DatabaseNinja
{
public void UseSQLKatana
{
//Code to execute queries against db.TableAwesome
}
}
public class DatabasePirate
{
public void UseSQLCutlass
{
//Code to execute queries against db.TableAwesome
}
}
This is a problem, because transactions to the database cannot be executed in parallel, and these methods (UseSQLKatana and UseSQLCutlass) are called by different threads.
In my research, I see that it is bad practice to use a public object as a lock object so how do I lock these methods so that they don't run in tandem? Is the answer simply to have these methods in the same class? (That is actually not so simple in my real code)

Well, first off, you could create a third class:
internal class ImplementationDetail
{
private static readonly object lockme = new object();
public static void DoDatabaseQuery(whatever)
{
lock(lockme)
ReallyDoQuery(whatever);
}
}
and now UseSQLKatana and UseSQLCutlass call ImplementationDetail.DoDatabaseQuery.
Second, you could decide to not worry about it, and lock an object that is visible to both types. The primary reason to avoid that is because it becomes difficult to reason about who is locking the object, and difficult to protect against hostile partially trusted code locking the object maliciously. If you don't care about either downside then you don't have to blindly follow the guideline.

The reason it's bad practice to lock on a public object is that you can never be sure who ELSE is locking on that object. Although unlikely, someone else someday can decide that they want to grab your lock object, and do some process that ends up calling your code, where you lock onto that same lock object, and now you have an impossible deadlock to figure out. (It's the same issue for using 'this').
A better way to do this would be to use a public Mutex object. These are much more heavyweight, but it's much easier to debug the issue.

Use a Mutex.
You can create mutex in main class and call Wait method at the beginning of each class (method); then set mutex so when the other method is called it gonna wait for first class to finish.
Ah, remember to release mutex exiting from those methods...

I see two differing questions here:
Why is it a bad idea to lock on a public object?
The idea is that locking on an object restricts access while the lock is maintained - this means none of its members can be accessed, and other sources may not be aware of the lock and attempt to utilise the instance, even trying to acquire a lock themselves, hence causing problems.
For this reason, use a dedicated object instance to lock onto.
How do I lock these methods so that they don't run in tandem?
You could consider the Mutex class; creating a 'global' mutex will allow your classes to operate on the basis of knowing the state of the lock throughout the application. Or, you could use a shared ReaderWriterLockSlim instance, but I wouldn't really recommend the cross-class sharing of it.

You can use a public LOCK object as a lock object. You'll just have to specify that the object you're creating is a Lock object solely used for locking the Ninja and Pirate class.

Question about lock objects and sub classes

So, I have a base class which has a private locking object like so:
class A
{
private object mLock = new object();
public virtual void myMethod()
{
lock(mLock)
{
// CS
}
}
}
This locking object is used for most of A's operations... because they need to be thread safe.
Now, lets say I inherit from A like so:
class B : public A
{
public override void myMethod()
{
lock(???)
{
// CS of mymethod here
// Call base (thread safe alread)
base.myMethod();
}
}
}
What is the convention for making B thread safe? Should B also have a private locking object like A does? What if I need to call the base method like above?
I guess I'm just curious what the convention is for making subclasses thread safe when the base class is already thread safe. Thanks!
Edit: Some are asking what I mean by "thread safe"... I'll clarify by adding that I'm trying to achieve thread safety through mutual exclusion... Only one thread at a time should be executing code which may alter the state of the object.

You would potentially expose the lock used by A:
protected object SyncLock { get { return mLock; } }
If B had its own lock object, you could easily end up with:
Different locks being used, potentially leading to race conditions. (It may be fine - if the operations occurring while holding lock B are orthogonal to those occurring while holding lock A, you may just get away with it.)
Locks being taken out recursively in different orders, leading to deadlocks. (Again, the orthogonality argument applies.)
As locks are recursive in .NET (for better or worse), if your override locks on the same object as A's implementation, it's fine to call base.myMethod and recursively acquire/release the lock.
Having said all of this, I'm keen on making most classes non-thread safe or immutable (only classes which are about threading need threading knowledge) and most classes don't need to be designed for inheritance IMO.

It depends really. If your subclass is only using the safe methods from your base class and doesn't add any extra unsafe state, than you don't have to do anything (preferred). If it add some extra state which is not correlated with the state of the base class, then you can create a separate lock object in the subclass and use that. If, on the other hand, you need to make sure that the new state and the state from the base class are changed in some sort "transactional" way, then I would make the lock object in the base class protected and use that.

There's not really enough information to answer your question, but first things first.
Just using lock blocks does not make your code thread-safe.
All you're doing in A is making it impossible for more than one thread at a time to call any function whose body you enclose in the lock(mLock). Thread safety is a broad term, and preventing concurrent calls is just one aspect (and isn't always what you want or need). If that is what you need, then you've obviously taken the right approach. Just be sure about it.
Second, what you need to expose to your subclass is not evident from the code above. You have three scenarios:
B might call protected (or internal if it's in the same assembly as A) functions on A that are not enclosed in the lock(mLock) blocks
B will only call functions on A that are enclosed in the lock(mLock) blocks and doesn't provide any operations of its own that require you to prevent concurrent calls
B will only call functions on A that are enclosed in the lock(mLock) blocks and also provides operations of its own that require you to prevent concurrent calls.
Which really boils down into two unrelated questions:
Will B interact with A in a way that needs to be protected (in other words, in a way that isn't already protected)?
Will B expose functionality that needs to be protected, and, if so, should they use the same locking object?
If 1) is true of 2) is true and they should use the same locking object, then you'll need to expose your locking object via a protected property and use that in all of your locks (I would suggest using it within A as well for readability).
If neither is true, then don't worry about it. If 2) is true but they don't need to use the same locking object (in other words, it would be acceptable to have concurrent calls to A and B), then don't worry about exposing the locking object. Remember that any calls that B makes into a function on A that's been protected by the lock(mLock) is going to lock on that object anyway, regardless of any outer locks.

What do you mean by "thread safe"? If this includes any ideas about reentrancy -- the idea that a method can only see the object in a good state, so only one method can be running at once -- then locking around methods may not do what you expect. For example, if your object ever calls out to another object in any way (an event?), and that object calls back in to the original object, the inner call will see the object in a "bad" state.
For this reason, when you're locking the whole object like this, you have to be careful about the code that runs inside the lock: "bad" code running inside the lock can compromise the correctness of your code.
Since subclassing introduces code that's outside the control of the original object, it's often seen as a dangerous way to work with self-locking classes like this, so one popular convention is "don't do it."
If, given the possible problems, you still want to take this route, then making the lockable object protected rather than private will allow you to include subclass operations inside the same base-class lock.

I thought the appropriate thing to do here is use lock(this), instead of creating an instance of another object just for locking. That should work for both cases.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.