I would like to use exactly one of a set of resources from a multi-threaded C# application. These resources are not thread-safe, so some lock or mutex must be for them. How should I do it?
I would like to get something like the following pseudo code:
lockAny([obj1, obj2, obj3]) {
achievedLock = getAchievedLock(); // returns e. g. obj2
myResource = getResourceForLock(achievedLock); // some function written by me, looks up the resource belonging to the particular lock
myResource.DoSomething();
}
The best way to do this is with some sort of pooling. A pool object can attempt to ensure that only 1 thing has a reference at a time (but .NET can't guarantee that like Rust).
.NET has recently added Microsoft.Extensions.ObjectPool<T>. You can configure how it creates the items. Some examples here.
Related
I got an api that's an end point for geographic coordinate requests. That means users can search for specific locations in their area. At the same time new locations can be added. To make the query as fast as possible, I thought I would make the R-tree unchangeable. That is, there are no locks within the R-Tree, since several threads can read at the same time, without race condition. The updates are collected and if e.g. 100 updates are collected, I want to create a new R-Tree and replace the old one. And now my question is how to do this best?
I have a SearchService, which is stored as a single tone and has an R-Tree as private instance.
In my Startup.cs
services.AddSingleton<ISearchService, SearchService>();
ISearchService.cs
public interface ISearchService
{
IEnumerable<GeoLocation> Get(RTreeQuery query);
void Update(IEnumerable<GeoLocation> data);
}
SearchService.cs
public class SearchService : ISearchService
{
private RTree rTree;
public IEnumerable<GeoLocation> Get(RTreeQuery query)
{
return rTree.Get(query);
}
public void Update(IEnumerable<GeoLocation> data)
{
var newTree = new RTree(data);
Interlocked.Exchange<RTree>(ref rTree, newTree);
}
}
My question is, if I exchange the reference with Interlock.Exchange() the operation is atomic and there should be no race condition. But what happens if threads still use the old instance to process their request. Could it be that the garbage collector deletes the old instance when threads still access it? After all, there is no longer a reference to the old instance.
I am relatively new to this topic, so any help is welcome. Thanks for your support!
Read and writes to references are atomic, which means there will be no alignment issues. However, they could be stale.
Section 12.6.6 of the CLI specs
Unless explicit layout control (see Partition II (Controlling Instance
Layout)) is used to alter the default behavior, data elements no
larger than the natural word size (the size of a native int) shall be
properly aligned. Object references shall be treated as though they
are stored in the native word size.
In regards to the GC, your trees are safe from garbage collection while they are running Get.
So in summary, your methods are thread safe as far as reference atomicity go, you can also use the Update method and safely overwrite the reference, there is no need for Interlocked.Exchange. The worst that can happen with your current implementation is you just get a stale tree which you have mentioned is not an issue.
I am tasked with writing a system to process result files created by a different process(which I have no control over) and and trying to modify my code to make use of Parallel.Foreach. The code works fine when just calling a foreach but I have some concerns about thread safety when using the parallel version. The base question I need answered here is "Is the way I am doing this going to guarantee thread safety?" or is this going to cause everything to go sideways on me.
I have tried to make sure all calls are to instances and have removed every static anything except the initial static void Main. It is my current understanding that this will do alot towards assuring thread safety.
I have basically the following, edited for brevity
static void Main(string[] args)
{
MyProcess process = new MyProcess();
process.DoThings();
}
And then in the actual process to do stuff I have
public class MyProcess
{
public void DoThings()
{
//Get some list of things
List<Thing> things = getThings();
Parallel.Foreach(things, item => {
//based on some criteria, take actions from MyActionClass
MyActionClass myAct = new MyActionClass(item);
string tempstring = myAct.DoOneThing();
if(somecondition)
{
MyAct.DoOtherThing();
}
...other similar calls to myAct below here
};
}
}
And over in the MyActionClass I have something like the following:
public class MyActionClass
{
private Thing _thing;
public MyActionClass(Thing item)
{
_thing = item;
}
public string DoOneThing()
{
return _thing.GetSubThings().FirstOrDefault();
}
public void DoOtherThing()
{
_thing.property1 = "Somenewvalue";
}
}
If I can explain this any better I'll try, but I think that's the basics of my needs
EDIT:
Something else I just noticed. If I change the value of a property of the item I'm working with while inside the Parallel.Foreach (in this case, a string value that gets written to a database inside the loop), will that have any affect on the rest of the loop iterations or just the one I'm on? Would it be better to create a new instance of Thing inside the loop to store the item i'm working with in this case?
There is no shared mutable state between actions in the Parallel.ForEach that I can see, so it should be thread-safe, because at most one thread can touch one object at a time.
But as it has been mentioned there is nothing shared that can be seen. It doesn't mean that in the actual code you use everything is as good as it seems here.
Or that nothing will be changed by you or your coworker that will make some state both shared and mutable (in the Thing, for example), and now you start getting difficult to reproduce crashes at best or just plain wrong behaviour at worst that can be left undetected for a long time.
So, perhaps you should try to go fully immutable near threading code?
Perhaps.
Immutability is good, but it is not a silver bullet, and it is not always easy to use and implement, or that every task can be reasonably expressed through immutable objects. And even that accidental "make shared and mutable" change may happen to it as well, though much less likely.
It should at least be considered as a possible option/alternative.
About the EDIT
If I change the value of a property of the item I'm working with while
inside the Parallel.Foreach (in this case, a string value that gets
written to a database inside the loop), will that have any affect on
the rest of the loop iterations or just the one I'm on?
If you change a property and that object is not used anywhere else, and it doesn't rely on some global mutable state (for example, sort of a public static Int32 ChangesCount that increments with each state change), then you should be safe.
a string value that gets written to a database inside the loop - depending on the used data access technology and how you use it, you may be in trouble, because most of them are not designed for multithreaded environment, like EF DbContext, for example. And obviously do not forget that dealing with concurrent access in database is not always easy, though that is a bit away from our original theme.
Would it be better to create a new instance of Thing inside the loop to store the item i'm working with in this case - if there is no risk of external concurrent changes, then it is just an unnecessary work. And if there is a chance of another threads(not Parallel.For) making changes to those objects that are being persisted, then you already have bigger problems than Parallel.For.
Objects should always have observable consistent state (unlike when half of properties set by one thread, and half by another, while you try to persist that who-knows-what), and if they are used by many threads, then they should be already thread-safe - there should be no way to put them into inconsistent state.
And if they want to be persisted by external code, such objects should probably provide:
Either SyncRoot property to synchronize property reading code.
Or some current state snapshot DTO that is created internally by some thread-safe method like ThingSnapshot Thing.GetCurrentData() { lock() {} }.
Or something more exotic.
I have the following setup in an aspx page:
Object Original = obj;
System.Threading.Thread thread = new System.Threading.Thread(() => saveOriginalDetails(Original));
thread.Start();
The basic idea of this is that I have an object, and I want to save it exactly how it is before making any changes to it.
So I make a copy of the original object obj and store it as Original
I am starting a new thread because the saveOriginalDetails method is slowing the code down too much.
My question is, if I do this instead:
System.Threading.Thread thread = new System.Threading.Thread(() => saveOriginalDetails(obj));
thread.Start();
obj.name = "NewName";
Where I am now passing in the original object, and copy it inside the method that is running concurrently, like this:
private void saveOriginalDetails(object applicant)
{
object OriginalApplicant = applicant;
.....
}
Will the object passed in to the method:
saveOriginalDetails(obj));
Have the updated name value eg a name of newName ?
First, don't use Thread. Use the new Task classes instead (if possible - you didn't specify which .NET version you are using).
Secondly, you're only passing a parameter to the saveOriginalDetails.
Lastly, if your class is a model class (it sort of looks this way) and is serializable, you can relatively quickly create a perfect copy by serializing it and deserializing it (which has the benefit of working with any future changes you might make to your class). A faster-working solution (which, however, would require more actual programming work) would be to write your own code for cloning your class. That said, unless your class is really really large and complex, serializing it and deserializing it, while not the most optimal solution, should be fast enough.
Finally, unless you have an actual business need to store a copy of the DuoApplicant object, an in-memory copy, as described above, should suffice.
Suppose I have a method like this:
public void MyCoolMethod(ref bool scannerEnabled)
{
try
{
CallDangerousMethod();
}
catch (FormatException exp)
{
try
{
//Disable scanner before validation.
scannerEnabled = false;
if (exp.Message == "FormatException")
{
MessageBox.Show(exp.Message);
}
}
finally
{
//Enable scanner after validation.
scannerEnabled = true;
}
}
And it is used like this:
MyCoolMethod(ref MyScannerEnabledVar);
The scanner can fire at any time on a separate thread. The idea is to not let it if we are handling an exception.
The question I have is, does the call to MyCoolMethod update MyScannerEnabledVar when scannerEnabled is set or does it update it when the method exits?
Note: I did not write this code, I am just trying to refactor it safely.
You can think of a ref as making an alias to a variable. It's not that the variable you pass is "passed by reference", it's that the parameter and the argument are the same variable, just with two different names. So updating one immediately updates the other, because there aren't actually two things here in the first place.
As SLaks notes, there are situations in VB that use copy-in-copy-out semantics. There are also, if I recall correctly, rare and obscure situations in which expression trees may be compiled into code that does copy-in-copy-out, but I do not recall the details.
If this code is intended to update the variable for reading on another thread, the fact that the variable is "immediately" updated is misleading. Remember, on multiple threads, reads and writes can be observed to move forwards and backwards in time with respect to each other if the reads and writes are not volatile. If the intention is to use the variable as a cross-thread communications mechanism them use an object actually designed for that purpose which is safe for that purpose. Use some sort of wait handle or mutex or whatever.
It gets updated live, as it is assigned inside the method.
When you pass a parameter by reference, the runtime passes (an equivalent to) a pointer to the field or variable that you referenced. When the method assigns to the parameter, it assigns directly to whatever the reference is pointing to.
Note, by the way, that this is not always true in VB.
Yes, it will be set when the variable is set within the method. Perhaps it would be best to return true or false whether the scanner is enabled rather than pass it in as a ref arg
The situation calls for more than a simple refactor. The code you posted will be subject to race conditions. The easy solution is to lock the unsafe method, thereby forcing threads to hop in line. The way it is, there's bound to be some bug(s) in the application due to this code, but its impossible to say what exactly they are without knowing a lot more about your requirements and implementation. I recommend you proceed with caution, a mutex/lock is an easy fix, but may have a great impact on performance. If this is a concern for you, then you all should review a better thread safe solution.
At first I assume I do need writerlock here but Im not sure (not much experience with that) what if I dont use it.
On the server side, there are client classes for each connected client. Each class contains public list which every other class can write to. Client requests are processed via threadpool workitems.
class client
{
public List <string> A;
someEventRaisedMethod(param)
{
client OtherClient=GetClientByID(param) //selects client class by ID sent by msg sender
OtherCLient.A.Add("blah");
}
}
What if two instances reference the same client and both try OtherCLient.A.Add("blah")? Isnt be here some writer lock? It works for me but I encounter some strange issues that I think are due to this.
Thank you!
(update: as always, Eric Lippert has a timely blog entry)
If you don't use a lock, you risk either missing data, state corruption, and probably the odd Exception - but only very occasionally, so very hard to debug.
Absolutely you need to synchronize here. I would expose a lock on the client (so we can span multiple operations):
lock(otherClient.LockObject) {
otherClient.A.Add("blah");
}
You could make a synchronous Add method on otherClient, but it is often useful to span multiple - perhaps to check Contains and then Add only if missing, etc.
Just to clarify 2 points:
all access to the list (even reads) must also take the lock; otherwise it doesn't work
the LockObject should be a readonly reference-type
for the second, perhaps:
private readonly object lockObject = new object();
public object LockObject {get {return lockObject;}}
From my point of view you should do the following:
Isolate the list into a separate class which implements either the IList Interface or only the subset which you require
Either add locking on a private object in the methods of your list class or use the ReaderWriterSlim implementation - as it is isolated there is only one place needed for changing in one single class
I don't know the C# internals, but I do remember reading awhile back about java example that could cause a thread to endlessly loop if it was reading a collection whilst an insert was being done on the collection (I think it was a hashtable), so make sure if you are using multiple threads that you lock on both read and write. Marc Gravell is correct that you should just create a global lock to handle this since it sounds like you have fairly low volume.
ReaderWriterLockSlim is also a good option if you do alot of reading and only a few write / update actions.