I have a static class which is accessed by multiple remoting and other internal to the application threads. Part of the functionality of this class is controlling read/write access to various files, so I've implemented a static ReaderWriterLock on the list of files. The project uses the .net framework 2.0 as part of the customer requirements.
However when I stress test the system using a number of different clients (generally I'm using 16) each performing a large amount of reads and writes then very intermittently and only after several hours or even days have passed with at least 500k+ transactions completed the system crashes. Ok so we got a bug..
But when I check the logs of all locking events I can see that the following has happened:
1: Thread A acquires a write lock directly, checking IsWriterLock shows it to be true.
2: Thread B tries to acquire a reader lock and succeeds even though Thread A still has the write lock
3: System now crashes, stack trace now shows a null reference exception to the readerwriterlock
This process has been run several hundred thousand times previously with no errors and I can check the logs and see that the read lock was blocked in all cases previously until the write had exited. I have also tried implementing the readerwriterlock as a singleton but the issue still occurs
Has anybody ever seen anything like this before ??
A slimed down version of the readerwriterlock implementation used is shown below:
private const int readwriterlocktimeoutms = 5000;
private static ReaderWriterLock readerWriterLock = new ReaderWriterLock();
// this method will be called by thread A
public static void MethodA()
{
// bool to indicate that we have the lock
bool IsTaken = false;
try
{
// get the lock
readerWriterLock.AcquireWriterLock(readwriterlocktimeoutms);
// log that we have the lock for debug
// Logger.LogInfo("MethodA: acquired write lock; writer lock held {0}; reader lock held {1}", readerWriterLock.IsWriterLockHeld.ToString(),readerWriterLock.IsReaderLockHeld.ToString(), );
// mark that we have taken the lock
IsTaken = true;
}
catch(Exception e)
{
throw new Exception(string.Format("Error getting lock {0} {1}", e.Message, Environment.StackTrace));
}
try
{
// do some work
}
finally
{
if (IsTaken)
{
readerWriterLock.ReleaseWriterLock();
}
}
}
// this method will be called by thread B
public static void MethodB()
{
// bool to indicate that we have the lock
bool IsTaken = false;
try
{
// get the lock
readerWriterLock.AcquireReaderLock(readwriterlocktimeoutms);
// log that we have the lock for debug
// Logger.LogInfo("MethodB: acquired read lock; writer lock held {0}; reader lock held {1}", readerWriterLock.IsWriterLockHeld.ToString(),readerWriterLock.IsReaderLockHeld.ToString(), );
// mark that we have taken the lock
IsTaken = true;
}
catch (Exception e)
{
throw new Exception(string.Format("Error getting lock {0} {1}", e.Message, Environment.StackTrace));
}
try
{
// do some work
}
finally
{
if (IsTaken)
{
readerWriterLock.ReleaseReaderLock();
}
}
}
enter code here
#All finally have a solution to this problem. #Yannick you were on the right track...
If MSDN says that it's impossible to have reader and writer lock held at same time.
Today I got confirmation from microsoft that in cases of very heavy load on multiprocessor systems (note: I could never reproduce this problem on an AMD system only on Intel) its possible for ReaderWriterLock class objects to become corrupted, the risk of this is increased if the numer of writers at any given stage grows as these can backup in the queue.
For the last two weeks I've been running using the .Net 3.5 ReaderWriterLockSlim class and have not encountered the issue, which corresponds to what Microsoft have confirmed that the readerwriterlockslim class does not have the same risk of corruption as the fat ReaderWriterLock class.
If MSDN says that it's impossible to have reader and writer lock held at same time.
Is it possible in your process to have 2 readerWriterLock objects at any time, for some other reason ?
Another thing strange, is that Debugging a thread using isWriterLockHeld, whereas the current thread is a reader one, don't allow you to know about writing within another thread.
How do you know that Thread A still holds a writer lock, and how do you know that it's not the debug-Logging system that delay or "mix" instructions given by threads ?
Other thought, is it possible that other resource shared leads to a deadlock ? That would results somehow to a crash ? (while, Null Exception is still strange unless where consider the deadlock cleaned and readerWriterLock reset.
Your problem is strange, true.
And other question, that won't solve your problem. What do you use isTaken, whereas in debugging your application you rely on isWriterLockHeld (or isReaderLockHeld) ?
why not use it in your finally blocks ?
Related
So I have 16 threads that simultaneously run this method:
private void Work()
{
int currentByte;
char currentChar;
try
{
while (true)
{
position++;
currentByte = file.ReadByte();
currentChar = Convert.ToChar(currentByte);
entries.Add(new Entry(currentChar));
}
}
catch (Exception) { }
}
And then I have one more thread running this method:
private void ManageThreads()
{
bool done;
for(; ; )
{
done = !threads.Any(x => x.IsAlive == true);//Check if each thread is dead before continuing
if (done)
break;
else
Thread.Sleep(100);
}
PrintData();
}
Here is the problem: the PrintData method just prints everything in the 'entries' list to a text file. This text file is different every time the program is run even with the same input file. I am a bit of a noob when it comes to multi-threaded applications so feel free to dish out the criticism.
In general unless type explicitly calls out thread safety in its documentation you should assume it is not thread-safe*. Streams in .Net do not have such section and should be treated non-thread safe - use appropriate synchronization (i.e. locks) that guarantees that each stream is accessed from one thread at a time.
With file streams there is another concern - OS level file object may be updated from other threads - FileStream tries to mitigate it by checking if its internal state matches OS state - see FileStream:remarks section on MSDN.
If you want thread safe stream you can try to use Synchronized method as shown in C#, is there such a thing as a "thread-safe" stream?.
Note that code you have in the post will produce random results whether stream is thread safe or not. Thread safety of a stream will only guarantee that all bytes show up in output. If using non thread safe stream there is no guarantees at all and some bytes may show up multiple times, some skipped and any other behavior (crashes, partial reads,...) are possible.
* Thread-safe as in "internal state of the instance will be consistent whether it is called from one thread or multiple". It does not mean calling arbitrary methods from different threads will lead to useful behavior.
I have two threads: one that feeds updates and one that writes them to disk. Only the most recent update matters, so I don't need a PC queue.
In a nutshell:
The feeder thread drops the latest update into a buffer, then sets a flag to indicate a new update.
The writer thread checks the flag, and if it indicates new content, writes the buffered update to disk and disables the flag again.
I'm currently using a dedicate lock object to ensure that there's no inconsistency, and I'm wondering what differences that has from locking the flag and buffer directly. The only one I'm aware of is that a dedicated lock object requires trust that everyone who wants to manipulate the flag and buffer uses the lock.
Relevant code:
private object cacheStateLock = new object();
string textboxContents;
bool hasNewContents;
private void MainTextbox_TextChanged(object sender, TextChangedEventArgs e)
{
lock (cacheStateLock)
{
textboxContents = MainTextbox.Text;
hasNewContents = true;
}
}
private void WriteCache() // running continually in a thread
{
string toWrite;
while (true)
{
lock (cacheStateLock)
{
if (!hasNewContents)
continue;
toWrite = textboxContents;
hasNewContents = false;
}
File.WriteAllText(cacheFilePath, toWrite);
}
}
First of all, if you're trying to use the bool flag in such manner, you should mark it as volatile (which isn't recommended at all, yet better than your code).
Second thing to note is that lock statement is a sintax sugar for a Monitor class methods, so even if you would be able to provide a value type for it (which is a compile error, by the way), two different threads will get their own version of the flag, making the lock useless. So you must provide a reference type for lock statement.
Third thing is that strings are immutable in the C# so it's theoretically possible for some method to store an old reference to the string and do the lock in a wrong way. Also a string could became a null from MainTextbox.Text in your case, which will throw in runtime, comparing with a private object which wouldn't ever change (you should mark it as readonly by the way).
So, introduction of a dedicated object for synchronization is an easiest and natural way to separate locking from actual logic.
As for your initial code, it has a problem, as MainTextbox_TextChanged could override the text which wasn't being written down. You can introduce some additional synchronization logic or use some library here. #Aron suggested the Rx here, I personally prefer the TPL Dataflow, it doesn't matter.
You can add the BroadcastBlock linked to ActionBlock<string>(WriteCache), which will remove the infinite loop from WriteCache method and the lock from both of your methods:
var broadcast = new BroadcastBlock<string>(s => s);
var consumer = new ActionBlock<string>(s => WriteCache(s));
broadcast.LinkTo(consumer);
// fire and forget
private async void MainTextbox_TextChanged(object sender, TextChangedEventArgs e)
{
await broadcast.SendAsync(MainTextbox.Text);
}
// running continually in a thread without a loop
private void WriteCache(string toWrite)
{
File.WriteAllText(cacheFilePath, toWrite);
}
I have a S#arp Architecture app that implements a lightweight queue-processing thing whereby various threads pull entities from a list and set their status to mark the fact that processing has started on those items.
Despite wrapping the start-processing bit in explicit transactions and using a C# lock(), I still get them starting at the same time sometimes.
Do I regret not using MSMQ ... well, yeah, but now this concurrency behaviour has got me baffled. Evidently there's something that I don't understand about NHibernate transactions and flushing. Can you help me out?
Here's the relevant bits of code:
private static object m_lock = new object();
private bool AbleToStartProcessing(int thingId)
{
bool able = false;
try
{
lock (m_lock)
{
this.thingRepository.DbContext.BeginTransaction();
var thing = this.thingRepository.Get(thingId);
if (thing.Status == ThingStatusEnum.PreProcessing)
{
able = true;
thing.Status = ThingStatusEnum.Processing;
}
else
{
logger.DebugFormat("Not able to start processing {0} because status is {1}",
thingId, thing.Status.ToString());
}
this.thingRepository.DbContext.CommitTransaction();
}
}
catch (Exception ex)
{
this.thingRepository.DbContext.RollbackTransaction();
throw ex;
}
if (able)
logger.DebugFormat("Starting processing of {0}",
thingId);
return able;
}
I would have expected this to guarantee that only one thread could change the status of a 'thing' at one time, but I get this in my logs pretty regularly:
2011-05-18 18:41:23,557 thread41 DEBUG src:MyApp.Blah.ThingJob - Starting processing of 78090
2011-05-18 18:41:23,557 thread51 DEBUG src:MyApp.Blah.ThingJob - Starting processing of 78090
.. and then both threads try and operate on the same thing and create a mess.
What am I missing? Thanks.
edit: changed code to reflect how my logging works in the real version
Setup concurrency in your NHibernate mappings, this post should help you get started.
http://ayende.com/blog/3946/nhibernate-mapping-concurrency
i think you are just crossed up on the status you are using to set that you are processing and to check that you are already processing. first one in sets ThingStatusEnum.Processing, but the next guy is checking for something different - ThingStatusEnum.PreProcessing. because ThingStatusEnum.Processing != ThingStatusEnum.PreProcessing, your locking means two threads are not
How can I create a system/multiprocess Mutex to co-ordinate multiple processes using the same unmanaged resource.
Background:
I've written a procedure that uses a File printer, which can only be used by one process at a time. If I wanted to use it on multiple programs running on the computer, I'd need a way to synchronize this across the system.
You can use the System.Threading.Mutex class, which has an OpenExisting method to open a named system mutex.
That doesn't answer the question:
How can I create a system/multiprocess Mutex
To create a system-wide mutex, call the System.Threading.Mutex constructor that takes a string as an argument. This is also known as a 'named' mutex. To see if it exists, I can't seem to find a more graceful method than try catch:
System.Threading.Mutex _mutey = null;
try
{
_mutey = System.Threading.Mutex.OpenExisting("mutex_name");
//we got Mutey and can try to obtain a lock by waitone
_mutey.WaitOne();
}
catch
{
//the specified mutex doesn't exist, we should create it
_mutey = new System.Threading.Mutex("mutex_name"); //these names need to match.
}
Now, to be a good programmer, you need to, when you end the program, release this mutex
_mutey.ReleaseMutex();
or, you can leave it in which case it will be called 'abandoned' when your thread exits, and will allow another process to create it.
[EDIT]
As a side note to the last sentence describing the mutex that is abandoned, when another thread acquires the mutex, the exception System.Threading.AbandonedMutexException will be thrown telling him it was found in the abandoned state.
[EDIT TWO]
I'm not sure why I answered the question that way years ago; there is (and was) a constructor overload that is much better at checking for an existing mutex. In fact, the code I gave seems to have a race condition! (And shame on you all for not correcting me! :-P )
Here's the race condition: Imagine two processes, they both try to open the existing mutex at the same time, and both get to the catch section of code. Then, one of the processes creates the mutex and lives happily ever after. The other process, however, tries to create the mutex, but this time it's already created! This checking/creating of a mutex needs to be atomic.
http://msdn.microsoft.com/en-us/library/bwe34f1k(v=vs.90).aspx
So...
var requestInitialOwnership = false;
bool mutexWasCreated;
Mutex m = new Mutex(requestInitialOwnership,
"MyMutex", out mutexWasCreated);
I think the trick here is that it appears that you have an option that you don't actually have (looks like a design flaw to me). You sometimes can't tell if you own the mutex if you send true for requestInitialOwnership. If you pass true and it appears that your call created the mutex, then obviously you own it (confirmed by documentation). If you pass true and your call did not create the mutex, all you know is that the mutex was already created, you don't know if some other process or thread which perhaps created the mutex currently owns the mutex. So, you have to WaitOne to make sure you have it. But then, how many Releases do you do? If some other process owned the mutex when you got it, then only your explicit call to WaitOne needs to be Released. If your call to the constructor caused you to own the mutex, and you called WaitOne explicitly, you'll need two Releases.
I'll put these words into code:
var requestInitialOwnership = true; /*This appears to be a mistake.*/
bool mutexWasCreated;
Mutex m = new Mutex(requestInitialOwnership,
"MyMutex", out mutexWasCreated);
if ( !mutexWasCreated )
{
bool calledWaitOne = false;
if ( ! iOwnMutex(m) ) /*I don't know of a method like this*/
{
calledWaitOne = true;
m.WaitOne();
}
doWorkWhileHoldingMutex();
m.Release();
if ( calledWaitOne )
{
m.Release();
}
}
Since I don't see a way to test whether you currently own the mutex, I will strongly recommend that you pass false to the constructor so that you know that you don't own the mutex, and you know how many times to call Release.
You can use the System.Threading.Mutex class, which has an OpenExisting method to open a named system mutex.
I have not had good luck using the System Mutex described above using Mono under Linux. I'm probably just doing something simple wrong but the following works well and cleans up nicely if the process exits unexpectedly (kill -9 ). Would would be interested to hear comments or critisisms.
class SocketMutex{
private Socket _sock;
private IPEndPoint _ep;
public SocketMutex(){
_ep = new IPEndPoint(IPAddress.Parse( "127.0.0.1" ), 7177);
_sock = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp);
_sock.ExclusiveAddressUse = true; // most critical if you want this to be a system wide mutex
}
public bool GetLock(){
try{
_sock.Bind(_ep); // 'SocketException: Address already in use'
}catch(SocketException se){
Console.Error.WriteLine ("SocketMutex Exception: " se.Message);
return false;
}
return true;
}
}
I've got a web application that controls which web applications get served traffic from our load balancer. The web application runs on each individual server.
It keeps track of the "in or out" state for each application in an object in the ASP.NET application state, and the object is serialized to a file on the disk whenever the state is changed. The state is deserialized from the file when the web application starts.
While the site itself only gets a couple requests a second tops, and the file it rarely accessed, I've found that it was extremely easy for some reason to get collisions while attempting to read from or write to the file. This mechanism needs to be extremely reliable, because we have an automated system that regularly does rolling deployments to the server.
Before anyone makes any comments questioning the prudence of any of the above, allow me to simply say that explaining the reasoning behind it would make this post much longer than it already is, so I'd like to avoid moving mountains.
That said, the code that I use to control access to the file looks like this:
internal static Mutex _lock = null;
/// <summary>Executes the specified <see cref="Func{FileStream, Object}" /> delegate on
/// the filesystem copy of the <see cref="ServerState" />.
/// The work done on the file is wrapped in a lock statement to ensure there are no
/// locking collisions caused by attempting to save and load the file simultaneously
/// from separate requests.
/// </summary>
/// <param name="action">The logic to be executed on the
/// <see cref="ServerState" /> file.</param>
/// <returns>An object containing any result data returned by <param name="func" />.
///</returns>
private static Boolean InvokeOnFile(Func<FileStream, Object> func, out Object result)
{
var l = new Logger();
if (ServerState._lock.WaitOne(1500, false))
{
l.LogInformation( "Got lock to read/write file-based server state."
, (Int32)VipEvent.GotStateLock);
var fileStream = File.Open( ServerState.PATH, FileMode.OpenOrCreate
, FileAccess.ReadWrite, FileShare.None);
result = func.Invoke(fileStream);
fileStream.Close();
fileStream.Dispose();
fileStream = null;
ServerState._lock.ReleaseMutex();
l.LogInformation( "Released state file lock."
, (Int32)VipEvent.ReleasedStateLock);
return true;
}
else
{
l.LogWarning( "Could not get a lock to access the file-based server state."
, (Int32)VipEvent.CouldNotGetStateLock);
result = null;
return false;
}
}
This usually works, but occasionally I cannot get access to the mutex (I see the "Could not get a lock" event in the log). I cannot reproduce this locally - it only happens on my production servers (Win Server 2k3/IIS 6). If I remove the timeout, the application hangs indefinitely (race condition??), including on subsequent requests.
When I do get the errors, looking at the event log tells me that the mutex lock was achieved and released by the previous request before the error was logged.
The mutex is instantiated in the Application_Start event. I get the same results when it is instantiated statically in the declaration.
Excuses, excuses: threading/locking is not my forté, as I generally don't have to worry about it.
Any suggestions as to why it randomly would fail to get a signal?
Update:
I've added proper error handling (how embarrassing!), but I am still getting the same errors - and for the record, unhandled exceptions were never the problem.
Only one process would ever be accessing the file - I don't use a web garden for this application's web pool, and no other applications use the file. The only exception I can think of would be when the app pool recycles, and the old WP is still open when the new one is created - but I can tell from watching the task manager that the issue occurs while there is only one worker process.
#mmr: How is using Monitor any different from using a Mutex? Based on the MSDN documentation, it looks like it is effectively doing the same thing - if and I can't get the lock with my Mutex, it does fail gracefully by just returning false.
Another thing to note: The issues I'm having seem to be completely random - if it fails on one request, it might work fine on the next. There doesn't seem to be a pattern, either (certainly no every other, at least).
Update 2:
This lock is not used for any other call. The only time _lock is referenced outside the InvokeOnFile method is when it is instantiated.
The Func that is invoked is either reading from the file and deserializing into an object, or serializing an object and writing it to the file. Neither operation is done on a separate thread.
ServerState.PATH is a static readonly field, which I don't expect would cause any concurrency problems.
I'd also like to re-iterate my earlier point that I cannot reproduce this locally (in Cassini).
Lessons learned:
Use proper error handling (duh!)
Use the right tool for the job (and have a basic understanding of what/how that tool does). As sambo points out, using a Mutex apparently has a lot of overhead, which was causing issues in my application, whereas Monitor is designed specifically for .NET.
You should only be using Mutexes if you need cross-process synchronization.
Although a mutex can be used for
intra-process thread synchronization,
using Monitor is generally preferred,
because monitors were designed
specifically for the .NET Framework
and therefore make better use of
resources. In contrast, the Mutex
class is a wrapper to a Win32
construct. While it is more powerful
than a monitor, a mutex requires
interop transitions that are more
computationally expensive than those
required by the Monitor class.
If you need to support inter-process locking you need a Global mutex.
The pattern being used is incredibly fragile, there is no exception handling and you are not ensuring that your Mutex is released. That is really risky code and most likely the reason you see these hangs when there is no timeout.
Also, if your file operation ever takes longer than 1.5 seconds then there is a chance concurrent Mutexes will not be able to grab it. I would recommend getting the locking right and avoiding the timeout.
I think its best to re-write this to use a lock. Also, it looks like you are calling out to another method, if this take forever, the lock will be held forever. That's pretty risky.
This is both shorter and much safer:
// if you want timeout support use
// try{var success=Monitor.TryEnter(m_syncObj, 2000);}
// finally{Monitor.Exit(m_syncObj)}
lock(m_syncObj)
{
l.LogInformation( "Got lock to read/write file-based server state."
, (Int32)VipEvent.GotStateLock);
using (var fileStream = File.Open( ServerState.PATH, FileMode.OpenOrCreate
, FileAccess.ReadWrite, FileShare.None))
{
// the line below is risky, what will happen if the call to invoke
// never returns?
result = func.Invoke(fileStream);
}
}
l.LogInformation("Released state file lock.", (Int32)VipEvent.ReleasedStateLock);
return true;
// note exceptions may leak out of this method. either handle them here.
// or in the calling method.
// For example the file access may fail of func.Invoke may fail
If some of the file operations fail, the lock will not be released. Most probably that is the case. Put the file operations in try/catch block, and release the lock in the finally block.
Anyway, if you read the file in your Global.asax Application_Start method, this will ensure that noone else is working on it (you said that the file is read on application start, right?). To avoid collisions on application pool restaring, etc., you just can try to read the file (assuming that the write operation takes an exclusive lock), and then wait 1 second and retry if exception is thrown.
Now, you have the problem of synchronizing the writes. Whatever method decides to change the file should take care to not invoke a write operation if another one is in progress with simple lock statement.
I see a couple of potential issues here.
Edit for Update 2: If the function is a simple serialize/deserialize combination, I'd separate the two out into two different functions, one into a 'serialize' function, and one into a 'deserialize' function. They really are two different tasks. You can then have different, lock-specific tasks. Invoke is nifty, but I've gotten into lots of trouble myself going for 'nifty' over 'working'.
1) Is your LogInformation function locking? Because you call it inside the mutex first, and then once you release the mutex. So if there's a lock to write to the log file/structure, then you can end up with your race condition there. To avoid that, put the log inside the lock.
2) Check out using the Monitor class, which I know works in C# and I'd assume works in ASP.NET. For that, you can just simply try to get the lock, and fail gracefully otherwise. One way to use this is to just keep trying to get the lock. (Edit for why: see here; basically, a mutex is across processes, the Monitor is in just one process, but was designed for .NET and so is preferred. No other real explanation is given by the docs.)
3) What happens if the filestream opening fails, because someone else has the lock? That would throw an exception, and that could cause this code to behave badly (ie, the lock is still held by the thread that has the exception, and another thread can get at it).
4) What about the func itself? Does that start another thread, or is it entirely within the one thread? What about accessing ServerState.PATH?
5) What other functions can access ServerState._lock? I prefer to have each function that requires a lock get its own lock, to avoid race/deadlock conditions. If you have many many threads, and each of them try to lock on the same object but for totally different tasks, then you could end up with deadlocks and races without any really easily understandable reason. I've changed to code to reflect that idea, rather than using some global lock. (I realize other people suggest a global lock; I really don't like that idea, because of the possibility of other things grabbing it for some task that is not this task).
Object MyLock = new Object();
private static Boolean InvokeOnFile(Func<FileStream, Object> func, out Object result)
{
var l = null;
var filestream = null;
Boolean success = false;
if (Monitor.TryEnter(MyLock, 1500))
try {
l = new Logger();
l.LogInformation("Got lock to read/write file-based server state.", (Int32)VipEvent.GotStateLock);
using (fileStream = File.Open(ServerState.PATH, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None)){
result = func.Invoke(fileStream);
} //'using' means avoiding the dispose/close requirements
success = true;
}
catch {//your filestream access failed
l.LogInformation("File access failed.", (Int32)VipEvent.ReleasedStateLock);
} finally {
l.LogInformation("About to released state file lock.", (Int32)VipEvent.ReleasedStateLock);
Monitor.Exit(MyLock);//gets you out of the lock you've got
}
} else {
result = null;
//l.LogWarning("Could not get a lock to access the file-based server state.", (Int32)VipEvent.CouldNotGetStateLock);//if the lock doesn't show in the log, then it wasn't gotten; again, if your logger is locking, then you could have some issues here
}
return Success;
}