Simultaneously write to a file from multiple processes [duplicate]

Simultaneously write to a file from multiple processes [duplicate] - c#

I have a txt file ABC.txt which will be read and wrote by multi processes. So when one process is reading from or writing to file ABC.txt, file ABC.txt must be locked so that any other processes can not reading from or writing to it. I know the enum System.IO.FileShare may be the right way to handle this problem. But I used another way which I'm not sure if it is right. The following is my solution.
I added another file Lock.txt to the folder. Before I can read from or write to file ABC.txt, I must have the capability to read from file Lock.txt. And after I have read from or written to file ABC.txt, I have to release that capability. The following is the code.
#region Enter the lock
FileStream lockFileStream = null;
bool lockEntered = false;
while (lockEntered == false)
{
try
{
lockFileStream = File.Open("Lock.txt", FileMode.Open, FileAccess.Read, FileShare.None);
lockEntered = true;
}
catch (Exception)
{
Thread.Sleep(500);
}
}
#endregion
#region Do the work
// Read from or write to File ABC.txt
// Read from or write to other files
#endregion
#region Release the lock
try
{
if (lockFileStream != null)
{
lockFileStream.Dispose();
}
}
catch
{
}
#endregion
On my computer, it seems that this solution works well, but I still can not make sure if it is appropriate..
Edit: Multi processes, not multi threads in the same process.

C#'s named EventWaitHandle is the way to go here. Create an instance of wait handle in every process which wants to use that file and give it a name which is shared by all such processes.
EventWaitHandle waitHandle = new EventWaitHandle(true, EventResetMode.AutoReset, "SHARED_BY_ALL_PROCESSES");
Then when accessing the file wait on waitHandle and when finished processing file, set it so the next process in the queue may access it.
waitHandle.WaitOne();
/* process file*/
waitHandle.Set();
When you name an event wait handle then that name is shared across all processes in the operating system. Therefore in order to avoid possibility of collisions, use a guid for name ("SHARED_BY_ALL_PROCESSES" above).

A mutex in C# may be shared across multiple processes. Here is an example for multiple processes writing to a single file:
using (var mutex = new Mutex(false, "Strand www.jakemdrew.com"))
{
mutex.WaitOne();
File.AppendAllText(outputFilePath,theFileText);
mutex.ReleaseMutex();
}
You need to make sure that the mutex is given a unique name that will be shared across the entire system.
Additional reading here:
http://www.albahari.com/threading/part2.aspx#_Mutex

Your solution is error prone. You've basically implemented double-checked locking (http://en.wikipedia.org/wiki/Double-checked_locking) which can be very unsafe.
A better solution would be to either introduce thread isolation, whereby only one thread ever accesses the file and does so by reading from a queue upon which requests to read or write are placed by other threads (and of course the queue is protected by mutually exclusive access by threads) or where the threads synchronize themselves either by synchronization devices (lock sections, mutices, whatever) or by using some other file access logic (for example, System.IO.FileShare came up in a few reponses here.)

If it was me, I would install something like SQL Server Compact Edition for reading/writing this data.
However, if you want to be able to lock access to a resource that is shared between multiple processes, you need to use a Mutex or a Semaphore.
The Mutex class is a .Net wrapper around an OS Level locking mechanism.
Overview of Synchronization Primitives

Related

Multi-threaded C# application log file locking issue [duplicate]

This question already has answers here:
How to write in a single file with multiple threads?
(3 answers)
Closed 8 years ago.
I have a bi-threaded c# application where both threads write to the same log and I'm getting an error.
'The process cannot access the file 'logfile.log' because it is being used by another process.'
I'm somewhat new to C# and multi-threading in general so if anyone can point me in the right direction here it would be much appreciated.

The error message says that the file is held open by another process, not thread. Are you sure the file is being closed properly each time you run your app ?
Generally speaking, you should use a lock to control access to a shared resource inside your application. For example, if you have a LogWriter class with a Log() function, a minimal implementation with no queuing might look like this:
public class LogWriter
{
private readonly object _lock = new object;
public void Log(string message)
{
lock(_lock)
{
//write message to log file
string appName = Path.GetFileNameWithoutExtension(Environment.GetCommandLineArgs()[0]);
StreamWriter sw = new StreamWriter(Environment.CurrentDirectory + Path.DirectorySeparatorChar + appName + ".log", true);
sw.WriteLine(string.Format("{0:u} {1}", DateTime.Now, message));
sw.Flush();
sw.Close();
}
}
The lock(_lock) ensures that only one thread accesses the file at a time. The sw.Close() ensures that the file is not left open when the process terminates ;-)

As the error message states, you can't create a writer from multiple different threads. You'll need to find some way of either synchronizing the access to the file between threads, which is to say ensuring that one thread waits to try to access the file until all other threads are done with it, or you could designate a single thread as being the thread responsible for accessing that file, requiring all other threads that want to access that file to request the other thread to do it for them.

You need to create a list(or whatever data structure you like) that is accessible to parallel threads. Then add record to that list in each thread
lock (yourList) {
// here you can add items to the List
}
When finished, dump your list to a file.
Or create a new List per each thread, return them all back, and then join all lists into one.
Or use a Database and add records to a table (the most logical solution)
As alternative solution
Create a global List
pass that list to all threads and lock it(example above) and write to it; unlock it
Create another infinite loop thread with delay
Inside infinite loop, after delay, lock the list, get all the data, write to file, empty the list
release the list

Why does File.Move allow 2 threads to move the same file at the same time?

We currently have one application that monitors a folder for new files. To make it fault tolerant and be able to process more files at once, we want to be able to run multiple instances of this application on different machines. We use File.Move to "lock" a file and make sure that only one thread can process a file at a time.
To test that only one application and/or thread can perform a File.Move on a file, I created a simple application (based on the original application's code), which created 10 threads per application and monitored a folder, when each thread detects a new file, it performs File.Move on it and changes the file's extension, to try and stop other thread's from doing the same.
I have seen an issue when running multiple copies of this application (and it running on its own), whereby 2 threads (either in the same application or different ones), both successfully perform File.Move with no exception thrown, but the thread that performed it last (I change the file's extension to include the DateTime.Now.ToFileTime()), successfully renamed the file.
I have looked at what File.Move does and it checks to see if the file exists before it performs the operation, then it calls out to Win32Native.MoveFile to perform the move.
All the other threads/applications throw an exception, as I would expect.
The reasons why this is an issue are:
I thought only 1 thread can perform a File.Move on a file at a time.
I need to reliably have only one application/thread be able to process a file at a time.
Here is the code that performs the File.Move:
public bool TryLock(string originalFile, out string changedFileName)
{
FileInfo fileInfo = new FileInfo(originalFile);
changedFileName = Path.ChangeExtension(originalFile, ".original." + DateTime.Now.ToFileTime());
try
{
File.Move(originalFile, changedFileName);
}
catch (IOException ex)
{
Console.WriteLine("{3} - Thread {1}-{2} File {0} is already in use", fileInfo.Name, Thread.CurrentThread.ManagedThreadId, id, DateTime.Now.ToLongTimeString());
return false;
}
catch (Exception ex)
{
Console.WriteLine("{3} - Thread {1}-{2} File {0} error {4}", fileInfo.Name, Thread.CurrentThread.ManagedThreadId, id, DateTime.Now.ToLongTimeString(), ex);
return false;
}
return true;
}
Note - id is just a sequential number I assigned to each thread for logging.
I am running Windows 7 Enterprise SP1 on a SSD with NTFS.

From the MSDN description I assume that File.Move does not open the file in exclusive mode.
If you try to move a file across disk volumes and that file is in use,
the file is copied to the destination, but it is not deleted from the
source.
Anyway, I think you are better off to create your own move mechanism and have it open the file in exclusive mode prior to copying it (and then deleting it):
File.Open(pathToYourFile, FileMode.Open, FileAccess.Read, FileShare.None);
Other threads won't be able to open it if the move operation is already in progress. You might have race condition issues between the moment the copy is finalized (thus you need to dispose of the file handle) and deleting it.

Using File.Move as a lock isn't going to work. As stated in #marceln's answer, it won't delete the source file it is already in use elsewhere and doesn't have a "locking" behavior, you can't relay on it.
What i would suggest is to use a BlockingCollection<T> to manage the processing of your files:
// Assuming this BlockingCollection is already filled with all string file paths
private BlockingCollection<string> _blockingFileCollection = new BlockingCollection<string>();
public bool TryProcessFile(string originalFile, out string changedFileName)
{
FileInfo fileInfo = new FileInfo(originalFile);
changedFileName = Path.ChangeExtension(originalFile, ".original." + DateTime.Now.ToFileTime());
string itemToProcess;
if (_blockingFileCollection.TryTake(out itemToProcess))
{
return false;
}
// The file should exclusively be moved by one thread,
// all other should return false.
File.Move(originalFile, changedFileName);
return true;
}

Are you moving across volumes or within a volume? In the latter case no copying is necessary.
.
#usr In production, once a thread has "locked" a file, we will be moving it across network shares
I'm not sure whether that is a true move or a copy operation. In any case, you could:
open the file exclusively
copy the data
delete the source by handle (Deleting or Renaming a file using an open handle)
That allows you to lock other processes out of that file for the duration of the move. It is more of a workaround than a real solution. Hope it helps.
Note, that for the duration of the move the file is unavailable and other processes will receive an error accessing it. You might need a retry loop with a time delay between operations.
Here's an alternative:
Copy the file to the target folder with a different extension that is being ignored by readers
Atomically rename the file to remove the extension
Renaming on the same volume is always atomic. Readers might receive a sharing violation error for a very short period of time. Again, you need a retry loop or tolerate a very small window of unavailability.

Based on #marceln and #YuvalItzchakov answer/comments, I tried the following, which seems to give more reliable results:
using (var readFileStream = File.Open(originalFile, FileMode.Open, FileAccess.Read, FileShare.Delete))
{
readFileStream.Lock(0, readFileStream.Length - 1);
File.Move(originalFile, changedFileName);
readFileStream.Unlock(0, readFileStream.Length - 1);
}
I want to use Windows's own file copying as it should be more efficient than copying the stream and in production we will be moving the files from one network share to another.

Streamwriter Lock Not Working

I'm taking over a C# project, and when testing it out I'm getting errors. The error is that the log file cannot be written to because it is in use by another process. Here's the code:
public void WriteToLog(string msg)
{
if (!_LogExists)
{
this.VerifyOrCreateLogFile(); // Creates log file if it does not already exist.
}
// do the actual writing on its own thread so execution control can immediately return to the calling routine.
Thread t = new Thread(new ParameterizedThreadStart(WriteToLog));
t.Start((object)msg);
}
private void WriteToLog(object msg)
{
lock (_LogLock)
{
string message = msg as string;
using (StreamWriter sw = File.AppendText(LogFile))
{
sw.Write(message);
sw.Close();
}
}
}
_LogLock is defined as a class variable:
private object _LogLock = 0;
Based on my research and the fact that this has been working fine in a production system for a few years now, I don't know what the problem could be. The lock should prevent another thread from attempting to write to the log file.
The changes I've made that need to be tested are a lot more log usage. We're basically adding a debug mode to save much more info to the log than used to be saved.
Thanks for any help!
EDIT:
Thanks for the quick answers! The code for VerifyOrCreateLogFile() does use the _LogLock, so that shouldn't be an issue. It does do some writing to the log before it errors out, so it gets past creating the file just fine.
What seems to be the problem is that previously only one class created an instance of the log class, and now I've added instances to other classes. It makes sense that this would create problems. Changing the _LogLock field to be static fixes the issue.
Thanks again!

The lock should prevent another thread from attempting to write to the log file.
This is only true if you're using a single instance of this class.
If each (or even some) of the log requests use a separate instance, then the lock will not protect you.
You can easily "correct" this by making the _LogLock field static:
private static object _LogLock = 0;
This way, all instances will share the same lock.

I see 2 problems with the code:
Lock must be the same among all "users" of ths Log class, easiest solution is to make either _LogLock or the complete class static
VerifyOrCreateLogFile could pose a problem if 2 or more parallel threads call WriteToLog when _LogExists is false...

One possibility is that the OS isn't releasing the file lock quickly enough before you exit the lock in WriteToLog and another thread that was blocked waiting for the lock tried to open it before the OS finished releasing the file lock. Yes, it can happen. You either need to sleep for a little before trying to open the file, centralize the writing to the log to a dedicated object (so that he and only he has access to this file and you don't have to worry about file lock contentions).
Another possibility is that you need to lock around
if (!_LogExists) {
this.VerifyOrCreateLogFile(); // Creates log file if it does not already exist.
}
The third possibility is that you have multiple instances of whatever class is housing these methods. The lock object won't be shared across instances (make it static to solve this).
At the end of the day, unless you're an expert in writing safe multi-threaded code, just let someone else worry about this stuff for you. Use a framework that handles these issues for you (log4net?).

you can do the code executable by simply
removing sw.Close(); from your code ...
do it....
it will work fine.....

How to enable two different C# applications accessing the same directory in a continuous thread?

I have the same BackgroundWorker code piece in two simultaneously running applications. Will this code avoid the problem of same resource getting access by two processes and run smoothly?
void bw_DoWork(object sender, DoWorkEventArgs e)
{
bool flag = false;
System.Threading.Thread.Sleep(1000);
while (flag.Equals(false))
{
string dir = #"C:\ProgramData\Msgs";
try
{
if (Directory.GetFiles(smsdir).Length > 0)
{
flag = true;
}
}
catch (Exception exc)
{
Logger.Log("Dir Access Exception: " + exc.Message);
System.Threading.Thread.Sleep(10);
}
}

On one level, depending on what you're doing, there's nothing wrong with having multiple applications accessing the same directory or file. If it's just read access, then by all means, both can access it at once.
If you've got identical code in multiple applications, then a Boolean isn't going to cut it for synchronization, no matter what you do: Each application has its own copy of the Boolean, and cannot modify the other.
For cross application synhronization, I'd use the Mutex class. There's a constructor that takes a string parameter, specifying the name of the Mutex. Mutex names are unique across all of Windows, not just your application. You can do Mutex m = new Mutex(false, "MySpecialMutex"); in two different applications, and each object will be referring to the same thing.

No, it won't solve the issue because setting the boolean's value and checking it is not an atomic function and is thus not thread safe. You have to use either a Mutex or a Monitor object.
Check this link for more info: Monitor vs Mutex in c#

No, it will not -- at least, the code you have pasted will not accomplish any sort of meaningful process synchronization.
If you want a more detailed and helpful answer, you are going to need to be more specific about what you are doing.

You must come up with some kind of cross-process synchronization scheme - any locking mechanism you use in that code is irrelevant if you're trying to prevent collisions between two processes as opposed to two threads running on the same process.

A good way to do locking across processes like this is to use a file. First process in creates a file and opens it with exclusive access, and then deletes it when its done. The second process in will either see that the file exists and have to wait till it doesn't or it will fail when attempting to open the file exclusively.

no, 'flag' is local to the scope of the method, which is local to the scope of the thread. In other words, it will also equal false.
This is what the lock function is for. Use it like this
In your class, declare a private object called gothread.
in your method write it like this
lock(gothread)
{
// put your code in here, one thread will not be able to enter when another thread is already
// in here
}

C# - Locking issues with Mutex

I've got a web application that controls which web applications get served traffic from our load balancer. The web application runs on each individual server.
It keeps track of the "in or out" state for each application in an object in the ASP.NET application state, and the object is serialized to a file on the disk whenever the state is changed. The state is deserialized from the file when the web application starts.
While the site itself only gets a couple requests a second tops, and the file it rarely accessed, I've found that it was extremely easy for some reason to get collisions while attempting to read from or write to the file. This mechanism needs to be extremely reliable, because we have an automated system that regularly does rolling deployments to the server.
Before anyone makes any comments questioning the prudence of any of the above, allow me to simply say that explaining the reasoning behind it would make this post much longer than it already is, so I'd like to avoid moving mountains.
That said, the code that I use to control access to the file looks like this:
internal static Mutex _lock = null;
/// <summary>Executes the specified <see cref="Func{FileStream, Object}" /> delegate on
/// the filesystem copy of the <see cref="ServerState" />.
/// The work done on the file is wrapped in a lock statement to ensure there are no
/// locking collisions caused by attempting to save and load the file simultaneously
/// from separate requests.
/// </summary>
/// <param name="action">The logic to be executed on the
/// <see cref="ServerState" /> file.</param>
/// <returns>An object containing any result data returned by <param name="func" />.
///</returns>
private static Boolean InvokeOnFile(Func<FileStream, Object> func, out Object result)
{
var l = new Logger();
if (ServerState._lock.WaitOne(1500, false))
{
l.LogInformation( "Got lock to read/write file-based server state."
, (Int32)VipEvent.GotStateLock);
var fileStream = File.Open( ServerState.PATH, FileMode.OpenOrCreate
, FileAccess.ReadWrite, FileShare.None);
result = func.Invoke(fileStream);
fileStream.Close();
fileStream.Dispose();
fileStream = null;
ServerState._lock.ReleaseMutex();
l.LogInformation( "Released state file lock."
, (Int32)VipEvent.ReleasedStateLock);
return true;
}
else
{
l.LogWarning( "Could not get a lock to access the file-based server state."
, (Int32)VipEvent.CouldNotGetStateLock);
result = null;
return false;
}
}
This usually works, but occasionally I cannot get access to the mutex (I see the "Could not get a lock" event in the log). I cannot reproduce this locally - it only happens on my production servers (Win Server 2k3/IIS 6). If I remove the timeout, the application hangs indefinitely (race condition??), including on subsequent requests.
When I do get the errors, looking at the event log tells me that the mutex lock was achieved and released by the previous request before the error was logged.
The mutex is instantiated in the Application_Start event. I get the same results when it is instantiated statically in the declaration.
Excuses, excuses: threading/locking is not my forté, as I generally don't have to worry about it.
Any suggestions as to why it randomly would fail to get a signal?
Update:
I've added proper error handling (how embarrassing!), but I am still getting the same errors - and for the record, unhandled exceptions were never the problem.
Only one process would ever be accessing the file - I don't use a web garden for this application's web pool, and no other applications use the file. The only exception I can think of would be when the app pool recycles, and the old WP is still open when the new one is created - but I can tell from watching the task manager that the issue occurs while there is only one worker process.
#mmr: How is using Monitor any different from using a Mutex? Based on the MSDN documentation, it looks like it is effectively doing the same thing - if and I can't get the lock with my Mutex, it does fail gracefully by just returning false.
Another thing to note: The issues I'm having seem to be completely random - if it fails on one request, it might work fine on the next. There doesn't seem to be a pattern, either (certainly no every other, at least).
Update 2:
This lock is not used for any other call. The only time _lock is referenced outside the InvokeOnFile method is when it is instantiated.
The Func that is invoked is either reading from the file and deserializing into an object, or serializing an object and writing it to the file. Neither operation is done on a separate thread.
ServerState.PATH is a static readonly field, which I don't expect would cause any concurrency problems.
I'd also like to re-iterate my earlier point that I cannot reproduce this locally (in Cassini).
Lessons learned:
Use proper error handling (duh!)
Use the right tool for the job (and have a basic understanding of what/how that tool does). As sambo points out, using a Mutex apparently has a lot of overhead, which was causing issues in my application, whereas Monitor is designed specifically for .NET.

You should only be using Mutexes if you need cross-process synchronization.
Although a mutex can be used for
intra-process thread synchronization,
using Monitor is generally preferred,
because monitors were designed
specifically for the .NET Framework
and therefore make better use of
resources. In contrast, the Mutex
class is a wrapper to a Win32
construct. While it is more powerful
than a monitor, a mutex requires
interop transitions that are more
computationally expensive than those
required by the Monitor class.
If you need to support inter-process locking you need a Global mutex.
The pattern being used is incredibly fragile, there is no exception handling and you are not ensuring that your Mutex is released. That is really risky code and most likely the reason you see these hangs when there is no timeout.
Also, if your file operation ever takes longer than 1.5 seconds then there is a chance concurrent Mutexes will not be able to grab it. I would recommend getting the locking right and avoiding the timeout.
I think its best to re-write this to use a lock. Also, it looks like you are calling out to another method, if this take forever, the lock will be held forever. That's pretty risky.
This is both shorter and much safer:
// if you want timeout support use
// try{var success=Monitor.TryEnter(m_syncObj, 2000);}
// finally{Monitor.Exit(m_syncObj)}
lock(m_syncObj)
{
l.LogInformation( "Got lock to read/write file-based server state."
, (Int32)VipEvent.GotStateLock);
using (var fileStream = File.Open( ServerState.PATH, FileMode.OpenOrCreate
, FileAccess.ReadWrite, FileShare.None))
{
// the line below is risky, what will happen if the call to invoke
// never returns?
result = func.Invoke(fileStream);
}
}
l.LogInformation("Released state file lock.", (Int32)VipEvent.ReleasedStateLock);
return true;
// note exceptions may leak out of this method. either handle them here.
// or in the calling method.
// For example the file access may fail of func.Invoke may fail

If some of the file operations fail, the lock will not be released. Most probably that is the case. Put the file operations in try/catch block, and release the lock in the finally block.
Anyway, if you read the file in your Global.asax Application_Start method, this will ensure that noone else is working on it (you said that the file is read on application start, right?). To avoid collisions on application pool restaring, etc., you just can try to read the file (assuming that the write operation takes an exclusive lock), and then wait 1 second and retry if exception is thrown.
Now, you have the problem of synchronizing the writes. Whatever method decides to change the file should take care to not invoke a write operation if another one is in progress with simple lock statement.

I see a couple of potential issues here.
Edit for Update 2: If the function is a simple serialize/deserialize combination, I'd separate the two out into two different functions, one into a 'serialize' function, and one into a 'deserialize' function. They really are two different tasks. You can then have different, lock-specific tasks. Invoke is nifty, but I've gotten into lots of trouble myself going for 'nifty' over 'working'.
1) Is your LogInformation function locking? Because you call it inside the mutex first, and then once you release the mutex. So if there's a lock to write to the log file/structure, then you can end up with your race condition there. To avoid that, put the log inside the lock.
2) Check out using the Monitor class, which I know works in C# and I'd assume works in ASP.NET. For that, you can just simply try to get the lock, and fail gracefully otherwise. One way to use this is to just keep trying to get the lock. (Edit for why: see here; basically, a mutex is across processes, the Monitor is in just one process, but was designed for .NET and so is preferred. No other real explanation is given by the docs.)
3) What happens if the filestream opening fails, because someone else has the lock? That would throw an exception, and that could cause this code to behave badly (ie, the lock is still held by the thread that has the exception, and another thread can get at it).
4) What about the func itself? Does that start another thread, or is it entirely within the one thread? What about accessing ServerState.PATH?
5) What other functions can access ServerState._lock? I prefer to have each function that requires a lock get its own lock, to avoid race/deadlock conditions. If you have many many threads, and each of them try to lock on the same object but for totally different tasks, then you could end up with deadlocks and races without any really easily understandable reason. I've changed to code to reflect that idea, rather than using some global lock. (I realize other people suggest a global lock; I really don't like that idea, because of the possibility of other things grabbing it for some task that is not this task).
Object MyLock = new Object();
private static Boolean InvokeOnFile(Func<FileStream, Object> func, out Object result)
{
var l = null;
var filestream = null;
Boolean success = false;
if (Monitor.TryEnter(MyLock, 1500))
try {
l = new Logger();
l.LogInformation("Got lock to read/write file-based server state.", (Int32)VipEvent.GotStateLock);
using (fileStream = File.Open(ServerState.PATH, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None)){
result = func.Invoke(fileStream);
} //'using' means avoiding the dispose/close requirements
success = true;
}
catch {//your filestream access failed
l.LogInformation("File access failed.", (Int32)VipEvent.ReleasedStateLock);
} finally {
l.LogInformation("About to released state file lock.", (Int32)VipEvent.ReleasedStateLock);
Monitor.Exit(MyLock);//gets you out of the lock you've got
}
} else {
result = null;
//l.LogWarning("Could not get a lock to access the file-based server state.", (Int32)VipEvent.CouldNotGetStateLock);//if the lock doesn't show in the log, then it wasn't gotten; again, if your logger is locking, then you could have some issues here
}
return Success;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Simultaneously write to a file from multiple processes [duplicate] - c#

Related

Multi-threaded C# application log file locking issue [duplicate]

Why does File.Move allow 2 threads to move the same file at the same time?

Streamwriter Lock Not Working

How to enable two different C# applications accessing the same directory in a continuous thread?

C# - Locking issues with Mutex

Categories

Resources