Writing to file system. How to lock effectively

Writing to file system. How to lock effectively - c#

I'm writing a Stringbuilder to file
using (FileStream file = new FileStream(Filepath, FileMode.Append, FileAccess.Write, FileShare.Read))
using (StreamWriter writer = new StreamWriter(file, Encoding.Unicode))
{
writer.Write(text.ToString());
}
This is equivilent (I think)
File.AppendAllText(Filepath, text.ToString());
Obviously in a multi threaded environment these statements on their own would cause failures to write as they collided.
I've put a lock on this code, but that isn't ideal, as it's too expensive and may exacerbate this bottleneck. Is there some other way of causing one threads file access to block another's. I've been told "blocking not locking", I thought lock did block, but they must be hinting at a cheaper way of preventing simultaneous use of the file system.
How do I block execution in a less time expensive manner?

You can't have multiple threads write to the same file simultaneously, thus, there is no such "bottleneck" . A lock makes perfect sense for this scenario. If you are concerned about this being expensive, just add the writes to a queue, and let a single thread manage writing them to file.
Pseudo code
public static readonly Object logsLock = new Object();
// any thread
lock(logsLock)
{
logs.Add(stringBuilderText);
}
// dedicated thread to writing
lock(logsLock)
{
// ideally, this should be a "get in, get out" situation,
// where you only need to make a copy of the logs, then exit the lock,
// then write them, then lock the logsLock again, and remove only the logs
// you successfully wrote to to file, then exit the lock again.
logs.ForEach(writeLogToFile);
}

You can lock the stream using the lock method.
http://msdn.microsoft.com/en-us/library/system.io.filestream.lock.aspx

Related

Is ConcurrentDictionary safe to use?

I have three separate codes running in separate threads.
Thread task 1: Reading data from a device and writing it into a ConcurrentDictionary.
Thread task 2: Writes the data in the ConcurrentDictionary to the computer as a separate file.
I have read many posts on the forum saying that concurrentdictionary is safe for separate threads. I've also read that there are lockout situations. In fact, more than one question mark occurred in my head.
Is there a need for locking for the concurrentdictionary? In which cases locking is required ? How can I do if locking is required? What problems does use in the following way cause?
Thread code 1: Data comes in every second.
public void FillModuleBuffer(byte[] buffer, string IpPort)
{
if (!CommunicationDictionary.dataLogList.ContainsKey(IpPort))
{
CommunicationDictionary.dataLogList.TryAdd(IpPort, buffer);
}
}
Thread code 2: The code below works in a timer. The timer duration is 200 ms.
if (CommunicationDictionary.dataLogList.ContainsKey(IpPort))
{
using (stream = File.Open(LogFilename, FileMode.Append, FileAccess.Write))
{
using (BinaryWriter writer = new BinaryWriter(stream))
{
writer.Write(CommunicationDictionary.dataLogList[IpPort]);
writer.Flush();
writer.Close();
CommunicationDictionary.dataLogList.TryRemove(IpPort,out _);
}
}
}
Note: the codes have been simplified for clarity.
Note 2: I used Dictionary before that. I encountered a very different problem. While active, after 2-3 hours, I got the error that the array was out of index range even though there was no data in the Dictionary.

The example code should be kind of threadsafe, but it shows a missunderstanding on how the concurrent dictionary should be used. For example:
if (!CommunicationDictionary.dataLogList.ContainsKey(IpPort))
{
CommunicationDictionary.dataLogList.TryAdd(IpPort, buffer);
}
This happens to work because there is only one thread that adds to the dictionary, but since there are separate statements the dictionary may change between them. If you look at the documentation for TryAdd you can see that it will return false if the key is already present. So no need for the ContainsKey. There are quite a few different methods with the purpose of doing multiple things at the same time, to ensure the entire operation is atomic.
Same with the reading thread. All accesses to the concurrentDictionary should be replaced with one call to TryRemove
if (CommunicationDictionary.dataLogList.TryRemove(IpPort,out var data))
{
using (stream = File.Open(LogFilename, FileMode.Append, FileAccess.Write))
{
using (BinaryWriter writer = new BinaryWriter(stream))
{
writer.Write(data);
writer.Flush();
writer.Close();
}
}
}
Note that this will save some datachunks, and throwaway others, without any hard guarantee what chunks will be saved or not. This might be the intended behavior, but it would be more common with a queue that ensures that all data is saved. A typical implementation would wrap a concurrentQueue in a blockingCollection with one or more producing threads, and one consuming thread. This avoids the need for a separate timer.

Simultaneously write to a file from multiple processes [duplicate]

I have a txt file ABC.txt which will be read and wrote by multi processes. So when one process is reading from or writing to file ABC.txt, file ABC.txt must be locked so that any other processes can not reading from or writing to it. I know the enum System.IO.FileShare may be the right way to handle this problem. But I used another way which I'm not sure if it is right. The following is my solution.
I added another file Lock.txt to the folder. Before I can read from or write to file ABC.txt, I must have the capability to read from file Lock.txt. And after I have read from or written to file ABC.txt, I have to release that capability. The following is the code.
#region Enter the lock
FileStream lockFileStream = null;
bool lockEntered = false;
while (lockEntered == false)
{
try
{
lockFileStream = File.Open("Lock.txt", FileMode.Open, FileAccess.Read, FileShare.None);
lockEntered = true;
}
catch (Exception)
{
Thread.Sleep(500);
}
}
#endregion
#region Do the work
// Read from or write to File ABC.txt
// Read from or write to other files
#endregion
#region Release the lock
try
{
if (lockFileStream != null)
{
lockFileStream.Dispose();
}
}
catch
{
}
#endregion
On my computer, it seems that this solution works well, but I still can not make sure if it is appropriate..
Edit: Multi processes, not multi threads in the same process.

C#'s named EventWaitHandle is the way to go here. Create an instance of wait handle in every process which wants to use that file and give it a name which is shared by all such processes.
EventWaitHandle waitHandle = new EventWaitHandle(true, EventResetMode.AutoReset, "SHARED_BY_ALL_PROCESSES");
Then when accessing the file wait on waitHandle and when finished processing file, set it so the next process in the queue may access it.
waitHandle.WaitOne();
/* process file*/
waitHandle.Set();
When you name an event wait handle then that name is shared across all processes in the operating system. Therefore in order to avoid possibility of collisions, use a guid for name ("SHARED_BY_ALL_PROCESSES" above).

A mutex in C# may be shared across multiple processes. Here is an example for multiple processes writing to a single file:
using (var mutex = new Mutex(false, "Strand www.jakemdrew.com"))
{
mutex.WaitOne();
File.AppendAllText(outputFilePath,theFileText);
mutex.ReleaseMutex();
}
You need to make sure that the mutex is given a unique name that will be shared across the entire system.
Additional reading here:
http://www.albahari.com/threading/part2.aspx#_Mutex

Your solution is error prone. You've basically implemented double-checked locking (http://en.wikipedia.org/wiki/Double-checked_locking) which can be very unsafe.
A better solution would be to either introduce thread isolation, whereby only one thread ever accesses the file and does so by reading from a queue upon which requests to read or write are placed by other threads (and of course the queue is protected by mutually exclusive access by threads) or where the threads synchronize themselves either by synchronization devices (lock sections, mutices, whatever) or by using some other file access logic (for example, System.IO.FileShare came up in a few reponses here.)

If it was me, I would install something like SQL Server Compact Edition for reading/writing this data.
However, if you want to be able to lock access to a resource that is shared between multiple processes, you need to use a Mutex or a Semaphore.
The Mutex class is a .Net wrapper around an OS Level locking mechanism.
Overview of Synchronization Primitives

Prevent Reading and Writing to a File at the Same Time

I have a process that needs to read and write to a file. The application has a specific order to its reads and writes and I want to preserve this order. What I would like to do is implement something that lets the first operation start and makes the second operation wait until the first is done with a first come first served like of queue to access the file. From what I have read file locking seems like it might be what I am looking for but I have not been able to find a very good example. Can anyone provide one?
Currently I am using a TextReader/Writer with .Synchronized but this is not doing what I hoped it would.
Sorry if this is a very basic question, threading gives me a headache :S

It should be as simple as this:
public static readonly object LockObj = new object();
public void AnOperation()
{
lock (LockObj)
{
using (var fs = File.Open("yourfile.bin"))
{
// do something with file
}
}
}
public void SomeOperation()
{
lock (LockObj)
{
using (var fs = File.Open("yourfile.bin"))
{
// do something else with file
}
}
}
Basically, define a lock object, then whenever you need to do something with your file, make sure you get a lock using the C# lock keyword. On reaching the lock statement, execution will block indefinitely until a lock has been obtained.
There are other constructs you can use for locking, but I find the lock keyword to be the most straightforward.

If you're using a current version of the .Net Framework, you can benefit from Task.ContinueWith.
If your units of work are logically always, "read some, then write some", the following expresses that intent succinctly and should scale:
string path = "file.dat";
// Start a reader task
var task = Task.Factory.StartNew(() => ReadFromFile(path));
// Continue with a writer task
task.ContinueWith(tt => WriteToFile(path));
// We're guaranteed that the read will occur before the write
// and that the write will occur once the read completes.
// We also can check the antecedent task's result (tt.Result in our
// example) for any special error logic we need.

Create a file only if doesn't exists

I want to create a file ONLY if it doesn't already exists.
A code like:
if (!File.Exists(fileName))
{
fileStream fs = File.Create(fileName);
}
Leave it open for a race-condition in case the file will be created between the "if" to the "create".
How can I avoid it?
EDIT:
locks can't be used here because it's a different processes (multiple instances of the same application).

You can also use
FileStream fs = new FileStream(fileName, FileMode.OpenOrCreate);
However, you should look into thread locking as if more than one thread tries to access the file you'll probably get an exception.

Kristian Fenn answer was almost what I needed, just with a different FileMode. This is what I was looking for:
FileStream fs = new FileStream(fileName, FileMode.CreateNew);

Is this not a better solution. Also notice the using(var stream...) Use it to close the stream to avoid IO Exceptions.
if (!File.Exists(filePath))
{
using (var stream = File.Create(filePath)) { }
}

If the contending attempts to create the file are in the same process, you can use a lock statement around your code to prevent contention.
If not, you may occasionally get an exception when you call File.Create. Just appropriately handle that exception. Checking whether the file exists before creating is probably advisable even if you are handling an exception when the file does exist because a thrown exception is relatively expensive. It would not be advisable only if the probability of the race condition is low.

First you Lock or Monitor.Enter or TryEnter APIs to lock the portion of the code.
Second you can use FileStream API with FileMode.OpenOrCreate API. If the file exists, it just uses it or else it just creates it.

File Locking (Read/Write) in ASP.NET Application

I have two ASP.NET web application. One is responsible for processing some info and writing to a log file, and the other application is reponsible for reading the log file and displays the information based on user request.
Here's my code for the Writer
public static void WriteLog(String PathToLogFile, String Message)
{
Mutex FileLock = new Mutex(false, "LogFileMutex");
try
{
FileLock.WaitOne();
using (StreamWriter sw = File.AppendText(FilePath))
{
sw.WriteLine(Message);
sw.Close();
}
}
catch (Exception ex)
{
LogUtil.WriteToSystemLog(ex);
}
finally
{
FileLock.ReleaseMutex();
}
}
And here's my code for the Reader :
private String ReadLog(String PathToLogFile)
{
FileStream fs = new FileStream(
PathToLogFile, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite);
StreamReader Reader = new StreamReader(fs);
return Reader.ReadToEnd();
}
My question, is the above code enough to prevent locking in a web garden environemnt?
EDIT 1 : Dirty read is okay.
EDIT 2 : Creating Mutex with new Mutex(false, "LogFileMutex"), closing StreamWriter

Sounds like your trying to implement a basic queue. Why not use a queue that gives you guarenteed availability. You could drop the messages into an MSMQ, then implement a windows service which will read from the queue and push the messages to the DB. If the writting to the DB fails you simply leave the message on the queue (Although you will want to handle posion messages so if it fails cause the data is bad you don't end up in an infinite loop)
This will get rid of all locking concerns and give you guarenteed delivery to your reader...

You should also be disposing of your mutex, as it derives from WaitHandle, and WaitHandle implements IDisposable:
using (Mutex FileLock = new Mutex(true, "LogFileMutex"))
{
// ...
}
Also, perhaps consider a more unique name (a GUID perhaps) than "LogFileMutex", since another unrelated process could possibly use the same name inadvertantly.

Doing this in a web based environment, you are going to have a lot of issues with file locks, can you change this up to use a database instead?
Most hosting solutions are allowing up to 250mb SQL databases.
Not only will a database help with the locking issues, it will also allow you to purge older data more easily, after a wile, that log read is going to get really slow.

No it won't. First, you're creating a brand new mutex with every call so multiple threads are going to access the writing critical section. Second, you don't even use the mutex in the reading critical section so one thread could be attempting to read the file while another is attempting to write. Also, you're not closing the stream in the ReadLog method so once the first read request comes through your app won't be able to write any log entries anyway until garbage collection comes along and closes the stream for you... which could take awhile.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Writing to file system. How to lock effectively - c#

You can lock the stream using the lock method. http://msdn.microsoft.com/en-us/library/system.io.filestream.lock.aspx

Related

Is ConcurrentDictionary safe to use?

Simultaneously write to a file from multiple processes [duplicate]

Prevent Reading and Writing to a File at the Same Time

Create a file only if doesn't exists

File Locking (Read/Write) in ASP.NET Application

Categories

Resources