-What is the most foolproof way of ensuring the folder or file I want to manipulate is accessible (not read-only)?
-I know I can use ACL to add/set entries (make the file/folder non-readonly), but how would I know if I need to use security permissions to ensure file access? Or can I just add this in as an extra measure and handle the exception/negative scenario?
-How do I know when to close or just flush a stream? For example, should I try to use the streams once in a method and then flush/close/dipose at the end? If I use dispose(), do I still need to call flush() and close() explicitly?
I ask this question because constantly ensuring a file is available is a core requirement but it is difficult to guarantee this, so some tips in the design of my code would be good.
Thanks
There is no way to guarantee access to a file. I know this isn't a popular response but it's 100% true. You can never guarantee access to a file even if you have an exclusive non-sharing open on a Win32 machine.
There are too many ways this can fail that you simply cannot control. The classic example is a file opened over the network. Open it any way you'd like with any account, I'll simply walk over and yank the network cable. This will kill your access to the file.
I'm not saying this to be mean or arrogant. I'm saying this to make sure that people understand that operating on the file system is a very dangerous operation. You must accept that the operation can and will fail. It's imperative that you have a fallback scenario for any operation that touches disk.
-What is the most foolproof way of ensuring the folder or file I want to manipulate is accessible (not read-only)?
Opening them in write-mode?
Try and write a new file into the folder and catch any exceptions. Along with that do the normally sanity checks like folder/files exists etc.
You should never change the folder security in code as the environment could drastically change and cause major headaches. Rather ensure that the security is well documented and configured before hand. ALternatively use impersonation in your own code to ensure you are always running the required code as a user with full permissions to the folder/file.
Never call Dispose() unless you have no other choice. You always flush before closing the file or when you want to commit the content of the stream to the file/disk. The choice of when to do it depends on the amount of data that needs to be written and the time involved in writing the data.
100% foolproof way to ensure a folder is writable - create a file, close it, verify it is there, then delete it. A little tedious, but you asked for foolproof =)
Your better bet, which covers your question about ACL, is to handle the various exceptions if you cannot write to a file.
Also, I always call Close explicitly unless I need to read from a file before I'm done writing it (in which case I call flush then close).
Flush() - Synchronizes the in-memory buffer with the disk. Call when you want to write the buffer to the disk but keep the file open for further use.
Dispose(bool) - Releases the unmanaged resource (i.e. the OS file handle) and, if passed true, also releases the managed resources.
Close() - Calls Dispose(true) on the object.
Also, Dispose flushes the data before closing the handle so there is no need to call flush explicitly (although it might be a good idea to be flushing frequently anyway, depending on the amount and type of data you're handling).
If you're doing relatively atomic operations to files and don't need a long-running handle, the "using" paradigm is useful to ensure you are handling files properly, e.g.:
using (StreamReader reader = new StreamReader("filepath"))
{
// Do some stuff
} // CLR automagically handles flushing and releasing resources
Related
I am using following code to copy remote file to local. How do you detect failure in the operation, when copying large files.
Is there any better approach to detect failure, apart from handling system.io exception ?
File.Copy(remoteSrcFile, dest);
Is this the best method offered by framework to copy large files whose file size is in gigabyte range ?
Is there any better approach to detect failure, apart from handling system.io exception?
Many possible errors, such as an invalid file name, can be checked beforehand. If File.Copy fails, it will throw an exception. There is no other error indicator.
Is this the best method offered by framework to copy large files whose
file size is in gigabyte range ?
It depends on what other features you want. For example, if you want to show progress, File.Copy will not help you, since it just wraps the FileCopy API. However, calling the Windows API FileCopyEx can provide progress. See Can I show file copy progress using FileInfo.CopyTo() in .NET? for more info.
Copy failures are supposed to be an exceptional circumstance, so failures are modelled as exceptions.
You should wrap File.Copy calls in a try/catch to catch any exceptions you are able to explicitly accommodate.
File.Copy is the only thing "included" in the framework along with FileInfo.CopyTo. But, you can use it in different ways. Maybe spawn a thread to invoke it. You could also use basic IO to read data from one file to another to get better progress/cancellation; but, it depends on what you want to achieve.
In my program i'm using
logWriter = File.CreateText(logFileName);
to store logs.
Should I call logWriter.Close() and where? Should it be finalizer or something?
The normal approach is to wrap File.CreateText in a using statement
using (var logWriter = File.CreateText(logFileName))
{
//do stuff with logWriter
}
However, this is inconvenient if you want logWriter to live for the duration of your app since you most likely won't want the using statement wrapping around your app's Main method.
In which case you must make sure that you call Dispose on logWriter before the app terminates which is exactly what using does for you behind the scenes.
Yes you should close your file when you're done with it. You can create a log class ( or use an existing one like log4net ) and implement IDisposable and inside the Dispose-method you release the resources.
You can wrap it with a using-block, but I would rather have it in a seperate class. This way in the future you can handle more advance logging, for instance, what happens when your application runs on multiple threads and you try to write to the file at the same time?
log4net can be configured to use a text-file or a database and it's easy to change if the applications grows.
If you have a log file which you wish to keep open, then the OS will release the file as part of shutting the process down when the application exits. You do not actually have to manage this explicitly.
One issue with letting the OS clean up your file handle, is that your file writer will use buffering and it may need flushing before it will write out the remains of its buffer. If you do not call close\dispose on it you may lose information. One way of forcing a flush is to hook the AppDomain unload event which will get called when your .Net process shuts down, e.g.:
AppDomain.CurrentDomain.DomainUnload += delegate { logWriter.Dispose(); };
There is a time limit on what can occur in a domain unload event handler, but writing the remains of a file writer buffer out is well within this. I am assuming you have a default setup i.e. one default AppDomain, otherwise things get tricky all round including logging.
If you are keeping your file open, consider opening it with access rights that will allow other processes to have read access. This will enable a program such as a file tailer or text editor to be used to read the file whilst your program is running.
I have a text file and multiple threads/processes will write to it (it's a log file).
The file gets corrupted sometimes because of concurrent writings.
I want to use a file writing mode from all of threads which is sequential at file-system level itself.
I know it's possible to use locks (mutex for multiple processes) and synchronize writing to this file but I prefer to open the file in the correct mode and leave the task to System.IO.
Is it possible ? what's the best practice for this scenario ?
Your best bet is just to use locks/mutexex. It's a simple approach, it works and you can easily understand it and reason about it.
When it comes to synchronization it often pays to start with the simplest solution that could work and only try to refine if you hit problems.
To my knowledge, Windows doesn't have what you're looking for. There is no file handle object that does automatic synchronization by blocking all other users while one is writing to the file.
If your logging involves the three steps, open file, write, close file, then you can have your threads try to open the file in exclusive mode (FileShare.None), catch the exception if unable to open, and then try again until success. I've found that tedious at best.
In my programs that log from multiple threads, I created a TextWriter descendant that is essentially a queue. Threads call the Write or WriteLine methods on that object, which formats the output and places it into a queue (using a BlockingCollection). A separate logging thread services that queue--pulling things from it and writing them to the log file. This has a few benefits:
Threads don't have to wait on each other in order to log
Only one thread is writing to the file
It's trivial to rotate logs (i.e. start a new log file every hour, etc.)
There's zero chance of an error because I forgot to do the locking on some thread
Doing this across processes would be a lot more difficult. I've never even considered trying to share a log file across processes. Were I to need that, I would create a separate application (a logging service). That application would do the actual writes, with the other applications passing the strings to be written. Again, that ensures that I can't screw things up, and my code remains simple (i.e. no explicit locking code in the clients).
you might be able to use File.Open() with a FileShare value set to None, and make each thread wait if it can't get access to the file.
BACKGROUND:
I use an offset into a file and the Filestream lock/unlock menthods to control read/write access. I am using the following code to test if a lock is currently held on the file
try
{
fs.Lock( RESERVED_BYTE, 1 );
fs.Unlock( RESERVED_BYTE, 1 );
rc = 1;
}
catch
{
rc = 0;
}
QUESTION:
My goal is to eliminate the try/catch block. Is there some better way to see if the lock exists?
EDIT:
Note: This question is not about if the file exists. I already know it does. It is about synchronizing write access.
You can call the LockFile Windows API function through the P/Invoke layer directly. You would use the handle returned by the SafeFileHandle property on the FileStream.
Calling the API directly will allow you to check the return value for an error condition as opposed to resorting to catching an exception.
Noah asks if there is any overhead in making the call to the P/Invoke layer vs a try/catch.
The Lock file makes the same call through the P/Invoke layer and throws the exception if the call to LockFile returns 0. In your case, you aren't throwing an exception. In the event the file is locked, you will take less time because you aren't dealing with a stack unwind.
The actual P/Invoke setup is around seven instructions I believe (for comparison, COM interop is about 40), but that point is moot, since your call to LockFile is doing the same thing that the managed method does (use the P/Invoke layer).
Personally I would just catch a locked file when trying to open it. If it's unlocked now, it may be locked when you try to open it (even if it's just a few ms later).
My goal is to eliminate the try/catch block
Remember, the file system is volatile: just because your file is in one state for one operation doesn't mean it will be in the same state for the next operation. You have to be able to handle exceptions from the file system.
In some circumstances you can also use WCT, it's usually implmented by debugger's or profilers, however it can be used from any code as the usual debugger requirement of being the thread which has the debug port open is not a pre-requisit. As such WCT is a very comprehensive and precise information regarding lock contention.
A managed example (all-be-it somewhat trickey), show's utility for this specific sub-set of the native debug API's on the CLR.
I don't think it's possible without try, catch.
Say I want to be informed whenever a file copy is launched on my system and get the file name, the destination where it is being copied or moved and the time of copy.
Is this possible? How would you go about it? Should you hook CopyFile API function?
Is there any software that already accomplishes this?
Windows has the concept of I/O filters which allow you to intercept all I/O operations and choose to perform additional actions as a result. They are primarily used for A/V type scenarios but can be programmed for a wide variety of tasks. The SysInternals Process Monitor for example uses a I/O filter to see the file level access.
You can view your current filters using MS Filter Manager, (fltmc.exe from a command prompt)
There is a kit to help you write filters, you can get the drivers and develop your own.
http://www.microsoft.com/whdc/driver/filterdrv/default.mspx is a starting place to get in depth info
As there is a .NET tag on this question, I would simply use System.IO.FileSystemWatcher that's in the .NET Framework. I'm guessing it is implemented using the I/O Filters that Andrew mentions in his answer, but I really do not know (nor care, exactly). Would that fit your needs?
As Andrew says a filter driver is the way to go.
There is no foolproof way of detecting a file copy as different programs copy files in different ways (some may use the CopyFile API, others may just read one file and write out the contents to another themselves). You could try calculating a hash in your filter driver of any file opened for reading, and then do the same after a program finishes writing to a file. If the hashes match you know you have a file copy. However this technique may be slow. If you just hook the CopyFile API you will miss file copies made without that API. Java programs (to name but one) have no access to the CopyFile API.
This is likely impossible as there is no guaranteed central method for performing a copy/move. You could hook into a core API (like CopyFile) but of course that means that you will still miss any copy/move that any application does without using this API.
Maybe you could watch the entire filesystem with IO filters for open files and then just draw conclusions yourself if two files with same names and same filesizes are open at the same time. But that no 100% solution either.
As previously mentioned, a file copy operation can be implemented in various ways and may involve several disk and memory transfers, therefore is not possible to simply get notified by the system when such operation occurs.
Even for the user, there are multiple ways to duplicate content and entire files. Copy commands, "save as", "send to", move, using various tools. Under the hood the copy operation is a succession of read / write, correlated by certain parameters. That is the only way to guarantee successful auditing. Hooking on CopyFile will not give you the copy operations of Total Commander, for example. Nor will it give you "Save as" operations which are in fact file create -> file content moved -> closing of original file -> opening of the new file. Then, things are different when dealing with copy over network, impersonated copy operations where the file handle security context is different than the process security context, and so on. I do not think that there is a straightforward way to achieve all of the above.
However, there is a software that can notify you for most of the common copy operations (i.e. when they are performed through windows explorer, total commander, command prompt and other applications). It also gives you the source and destination file name, the timestamp and other relevant details. It can be found here: http://temasoft.com/products/filemonitor.
Note: I work for the company which develops this product.