C# move file as soon as it becomes available

C# move file as soon as it becomes available - c#

I need to accomplish the following task:
Attempt to move a file. If file is locked schedule for moving as soon as it becomes available.
I am using File.Move which is sufficient for my program. Now the problems are that:
1) I can't find a good way to check if the file I need to move is locked. I am catching System.IO.IOException but reading other posts around I discovered that the same exception may be thrown for different reasons as well.
2) Determining when the file gets unlocked. One way of doing this is probably using a timer/thread and checking the scheduled files lets say every 30 seconds and attempting to move them. But I hope there is a better way using FileSystemWatcher.
This is a .net 3.5 winforms application. Any comments/suggestions are appreciated. Thanks for attention.

You should really just try and catch an IOException. Use Marshal.GetHRForException to check for the cause of the exception.
A notification would not be reliable. Another process might lock the file again before File.Move is executed.

One possible alternative is by using MoveFileEx with a MOVEFILE_DELAY_UNTIL_REBOOT flag. If you don't have access to move the file right now, you can schedule it to be moved on the next reboot when it's guaranteed to be accessible (the moving happens very early in the boot sequence).
Depending on your specific application, you could inform the user a reboot is necessary and initiate the reboot yourself in addition to the moving scheduling.

It's simple:
static void Main(string[] args)
{
//* Create Watcher object.
FileSystemWatcher watcher = new FileSystemWatcher(#"C:\MyFolder\");
//* Assign event handler.
watcher.Created += new FileSystemEventHandler(watcher_Created);
//* Start watching.
watcher.EnableRaisingEvents = true;
Console.ReadLine();
}
static void watcher_Created(object sender, FileSystemEventArgs e)
{
try
{
File.Move(e.FullPath, #"C:\MyMovedFolder\" + e.Name);
}
catch (Exception)
{
//* Something went wrong. You can do additional proceesing here, like fire-up new thread for retry move procedure.
}
}

This is not specific to your problem, but generally you will always need to retain the 'try it and gracefully deal with a failure' mode of operation for this sort of action.
That's because however clever your 'detect that the file is available' mechanism is, there will always be some amount of time between you detecting that the file is available and moving it, and in that time someone else might mess with the file.

The scheduled retry on exception (probably increasing delays - up to a point) is probably the simplest way to achieve this (your (2) ).
To do it properly you're going to have to drop to system level (with Kernel code) hooks to trap the file close event - which has its own idiosynchrases. It's a big job - several orders of magnitude more complex than the scheduled retry method. It's up to you and your application case to make that call, but I don't know of anything effective in between.

Very old question, but google led me here, so when I found a better answer I decided to post it:
There's a nice code I found in the dotnet CLI repo:
/// <summary>
/// Run Directory.Move and File.Move in Windows has a chance to get IOException with
/// HResult 0x80070005 due to Indexer. But this error is transient.
/// </summary>
internal static void RetryOnMoveAccessFailure(Action action)
{
const int ERROR_HRESULT_ACCESS_DENIED = unchecked((int)0x80070005);
int nextWaitTime = 10;
int remainRetry = 10;
while (true)
{
try
{
action();
break;
}
catch (IOException e) when (e.HResult == ERROR_HRESULT_ACCESS_DENIED)
{
Thread.Sleep(nextWaitTime);
nextWaitTime *= 2;
remainRetry--;
if (remainRetry == 0)
{
throw;
}
}
}
}
There is also a method for just IOException. Here's the usage example:
FileAccessRetrier.RetryOnMoveAccessFailure(() => Directory.Move(packageDirectory.Value, tempPath));
Overall, this repo contains a lot of interesting ideas for file manipulations and installation/removal logic, like TransactionalAction, so I recommend it for reviewing. Unfortunately, these functions are not available as NuGet package.

Have a look at the FileSystemWatcher.
http://msdn.microsoft.com/en-us/library/system.io.filesystemwatcher(VS.90).aspx
Listens to the file system change
notifications and raises events when a
directory, or file in a directory,
changes

Related

Using Directory.Delete() and Directory.CreateDirectory() to overwrite a folder

In my WebApi action method, I want to create/over-write a folder using this code:
string myDir = "...";
if(Directory.Exists(myDir))
{
Directory.Delete(myDir, true);
}
Directory.CreateDirectory(myDir);
// 1 - Check the dir
Debug.WriteLine("Double check if the Dir is created: " + Directory.Exists(myDir));
// Some other stuff here...
// 2 - Check the dir again
Debug.WriteLine("Check again if the Dir still exists: " + Directory.Exists(myDir));
Issue
Strangely, sometimes right after creating the directory, the directory does not exist!
Sometimes when checking the dir for the first time (where the number 1 is); Directory.Exist() returns true, other times false. Same happens when checking the dir for the second time (where the number 2 is).
Notes
None of this part of code throw any exception.
Only can reproduce this when publishing the website on server. (Windows server 2008)
Happens when accessing the same folder.
Questions
Is this a concurrency issue race condition?
Doesn't WebApi or the Operating System handle the concurrency?
Is this the correct way to overwrite a folder?
Should I lock files manually when we have many API requests to the same file?
Or in General:
What's the reason for this strange behavior?
UPDATE:
Using DirectoryInfo and Refresh() instead of Directory does not solve the problem.
Only happens when the recursive option of Delete is true. (and the directory is not empty).

Many filesystem operations are not synchonous on some filesystems (in case of windows - NTFS). Take for example RemoveDirectory call (which is called by Directory.DeleteDirectory at some point):
The RemoveDirectory function marks a directory for deletion on close. Therefore, the directory is not removed until the last handle to the directory is closed.
As you see, it will not really delete directory until all handles to it are closed, but Directory.DeleteDirectory will complete fine. In your case that is also most likely such concurrency problem - directory is not really created while you executing Directory.Exists.
So, just periodically check what you need and don't consider filesystem calls in .NET to be synchronous. You can also use FileSystemWatcher in some cases to avoid polling.
EDIT: I was thinking how to reproduce it, and here is the code:
internal class Program {
private static void Main(string[] args) {
const string path = "G:\\test_dir";
while (true) {
if (Directory.Exists(path))
Directory.Delete(path);
Directory.CreateDirectory(path);
if (!Directory.Exists(path))
throw new Exception("Confirmed");
}
}
}
You see that if all filesystem calls were synchronous (in .NET), this code should run without problem. Now, before running that code, create empty directory at specified path (preferrably don't use SSD for that) and open it with windows explorer. Now run the code. For me it either throws Confirmed (which exactly reproduces your issue) or throws on Directory.Delete saying that directory does not exist (almost the same case). It does it 100% of the time for me.
Here is another code which when running on my machine confirms that it's certainly possible for File.Exists to return true directly after File.Delete call:
internal class Program {
private static void Main(string[] args) {
while (true) {
const string path = #"G:\test_dir\test.txt";
if (File.Exists(path))
File.Delete(path);
if (File.Exists(path))
throw new Exception("Confirmed");
File.Create(path).Dispose();
}
}
}
To do this, I opened G:\test_dir folder and during execution of this code tried to open constantly appearing and disappearing test.txt file. After couple of tries, Confirmed exception was thrown (while I didn't create or delete that file, and after exception is thrown, it's not present on filesystem already). So race conditions are possible in multiple cases and my answer is correct one.

I wrote myself a little C# method for synchronous folder deletion using Directory.Delete(). Feel free to copy:
private bool DeleteDirectorySync(string directory, int timeoutInMilliseconds = 5000)
{
if (!Directory.Exists(directory))
{
return true;
}
var watcher = new FileSystemWatcher
{
Path = Path.Combine(directory, ".."),
NotifyFilter = NotifyFilters.DirectoryName,
Filter = directory,
};
var task = Task.Run(() => watcher.WaitForChanged(WatcherChangeTypes.Deleted, timeoutInMilliseconds));
// we must not start deleting before the watcher is running
while (task.Status != TaskStatus.Running)
{
Thread.Sleep(100);
}
try
{
Directory.Delete(directory, true);
}
catch
{
return false;
}
return !task.Result.TimedOut;
}
Note that getting task.Result will block the thread until the task is finished, keeping the CPU load of this thread idle. So that is the point where it gets synchronous.

Sounds like race condition to me. Not sure why - you did not provide enough details - but what you can do is to wrap everything in lock() statement and see if the problem is gone. For sure this is not production-ready solution, it is only a quick way to check. If it's indeed a race condition - you need to rethink your approach of rewriting folders. May be create "GUID" folder and when done - update DB with the most recent GUID to point to the most recent folder?..

Check If File Is In Use By Other Instances of Executable Run

Before I go into too detail, my program is written in Visual Studio 2010 using C# .Net 4.0.
I wrote a program that will generate separate log files for each run. The log file is named after the time, and accurate up at millisecond (for example, 20130726103042375.log). The program will also generate a master log file for the day if it has not already exist (for example, *20130726_Master.log*)
At the end of each run, I want to append the log file to a master log file. Is there a way to check if I can append successfully? And retry after Sleep for like a second or something?
Basically, I have 1 executable, and multiple users (let's say there are 5 users).
All 5 users will access and run this executable at the same time. Since it's nearly impossible for all user to start at the exact same time (up to millisecond), there will be no problem generate individual log files.
However, the issue comes in when I attempt to merge those log files to the master log file. Though it is unlikely, I think the program will crash if multiple users are appending to the same master log file.
The method I use is
File.AppendAllText(masterLogFile, File.ReadAllText(individualLogFile));
I have check into the lock object, but I think it doesn't work in my case, as there are multiple instances running instead of multiple threads in one instance.
Another way I look into is try/catch, something like this
try
{
stream = file.Open(FileMode.Open, FileAccess.ReadWrite, FileShare.None);
}
catch {}
But I don't think this solve the problem, because the status of the masterLogFile can change in that brief millisecond.
So my overall question is: Is there a way to append to masterLogFile if it's not in use, and retry after a short timeout if it is? Or if there is an alternative way to create the masterLogFile?
Thank you in advance, and sorry for the long message. I want to make sure I get my message across and explain what I've tried or look into so we are not wasting anyone's time.
Please let me know if there's anymore information I can provide to help you help me.

Your try/catch is the way to do things. If the call to File.Open succeeds, then you can write to to the file. The idea is to keep the file open. I would suggest something like:
bool openSuccessful = false;
while (!openSuccessful)
{
try
{
using (var writer = new StreamWriter(masterlog, true)) // append
{
// successfully opened file
openSuccessful = true;
try
{
foreach (var line in File.ReadLines(individualLogFile))
{
writer.WriteLine(line);
}
}
catch (exceptions that occur while writing)
{
// something unexpected happened.
// handle the error and exit the loop.
break;
}
}
}
catch (exceptions that occur when trying to open the file)
{
// couldn't open the file.
// If the exception is because it's opened in another process,
// then delay and retry.
// Otherwise exit.
Sleep(1000);
}
}
if (!openSuccessful)
{
// notify of error
}
So if you fail to open the file, you sleep and try again.
See my blog post, File.Exists is only a snapshot, for a little more detail.

I would do something along the lines of this as I think in incurs the least overhead. Try/catch is going to generate a stack trace(which could take a whole second) if an exception is thrown. There has to be a better way to do this atomically still. If I find one I'll post it.

What's the best practice to recover from a FileSystemWatcher error?

After a FileSystemWatcher.Error event was raised, I have no clue about what to do next.
The exception can be a [relatively] minor one, such as
too many changes at once in directory
which doesn't affect the watcher's watching process, but it can also be a big issue - such as the watched directory being deleted, in which case the watcher is no longer functional.
My question is what is the best way to handle the Error event?

Depends on the error surely?
If it is too much data because the buffer was overrun (many changes) do a list directory and grab the changes you're after.
If it is too much data because you're not processing the FileSystemWatcher events quickly enough, ensure you're processing it efficiently.
Deleted directory, can't do anything about it other than disposing the FileSystemWatcher, or maybe watching the parent for a recreation of that directory name again.

I would simply get the inner exception type, then decide on a per-error basis what to do ( restart or fail ).
So
myWatcher.Error += new ErrorEventHandler(OnError);
Followde by
private static void OnError(object source, ErrorEventArgs e)
{
if (e.GetException().GetType() == typeof(InternalBufferOverflowException))
{
// This can happen if Windows is reporting many file system events quickly
// and internal buffer of the FileSystemWatcher is not large enough to handle this
// rate of events. The InternalBufferOverflowException error informs the application
// that some of the file system events are being lost.
Console.WriteLine(("The file system watcher experienced an internal buffer overflow: " + e.GetException().Message));
}
}

Detect File Read in C#

I'm using FileSystemWatcher to check when a file is modified or deleted, but I'm wondering if there is any way to check when a file is read by another application.
Example:
I have the file C:\test.txt on my harddrive and am watching it using FileSystemWatcher. Another program (not under my control) goes to read that file; I would like to catch that event and, if possible, check what program is reading the file then modify the contents of the file accordingly.

It sounds like you want to write to your log file when your log file is read externally, or something to that effect. If that is the case, there is a NotifyFilters value, LastAccess. Make sure this is set as one of the flags in your FileSystemWatcher.NotifyFilter property. A change in the last access time will then fire the Changed event on FileSystemWatcher.
Currently, FileSystemWatcher does not allow you to directly differentiate between a read and a change; they both fire the Changed event based on the "change" to LastAccess. So, it would be infeasible to watch for reads to a large number of files. However, you seem to know which file you're watching, so if you had a FileInfo object for that file, and FileSystemWatcher fired its Changed event, you could get a new one and compare LastAccessTime values. If the access time changed, and LastWriteTime didn't, your file is only being read.
Now, in simplest terms, changes you make to the file while it is being read are not going to immediately show up in the other app, nor are you going to be able to "get there first", lock the file and write to it before they see it. So, you cannot use FileSystemWatcher to "intercept" a read request and show the content you want that app to see. The only way the user of another application can see what you just wrote is if the application is also watching the file and re-loads the file. That will fire another Changed event, causing an infinite loop as long as the other application continues to reload the file.
You will also get a Changed event for a read and a write. Opening a file in a text editor (virtually any will do), making some changes, then saving will fire two Changed events if you're looking for changes to Last Access Time. The first one will go off when the file is opened by the editor; at that time, you may not be able to tell that a write will happen, so if you are looking for pure read-only accesses to the file then you're SOL.

The easiest way I can think of to do this would be with a timer (System.Threading.Timer) whose callback checks and stores the last
System.IO.File.GetLastAccessTime(path)
Something like (maybe with a bit more locking...)
public class FileAccessWatcher
{
public Dictionary<string, DateTime> _trackedFiles = new Dictionary<string, DateTime>();
private Timer _timer;
public event EventHandler<EventArgs<string>> FileAccessed = delegate { };
public FileAccessWatcher()
{
_timer = new Timer(OnTimerTick, null, 500, Timeout.Infinite);
}
public void Watch(string path)
{
_trackedFiles[path] = File.GetLastAccessTime(path);
}
public void OnTimerTick(object state)
{
foreach (var pair in _trackedFiles.ToList())
{
var accessed = File.GetLastAccessTime(pair.Key);
if (pair.Value != accessed)
{
_trackedFiles[pair.Key] = accessed;
FileAccessed(this, new EventArgs<string>(pair.Key));
}
}
_timer.Change(500, Timeout.Infinite);
}
}

There is SysInternals' program FileMon... It can trace every file access in the system. If you can find its source, and understand what win32 hooks it uses, you might marshal those functions in C# and get what you want.

You could use FileInfo.LastAccessTime and FileInfo.Refresh() in a polling loop.
http://msdn.microsoft.com/en-us/library/system.io.fileinfo_members.aspx

Yes, using file system filter driver you can catch all read requests, analyze them and even substitute the data being read. Development of such driver yourself is possible, but very time-consuming and complicated. We offer a product called CallbackFilter, which includes a ready to use driver and lets you implement your filtering business logic in user-mode.

A little snippet that I found useful for detecting when another process has a lock:
static bool IsFileUsedbyAnotherProcess(string filename)
{
try
{
using(var file = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.None))
{
}
}
catch (System.IO.IOException exp)
{
return true;
}
return false;
}

FileSystemWatcher vs polling to watch for file changes

I need to setup an application that watches for files being created in a directory, both locally or on a network drive.
Would the FileSystemWatcher or polling on a timer would be the best option. I have used both methods in the past, but not extensively.
What issues (performance, reliability etc.) are there with either method?

I have seen the file system watcher fail in production and test environments. I now consider it a convenience, but I do not consider it reliable. My pattern has been to watch for changes with the files system watcher, but poll occasionally to catch missing file changes.
Edit: If you have a UI, you can also give your user the ability to "refresh" for changes instead of polling. I would combine this with a file system watcher.

The biggest problem I have had is missing files when the buffer gets full. Easy as pie to fix--just increase the buffer. Remember that it contains the file names and events, so increase it to the expected amount of files (trial and error). It does use memory that cannot be paged out, so it could force other processes to page if memory gets low.
Here is the MSDN article on buffer :
FileSystemWatcher..::.InternalBufferSize Property
Per MSDN:
Increasing buffer size is expensive, as it comes from non paged memory that cannot be swapped out to disk, so keep the buffer as small as possible. To avoid a buffer overflow, use the NotifyFilter and IncludeSubdirectories properties to filter out unwanted change notifications.
We use 16MB due to a large batch expected at one time. Works fine and never misses a file.
We also read all the files before beginning to process even one...get the file names safely cached away (in our case, into a database table) then process them.
For file locking issues I spawn a process which waits around for the file to be unlocked waiting one second, then two, then four, et cetera. We never poll. This has been in production without error for about two years.

The FileSystemWatcher may also miss changes during busy times, if the number of queued changes overflows the buffer provided. This is not a limitation of the .NET class per se, but of the underlying Win32 infrastructure. In our experience, the best way to minimize this problem is to dequeue the notifications as quickly as possible and deal with them on another thread.
As mentioned by #ChillTemp above, the watcher may not work on non-Windows shares. For example, it will not work at all on mounted Novell drives.
I agree that a good compromise is to do an occasional poll to pick up any missed changes.

Also note that file system watcher is not reliable on file shares. Particularly if the file share is hosted on a non-windows server. FSW should not be used for anything critical. Or should be used with an occasional poll to verify that it hasn't missed anything.

Personally, I've used the FileSystemWatcher on a production system, and it has worked fine. In the past 6 months, it hasn't had a single hiccup running 24x7. It is monitoring a single local folder (which is shared). We have a relatively small number of file operations that it has to handle (10 events fired per day). It's not something I've ever had to worry about. I'd use it again if I had to remake the decision.

I currently use the FileSystemWatcher on an XML file being updated on average every 100 milliseconds.
I have found that as long as the FileSystemWatcher is properly configured you should never have problems with local files.
I have no experience on remote file watching and non-Windows shares.
I would consider polling the file to be redundant and not worth the overhead unless you inherently distrust the FileSystemWatcher or have directly experienced the limitations everyone else here has listed (non-Windows shares, and remote file watching).

I have run into trouble using FileSystemWatcher on network shares. If you're in a pure Windows environment, it might not be an issue, but I was watching an NFS share and since NFS is stateless, there was never a notification when the file I was watching changed.

I'd go with polling.
Network issues cause the FileSystemWatcher to be unreliable (even when overloading the error event).

Returning from the event method as quickly as possible, using another thread, solved the problem for me:
private void Watcher_Created(object sender, FileSystemEventArgs e)
{
Task.Run(() => MySubmit(e.FullPath));
}

I had some big problems with FSW on network drives: Deleting a file always threw the error event, never the deleted event. I did not find a solution, so I now avoid the FSW and use polling.
Creation events on the other hand worked fine, so if you only need to watch for file creation, you can go for the FSW.
Also, I had no problems at all on local folders, no matter if shared or not.

Using both FSW and polling is a waste of time and resources, in my opinion, and I am surprised that experienced developers suggest it. If you need to use polling to check for any "FSW misses", then you can, naturally, discard FSW altogether and use only polling.
I am, currently, trying to decide whether I will use FSW or polling for a project I develop. Reading the answers, it is obvious that there are cases where FSW covers the needs perfectly, while other times, you need polling. Unfortunately, no answer has actually dealt with the performance difference(if there is any), only with the "reliability" issues. Is there anyone that can answer that part of the question?
EDIT : nmclean's point for the validity of using both FSW and polling(you can read the discussion in the comments, if you are interested) appears to be a very rational explanation why there can be situations that using both an FSW and polling is efficient. Thank you for shedding light on that for me(and anyone else having the same opinion), nmclean.

Working solution for working with create event instead of change
Even for copy, cut, paste, move.
class Program
{
static void Main(string[] args)
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
FileSystemWatcher FileSystemWatcher = new FileSystemWatcher();
FileSystemWatcher.Path = SourceFolderPath;
FileSystemWatcher.IncludeSubdirectories = false;
FileSystemWatcher.NotifyFilter = NotifyFilters.FileName; // ON FILE NAME FILTER
FileSystemWatcher.Filter = "*.txt";
FileSystemWatcher.Created +=FileSystemWatcher_Created; // TRIGGERED ONLY FOR FILE GOT CREATED BY COPY, CUT PASTE, MOVE
FileSystemWatcher.EnableRaisingEvents = true;
Console.Read();
}
static void FileSystemWatcher_Created(object sender, FileSystemEventArgs e)
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
try
{
// DO SOMETING LIKE MOVE, COPY, ETC
File.Copy(e.FullPath, DestinationFolderPath + #"\" + e.Name);
}
catch
{
}
}
}
Solution for this file watcher while file attribute change event using static storage
class Program
{
static string IsSameFile = string.Empty; // USE STATIC FOR TRACKING
static void Main(string[] args)
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
FileSystemWatcher FileSystemWatcher = new FileSystemWatcher();
FileSystemWatcher.Path = SourceFolderPath;
FileSystemWatcher.IncludeSubdirectories = false;
FileSystemWatcher.NotifyFilter = NotifyFilters.LastWrite;
FileSystemWatcher.Filter = "*.txt";
FileSystemWatcher.Changed += FileSystemWatcher_Changed;
FileSystemWatcher.EnableRaisingEvents = true;
Console.Read();
}
static void FileSystemWatcher_Changed(object sender, FileSystemEventArgs e)
{
if (e.Name == IsSameFile) //SKIPS ON MULTIPLE TRIGGERS
{
return;
}
else
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
try
{
// DO SOMETING LIKE MOVE, COPY, ETC
File.Copy(e.FullPath, DestinationFolderPath + #"\" + e.Name);
}
catch
{
}
}
IsSameFile = e.Name;
}
}
This is a workaround solution for this problem of multiple triggering event.

I would say use polling, especially in a TDD scenario, as it is much easier to mock/stub the presence of files or otherwise when the polling event is triggered than to rely on the more "uncontrolled" fsw event. + to that having worked on a number of apps which were plagued by fsw errors.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.