FileSystemWatcher vs polling to watch for file changes

FileSystemWatcher vs polling to watch for file changes - c#

I need to setup an application that watches for files being created in a directory, both locally or on a network drive.
Would the FileSystemWatcher or polling on a timer would be the best option. I have used both methods in the past, but not extensively.
What issues (performance, reliability etc.) are there with either method?

I have seen the file system watcher fail in production and test environments. I now consider it a convenience, but I do not consider it reliable. My pattern has been to watch for changes with the files system watcher, but poll occasionally to catch missing file changes.
Edit: If you have a UI, you can also give your user the ability to "refresh" for changes instead of polling. I would combine this with a file system watcher.

The biggest problem I have had is missing files when the buffer gets full. Easy as pie to fix--just increase the buffer. Remember that it contains the file names and events, so increase it to the expected amount of files (trial and error). It does use memory that cannot be paged out, so it could force other processes to page if memory gets low.
Here is the MSDN article on buffer :
FileSystemWatcher..::.InternalBufferSize Property
Per MSDN:
Increasing buffer size is expensive, as it comes from non paged memory that cannot be swapped out to disk, so keep the buffer as small as possible. To avoid a buffer overflow, use the NotifyFilter and IncludeSubdirectories properties to filter out unwanted change notifications.
We use 16MB due to a large batch expected at one time. Works fine and never misses a file.
We also read all the files before beginning to process even one...get the file names safely cached away (in our case, into a database table) then process them.
For file locking issues I spawn a process which waits around for the file to be unlocked waiting one second, then two, then four, et cetera. We never poll. This has been in production without error for about two years.

The FileSystemWatcher may also miss changes during busy times, if the number of queued changes overflows the buffer provided. This is not a limitation of the .NET class per se, but of the underlying Win32 infrastructure. In our experience, the best way to minimize this problem is to dequeue the notifications as quickly as possible and deal with them on another thread.
As mentioned by #ChillTemp above, the watcher may not work on non-Windows shares. For example, it will not work at all on mounted Novell drives.
I agree that a good compromise is to do an occasional poll to pick up any missed changes.

Also note that file system watcher is not reliable on file shares. Particularly if the file share is hosted on a non-windows server. FSW should not be used for anything critical. Or should be used with an occasional poll to verify that it hasn't missed anything.

Personally, I've used the FileSystemWatcher on a production system, and it has worked fine. In the past 6 months, it hasn't had a single hiccup running 24x7. It is monitoring a single local folder (which is shared). We have a relatively small number of file operations that it has to handle (10 events fired per day). It's not something I've ever had to worry about. I'd use it again if I had to remake the decision.

I currently use the FileSystemWatcher on an XML file being updated on average every 100 milliseconds.
I have found that as long as the FileSystemWatcher is properly configured you should never have problems with local files.
I have no experience on remote file watching and non-Windows shares.
I would consider polling the file to be redundant and not worth the overhead unless you inherently distrust the FileSystemWatcher or have directly experienced the limitations everyone else here has listed (non-Windows shares, and remote file watching).

I have run into trouble using FileSystemWatcher on network shares. If you're in a pure Windows environment, it might not be an issue, but I was watching an NFS share and since NFS is stateless, there was never a notification when the file I was watching changed.

I'd go with polling.
Network issues cause the FileSystemWatcher to be unreliable (even when overloading the error event).

Returning from the event method as quickly as possible, using another thread, solved the problem for me:
private void Watcher_Created(object sender, FileSystemEventArgs e)
{
Task.Run(() => MySubmit(e.FullPath));
}

I had some big problems with FSW on network drives: Deleting a file always threw the error event, never the deleted event. I did not find a solution, so I now avoid the FSW and use polling.
Creation events on the other hand worked fine, so if you only need to watch for file creation, you can go for the FSW.
Also, I had no problems at all on local folders, no matter if shared or not.

Using both FSW and polling is a waste of time and resources, in my opinion, and I am surprised that experienced developers suggest it. If you need to use polling to check for any "FSW misses", then you can, naturally, discard FSW altogether and use only polling.
I am, currently, trying to decide whether I will use FSW or polling for a project I develop. Reading the answers, it is obvious that there are cases where FSW covers the needs perfectly, while other times, you need polling. Unfortunately, no answer has actually dealt with the performance difference(if there is any), only with the "reliability" issues. Is there anyone that can answer that part of the question?
EDIT : nmclean's point for the validity of using both FSW and polling(you can read the discussion in the comments, if you are interested) appears to be a very rational explanation why there can be situations that using both an FSW and polling is efficient. Thank you for shedding light on that for me(and anyone else having the same opinion), nmclean.

Working solution for working with create event instead of change
Even for copy, cut, paste, move.
class Program
{
static void Main(string[] args)
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
FileSystemWatcher FileSystemWatcher = new FileSystemWatcher();
FileSystemWatcher.Path = SourceFolderPath;
FileSystemWatcher.IncludeSubdirectories = false;
FileSystemWatcher.NotifyFilter = NotifyFilters.FileName; // ON FILE NAME FILTER
FileSystemWatcher.Filter = "*.txt";
FileSystemWatcher.Created +=FileSystemWatcher_Created; // TRIGGERED ONLY FOR FILE GOT CREATED BY COPY, CUT PASTE, MOVE
FileSystemWatcher.EnableRaisingEvents = true;
Console.Read();
}
static void FileSystemWatcher_Created(object sender, FileSystemEventArgs e)
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
try
{
// DO SOMETING LIKE MOVE, COPY, ETC
File.Copy(e.FullPath, DestinationFolderPath + #"\" + e.Name);
}
catch
{
}
}
}
Solution for this file watcher while file attribute change event using static storage
class Program
{
static string IsSameFile = string.Empty; // USE STATIC FOR TRACKING
static void Main(string[] args)
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
FileSystemWatcher FileSystemWatcher = new FileSystemWatcher();
FileSystemWatcher.Path = SourceFolderPath;
FileSystemWatcher.IncludeSubdirectories = false;
FileSystemWatcher.NotifyFilter = NotifyFilters.LastWrite;
FileSystemWatcher.Filter = "*.txt";
FileSystemWatcher.Changed += FileSystemWatcher_Changed;
FileSystemWatcher.EnableRaisingEvents = true;
Console.Read();
}
static void FileSystemWatcher_Changed(object sender, FileSystemEventArgs e)
{
if (e.Name == IsSameFile) //SKIPS ON MULTIPLE TRIGGERS
{
return;
}
else
{
string SourceFolderPath = "D:\\SourcePath";
string DestinationFolderPath = "D:\\DestinationPath";
try
{
// DO SOMETING LIKE MOVE, COPY, ETC
File.Copy(e.FullPath, DestinationFolderPath + #"\" + e.Name);
}
catch
{
}
}
IsSameFile = e.Name;
}
}
This is a workaround solution for this problem of multiple triggering event.

I would say use polling, especially in a TDD scenario, as it is much easier to mock/stub the presence of files or otherwise when the polling event is triggered than to rely on the more "uncontrolled" fsw event. + to that having worked on a number of apps which were plagued by fsw errors.

Related

A robust solution for FileSystemWatcher firing events multiple times

FileSystemWatcher events can fire multiple times. Not good if I need predictable behaviour from my code.
This is described in the MSDN documentation:
Common file system operations might raise more than one event. For
example, when a file is moved from one directory to another, several
OnChanged and some OnCreated and OnDeleted events might be raised.
Moving a file is a complex operation that consists of multiple simple
operations, therefore raising multiple events. Likewise, some
applications (for example, antivirus software) might cause additional
file system events that are detected by FileSystemWatcher.
Good use of NotifyFilters with particular events has helped but won't give me 100% confidence in the consistency.
Here's an example, recreating a Notepad write example (but I have experienced this with other write actions too):
public ExampleAttributesChangedFiringTwice(string demoFolderPath)
{
var watcher = new FileSystemWatcher()
{
Path = #"c:\temp",
NotifyFilter = NotifyFilters.LastWrite,
Filter = "*.txt"
};
watcher.Changed += OnChanged;
watcher.EnableRaisingEvents = true;
}
private static void OnChanged(object source, FileSystemEventArgs e)
{
// This will fire twice if I edit a file in Notepad
}
Any suggestions for making this more resilient?
EDIT: meaning not repeating multiple actions when multiple events are triggered.

An approach utilising MemoryCache as a buffer that will 'throttle' additional events.
A file event (Changed in this example) is triggered
The event is handled by OnChanged but instead of completing the desired action, it stores the event in MemoryCache
with a 1 second expiration
and a CacheItemPolicy callback setup to execute on expiration.
Note that I use AddOrGetExisting as an simple way to block any additional events firing within the cache period being added to the cache.
When it expires, the callback OnRemovedFromCache completes the behaviour intended for that file event
.
class BlockAndDelayExample
{
private readonly MemoryCache _memCache;
private readonly CacheItemPolicy _cacheItemPolicy;
private const int CacheTimeMilliseconds = 1000;
public BlockAndDelayExample(string demoFolderPath)
{
_memCache = MemoryCache.Default;
var watcher = new FileSystemWatcher()
{
Path = demoFolderPath,
NotifyFilter = NotifyFilters.LastWrite,
Filter = "*.txt"
};
_cacheItemPolicy = new CacheItemPolicy()
{
RemovedCallback = OnRemovedFromCache
};
watcher.Changed += OnChanged;
watcher.EnableRaisingEvents = true;
}
// Add file event to cache for CacheTimeMilliseconds
private void OnChanged(object source, FileSystemEventArgs e)
{
_cacheItemPolicy.AbsoluteExpiration =
DateTimeOffset.Now.AddMilliseconds(CacheTimeMilliseconds);
// Only add if it is not there already (swallow others)
_memCache.AddOrGetExisting(e.Name, e, _cacheItemPolicy);
}
// Handle cache item expiring
private void OnRemovedFromCache(CacheEntryRemovedArguments args)
{
if (args.RemovedReason != CacheEntryRemovedReason.Expired) return;
// Now actually handle file event
var e = (FileSystemEventArgs) args.CacheItem.Value;
}
}
Could easily extend to:
Check file lock on expiry from cache and if not available, put it back in the cache again (sometimes events fire so fast the file isn't ready for some operations). Preferable to using try/catch loops.
Key cache on file name + event type combined

I use a FileSystemWatcher to check for MP4 files being uploaded that I ultimately have to do something with. The process doing the upload doesn't seem to establish any lock on the file, so I formerly struggled with starting the processing of them too early.
The technique I adopted in the end, which has been entirely successful for my case, was to consume the event and add the filepath to a Dictionary<string, long> of potentially interesting files, and start a timer. Periodically (60 seconds) I check the file size. The long dictionary value holds the file size from the last check, and if the current size is greater I deem it still being written to, store the new size and go back to sleep for another 60 seconds.
Upon there being a period of 60 seconds where no write activity has occurred, I can start processing.
If this isn't suitable for you, there are a few other things you could consider; hash the file every minute and store the hash instead, re-hash it periodically until the content hasn't changed. Keep tabs on the Last Modified date in the file system, perhaps
Ultimately, consider that FileSYstemWatcher might be a useful device not for notifying you which files you have to act on, but instead for files that are potentially interesting, and a separate process with more refined in-house logic can decide if a potentially interesting file should be acted on

What's the best practice to recover from a FileSystemWatcher error?

After a FileSystemWatcher.Error event was raised, I have no clue about what to do next.
The exception can be a [relatively] minor one, such as
too many changes at once in directory
which doesn't affect the watcher's watching process, but it can also be a big issue - such as the watched directory being deleted, in which case the watcher is no longer functional.
My question is what is the best way to handle the Error event?

Depends on the error surely?
If it is too much data because the buffer was overrun (many changes) do a list directory and grab the changes you're after.
If it is too much data because you're not processing the FileSystemWatcher events quickly enough, ensure you're processing it efficiently.
Deleted directory, can't do anything about it other than disposing the FileSystemWatcher, or maybe watching the parent for a recreation of that directory name again.

I would simply get the inner exception type, then decide on a per-error basis what to do ( restart or fail ).
So
myWatcher.Error += new ErrorEventHandler(OnError);
Followde by
private static void OnError(object source, ErrorEventArgs e)
{
if (e.GetException().GetType() == typeof(InternalBufferOverflowException))
{
// This can happen if Windows is reporting many file system events quickly
// and internal buffer of the FileSystemWatcher is not large enough to handle this
// rate of events. The InternalBufferOverflowException error informs the application
// that some of the file system events are being lost.
Console.WriteLine(("The file system watcher experienced an internal buffer overflow: " + e.GetException().Message));
}
}

How to copy a file as it is being written in C#

I am currently working on Windows Service to copy data from our security cameras as it is being written to the Google Drive directory on the computer for instant upload. The files are accessible immediately after creation by the provided playback software so we would like if possible to immediately copy the data stream, that way we have some video even if the recording is interrupted (the files are 10 minute time blocks).
I currently have a service created which can watch the directory, however I am having some difficulty determining the best way to watch these files. Since they are modified continuously for 10 minutes, I will receive a large number of changed events. I was hoping there might be a way that I can capture the initial creation and start streaming the data to a second file. My concern here is that I need to ensure that I don't overrun the recording stream.
If this isn't possible or relatively simple, then I will just have to detect when the file is no longer being written to by using some logic with the last write time, but I am looking for suggestions on what the best way to do this might be. I am aware of the solutions proposed Here, but I am unsure if they apply to the situation I am dealing with. There are a large number of files within sub-directories so trying to keep track of which files I have are no longer triggering events could get very messy. Does anyone have any suggestions for how to do either of these methods?

Hmmm... You could try using a timer... This way, you can limit when it fires
private Boolean TimeToCheck=false;
public static void Run()
{ Timer timer=new Timer(2000); //2 seconds
FileSystemWatcher fileWatch=new FileSystemWatcher();
fileWatch.Path="DirToWatch";
fileWatch.Filter="fileToWatch";
fileWatch.Changed += new FileSystemEventHandler(OnChanged);
fileWatch.Created += new FileSystemEventHandler(OnChanged);
fileWatch.Deleted += new FileSystemEventHandler(OnChanged);
//If you want rename, you could use the rename event as well fileWatch.Renamed += new RenamedEventHandler(OnRenamed);
timer.Elapsed += new ElapsedEventHandler(timer_done);
watcher.EnableRaisingEvents = true;
timer.Enabled = true; // Enable it
}
private static void OnChanged(object source, FileSystemEventArgs e)
{
if(TimeToCheck)
{
TimeToCheck=false;
timer.Enabled = false; // Enable it
//move the files
timer.Enabled = true; // Enable it
}
}
private static void OnRenamed(object source, RenamedEventArgs e)
{
if(TimeToCheck)
{
TimeToCheck=false;
timer.Enabled = false; // Enable it
//move the files
timer.Enabled = true; // Enable it
}
}
private static void timer_done(object sender, ElapsedEventArgs e)
{
TimeToCheck=true;
}

You could try to do this but to be honest this seems like a hack and I'm skeptical that Windows has any supported method for doing what you're trying to do. Essentially you're trying to listen in on a write stream.
It sounds like whatever solution you're working with right now is a black box so accessing the stream directly probably isn't an option. However, there is another approach. I would look into how you can create a virtual drive with your app in windows. That way you can have the recording application writing to your virtual drive path which will allow you to handle the streams however you like. Which can include writing them to two separate locations at the same time. Both Google drive and some local storage of some kind for example.
Here's a StackOverflow question on how to create virtual drives that should get you started: C#: Create a virtual drive in Computer

Have you looked at the FileSystemWatcher object? If i'm understanding the question correctly, it may be something you may want to use.... If you were to put this security file within a certain directory, you could then use file.copy to move the updated security log into the google drive folder...

FileSystemWatcher changed event (for "LastWrite") is unreliable

I am trying to get a notification when a file is updated on disk. I am interested in getting this notifications as soon as a flush occurs, yet, it seems that the FileSystemWatcher would only send an event when the stream opens or closes.
In the code below I am writing to a file and repeatedly flush the buffer to the disk. Yet, the FileSystemWatcher only notified me once in the beginning of the writes and once when they end.
Is there another way to get these notifications? Or should I use polling for that?
The code:
class Program
{
static void Main(string[] args)
{
FileSystemWatcher watcher = new FileSystemWatcher(Environment.CurrentDirectory, "Test.txt");
watcher.Changed += watcher_Changed;
watcher.EnableRaisingEvents = true;
using(TextWriter writer = new StreamWriter("Test.txt"))
{
WriteData(writer);
WriteData(writer);
WriteData(writer);
WriteData(writer);
}
Thread.Sleep(10000);
}
private static void WriteData(TextWriter writer)
{
writer.WriteLine("Hello!");
writer.Flush();
Console.WriteLine(DateTime.Now.ToString("T") + "] Wrote data");
Thread.Sleep(3000);
}
static void watcher_Changed(object sender, FileSystemEventArgs e)
{
Console.WriteLine(DateTime.Now.ToString("T") + "] Watcher changed!");
}
}
UPDATE 1
I've corrected the ToString function of the DateTime to show seconds. Here is the output with the code above:
11:37:47 AM] Watcher changed!
11:37:47 AM] Wrote data
11:37:50 AM] Wrote data
11:37:53 AM] Wrote data
11:37:56 AM] Wrote data
11:37:59 AM] Watcher changed!
Thanks!

It has nothing to do with the FileSystemWatcher. The watcher reacts to updates on the LastWrite attribute for the filesystem.
E.g. NTFS does not update the LastWrite on every write. The value is cached and only written when the stream is closed or at some other unspecified time. This document says
Timestamps are updated at various times and for various reasons. The only guarantee about a file timestamp is that the file time is correctly reflected when the handle that makes the change is closed. [...] The NTFS file system delays updates to the last access time for a file by up to 1 hour after the last access
I assume a similar caching applies for write

I think one of your issues here is that all your writes get executed before the event gets a slice of that cpu-time.
MSDN states
The Changed event is raised when changes are made to the size, system attributes, last write time, last access time, or security permissions of a file or directory in the directory being monitored.
I did a test and inserted Sleeps after each call to WriteData(...), what i got was
09:32] Watcher changed!
09:32] Watcher changed!
09:32] -----> Wrote data
09:32] -----> Wrote data
09:32] -----> Wrote data
09:32] -----> Wrote data
09:32] Watcher changed!
09:32] Watcher changed!
I guess this kind of proves that the even is fired right after you call Flush(), it's just a question of when the event-handler gets executed (and I assume it groups events as well).
I don't know the specific needs of your project, but I wouldn't poll. Seems like a waste since FileSystemWatcher does what you want it to do in my opinion.
Edit: Ok, I guess my brain wasn't quite ready yet for thinking when I posted this.
Your conclusion that it fires when you open and close the stream seems more logical and right.
I guess I was looking for "prove" that it fires when you call flush and therefore found it - somehow.
Update
I just had a poke at the USN-Journal and it seems that wont get you what you want either as it writes the record only when the file is closed. -> http://msdn.microsoft.com/en-us/library/aa363803(VS.85).aspx
I also found a USN-Viewer in C# and http://social.msdn.microsoft.com/Forums/en/csharpgeneral/thread/c1550294-d121-4511-ac32-31551497f64e might be interesting to read as well.
I also ran DiskMon to see if it gets the changes in realtime. It doesn't, but I don't know if that's intentional or not. However, the problem with both is that they require admin-rights to run.
So I guess you are stuck with FileSystemWatcher. (What is it you need the updates for anyway, it's not like you can read the file while it's open/locked by another program.)
ps: I just noticed you are a dev of BugAid, which I was made aware of it only recently - it looks awesome :)

I know this is an old thread but the problem still exists 10 years later. I solved this by using a timer:
private Timer _checkTimer;
private DateTime _lastSaved;
Before initializing the timer, I get the last write time of the file:
_lastSaved = File.GetLastWriteTime(_watchedFile);
_checkTimer = new Timer() { Interval = 1000, AutoReset = true };
_checkTimer.Elapsed += CheckTimer_Elapsed;
_checkTimer.Start();
Then on each elapsed event I check if the last write time is newer than the one that I knew to be the last write time:
private void CheckTimer_Elapsed(object sender, ElapsedEventArgs e)
{
DateTime lastWriteTime = File.GetLastWriteTime(_watchedFile);
if (lastWriteTime > _lastSaved)
{
// Stop the timer to avoid concurrency problems
_syncTimer.Stop();
// Do here your processing
// Now, this is the new last known write time
_lastSaved = File.GetLastWriteTime(_watchedFile);
// Start again the timer
_syncTimer.Start();
}
}
This is how I personally solved this problem that I also had with FileSystemWatcher. This method is not as expensive as reading the file content or computing the file hash every second to check if there is a difference.

C# move file as soon as it becomes available

I need to accomplish the following task:
Attempt to move a file. If file is locked schedule for moving as soon as it becomes available.
I am using File.Move which is sufficient for my program. Now the problems are that:
1) I can't find a good way to check if the file I need to move is locked. I am catching System.IO.IOException but reading other posts around I discovered that the same exception may be thrown for different reasons as well.
2) Determining when the file gets unlocked. One way of doing this is probably using a timer/thread and checking the scheduled files lets say every 30 seconds and attempting to move them. But I hope there is a better way using FileSystemWatcher.
This is a .net 3.5 winforms application. Any comments/suggestions are appreciated. Thanks for attention.

You should really just try and catch an IOException. Use Marshal.GetHRForException to check for the cause of the exception.
A notification would not be reliable. Another process might lock the file again before File.Move is executed.

One possible alternative is by using MoveFileEx with a MOVEFILE_DELAY_UNTIL_REBOOT flag. If you don't have access to move the file right now, you can schedule it to be moved on the next reboot when it's guaranteed to be accessible (the moving happens very early in the boot sequence).
Depending on your specific application, you could inform the user a reboot is necessary and initiate the reboot yourself in addition to the moving scheduling.

It's simple:
static void Main(string[] args)
{
//* Create Watcher object.
FileSystemWatcher watcher = new FileSystemWatcher(#"C:\MyFolder\");
//* Assign event handler.
watcher.Created += new FileSystemEventHandler(watcher_Created);
//* Start watching.
watcher.EnableRaisingEvents = true;
Console.ReadLine();
}
static void watcher_Created(object sender, FileSystemEventArgs e)
{
try
{
File.Move(e.FullPath, #"C:\MyMovedFolder\" + e.Name);
}
catch (Exception)
{
//* Something went wrong. You can do additional proceesing here, like fire-up new thread for retry move procedure.
}
}

This is not specific to your problem, but generally you will always need to retain the 'try it and gracefully deal with a failure' mode of operation for this sort of action.
That's because however clever your 'detect that the file is available' mechanism is, there will always be some amount of time between you detecting that the file is available and moving it, and in that time someone else might mess with the file.

The scheduled retry on exception (probably increasing delays - up to a point) is probably the simplest way to achieve this (your (2) ).
To do it properly you're going to have to drop to system level (with Kernel code) hooks to trap the file close event - which has its own idiosynchrases. It's a big job - several orders of magnitude more complex than the scheduled retry method. It's up to you and your application case to make that call, but I don't know of anything effective in between.

Very old question, but google led me here, so when I found a better answer I decided to post it:
There's a nice code I found in the dotnet CLI repo:
/// <summary>
/// Run Directory.Move and File.Move in Windows has a chance to get IOException with
/// HResult 0x80070005 due to Indexer. But this error is transient.
/// </summary>
internal static void RetryOnMoveAccessFailure(Action action)
{
const int ERROR_HRESULT_ACCESS_DENIED = unchecked((int)0x80070005);
int nextWaitTime = 10;
int remainRetry = 10;
while (true)
{
try
{
action();
break;
}
catch (IOException e) when (e.HResult == ERROR_HRESULT_ACCESS_DENIED)
{
Thread.Sleep(nextWaitTime);
nextWaitTime *= 2;
remainRetry--;
if (remainRetry == 0)
{
throw;
}
}
}
}
There is also a method for just IOException. Here's the usage example:
FileAccessRetrier.RetryOnMoveAccessFailure(() => Directory.Move(packageDirectory.Value, tempPath));
Overall, this repo contains a lot of interesting ideas for file manipulations and installation/removal logic, like TransactionalAction, so I recommend it for reviewing. Unfortunately, these functions are not available as NuGet package.

Have a look at the FileSystemWatcher.
http://msdn.microsoft.com/en-us/library/system.io.filesystemwatcher(VS.90).aspx
Listens to the file system change
notifications and raises events when a
directory, or file in a directory,
changes

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.