I'm monitoring a folder on a network mapped drive using FileSystemWatcher. I know it can be janky and report repeated events and various other issues, but I'm also running into a problem of it reporting file modification before and my app not being able to open the file (as in, after receiving multiple Changed events, I try to open the file and get FileNotFoundException).
My current protection:
I keep a dictionary of per-filename timers and a list of things that were already processed:
static private Dictionary<string, Timer> fileChangeTimeout = new Dictionary<string, Timer>();
static private Mutex newFileHandlerLock = new Mutex();
static private List<(DateTime, string)> recentlyHandled = new List<(DateTime, string)>();
Modifications are registered to fire after waitTime (3s) of no changes:
private void ScheduleFileHandling(object sender, FileSystemEventArgs e){
newFileHandlerLock.WaitOne();
if(fileChangeTimeout.TryGetValue(e.FullPath, out var timer))
timer.Change(waitTime, Timeout.Infinite);
else
fileChangeTimeout[e.FullPath] = new Timer(HandleFile, e.FullPath, waitTime, Timeout.Infinite);
newFileHandlerLock.ReleaseMutex();
}
No-more-writes delay timer finally runs out:
private void HandleFile(object state)
{
newFileHandlerLock.WaitOne();
// remove the timer
// return if file in recentlyHandled with time difference < 10s
// add file to recentlyHandled
newFileHandlerLock.ReleaseMutex();
This eliminated all the issues I've had with files either being empty or not fully written. But even after all that delay I sometimes run into "file not found" situations which don't make sense. (the file is in the folder when checked manually)
Is there any way I can make this process more reliable, or do I have to resort to just retrying for a few seconds and hoping for the best?
Related
I am developing a .net application, where I am using FileSystemWatcher class and attached its Created event on a folder. I have to do action on this event (i.e. copy file to some other location). When I am putting a large size into the attached watch folder the event raised immediately even the file copy process still not completed. I don’t want to check this by file.open method.
Is there any way get notify that my file copy process into the watch folder has been completed and then my event get fire.
It is indeed a bummer that FileSystemWatcher (and the underlying ReadDirectoryChangesW API) provide no way to get notified when a new file has been fully created.
The best and safest way around this that I've come across so far (and that doesn't rely on timers) goes like this:
Upon receiving the Created event, start a thread that, in a loop, checks whether the file is still locked (using an appropriate retry interval and maximum retry count). The only way to check if a file is locked is by trying to open it with exclusive access: If it succeeds (not throwing an IOException), then the File is done copying, and your thread can raise an appropriate event (e.g. FileCopyCompleted).
I have had the exact same problem, and solved it this way:
Set FileSystemWatcher to notify when files are created and when they are modified.
When a notification comes in:
a. If there is no timer set for this filename (see below), set a timer to expire in a suitable interval (I commonly use 1 second).
b. If there is a timer set for this filename, cancel the timer and set a new one to expire in the same interval.
When a timer expires, you know that the associated file has been created or modified and has been untouched for the time interval. This means that the copy/modify is probably done and you can now process it.
You could listen for the modified event, and start a timer. If the modified event is raised again, reset the timer. When the timer has reached a certain value without the modify event being raised you can try to perform the copy.
I subscribe to the Changed- and Renamed-event and try to rename the file on every Changed-event catching the IOExceptions. If the rename succeeds, the copy has finished and the Rename-event is fired only once.
Three issues with FileSystemWatcher, the first is that it can send out duplicate creation events so you check for that with something like:
this.watcher.Created += (s, e) =>
{
if (!this.seen.ContainsKey(e.FullPath)
|| (DateTime.Now - this.seen[e.FullPath]) > this.seenInterval)
{
this.seen[e.FullPath] = DateTime.Now;
ThreadPool.QueueUserWorkItem(
this.WaitForCreatingProcessToCloseFileThenDoStuff, e.FullPath);
}
};
where this.seen is a Dictionary<string, DateTime> and this.seenInterval is a TimeSpan.
Next, you have to wait around for the file creator to finish writing it (the issue raised in the question). And, third, you must be careful because sometimes the file creation event gets thrown before the file can be opened without giving you a FileNotFoundException but it can also be removed before you can get a hold of it which also gives a FileNotFoundException.
private void WaitForCreatingProcessToCloseFileThenDoStuff(object threadContext)
{
// Make sure the just-found file is done being
// written by repeatedly attempting to open it
// for exclusive access.
var path = (string)threadContext;
DateTime started = DateTime.Now;
DateTime lastLengthChange = DateTime.Now;
long lastLength = 0;
var noGrowthLimit = new TimeSpan(0, 5, 0);
var notFoundLimit = new TimeSpan(0, 0, 1);
for (int tries = 0;; ++tries)
{
try
{
using (var fileStream = new FileStream(
path, FileMode.Open, FileAccess.ReadWrite, FileShare.None))
{
// Do Stuff
}
break;
}
catch (FileNotFoundException)
{
// Sometimes the file appears before it is there.
if (DateTime.Now - started > notFoundLimit)
{
// Should be there by now
break;
}
}
catch (IOException ex)
{
// mask in severity, customer, and code
var hr = (int)(ex.HResult & 0xA000FFFF);
if (hr != 0x80000020 && hr != 0x80000021)
{
// not a share violation or a lock violation
throw;
}
}
try
{
var fi = new FileInfo(path);
if (fi.Length > lastLength)
{
lastLength = fi.Length;
lastLengthChange = DateTime.Now;
}
}
catch (Exception ex)
{
}
// still locked
if (DateTime.Now - lastLengthChange > noGrowthLimit)
{
// 5 minutes, still locked, no growth.
break;
}
Thread.Sleep(111);
}
You can, of course, set your own timeouts. This code leaves enough time for a 5 minute hang. Real code would also have a flag to exit the thread if requested.
This answer is a bit late, but if possible I'd get the source process to copy a small marker file after the large file or files and use the FileWatcher on that.
Try to set filters
myWatcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite;
I built a console app that monitors a set of folders on a Windows 2019 Server and copies any newly-created .txt files to another folder, using the same file name. So far it's working for the basic functionality. Now I have to handle the fact that most of the time, these files are large and take several minutes to complete creation. I have gone through several SO posts and pieced together the following code trying to accomplish this:
using System;
using System.IO;
namespace Folderwatch
{
class Program
{
static void Main(string[] args)
{
string sourcePath = #"C:\Users\me\Documents\SomeFolder";
FileSystemWatcher watcher = new FileSystemWatcher(sourcePath);
watcher.EnableRaisingEvents = true;
watcher.IncludeSubdirectories = true;
watcher.Filter = "*.txt";
// Add event handlers.
watcher.Created += new FileSystemEventHandler(OnCreated);
}
// Define the event handlers.
private static void OnCreated(object source, FileSystemEventArgs e)
{
// Specify what is done when a file is created.
FileInfo file = new FileInfo(e.FullPath);
string wctPath = e.FullPath;
string wctName = e.Name;
string createdFile = Path.GetFileName(wctName);
string destPath = #"C:\Users\SomeOtherFolder";
string sourceFile = wctPath;
string destFile = Path.Combine(destPath, createdFile);
WaitForFile(file);
File.Copy(sourceFile, destFile, true);
}
public static bool IsFileLocked(FileInfo file)
{
try
{
using (FileStream stream = file.Open(FileMode.Open, FileAccess.Read, FileShare.None))
{
stream.Close();
}
}
catch (IOException)
{
//the file is unavailable because it is:
//still being written to
//or being processed by another thread
//or does not exist (has already been processed)
return true;
}
//file is not locked
return false;
}
public static void WaitForFile(FileInfo filename)
{
//This will lock the execution until the file is ready
//TODO: Add some logic to make it async and cancelable
while (!IsFileLocked(filename)) { }
}
}
}
What I'm attempting to do in the OnCreated method is to check and wait until the file is done being created, and then copy the file to another destination. I don't seem to know what I'm doing with the WaitForFile(file) line - if I comment out that line and the file creation is instant, the file copies as intended. If I use the WaitForFile line, nothing ever happens. I took the IsFileLocked and WaitForFile methods from other posts on SO, but I'm clearly not implementing them correctly.
I've noted this Powershell version Copy File On Creation (once complete) and I'm not sure if the answer here could be pointing me in the right direction b/c I'm even less versed in PS than I am in C#.
EDIT #1: I should have tested for longer before accepting the answer - I think we're close but after about a minute of the program running, I got the following error before the program crashed:
Unhandled exception. System.IO.IOException: The process cannot access
the file
'C:\Users\me\Dropbox\test1.log'
because it is being used by another process. at
System.IO.FileSystem.CopyFile(String sourceFullPath, String
destFullPath, Boolean overwrite) at
Folderwatch.Program.OnCreated(Object source, FileSystemEventArgs e) in
C:\Users\me\OneDrive -
Development\Source\repos\FolderWatchCG\FolderWatchCG\Program.cs:line
61 at System.Threading.Tasks.Task.<>c.b__139_1(Object
state) at
System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
at System.Threading.ThreadPoolWorkQueue.Dispatch() at
System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
Any advice on this would be appreciated. As I further analyze the files in these folders, some of them are log files getting written in realtime, so it could be that the file is being written to for hours before it's actually completed. I am wondering if somehow one of the NotifyFilter comes into play here?
There's a bug in the WaitForFile() method, that is, it currently waits while the file is not locked (not the other way around). In addition to that, you need a way to confirm that the file actually exists. A simple way to achieve that would be to change the WaitForFile() method into something like this:
public static bool WaitForFile(FileInfo file)
{
while (IsFileLocked(file))
{
// The file is inaccessible. Let's check if it exists.
if (!file.Exists) return false;
}
// The file is accessible now.
return true;
}
This will keep waiting as long as the file exists and is inaccessible.
Then, you can use it as follows:
bool fileAvailable = WaitForFile(file);
if (fileAvailable)
{
File.Copy(sourceFile, destFile, true);
}
The problem with this approach though is that the while loop keeps the thread busy, which a) consumes a considerable amount of the CPU resources, and b) prevents the program from processing other files until it finishes waiting for that one file. So, it's probably better to use an asynchronous wait between each check.
Change the WaitForFile method to:
public static async Task<bool> WaitForFile(FileInfo file)
{
while (IsFileLocked(file))
{
// The file is inaccessible. Let's check if it exists.
if (!file.Exists) return false;
await Task.Delay(100);
}
// The file is accessible now.
return true;
}
Then, await it inside OnCreated like this:
private async static void OnCreated(object source, FileSystemEventArgs e)
{
// ...
bool fileAvailable = await WaitForFile(file);
if (fileAvailable)
{
File.Copy(sourceFile, destFile, true);
}
}
It's code that will execute 4 threads in 15-min intervals. The last time that I ran it, the first 15-minutes were copied fast (20 files in 6 minutes), but the 2nd 15-minutes are much slower. It's something sporadic and I want to make certain that, if there's any bottleneck, it's in a bandwidth limitation with the remote server.
EDIT: I'm monitoring the last run and the 15:00 and :45 copied in under 8 minutes each. The :15 hasn't finished and neither has :30, and both began at least 10 minutes before :45.
Here's my code:
static void Main(string[] args)
{
Timer t0 = new Timer((s) =>
{
Class myClass0 = new Class();
myClass0.DownloadFilesByPeriod(taskRunDateTime, 0, cts0.Token);
Copy0Done.Set();
}, null, TimeSpan.FromMinutes(20), TimeSpan.FromMilliseconds(-1));
Timer t1 = new Timer((s) =>
{
Class myClass1 = new Class();
myClass1.DownloadFilesByPeriod(taskRunDateTime, 1, cts1.Token);
Copy1Done.Set();
}, null, TimeSpan.FromMinutes(35), TimeSpan.FromMilliseconds(-1));
Timer t2 = new Timer((s) =>
{
Class myClass2 = new Class();
myClass2.DownloadFilesByPeriod(taskRunDateTime, 2, cts2.Token);
Copy2Done.Set();
}, null, TimeSpan.FromMinutes(50), TimeSpan.FromMilliseconds(-1));
Timer t3 = new Timer((s) =>
{
Class myClass3 = new Class();
myClass3.DownloadFilesByPeriod(taskRunDateTime, 3, cts3.Token);
Copy3Done.Set();
}, null, TimeSpan.FromMinutes(65), TimeSpan.FromMilliseconds(-1));
}
public struct FilesStruct
{
public string RemoteFilePath;
public string LocalFilePath;
}
Private void DownloadFilesByPeriod(DateTime TaskRunDateTime, int Period, Object obj)
{
FilesStruct[] Array = GetAllFiles(TaskRunDateTime, Period);
//Array has 20 files for the specific period.
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
TransferOperationResult transferResult;
foreach (FilesStruct u in Array)
{
if (session.FileExists(u.RemoteFilePath)) //File exists remotely
{
if (!File.Exists(u.LocalFilePath)) //File does not exist locally
{
transferResult = session.GetFiles(u.RemoteFilePath, u.LocalFilePath);
transferResult.Check();
foreach (TransferEventArgs transfer in transferResult.Transfers)
{
//Log that File has been transferred
}
}
else
{
using (StreamWriter w = File.AppendText(Logger._LogName))
{
//Log that File exists locally
}
}
}
else
{
using (StreamWriter w = File.AppendText(Logger._LogName))
{
//Log that File exists remotely
}
}
if (token.IsCancellationRequested)
{
break;
}
}
}
}
Something is not quite right here. First thing is, you're setting 4 timers to run parallel. If you think about it, there is no need. You don't need 4 threads running parallel all the time. You just need to initiate tasks at specific intervals. So how many timers do you need? ONE.
The second problem is why TimeSpan.FromMilliseconds(-1)? What is the purpose of that? I can't figure out why you put that in there, but I wouldn't.
The third problem, not related to multi-programming, but I should point out anyway, is that you create a new instance of Class each time, which is unnecessary. It would be necessary if, in your class, you need to set constructors and your logic access different methods or fields of the class in some order. In your case, all you want to do is to call the method. So you don't need a new instance of the class every time. You just need to make the method you're calling static.
Here is what I would do:
Store the files you need to download in an array / List<>. Can't you spot out that you're doing the same thing every time? Why write 4 different versions of code for that? This is unnecessary. Store items in an array, then just change the index in the call!
Setup the timer at perhaps 5 seconds interval. When it reaches the 20 min/ 35 min/ etc. mark, spawn a new thread to do the task. That way a new task can start even if the previous one is not finished.
Wait for all threads to complete (terminate). When they do, check if they throw exceptions, and handle them / log them if necessary.
After everything is done, terminate the program.
For step 2, you have the option to use the new async keyword if you're using .NET 4.5. But it won't make a noticeable difference if you use threads manually.
And why is it so slow...why don't you check your system status using task manager? Is the CPU high and running or is the network throughput occupied by something else or what? You can easily tell the answer yourself from there.
The problem was the sftp client.
The purpose of the console application was to loop through a list<> and download the files. I tried with winscp and, even though, it did the job, it was very slow. I also tested sharpSSH and it was even slower than winscp.
I finally ended up using ssh.net which, at least in my particular case, was much faster than both winscp and sharpssh. I think the problem with winscp is that there was no evident way of disconnecting after I was done. With ssh.net I could connect/disconnect after every file download was made, something I couldn't do with winscp.
I am working on a logging system for a web application which logs a sequence of events in a dictionary object before sending it to my logging object using Task.Factory.StartNew(() => iLogEventSave()). The logger seemed to work fine, but in some instances some events were not being saved properly so I used the lock() statement to correct the issue. This seemed to do the trick, but the application's performance has dramatically decreased by doing this. How can I have the UI/Page render without having to wait for the Tasks to finish their job?
Below is the code
private static readonly object Locker = new object();
public void iLogEventSave(object state)
{
XmlDocument doc = new XmlDocument();
IDictionary<string, string> EventDetails = (IDictionary<string, string>)state;
string logFile = "";
if(ConfigurationManager.AppSettings["Log_File_Path"].ToString() =="")
{
logFile = HttpRuntime.AppDomainAppPath + "Logs\\" + DateTime.Now.ToString("yyyy_MM_dd") + ".txt";
}
else
{
logFile = ConfigurationManager.AppSettings["Log_File_Path"].ToString() + DateTime.Now.ToString("yyyy_MM_dd") + ".txt";
}
lock (Locker)
{
if (File.Exists(logFile))
{
doc.Load(logFile);
}
else
{
var root = doc.CreateElement("Log");
doc.AppendChild(root);
}
var el = (XmlElement)doc.DocumentElement.AppendChild(doc.CreateElement("Event"));
foreach (KeyValuePair<string, string> item in EventDetails)
{
XmlElement Desc = doc.CreateElement("Details");
Desc.SetAttribute(item.Key.ToString(), item.Value);
el.AppendChild(Desc);
}
doc.Save(logFile);
}
}
If your log did not save several events while being executed asynchronously, you have an unhandled error that you did not address. Considering that you're using a file, I'm going to go out on a limb and say that it failed because two threads were competing for access to the same log file and the first thread to grab it locked the other one out. This is why your lock would now work, it prevents other threads from trying to grab the file.
But logging to a file means that you've effectively restricted yourself to one thread at a time and dealing with the entire file as it grows. You have to load more and more, append more and more, and locking the thread means that the more threads are waiting to log the events, the higher your overhead. All this could certainly add up to a decrease in performance.
May I recommend using a database table to log events? File I/O is very expensive, resource and time-wise. Databases have less overhead and far better throughput by comparison in these very scenarios.
I am writing a WPF application in c# and I need to move some files--the rub is that I really REALLY need to know if the files make it. To do this, I wrote a check that makes sure that the file gets to the target directory after the move--the problem is that sometimes I get to the check before the file finishes moving:
System.IO.File.Move(file.FullName, endLocationWithFile);
System.IO.FileInfo[] filesInDirectory = endLocation.GetFiles();
foreach (System.IO.FileInfo temp in filesInDirectory)
{
if (temp.Name == shortFileName)
{
return true;
}
}
// The file we sent over has not gotten to the correct directory....something went wrong!
throw new IOException("File did not reach destination");
}
catch (Exception e)
{
//Something went wrong, return a fail;
logger.writeErrorLog(e);
return false;
}
Could somebody tell me how to make sure that the file actually gets to the destination?--The files that I will be moving could be VERY large--(Full HD mp4 files of up to 2 hours)
Thanks!
You could use streams with Aysnc Await to ensure the file is completely copied
Something like this should work:
private void Button_Click(object sender, RoutedEventArgs e)
{
string sourceFile = #"\\HOMESERVER\Development Backup\Software\Microsoft\en_expression_studio_4_premium_x86_dvd_537029.iso";
string destinationFile = "G:\\en_expression_studio_4_premium_x86_dvd_537029.iso";
MoveFile(sourceFile, destinationFile);
}
private async void MoveFile(string sourceFile, string destinationFile)
{
try
{
using (FileStream sourceStream = File.Open(sourceFile, FileMode.Open))
{
using (FileStream destinationStream = File.Create(destinationFile))
{
await sourceStream.CopyToAsync(destinationStream);
if (MessageBox.Show("I made it in one piece :), would you like to delete me from the original file?", "Done", MessageBoxButton.YesNo) == MessageBoxResult.Yes)
{
sourceStream.Close();
File.Delete(sourceFile);
}
}
}
}
catch (IOException ioex)
{
MessageBox.Show("An IOException occured during move, " + ioex.Message);
}
catch (Exception ex)
{
MessageBox.Show("An Exception occured during move, " + ex.Message);
}
}
If your using VS2010 you will have to install Async CTP to use the new Async/Await syntax
You could watch for the files to disappear from the original directory, and then confirm that they indeed appeared in the target directory.
I have not had great experience with file watchers. I would probably have the thread doing the move wait for an AutoResetEvent while a separate thread or timer runs to periodically check for the files to disappear from the original location, check that they are in the new location, and perhaps (depending on your environment and needs) perform a consistency check (e.g. MD5 check) of the files. Once those conditions are satisfied, the "checker" thread/timer would trigger the AutoResetEvent so that the original thread can progress.
Include some "this is taking way too long" logic in the "checker".
Why not manage the copy yourself by copying streams?
//http://www.dotnetthoughts.net/writing_file_with_non_cache_mode_in_c/
const FileOptions FILE_FLAG_NO_BUFFERING = (FileOptions) 0x20000000;
//experiment with different buffer sizes for optimal speed
var bufLength = 4096;
using(var outFile =
new FileStream(
destPath,
FileMode.Create,
FileAccess.Write,
FileShare.None,
bufLength,
FileOptions.WriteThrough | FILE_FLAG_NO_BUFFERING))
using(var inFile = File.OpenRead(srcPath))
{
//either
//inFile.CopyTo(outFile);
//or
var fileSizeInBytes = inFile.Length;
var buf = new byte[bufLength];
long totalCopied = 0L;
int amtRead;
while((amtRead = inFile.Read(buf,0,bufLength)) > 0)
{
outFile.Write(buf,0,amtRead);
totalCopied += amtRead;
double progressPct =
Convert.ToDouble(totalCopied) * 100d / fileSizeInBytes;
progressPct.Dump();
}
}
//file is written
You most likely want the move to happen in a separate thread so that you aren't stopping the execution of your application for hours.
If the program cannot continue without the move being completed, then you could open a dialog and check in on the move thread periodically to update a progress tracker. This provides the user with feedback and will prevent them from feeling as if the program has frozen.
There's info and an example on this here:
http://hintdesk.com/c-wpf-copy-files-with-progress-bar-by-copyfileex-api/
try checking periodically in a background task whether the copied file
size reached the file size of the original file (you can add hashes comparing between the files)
Got similar problem recently.
OnBackupStarts();
//.. do stuff
new TaskFactory().StartNew(() =>
{
OnBackupStarts()
//.. do stuff
OnBackupEnds();
});
void OnBackupEnds()
{
if (BackupChanged != null)
{
BackupChanged(this, new BackupChangedEventArgs(BackupState.Done));
}
}
do not wait, react to event
In first place, consider that Moving files in an operating system does not “recreates” the file in the new directory, but only changes its location data in the “files allocation table”, as physically copy all bytes to delete old ones is just a waste of time.
Due to that reason, moving files is a very fast process, no matter the file size.
EDIT: As Mike Christiansen states in his comment, this "speedy" process only happens when files are moving inside the same volume (you know, C:\... to C:\...)
Thus, copy/delete behavior as proposed by “sa_ddam213” in his response will work but is not the optimal solution (takes longer to finish, will not work if for example you don’t have enough free disk to make the copy of the file while the old one exists, …).
MSDN documentation about File.Move(source,destination) method does not specifies if it waits for completion, but the code given as example makes a simple File.Exists(…) check, saying that having there the original file “is unexpected”:
// Move the file.
File.Move(path, path2);
Console.WriteLine("{0} was moved to {1}.", path, path2);
// See if the original exists now.
if (File.Exists(path))
{
Console.WriteLine("The original file still exists, which is unexpected.");
}
else
{
Console.WriteLine("The original file no longer exists, which is expected.");
}
Perhaps, you could use a similar approach to this one, checking in a while loop for the existence of the new file, and the non existence of the old one, giving a “timer” exit for the loop just in case something unexpected happens at operating system level, and the files get lost:
// We perform the movement of the file
File.Move(source,destination);
// Sets an "exit" datetime, after wich the loop will end, for example 15 seconds. The moving process should always be quicker than that if files are in the same volume, almost immediate, but not if they are in different ones
DateTime exitDateTime = DateTime.Now.AddSeconds(15);
bool exitLoopByExpiration = false;
// We stops here until copy is finished (by checking fies existence) or the time limit excedes
while (File.Exists(source) && !File.Exists(destination) && !exitLoopByExpiration ) {
// We compare current datetime with the exit one, to see if we reach the exit time. If so, we set the flag to exit the loop by expiration time, not file moving
if (DateTime.Now.CompareTo(exitDateTime) > 0) { exitLoopByExpiration = true; }
}
//
if (exitLoopByExpiration) {
// We can perform extra work here, like log problems or throw exception, if the loop exists becouse of time expiration
}
I have checked this solution and seems to work without problems.