I have an application that requires two files to process data. A zip file containing the actual data then a control file that says what to do with said data.
These files are downloaded via sftp to a staging directory. Once the zip file is complete, I need to check and see if the control file is there as well. They share a naming prefix only(Eg. 100001_ABCDEF_123456.zip is paired with 100001_ABCDEF_control_file.ctl.
I am trying to find a way to wait for the zip file to finishing downloading then move the files on the fly, while maintaining the directory structure as that is important for the next step in processing.
Currently I am waiting till the sftp worker finishes then calling robocopy to move everything. I would like a more polished approach.
I have tried several things and I get the same results. Files download but never move. For some reason I just cannot get the compare to work correctly.
I have tried using a FileSystemWatcher to look for the rename from filepart to zip but it seems to miss several downloads and for some reason the function dies when I get to my foreach to search the directory for the control file.
Below is the FileSystemWatcher event, I am calling this for created and changed.
Also below is the setup for the filesystemwatcher.
watcher.Path = #"C:\Sync\";
watcher.IncludeSubdirectories = true;
watcher.EnableRaisingEvents = true;
watcher.Filter = "*.zip";
watcher.NotifyFilter = NotifyFilters.Attributes |
NotifyFilters.CreationTime |
NotifyFilters.FileName |
NotifyFilters.LastAccess |
NotifyFilters.LastWrite |
NotifyFilters.Size |
NotifyFilters.Security |
NotifyFilters.CreationTime |
NotifyFilters.DirectoryName;
watcher.Created += Watcher_Changed;
watcher.Changed += Watcher_Changed;
private void Watcher_Changed(object sender, FileSystemEventArgs e)
{
var dir = new DirectoryInfo(e.FullPath.Substring(0, e.FullPath.Length - e.Name.Length));
var files = dir.GetFiles();
FileInfo zipFile = new FileInfo(e.FullPath);
foreach (FileInfo file in files)
{
MessageBox.Show(file.Extension);
if (file.Extension == "ctl" && file.Name.StartsWith(e.Name.Substring(0, (e.Name.Length - 14))))
{
file.CopyTo(#"C:\inp\");
zipFile.CopyTo(#"C:\inp\");
}
}
}
Watcher_Changed is going to get called for all sorts of things, and not every time it's called will you want to react to it.
The first thing you should do in the event handler is try to exclusively open zipFile. If you cannot do it, ignore this event and wait for another event. If this is an FTP server, every time a new chunk of data is written to disk, you'll get a changed event. You could also put something on a "retry" queue or use some other mechanism to check to see if the file available at a later time. I have a similar need in our system, and we try every 5 seconds after we notice a first change. Only once we can exclusively open the file for writing, do we allow it to move on to the next step.
I would tighten up your assumptions about what the filename looks like. You're limiting the search to *.zip, but don't depend on only your .zip files existing in that target directory. Validate that the parsing you're doing of the filename isn't hitting unexpected values. You may also want to check that dir.Exists() before calling dir.GetFiles(). That could be throwing exceptions.
As to missing events, see this good answer on buffer overflows: FileSystemWatcher InternalBufferOverflow
The FileSystemWatcher class is notoriously tricky to use correctly, because you will get multiple events for a single file that is being written to, moved or copied, as #WillStoltenberg also mentioned in his answer.
I have found that it is much easier just to setup a task that runs periodically (e.g. every 30 seconds). For your problem, you could easily do something like the below. Note that a similar implementation using a Timer, instead of the Task.Delay, may be preferable.
public class MyPeriodicWatcher
{
private readonly string _watchPath;
private readonly string _searchMask;
private readonly Func<string, string> _commonPrefixFetcher;
private readonly Action<FileInfo, FileInfo> _pairProcessor;
private readonly TimeSpan _checkInterval;
private readonly CancellationToken _cancelToken;
public MyPeriodicWatcher(
string watchPath,
string searchMask,
Func<string, string> commonPrefixFetcher,
Action<FileInfo, FileInfo> pairProcessor,
TimeSpan checkInterval,
CancellationToken cancelToken)
{
_watchPath = watchPath;
_searchMask = string.IsNullOrWhiteSpace(searchMask) ? "*.zip" : searchMask;
_pairProcessor = pairProcessor;
_commonPrefixFetcher = commonPrefixFetcher;
_cancelToken = cancelToken;
_checkInterval = checkInterval;
}
public Task Watch()
{
while (!_cancelToken.IsCancellationRequested)
{
try
{
foreach (var file in Directory.EnumerateFiles(_watchPath, _searchMask))
{
var pairPrefix = _commonPrefixFetcher(file);
if (!string.IsNullOrWhiteSpace(pairPrefix))
{
var match = Directory.EnumerateFiles(_watchPath, pairPrefix + "*.ctl").FirstOrDefault();
if (!string.IsNullOrEmpty(match) && !_cancelToken.IsCancellationRequested)
_pairProcessor(
new FileInfo(Path.Combine(_watchPath, file)),
new FileInfo(Path.Combine(_watchPath, match)));
}
if (_cancelToken.IsCancellationRequested)
break;
}
if (_cancelToken.IsCancellationRequested)
break;
Task.Delay(_checkInterval, _cancelToken).Wait().ConfigureAwait(false);
}
catch (OperationCanceledException)
{
break;
}
}
}
}
You will need to provide it with
the path to monitor
the search mask for the first file (i.e. *.zip)
a function delegate that gets the common file name prefix from the zip file name
an interval
the delegate that will perform the moving and receives the FileInfo for the pair to be processed / moved.
and a cancellation token to cleanly cancel monitoring.
In your pairProcessor delegate, catch IO exceptions, and check for a sharing violation (which likely means writing the file has not yet completed).
Related
static void Main(string[] args)
{
var fileSystemWatcher = new FileSystemWatcher(#"filepath")
{
Filter = "*.txt",
NotifyFilter = NotifyFilters.FileName,
EnableRaisingEvents = true
};
fileSystemWatcher.Created += OnActionOccurOnFolderPath;
Console.ReadLine();
}
public static void OnActionOccurOnFolderPath(object sender, FileSystemEventArgs e)
{
Console.WriteLine(e.ChangeType);
Console.WriteLine(e.Name);
Upload.upload();
}
This uploads any txt file that has been created in a specified path to the SFTP Server.
The server will generate a report about whether the upload and file processing was successful.
This usually takes about 2-3 minutes.
I then check with a timer every 60 seconds if there has been a new report created.
First I get a list of the files in the directory:
RemoteDirectoryInfo directoryInfo = session.ListDirectory(remotePath);
Here I select the latest file:
RemoteFileInfo latest =
directoryInfo.Files
.Where(file => !file.IsDirectory)
.OrderByDescending(file => file.LastWriteTime)
.FirstOrDefault();
I go on with downloading the file to check it for some parameters.
session.GetFileToDirectory(latest.FullName, localPath);
But whenever I upload multiple files, there will be multiple reports but I can only download the latest one.
My intention is that I want to download everything that has been created in the last 60 seconds.
This needs to be done while the upload of new data can still be assured.
So I suppose that the code above for finding latest need to be changed in some way.
To download files created in the last minute, use file mask time constraint >=60S:
session.GetFilesToDirectory(remoteDirectory, localDirectory, "*>=60S").Check();
I have FileSystem watcher for a local directory. It's working fine. I want same to implement for FTP. Is there any way I can achieve it? I have checked many solutions but it's not clear.
Logic: Want to get files from FTP later than some timestamp.
Problem faced: Getting all files from FTP and then filtering the result is hitting the performance (used FtpWebRequest).
Is there any right way to do this? (WinSCP is on hold. Cant use it now.)
FileSystemWatcher oFsWatcher = new FileSystemWatcher();
OFSWatchers.Add(oFsWatcher);
oFsWatcher.Path = sFilePath;
oFsWatcher.Filter = string.IsNullOrWhiteSpace(sFileFilter) ? "*.*" : sFileFilter;
oFsWatcher.NotifyFilter = NotifyFilters.FileName;
oFsWatcher.EnableRaisingEvents = true;
oFsWatcher.IncludeSubdirectories = bIncludeSubdirectories;
oFsWatcher.Created += new FileSystemEventHandler(OFsWatcher_Created);
You cannot use the FileSystemWatcher or any other way, because the FTP protocol does not have any API to notify a client about changes in the remote directory.
All you can do is to periodically iterate the remote tree and find changes.
It's actually rather easy to implement, if you use an FTP client library that supports recursive listing of a remote tree. Unfortunately, the built-in .NET FTP client, the FtpWebRequest does not. But for example with WinSCP .NET assembly, you can use the Session.EnumerateRemoteFiles method.
See the article Watching for changes in SFTP/FTP server:
// Setup session options
SessionOptions sessionOptions = new SessionOptions
{
Protocol = Protocol.Ftp,
HostName = "example.com",
UserName = "user",
Password = "password",
};
using (Session session = new Session())
{
// Connect
session.Open(sessionOptions);
List<string> prevFiles = null;
while (true)
{
// Collect file list
List<string> files =
session.EnumerateRemoteFiles(
"/remote/path", "*.*", EnumerationOptions.AllDirectories)
.Select(fileInfo => fileInfo.FullName)
.ToList();
if (prevFiles == null)
{
// In the first round, just print number of files found
Console.WriteLine("Found {0} files", files.Count);
}
else
{
// Then look for differences against the previous list
IEnumerable<string> added = files.Except(prevFiles);
if (added.Any())
{
Console.WriteLine("Added files:");
foreach (string path in added)
{
Console.WriteLine(path);
}
}
IEnumerable<string> removed = prevFiles.Except(files);
if (removed.Any())
{
Console.WriteLine("Removed files:");
foreach (string path in removed)
{
Console.WriteLine(path);
}
}
}
prevFiles = files;
Console.WriteLine("Sleeping 10s...");
Thread.Sleep(10000);
}
}
(I'm the author of WinSCP)
Though, if you actually want to just download the changes, it's a way easier. Just use the Session.SynchronizeDirectories in the loop.
while (true)
{
SynchronizationResult result =
session.SynchronizeDirectories(
SynchronizationMode.Local, "/remote/path", #"C:\local\path", true);
result.Check();
// You can inspect result.Downloads for a list for updated files
Console.WriteLine("Sleeping 10s...");
Thread.Sleep(10000);
}
This will update even modified files, not only new files.
Though using WinSCP .NET assembly from a web application might be problematic. If you do not want to use a 3rd party library, you have to do with limitations of the FtpWebRequest. For an example how to recursively list a remote directory tree with the FtpWebRequest, see my answer to List names of files in FTP directory and its subdirectories.
You have edited your question to say that you have performance problems with the solutions I've suggested. Though you have already asked a new question that covers this:
Get FTP file details based on datetime in C#
Unless you have access to the OS which hosts the service; it will be a bit harder.
FileSystemWatcher places a hook on the filesystem, which will notify your application as soon as something happened.
FTP command specifications does not have such a hook. Besides that it's always initiated by the client.
Therefor, to implement such logic you should periodical perform a NLST to list the FTP-directory contents and track the changes (or hashes, perhaps (MDTM)) yourself.
More info:
FTP return codes
FTP
I have got an alternative solution to do my functionality.
Explanation:
I am downloading the files from FTP (Read permission reqd.) with same folder structure.
So everytime the job/service runs I can check into the physical path same file(Full Path) exists or not If not exists then it can be consider as a new file. And Ii can do some action for the same and download as well.
Its just an alternative solution.
Code Changes:
private static void GetFiles()
{
using (FtpClient conn = new FtpClient())
{
string ftpPath = "ftp://myftp/";
string downloadFileName = #"C:\temp\FTPTest\";
downloadFileName += "\\";
conn.Host = ftpPath;
//conn.Credentials = new NetworkCredential("ftptest", "ftptest");
conn.Connect();
//Get all directories
foreach (FtpListItem item in conn.GetListing(conn.GetWorkingDirectory(),
FtpListOption.Modify | FtpListOption.Recursive))
{
// if this is a file
if (item.Type == FtpFileSystemObjectType.File)
{
string localFilePath = downloadFileName + item.FullName;
//Only newly created files will be downloaded.
if (!File.Exists(localFilePath))
{
conn.DownloadFile(localFilePath, item.FullName);
//Do any action here.
Console.WriteLine(item.FullName);
}
}
}
}
}
I can't find any documentation that outlines the correct way to use OneDrive to store and keep app files syncrhonised across devices in C#
I have read the documentation at OneDrive Dev Center but I don't understand the http code. (self taught C# only).
I kind of understand that I use the delta method to get changed files from OneDrive, to then save locally, but I can't figure out exactly how, so have gotten around it by checking local vs OneDrive manually using the GetAsync<> methods.
My current implementation (below for reference) seems to me to be rather clunky compared to what is probably handled better in the API.
In addition, it doesn't appear that there is a reverse 'delta' function? That is, where I write a file to the app locally, then tell OneDrive to sync the change. Is that because I need to actually upload it using the PutAsync<> method? (Currently what I am doing)
public async Task<T> ReadFromXML<T>(string gamename, string filename)
{
string filepath = _appFolder + #"\" + gamename + #"\" + filename + ".xml";
T objectFromXML = default(T);
var srializer = new XmlSerializer(typeof(T));
Item oneDItem = null;
int casenum = 0;
//_userDrive is the IOneDriveClient
if (_userDrive != null && _userDrive.IsAuthenticated)
{
try
{
oneDItem = await _userDrive.Drive.Special.AppRoot.ItemWithPath(filepath).Request().GetAsync();
if (oneDItem != null) casenum += 1;
}
catch (OneDriveException)
{ }
}
StorageFile localfile = null;
try
{
localfile = await ApplicationData.Current.LocalFolder.GetFileAsync(filepath);
if (localfile != null) casenum += 2;
}
catch (FileNotFoundException)
{ }
switch (casenum)
{
case 0:
//neither exist. Throws exception to tbe caught by the calling method, which should then instantiate a new object of type <T>
throw new FileNotFoundException();
case 1:
//OneDrive only - should copy the stream to a new local file then return the object
StorageFile writefile = await ApplicationData.Current.LocalFolder.CreateFileAsync(filepath, CreationCollisionOption.ReplaceExisting);
using (var newlocalstream = await writefile.OpenStreamForWriteAsync())
{
using (var oneDStream = await _userDrive.Drive.Special.AppRoot.ItemWithPath(filepath).Content.Request().GetAsync())
{
oneDStream.CopyTo(newlocalstream);
}
}
using (var newreadstream = await writefile.OpenStreamForReadAsync())
{ objectFromXML = (T)srializer.Deserialize(newreadstream); }
break;
case 2:
//Local only - returns the object
using (var existinglocalstream = await localfile.OpenStreamForReadAsync())
{ objectFromXML = (T)srializer.Deserialize(existinglocalstream); }
break;
case 3:
//Both - compares last modified. If OneDrive, replaces local data then returns the object
var localinfo = await localfile.GetBasicPropertiesAsync();
var localtime = localinfo.DateModified;
var oneDtime = (DateTimeOffset)oneDItem.FileSystemInfo.LastModifiedDateTime;
switch (oneDtime > localtime)
{
case true:
using (var newlocalstream = await localfile.OpenStreamForWriteAsync())
{
using (var oneDStream = await _userDrive.Drive.Special.AppRoot.ItemWithPath(filepath).Content.Request().GetAsync())
{ oneDStream.CopyTo(newlocalstream); }
}
using (var newreadstream = await localfile.OpenStreamForReadAsync())
{ objectFromXML = (T)srializer.Deserialize(newreadstream); }
break;
case false:
using (var existinglocalstream = await localfile.OpenStreamForReadAsync())
{ objectFromXML = (T)srializer.Deserialize(existinglocalstream); }
break;
}
break;
}
return objectFromXML;
}
Synchronization requires a few different steps, some of which the OneDrive API will help you with, some of which you'll have to do yourself.
Change Detection
The first stage is obviously to detect whether anything has changed. The OneDrive API provides two mechanism to detect changes in the service:
Changes for individual files can be detected using a standard request with an If-None-Match:
await this.userDrive.Drive.Special.AppRoot.ItemWithPath(remotePath).Content.Request(new Option[] { new HeaderOption("If-None-Match", "etag") }).GetAsync();
If the file doesn't yet exist at all you'll get back a 404 Not Found.
Else if the file has not changed you'll get back a 304 Not Modified.
Else you'll get the current state of the file.
Changes for a hierarchy can be detected using the delta API:
await this.userDrive.Drive.Special.AppRoot.Delta(previousDeltaToken).Request().GetAsync();
This will return the current state for all items that changed since the last invocation of delta. If this is the first invocation, previousDeltaToken will be null and the API will return the current state for ALL items within the AppRoot. For each file in the response you'll need to make another round-trip to the service to get the content.
On the local side you'll need to enumerate all of the files of interest and compare the timestamps to determine if something has changed.
Obviously the previous steps require knowledge of the "last seen" state, and so your application will need to keep track of this in some form of database/data structure. I'd suggest tracking the following:
+------------------+---------------------------------------------------------------------------+
| Property | Why? |
+------------------+---------------------------------------------------------------------------+
| Local Path | You'll need this so that you can map a local file to its service identity |
| Remote Path | You'll need this if you plan to address the remote file by path |
| Remote Id | You'll need this if you plan to address the remote file by unique id |
| Hash | The hash representing the current state of the file |
| Local Timestamp | Needed to detect local changes |
| Remote Timestamp | Needed for conflict resolution |
| Remote ETag | Needed to detect remote changes |
+------------------+---------------------------------------------------------------------------+
Additionally, if using the delta approach you'll need to store the token value from the delta response. This is item independent, so would need to be stored in some global field.
Conflict Resolution
If changes were detected on both sides your app will need to go through a conflict resolution process. An app that lacks an understanding of the files being synced would need to either prompt the user for manual conflict resolution, or do something like fork the file so there are now two copies. However, apps that are dealing with custom file formats should have enough knowledge to effectively merge the files without any form of user interaction. What that entails is obviously completely dependent on the file being synced.
Apply Changes
The final step is to push the merged state to wherever is required (e.g. if the change was local then push remote, if the change was remote then push local, otherwise if the change was in both places push both places). It's important to make sure this step occurs in such a way as to avoid replacing content that was written after the "Change Detection" step has taken place. Locally you'd probably accomplish this by locking the file during the process, however you cannot do that with the remote file. Instead you'll want to use the etag value to make sure the service only accepts the request if the state is still what you expect:
await this.userDrive.Drive.Special.AppRoot.ItemWithPath(remotePath).Content.Request(new Option[] { new HeaderOption("If-Match", "etag") }).PutAsync(newContentStream);
I have a console app which is checking a logfile for newly updated data, if yes, then copies them in a db.
Currently I'm using FileSystemWatcher to track any changes in the log.
What I would like to do is, to track if a new logfile is created, if yes, then to invoke myMethod() in the new logfile.
Since I'm new in C# I have a question, should a create a second fileSystemWatcher, or there's a way to update the current one in order to include also this case? I already checked stackoverflow and although I found many posts on fileSystemWatcher, I'm not really sure on how to proceed.
Here's my code:
foreach (string path in paths)
{
if (path.Length > 0)
{
logFile.read(path, errorListAndLevel);
FileSystemWatcher logWatcher = new FileSystemWatcher();
logWatcher.Path = Path.GetDirectoryName(path);
logWatcher.Filter = Path.GetFileName(path);
logWatcher.IncludeSubdirectories = false;
logWatcher.NotifyFilter = NotifyFilters.CreationTime | NotifyFilters.DirectoryName | NotifyFilters.FileName | NotifyFilters.LastAccess | NotifyFilters.LastWrite | NotifyFilters.Size;
logWatcher.Changed += new FileSystemEventHandler(delegate (object sender, FileSystemEventArgs e)
{
logFile.myMethod(errorListAndLevel, e.FullPath.ToString());
});
logWatcher.EnableRaisingEvents = true;
}
}
I thought of creating a logWatcher.Created... but it isn't working.
logWatcher.Created += new FileSystemEventHandler(delegate (object sender, FileSystemEventArgs e)
{
logFile.myMethod(errorListAndLevel, e.FullPath.ToString());
});
The problem is the following line
logWatcher.Filter = Path.GetFileName(path);
It filters all files and makes the fileWatcher only whatch that one file. Now this makes the fileWatcher not calling the event Created unless a second a file with the exact same name will be created.
When you comment that line it should work fine.
Additionaly you could do this instead:
logWatcher.Filter = "*.log";
This will filter all files with the ending .log which is what you want from what I can see.
I am using the FileSystemWatcher to notify when the new files gets created in the network directory. We process the text files(about 5KB size) and delete them immediately when the new file gets created in the directory. If the FileSystemWatcher windows service stops for some reason we have to look for the unprocessed files after it gets back up and running. How can I handle if the new file comes while processing the old files from the directory? Any examples please?
Thank you,
Here is the code example I have with simple form.
public partial class Form1 : Form
{
private System.IO.FileSystemWatcher watcher;
string tempDirectory = #"C:\test\";
public Form1()
{
InitializeComponent();
CreateWatcher();
GetUnprocessedFiles();
}
private void CreateWatcher()
{
//Create a new FileSystemWatcher.
watcher = new FileSystemWatcher();
watcher.Filter = "*.txt";
watcher.NotifyFilter = NotifyFilters.FileName;
//Subscribe to the Created event.
watcher.Created += new FileSystemEventHandler(watcher_FileCreated);
watcher.Path = #"C:\test\";
watcher.EnableRaisingEvents = true;
}
void watcher_FileCreated(object sender, FileSystemEventArgs e)
{
//Parse text file.
FileInfo objFileInfo = new FileInfo(e.FullPath);
if (!objFileInfo.Exists) return;
ParseMessage(e.FullPath);
}
void ParseMessage(string filePath)
{
// Parse text file here
}
void GetUnprocessedFiles()
{
// Put all txt files into array.
string[] array1 = Directory.GetFiles(#"C:\test\");
foreach (string name in array1)
{
string path = string.Format("{0}{1}", tempDirectory, name)
ParseMessage(path);
}
}
}
When the process starts do the following:
first get the contents of the folder
process every file (end delete them as you already do now)
repeat until no files are in the folder (check again here, since a new file could have been placed in the folder).
start the watcher
For any of our services that use the FileSystemWatcher we always process all the files that exist in the directory first, prior to starting the watcher. After the watcher has been started we then start a timer (with a fairly long interval) to handle any files that appear in the directory without triggering the watcher (it does happen from time to time). That usually covers all the possibilities.