ignore files in flight

ignore files in flight - c#

When I run the code below, it fills my array with a list of files in the specified directory.
This is good.
However, it also grabs files that are 'in flight' - meaning files that are currently being copied to that directory.
This is bad.
How do I go about ignoring those 'in-flight' files? Is there a way to check each file to make sure it's 'fully there' before I process it?
string[] files = Directory.GetFiles(ConfigurationSettings.AppSettings.Get("sourcePath"));
if (files.Length > 0)
{
foreach (string filename in files)
{
string filenameonly = Path.GetFileName(filename);
AMPFileEntity afe = new AMPFileEntity(filenameonly);
afe.processFile();
}
}

Unfortunately there is no way to achieve what you are looking for. Robert's suggestion of opening the file for writing solves a subset of the problem but does not solve the bigger issue which is
The file system is best viewed as a multi-threaded object over which you have no synchronization capabilities
No matter what synchronization construct you try to use to put the file system into a "known state", there is a way for the user to beat it.
The best way to approach this problem is to process the files as normal and catch the exceptions that result from using files that are "in flight". This is the only sane way to deal with the file system.

Yes, you can try to open the file for writing. If you are able to open for writing without an exception then it's likely not "in-flight" anymore. Unfortunately, I have encountered this problem several times before and have not come across a better solution.

Get a list of files in the directory.
Get a list of open handles (see below).
Remove the latter from the former.
You can get a list of open handles by p/invoking NtQuerySystemInformation. There's a project on CodeProject that shows how to do this. Alternatively, you can call Handle.exe, from Sysinternals, and parse its output.

Try to rename/move the file. If you can rename it, it's no longer in use.

Carra's answer gave me an idea.
If you have access to the program that copies the files to this directory, modify it so that it:
Writes files to a temporary directory on the same disk.
Move the files to the appropriate folder after they're finished writing to disk.
On the same filesystem, a move operation just updates the directory entries rather than changing the file's physical location on disk. Which means that it's extremely fast.

Related

FileSystemWatcher Filter - Detect Zipped Files?

I'm using a FileSystemWatcher to detect that a text file is created in directory A and subsequently created in directory B.
The issue I'm having is, the process which moves the file from directory A to directory B also zips the file up, changing the filename from say "999_XXX_001.txt" to "999_XXX_001.txt.zip"
Three problems with this;
1) I can no longer open and read the file to analyse the contents
2) The filename has changed
3) The FileSystemWatcher appears to only support a single extension
Solution
Using two watchers, one for ".zip" and one for ".txt", I'm removing the .zip and comparing filenames because moved files no longer exist to be compared byte-for-byte.. I guess the real question here was how can I use the watcher to detect ".txt.zip" as an extension!

Why? You would have to wait until the process has finished its zipping magic and afterwards you can open the zip file with your framework of choice
Why is it a problem itself that the filename has changed?
No, the file watcher will detect any change of all files within the given directory
But maybe it is better to describe what you actually try to achieve here. There is probably a better solution to what you actually need.

How to merge 2 zip files together into 1 zip

I am trying to make a custom launcher for Minecraft in C# but I have come across a bump.
I want to add something into it, Minecraft Forge, but the only way I could think of is to change the extension of minecraft.jar to minecraft.zip, extract the contents of the Minecraft Forge.zip and the minecraft.zip into the same folder and then zip that entire folder up into minecraft.jar.
However minecraft.jar has a file named aux.class so whenever my extract script (Made in java) tries to extract it, it simply says:
Unable to find file G:\Programming\C#\Console\Forge Installer\Forge Installer\bin\Debug\Merge\aux.class.
The only other way I can think of is to merge minecraft_forge.zip into minecraft.zip, I have spent around 2 hours looking on Google (watch as someone sees it within a couple of minutes) but it always shows me results for "How to zip multiple files", "How to make a zip file in C#" etc.
So I have come here looking for my answer, sorry if this is a lot to read but I always see comments on here saying "You didn't give enough information for us to help you with".
EDIT: The question in case it wasn't clear is: How am I able to put the contents of minecraft_forge.zip into minecraft.zip?

In your case, if you cannot unzip the files due to OS limitations, you need to "skip" unzipping temporary files to zip them. Instead, only handle input & output streams, as suggested in the answers found here: How can I add entries to an existing zip file in Java?

As you pointed out, "aux" is a protected keyword within windows and it does not matter what the file suffix may be; windows won't let you use it. Here are a couple of threads that discusses this in general.
Ref 1: Windows reserved words.
Ref 2: Windows reserved words.
If you are typing in commands to perform the copy or unzip, there is a chance you can get this to work by using a path prefix of the following \\.\ or \\?\. When I tested this, it worked with either a single or double back-slash following the period or question mark. Such that the following work:
\\.\c:\paths\etc
\\.\\c:\paths\etc
\\?\c:\path\etc
\\?\\c:\path\etc
I used the following command to test this. When trying to rename through windows explorer it gave a "The specified device name is invalid." error message. From the command line it worked just fine. I should point out, that once you create these files, you will have to manually delete them using the same technique. Windows Explorer reports that these text files which have a size of 0 bytes "is too large for the destination file system", ie... the recycle bin.
rename "\.\c:\temp\New Text Document.txt" aux.txt
del "\.\c:\temp\aux.txt"
As far as copying directly from zip or jar files, I tried this myself and it appeared to work. I used 7-zip and opened the jars directly using the "open archive..." windows explorer context menu. I then dragged-and-dropped the contents from forge.jar to the minecraft jar file. Since it is the minecraft jar file with the offending file name the chance of needing to create a temporary file on the filesystem is reduced. I did see someone mention that 7-zip may extract to a temporary file when copying between jars and zips.
7-zip reference on copying between archives
I should point out that my copy of minecraft jar (minecraft_server.1.8.7.jar) did not contain a file named aux.class. I also did not try to use the jar after the copy/merge. Nor did I spend too much time trying to figure out how well it merged the two contents since it appears like there may be a conflict with com\google\common\base\ since there are similar class name but with different $ variable suffixes on them.
I hope these two possible suggestions could give you some room to work with to find a solution for your needs... if you're still looking.

Best practise for using Directory.GetFiles() or EnumerateFiles with a target directory that contains locked files?

Currently I try to improve the design of two windows services (C#).
Service A produces data exports (csv files) and writes them to a temporary directory.
So the file is written to a temporary directory that is a sub dir. of the main output directory.
Then the file is moved (via File.Move) to the output directory (after a successful write).
This export may be performed by multiple threads.
Another service B tries to fetch the files from this output directory in a defined interval.
How to assure that Directory.GetFiles() excludes locked files.
Should I try to check every file by creating a new FileStream (using
(Stream stream = new FileStream("MyFilename.txt", FileMode.Open)) as
described
here.
Or should the producer service (A) use temporary file names (*.csv.tmp) that are
automatically excluded by the consumer serivce (B) with appropriate search pattterns. And rename a file after the move was finished.
Are there better ways to handle such file listing operations.

Don't bother checking!
Huh? How can that be?
If the files are on the same drive, a Move operation is atomic! The operation is effectively a rename, erasing the directory entry from the previous, and inserting it into the next directory, pointing to the same sectors (or whatevers) where the data really are, without rewriting it. The file system's internal locking mechanism has to lock & block directory reads during this process to prevent a directory scan from returning corrupt results.
That means, by the time it ever shows up in a directory, it won't be locked; in fact, the file won't have been opened/modified since the close operation that wrote it to the previous directory.
caveats - (1) definitely won't work between drives, partitions, or other media mounted as a subdirectory. The OS does a copy+delete behind the scenes instead of a directory entry edit. (2) this behaviour is a convention, not a rule. Though I've never seen it, file systems are free to break it, and even to break it inconsistently!
So this will probably work. If it doesn't, I'd recommend using your own idea of temp extensions (I've done it before for this exact purpose, but between a client and server that only could talk by communicating via a shared drive) and it's not that hard and worked flawlessly.
If your own idea is too low-tech, and you're on the same machine (sounds like you are), you can set a mutex (google that), with the filename embedded, that lives while the file is being written, in the writer process; then do a blocking test on it when you open each file you are reading from the other process. If you want the second process to respond ASAP combine this with the filesystem watcher. Then pat yourself on the back for spending ten times the effort as the temp filename idea, with no extra gain >:-}
good luck!

One way would be to mark the files as temporary from the writing app whilst they're in use, and only unmark them once they are written to and closed, eg.
FileStream f = File.Create (filename);
FileAttributes attr = File.GetAttributes (filename);
File.SetAttributes (filename, attr | FileAttributes.Temporary);
//write to file.
f.Close ();
File.SetAttributes (filename, attr);
From the consuming app, you just want to skip any temporary files.
foreach (var file in Directory.GetFiles (Path.GetDirectoryName (filename))) {
if ((File.GetAttributes (file) & FileAttributes.Temporary) != 0) continue;
// do normal stuff.
}

Check sftp file transfer completed?

We have here a windows server and one day we will get via sftp some text files in a folder. I dont have more information, but maybe this is enough. Now I should write a function that is moving these files into another folder. Well that should not be that hard, I thought... but now I realized that Im able to move a file before its finished. So I was searching for some solutions and Im really confused.
My solution would be to check the file and the processes around it. Because if the file is not finished yet, there is a copy-process and I can check this process. To make this easy, I just have to try to lock the file and if there is no another process, well then the file is ready for move?
using (File.Open("myFile", FileMode.Open, FileAccess.Read, FileShare.None))
{ /*rdy!*/ }
But now I see that people are writing something about checksum test or to test the filesize and if the filesize is not changing then the file is ready. Is this stuff not a little bit complicated? Please tell me that my solution could work also... Im not able to test it with any server to server sftp stuff. I just know that if I copy a file to another folder (via explorer) this is working. Does this work via sftp transfer as well? Any ideas? Thank you

File-size checks are dangerous - what if the upload is suspended and later resumed? How much time should go by until you accept the current file size as the final file size? => Not a good solution.
I'd go for the locking, however, this only works if the process that writes the file also opens the file in a way so that it is locked exclusively. If the process doesn't do that, you'll be stuck with your problem again.
Another solution would be to upload the files with temporary names, like ".sftptmp". And to have the uploader rename it after it is done. That way you can be sure the file has been uploaded - just ignore all files that end with ".sftptmp". This, however, assumes that you actually have control over the process of uploading files.

Another option is to have the sender put a control file after the data file. For example, put uploadfile-20220714.txt, then put uploadfile-20220714.ctl. The control file can contain file information such as the name and size of the data file. This option requires the sender to modify their process, but it shouldn't require too much effort.

C#: new file() - where is the root save location?

If i make the call File myFile = new File('myfile.txt'); where is does this get saved to?

It's relative to the process's current directory. What that is will depend on how your application has been started, and what else you've done - for example, some things like the choose file dialog can change the current working directory.
EDIT: If you're after a temporary file, Path.GetTempFileName() is probably what you're after. You can get the temp folder with Path.GetTempPath().

That won't compile.
EDIT: However, if you're after where creating a text file using StreamWriter or File.Create("text.txt"), etc, then I defer to the other answers above; where it will be where the application is running from. Be aware as others mentioned that if you're working out of debug it will be in the debug folder, etc.

NORMALLY it gets saved to the same directory the executable is running in. I've seen exceptions. (I can't remember the specifics, but it threw me for a loop. I seem to recall it being when it's run as a scheduled task, or if it's run from a shortcut.
The only reliable way to know where it is going to save if you are using the code snippet you provided is to check the value of System.Environment.CurrentDirectory. Better yet, explicitly state the path where you want it to save.
Edit - added
I know this is a bit late in modifying this question, but I found a better answer to the problem of ensuring that you always save the file to the correct location, relative to the executable. The accepted answer there is worth up-votes, and is probably relevant to your question.
See here: Should I use AppDomain.CurrentDomain.BaseDirectory or System.Environment.CurrentDirectory?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.