Difference between in doing file copy/delete and Move - c#

What is difference between
Copying a file and deleting it using File.Copy() and File.Delete()
Moving the file using File.Move()
In terms of permission required to do these operations is there any difference? Any help much appreciated.

File.Move method can be used to move the file from one path to another. This method works across disk volumes, and it does not throw an exception if the source and destination are the same.
You cannot use the Move method to overwrite an existing file. If you attempt to replace a file by moving a file of the same name into that directory, you get an IOException. To overcome this you can use the combination of Copy and Delete methods

Performance wise, if on one and the same file system, moving a file is (in simplified terms) just adjusting some internal registers of the file system itself (possibly adjusting some nodes in a red/black-tree), without actually moving something.
Imagine you have 180MiB to move, and you can write onto your disk at roughly 30MiB/s. Then with copy/delete, it takes approximately 6 seconds to finish. With a simple move [same file system], it goes so fast you might not even realise it.
(I once wrote some transactional file system helpers that would move or copy multiple files, all or none; in order to make the commit as fast as possible, I moved/copied all stuff into a temporary sub-folder first, and then the final commit would move existent data into another folder (to enable rollback), and the new data up to the target).

I don't think there is any difference permission-wise, but I would personally prefer to use File.Move() since then you have both actions happening in the same "transaction". In other words if something on the move fails the whole operation fails. However, if you break it up in two steps (copy + delete) if copy worked and delete failed, you would have to reverse the "transaction" (delete the copy) manually.

Permission in file transfer is checked at two points: source, and destination. So, if you don't have read permission in source folder, or you don't have write permission in destination, then these methods both throw AccessDeniedException exception. In other words, permission checking is agnostic to method in use.

Related

Does File.Move() deletes original on IoException?

I have a webservice that is writing files that are being read by a different program.
To keep the reader program from reading them before they're done writing, I'm writing them with a .tmp extension, then using File.Move to rename them to a .xml extension.
My problem is when we are running at volume - thousands of files in just a couple of minutes.
I've successfully written file "12345.tmp", but when I try to rename it, File.Move() throws an IOException:
File.Move("12345.tmp", "12345.xml")
Exception: The process cannot access the file because it is being used
by another process.
For my situation, I don't really care what the filenames are, so I retry:
File.Move("12345.tmp", "12346.xml")
Exception: Exception: Could not find file '12345.tmp'.
Is File.Move() deleting the source file, if it encounters an error in renaming the file?
Why?
Is there someway to ensure that the file either renames successfully or is left unchanged?
The answer is that it depends much on how the file system itself is implemented. Also, if the Move() is between two file systems (possibly even between two machines, if the paths are network shares) - then it also depends much on the O/S implementation of Move(). Therefore, the guarantees depend less on what System.IO.File does, and more about the underlying mechanisms: the O/S code, file-system drivers, file system structure etc.
Generally, in the vast majority of cases Move() will behave the way you expect it to: either the file is moved or it remains as it was. This is because a Move within a single file system is an act of removing the file reference from one directory (an on-disk data structure), and adding it to another. If something bad happens, then the operation is rolled back: the removal from the source directory is undone by an opposite insert operation. Most modern file systems have a built-in journaling mechanism, which ensures that the move operation is either carried out completely or rolled back completely, even in case the machine loses power in the midst of the operation.
All that being said, it still depends, and not all file systems provide these guarantees. See this study
If you are running over Windows, and the file system is local (not a network share), then you can use the Transacitonal File System (TxF) feature of Windows to ensure the atomicity of your move operation.

C# assert that all files can be renamed

I have groups of files that need to be batch renamed from time to time. I don't want to rename any of them unless it can be asserted that all can be renamed from one to the other. Is there some kind of assertion method for doing this, or will I have to write my own?
Transactional NTFS can do that. There are .NET wrappers for that.
If you don't want to use that consider opening all files in exclusive mode before you start the renaming. That gives you assurance that there was a point in time where each of the files was unused. Of course, those files can be opened right after your check so it is only a heuristic.

Best file mutex in .NET 3.5

I want to use some mutex on files, so any process won't touch certain files before other stop using them. How can I do it in .NET 3.5? Here are some details:
I have some service, which checks every period of time if there are any files/directories in certain folder and if there are, service's doing something with it.
My other process is responsible for moving files (and directories) into certain folder and everything works just fine.
But I'm worrying because there can be situation, when my copying process will copy the files to certain folder and in the same time (in the same milisecond) my service will check if there are some files, and will do something with them (but not with all of them, because it checked during the copying).
So my idea is to put some mutex in there (maybe one extra file can be used as a mutex?), so service won't check anything until copying is done.
How can I achieve something like that in possibly easy way?
Thanks for any help.
The canonical way to achieve this is the filename:
Process A copies the files to e.g. "somefile.ext.noprocess" (this is non-atomic)
Process B ignores all files with the ".noprocess" suffix
After Process B has finished copying, it renames the file to "somefile.ext"
Next time Process B checks, it sees the file and starts processing.
If you have more than one file, that have to be processd together (or none), you need to adapt this scheme to an additional transaction file containing the file names for the transaction: Only if this file exists and has the correct name, must process B read it and process the files mentioned in it.
Your problem really is not of mutual exclusion, but of atomicity. Copying multiple files is not an atomic operation, and so it is possible to observe the files in a half-copied state which you'd like to prevent.
To solve your problem, you could hinge your entire operation on a single atomic file system operation, for example renaming (or moving) of a folder. That way no one can observe an intermediate state. You can do it as follows:
Copy the files to a folder outside the monitored folder, but on the same drive.
When the copying operation is complete, move the folder inside the monitored folder. To any outside process, all the files would appear at once, and it would have no chance to see only part of the files.

Is there any alternative to "truncate" which sounds safer?

I have an application that reads a linked list1 from a file when it starts, and write it back to the file when it ends. I choose truncate as the file mode when writing back. However, truncate sounds a little bit dangerous to me as it clears the whole content first. Thus if something goes wrong, I cannot get my old stuff back. Is there any better alternative?
1: I use a linked list because the order of items may change. Thus I later use truncate to update the whole file.
The right answer reputation goes to Hans as he first pointed out File.Replace(), though it is not available for Silverlight for now.
Write to a new temporary file. When finished and satisfied with the result, delete the old file and rename/copy the new temporary file into the original file's location. This way, should anything go wrong, you are not losing data.
As pointed out in Hans Passants answer, you should use File.Replace for maximum robustness when replacing the original file.
This is covered well by the .NET framework. Use the File.Replace() method. It securely replaces the content of your original file with the content of another file, leaving the original in tact if there's any problem with the file system. It is a better mouse trap than the upvoted answers, they'll fail when there's a pending delete on the original file.
There's an overload that lets you control whether the original file is preserved as a backup file. It is best if you let the function create the backup, it significantly increases the odds that the function will succeed when another process has a lock on your file, the most typical failure mode. They'll get to keep the lock on the backup file. The method also works best when you create the intermediate file on the same drive as the original so you'll want to avoid GetTempFileName(). A good way to generate a filename is Guid.NewGuid().ToString().
The "best" alternative for robustness would be to do the following:
Create a new file for the data you're persisting to disk
Write the data out to the new file
Perform any necessary data verification
Delete the original file
Move the new file to the original file location
You can use System.IO.Path.GetTempFileName to provide you with a uniquely named temporary file to use for step 1.
You have thought to use truncate, so I assume your input data is always anew, therefore....
try ... catch to rename your original file to something like 'originalname_day_month_year.bak'
Write ex-novo your file with new data.
In this way you don't have to worry to loose anything and, as a side effect, you have a backup copy of your previous data. If that backup is not needed, you can always delete the backup file.

Detecting moved files using FileSystemWatcher

I realise that FileSystemWatcher does not provide a Move event, instead it will generate a separate Delete and Create events for the same file. (The FilesystemWatcher is watching both the source and destination folders).
However how do we differentiate between a true file move and some random creation of a file that happens to have the same name as a file that was recently deleted?
Some sort of property of the FileSystemEventArgs class such as "AssociatedDeleteFile" that is assigned the deleted file path if it is the result of a move, or NULL otherwise, would be great. But of course this doesn't exist.
I also understand that the FileSystemWatcher is operating at the basic Filesystem level and so the concept of a "Move" may be only meaningful to higher level applications. But if this is the case, what sort of algorithm would people recommend to handle this situation in my application?
Update based on feedback:
The FileSystemWatcher class seems to see moving a file as simply 2 distinct events, a Delete of the original file, followed by a Create at the new location.
Unfortunately there is no "link" provided between these events, so it is not obvious how to differentiate between a file move and a normal Delete or Create. At the OS level, a move is treated specially, you can move say a 1GB file almost instantaneously.
A couple of answers suggested using a hash on files to identify them reliably between events, and I will proably take this approach. But if anyone knows how to detect a move more simply, please leave an answer.
According to the docs:
Common file system operations might
raise more than one event. For
example, when a file is moved from one
directory to another, several
OnChanged and some OnCreated and
OnDeleted events might be raised.
Moving a file is a complex operation
that consists of multiple simple
operations, therefore raising multiple
events.
So if you're trying to be very careful about detecting moves, and having the same path is not good enough, you will have to use some sort of heuristic. For example, create a "fingerprint" using file name, size, last modified time, etc for files in the source folder. When you see any event that may signal a move, check the "fingerprint" against the new file.
As far as I understand it, the Renamed event is for files being moved...?
My mistake - the docs specifically say that only files inside a moved folder are considered "renamed" in a cut-and-paste operation:
The operating system and FileSystemWatcher object interpret a cut-and-paste action or a move action as a rename action for a folder and its contents. If you cut and paste a folder with files into a folder being watched, the FileSystemWatcher object reports only the folder as new, but not its contents because they are essentially only renamed.
It also says about moving files:
Common file system operations might raise more than one event. For example, when a file is moved from one directory to another, several OnChanged and some OnCreated and OnDeleted events might be raised. Moving a file is a complex operation that consists of multiple simple operations, therefore raising multiple events.
As you already mentioned, there is no reliable way to do this with the default FileSystemWatcher class provided by C#. You can apply certain heuristics like filename, hashes, or unique file ids to map created and deleted events together, but none of these approaches will work reliably. In addition, you cannot easily get the hash or file id for the file associated with the deleted event, meaning that you have to maintain these values in some sort of database.
I think the only reliable approach for detecting file movements is to create an own file system watcher. Therefore, you can use different approaches. If you are only going to watch changes on NTFS file systems, one solution might be to read out the NTFS change journal as described here. What's nice about this is that it even allows you to track changes that occurred while your app wasn't running.
Another approach is to create a minifilter driver that tracks file system operations and forwards them to your application. Using this you basically get all information about what is happening to your files and you'll be able to get information about moved files. A drawback of this approach is that you have to create a separate driver that needs to be installed on the target system. The good thing however is that you wouldn't need to start from scratch, because I already started to create something like this: https://github.com/CenterDevice/MiniFSWatcher
This allows you to simply track moved files like this:
var eventWatcher = new EventWatcher();
eventWatcher.OnRenameOrMove += (filename, oldFilename, process) =>
{
Console.WriteLine("File " + oldFilename + " has been moved to " + filename + " by process " + process );
};
eventWatcher.Connect();
eventWatcher.WatchPath("C:\\Users\\MyUser\\*");
However, please be aware that this requires kernel code that needs to be signed in order run on 64bit version of Windows (if you don't disable signature checking for testing). At time of writing, this code is also still in an early stage of development, so I would not use it on production systems yet. But even if you're not going to use this, it should still give you some information about how file system events might be tracked on Windows.
I'll hazard a guess 'move' indeed does not exist, so you're really just going to have to look for a 'delete' and then mark that file as one that could be 'possibly moved', and then if you see a 'create' for it shortly after, I suppose you can assume you're correct.
Do you have a case of random file creations affecting your detection of moves?
Might want to try the OnChanged and/or OnRenamed events mentioned in the documentation.
StorageLibrary class can track moves. The example from Microsoft:
StorageLibrary videosLib = await StorageLibrary.GetLibraryAsync(KnownLibraryId.Videos);
StorageLibraryChangeTracker videoTracker = videosLib.ChangeTracker;
videoTracker.Enable();
A complete example could be found here.
However, it looks like you can only track changes inside Windows "known libraries".
You can also try to get StorageLibraryChangeTracker using StorageFolder.TryGetChangeTracker(). But your folder must be under sync root, you can not use this method to get an arbitrary folder in file system.

Categories