I want to detect what sorting order I have chosen in a given path. My goal is to sort an array / list the same way.
For example: The path C:\Test is sorted after lowest file size first.
If your question is in regaurds to the file enumeration methods in C#. Then the order of the files is not guaranteed and it depends on the file system. The operating system doesn't remember what you chose in windows explorer when you call the file enumeration methods in C#. Those methods are based on the win32 apis which states the following
The order in which this function returns the file names is dependent
on the file system type. With the NTFS file system and CDFS file
systems, the names are usually returned in alphabetical order. With
FAT file systems, the names are usually returned in the order the
files were written to the disk, which may or may not be in
alphabetical order. However, as stated previously, these behaviors are
not guaranteed.
However if you were an enterprising-young-jedi-coder-with-time-to-burn, you would likely find the sort order for windows explorer (as chosen by you in windows explorer) is likely stored in the registry. You could likely use a registry monitor to identify what is actually accessed and changed (if this is the case).
Personally I think this is likely going to take a long time, and OS dependent and may change with any windows update, for little to no benefit
Related
I am currently writing a compiler and I have a small class that loops through the input files and compares them to see if there's no repeated files. Of course, we can't compare the strings directly because the same file can be written like, let's say, main.c and ./main.c. Therefore, I am using System.IO.Path.GetFullPath() to compare the file paths. The problem is that on Windows, the filesystems aren't case sensitive so, "C:/main.c" == "C:/Main.c", for example but, on *NIX systems like Linux, Mac or Android, these two could be different files. Also, *NIX also supports filesystems like FAT and FAT32, that work like Windows' ones. How do I know when I should compare the two paths with or without case-sensitivity, so that I can firmly whether the 2 file paths are equal or different?
You can pinvoke the Shell API function SHParseDisplayName, then call the CompareIDs method on the IShellFolder interface returned by SHGetDesktopFolder.
If you can drop XP support you can use Microsoft's Windows-API-Code-Pack. Microsoft.WindowsAPICodePack.Shell.ShellObject.Equals would do the comparison.
I'm using FSCTL_ENUM_USN_DATA to enumerate over the NTFS MFT so that I may build a directory database based on USN_RECORD FileReferenceNumbers. I'm constructing this database so that I can monitor file changes on an NTFS drive by using the NTFS USN Change Journal and reading USN_RECORD's (using FileReferenceNumber and ParentFileReferenceNumber, which reference the directory database). See here for info on doing this.
My issue is with the USN Record versions. If you look, USN_RECORD_V2 supports a different datatype for FileReferenceNumbers (DWORDLONG) than USN_RECORD_V3 (FILE_ID_128). This would be fine, if FSCTL_ENUM_USN_DATA supported USN_RECORD_V3. The issue is USN_RECORD_V3 is used in Windows 10, while USN_RECORD_V2 is used in Windows 7.
FSCTL_ENUM_USN_DATA takes in a MFT_ENUM_DATA_V1 or MFT_ENUM_DATA_V0 as its input buffer. I assumed V1 supported FILE_ID_128 FileReferenceNumbers, but this assumption turned out to be incorrect. There seems to be no support for USN_RECORD_V3 and its associated FileReferenceNumber data type. Thus, monitoring changes on an NTFS drive using the NTFS Change Journal on versions of windows that use USN_RECORD_V3 or later is a huge issue right now.
I have found a temporary solution! On Windows 10 when enumerating the MFT, FSCTL_READ_ENUM_DATA only returned USN_RECORD_V2's, giving FileReferenceNumbers of type DWORDLONG. In turn, I was forced to bitshift these DWORDLONG FileReferenceNumbers into a 128 bit buffer so that the directory cache would match the USN_RECORD_V3s returned from the FSCTL_READ_USN_JOURNAL call.
However, I can't help but feel that I'm missing something. Does anyone have any other solutions to this problem, or any alternate approaches that can be taken? Keep in mind, monitoring changes that were made to the drive while the program wasn't running is paramount for my project.
I'm developing a file system manager module, and wondering what will be a more efficient approach.
This will be on a Windows machine with NTFS.
The module will need to notify a different module regarding new files created on a specific directory and also maintain some kind of state for this files so already processed files can be deleted, and in case of failure, the unprocessed files will be processed again.
I thought of either moving files between directories as their state changes, or renaming files according to their state or changing the files attributes as a sign of their state.
I'm wondering what would be the most efficient approach, considering the possibility of a large quantity of files being created over a short time span.
I can't fully answer your question, but give some general hints. Most important of all, the answer to your question might largely depend on the underlying file system (NTFS, FAT32, etc.).
Renaming or moving a file on the same partition generally means that directory entries are changed. The actual file contents need not be touched. Once you move a file to a different partition or hard disk drive, the actual file contents must be copied, too, which takes far more time.
That all being said, I would generally assume a rename to be slightly quicker than moving a file to another directory (on the same partition), since only one directory is affected instead of two. I'm also not quite sure what you mean by changing a file "attribute" -- however, if you're talking about e.g. setting the "archive" flag of a file, or making the file "read-only", that might again be slightly faster than a rename, if the directory entry can be changed in-place instead of being replaced with a new one of a different size.
Again: Do take my assumptions with caution, since this all depends on the particular file system. (For example, hiding a file on a UNIX file system usually means renaming it -- prefixing the name with a . --, but the same is not true for typical DOS/Windows file systems.)
Renaming took: 1498.8166
ApplyAttribute took: 340.5407
Transfer took: 2527.6837
Transfer took: 3933.4944
ApplyAttribute took: 419.635
Renaming took: 1384.0079
Tested with 1000 files.
Run tests twice in order to ensure no caching is in place.
EDITED: nasty bug was fixed, sorry.
Go with attributes.
Why do you want to store this information directly in the filesystem? I would recommend using a SQL database to keep track of the files. That way, you avoid modifying the filesystem, it's probably going to be faster, and you can easily have more information about the files if you need them.
Also, having one folder with large amount of files might be slow by itself, so you might consider having more folders for the files, if that makes sense for you.
I have a program that compares files in two folders. I want to detect if a file has been renamed, determine the newest file (most recently renamed), and update the name on the old file to match.
To accomplish this, I would check to see if the newest file is bit by bit identical to the old one, and if it is, simply rename the old file to match the new one.
The problem is, I have nothing to key on to tell me which file was most recently renamed.
I would love some property like FileInfo.LastModified, but for files that have been renamed.
I've already looked at solutions like FileSystemWatcher, and that is not really what I'm looking for. I would like to be able to run my synchronizer whenever I want, without having to worry about some dedicated process tracking a folder's state.
Any ideas?
A: At least on NTFS, you can attach alternate data streams to a file.
On your first sync, you can just attach a GUID in an ADS to the source files to tag them.
B: If you don't have write access to the source, store hashes of the files you synced in your target repository. When the source changes, you only have to hash the source files and only compare bit-by-bit if the hashes collide. Depending on the quality and speed of your hash function, this will save you a lot of time.
If you are running on an NTFS drive you can enable the change journal which you can then query for things like rename events. However you need to be an admin to enable it to begin with and it will use disk space. Unfortunately I don't know of any specific C# implementations of reading the journal.
You could possibly create a config file that holds a list of all expected names within the folder, and then, if a file in the folder is not a member of the expected list of names, determine that the file has then been renamed. This would, however, add another layer of work considering you'd have to change the list every time you wish to add a new file to the folder.
Filesystems generally do not track this.
Since you seem to be on Windows, you can use GetFileInformationByHandle(). (Sorry, I don't know the C# equivalent.) You can use the "file index" fields in the struct returned to see if files have the same index as something you've seen before. Keep in mind that hardlinks will also have the same index.
Alternatively you could hash file contents somehow.
I don't know precisely what you're trying to do, so I can't tell you whether either of these points makes sense. It could be that the most reasonable answer is, "no, you can't do that."
I would make a CRC (e.g. CRC example) of (all?) the files in the 2 directories storing the last update time with the CRC value, file name etc. After that, interate through the lists finding maches by the CRC and then use the date values to decide what to do.
I am using the Windows API function FindFirstFileEx because it provides the capability to return just the sub-directories of a given directory (ignoring files). However when I call this function with the required flag, I still receive both files and directories.
The MSDN documentation for the FindExSearchLimitToDirectories flag used by FindFirstFileEx says:
This is an advisory flag. If the file
system supports directory filtering,
the function searches for a file that
matches the specified name and is also
a directory. If the file system does
not support directory filtering, this
flag is silently ignored.
The lpSearchFilter parameter of the
FindFirstFileEx function must be NULL
when this search value is used.
If directory filtering is desired,
this flag can be used on all file
systems, but because it is an advisory
flag and only affects file systems
that support it, the application must
examine the file attribute data stored
in the lpFindFileData parameter of the
FindFirstFileEx function to determine
whether the function has returned a
handle to a directory.
So, what file systems actually support this flag? It would have been sensible to actually list these supported file systems on the same page, but I can't find it.
My development system is Windows XP SP3, NTFS, .NET 3.5.
I know I can check file attributes to determine if a file is a directory, however this means checking the every file/directory. It also defeats the purpose of using FindFirstFileEx in the first place.
Of course there is still the chance I may be doing something incorrectly in my code. The only thing I can see is passing IntPtr.Zero to lpSearchFilter may not be the same as passing NULL (as mentioned in the quote).
Here's an example of the code I'm using:
m_searchDirHandle = WinAPI.FindFirstFileEx(#"C:\Temp\*",
WinAPI.FINDEX_INFO_LEVELS.FindExInfoStandard ,
ref m_findDirData, WinAPI.FINDEX_SEARCH_OPS.FindExSearchLimitToDirectories,
IntPtr.Zero , 0);
if (m_searchDirHandle != WinAPI.INVALID_HANDLE_VALUE)
{
do
{
foundNextDir = WinAPI.FindNextFile(m_searchDirHandle, ref m_findDirData);
} while (foundNextDir);
}
The nearest link I could find was, the list of System Calls by Metasploit...I am taking a stab here but I would imagine that this 'FindFirstFileEx' would somehow be an indirect call to the NT system call equivalent 'NtOpenDirectoryObject', 'NtQueryDirectoryFile', 'NtQueryDirectoryObject'... I hope...if anyone thinks I'm wrong and downvotes to disagree, I will be corrected by whoever disagrees :)
However, I have hit on a few links here
CodeGuru forum on this issue about the flag
Wine has a mailing listed as the flag as no effect?
GenNT mentions that it is apparently limited to NTFS, (there's 3 replies to that posting)
Here on SO, a question on 'How to get list of folders in this folder'
Edit: Just now after mentioning in the comments, I thought it would be fitting enough to add a link to the Linux NTFS driver for capabilities to read the NTFS partition, there is bound to be source version changes to accomodate the different NTFS versions going back to Win2000...
Hope this helps,
Best regards,
Tom.