I have a folder with files like this:
foo.map
foo.ind
foo.dat
bar.map
bar.ind
bar.dat
readme.txt
readme.html
I like to get all distinct file names but without extension, i.e. foo, bar, readme.
Of course I can use Directory.GetFiles() and make a loop with Path.GetFileNameWithoutExtension() but I wonder if there is a short way, maybe just one line, e.g. using LINQ?
How about:
directory
.GetFiles()
.Select(f => Path.GetFileNameWithoutExtension(f.Name))
.Distinct();
Related
I have directories arranged as in the picture. I want to edit the files OneA1, OneA2,OneA3.. and TwoA1, TwoA2, TwoA3... (They are xml files and want to edit some tags). There are 100s of files in C drive. How do I filter the required files in C# ? Aim is to filter all xml files with file names contain the word OneA and OneB.
static void Main (string[] args)
{
DirectoryInfo directory = new DirectoryInfo (#"C:\Products\MetalicProducts");
}
You can use a search option to include all subdirectories:
var prefilteredFiles = Directory.EnumerateFiles(path, "???A*.xml",
SearchOption.AllDirectories);
var filtered = prefilteredFiles
.Select(f => (full: f, name: Path.GetFileNameWithoutExtension(f)))
.Where(t => t.name.StartsWith("OneA") || t.name.StartsWith("TwoA"));
The wildcard pattern ???A*.xml pre-filters the files but is not selective enough. Therefore we use LINQ to refine the search.
The Select creates a tuple with the full file name including the directory and the extension and the bare file name.
Of course you could use Regex if the simple string operations are not precise enough:
var filtered = prefilteredFiles
.Where(f =>
Regex.IsMatch(Path.GetFileNameWithoutExtension(f), "(OneA|TwoA)[1-9]+")
);
This also has the advantage that only one test per file is required what allows us to discard the Select.
You also might use a pre-compiled regex to speed up the search; however, file operations are usually very slow compared to any calculations.
Note that DirectoryInfo also has a EnumerateFiles method with a SearchOption parameter. It will return FileInfo objects instead of just file names.
I have a directory with multiple sub directories that contain .doc files. Example:
C:\Users\user\Documents\testenviroment\Released\test0.doc
C:\Users\user\Documents\testenviroment\Debug\test1.doc
C:\Users\user\Documents1\testenviroment\Debug\test2.doc
C:\Users\user\Documents1\testenviroment\Released\test20.doc
I want to get all the test*.doc files under all Debug folders. I tried:
string[] files = Directory.GetFiles(#"C:\Users\user", "*Debug\\test*.doc",
SearchOption.AllDirectories);
And it gives me an "Illegal characters in path" error.
If I try:
string[] files = Directory.GetFiles(#"C:\Users\user", "\\Debug\\test*.doc",
SearchOption.AllDirectories);
I get a different error: "Could not find a part of the path C:\Users\user\Debug".
You are including a folder within the search pattern which isn't expected. According to the docs:
searchPattern Type: System.String The search string to match against
the names of files in path. This parameter can contain a combination
of valid literal path and wildcard (* and ?) characters (see Remarks),
but doesn't support regular expressions.
With this in mind, try something like this:
String[] files = Directory.GetFiles(#"C:\Users\user", "test*.doc", SearchOption.AllDirectories)
.Where(file => file.Contains("\\Debug\\"))
.ToArray();
This will get ALL the files in your specified directory and return the ones with Debug in the path. With this in mind, try and keep the search directory narrowed down as much as possible.
Note:
My original answer included EnumerateFiles which would work like this (making sure to pass the search option (thanks #CodeCaster)):
String[] files = Directory.EnumerateFiles(#"C:\Users\user", "test*.doc", SearchOption.AllDirectories)
.Where(file => file.Contains("\\Debug\\"))
.ToArray();
I've just run a test and the second seems to be slower however it might be quicker on a larger folder. Worth keeping in mind.
Edit: Note from #pinkfloydx33
I've actually had that practically take down a system that I had
inherited. It was taking so much time trying to return the array and
killing the memory footprint as well. Problem was diverted converting
over to the enumerable counterparts
So using the second option would be safer for larger directories.
The second parameter, the search pattern, works only for filenames. So you'll need to iterate the directories you want to search, then call Directory.GetFiles(directory, "test*.doc") on each directory.
How to write that code depends on how robust you want it to be and what assumptions you want to make (e.g. "all Debug directories are always two levels into the user's directory" versus "the Debug directory can be at any level into the user's directory").
See How to recursively list all the files in a directory in C#?.
Alternatively, if you want to search all subdirectories and then discard files that don't match your preferences, see Searching for file in directories recursively:
var files = Directory.GetFiles(#"C:\Users\user", "test*.doc", SearchOption.AllDirectories)
.Where(f => f.IndexOf(#"\debug", StringComparison.OrdinalIgnoreCase) >= 0);
But note that this may be bad for performance, as it'll scan irrelevant directories.
I am using the following line to return specific files...
FileInfo file in nodeDirInfo.GetFiles("*.sbs", option)
But there are other files in the directory with the extension .sbsar, and it is getting them, too. How can I differentiate between .sbs and .sbsar in the search pattern?
The issue you're experiencing is a limitation of the search pattern, in the Win32 API.
A searchPattern with a file extension (for example *.txt) of exactly
three characters returns files having an extension of three or more
characters, where the first three characters match the file extension
specified in the searchPattern.
My solution is to manually filter the results, using Linq:
nodeDirInfo.GetFiles("*.sbs", option).Where(s => s.EndsWith(".sbs"),
StringComparison.InvariantCultureIgnoreCase));
Try this, filtered using file extension.
FileInfo[] files = nodeDirInfo.GetFiles("*", SearchOption.TopDirectoryOnly).
Where(f=>f.Extension==".sbs").ToArray<FileInfo>();
That's the behaviour of the Win32 API (FindFirstFile) that is underneath GetFiles() being reflected on to you.
You'll need to do your own filtering if you must use GetFiles(). For instance:
GetFiles("*", searchOption).Where(s => s.EndsWith(".sbs",
StringComparison.InvariantCultureIgnoreCase));
Or more efficiently:
EnumerateFiles("*", searchOption).Where(s => s.EndsWith(".sbs",
StringComparison.InvariantCultureIgnoreCase));
Note that I use StringComparison.InvariantCultureIgnoreCase to deal with the fact that Windows file names are case-insensitive.
If performance is an issue, that is if the search has to process directories with large numbers of files, then it is more efficient to perform the filtering twice: once in the call to GetFiles or EnumerateFiles, and once to clean up the unwanted file names. For example:
GetFiles("*.sbs", searchOption).Where(s => s.EndsWith(".sbs",
StringComparison.InvariantCultureIgnoreCase));
EnumerateFiles("*.sbs", searchOption).Where(s => s.EndsWith(".sbs",
StringComparison.InvariantCultureIgnoreCase));
Its mentioned in docs
When using the asterisk wildcard character in a searchPattern,a
searchPattern with a file extension of exactly three characters
returns files having an extension of three or more characters.When
using the question mark wildcard character, this method returns only
files that match the specified file extension.
I wish to get a list of all the files of a certain extension (recursive), but only the files ending with that extension.
For example, I wish to get all the files with the ".exe" extension, If I have the following files:
file1.exe , file2.txt.exe , file3.exe.txt , file4.txt.exe1 , file5.txt
I expect to get a list of 1 file, which is: file1.exe.
I'm trying to use the following line:
List<string> theList = Directory.GetFiles(#"C:\SearchDir", "*.exe", SearchOption.AllDirectories).ToList();
But what I get is a list of the following three files: file1.exe , file2.txt.exe , file4.txt.exe1
Any ideas?
Try this:
var exeFiles = Directory.EnumerateFiles(sourceDirectory,
"*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".exe") && s.Count( c => c == '.') == 2)
.ToList();
This is a common issue to see. Take note to the MSDN documentation:
When using the asterisk wildcard character in a searchPattern, such as "*.txt", the matching behavior when the extension is exactly three characters long is different than when the extension is more or less than three characters long. A searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern.
You can't solve it by searching for the .exe extension; you'll need to filter your results one more time in the client code.
Now, one thing to note also is this. The following examples would in fact be considered executable files:
file1.exe
file2.txt.exe
whereas this one wouldn't technically be considered an executable file.
file4.txt.exe1
So the question then becomes, what algorithm do you want? It appears to me you want the following:
Files that have an extension of exe.
Files that don't have multiple extensions.
Have a look at Ahmed's answer for a fantastic approach to getting the algorithm you want.
Is there any way I can create a list with all the folders and files that are in a directory? I will specify the path and I want to list all its child folders and files, and write them in a txt file, or maybe an xml file to make it easier to read.
The Directory.GetFiles method should give you a list of all files, along with their full paths:
string[] filePaths = Directory.GetFiles(#"c:\MyDir\", "*.*", SearchOption.AllDirectories);
Directory.GetFiles and Directory.GetDirectories methods should help
This is a good link to get all files:
http://www.csharp-examples.net/get-files-from-directory/
And to get all directories: use Directories.GetFolders() instead.
Then you have to make a for loop or something to traverse them them.
Perhaps a recursive method would be a good selection. Something like
void PrintFilesAndFolders(string directory)...