I am trying to use the Directory.GetFiles() method to retrieve a list of files of multiple types, such as mp3's and jpg's. I have tried both of the following with no luck:
Directory.GetFiles("C:\\path", "*.mp3|*.jpg", SearchOption.AllDirectories);
Directory.GetFiles("C:\\path", "*.mp3;*.jpg", SearchOption.AllDirectories);
Is there a way to do this in one call?
For .NET 4.0 and later,
var files = Directory.EnumerateFiles("C:\\path", "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".mp3") || s.EndsWith(".jpg"));
For earlier versions of .NET,
var files = Directory.GetFiles("C:\\path", "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".mp3") || s.EndsWith(".jpg"));
edit: Please read the comments. The improvement that Paul Farry suggests, and the memory/performance issue that Christian.K points out are both very important.
How about this:
private static string[] GetFiles(string sourceFolder, string filters, System.IO.SearchOption searchOption)
{
return filters.Split('|').SelectMany(filter => System.IO.Directory.GetFiles(sourceFolder, filter, searchOption)).ToArray();
}
I found it here (in the comments): http://msdn.microsoft.com/en-us/library/wz42302f.aspx
If you have a large list of extensions to check you can use the following. I didn't want to create a lot of OR statements so i modified what lette wrote.
string supportedExtensions = "*.jpg,*.gif,*.png,*.bmp,*.jpe,*.jpeg,*.wmf,*.emf,*.xbm,*.ico,*.eps,*.tif,*.tiff,*.g01,*.g02,*.g03,*.g04,*.g05,*.g06,*.g07,*.g08";
foreach (string imageFile in Directory.GetFiles(_tempDirectory, "*.*", SearchOption.AllDirectories).Where(s => supportedExtensions.Contains(Path.GetExtension(s).ToLower())))
{
//do work here
}
for
var exts = new[] { "mp3", "jpg" };
You could:
public IEnumerable<string> FilterFiles(string path, params string[] exts) {
return
Directory
.EnumerateFiles(path, "*.*")
.Where(file => exts.Any(x => file.EndsWith(x, StringComparison.OrdinalIgnoreCase)));
}
Don't forget the new .NET4 Directory.EnumerateFiles for a performance boost (What is the difference between Directory.EnumerateFiles vs Directory.GetFiles?)
"IgnoreCase" should be faster than "ToLower" (.EndsWith("aspx", StringComparison.OrdinalIgnoreCase) rather than .ToLower().EndsWith("aspx"))
But the real benefit of EnumerateFiles shows up when you split up the filters and merge the results:
public IEnumerable<string> FilterFiles(string path, params string[] exts) {
return
exts.Select(x => "*." + x) // turn into globs
.SelectMany(x =>
Directory.EnumerateFiles(path, x)
);
}
It gets a bit faster if you don't have to turn them into globs (i.e. exts = new[] {"*.mp3", "*.jpg"} already).
Performance evaluation based on the following LinqPad test (note: Perf just repeats the delegate 10000 times)
https://gist.github.com/zaus/7454021
( reposted and extended from 'duplicate' since that question specifically requested no LINQ: Multiple file-extensions searchPattern for System.IO.Directory.GetFiles )
I know it's old question but LINQ: (.NET40+)
var files = Directory.GetFiles("path_to_files").Where(file => Regex.IsMatch(file, #"^.+\.(wav|mp3|txt)$"));
There is also a descent solution which seems not to have any memory or performance overhead and be quite elegant:
string[] filters = new[]{"*.jpg", "*.png", "*.gif"};
string[] filePaths = filters.SelectMany(f => Directory.GetFiles(basePath, f)).ToArray();
Another way to use Linq, but without having to return everything and filter on that in memory.
var files = Directory.GetFiles("C:\\path", "*.mp3", SearchOption.AllDirectories).Union(Directory.GetFiles("C:\\path", "*.jpg", SearchOption.AllDirectories));
It's actually 2 calls to GetFiles(), but I think it's consistent with the spirit of the question and returns them in one enumerable.
Let
var set = new HashSet<string>(
new[] { ".mp3", ".jpg" },
StringComparer.OrdinalIgnoreCase); // ignore case
var dir = new DirectoryInfo(path);
Then
dir.EnumerateFiles("*.*", SearchOption.AllDirectories)
.Where(f => set.Contains(f.Extension));
or
from file in dir.EnumerateFiles("*.*", SearchOption.AllDirectories)
from ext in set // makes sense only if it's just IEnumerable<string> or similar
where String.Equals(ext, file.Extension, StringComparison.OrdinalIgnoreCase)
select file;
Nope. Try the following:
List<string> _searchPatternList = new List<string>();
...
List<string> fileList = new List<string>();
foreach ( string ext in _searchPatternList )
{
foreach ( string subFile in Directory.GetFiles( folderName, ext )
{
fileList.Add( subFile );
}
}
// Sort alpabetically
fileList.Sort();
// Add files to the file browser control
foreach ( string fileName in fileList )
{
...;
}
Taken from: http://blogs.msdn.com/markda/archive/2006/04/20/580075.aspx
I can't use .Where method because I'm programming in .NET Framework 2.0 (Linq is only supported in .NET Framework 3.5+).
Code below is not case sensitive (so .CaB or .cab will be listed too).
string[] ext = new string[2] { "*.CAB", "*.MSU" };
foreach (string found in ext)
{
string[] extracted = Directory.GetFiles("C:\\test", found, System.IO.SearchOption.AllDirectories);
foreach (string file in extracted)
{
Console.WriteLine(file);
}
}
List<string> FileList = new List<string>();
DirectoryInfo di = new DirectoryInfo("C:\\DirName");
IEnumerable<FileInfo> fileList = di.GetFiles("*.*");
//Create the query
IEnumerable<FileInfo> fileQuery = from file in fileList
where (file.Extension.ToLower() == ".jpg" || file.Extension.ToLower() == ".png")
orderby file.LastWriteTime
select file;
foreach (System.IO.FileInfo fi in fileQuery)
{
fi.Attributes = FileAttributes.Normal;
FileList.Add(fi.FullName);
}
in .NET 2.0 (no Linq):
public static List<string> GetFilez(string path, System.IO.SearchOption opt, params string[] patterns)
{
List<string> filez = new List<string>();
foreach (string pattern in patterns)
{
filez.AddRange(
System.IO.Directory.GetFiles(path, pattern, opt)
);
}
// filez.Sort(); // Optional
return filez; // Optional: .ToArray()
}
Then use it:
foreach (string fn in GetFilez(path
, System.IO.SearchOption.AllDirectories
, "*.xml", "*.xml.rels", "*.rels"))
{}
DirectoryInfo directory = new DirectoryInfo(Server.MapPath("~/Contents/"));
//Using Union
FileInfo[] files = directory.GetFiles("*.xlsx")
.Union(directory
.GetFiles("*.csv"))
.ToArray();
If you are using VB.NET (or imported the dependency into your C# project), there actually exists a convenience method that allows to filter for multiple extensions:
Microsoft.VisualBasic.FileIO.FileSystem.GetFiles("C:\\path", Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, new string[] {"*.mp3", "*.jpg"});
In VB.NET this can be accessed through the My-namespace:
My.Computer.FileSystem.GetFiles("C:\path", FileIO.SearchOption.SearchAllSubDirectories, {"*.mp3", "*.jpg"})
Unfortunately, these convenience methods don't support a lazily evaluated variant like Directory.EnumerateFiles() does.
The following function searches on multiple patterns, separated by commas. You can also specify an exclusion, eg: "!web.config" will search for all files and exclude "web.config". Patterns can be mixed.
private string[] FindFiles(string directory, string filters, SearchOption searchOption)
{
if (!Directory.Exists(directory)) return new string[] { };
var include = (from filter in filters.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries) where !string.IsNullOrEmpty(filter.Trim()) select filter.Trim());
var exclude = (from filter in include where filter.Contains(#"!") select filter);
include = include.Except(exclude);
if (include.Count() == 0) include = new string[] { "*" };
var rxfilters = from filter in exclude select string.Format("^{0}$", filter.Replace("!", "").Replace(".", #"\.").Replace("*", ".*").Replace("?", "."));
Regex regex = new Regex(string.Join("|", rxfilters.ToArray()));
List<Thread> workers = new List<Thread>();
List<string> files = new List<string>();
foreach (string filter in include)
{
Thread worker = new Thread(
new ThreadStart(
delegate
{
string[] allfiles = Directory.GetFiles(directory, filter, searchOption);
if (exclude.Count() > 0)
{
lock (files)
files.AddRange(allfiles.Where(p => !regex.Match(p).Success));
}
else
{
lock (files)
files.AddRange(allfiles);
}
}
));
workers.Add(worker);
worker.Start();
}
foreach (Thread worker in workers)
{
worker.Join();
}
return files.ToArray();
}
Usage:
foreach (string file in FindFiles(#"D:\628.2.11", #"!*.config, !*.js", SearchOption.AllDirectories))
{
Console.WriteLine(file);
}
What about
string[] filesPNG = Directory.GetFiles(path, "*.png");
string[] filesJPG = Directory.GetFiles(path, "*.jpg");
string[] filesJPEG = Directory.GetFiles(path, "*.jpeg");
int totalArraySizeAll = filesPNG.Length + filesJPG.Length + filesJPEG.Length;
List<string> filesAll = new List<string>(totalArraySizeAll);
filesAll.AddRange(filesPNG);
filesAll.AddRange(filesJPG);
filesAll.AddRange(filesJPEG);
Just found an another way to do it. Still not one operation, but throwing it out to see what other people think about it.
private void getFiles(string path)
{
foreach (string s in Array.FindAll(Directory.GetFiles(path, "*", SearchOption.AllDirectories), predicate_FileMatch))
{
Debug.Print(s);
}
}
private bool predicate_FileMatch(string fileName)
{
if (fileName.EndsWith(".mp3"))
return true;
if (fileName.EndsWith(".jpg"))
return true;
return false;
}
I wonder why there are so many "solutions" posted?
If my rookie-understanding on how GetFiles works is right, there are only two options and any of the solutions above can be brought down to these:
GetFiles, then filter: Fast, but a memory killer due to storing overhead untill the filters are applied
Filter while GetFiles: Slower the more filters are set, but low memory usage as no overhead is stored.This is explained in one of the above posts with an impressive benchmark: Each filter option causes a seperate GetFile-operation so the same part of the harddrive gets read several times.
In my opinion Option 1) is better, but using the SearchOption.AllDirectories on folders like C:\ would use huge amounts of memory.
Therefor i would just make a recursive sub-method that goes through all subfolders using option 1)
This should cause only 1 GetFiles-operation on each folder and therefor be fast (Option 1), but use only a small amount of memory as the filters are applied afters each subfolders' reading -> overhead is deleted after each subfolder.
Please correct me if I am wrong. I am as i said quite new to programming but want to gain deeper understanding of things to eventually become good at this :)
Here is a simple and elegant way of getting filtered files
var allowedFileExtensions = ".csv,.txt";
var files = Directory.EnumerateFiles(#"C:\MyFolder", "*.*", SearchOption.TopDirectoryOnly)
.Where(s => allowedFileExtensions.IndexOf(Path.GetExtension(s)) > -1).ToArray();
Make the extensions you want one string i.e ".mp3.jpg.wma.wmf" and then check if each file contains the extension you want.
This works with .net 2.0 as it does not use LINQ.
string myExtensions=".jpg.mp3";
string[] files=System.IO.Directory.GetFiles("C:\myfolder");
foreach(string file in files)
{
if(myExtensions.ToLower().contains(System.IO.Path.GetExtension(s).ToLower()))
{
//this file has passed, do something with this file
}
}
The advantage with this approach is you can add or remove extensions without editing the code i.e to add png images, just write myExtensions=".jpg.mp3.png".
/// <summary>
/// Returns the names of files in a specified directories that match the specified patterns using LINQ
/// </summary>
/// <param name="srcDirs">The directories to seach</param>
/// <param name="searchPatterns">the list of search patterns</param>
/// <param name="searchOption"></param>
/// <returns>The list of files that match the specified pattern</returns>
public static string[] GetFilesUsingLINQ(string[] srcDirs,
string[] searchPatterns,
SearchOption searchOption = SearchOption.AllDirectories)
{
var r = from dir in srcDirs
from searchPattern in searchPatterns
from f in Directory.GetFiles(dir, searchPattern, searchOption)
select f;
return r.ToArray();
}
Nop... I believe you have to make as many calls as the file types you want.
I would create a function myself taking an array on strings with the extensions I need and then iterate on that array making all the necessary calls. That function would return a generic list of the files matching the extensions I'd sent.
Hope it helps.
I had the same problem and couldn't find the right solution so I wrote a function called GetFiles:
/// <summary>
/// Get all files with a specific extension
/// </summary>
/// <param name="extensionsToCompare">string list of all the extensions</param>
/// <param name="Location">string of the location</param>
/// <returns>array of all the files with the specific extensions</returns>
public string[] GetFiles(List<string> extensionsToCompare, string Location)
{
List<string> files = new List<string>();
foreach (string file in Directory.GetFiles(Location))
{
if (extensionsToCompare.Contains(file.Substring(file.IndexOf('.')+1).ToLower())) files.Add(file);
}
files.Sort();
return files.ToArray();
}
This function will call Directory.Getfiles() only one time.
For example call the function like this:
string[] images = GetFiles(new List<string>{"jpg", "png", "gif"}, "imageFolder");
EDIT: To get one file with multiple extensions use this one:
/// <summary>
/// Get the file with a specific name and extension
/// </summary>
/// <param name="filename">the name of the file to find</param>
/// <param name="extensionsToCompare">string list of all the extensions</param>
/// <param name="Location">string of the location</param>
/// <returns>file with the requested filename</returns>
public string GetFile( string filename, List<string> extensionsToCompare, string Location)
{
foreach (string file in Directory.GetFiles(Location))
{
if (extensionsToCompare.Contains(file.Substring(file.IndexOf('.') + 1).ToLower()) &&& file.Substring(Location.Length + 1, (file.IndexOf('.') - (Location.Length + 1))).ToLower() == filename)
return file;
}
return "";
}
For example call the function like this:
string image = GetFile("imagename", new List<string>{"jpg", "png", "gif"}, "imageFolder");
Using GetFiles search pattern for filtering the extension is not safe!!
For instance you have two file Test1.xls and Test2.xlsx and you want to filter out xls file using search pattern *.xls, but GetFiles return both Test1.xls and Test2.xlsx
I was not aware of this and got error in production environment when some temporary files suddenly was handled as right files. Search pattern was *.txt and temp files was named *.txt20181028_100753898
So search pattern can not be trusted, you have to add extra check on filenames as well.
Or you can just convert the string of extensions to String^
vector <string> extensions = { "*.mp4", "*.avi", "*.flv" };
for (int i = 0; i < extensions.size(); ++i)
{
String^ ext = gcnew String(extensions[i].c_str());;
String^ path = "C:\\Users\\Eric\\Videos";
array<String^>^files = Directory::GetFiles(path,ext);
Console::WriteLine(ext);
cout << " " << (files->Length) << endl;
}
i don t know what solution is better, but i use this:
String[] ext = "*.ext1|*.ext2".Split('|');
List<String> files = new List<String>();
foreach (String tmp in ext)
{
files.AddRange(Directory.GetFiles(dir, tmp, SearchOption.AllDirectories));
}
you can add this to your project
public static class Collectables {
public static List<System.IO.FileInfo> FilesViaPattern(this System.IO.DirectoryInfo fldr, string pattern) {
var filter = pattern.Split(" ");
return fldr.GetFiles( "*.*", System.IO.SearchOption.AllDirectories)
.Where(l => filter.Any(k => l.Name.EndsWith(k))).ToList();
}
}
then use it anywhere like this
new System.IO.DirectoryInfo("c:\\test").FilesViaPattern("txt doc any.extension");