How to EnumerateFiles to only 3 files directories in C#? - c#

Currently I can only Enumeratefiles method all files or the source directory. I want it to go 3 subdirectory's deeper and then only check there.
For Example the second snippet will check only K:\SourceFolder
The first example will check K:\SourceFolder\JobName\Batches\Folder1\Folder11\Images
It will check all folders and therefore decreasing the performance and efficiency of the application.
I only need it to check too K:\SourceFolder\JobName\Batches
This code goes too far:
List<string> validFiles = new List<string>();
List<string> files = Directory.EnumerateFiles(folderPath, "*.*", SearchOption.AllDirectories).ToList();
foreach (var file in files)
This code doesn't go far enough:
List<string> files = Directory.Enumeratefiles(directory)

Theres a few different methods we could use to tackle this task, so I'll list a few examples to help get started.
Using a loop to iterate all the folders and return only those at target depth:
List<string> ListFilesAtSpecificDepth(string rootPath, int targetDepth)
{
// Set the root folder before we start iterating through subfolders.
var foldersAtCurrentDepth = Directory.EnumerateDirectories(rootPath).ToList(); // calling ToList will make the enumeration happen now.
for (int currentDepth = 0; currentDepth < targetDepth; currentDepth++)
{
// Using select many is a clean way to select the subfolders for multiple root folders into a flat list.
foldersAtCurrentDepth = foldersAtCurrentDepth.SelectMany(x => Directory.EnumerateDirectories(x)).ToList();
}
// After the loop we have a list of folders for the targetDepth only.
// Select many again to get all the files for all the folders.
return foldersAtCurrentDepth.SelectMany(x => Directory.EnumerateFiles(x, "*.*", SearchOption.TopDirectoryOnly)).ToList();
}
Another options is to use recursion to recall the same method until we reach the desired depth. This can be used to only return results at the target depth or all the results along the way, since above example does only target depth, I've decided to do all results for this one using a custom object to track the depth:
class FileSearchResult
{
public string FilePath {get;set;}
public int FolderDepthFromRoot {get;set;}
}
List<FileSearchResult> ListFilesUntilSpecificDepth(string rootPath, int maxDepth, int currentDepth = 0)
{
// Add all the files at the current level along with extra details like the depth.
var iterationResult = Directory.EnumerateFiles(rootPath, "*.*", SearchOption.TopDirectoryOnly)
.Select(x => new FileSearchResult
{
FilePath = x,
FolderDepthFromRoot = currentDepth
}).ToList();
if (currentDepth < maxDepth) // we need to go deeper.
{
var foldersUnderMe = Directory.EnumerateDirectories(rootPath);
// Add all the results for subfolders recursively by calling the same method again.
foreach (var subFolder in foldersUnderMe)
{
iterationResult.AddRange(ListFilesUntilSpecificDepth(subFolder, maxDepth, currentDepth + 1))
}
}
return iterationResult;
}
With and example for both recursion and looping both covered, the last one I want to touch on is using the structure of the file system itself to help accomplish this task. So instead of managing our own loops or recursion we can use the original method to get all files recursively and then using the list of all results we can determine the depth. e.g. We know that '/' is a character used by the file system to delimit folders and that it's an illegal character to use in a folder or file name, so it should be pretty safe to use this marker to effectively count folders. In this example I'll use another custom class to track the results so it should effectively return the same results as the recursive method but with infinite depth.
class FileSearchResult
{
public string FilePath { get; set; }
public int FolderDepthFromRoot { get; set; }
}
List<FileSearchResult> ListFiles(string rootPath)
{
var allFiles = Directory.EnumerateFiles(rootPath, "*.*", SearchOption.AllDirectories).ToList();
int numberOfFoldersInRootPath = rootPath.Count(c => c == '\\'); // count how many backslashes in root path as a base.
return allFiles.Select(filePath => new FileSearchResult
{
FilePath = filePath,
FolderDepthFromRoot = filePath.Count(c => c == '\\') - numberOfFoldersInRootPath
}).ToList();
}

Related

How to get list of Direct Directories named something-* using C# in D:\myfolder\

How to get list of Direct Directories named something-***** using C# in **D:\myfolder**
I tried
String root = #"E:\something-*";
var directories = Directory.GetDirectories(root);
but it is giving error infact, the listing of E:\ also results in null as value in variable directories.
I also tried looking for possible solutions on stackoverflow and other forums but did not get any appropriate answer to my query.
Here is an example that will compile an array of DirectoryInfo objects (directories) that match the SearchPatn that exists in the Path.
So, if Path equals "D:\myfolders\" and SearchPatn equals "something-*", You'll get results like: something-abc, something-xyz as folders that you can manipulate.
Caution with the searchOption: AllDirectories will search through all the folders below your path and return anything it finds. If you only want folders from your root, use the TopDirectoryOnly searchOption.
// this returns an array of folders based on the SearchPatn (i.e., the folders you're looking for)
private DirectoryInfo[] getSourceFolders(string Path, string SearchPatn)
{
System.IO.DirectoryInfo[] f = new DirectoryInfo(Path).GetDirectories(SearchPatn, searchOption: SearchOption.AllDirectories);
return f;
}
You need the overload of GetDirectories that takes a search pattern and prepare the correct pattern to search for
String root = #"E:\something-*";
string parent = Path.GetDirectoryName(root);
// A little trick, here GetFilename will return "something-*"
string search = Path.GetFileName(root);
var dirs = Directory.GetDirectories(parent, search);
There is also a third overload that allows you to search the pattern recursively under the parent folder
var dirs = Directory.GetDirectories(parent, search, SearchOption.AllDirectories);
The problem you are experiencing is caused by the presence of system directories like the System Volume Information on which you don't have permission to read its content. MSDN has an example how to overcome this situation and could be adapted to your requirements with some minor changes like the one here below
// Call the WalkDirectoryTree with the parameters below
// Notice that I have removed the * in the search pattern
var dirs = WalkDirectoryTree(#"E:\", #"something-");
List<string> WalkDirectoryTree(string root, string search)
{
try
{
var files = Directory.GetFiles(root, "*.*");
}
// This is thrown if even one of the files requires permissions greater
// than the application provides.
catch (UnauthorizedAccessException e)
{
// This code just writes out the message and continues to recurse.
// You may decide to do something different here. For example, you
// can try to elevate your privileges and access the file again.
Console.WriteLine(e.Message);
return new List<string>();
}
catch (System.IO.DirectoryNotFoundException e)
{
Console.WriteLine(e.Message);
return new List<string>();
}
// Now find all the subdirectories under this directory.
List<string> subDirs = new List<string>();
List<string> curDirs = Directory.GetDirectories(root).ToList();
foreach (string s in curDirs)
{
if(s.StartsWith(search))
subDirs.Add(s);
var result = WalkDirectoryTree(s, search);
subDirs.AddRange(result);
}
return subDirs;
}

How to search a directory for files that begin with something then get the one that was modified most recently

What I want to do is search/scan a directory for multiple files beginning with something, then get the file that was last modified most recently. For example, I want to search the directory Prefetch for files that begin with "apple", "pear", and "orange". These files may not exist, but if they do, and say there are files that begin with apple and files that begin with pear, out of all of those files, I want to get the one that was modified most recently. The code below allows me do to this but search only 1 thing.
DirectoryInfo prefetch = new DirectoryInfo("c:\\Windows\\Prefetch");
FileInfo[] apple = prefetch.GetFiles("apple*");
if (apple.Length == 0)
// Do something
else
{
double lastused = DateTime.Now.Subtract(
apple.OrderByDescending(x => x.LastWriteTime)
.FirstOrDefault().LastWriteTime).TotalMinutes;
int final = Convert.ToInt32(lastused);
}
Basically, how can I make that code search 'apple', 'pear' etc. instead of just apple? I don't know if you can modify the code above to do that or if you have to change it completely. I've been trying to figure this out for hours and can't do it.
As explained in my comments you can't use DirectoryInfo.GetFiles to return list of FileInfo with so different patterns. Just one pattern is supported.
As others as already shown, you can prepare a list of patterns and then call in a loop the GetFiles on each pattern.
However, I would show you the same approach, but done with just one line of code in Linq.
List<string> patterns = new List<string> { "apple*", "pear*", "orange*" };
DirectoryInfo prefetch = new DirectoryInfo(#"c:\Windows\Prefetch");
var result = patterns.SelectMany(x => prefetch.GetFiles(x))
.OrderByDescending(k => k.LastWriteTime)
.FirstOrDefault();
Now, result is a FileInfo with the most recent update. Of course, if no files matches the three patterns, then result will be null. A check before using that variable is mandatory.
You could create a set of files that match the prefixes then check the date of those files, something like (not tested):
List<string> files=new List<string>();
foreach(var str in prefixes)
files.AddRange(dirInfo.GetFiles(str));
return (from d in (from name in files select File.GetLastAccessTime(name)) orderby d descending).FirstOrDefault();
prefixes is the list of search patterns, and dirInfo is a DirectoryInfo object.
You can iterate over a list
List<string> patterns = new List<string> { "apple*", "pear*", "orange*" };
DirectoryInfo prefetch = new DirectoryInfo("c:\\Windows\\Prefetch");
foreach (var pattern in patterns) {
FileInfo[] files = prefetch.GetFiles(pattern);
var lastAccessed = files.OrderByDescending(x => x.LastAccessTime).FirstOrDefault();
if (lastAccessed != null) {
var minutes = DateTime.Now.Subtract(lastAccessed.LastAccessTime).TotalMinutes;
}
}

Method to get directory and subdirectories and count the number of files with a specific extension in each level

I currently have a method that lists all sub-directories and I think I need to supplement it for another method.
private static List<string> GetDirectories(string directory, string searchPattern)
{
try
{
return Directory.GetDirectories(directory, searchPattern).ToList();
}
catch (UnauthorizedAccessException)
{
return new List<string>();
}
}
I then call it like this:
var directories = GetDirectories(directory, fileExtension);
I can list all sub-directories on the next level but not the level inside of it. The catch is my code won't exit if there's a folder I don't have access to.
e.g. when I pass "C:\" and "*.*" I can get
C:\Folder1
C:\Folder2
C:\Folder3
but not the folders inside of it.
I am trying to make a List that would make it so that if I pass C:\\ and \*.xls, I'll be able to get the result below as a List:
Directory | File Count
C:\Folder1 | 3 (3 files under \Folder with and xls extension)
C:\Folder\Sub | 2
C:\Folder2 | 5
and so on.
Thank you in advance.
You need a recursive function to run your search query against each child directory found.
void GetChildDirectories(string path, string pattern, Dictionary<string, int> stats)
{
try
{
var children = Directory.GetDirectories(path);
var count = Directory.GetFiles(path, pattern).Length;
stats[path] = count;
foreach (var child in children)
GetChildDirectories(child, pattern, stats);
}
catch (UnauthorizedAccessException e)
{
stats[path] = -1;
return;
}
}
Once this function returns, you can print results like this:
var stats = new Dictionary<string, int>();
string path = "C:\\", pattern = "*.txt";
GetChildDirectories(path, pattern, stats);
Console.WriteLine("Directory | Count");
foreach (var key in stats.Keys)
{
if (stats[key] == -1)
Console.WriteLine("Unable to access path: {0}", key);
else
Console.WriteLine("{0} | {1}", key, stats[key]);
}
The recursion is likely to blow up in your face, badly, as written in example. These days there are thousands of directories in %systemdrive%. A more stable solution would be stack-based non-recursive iteration.

how to efficiently Comparing two lists with 500k objects and strings

So i have a main directory with sub folders and around 500k images. I know alot of theese images does not exist in my database and i want to know which ones so that i can delete them.
This is the code i have so far:
var listOfAdPictureNames = ImageDB.GetAllAdPictureNames();
var listWithFilesFromImageFolder = ImageDirSearch(adPicturesPath);
var result = listWithFilesFromImageFolder.Where(p => !listOfAdPictureNames.Any(q => p.FileName == q));
var differenceList = result.ToList();
listOfAdPictureNames is of type List<string>
here is my model that im returing from the ImageDirSearch:
public class CheckNotUsedAdImagesModel
{
public List<ImageDirModel> ListWithUnusedAdImages { get; set; }
}
public class ImageDirModel
{
public string FileName { get; set; }
public string Path { get; set; }
}
and here is the recursive method to get all images from my folder.
private List<ImageDirModel> ImageDirSearch(string path)
{
string adPicturesPath = ConfigurationManager.AppSettings["AdPicturesPath"];
List<ImageDirModel> files = new List<ImageDirModel>();
try
{
foreach (string f in Directory.GetFiles(path))
{
var model = new ImageDirModel();
model.Path = f.ToLower();
model.FileName = Path.GetFileName(f.ToLower());
files.Add(model);
}
foreach (string d in Directory.GetDirectories(path))
{
files.AddRange(ImageDirSearch(d));
}
}
catch (System.Exception excpt)
{
throw new Exception(excpt.Message);
}
return files;
}
The problem I have is that this row:
var result = listWithFilesFromImageFolder.Where(p => !listOfAdPictureNames.Any(q => p.FileName == q));
takes over an hour to complete. I want to know if there is a better way to check in my images folder if there are images there that doesn't exist in my database.
Here is the method that get all the image names from my database layer:
public static List<string> GetAllAdPictureNames()
{
List<string> ListWithAllAdFileNames = new List<string>();
using (var db = new DatabaseLayer.DBEntities())
{
ListWithAllAdFileNames = db.ad_pictures.Select(b => b.filename.ToLower()).ToList();
}
if (ListWithAllAdFileNames.Count < 1)
return new List<string>();
return ListWithAllAdFileNames;
}
Perhaps Except is what you're looking for. Something like this:
var filesInFolderNotInDb = listWithFilesFromImageFolder.Select(p => p.FileName).Except(listOfAdPictureNames).ToList();
Should give you the files that exist in the folder but not in the database.
Instead of the search being repeated on each of these lists its optimal to sort second list "listOfAdPictureNames" (Use any of n*log(n) sorts). Then checking for existence by binary search will be the most efficient all other techniques including the current one are exponential in order.
As I said in my comment, you seem to have recreated the FileInfo class, you don't need to do this, so your ImageDirSearch can become the following
private IEnumerable<string> ImageDirSearch(string path)
{
return Directory.EnumerateFiles(path, "*.jpg", SearchOption.TopDirectoryOnly);
}
There doesn't seem to be much gained by returning the whole file info where you only need the file name, and also this only finds jpgs, but this can be changed..
The ToLower calls are quite expensive and a bit pointless, so is the to list when you are planning on querying again so you can get rid of that and return an IEnumerable again, (this is in the GetAllAdPictureNames method)
Then your comparison can use equals and ignore case.
!listOfAdPictureNames.Any(q => p.Equals(q, StringComparison.InvariantCultureIgnoreCase));
One more thing that will probably help is removing items from the list of file names as they are found, this should make the searching of the list quicker every time one is removed since there is less to iterate through.

Get list of titles from xml files

I am trying to get titles of xml files from a folder call "bugs".
My code:
public virtual List<IBug> FillBugs()
{
string folder = xmlStorageLocation + "bugs" + Path.DirectorySeparatorChar;
List<IBug> bugs = new List<IBug>();
foreach (string file in Directory.GetFiles(folder, "*.xml", SearchOption.TopDirectoryOnly))
{
var q = from b in bugs
select new IBug
{
Title = b.Title,
Id = b.Id,
};
return q.ToList();
}
return bugs;
}
But I'm not geting out the titles from all the xml files in the folder "bugs".
the biggest problem is to get eatch files to singel string and not string[].
Your code as written doesn't make any sense. Perhaps you meant something more like this:
public virtual List<IBug> FillBugs()
{
// is this actually correct or did you mix up the concatenation order?
// either way, I suggest Path.Combine() instead
string folder = xmlStorageLocation + "bugs" + Path.DirectorySeparatorChar;
List<IBug> bugs = new List<IBug>();
foreach (string file in Directory.GetFiles(folder, "*.xml",
SearchOption.TopDirectoryOnly))
{
// i guess IBug is not actually an interface even though it starts
// with "I" since you made one in your code
bugs.Add(new IBug {
Title = file, Id = 0 /* don't know where you get an ID */ });
}
return bugs;
}
"from b in bugs" selects from an empty list. you need to initialize bugs from the file at the start of your foreach loop
Do you need a backslash (Path.DirectorySeparatorChar) between xmlStorageLocation and "bugs"?
You don't use file in your loop anywhere - Is that correct or did you miss to push it into the collection?

Categories