searching for multiple keywords using getfiles - c#

I'm using below code to look for files which carry any of these keywords. For now I could only look for kword1 and if no files found, I start repeating the search for kword2. I'm wondering if there is a more efficient way to look for kword1 or kword2 in the same search. I've been looking on multiple sources and I cant seem to find the way.
Here is what I`ve done so far:
string[] matches =
Directory
.GetFiles(
path1,
"*" + kword1 + "*.txt",SearchOption.AllDirectories
);

Here are a couple of tricks you can do using System.Linq. First set up the test run like this:
var directoryToSearch = AppDomain.CurrentDomain.BaseDirectory;
string
kword1 = "exe",
kword2 = "pdb";
The Where expression retrieves all the files, then filters them according to kword1 and kword2.
var files =
Directory
.GetFiles(
directoryToSearch,
"*",
SearchOption.AllDirectories)
.Where(filename=> filename.Contains(kword1) || filename.Contains(kword2));
The Select expression selects only the file name portion of the full path.
Console.WriteLine(
string.Join(
Environment.NewLine,
files.Select(file=>Path.GetFileName(file))));

Related

LINQ Query Not Selecting Files

I am trying to LINQ query a set of files where I can find the file names with a specific string in them.
I was using:
var docs = directory.enumerateFiles(searchFolder, "* " + strNumber+ "*", SearchOption.AllDirectories);
That was working fine, but some of my file searches were taking 30+ minutes due to the fact that one of the directories has 1+ million files. I was hoping to speed up the search process with a PLINQ query. However, while my syntax is good, I'm not getting the results I would expect. It looks like my problem may be in the Where statement. Any help would be helpful.
foreach (strNumber in strNumbers)
{
DirectoryInfo searchDirectory = new DirectoryInfo(searchFolder);
IEnumerable<System.IO.FileInfo> allDocs = searchDirectory.EnumerateFiles("*", SearchOPtion.AllDirectories);
IEnumerable<System.IO.FileInfo> docsToProcess = strNumbers
.SelectMany(strNumber => allDocs
.Where(file => file.Name.Contains(strNumber)))
.Distinct();
}
Any help would be much appreciated.
I would change the order of the problem.
Create a list of all files (into memory)
Perform the search over the memory list
Then, you can use a Parallel Foreach over the memory array and your disk usage is limited to the initial search.
var searchDirectory = new DirectoryInfo(searchFolder);
var allDocs = searchDirectory.EnumerateFiles("*", SearchOPtion.AllDirectories).ToArray();
// For extra points, use a Parallel.ForEach here for multi-threaded work
Parallel.Foreach(strNumbers, strNumber =>
{
// Work on allDocs here, it should be in memory
});

How do I iterate through a directory stopping at a folder that excludes a specific character?

I would like to iterate through a directory and stop at the first folder that doesn't end in "#"
This is what I tried so far (based on another question from this site):
string rootPath = "D:\\Pending\\Engineering\\Parts\\3";
string targetPattern = "*#";
string fullPath = Directory
.EnumerateFiles(rootPath, targetPattern, SearchOption.AllDirectories)
.FirstOrDefault();
if (fullPath != null)
Console.WriteLine("Found " + fullPath);
else
Console.WriteLine("Not found");
I know *# isn't correct, no idea how to do that part.
Also I'm having problems with SearchOption Visual studio says "it's an ambiguous reference."
Eventually I want the code to get the name of this folder and use it to rename a different folder.
FINAL SOLUTION
I ended up using a combination of dasblikenlight and user3601887
string fullPath = Directory
.GetDirectories(rootPath, "*", System.IO.SearchOption.TopDirectoryOnly)
.FirstOrDefault(fn => !fn.EndsWith("#"));
Since EnumerateFiles pattern does not support regular expressions, you need to get all directories, and do filtering on the C# side:
string fullPath = Directory
.EnumerateFiles(rootPath, "*", SearchOption.AllDirectories)
.FirstOrDefault(fn => !fn.EndsWith("#"));
Or just replace EnumerateFiles with GetDirectories
string fullPath = Directory
.GetDirectories(rootPath, "*#", SearchOption.AllDirectories)
.FirstOrDefault();

find a file name knowing only the extension name

I am trying to verify if a file exist in a c# console program. The only thing is that the file can have any name.
The only thing that I know is the file extension and there can only be one file of this extension. How can I verify if it exist and then use it whatever the name is?
The problem with using Directory.GetFiles() is that is walks the entire filesystem first, then returns all matches as an array. Even if the very first file examined is the one and only match, it still walks the entire filesystem from the specified root before returning the one match.
Instead, use EnumerateFiles() to do a lazy walk, stopping when the first match is encountered, thus:
DirectoryInfo root = new DirectoryInfo( #"C:\" ) ;
string pattern = "*.DesiredFileExtension" ;
FileInfo desiredFile = root.EnumerateFiles( pattern , SearchOption.AllDirectories )
.First()
;
It will throw an exception if the file's not found. Use FirstOrDefault() to get a null value instead.
Try the Directory.GetFiles static method:
var fileMatches = Directory.GetFiles("folder to start search in", "*.exe", SearchOption.AllDirectories);
if (fileMatches.Length == 1)
{
//my file was found
//fileMatches[0] contains the path to my file
}
Note that with the SearchOption enum you can specify just the current folder or to search recursively.
string extension = "txt";
string dir = #"C:\";
var file = Directory.GetFiles(dir, "*." + extension).FirstOrDefault();
if (file != null)
{
Console.WriteLine(file);
}
If the file does not exist directly under 'dir', you will need to use SearchOption.AllDirectories for Directory.GetFiles
Something like this may work
if (Directory.GetFiles(path, "*.ext").Any())
{
var file = Directory.GetFiles(path, ".ext").First();
}

How to check if filename contains substring in C#

I have a folder with files named
myfileone
myfiletwo
myfilethree
How can I check if file "myfilethree" is present.
I mean is there another method other than IsFileExist() method, i.e like filename contains substring "three"?
Substring:
bool contains = Directory.EnumerateFiles(path).Any(f => f.Contains("three"));
Case-insensitive substring:
bool contains = Directory.EnumerateFiles(path).Any(f => f.IndexOf("three", StringComparison.OrdinalIgnoreCase) > 0);
Case-insensitive comparison:
bool contains = Directory.EnumerateFiles(path).Any(f => String.Equals(f, "myfilethree", StringComparison.OrdinalIgnoreCase));
Get file names matching a wildcard criteria:
IEnumerable<string> files = Directory.EnumerateFiles(path, "three*.*"); // lazy file system lookup
string[] files = Directory.GetFiles(path, "three*.*"); // not lazy
If I understand your question correctly, you could do something like
Directory.GetFiles(directoryPath, "*three*")
or
Directory.GetFiles(directoryPath).Where(f => f.Contains("three"))
Both of these will give you all the names of all files with three in it.
I am not that familiar with IO but maybe this would work ? Requires using System.Linq
System.IO.Directory.GetFiles("PATH").Where(s => s.Contains("three"));
EDIT: Note that this returns array of strings.

How to search for Files using GetFiles method (multiple criteria..)

The code below obviously searches a directory for Files that contain the word "FINAL" but what I'm wondering is can I add to its search criteria? I have a Well_Name and Actual_Date strings that I would like to search for in the File names in addition to the "FINAL" word. Thoughts? Thanks in advance.
DirectoryInfo myDir = new DirectoryInfo("C://DWGs");
var files = myDir.GetFiles("FINAL");
//Can I do something like this to add to my search criteria?
var files = myDir.GetFiles("FINAL" +
drow["Well_Name"].ToString() +
drow["Actual_Date"]);
var files = myDir.GetFileInfo()
.Where(f => f.FileName.Contains("FINAL") ||
f.FileName.Contains(drow["Well_Name"].ToString()) ||
f.FileName.Contains(drow["Actual_Date"]));
Since GetFiles() returns an Enumerable Collection of FileInfo you can just check all of the file names for the criteria that you want.
If you want to get really generic on this you could write a function that looks like this
public IEnumerable<FileInfo> addCriteria(IEnumerable<FileInfo> FileList,
List<String> searchCriteria)
{
var newFileList = FileList;
foreach(String criteria in searchCriteria)
{
newFileList = newFileList.Where(f => f.FileName.Contains(criteria).AsQueryable();
}
return newFileList.AsEnumerable();
}
GetFiles method does not support multiple search criteria, but there is a simple way around this limitation. Run a getFile for each file extension, and then "merge" returned arrays into a List<>. Then use a List's ToArray method to "convert" a List back to an Array.
I used this approach, and it works for me
The code is below (do not forget to reference "using System.Collections.Generic;" namespace):
// Get the DirectoryInfo and FileInfo objects for aspx and html files.
FileInfo[] files_aspx = dir.GetFiles("*.aspx");
FileInfo[] files_html = dir.GetFiles("*.html");
List<FileInfo> files = new List<FileInfo>();
files.AddRange(files_aspx);
files.AddRange(files_html);
files.ToArray();

Categories