Stop implicit wildcard in Directory.GetFiles() - c#

string[] fileEntries = Directory.GetFiles(pathName, "*.xml");
Also returns files like foo.xml_ Is there a way to force it to not do so, or will I have to write code to filter the return results.
This is the same behavior as dir *.xml on the command prompt, but different than searching for *.xml in windows explorer.

This behavior is by design. From MSDN (look at the note section and examples given):
A searchPattern with a file extension
of exactly three characters returns
files having an extension of three or
more characters, where the first three
characters match the file extension
specified in the searchPattern.
You could limit it as follows:
C# 2.0:
string[] fileEntries = Array.FindAll(Directory.GetFiles(pathName, "*.xml"),
delegate(string file) {
return String.Compare(Path.GetExtension(file), ".xml", StringComparison.CurrentCultureIgnoreCase) == 0;
});
// or
string[] fileEntries = Array.FindAll(Directory.GetFiles(pathName, "*.xml"),
delegate(string file) {
return Path.GetExtension(file).Length == 4;
});
C# 3.0:
string[] fileEntries = Directory.GetFiles(pathName, "*.xml").Where(file =>
Path.GetExtension(file).Length == 4).ToArray();
// or
string[] fileEntries = Directory.GetFiles(pathName, "*.xml").Where(file =>
String.Compare(Path.GetExtension(file), ".xml",
StringComparison.CurrentCultureIgnoreCase) == 0).ToArray();

it's due to the 8.3 search method of windows. If you try to search for "*.xm" you'll get 0 results.
you can use this in .net 2.0:
string[] fileEntries =
Array.FindAll<string>(System.IO.Directory.GetFiles(pathName, "*.xml"),
new Predicate<string>(delegate(string s)
{
return System.IO.Path.GetExtension(s) == ".xml";
}));

Related

Path as part of pattern matching for method similar to Directory.GetFiles

Directory.GetFiles has an overload that takes a path and a search pattern:
var files = Directory.GetFiles(#"c:\path\to\folder", "*.txt");
to return files within a specified folder, which match the pattern. Is there a built-in .NET method that takes the path as part of the search pattern?
var files1 = Something.GetFiles(#"c:\path\to\folder\*.txt");
No, there isn't anything like that but I had this need countless times. Fortunately it's easy to write:
public string[] SearchFiles(string query)
{
return Directory.GetFiles(
Path.GetDirectoryName(query),
Path.GetFileName(query));
}
A less raw version may handle more special cases (if you need it):
public string[] SearchFiles(string query)
{
if (IsDirectory(query))
return Directory.GetFiles(query, "*.*");
return Directory.GetFiles(
Path.GetDirectoryName(query),
Path.GetFileName(query));
}
private static bool IsDirectory(string path)
{
if (String.IsNullOrWhiteSpaces(path))
return false;
if (path[path.Length - 1] == Path.DirectorySeparatorChar)
return true;
if (path[path.Length - 1] == Path.AltDirectorySeparatorChar)
return true;
if (path.IndexOfAny(Path.GetInvalidPathChars()) != -1)
return false;
return Directory.Exists(path);
}
With this new version (see IsDirectory() code) you may use it like this:
SearchFiles(#"c:\windows\*.*");
SearchFiles(#"c:\windows\");
SearchFiles(#"c:\windows");

C# Directory.GetFiles with mask

In C#, I would like to get all files from a specific directory that matches the following mask:
prefix is "myfile_"
suffix is some numeric number
file extension is xml
i.e
myfile_4.xml
myfile_24.xml
the following files should not match the mask:
_myfile_6.xml
myfile_6.xml_
the code should like somehing this this (maybe some linq query can help)
string[] files = Directory.GetFiles(folder, "???");
Thanks
I am not good with regular expressions, but this might help -
var myFiles = from file in System.IO.Directory.GetFiles(folder, "myfile_*.xml")
where Regex.IsMatch(file, "myfile_[0-9]+.xml",RegexOptions.IgnoreCase) //use the correct regex here
select file;
You can try it like:
string[] files = Directory.GetFiles("C:\\test", "myfile_*.xml");
//This will give you all the files with `xml` extension and starting with `myfile_`
//but this will also give you files like `myfile_ABC.xml`
//to filter them out
int temp;
List<string> selectedFiles = new List<string>();
foreach (string str in files)
{
string fileName = Path.GetFileNameWithoutExtension(str);
string[] tempArray = fileName.Split('_');
if (tempArray.Length == 2 && int.TryParse(tempArray[1], out temp))
{
selectedFiles.Add(str);
}
}
So if your Test folder has files:
myfile_24.xml
MyFile_6.xml
MyFile_6.xml_
myfile_ABC.xml
_MyFile_6.xml
Then you will get in selectedFiles
myfile_24.xml
MyFile_6.xml
You can do something like:
Regex reg = new Regex(#"myfile_\d+.xml");
IEnumerable<string> files = Directory.GetFiles("C:\\").Where(fileName => reg.IsMatch(fileName));

Buffer file names in a given directory

I'm trying to find a way to buffer FileNames from a given directory in C#. By this I mean:
Given directory
C:/MyDir
Which contains files:
File1_orig.txt
File1_edited.txt
File2_orig.txt
File2_edited.txt
...
Filen_orig.txt
Filen_edited.txt
I want to store the filenames(not the whole filepath, just the filename, e.g. String[] filename = Filen_orig.txt) into temporary strings and run a simple comparison on them to see if they contain a target string.
I would like to pass the strings into:
while(STILL FILES IN DIRECTORY)
{
string[] exFileName = {BUFFER FILENAME HERE}
string[] words = exFileName.Split('_');
string[] toCompare = "edited";
bool result;
foreach (string word in words)
{
Console.WriteLine(word);
bool result = toCompare.Equals(word, StringComparison.OrdinalIgnoreCase);
if (result)
{
Console.WriteLine("success");
}
}
Console.ReadLine();
To check to see if the file being examined is edited (*_edited.txt) or an original (*_original.txt), and, if the file is edited, further process the file.
Does anyone know how to automate a filepath read?
Thank you very much.
if you want to see if any files contain the _edited bit, you can use:
bool success = Directory.GetFiles(#"c:\MyDir").Any(p => p.Contains("_edited"));
I'm making a bit of a guess this is what you want because your code isn't very clear (nor is your description)
Edit: to show all edited files:
foreach(var file in Directory.GetFiles(#"c:\MyDir").Where(p => p.Contains("_edited")))
{
Console.WriteLine(" {0}: edited", file);
}
Also, must be using "System.Linq"
How about DirectoryInfo.GetFiles?
DirectoryInfo di = new DirectoryInfo(#"c:\");
// Get only subdirectories that contain the letter "p."
FileInfo[] files= di.GetFiles("*.txt");
foreach (FileInfo fi in files)
{
string exFileName = fi.FileName;
...
}

Remove part of the full directory name?

I have a list of filename with full path which I need to remove the filename and part of the file path considering a filter list I have.
Path.GetDirectoryName(file)
Does part of the job but I was wondering if there is a simple way to filter the paths using .Net 2.0 to remove part of it.
For example:
if I have the path + filename equal toC:\my documents\my folder\my other folder\filename.exe and all I need is what is above my folder\ means I need to extract only my other folder from it.
UPDATE:
The filter list is a text box with folder names separated by a , so I just have partial names on it like the above example the filter here would be my folder
Current Solution based on Rob's code:
string relativeFolder = null;
string file = #"C:\foo\bar\magic\bar.txt";
string folder = Path.GetDirectoryName(file);
string[] paths = folder.Split(Path.DirectorySeparatorChar);
string[] filterArray = iFilter.Text.Split(',');
foreach (string filter in filterArray)
{
int startAfter = Array.IndexOf(paths, filter) + 1;
if (startAfter > 0)
{
relativeFolder = string.Join(Path.DirectorySeparatorChar.ToString(), paths, startAfter, paths.Length - startAfter);
break;
}
}
How about something like this:
private static string GetRightPartOfPath(string path, string startAfterPart)
{
// use the correct seperator for the environment
var pathParts = path.Split(Path.DirectorySeparatorChar);
// this assumes a case sensitive check. If you don't want this, you may want to loop through the pathParts looking
// for your "startAfterPath" with a StringComparison.OrdinalIgnoreCase check instead
int startAfter = Array.IndexOf(pathParts, startAfterPart);
if (startAfter == -1)
{
// path not found
return null;
}
// try and work out if last part was a directory - if not, drop the last part as we don't want the filename
var lastPartWasDirectory = pathParts[pathParts.Length - 1].EndsWith(Path.DirectorySeparatorChar.ToString());
return string.Join(
Path.DirectorySeparatorChar.ToString(),
pathParts, startAfter,
pathParts.Length - startAfter - (lastPartWasDirectory?0:1));
}
This method attempts to work out if the last part is a filename and drops it if it is.
Calling it with
GetRightPartOfPath(#"C:\my documents\my folder\my other folder\filename.exe", "my folder");
returns
my folder\my other folder
Calling it with
GetRightPartOfPath(#"C:\my documents\my folder\my other folder\", "my folder");
returns the same.
you could use this method to split the path by "\" sign (or "/" in Unix environments). After this you get an array of strings back and you can pick what you need.
public static String[] SplitPath(string path)
{
String[] pathSeparators = new String[]
{
Path.DirectorySeparatorChar.ToString()
};
return path.Split(pathSeparators, StringSplitOptions.RemoveEmptyEntries);
}

Filtering file names: getting *.abc without *.abcd, or *.abcde, and so on

Directory.GetFiles(LocalFilePath, searchPattern);
MSDN Notes:
When using the asterisk wildcard character in a searchPattern, such as ".txt", the matching behavior when the extension is exactly three characters long is different than when the extension is more or less than three characters long. A searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern. A searchPattern with a file extension of one, two, or more than three characters returns only files having extensions of exactly that length that match the file extension specified in the searchPattern. When using the question mark wildcard character, this method returns only files that match the specified file extension. For example, given two files, "file1.txt" and "file1.txtother", in a directory, a search pattern of "file?.txt" returns just the first file, while a search pattern of "file.txt" returns both files.
The following list shows the behavior of different lengths for the searchPattern parameter:
*.abc returns files having an extension of .abc, .abcd, .abcde, .abcdef, and so on.
*.abcd returns only files having an extension of .abcd.
*.abcde returns only files having an extension of .abcde.
*.abcdef returns only files having an extension of .abcdef.
With the searchPattern parameter set to *.abc, how can I return files having an extension of .abc, not .abcd, .abcde and so on?
Maybe this function will work:
private bool StriktMatch(string fileExtension, string searchPattern)
{
bool isStriktMatch = false;
string extension = searchPattern.Substring(searchPattern.LastIndexOf('.'));
if (String.IsNullOrEmpty(extension))
{
isStriktMatch = true;
}
else if (extension.IndexOfAny(new char[] { '*', '?' }) != -1)
{
isStriktMatch = true;
}
else if (String.Compare(fileExtension, extension, true) == 0)
{
isStriktMatch = true;
}
else
{
isStriktMatch = false;
}
return isStriktMatch;
}
Test Program:
class Program
{
static void Main(string[] args)
{
string[] fileNames = Directory.GetFiles("C:\\document", "*.abc");
ArrayList al = new ArrayList();
for (int i = 0; i < fileNames.Length; i++)
{
FileInfo file = new FileInfo(fileNames[i]);
if (StriktMatch(file.Extension, "*.abc"))
{
al.Add(fileNames[i]);
}
}
fileNames = (String[])al.ToArray(typeof(String));
foreach (string s in fileNames)
{
Console.WriteLine(s);
}
Console.Read();
}
Anybody else better solution?
The answer is that you must do post filtering. GetFiles alone cannot do it. Here's an example that will post process your results. With this you can use a search pattern with GetFiles or not - it will work either way.
List<string> fileNames = new List<string>();
// populate all filenames here with a Directory.GetFiles or whatever
string srcDir = "from"; // set this
string destDir = "to"; // set this too
// this filters the names in the list to just those that end with ".doc"
foreach (var f in fileNames.All(f => f.ToLower().EndsWith(".doc")))
{
try
{
File.Copy(Path.Combine(srcDir, f), Path.Combine(destDir, f));
}
catch { ... }
}
Not a bug, perverse but well-documented behavior. *.doc matches *.docx based on 8.3 fallback lookup.
You will have to manually post-filter the results for ending in doc.
use linq....
string strSomePath = "c:\\SomeFolder";
string strSomePattern = "*.abc";
string[] filez = Directory.GetFiles(strSomePath, strSomePattern);
var filtrd = from f in filez
where f.EndsWith( strSomePattern )
select f;
foreach (string strSomeFileName in filtrd)
{
Console.WriteLine( strSomeFileName );
}
This won't help in the short term, but voting on the MS Connect post for this issue may get things changed in the future.
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=95415
Since for "*.abc" GetFiles will return extensions of 3 or more, anything with a length of 3 after the "." is an exact match, and anything longer is not.
string[] fileList = Directory.GetFiles(path, "*.abc");
foreach (string file in fileList)
{
FileInfo fInfo = new FileInfo(file);
if (fInfo.Extension.Length == 4) // "." is counted in the length
{
// exact extension match - process the file...
}
}
Not sure of the performance of the above - while it uses simple length comparisons rather than string manipulations, new FileInfo() is called each time around the loop.

Categories