C#: Using Directory.GetFiles to get files with fixed length - c#

The directory 'C:\temp' has two files named 'GZ96A7005.tif' and 'GZ96A7005001.tif'. They have different length with the same extension. Now I run below code:
string[] resultFileNames = Directory.GetFiles(#"C:\temp", "????????????.tif");
The 'resultFileNames' return two items 'c:\temp\GZ96A7005.tif' and 'c:\temp\GZ96A7005001.tif'.
But the Window Search will work fine. This is why and how do I get I want?

For Directory.GetFiles, ? signifies "Exactly zero or one character." On the other hand, you could use DirectoryInfo.GetFiles, for which ? signifies "Exactly one character" (apparently what you want).
EDIT:
Full code:
string[] resultFileNames = (from fileInfo in new DirectoryInfo(#"C:\temp").GetFiles("????????????.tif") select fileInfo.Name).ToArray();
You can probably skip the ToArray and just let resultFileNames be an IEnumerable<string>.
People are reporting this doesn't work for them on MS .NET. The below exact code works for me with on Mono on Ubuntu Hardy. I agree it doesn't really make sense to have two related classes use different conventions. However, that is what the documentation (linked above) says, and Mono complies with the docs. If Microsoft's implementation doesn't, they have a bug:
using System;
using System.IO;
using System.Linq;
public class GetFiles
{
public static void Main()
{
string[] resultFileNames = (from fileInfo in new DirectoryInfo(#".").GetFiles("????????????.tif") select fileInfo.Name).ToArray();
foreach(string fileName in resultFileNames)
{
Console.WriteLine(fileName);
}
}
}

I know I've read about this somewhere before, but the best I could find right now was this reference to it in Raymond Chen's blog post. The point is that Windows keeps a short (8.3) filename for every file with a long filename, for backward compatibility, and filename wildcards are matched against both the long and short filenames. You can see these short filenames by opening a command prompt and running "dir /x". Normally, getting a list of files which match ????????.tif (8) returns a list of file with 8 or less characters in their filename and a .tif extension. But every file with a long filename also has a short filename with 8.3 characters, so they all match this filter.
In your case both GZ96A7005.tif and GZ96A7005001.tif are long filenames, so they both have a 8.3 short filename which matches ????????.tif (anything with 8 or more ?'s).
UPDATE... from MSDN:
Because this method checks against
file names with both the 8.3 file name
format and the long file name format,
a search pattern similar to "*1*.txt"
may return unexpected file names. For
example, using a search pattern of
"*1*.txt" returns "longfilename.txt"
because the equivalent 8.3 file name
format is "LONGFI~1.TXT".
UPDATE: The MSDN docs specifiy different behavior for the "?" wildcard in Directory.GetFiles() and DirectoryInfo.GetFiles(). The documentation seems to be wrong, however. See Matthew Flaschen's answer.

The ? character matches "zero or one" characters... so from what you have I would imagine that your search pattern will match any file ending in ".tif" that is between zero and twelve characters long.
Try dropping another file in that is only three characters long with a ".tif" extension and see if the code picks that up as well. I have a sneaking suspicion that it will ;)
As far as the Windows search is concerned, it is most definately not using the same algorithm under the hood. The ? character might have a very different meaning there than it does in the .Net search pattern specification for the Directory.GetFiles(string, string) method.

string path = "C:/";
var files = Directory.GetFiles(path)
.Where(f => f.Replace(path, "").Length == 8);
A little costly with the string replacement. You can add whatever extension you need.

Related

sony vegas script: probleme to get directory path

I d like to create a blind test generator with a script using Sony vegas 14. For this I must make my script in C#.
I don’t have many experiences in C# so maybe my problem is a very basic one.
To do my script I must use a library class (.dll) and execute my script by Sony vegas. To test my code easily I create a console app where I try my code and can easily print in the console what my code does.
For my program y need to get the path of all subdirectory in a Directory in a string.
My problem is the next one.
the command "Directory.GetDirectories" don't work
When I use the next code to check what in my array/list I get a coherent result if I use it in the console app version on my script (the number of subdirectories in my directory)
string[] dirs = Directory.GetDirectories(myDirectorypath, "", SearchOption.TopDirectoryOnly);// get all directory path in dirs
Console.WriteLine("the number of element in your array is "+ dirs.Length);
List<string> listdedossier = new List<string>(dirs); // convert the array in a list
Console.WriteLine("the number of element in your list is " + listdedossier.Count);
But when in paste my code in my dll project nothing is written in my array or my list. I notice this because when I want to print the number of elements in the list /array that return me 0
.
do you have any idea of what happen i my code?
thanks
You should check the online Microsoft documentation for GetDirectories. The 2nd argument is supposed to be a pattern to search for that conforms to Windows file name patterns. Essentially, all or part of a file name is allowed with * being a wildcard (The .* from regex meaning "match any character any number of times") and ? being a single character wildcard (regex .). You are providing an empty string, so you get nothing back. The pattern *.exe will match all executables in a folder (if you are using GetFiles, while the pattern pattern* matches any files/folders that start with pattern. If you want all directories, do this:
string[] dirs = Directory.GetDirectories(myDirectoryPath, "*", SearchOption.TopDirectoryOnly);
Next point, the path you provide can be either a relative path (e.g., "relative\path\to\folder"), an absolute path (e.g., "D:\path\to\folder"), or a fully qualified domain name (FQDN, e.g. "\servername.gov.edu.com\drive$\path\to\folder"). If you supply a relative path, you'll need to look up Windows' rules for path resolution. It is very easy using a relative path to search the wrong folder, or even a non-existent location (though you should get an exception in that case). Also remember: Windows path names are NOT case-sensitive.
Finally, when writing text with arguments, I HIGHLY recommend you use this format:
Console.WriteLine("The number of elements in your array is {0}", dirs.Length);
This uses a place holder in the string itself which has a numeric value in it. The number indicates what argument after the format string to use (0 is the first argument after the format string). You can use as many placeholders as you want, and use the same place holder in multiple locations. This is a more type-safe way to doing string printing in C# than using the + operator, which requires that an operator be defined that takes a string and whatever type you provided. When you use placeholders, WriteLine will use the built-in ToString method which is defined for all types in the Object class. Placeholders will always work, while using + will only sometimes work.

How to filter Directory.EnumerateFiles with specific extension

I want a list of all xml files in a folder like this:
foreach (var file in Directory.EnumerateFiles(folderPath, "*.xml"))
{
// add file to a collection
}
However, if I for some reason have any files in folderPath that ends with .xmlXXX where XXX represent any characters, then they will be part of the enumerator.
If can solve it easily by doing something like
foreach (var file in Directory.EnumerateFiles(folderPath, "*.xml").Where(x => x.EndsWith(".xml")))
But it seems a bit odd to me, as I basically have to search for the same thing two times. Is there any way to get the right files directly or am I doing something wrong?
The is the documented/default behaviour of the wildcard usage with file searching.
Directory.EnumerateFiles Method (String, String)
If the specified extension is exactly three characters long, the
method returns files with extensions that begin with the specified
extension. For example, "*.xls" returns both "book.xls" and
"book.xlsx".
Your current approach of filtering twice is the right way.
The only improvement you can do is to ignore case in EndsWith like:
x.EndsWith(".xml", StringComparison.CurrentCultureIgnoreCase)
It seems like you cant do it using EnumerateFiles for 3 characters extension, according to MSDN
Quote from the article above
When you use the asterisk wildcard character in a searchPattern such as ".txt", the number of characters in the specified extension affects the search as follows:
If the specified extension is exactly three characters long, the method returns files with extensions that begin with the specified extension. For example, ".xls" returns both "book.xls" and "book.xlsx".
In all other cases, the method returns files that exactly match the specified extension. For example, ".ai" returns "file.ai" but not "file.aif".
When you use the question mark wildcard character, this method returns only files that match the specified file extension. For example, given two files, "file1.txt" and "file1.txtother", in a directory, a search pattern of "file?.txt" returns just the first file, whereas a search pattern of "file.txt" returns both files.
Therefore using the .Where extension seems like the best solution to your problem
Yes, and this design is stupid, stupid, stupid! It shouldn't do that. And it's annoying too!
That said, it appears this is what is happening: It actually searches both the long and short filenames. So files with longer extensions will have a short filename with the extension truncated to three characters.
And on newer versions of Windows, the short filenames may be disabled. So the behavior on newer systems will actually be what you would expect, and what it should've been in the first place.

Get files of certain extension c#

I wish to get a list of all the files of a certain extension (recursive), but only the files ending with that extension.
For example, I wish to get all the files with the ".exe" extension, If I have the following files:
file1.exe , file2.txt.exe , file3.exe.txt , file4.txt.exe1 , file5.txt
I expect to get a list of 1 file, which is: file1.exe.
I'm trying to use the following line:
List<string> theList = Directory.GetFiles(#"C:\SearchDir", "*.exe", SearchOption.AllDirectories).ToList();
But what I get is a list of the following three files: file1.exe , file2.txt.exe , file4.txt.exe1
Any ideas?
Try this:
var exeFiles = Directory.EnumerateFiles(sourceDirectory,
"*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".exe") && s.Count( c => c == '.') == 2)
.ToList();
This is a common issue to see. Take note to the MSDN documentation:
When using the asterisk wildcard character in a searchPattern, such as "*.txt", the matching behavior when the extension is exactly three characters long is different than when the extension is more or less than three characters long. A searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern.
You can't solve it by searching for the .exe extension; you'll need to filter your results one more time in the client code.
Now, one thing to note also is this. The following examples would in fact be considered executable files:
file1.exe
file2.txt.exe
whereas this one wouldn't technically be considered an executable file.
file4.txt.exe1
So the question then becomes, what algorithm do you want? It appears to me you want the following:
Files that have an extension of exe.
Files that don't have multiple extensions.
Have a look at Ahmed's answer for a fantastic approach to getting the algorithm you want.

System.IO.Directory search pattern not working as expected

I am attempting to retrieve jpeg and jpg files using the following statement:
string[] files = Directory.GetFiles(someDirectoryPath, "*.jp?g");
MSDN's docs for System.IO.Directory.GetFiles(string, string) state that ? represents "Exactly zero or one character.", however the above block selects jpeg files but omits jpg files.
I am currently using the overly-permissive search pattern "*.jp*g" to achieve my results, but it wrinkles my brain because it should work.
From the docs you linked to:
A searchPattern with a file extension of one, two, or more than three characters returns only files having extensions of exactly that length that match the file extension specified in the searchPattern.
I suspect that's the problem. To be honest, I'd probably fetch all the files and then postprocess them in code - it'll make for code which is simpler to reason about than relying on the Windows path-handling oddities.
You could either use "*" as a pattern and process the result yourself OR use
string[] files = Directory.GetFiles(someDirectoryPath, "*.jpg").Union (Directory.GetFiles(someDirectoryPath, "*.jpeg")).ToArray();
According to the Docs the pattern you use would return only files with extensions which are 4 characters long.
MSDN reference on Union

C# Help needed changing code that deletes folders with long paths

A while back I asked a question on stackoverflow about deleting folders that have long paths (>260 characters), the most popular solution was to move into each directory to reduce the length of the path. I've struggled with this and I'm no further on, could someone please suggest how I would intergrate the suggested code into my code?
A typical path is:
\\serverName\share\dave\Private\Careers\Careers Ed\Fun Careers Education\Chris's not used 2006 to07\old 4.Careers Area Activity Week 1 30.10.06 or 6.11.06 or 13.11.06 Introduction to job levels and careers resources\Occupational Areas & Job levels Tutor Help Sheet[1].doc
Many thanks
//Suggested code:
var curDir = Directory.GetCurrentDirectory();
Environment.CurrentDirectory = #"C:\Part\Of\The\Really\Long\Path";
Directory.Delete("Relative\Path\To\Directory");
Environment.CurrentDirectory = curDir;
//My code:
try
{
var dir = new DirectoryInfo(#FolderPath);
dir.Attributes = dir.Attributes & ~FileAttributes.ReadOnly;
dir.Delete();
}
catch (IOException ex)
{
MessageBox.Show(ex.Message,"Delete Error",MessageBoxButtons.OK,MessageBoxIcon.Error);
}
Before 'removing a directory' we have to be sure that it is empty. You could consider using the reverse 'directory walk' approach.
This would entail dealing with each directory seperately in deep-to-shallow order.
Some pseudo code;
While (fullPath.Length > 0)
{
DirectoryToDelete = GetLastPartOfPath( fullPath );
CurrentDirectory = fullPath - DirectoryToDelete;
ChangeDirectory(CurrentDirectory);
DeleteDirectory(DirectoryToDelete);
fullPath = fullPath - DirectoryToDelete;
}
Hope this helps,
Have you tried using the long path name syntax ?
From the CreateFile function in the platform SDK:
Maximum Path Length In the Windows API
(with some exceptions discussed
later), the maximum length for a path
is MAX_PATH, which is defined as 260
characters. A local path is structured
in the following order: drive letter,
colon, backslash, components separated
by backslashes, and a terminating null
character. For example, the maximum
path on drive D is "D:\<256
chars>NUL".
The Windows API has many functions
that also have Unicode versions to
permit a maximum path length of
approximately 32,000 characters
composed of components up to 255
characters each in length. To specify
that kind of extended length path, use
the "\?\" prefix. For example,
"\?\D:\".
Note The maximum path of 32,000
characters is approximate, because the
"\?\" prefix can be expanded to a
longer string, and the expansion
applies to the total length.
To specify such a path using UNC, use
the "\?\UNC\" prefix. For example,
"\?\UNC\\". These
prefixes are not used as part of the
path itself. They indicate that the
path should be passed to the system
with minimal modification, which means
that you cannot use forward slashes to
represent path separators, or a period
to represent the current directory.
Also, you cannot use the "\?\" prefix
with a relative path. Relative paths
are limited to MAX_PATH characters.
The last paragraph is of course the one that is relevant to your case.
It is not sure that .NET supports this kind of path. You could use P/Invoke to call RemoveDirectory from the Win32 SDK.
Use ZetaLongPaths. It handles long paths. Google ZetaLongPaths

Categories