Select Files From Folder Depending on Name Convention - c#

I receive a bunch of XML files in a folder.
I want to keep checking for files that have the following naming convention:
sr-{first_id}-{second_id}-matchresults.xml
To parse as soon as I receive one.
For example:
sr-40-24-standings.xml
sr-40-24-results.xml
sr-40-24-j7844-matchresults.xml
I should select that one : sr-40-24-j7844-matchresults.xml
What comes after this that helps me select Files depending on their naming convention from a ASP Web Service?
Dim files As IO.FileInfo() = FolderName.GetFiles("*.xml")

private bool IsValid(string value)
{
string regexString = "^sr-([a-z0-9]+)-([a-z0-9-]+)-matchresults.xml";
return Regex.IsMatch(value, regexString);
}
This method will give you the files with the specified format (sr-{first_id}-{second_id}-matchresults.xml).
Note: your Ids can contain alphanumeric characters also "-" symbol. if you don't want that symbol in id, then code will look like,
string regexString = "^sr-([a-z0-9]+)-([a-z0-9]+)-matchresults.xml";

You can use a regular expression:
var pattern = new Regex(#"^sr-.............$");
And then apply a "filter" on Directoy.GetFiles to retrieve only the files matching this pattern:
var files = Directory.GetFiles("path to files", "*.xml").Where(path => pattern.IsMatch(path)).ToList();

Related

How to fetch a particular filename pattern from directory

I'm trying to fetch a particular filename from a directory. The code I've tried is as below
DirectoryInfo dirInfo = new DirectoryInfo(directoryPath);
FileInfo recentlyModLogFile = (from files in dirInfo.GetFiles("^Monarch_[0-9]{2}$") orderby files.LastWriteTime descending select files).First();
//Output : Error
List of file names (Input)
Monarch_05bridge //Date modified 16-12-2021 20:41
Monarch_04bridge //Date modified 16-12-2021 06:49
Monarch_04 //Date modified 16-12-2021 05:39
Monarch_02 //Date modified 16-12-2021 05:49
Monarch_02bridge //Date modified 14-12-2021 19:34
Monarch_01 //Date modified 14-12-2021 09:08
Code should look for files whose filename starts with Monarch_ followed by 2 numeric digits and then filter out the recently modified file
So the output should be Monarch_02
I also tried doing
DirectoryInfo dirInfo = new DirectoryInfo(directoryPath);
FileInfo recentlyModLogFile = (from files in dirInfo.GetFiles(Monarch_ + "*") orderby files.LastWriteTime descending select files).First();
//OUtput : Monarch_05bridge
Can someone help me to resolve this issue.
string youngestFile = Directory.GetFiles(directoryPath)
.Where(o => Regexp.Contains(Path.GetFileNameWithoutExtension(o), "Monarch_\\d\\d"))
.OrderByDescending(o => File.GetLastWriteTime(o))
.FirstOrDefault();
This is a quick copy-and-paste from my project files. The Regexp.Contains() is one of the simple methods I wrote to do regexp comparisons.
Notice the Regular Expression I used allow Monarch_02, Monarch_02Bridge and abcMonarch_09 all to be possible result. You can use "^Monarch_\\d\\d$", if you want a strict rule.
Refer to Regular Expressions for details.
private static Match GetFirstMatch(string text, string pattern)
{
Match match = Regex.Match(text, pattern, RegexOptions.None);
return match;
}
public static Boolean Contains(string text, string pattern)
{
return GetFirstMatch(text, pattern).Value != String.Empty;
}
Basically, use Directory.GetFiles(path) to get all the files, then use LINQ to apply conditions, order-bys and fetch the first result.
The Path, Directory and File classes can help a lot when you are working around file system.

How to Remove Directories From EnumerateFiles?

So I'm working on a program that will list all the files in a directory. Pretty simple. Basically, when I do this: List<string> dirs = new List<string>(Directory.EnumerateFiles(target));, I don't want it to include the directory and all. Just the file name. When I run my code;
List<string> dirs = new List<string>(Directory.EnumerateFiles(target));
Console.WriteLine($"Folders and files in this directory:\n");
foreach (string i in dirs) {
Console.WriteLine($"> {i}");
}
it gives me the following:
C:\Users\Camden\Desktop\Programming\Visual Studio\C#\DirectoryManager\DirectoryManager\bin\Debug\DirectoryManager.exe
I just want the DirectoryManager.exe part, so I looked it up and I found that you can replace strings inside of strings. Like so: i.Replace(target, "");. However, this isn't doing anything, and it's just running like normal. Why isn't it replacing, and how should I instead do this?
Use methods from the System.IO.Path class.
var fullfile = #"C:\Users\Camden\Desktop\Programming\Visual Studio\C#\DirectoryManager\DirectoryManager\bin\Debug\DirectoryManager.exe";
var fileName = Path.GetFileName(fullfile); // DirectoryManager.exe
var name = Path.GetFileNameWithoutExtension(fullfile); // DirectoryManager
The simplest way is to use the Select IEnumerable extension
(you need to have a using Linq; at the top of your source code file)
List<string> files = new List<string>(Directory.EnumerateFiles(target)
.Select(x => Path.GetFileName(x)));
In this way the sequence of files retrieved by Directory.EnumerateFiles is passed, one by one, to the Select method where each fullfile name (x) is passed to Path.GetFileName to produce a new sequence of just filenames.
This sequence is then returned as a parameter to the List constructor.
And about your question on the Replace method. Remember that the Replace method doesn't change the string that you use to call the method, but returns a new string with the replacement executed. In NET strings are immutable.
So if you want to look at the replacement you need
string justFileName = i.Replace(target, "");
An alternative to using Directory.EnumerateFiles, would be DirectoryInfo.EnumerateFiles. This method returns an IEnumerable<FileInfo>. You can then make use of the FileInfo.Name property of each of the returned objects. Your code would then become:
var files = new DirectoryInfo(target).EnumerateFiles();
Console.WriteLine("Files in this directory:\n");
foreach (FileInfo i in files) {
Console.WriteLine($"> {i.Name}");
}
For just the list of file names:
List<string> fileNames = new DirectoryInfo(target).EnumerateFiles().Select(f => f.Name).ToList();
Alternatively, if you want both files and directories, you can use EnumerateFileSystemInfos. If you need to know if you have a file vs a directory you can query the Attributes property and compare it to the FileAttributes flags enumeration.
var dirsAndFiles = new DirectoryInfo(target).EnumerateFileSystemInfos();
Console.WriteLine("Folders and files in this directory:\n");
foreach (var i in dirsAndFiles) {
var type = (i.Attributes & FileAttributes.Directory) == FileAttributes.Directory ? "Directory" : "File";
Console.WriteLine($"{type} > {i.Name}");
}
The FileSystemInfo.Name property will return either the file's name (in case of a file) or the last directory in the hierarchy (for a directory)--so just the subdirectory name and not the full path ("sub" instead of "c:\sub").

folder name contain names c# Directory

I am working on Excel add-ins with intranet server.
I have names of employees and each one has a folder in the intranet and this folder may has a power point file may not. so I need to read the files for each name.
the Problem is with names:
each folder name has this Pattern :
surname, firstname
but the problem is with the names who contain multiple names as a firstname or surname:
ex:
samy jack sammour.
the first name is: "samy jack" and the last name is "sammour"
so the folder would be : sammour, samy jack
but I have only the field name, I don't know what is the last name or the firstname(it could be "jack sammour, samy" or "sammour, samy jack"). so I tried this code to fix it:
string[] dirs = System.IO.Directory.GetFiles(#"/samy*jack*sammour/","*file*.pptx");
if (dirs.Length > 0)
{
MessageBox.Show("true");
}
but it gave me an error:
file is not illegal
how can I fix this problem and search all the possibilties
That should do the trick:
var path = #"C:\Users\";
var name = "samy jack sammour";
Func<IEnumerable<string>, IEnumerable<string>> permutate = null;
permutate = items =>
items.Count() > 1 ?
items.SelectMany(
(_, ndx1) => permutate(items.Where((__, ndx2) => ndx1 != ndx2)),
(item1, item2) => item1 + (item2.StartsWith(",") ? "" : " ") + item2) :
items;
var names = name.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Concat(new[] { "," }).ToArray();
var dirs = new HashSet<string>(permutate(names).Where(n => !n.StartsWith(",") && !n.EndsWith(",")), StringComparer.OrdinalIgnoreCase);
if (new DirectoryInfo(path).EnumerateDirectories().Any(dir => dirs.Contains(dir.Name) && dir.EnumerateFiles("*.pptx").Any()))
MessageBox.Show("true");
In my opinion, you should't do this with a Regex because regexes can't match permutations very well.
Instead you can create a HashSet which contains all case-insensitive permutations that correlate to your pattern:
surname, firstname
(Case-sensitivity isn't required because the windows file system doesn't care if a directory or file name is upper or lower case.)
For the sake of simplicity I just add the comma to the permutation parts and filter the items that start or end with a comma in a next step.
If performance matters or if the names can consist of many parts I'm sure that there's a way to optimize these possibilities away sooner to prevent large parts of the unnecessary permutations.
In the last step you enumerate the directory names and check if there's a match in this HashSet of all possible names.
When you've found a matching directory you just need to search for all .pptx files in this directory.
If necessary just replace the "*.pptx" with your file name pattern.

Enumerate contents of specific folder DotNetZip, without child folders

Using Ionic.Zip
I wish to display the files or folders in a specific folder. I am using the SelectEntries method, but it unfortunately is filtering out the folders. Not what I was expecting using '*'.
ICollection<ZipEntry> selectEntries = _zipFile.SelectEntries("*",rootLocation)
If I follow an alternative approach:
IEnumerable<ZipEntry> selectEntries = _zipFile.Entries.Where(e => e.FileName.StartsWith(rootLocation))
I face two problems:
I have to switch '/' for '\' potentially.
I get all the subfolders.
Which is not desirable.
Anyone know why SelectEntries returns no folders, or am I misusing it?
I found a solution in my particular case. I think something about the way the Zipfile was constructed led to it appearing to have folders but none actually existed i.e. the following code yielded an empty list.
_zipFile.Entries.Where(e=>e.IsDirectory).AsList(); // always empty!
I used the following snippet to achieve what I needed. The regex is not as comprehensive as it should be but worked for all cases I needed.
var conformedRootLocation = rootLocation.Replace('\\','/').TrimEnd('/') + "/";
var pattern = string.Format(#"({0})([a-z|A-Z|.|_|0-9|\s]+)/?", conformedRootLocation);
var regex = new Regex(pattern);
return _zipFile.EntryFileNames.Select(e => regex.Match(e))
.Where(match => match.Success)
.Select(match => match.Groups[2].Value)
.Distinct()
.Select(f => new DirectoryResource
{
Name = f, IsDirectory = !Path.HasExtension(f)
})
.ToList();

How to check if filename contains substring in C#

I have a folder with files named
myfileone
myfiletwo
myfilethree
How can I check if file "myfilethree" is present.
I mean is there another method other than IsFileExist() method, i.e like filename contains substring "three"?
Substring:
bool contains = Directory.EnumerateFiles(path).Any(f => f.Contains("three"));
Case-insensitive substring:
bool contains = Directory.EnumerateFiles(path).Any(f => f.IndexOf("three", StringComparison.OrdinalIgnoreCase) > 0);
Case-insensitive comparison:
bool contains = Directory.EnumerateFiles(path).Any(f => String.Equals(f, "myfilethree", StringComparison.OrdinalIgnoreCase));
Get file names matching a wildcard criteria:
IEnumerable<string> files = Directory.EnumerateFiles(path, "three*.*"); // lazy file system lookup
string[] files = Directory.GetFiles(path, "three*.*"); // not lazy
If I understand your question correctly, you could do something like
Directory.GetFiles(directoryPath, "*three*")
or
Directory.GetFiles(directoryPath).Where(f => f.Contains("three"))
Both of these will give you all the names of all files with three in it.
I am not that familiar with IO but maybe this would work ? Requires using System.Linq
System.IO.Directory.GetFiles("PATH").Where(s => s.Contains("three"));
EDIT: Note that this returns array of strings.

Categories