folder name contain names c# Directory - c#

I am working on Excel add-ins with intranet server.
I have names of employees and each one has a folder in the intranet and this folder may has a power point file may not. so I need to read the files for each name.
the Problem is with names:
each folder name has this Pattern :
surname, firstname
but the problem is with the names who contain multiple names as a firstname or surname:
ex:
samy jack sammour.
the first name is: "samy jack" and the last name is "sammour"
so the folder would be : sammour, samy jack
but I have only the field name, I don't know what is the last name or the firstname(it could be "jack sammour, samy" or "sammour, samy jack"). so I tried this code to fix it:
string[] dirs = System.IO.Directory.GetFiles(#"/samy*jack*sammour/","*file*.pptx");
if (dirs.Length > 0)
{
MessageBox.Show("true");
}
but it gave me an error:
file is not illegal
how can I fix this problem and search all the possibilties

That should do the trick:
var path = #"C:\Users\";
var name = "samy jack sammour";
Func<IEnumerable<string>, IEnumerable<string>> permutate = null;
permutate = items =>
items.Count() > 1 ?
items.SelectMany(
(_, ndx1) => permutate(items.Where((__, ndx2) => ndx1 != ndx2)),
(item1, item2) => item1 + (item2.StartsWith(",") ? "" : " ") + item2) :
items;
var names = name.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Concat(new[] { "," }).ToArray();
var dirs = new HashSet<string>(permutate(names).Where(n => !n.StartsWith(",") && !n.EndsWith(",")), StringComparer.OrdinalIgnoreCase);
if (new DirectoryInfo(path).EnumerateDirectories().Any(dir => dirs.Contains(dir.Name) && dir.EnumerateFiles("*.pptx").Any()))
MessageBox.Show("true");
In my opinion, you should't do this with a Regex because regexes can't match permutations very well.
Instead you can create a HashSet which contains all case-insensitive permutations that correlate to your pattern:
surname, firstname
(Case-sensitivity isn't required because the windows file system doesn't care if a directory or file name is upper or lower case.)
For the sake of simplicity I just add the comma to the permutation parts and filter the items that start or end with a comma in a next step.
If performance matters or if the names can consist of many parts I'm sure that there's a way to optimize these possibilities away sooner to prevent large parts of the unnecessary permutations.
In the last step you enumerate the directory names and check if there's a match in this HashSet of all possible names.
When you've found a matching directory you just need to search for all .pptx files in this directory.
If necessary just replace the "*.pptx" with your file name pattern.

Related

How to fetch a particular filename pattern from directory

I'm trying to fetch a particular filename from a directory. The code I've tried is as below
DirectoryInfo dirInfo = new DirectoryInfo(directoryPath);
FileInfo recentlyModLogFile = (from files in dirInfo.GetFiles("^Monarch_[0-9]{2}$") orderby files.LastWriteTime descending select files).First();
//Output : Error
List of file names (Input)
Monarch_05bridge //Date modified 16-12-2021 20:41
Monarch_04bridge //Date modified 16-12-2021 06:49
Monarch_04 //Date modified 16-12-2021 05:39
Monarch_02 //Date modified 16-12-2021 05:49
Monarch_02bridge //Date modified 14-12-2021 19:34
Monarch_01 //Date modified 14-12-2021 09:08
Code should look for files whose filename starts with Monarch_ followed by 2 numeric digits and then filter out the recently modified file
So the output should be Monarch_02
I also tried doing
DirectoryInfo dirInfo = new DirectoryInfo(directoryPath);
FileInfo recentlyModLogFile = (from files in dirInfo.GetFiles(Monarch_ + "*") orderby files.LastWriteTime descending select files).First();
//OUtput : Monarch_05bridge
Can someone help me to resolve this issue.
string youngestFile = Directory.GetFiles(directoryPath)
.Where(o => Regexp.Contains(Path.GetFileNameWithoutExtension(o), "Monarch_\\d\\d"))
.OrderByDescending(o => File.GetLastWriteTime(o))
.FirstOrDefault();
This is a quick copy-and-paste from my project files. The Regexp.Contains() is one of the simple methods I wrote to do regexp comparisons.
Notice the Regular Expression I used allow Monarch_02, Monarch_02Bridge and abcMonarch_09 all to be possible result. You can use "^Monarch_\\d\\d$", if you want a strict rule.
Refer to Regular Expressions for details.
private static Match GetFirstMatch(string text, string pattern)
{
Match match = Regex.Match(text, pattern, RegexOptions.None);
return match;
}
public static Boolean Contains(string text, string pattern)
{
return GetFirstMatch(text, pattern).Value != String.Empty;
}
Basically, use Directory.GetFiles(path) to get all the files, then use LINQ to apply conditions, order-bys and fetch the first result.
The Path, Directory and File classes can help a lot when you are working around file system.

How can I get a list of DirectoryInfo where the name of the directories contains a string stored in a List<string>?

Suppose I have a list of strings. These strings will be a part of the directory name that I want to open.
var listOfStrings = new List<string>(){"Foo", "Bar", "LocalHost", "SomeIPAddress"};
If this is my list, my directories might look like this:
Foo_TodaysDate_ThisFileNameIsMostlyLongAndUnhelpful
Bar_TodaysDate_ThisFileNameIsMostlyLongAndUnhelpful
LocalHost_TodaysDate_ThisFileNameIsMostlyLongAndUnhelpful
SomeIPAddress_TodaysDate_ThisFileNameIsMostlyLongAndUnhelpful
So I have the code here to load the directory info into a list:
m_jsonDirectories = new DirectoryInfo(#"C:\ProgramData\SCLDDB\ReportLogs\")
.GetDirectories()
.OrderByDescending(p_f => p_f.LastWriteTime)
.ToList();
Right now, I can load all the directories in the master directory into my variable, but I want to add something like:
.Where(x => x.Name.Contains(/*A string found in my List above*/)
Edit: in the above statement, the parameter x is of type DirectoryInfo. So x.Name should return the Name of the Directory.
I don't know how to search
List.Any(s => string.Contains(s))
when I don't have a string variable already set. And ideally I'd just want to search each element of my list for a match without individually setting some temporary string variable.
.Where(x=> listOfStrings.Any(c=> x.Contains(c))) is what you are looking for.
So you have a sequence of DirectoryInfos, and a sequence of strings.
You want to filter the sequence of DirectoryInfos in such a way that only those DirectoryInfos that have a Name that starts with at least one of the strings that is in your sequence of strings.
So if your sequence of strings contains "Foo", than your end result should at least contain all DirectoryInfos whose Name start with Foo.
IEnumerable<string> strings = ...
IEnumerable<DirectoryInfo> directoryInfos = ...
var result = directoryInfos
.Where(directoryInfo => strings
.Any(str => directoryInfo.Name.StartsWitch(str));
In words:
From the sequence of all DirectoryInfos, keep only those DirectoryInfos, of which the name of this DirectoryInfo starts with at Any of the strings in the sequence of strings.

getting files with max dates

I have a list of files:
fileA_20180103110932
fileA_20180103111001
fileB_20180103110901
fileC_20180103110932
fileC_20180103111502
Per file name, I need to get the latest date. So the result set would be:
fileA_20180103111001
fileB_20180103110901
fileC_20180103111502
How would I do that with lambda expressions?
on a high level, I think I have to group by file names (so do a substring till the underscore) and then get the max date for those file names that have a count > 2.
Something like this should work:
var files = new List<string>
{
"fileA_20180103110932",
"fileA_20180103111001",
"fileB_20180103110901",
"fileC_20180103110932",
"fileC_20180103111502"
};
var results = files
.Select(f => f.Split('_'))
.GroupBy(p => p[0], p => p[1])
.Select(g => g.Key + "_" + g.Max());
Apparently all your files have exactly one underscore in their file names. The fact that you define the part after the underscore as the "date of the file" is irrelevant to your problem. What is relevant is that your filenames have an underscore, a part before the underscore and a part after the underscore.
Besides, a filename is not a file, it is just a string with some limitations, especially your coded filenames
So your problem would be like this:
Given a sequence of strings, where every string has exactly one underscore. The part before the underscore is called MainPart, the part after the underscore is called SortablePart (this is what you would call the "date of the file").
Your requirement would be:
I want a linq statement that has as input this sequence of strings and
as output a sequence of strings containing the MainPart of the input
strings, followed by an underscore, followed by the first value of all
SortableParts of strings with the same MainPart ordered in descending
order.
Having rephrased your problem your linq statement is fairly easy. You'll need a function to split your input strings into MainPart and SortablePart. I'll do this using String.Split
var result = fileNames
.Select(inputString => inputString.Split(new char[] {'_'}))
.Select(splitStringArray => new
{
MainPart = splitStringArray[0],
SortablePart = splitStringArray[1],
})
// now easy to group by MainPart:
.GroupBy(
item => item.MainPart, // make groups with same MainPart, will be the key
item => item.SortablePart) // the elements of the group
// for every group, sort the elements descending and take only the first element
.Select(group => new
{
MainPart = group.Key,
NewestElement = group // take all group elements
.SortByDescending(groupElement => groupElement) // sort in descending order
.First(),
})
// I know every group has at least one element, otherwise it wouldn't be a group
// now form the file name:
.Select(item => item.MainPart + '_' + item.NewestElement);
This is one horrible linq statement!
Besides it will crash if your file names have no underscore at all. It is very difficult to guarantee the filenames are all correctly coded.
If your coded filenames are something you widely use in your application, my advise would be to create a class for this and some functions to make conversion to filename (string) and back easier. This would make your coded filenames easier to understand by others, easier to change if needed, and above all: you can be certain that the filenames are coded correctly
class CodedFileName
{
private const char separator = '_';
public string MainPart {get; private set;}
public string TimePart {get; private set;}
}
This makes it easier if you decide to change your separator, or accept several separators (old filenames using underscore, and new filenames using minus sign)
You'd also need a propert constructor:
public CodedFileName(string mainPart, DateTime fileDate) {...}
And maybe constructors that takes a filename. Exceptions if not coded:
public CodedFileName(string fileName) {..}
public CodedFileName(FileInfo fileInfo) {...}
public bool IsProperlyCoded(string fileName) {...}
and of course a ToString():
public override ToString()
{
return this.MainPart + separator + this.TimePart;
}
TODO: if needed consider defining equality, IEquatable, IComparable, ICloneable, etc.
Having done this, the advantages are that you are certain that your filenames will always be properly coded. Much easier to understand by others, much easier to change, and thus maintain, and finally your linq query will be much easier (to understand, maintain, test, etc):
As an extension function: see Extension methods demystified
static class CodedFileNameExtensions
{
public static CodedFileName Newest(this IEnumerable<CodedFileName> source)
{
// TODO: exception if null or source empty
return source.OrderByDescending(sourceElement => sourceElement.TimePart)
.First();
}
public static CodedFileName NewestOrDefault(this IEnumerable<CodedFileName> source)
{
// TODO: exception if null source
if (source.Any())
return source.Newest();
else
return null;
}
public static IEnumerable<CodedFileName> ExtractNewest(this IEnumerable<CodedFileName> source)
{
return groupsSameNamePart = source
.GroupBy(sourceElement => sourceElement.MainPart)
.Newest(group => group)
}
}
Usage will be:
IEnumerable<string> fileNames = ...
IEnumerable<string> correctlyCodedFileNames = fileNames
.Where(fileName => fileName.IsCorrectlyCoded();
IEnumerable<CodedFileName> codedFileNames = correctlyCodedFileNames
.Select(correctlyCodedFileName => new CodedFileName(correctlyCodedFileName));
IEnumerable<CodedFileName> newestFiles = codedFileNames.ExtractNewest();
Or in one statement:
IEnumerable<CodedFileName> newestFiles = fileNames
.Where(fileName => fileName.IsCorrectlyCoded)
.Select(fileName => new CodedFileName(fileName)
.ExtractNewest();
Now isn't that much easier to understand? And all this by less then one page of coding.
So if you use your coded file names all over your project, my advise would be to consider creating a class for it.

How do I store only the folder names as an array and not the while path (C#)

So I know how to store the full path but not just the end folder names, for example I've already got an array but is there any method to remove certain characters from all arrays or just get folder names from a path?
Edit: string[] allFolders = Directory.GetDirectories(directory);
That's what I use to get all folder names but that gets me the whole path
Edit:They need to be stored in an Array
Edit: sorry , I need an array with values such as "mpbeach","blabla","keyboard" and not E:\Zmod\idk\DLC List Generator\DLC List Generator by Frazzlee\ , so basically not the full path
This works.
string[] allFolders = Directory.EnumerateDirectories(directory)
.Select(d => new DirectoryInfo(d).Name).ToArray();
This also works. Difference is we are using List<string> instead of string[]
List<string> allFolders = Directory.EnumerateDirectories(directory)
.Select(d => new DirectoryInfo(d).Name).ToList();
Example 1: Uses string[] allFolders
Test Folder
In VS IDE, in Debug Mode
Example 2: Uses List<string> allFolders
Test Folder
In VS IDE, in Debug Mode
Example 2: Uses string[] allFolders
No need for string operations... Just use DirectoryInfo class
var allFolders = new DirectoryInfo(directory).GetDirectories()
.Select(x => x.Name)
.ToArray();
NOTE I'm taking your question to mean how to extract the last folder name from a file URL. Others are reading this as how to extract the names of folders in a directory. If I'm wrong, it's because I'm misinterpreting your question.
Split by backslash to get folders. The next-to-last value is the last folder:
string folder = #"c:\mydrive\testfolder\hello.txt";
string[] parts = folder.Split('\\');
string lastFolder = parts[parts.Length - 1];
//Yields "testfolder";
Moving that forward to what you want:
private string[] foldersOnly(){
List<string> folders = new List<string>();
string[] allFolders = Directory.GetDirectories(directory);
foreach(string folder in allfolders){
string[] parts = folder.Split('\\');
folders.Add(parts[parts.Length-1]);
}
}
return folders.ToArray();

Select Files From Folder Depending on Name Convention

I receive a bunch of XML files in a folder.
I want to keep checking for files that have the following naming convention:
sr-{first_id}-{second_id}-matchresults.xml
To parse as soon as I receive one.
For example:
sr-40-24-standings.xml
sr-40-24-results.xml
sr-40-24-j7844-matchresults.xml
I should select that one : sr-40-24-j7844-matchresults.xml
What comes after this that helps me select Files depending on their naming convention from a ASP Web Service?
Dim files As IO.FileInfo() = FolderName.GetFiles("*.xml")
private bool IsValid(string value)
{
string regexString = "^sr-([a-z0-9]+)-([a-z0-9-]+)-matchresults.xml";
return Regex.IsMatch(value, regexString);
}
This method will give you the files with the specified format (sr-{first_id}-{second_id}-matchresults.xml).
Note: your Ids can contain alphanumeric characters also "-" symbol. if you don't want that symbol in id, then code will look like,
string regexString = "^sr-([a-z0-9]+)-([a-z0-9]+)-matchresults.xml";
You can use a regular expression:
var pattern = new Regex(#"^sr-.............$");
And then apply a "filter" on Directoy.GetFiles to retrieve only the files matching this pattern:
var files = Directory.GetFiles("path to files", "*.xml").Where(path => pattern.IsMatch(path)).ToList();

Categories