Directory.GetFiles with date in name pattern - c#

I have a issue searching files with Directory class. I have a lot of files with the name similar to this:
XXX_YYYYMMDD_HHMMSS.
I want to list only the files that have the name with a date in that format and Directory.GetFiles() support patterns, but i do not know if there is a pattern that allows me to filter that the name has a date with that format. I thought about using the date of creation or the date of modification, but it is not the same as the one that comes in the name and is the one I need to use.
Does anyone know how to help me? Thanks!

What about using regex? You could propably use something like this:
private static readonly Regex DateFileRegex = new Regex(".*_[0-9]{8}_[\d]{6}.*");
public IEnumerable<string> EnumerateDateFiles(string path)
{
return Directory.EnumerateFiles(path)
.Where(x => this.IsValidDateFile(x));
}
private bool IsValidDateFile(string filename)
{
return DateFileRegex.IsMatch(filename);
}
The pattern:
.*_[0-9]{8}_[\d]{6}.*
matches
AHJDJKA_20180417_113028sad.jpg
for example.
Notice: I couldnt test the code right now. But youll get the idea.

I suggest two steps filtering:
Wild cards (raw filtering) XXX_YYYYMMDD_HHMMSS where XXX, YYYYMMDD and HHMMSS are some characters
Fine filtering (Linq) where we ensure YYYYMMDD_HHMMSS is a proper date.
Something like this:
var files = Directory
.EnumerateFiles(#"c:\MyFiles", "*_????????_??????.*") // raw filtering
.Where(file => { // fine filtering
string name = Path.GetFileNameWithoutExtension(file);
// from second last underscope '_'
string at = name.Substring(name.LastIndexOf('_', name.LastIndexOf('_') - 1) + 1);
return DateTime.TryParseExact( // is it a proper date?
at,
"yyyyMMdd'_'HHmmss",
CultureInfo.InvariantCulture,
DateTimeStyles.AssumeLocal,
out var _date); })
.ToArray(); // Finally, we want an array

Related

Enumerate Files

I want to get the files where the file names contains 14 digits.
foreach (var file_path in Directory.EnumerateFiles(#"F:\apinvoice", "*.pdf"))
{
}
I need to get the files only which has "14" digits.
16032021133026
17032021120457
17032021120534
I would go with regex where you specify pattern
you said you want 14 digits meaning it will ignore names like
a1603202113302
because it contains letter
therefore pattern is
^[0-9]{14}$
and full code:
Regex rx = new Regex("^[0-9]{14}$");
Directory
.EnumerateFiles(#"F:\apinvoice", "*.pdf")
.Where(x => rx.IsMatch(Path.GetFileNameWithoutExtension(x)));
Assign it to a list
List<string> list = Directory.EnumerateFiles(#"F:\apinvoice", "*.pdf"))
List<string> whatyouwant = list.Where(l => l.Length == 14).ToList();
Since these seem to be timestamps, another thing you could do is this;
foreach (var file_path in Directory.EnumerateFiles(#"F:\apinvoice", "*.pdf"))
{
DateTime dateTimeParsed;
var dateTimeParsedSuccesfully = DateTime.TryParseExact(file_path, "ddMMyyyyHHmmss", CultureInfo.InvariantCulture, DateTimeStyles.None, out dateTimeParsed);
if(dateTimeParsedSuccesfully)
{
// Got a valid file, add it to a list or something.
}
}
Also see:
https://learn.microsoft.com/en-us/dotnet/api/system.datetime.tryparseexact?view=net-5.0
https://learn.microsoft.com/en-us/dotnet/api/system.datetime.parseexact?view=net-5.0
ofcourse often the timespan will often be at the end of a file, so if there are characters or something in front, you may want to pass file_path.Substring(file_path.length - 14) to TryParseExact().

Changing file name

Consider the following code snippet
public static string AppendDateTimeToFileName(this string fileName)
{
return string.Concat(
Path.GetFileNameWithoutExtension(fileName),
DateTime.Now.ToString("yyyyMMddHHmmssfff"),
Path.GetExtension(fileName));
}
This basically puts a date time stamp on any file that is being uploaded by the users. Now this works great is the file name is something like
MyFile.png
AnotherFile.png
Now I'm trying to change this method so if the file name is something like
MyFile - Copy(1).png
AnotherFile - Copy(1).png
I want the file name to become
MyFile-Copy-120170303131815555.png
AnotherFile-Copy-120170303131815555.png
If there an easy soltuion for this with regex or similar or do I have to re-write the method again and check each of those values one by one.
return string.Concat(
Regex.Replace(Path.GetFileNameWithoutExtension(fileName), #" - Copy\s*\(\d+\)", "-Copy-", RegexOptions.IgnoreCase),
DateTime.Now.ToString("yyyyMMddHHmmssfff"),
Path.GetExtension(fileName));
This matches any number of digits and is a global replace.

How to get the files in numeric order from the specified directory in c#?

I have to retrieve list of file names from the specific directory using numeric order.Actually file names are combination of strings and numeric values but end with numeric values.
For example : page_1.png,page_2.png,page3.png...,page10.png,page_11.png,page_12.png...
my c# code is below :
string filePath="D:\\vs-2010projects\\delete_sample\\delete_sample\\myimages\\";
string[] filePaths = Directory.GetFiles(filePath, "*.png");
It retrieved in the following format:
page_1.png
page_10.png
page_11.png
page_12.png
page_2.png...
I am expecting to retrieve the list ordered like this:
page_1.png
page_2.png
page_3.png
[...]
page_10.png
page_11.png
page_12.png
Ian Griffiths has a natural sort for C#. It makes no assumptions about where the numbers appear, and even correctly sorts filenames with multiple numeric components, such as app-1.0.2, app-1.0.11.
You can try following code, which sort your file names based on the numeric values. Keep in mind, this logic works based on some conventions such as the availability of '_'. You are free to modify the code to add more defensive approach save you from any business case.
var vv = new DirectoryInfo(#"C:\Image").GetFileSystemInfos("*.bmp").OrderBy(fs=>int.Parse(fs.Name.Split('_')[1].Substring(0, fs.Name.Split('_')[1].Length - fs.Extension.Length)));
First you can extract the number:
static int ExtractNumber(string text)
{
Match match = Regex.Match(text, #"_(\d+)\.(png)");
if (match == null)
{
return 0;
}
int value;
if (!int.TryParse(match.Value, out value))
{
return 0;
}
return value;
}
Then you could sort your list using:
list.Sort((x, y) => ExtractNumber(x).CompareTo(ExtractNumber(y)));
Maybe this?
string[] filePaths = Directory.GetFiles(filePath, "*.png").OrderBy(n => n);
EDIT: As Marcelo pointed, I belive you can get get all file names you can get their numerical part with a regex, than you can sort them including their file names.
This code would do that:
var dir = #"C:\Pictures";
var sorted = (from fn in Directory.GetFiles(dir)
let m = Regex.Match(fn, #"(?<order>\d+)")
where m.Success
let n = int.Parse(m.Groups["order"].Value)
orderby n
select fn).ToList();
foreach (var fn in sorted) Console.WriteLine(fn);
It also filters out those files that has not a number in their names.
You may want to change the regex pattern to match more specific name structures for file names.

In C#, what is the best way to parse out this value from a string?

I have to parse out the system name from a larger string. The system name has a prefix of "ABC" and then a number. Some examples are:
ABC500
ABC1100
ABC1300
the full string where i need to parse out the system name from can look like any of the items below:
ABC1100 - 2ppl
ABC1300
ABC 1300
ABC-1300
Managers Associates Only (ABC1100 - 2ppl)
before I saw the last one, i had this code that worked pretty well:
string[] trimmedStrings = jobTitle.Split(new char[] { '-', '–' },StringSplitOptions.RemoveEmptyEntries)
.Select(s => s.Trim())
.ToArray();
return trimmedStrings[0];
but it fails on the last example where there is a bunch of other text before the ABC.
Can anyone suggest a more elegant and future proof way of parsing out the system name here?
One way to do this:
string[] strings =
{
"ABC1100 - 2ppl",
"ABC1300",
"ABC 1300",
"ABC-1300",
"Managers Associates Only (ABC1100 - 2ppl)"
};
var reg = new Regex(#"ABC[\s,-]?[0-9]+");
var systemNames = strings.Select(line => reg.Match(line).Value);
systemNames.ToList().ForEach(Console.WriteLine);
prints:
ABC1100
ABC1300
ABC 1300
ABC-1300
ABC1100
demo
You really could leverage a Regex and get better results. This one should do the trick [A-Za-z]{3}\d+, and here is a Rubular to prove it. Then in the code use it like this:
var matches = Regex.Match(someInputString, #"[A-Za-z]{3}\d+");
if (matches.Success) {
var val = matches.Value;
}
You can use a regular expression to parse this. There may be better expressions, but this one works for your case:
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string txt="ABC500";
string re1="((?:[a-z][a-z]+))";
string re2="(\\d+)"
Regex r = new Regex(re1+re2,RegexOptions.IgnoreCase|RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String word1=m.Groups[1].ToString();
String int1=m.Groups[2].ToString();
Console.Write("("+word1.ToString()+")"+"("+int1.ToString()+")"+"\n");
}
}
}
}
You should definitely use Regex for this. Depending on the exact nature of the system name, something like this could prove to be enough:
Regex systemNameRegex = new Regex(#"ABC[0-9]+");
If the ABC part of the name can change, you can modify the Regex to something like this:
Regex systemNameRegex = new Regex(#"[a-zA-Z]+[0-9]+");

Regex required for renaming file in C#

I need a regex for renaming file in c#. My file name is 22px-Flag_Of_Sweden.svg.png. I want it to rename as sweden.png.
So for that I need regex. Please help me.
I have various files more than 300+ like below:
22px-Flag_Of_Sweden.svg.png - should become sweden.png
13px-Flag_Of_UnitedStates.svg.png - unitedstates.png
17px-Flag_Of_India.svg.png - india.png
22px-Flag_Of_Ghana.svg.png - ghana.png
These are actually flags of country. I want to extract Countryname.Fileextension. Thats all.
var fileNames = new [] {
"22px-Flag_Of_Sweden.svg.png"
,"13px-Flag_Of_UnitedStates.svg.png"
,"17px-Flag_Of_India.svg.png"
,"22px-Flag_Of_Ghana.svg.png"
,"asd.png"
};
var regEx = new Regex(#"^.+Flag_Of_(?<country>.+)\.svg\.png$");
foreach ( var fileName in fileNames )
{
if ( regEx.IsMatch(fileName))
{
var newFileName = regEx.Replace(fileName,"${country}.png").ToLower();
//File.Save(Path.Combine(root, newFileName));
}
}
I am not exactly sure how this would look in c# (although the regex is important and not the language), but in Java this would look like this:
String input = "22px-Flag_Of_Sweden.svg.png";
Pattern p = Pattern.compile(".+_(.+?)\\..+?(\\..+?)$");
Matcher m = p.matcher(input);
System.out.println(m.matches());
System.out.println(m.group(1).toLowerCase() + m.group(2));
Where the relevant for you is this part :
".+_(.+?)\\..+?(\\..+?)$"
Just concat the two groups.
I wish I knew a bit of C# right now :)
Cheers Eugene.
This will return country in the first capture group: ([a-zA-Z]+)\.svg\.png$
I don't know c# but the regex could be:
^.+_(\pL+)\.svg\.png
and the replace part is : $1.png

Categories