getting files with max dates - c#

I have a list of files:
fileA_20180103110932
fileA_20180103111001
fileB_20180103110901
fileC_20180103110932
fileC_20180103111502
Per file name, I need to get the latest date. So the result set would be:
fileA_20180103111001
fileB_20180103110901
fileC_20180103111502
How would I do that with lambda expressions?
on a high level, I think I have to group by file names (so do a substring till the underscore) and then get the max date for those file names that have a count > 2.

Something like this should work:
var files = new List<string>
{
"fileA_20180103110932",
"fileA_20180103111001",
"fileB_20180103110901",
"fileC_20180103110932",
"fileC_20180103111502"
};
var results = files
.Select(f => f.Split('_'))
.GroupBy(p => p[0], p => p[1])
.Select(g => g.Key + "_" + g.Max());

Apparently all your files have exactly one underscore in their file names. The fact that you define the part after the underscore as the "date of the file" is irrelevant to your problem. What is relevant is that your filenames have an underscore, a part before the underscore and a part after the underscore.
Besides, a filename is not a file, it is just a string with some limitations, especially your coded filenames
So your problem would be like this:
Given a sequence of strings, where every string has exactly one underscore. The part before the underscore is called MainPart, the part after the underscore is called SortablePart (this is what you would call the "date of the file").
Your requirement would be:
I want a linq statement that has as input this sequence of strings and
as output a sequence of strings containing the MainPart of the input
strings, followed by an underscore, followed by the first value of all
SortableParts of strings with the same MainPart ordered in descending
order.
Having rephrased your problem your linq statement is fairly easy. You'll need a function to split your input strings into MainPart and SortablePart. I'll do this using String.Split
var result = fileNames
.Select(inputString => inputString.Split(new char[] {'_'}))
.Select(splitStringArray => new
{
MainPart = splitStringArray[0],
SortablePart = splitStringArray[1],
})
// now easy to group by MainPart:
.GroupBy(
item => item.MainPart, // make groups with same MainPart, will be the key
item => item.SortablePart) // the elements of the group
// for every group, sort the elements descending and take only the first element
.Select(group => new
{
MainPart = group.Key,
NewestElement = group // take all group elements
.SortByDescending(groupElement => groupElement) // sort in descending order
.First(),
})
// I know every group has at least one element, otherwise it wouldn't be a group
// now form the file name:
.Select(item => item.MainPart + '_' + item.NewestElement);
This is one horrible linq statement!
Besides it will crash if your file names have no underscore at all. It is very difficult to guarantee the filenames are all correctly coded.
If your coded filenames are something you widely use in your application, my advise would be to create a class for this and some functions to make conversion to filename (string) and back easier. This would make your coded filenames easier to understand by others, easier to change if needed, and above all: you can be certain that the filenames are coded correctly
class CodedFileName
{
private const char separator = '_';
public string MainPart {get; private set;}
public string TimePart {get; private set;}
}
This makes it easier if you decide to change your separator, or accept several separators (old filenames using underscore, and new filenames using minus sign)
You'd also need a propert constructor:
public CodedFileName(string mainPart, DateTime fileDate) {...}
And maybe constructors that takes a filename. Exceptions if not coded:
public CodedFileName(string fileName) {..}
public CodedFileName(FileInfo fileInfo) {...}
public bool IsProperlyCoded(string fileName) {...}
and of course a ToString():
public override ToString()
{
return this.MainPart + separator + this.TimePart;
}
TODO: if needed consider defining equality, IEquatable, IComparable, ICloneable, etc.
Having done this, the advantages are that you are certain that your filenames will always be properly coded. Much easier to understand by others, much easier to change, and thus maintain, and finally your linq query will be much easier (to understand, maintain, test, etc):
As an extension function: see Extension methods demystified
static class CodedFileNameExtensions
{
public static CodedFileName Newest(this IEnumerable<CodedFileName> source)
{
// TODO: exception if null or source empty
return source.OrderByDescending(sourceElement => sourceElement.TimePart)
.First();
}
public static CodedFileName NewestOrDefault(this IEnumerable<CodedFileName> source)
{
// TODO: exception if null source
if (source.Any())
return source.Newest();
else
return null;
}
public static IEnumerable<CodedFileName> ExtractNewest(this IEnumerable<CodedFileName> source)
{
return groupsSameNamePart = source
.GroupBy(sourceElement => sourceElement.MainPart)
.Newest(group => group)
}
}
Usage will be:
IEnumerable<string> fileNames = ...
IEnumerable<string> correctlyCodedFileNames = fileNames
.Where(fileName => fileName.IsCorrectlyCoded();
IEnumerable<CodedFileName> codedFileNames = correctlyCodedFileNames
.Select(correctlyCodedFileName => new CodedFileName(correctlyCodedFileName));
IEnumerable<CodedFileName> newestFiles = codedFileNames.ExtractNewest();
Or in one statement:
IEnumerable<CodedFileName> newestFiles = fileNames
.Where(fileName => fileName.IsCorrectlyCoded)
.Select(fileName => new CodedFileName(fileName)
.ExtractNewest();
Now isn't that much easier to understand? And all this by less then one page of coding.
So if you use your coded file names all over your project, my advise would be to consider creating a class for it.

Related

How to fetch a particular filename pattern from directory

I'm trying to fetch a particular filename from a directory. The code I've tried is as below
DirectoryInfo dirInfo = new DirectoryInfo(directoryPath);
FileInfo recentlyModLogFile = (from files in dirInfo.GetFiles("^Monarch_[0-9]{2}$") orderby files.LastWriteTime descending select files).First();
//Output : Error
List of file names (Input)
Monarch_05bridge //Date modified 16-12-2021 20:41
Monarch_04bridge //Date modified 16-12-2021 06:49
Monarch_04 //Date modified 16-12-2021 05:39
Monarch_02 //Date modified 16-12-2021 05:49
Monarch_02bridge //Date modified 14-12-2021 19:34
Monarch_01 //Date modified 14-12-2021 09:08
Code should look for files whose filename starts with Monarch_ followed by 2 numeric digits and then filter out the recently modified file
So the output should be Monarch_02
I also tried doing
DirectoryInfo dirInfo = new DirectoryInfo(directoryPath);
FileInfo recentlyModLogFile = (from files in dirInfo.GetFiles(Monarch_ + "*") orderby files.LastWriteTime descending select files).First();
//OUtput : Monarch_05bridge
Can someone help me to resolve this issue.
string youngestFile = Directory.GetFiles(directoryPath)
.Where(o => Regexp.Contains(Path.GetFileNameWithoutExtension(o), "Monarch_\\d\\d"))
.OrderByDescending(o => File.GetLastWriteTime(o))
.FirstOrDefault();
This is a quick copy-and-paste from my project files. The Regexp.Contains() is one of the simple methods I wrote to do regexp comparisons.
Notice the Regular Expression I used allow Monarch_02, Monarch_02Bridge and abcMonarch_09 all to be possible result. You can use "^Monarch_\\d\\d$", if you want a strict rule.
Refer to Regular Expressions for details.
private static Match GetFirstMatch(string text, string pattern)
{
Match match = Regex.Match(text, pattern, RegexOptions.None);
return match;
}
public static Boolean Contains(string text, string pattern)
{
return GetFirstMatch(text, pattern).Value != String.Empty;
}
Basically, use Directory.GetFiles(path) to get all the files, then use LINQ to apply conditions, order-bys and fetch the first result.
The Path, Directory and File classes can help a lot when you are working around file system.

Check whether a string is in a list at any order in C#

If We have a list of strings like the following code:
List<string> XAll = new List<string>();
XAll.Add("#10#20");
XAll.Add("#20#30#40");
string S = "#30#20";//<- this is same as #20#30 also same as "#20#30#40" means S is exist in that list
//check un-ordered string S= #30#20
// if it is contained at any order like #30#20 or even #20#30 ..... then return true :it is exist
if (XAll.Contains(S))
{
Console.WriteLine("Your String is exist");
}
I would prefer to use Linq to check that S in this regard is exist, no matter how the order is in the list, but it contains both (#30) and (#20) [at least] together in that list XAll.
I am using
var c = item2.Intersect(item1);
if (c.Count() == item1.Length)
{
return true;
}
You should represent your data in a more meaningful way. Don't rely on strings.
For example I would suggest creating a type to represent a set of these numbers and write some code to populate it.
But there are already set types such as HashSet which is possibly a good match with built in functions for testing for sub sets.
This should get you started:
var input = "#20#30#40";
var hashSetOfNumbers = new HashSet<int>(input
.Split(new []{'#'}, StringSplitOptions.RemoveEmptyEntries)
.Select(s=>int.Parse(s)));
This works for me:
Func<string, string[]> split =
x => x.Split(new [] { '#' }, StringSplitOptions.RemoveEmptyEntries);
if (XAll.Any(x => split(x).Intersect(split(S)).Count() == split(S).Count()))
{
Console.WriteLine("Your String is exist");
}
Now, depending on you you want to handle duplicates, this might even be a better solution:
Func<string, HashSet<string>> split =
x => new HashSet<string>(x.Split(
new [] { '#' },
StringSplitOptions.RemoveEmptyEntries));
if (XAll.Any(x => split(S).IsSubsetOf(split(x))))
{
Console.WriteLine("Your String is exist");
}
This second approach uses pure set theory so it strips duplicates.

Process part of the regex match before replacing it

I'm writing a function that will parse a file similar to an XML file from a legacy system.
....
<prod pid="5" cat='gov'>bla bla</prod>
.....
<prod cat='chi'>etc etc</prod>
....
.....
I currently have this code:
buf = Regex.Replace(entry, "<prod(?:.*?)>(.*?)</prod>", "<span class='prod'>$1</span>");
Which was working fine until it was decided that we also wanted to show the categories.
The problem is, categories are optional and I need to run the category abbreviation through a SQL query to retrieve the category's full name.
eg:
SELECT * FROM cats WHERE abbr='gov'
The final output should be:
<span class='prod'>bla bla</span><span class='cat'>Government</span>
Any idea on how I could do this?
Note1: The function is done already (except this part) and working fine.
Note2: Cannot use XML libraries, regex has to be used
Regex.Replace has an overload that takes a MatchEvaluator, which is basically a Func<Match, string>. So, you can dynamically generate a replacement string.
buf = Regex.Replace(entry, #"<prod(?<attr>.*?)>(?<text>.*?)</prod>", match => {
var attrText = match.Groups["attr"].Value;
var text = match.Groups["text"].Value;
// Now, parse your attributes
var attributes = Regex.Matches(#"(?<name>\w+)\s*=\s*(['""])(?<value>.*?)\1")
.Cast<Match>()
.ToDictionary(
m => m.Groups["name"].Value,
m => m.Groups["value"].Value);
string category;
if (attributes.TryGetValue("cat", out category))
{
// Your SQL here etc...
var label = GetLabelForCategory(category)
return String.Format("<span class='prod'>{0}</span><span class='cat'>{1}</span>", WebUtility.HtmlEncode(text), WebUtility.HtmlEncode(label));
}
// Generate the result string
return String.Format("<span class='prod'>{0}</span>", WebUtility.HtmlEncode(text));
});
This should get you started.

C# reading variables into static string from text file

I have seen several posts giving examples of how to read from text files, and examples on how to make a string 'public' (static or const), but I haven't been able to combine the two inside a 'function' in a way that is making sense to me.
I have a text file called 'MyConfig.txt'.
In that, I have 2 lines.
MyPathOne=C:\TestOne
MyPathTwo=C:\TestTwo
I want to be able to read that file when I start the form, making both MyPathOne and MyPathTwo accessible from anywhere inside the form, using something like this :
ReadConfig("MyConfig.txt");
the way I am trying to do that now, which is not working, is this :
public voice ReadConfig(string txtFile)
{
using (StreamReader sr = new StreamResder(txtFile))
{
string line;
while ((line = sr.ReadLine()) !=null)
{
var dict = File.ReadAllLines(txtFile)
.Select(l => l.Split(new[] { '=' }))
.ToDictionary( s => s[0].Trim(), s => s[1].Trim());
}
public const string MyPath1 = dic["MyPathOne"];
public const string MyPath2 = dic["MyPathTwo"];
}
}
The txt file will probably never grow over 5 or 6 lines, and I am not stuck on using StreamReader or dictionary.
As long as I can access the path variables by name from anywhere, and it doesn't add like 400 lines of code or something , then I am OK with doing whatever would be best, safest, fastest, easiest.
I have read many posts where people say the data should stored in XML, but I figure that part really doesn't matter so much because reading the file and getting the variables part would be almost the same either way. That aside, I would rather be able to use a plain txt file that somebody (end user) could edit without having to understand XML. (which means of course lots of checks for blank lines, does the path exist, etc...I am OK with doing that part, just wanna get this part working first).
I have read about different ways using ReadAllLines into an array, and some say to create a new separate 'class' file (which I don't really understand yet..but working on it). Mainly I want to find a 'stable' way to do this.
(project is using .Net4 and Linq by the way)
Thanks!!
The code you've provided doesn't even compile. Instead, you could try this:
public string MyPath1;
public string MyPath2;
public void ReadConfig(string txtFile)
{
using (StreamReader sr = new StreamReader(txtFile))
{
// Declare the dictionary outside the loop:
var dict = new Dictionary<string, string>();
// (This loop reads every line until EOF or the first blank line.)
string line;
while (!string.IsNullOrEmpty((line = sr.ReadLine())))
{
// Split each line around '=':
var tmp = line.Split(new[] { '=' },
StringSplitOptions.RemoveEmptyEntries);
// Add the key-value pair to the dictionary:
dict[tmp[0]] = dict[tmp[1]];
}
// Assign the values that you need:
MyPath1 = dict["MyPathOne"];
MyPath2 = dict["MyPathTwo"];
}
}
To take into account:
You can't declare public fields into methods.
You can't initialize const fields at run-time. Instead you provide a constant value for them at compilation time.
Got it. Thanks!
public static string Path1;
public static string Path2;
public static string Path3;
public void ReadConfig(string txtFile)
{
using (StreamReader sr = new StreamReader(txtFile))
{
var dict = new Dictionary<string, string>();
string line;
while (!string.IsNullOrEmpty((line = sr.ReadLine())))
{
dict = File.ReadAllLines(txtFile)
.Select(l => l.Split(new[] { '=' }))
.ToDictionary( s => s[0].Trim(), s => s[1].Trim());
}
Path1 = dict["PathOne"];
Path2 = dict["PathTwo"];
Path3 = Path1 + #"\Test";
}
}
You need to define the variables outside the function to make them accessible to other functions.
public string MyPath1; // (Put these at the top of the class.)
public string MyPath2;
public voice ReadConfig(string txtFile)
{
var dict = File.ReadAllLines(txtFile)
.Select(l => l.Split(new[] { '=' }))
.ToDictionary( s => s[0].Trim(), s => s[1].Trim()); // read the entire file into a dictionary.
MyPath1 = dict["MyPathOne"];
MyPath2 = dict["MyPathTwo"];
}
This question is similar to Get parameters out of text file
(I put an answer there. I "can't" paste it here.)
(Unsure whether I should "flag" this question as duplicate. "Flagging" "closes".)
(Do duplicate questions ever get consolidated? Each can have virtues in the wording of the [often lame] question or the [underreaching and overreaching] answers. A consolidated version could have the best of all, but consolidation is rarely trivial.)

How to check if filename contains substring in C#

I have a folder with files named
myfileone
myfiletwo
myfilethree
How can I check if file "myfilethree" is present.
I mean is there another method other than IsFileExist() method, i.e like filename contains substring "three"?
Substring:
bool contains = Directory.EnumerateFiles(path).Any(f => f.Contains("three"));
Case-insensitive substring:
bool contains = Directory.EnumerateFiles(path).Any(f => f.IndexOf("three", StringComparison.OrdinalIgnoreCase) > 0);
Case-insensitive comparison:
bool contains = Directory.EnumerateFiles(path).Any(f => String.Equals(f, "myfilethree", StringComparison.OrdinalIgnoreCase));
Get file names matching a wildcard criteria:
IEnumerable<string> files = Directory.EnumerateFiles(path, "three*.*"); // lazy file system lookup
string[] files = Directory.GetFiles(path, "three*.*"); // not lazy
If I understand your question correctly, you could do something like
Directory.GetFiles(directoryPath, "*three*")
or
Directory.GetFiles(directoryPath).Where(f => f.Contains("three"))
Both of these will give you all the names of all files with three in it.
I am not that familiar with IO but maybe this would work ? Requires using System.Linq
System.IO.Directory.GetFiles("PATH").Where(s => s.Contains("three"));
EDIT: Note that this returns array of strings.

Categories