GetFiles - Explanation - c#

Can anyone tell me what this code does? Thanks in advance!
string[] fileName = dirInfo.GetFiles("*.pdf")
.Select(fi => fi.Name)
.FirstOrDefault(name => name != "Thumbs.db")
.Split(new char[] { '-' }, StringSplitOptions.RemoveEmptyEntries);

Method Chaining
The code looks a little confusing possibly because it's using "Method Chaining", where the return value from one method is immediately acted upon without capturing the object into a named variable first.
For example, the string class has a ToLower() method which can be used to get the lowercase version of the string. If we have a method that returns a string (say, GetUserName()), then instead of doing this:
string userName = GetUserName();
string lowerCaseUserName = userName.ToLower();
We can just do this:
string lowerCaseUserName = GetUserName().ToLower();
If you understand that, then we can break the method chain down into individual lines and see what each one does.
Break the chain into individual lines
It's often helpful when debugging to break a method chain into individual lines, so you can examine each value along the way.
The first line gets an array of FileInfo objects from a directory (assuming dirInfo is an instance of the DirectoryInfo class), one for each file whose FullName ends with ".pdf"
FileInfo[] allPdfFiles = dirInfo.GetFiles("*.pdf");
Then next line selects the Name property from each FileInfo object above (which is just the file name without the rest of the path) and returns them in an IEnumerable<string>.
IEnumerable<string> pdfFileNames = allPdfFiles.Select(fi => fi.Name);
Note if the line above seems confusing, it's probably the lambda expression passed to the Select statement. You can read this as "for each FileInfo object in allPdfFiles, which in this case is referred to as fi, select the Name property.
Next we select the first file name that doesn't equal "Thumbs.db" (or a default value of null if none are found that meet this condition). This line is not needed, since we know all the file names end in ".pdf"
string firstPdfFileName = pdfFileNames.FirstOrDefault(name => name != "Thumbs.db");
And finally we split the file name on the '-' character, remove any empty entries and return the pieces as an array. So if the file name was "My-first-file.pdf", this would return an array of strings: {"My", "first", "file.pdf"}
string[] fileName = firstPdfFileName.Split(new[] {'-'},
StringSplitOptions.RemoveEmptyEntries);

Related

Turn A Full Path Into A Path With Environment Variables

I want to turn a full path into an environment variable path using c#
Is this even possible?
i.e.
C:\Users\Username\Documents\Text.txt -> %USERPROFILE%\Documents\Text.txt
C:\Windows\System32\cmd.exe -> %WINDIR%\System32\cmd.exe
C:\Program Files\Program\Program.exe -> %PROGRAMFILES%\Program\Program.exe
It is possible by going over all environment variables and checking which variable's value is contained in the string, then replacing that part of the string with the corresponding variable name surrounded by %.
First naive attempt:
string Tokenify(string path)
{
foreach (DictionaryEntry e in Environment.GetEnvironmentVariables())
{
int index = path.IndexOf(e.Value.ToString());
if (index > -1)
{
//we need to make sure we're not already inside a tokenized part.
int numDelimiters = path.Take(index).Count(c => c == '%');
if (numDelimiters % 2 == 0)
{
path = path.Replace(e.Value.ToString(), $"%{e.Key.ToString()}%");
}
}
}
return path;
}
The code currently makes a faulty assumption that the environment variable's value appears only once in the path. This needs to be corrected, but let's put that aside for now.
Also note that not all environment variables represent directories. For example, if I run this method on the string "6", I get "%PROCESSOR_LEVEL%". This can be remedied by checking for Directory.Exists() on the environment variable value before using it. This will probably also invalidate the need for checking whether we are already in a tokenized part of the string.
You may also want to sort the environment variables by length so to always use the most specific one. Otherwise you can end up with:
%HOMEDRIVE%%HOMEPATH%\AppData\Local\Folder
instead of:
%LOCALAPPDATA%\Folder
Updated code that prefers the longest variable:
string Tokenify(string path)
{
//first find all the environment variables that represent paths.
var validEnvVars = new List<KeyValuePair<string, string>>();
foreach (DictionaryEntry e in Environment.GetEnvironmentVariables())
{
string envPath = e.Value.ToString();
if (System.IO.Directory.Exists(envPath))
{
//this would be the place to add any other filters.
validEnvVars.Add(new KeyValuePair<string, string>(e.Key.ToString(), envPath));
}
}
//sort them by length so we always get the most specific one.
//if you are dealing with a large number of strings then orderedVars can be generated just once and cached.
var orderedVars = validEnvVars.OrderByDescending(kv => kv.Value.Length);
foreach (var kv in orderedVars)
{
//using regex just for case insensitivity. Otherwise just use string.Replace.
path = Regex.Replace(path, Regex.Escape(kv.Value), $"%{kv.Key}%", RegexOptions.IgnoreCase);
}
return path;
}
You may still want to add checks to avoid double-tokenizing parts of the string, but that is much less likely to be an issue in this version.
Also you might want to filter out some variables like drive roots, e.g. (%HOMEDRIVE%) or by any other criteria.

Something.Text containing two strings

I can choose multiple texts to SelectedFileText like
something\text.css
something\multiple.php
My code is
string NameParser = SelectedFileText.Text;
string[] words = NameParser.Split('\\');
string myLastWord = words[words.Length - 1];
And I can parse the text so it shows only multiple.php, but the problem is I need to get both lines out, not only the last one. So it work like this
text.css
multiple.php
Sorry for beeing bit unclear. I have already solved the problem how get the filename path.GetFilename.
The problem is SelectedText.text property contains two lines with full filename and directory, I just used NameParser to parse SelectedFileText.text where it inherits the text.
If your SelectedText.Text property contains two (or more) lines of text consisting in a full filename and you want to retrieve just the file name of all lines you can work with
var parts = NameParser.Split(Environment.Newline)
.Select(x => Path.GetFilename(x));
foreach(string s in parts)
Console.WriteLine(s);
I suggest Path.GetFileName e.g.
// Actually, myLastWord is a file name like "text.css" or "multiple.php"
string myLastWord = Path.GetFileName(SelectedFileText.Text);
If you want all the text, except the file name (i.e. directory name), Path class helps out again (via Path.GetDirectoryName)
string dirName = Path.GetDirectoryName(SelectedFileText.Text);
Edit: if you have a multiline text (see comments below) and you have extract files' names you can try Linq:
string[] fileNames = string
.Split(SelectedFileText.Text,
new char[] { '\r', '\n'},
StringSplitOptions.RemoveEmptyEntries)
.Select(line => Path.GetFileName(line))
.ToArray();

C# Split File Name Beginner Exercise

I have a directory filled with multiple excel files that I would like to rename. The names all have leading integers and a '-'. For example: 0123456-Test_01. I would like to rename all of the files within this directory by removing this prefix. 0123456-Test_01 should just be Test_01. I can rename a hard coded instance of a string, but am having trouble getting the files and renaming all of them.
My code is below. Any help is appreciated, as I am clearly new to C#.
public static void Main()
{
//Successfully splits hardcoded string
var temp = "0005689-Test_01".Split('-');
Console.WriteLine(temp[1]);
Console.ReadLine();
//Unsuccessful renaming of all files within directory
List<string> files = System.IO.Directory.GetFiles(#"C:\Users\acars\Desktop\B", "*").ToList();
System.IO.File.Move(#"C:\Users\acars\Desktop\B\", #"C:\Users\acars\Desktop\B\".Split('-'));
foreach (string file in files)
{
var temp = files.Split('-');
return temp[1];
};
}
There are some errors to fix in your code.
The first one is the wrong usage of the variable files. This is the full list of files, not the single file that you want to split and move. As explained comments you should use the iterator result stored in the variable file
The most important problem is the fact that the File.Move method throws an exception if the destination file exists. After removing the first part of your filename string, you cannot be sure that the resulting name is unique in your directory.
So a check for the existance of the file before the Move is mandatory.
Finally, it is better use Directory.EnumerateFiles because this method allows you to start the execution of your moving code without loading first all filenames in memory in a list. (In a folder full of files this could make a noticeable difference in speed)
public static void Main()
{
string workPath = #"C:\Users\acars\Desktop\B";
foreach (string file in Directory.EnumerateFiles(workPath)
{
string[] temp = file.Split('-');
if(temp.Length > 1)
{
string newName = Path.Combine(workPath, temp[1]);
if(!File.Exists(newName))
File.Move(file, newName);
}
}
}
Pay also attention to the comment below from CodeNotFound. You are using an hard-coded path so the problem actually doesn't exist, but if the directory contains a single "-" in its name then you should use something like this to get the last element in the splitted array
string newName = Path.Combine(workPath, temp[temp.Length-1]);

Replacing hardcoded strings with constants in C#

I am trying to take all the hardcoded strings in a .cs file and load it from a constant file.
For instance
string capital="Washington";
should be loaded as
string capital=Constants.capital;
and that will be added in Constants.cs
public final const capital="Washington";
I need a java/C# snippet to do this.I can't use any third party tools.Any help on this?
EDIT:
After reading the comments and answers I get a feeling I am not clear.I just want a way to replace all hard coded constants which will be having "" and rip that off and replace it with the Constants. and add that property in Constants.cs.This can be a simple text processing as well.
A few hints that should get you started:
Assume that your string processor function is called ProcessStrings.
1) Include Constants.cs into the same project as the ProcessStrings function, so it gets compiled in with the refactoring code.
2) Reflect over your Constants class to build a Dictionary of language strings to constant names, something like:
Dictionary<String, String> constantList = new Dictionary<String, String>();
FieldInfo[] fields = typeof(Constants).GetFields(BindingFlags.Static | BindingFlags.Public);
String constantValue;
foreach (FieldInfo field in fields)
{
if (field.FieldType == typeof(String))
{
constantValue = (string)field.GetValue(null);
constantList.Add(constantValue, field.Name);
}
}
3) constantList should now contain the full list of Constant names, indexed by the string they represent.
4) Grab all the lines from the file (using File.ReadAllLines).
5) Now iterate over the lines. Something like the following should allow you to ignore lines that you shouldn't be processing.
//check if the line is a comment or xml comment
if (Regex.IsMatch(lines[idx], #"^\s*//"))
continue;
//check if the entry is an attribute
if (Regex.IsMatch(lines[idx], #"^\s*\["))
continue;
//check if the line is part of a block comment (assuming a * at the start of the line)
if (Regex.IsMatch(lines[idx], #"^\s*(/\*+|\*+)"))
continue;
//check if the line has been marked as ignored
//(this is something handy I use to mark a string to be ignored for any reason, just put //IgnoreString at the end of the line)
if (Regex.IsMatch(lines[idx], #"//\s*IgnoreString\s*$"))
continue;
6) Now, match any quoted strings on the line, then go through each match and check it for a few conditions. You can remove some of these conditions if needs be.
MatchCollection mC = Regex.Matches(lines[idx], "#?\"([^\"]+)\"");
foreach (Match m in mC)
{
if (
// Detect format insertion markers that are on their own and ignore them,
!Regex.IsMatch(m.Value, #"""\s*\{\d(:\d+)?\}\s*""") &&
//or check for strings of single character length that are not proper characters (-, /, etc)
!Regex.IsMatch(m.Value, #"""\s*\\?[^\w]\s*""") &&
//check for digit only strings, allowing for decimal places and an optional percentage or multiplier indicator
!Regex.IsMatch(m.Value, #"""[\d.]+[%|x]?""") &&
//check for array indexers
!(m.Index <= lines[idx].Length && lines[idx][m.Index - 1] == '[' && lines[idx][m.Index + m.Length] == ']') &&
)
{
String toCheck = m.Groups[1].Value;
//look up the string we found in our list of constants
if (constantList.ContainsKey(toCheck))
{
String replaceString;
replaceString = "Constants." + constants[toCheck];
//replace the line in the file
lines[idx] = lines[idx].Replace("\"" + m.Groups[1].Value + "\"", replaceString);
}
else
{
//See Point 8....
}
}
7) Now join the array of lines back up, and write it back to the file. That should get you most of the way.
8) To get it to generate constants for strings you don't already have an entry for, in the else block for looking up the string,
generate a name for the constant from the string (I just removed all special characters and spaces from the string and limited it to 10 words). Then use that name and the original string (from the toCheck variable in point 6) to make a constant declaration and insert it into Constants.cs.
Then when you run the function again, those new constants will be used.
I don't know if there is any such code available, but I am providing some guidelines on how it can be implemented.
You can write a macro/standalone application (I think macro is a better option)
Parse current document or all the files in the project/solution
Write a regular expression for finding the strings (what about strings in XAML?). something like [string]([a-z A-Z0-9])["]([a-z A-Z0-9])["][;] -- this is not valid, I have just provide for discussion
Extract the constant from code.
Check if similar string is already there in your static class
If not found, insert new entry in static class
Replace string with the variable name
Goto step 2
Is there a reason why you can't put these into a static class or just in a file in your application? You can put constants anywhere and as long as they are scoped properly you can access them from everywhere.
public const string capital = "Washington";
if const doesn't work in static class, then it would be
public static readonly string capital = "Washington";
if you really want to do it the way you describe, read the file with a streamreader, split by \r\n, check if the first thing is "string", and then do all your replacements on that string element...
make sure that every time you change that string declaration, you add the nessesary lines to the other file.
You can create a class project for your constants, or if you have a helper class project, you can add a new class for you constants (Constants.cs).
public static class Constants
{
public const string CAPITAL_Washington = "Washington";
}
You can now use this:
string capital = Constants.CAPITAL_Washington;
You might as well name your constants quite specific.

Trim all chars off file name after first "_"

I'd like to trim these purchase order file names (a few examples below) so that everything after the first "_" is omitted.
INCOLOR_fc06_NEW.pdf
Keep: INCOLOR (write this to db as the VendorID) Remove: _fc08_NEW.pdf
NORTHSTAR_sc09.xls
Keep: NORTHSTAR (write this to db as the VendorID) Remove: _sc09.xls
Our scenario: The managers are uploading these files to our Intranet web server, to make them available to download/view ect. I'm using Brettles NeatUpload, and for each file uploaded, am writing the files attributes into the PO table (sql 2000). The first part of the file name will be written to the DB as a VendorID.
The naming convention for these files is consistent in that the the first part of the file is always the vendor name (or Vendor ID) followed by an "_" then other unpredictable chars used to identify the type of Purchase Order then the file extention - which is consistently either .xls, .XLS, .PDF, or .pdf.
I tried TrimEnd - but the array of chars that you have to provide ends up being long and can conflict with the part of the file name I want to keep. I have a feeling I'm not using TrimEnd properly.
What is the best way to use string.TrimEnd (or any other string manipulation in C#) that will strip off all chars after the first "_" ?
String s = "INCOLOR_fc06_NEW.pdf";
int index = s.IndexOf("_");
return index >= 0 ? s.Substring(0,index) : s;
I'll probably offend the anti-regex lobby, but here I go (ducking):
string stripped = Regex.Replace(filename, #"(?<=[^_]*)_.*",String.Empty);
This code will strip all extra characters after the first '_', unless there is no '_' in the string (then it will just return the original string).
It's one line of code. It's slower than the more elaborate IndexOf() algorithm, but when used in a non-performance-sensitive part of the code, it's a good solution.
Get your flame-throwers out...
TrimEnd removes white spaces and punctuation marks at the end of the String, it won't help you here. Read more about TrimEnd here:
http://msdn.microsoft.com/en-us/library/system.string.trimend.aspx
Bnaffas code (with a small tweak):
String fileName = "INCOLOR_fc06_NEW.pdf";
int index = fileName.IndexOf("_");
return index >= 0 ? fileName.Substring(0, index) : fileName;
If you want to do something with the other parts, you could use a Split
string fileName = "INCOLOR_fc06_NEW.pdf";
string[] parts = fileName.Split('_');
public string StripOffStuff(string sInput)
{
int iIndex = sInput.IndexOf("_");
return (iIndex > 0) ? sInput.Substring(0, iIndex) : sInput;
}
// Call it like:
string sNewString = StripOffStuff("INCOLOR_fc06_NEW.pdf");
I would go with the SubString approach but to round out the available solutions here's a LINQ approach just for fun:
string filename = "INCOLOR_fc06_NEW.pdf";
string result = new string(filename.TakeWhile(c => c != '_').ToArray());
It'll return the original string if no underscore is found.
To go with all the "alternative" solutions, here's the second one that I thought of (after substring):
string filename = "INCOLOR_fc06_NEW.pdf";
string stripped = filename.Split('_')[0];

Categories