I am getting a list of file names using the following code:
//Set up Datatable
dtUpgradeFileInfo.Columns.Add("BaseFW");
dtUpgradeFileInfo.Columns.Add("ActiveFW");
dtUpgradeFileInfo.Columns.Add("UpgradeFW");
dtUpgradeFileInfo.Columns.Add("FileName");
//Gets Upgrade information and upgrade Files from Upgrade Folder
DirectoryInfo di = new DirectoryInfo(g_strAppPath + "\\Update Files");
FileInfo[] rgFiles = di.GetFiles("*.txt");
foreach (FileInfo fi in rgFiles)
{
test1 = fi.Name.ToString();
}
All file names will be in the form BXXXX_AXXXX_UXXXX. Where of course the Xs represent a number 0-9, and i need those 3 grouping of just numbers to put each into their respective column in the Datatable. I was initially intending to get the characters that represent each grouping and putting them together for each grouping but i'm wondering if there is a better way/quicker way than sending it to a charArray. Any suggestions?
Here is a relatively simple way to get the numbers out of test1 (without LINQ):
...
string test1 = fi.Name.ToString();
int baseFW=0;
int activeFW=0;
int upgradeFW=0;
// Break the file name into the three groups
string[] groups=test1.Split('_');
if (groups.Length==3)
{
// Create a numbers array to hold the numbers
int[] nums=new int[groups.Length];
// Parse the numbers out of the strings
int idx=0;
foreach (string s in groups)
nums[idx++]=int.Parse(s.Remove(0,1)); // Convert to num
baseFW=nums[0];
activeFW=nums[1];
upgradeFW=nums[2];
}
else
{
// Error handling...
}
If you want to do this using LINQ, it's even easier:
...
string test1 = fi.Name.ToString();
int baseFW=0;
int activeFW=0;
int upgradeFW=0;
// Extract all numbers
int[] nums=test1.Split('_') // Split on underscores
.Select(s => int.Parse(s.Remove(0,1))) // Convert to ints
.ToArray(); // For random access, below
if (nums.Length==3)
{
baseFW=nums[0];
activeFW=nums[1];
upgradeFW=nums[2];
}
else
{
// Error handling...
}
Using regular expressions allows you to easily parse out the values that you need, and has the added benefit of allowing you to skip over files that end up in the directory that don't match the expected filename format.
Your code would look something like this:
//Gets Upgrade information and upgrade Files from Upgrade Folder
string strRegex = #"^B(?<Base>[0-9]{4})_A(?<Active>[0-9]{4})_U(?<Upgrade>[0-9]{4}).txt$";
RegexOptions myRegexOptions = RegexOptions.ExplicitCapture | RegexOptions.Compiled;
Regex myRegex = new Regex(strRegex, myRegexOptions);
DirectoryInfo di = new DirectoryInfo(g_strAppPath + "\\Update Files");
FileInfo[] rgFiles = di.GetFiles("*.txt");
foreach (FileInfo fi in rgFiles)
{
string name = fi.Name.ToString();
Match matched = myRegex.Match(name);
if (matched.Success)
{
//do the inserts into the data table here
string baseFw = matched.Groups["Base"].Value;
string activeFw = matched.Groups["Active"].Value;
string upgradeFw = matched.Groups["Upgrade"].Value;
}
}
Related
The requirement I'm trying to achieve is quite complicated and I'm not able to think beyond at certain point.
1) I need to traverse through a list of some thousands of files and folders(typically complex XMLs) and find a particular string pattern like { DisplayKey.get(" } (forget the parentheses) and replace them with { DisplayKey.get(& quot ; }. -> Thats Obvious and Easy
2) Now here is the tougher part.
The Ideal way the above said text should exist in the XML in any tag is like the pattern below:
DisplayKey.get("Web.Admin.MessageDestinationStatisticsDV.Failed")
The ideal pattern goes this way DisplayKey.get("xxx.xxx.xxx.xxx.xxx")
where x could be any string and the pattern should end with ").
My code should identify the sequences that starts with { DisplayKey.get(" } that does NOT end with { ") } and fix it.
Below is the approach I started:
static void WalkDirectoryTree(DirectoryInfo root)
{
FileInfo[] files = null;
DirectoryInfo[] subDirs = null;
files = root.GetFiles(".");
if (files != null)
{
try
{
foreach (FileInfo fi in files)
{
String errDSTR = "DisplayKey.get(\"";
string[] allLines = File.ReadAllLines(fi.FullName);
var writer = new StreamWriter(fi.FullName);
for (int i = 0; i < allLines.Length; i++)
{
string line = allLines[i];
// Find DisplayKey.get("
// Replace it with DisplayKey.get("
// LOGIC: HOW DO I APPROACH THIS?
foreach(char ch in line.ToCharArray())
{
//Sadly .IndexOf() only finds the First String and not the subsequet ones
}
}
}
catch(Exception e)
{
Console.WriteLine("Exception Occured :" + e.Message);
Console.ReadLine();
}
subDirs = root.GetDirectories();
foreach (System.IO.DirectoryInfo dirInfo in subDirs)
{
// Resursive call for each subdirectory.
WalkDirectoryTree(dirInfo);
}
}
}
I know File.WriteAllText(fi.FullName, File.ReadAllText(fi.FullName).Replace("some text", "some other text")); could address a generic text but I'm wondering how to I traverse through and fix the pattern issue!
An approach you could take is to use regex matching to make to checks:
Check if the line contains ' DisplayKey.get(" ' . Use the regex DisplayKey\.get\(" (note the escape chars)
Check if the line does not contain an element of the form DisplayKey.get("....."). Use the regex DisplayKey\.get\(".+"\). The .+ part of the regex matches any number of characters between the parenthesis.
For each line where there is a match for 1 and there isn't a match for 2, append )" at the end.
I am currently trying to use the below regular expression in C#
Regex reg = new Regex(#"-(FILENM01P\\.(\\d){3}\\.PGP)$");
var files = Directory.GetFiles(savePath, "*.PGP")
.Where(path => reg.IsMatch(path))
.ToList();
foreach (string file in files)
{
MessageBox.Show(file);
}
To match all files that have this file naming convention in a single to directory
FILENM01P.001.PGP
If I just load up all files like this
var files = Directory.GetFiles(savePath, "*.PGP")
foreach (string file in files)
{
MessageBox.Show(file);
}
The I get a string like this; etc.
C:\Users\User\PGP Files\FILENM01P.001.PGP
There could be many of these files for example
FILENM01P.001.PGP
FILENM01P.002.PGP
FILENM01P.003.PGP
FILENM01P.004.PGP
But there will never be
FILENM01P.000.PGP
FILENM01P.1000.PGP
To clarify, only the 3 numbers together will change and can only be between 001 to 999 (with leading zeros) the rest of the text is static and will never change.
I'm a complete novice when it comes to RegEx so any help would be greatly appreciated.
Essentially my end goal is to find the next number and create the file and if there are no files then it will create one starting at 001 and if it gets to 999 then it returns 1000 so that I know I need to move to a new directory as each directory is limited to 999 sequential files. (I'll deal with this stuff though)
Try this code.
var reg = new Regex(#"FILENM01P\.(\d{3})\.PGP");
var matches = files.Select(f => reg.Match(f)).Where(f => f.Success).Select(x=> Convert.ToInt32(x.Value.Split('.')[1])).ToList();
var nextNumber = (matches.Max() + 1).ToString("D3"); // 3 digit with leading zeros
Also you might need a if check to see if the next number is 1000 if so then return 0.
(matches.Max() + 1 > 999? 0:matches.Max() + 1).ToString("D3")
My test case.
List<string> files = new List<string>();
files.Add(#"C:\Users\User\PGP Files\FILENM01P.001.PGP");
files.Add(#"C:\Users\User\PGP Files\FILENM01P.002.PGP");
files.Add(#"C:\Users\User\PGP Files\FILENM01P.003.PGP");
files.Add(#"C:\Users\User\PGP Files\FILENM01P.004.PGP");
The output is
nextNumber = "005";
Regex regex = new Regex(#"FILENM01P\.(\d+)\.", RegexOptions.IgnoreCase);
var fnumbers = Directory.GetFiles(src, "*.PGP", SearchOption.TopDirectoryOnly)
.Select(f=>regex.Match(f))
.Where(m=>m.Success)
.Select(m=>int.Parse(m.Groups[1].Value));
int fileNum = 1 + (fnumbers.Any() ? fnumbers.Max() : 0);
You can do something like this:
var reg = new Regex(#"FILENM01P\.(\d{3})\.PGP");
var matches = files.Select(f => reg.Match(f)).Where(f => f.Success).ToList();
var nextNumber = matches.Any()
? matches.Max(f => int.Parse(f.Groups[1].Value)) + 1
: 1;
Where files is a list of the files to match.
In C#, I would like to get all files from a specific directory that matches the following mask:
prefix is "myfile_"
suffix is some numeric number
file extension is xml
i.e
myfile_4.xml
myfile_24.xml
the following files should not match the mask:
_myfile_6.xml
myfile_6.xml_
the code should like somehing this this (maybe some linq query can help)
string[] files = Directory.GetFiles(folder, "???");
Thanks
I am not good with regular expressions, but this might help -
var myFiles = from file in System.IO.Directory.GetFiles(folder, "myfile_*.xml")
where Regex.IsMatch(file, "myfile_[0-9]+.xml",RegexOptions.IgnoreCase) //use the correct regex here
select file;
You can try it like:
string[] files = Directory.GetFiles("C:\\test", "myfile_*.xml");
//This will give you all the files with `xml` extension and starting with `myfile_`
//but this will also give you files like `myfile_ABC.xml`
//to filter them out
int temp;
List<string> selectedFiles = new List<string>();
foreach (string str in files)
{
string fileName = Path.GetFileNameWithoutExtension(str);
string[] tempArray = fileName.Split('_');
if (tempArray.Length == 2 && int.TryParse(tempArray[1], out temp))
{
selectedFiles.Add(str);
}
}
So if your Test folder has files:
myfile_24.xml
MyFile_6.xml
MyFile_6.xml_
myfile_ABC.xml
_MyFile_6.xml
Then you will get in selectedFiles
myfile_24.xml
MyFile_6.xml
You can do something like:
Regex reg = new Regex(#"myfile_\d+.xml");
IEnumerable<string> files = Directory.GetFiles("C:\\").Where(fileName => reg.IsMatch(fileName));
I have a text file with a certain format. First comes an identifier followed by three spaces and a colon. Then comes the value for this identifier.
ID1 :Value1
ID2 :Value2
ID3 :Value3
What I need to do is searching e.g. for ID2 : and replace Value2 with a new value NewValue2. What would be a way to do this? The files I need to parse won't get very large. The largest will be around 150 lines.
If the file isn't that big you can do a File.ReadAllLines to get a collection of all the lines and then replace the line you're looking for like this
using System.IO;
using System.Linq;
using System.Collections.Generic;
List<string> lines = new List<string>(File.ReadAllLines("file"));
int lineIndex = lines.FindIndex(line => line.StartsWith("ID2 :"));
if (lineIndex != -1)
{
lines[lineIndex] = "ID2 :NewValue2";
File.WriteAllLines("file", lines);
}
Here's a simple solution which also creates a backup of the source file automatically.
The replacements are stored in a Dictionary object. They are keyed on the line's ID, e.g. 'ID2' and the value is the string replacement required. Just use Add() to add more as required.
StreamWriter writer = null;
Dictionary<string, string> replacements = new Dictionary<string, string>();
replacements.Add("ID2", "NewValue2");
// ... further replacement entries ...
using (writer = File.CreateText("output.txt"))
{
foreach (string line in File.ReadLines("input.txt"))
{
bool replacementMade = false;
foreach (var replacement in replacements)
{
if (line.StartsWith(replacement.Key))
{
writer.WriteLine(string.Format("{0} :{1}",
replacement.Key, replacement.Value));
replacementMade = true;
break;
}
}
if (!replacementMade)
{
writer.WriteLine(line);
}
}
}
File.Replace("output.txt", "input.txt", "input.bak");
You'll just have to replace input.txt, output.txt and input.bak with the paths to your source, destination and backup files.
Ordinarily, for any text searching and replacement, I'd suggest some sort of regular expression work, but if this is all you're doing, that's really overkill.
I would just open the original file and a temporary file; read the original a line at a time, and just check each line for "ID2 :"; if you find it, write your replacement string to the temporary file, otherwise, just write what you read. When you've run out of source, close both, delete the original, and rename the temporary file to that of the original.
Something like this should work. It's very simple, not the most efficient thing, but for small files, it would be just fine:
private void setValue(string filePath, string key, string value)
{
string[] lines= File.ReadAllLines(filePath);
for(int x = 0; x < lines.Length; x++)
{
string[] fields = lines[x].Split(':');
if (fields[0].TrimEnd() == key)
{
lines[x] = fields[0] + ':' + value;
File.WriteAllLines(lines);
break;
}
}
}
You can use regex and do it in 3 lines of code
string text = File.ReadAllText("sourcefile.txt");
text = Regex.Replace(text, #"(?i)(?<=^id2\s*?:\s*?)\w*?(?=\s*?$)", "NewValue2",
RegexOptions.Multiline);
File.WriteAllText("outputfile.txt", text);
In the regex, (?i)(?<=^id2\s*?:\s*?)\w*?(?=\s*?$) means, find anything that starts with id2 with any number of spaces before and after :, and replace the following string (any alpha numeric character, excluding punctuations) all the way 'till end of the line. If you want to include punctuations, then replace \w*? with .*?
You can use regexes to achieve this.
Regex re = new Regex(#"^ID\d+ :Value(\d+)\s*$", RegexOptions.IgnoreCase | RegexOptions.Compiled);
List<string> lines = File.ReadAllLines("mytextfile");
foreach (string line in lines) {
string replaced = re.Replace(target, processMatch);
//Now do what you going to do with the value
}
string processMatch(Match m)
{
var number = m.Groups[1];
return String.Format("ID{0} :NewValue{0}", number);
}
Directory.GetFiles(LocalFilePath, searchPattern);
MSDN Notes:
When using the asterisk wildcard character in a searchPattern, such as ".txt", the matching behavior when the extension is exactly three characters long is different than when the extension is more or less than three characters long. A searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern. A searchPattern with a file extension of one, two, or more than three characters returns only files having extensions of exactly that length that match the file extension specified in the searchPattern. When using the question mark wildcard character, this method returns only files that match the specified file extension. For example, given two files, "file1.txt" and "file1.txtother", in a directory, a search pattern of "file?.txt" returns just the first file, while a search pattern of "file.txt" returns both files.
The following list shows the behavior of different lengths for the searchPattern parameter:
*.abc returns files having an extension of .abc, .abcd, .abcde, .abcdef, and so on.
*.abcd returns only files having an extension of .abcd.
*.abcde returns only files having an extension of .abcde.
*.abcdef returns only files having an extension of .abcdef.
With the searchPattern parameter set to *.abc, how can I return files having an extension of .abc, not .abcd, .abcde and so on?
Maybe this function will work:
private bool StriktMatch(string fileExtension, string searchPattern)
{
bool isStriktMatch = false;
string extension = searchPattern.Substring(searchPattern.LastIndexOf('.'));
if (String.IsNullOrEmpty(extension))
{
isStriktMatch = true;
}
else if (extension.IndexOfAny(new char[] { '*', '?' }) != -1)
{
isStriktMatch = true;
}
else if (String.Compare(fileExtension, extension, true) == 0)
{
isStriktMatch = true;
}
else
{
isStriktMatch = false;
}
return isStriktMatch;
}
Test Program:
class Program
{
static void Main(string[] args)
{
string[] fileNames = Directory.GetFiles("C:\\document", "*.abc");
ArrayList al = new ArrayList();
for (int i = 0; i < fileNames.Length; i++)
{
FileInfo file = new FileInfo(fileNames[i]);
if (StriktMatch(file.Extension, "*.abc"))
{
al.Add(fileNames[i]);
}
}
fileNames = (String[])al.ToArray(typeof(String));
foreach (string s in fileNames)
{
Console.WriteLine(s);
}
Console.Read();
}
Anybody else better solution?
The answer is that you must do post filtering. GetFiles alone cannot do it. Here's an example that will post process your results. With this you can use a search pattern with GetFiles or not - it will work either way.
List<string> fileNames = new List<string>();
// populate all filenames here with a Directory.GetFiles or whatever
string srcDir = "from"; // set this
string destDir = "to"; // set this too
// this filters the names in the list to just those that end with ".doc"
foreach (var f in fileNames.All(f => f.ToLower().EndsWith(".doc")))
{
try
{
File.Copy(Path.Combine(srcDir, f), Path.Combine(destDir, f));
}
catch { ... }
}
Not a bug, perverse but well-documented behavior. *.doc matches *.docx based on 8.3 fallback lookup.
You will have to manually post-filter the results for ending in doc.
use linq....
string strSomePath = "c:\\SomeFolder";
string strSomePattern = "*.abc";
string[] filez = Directory.GetFiles(strSomePath, strSomePattern);
var filtrd = from f in filez
where f.EndsWith( strSomePattern )
select f;
foreach (string strSomeFileName in filtrd)
{
Console.WriteLine( strSomeFileName );
}
This won't help in the short term, but voting on the MS Connect post for this issue may get things changed in the future.
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=95415
Since for "*.abc" GetFiles will return extensions of 3 or more, anything with a length of 3 after the "." is an exact match, and anything longer is not.
string[] fileList = Directory.GetFiles(path, "*.abc");
foreach (string file in fileList)
{
FileInfo fInfo = new FileInfo(file);
if (fInfo.Extension.Length == 4) // "." is counted in the length
{
// exact extension match - process the file...
}
}
Not sure of the performance of the above - while it uses simple length comparisons rather than string manipulations, new FileInfo() is called each time around the loop.