I have an app that goes in, replaces "invalid" chars (as defined by my Regex) with a blankspace. I want it so that if there are 2 or more blank spaces in the filename, to trim one. For example:
Deal A & B.txt after my app runs, would be renamed to Deal A B.txt (3 spaces b/w A and B). What i want is really this: Deal A B.txt (one space between A and B).
I'm trying to determine how to do this--i suppose my app will have to run through all filenames at least once to replace invalid chars and then run through filenames again to get rid of extraneous whitespace.
Can anybody help me with this?
Here is my code currently for replacing the invalid chars:
public partial class CleanNames : Form
{
public CleanNames()
{
InitializeComponent();
}
public void Sanitizer(List<string> paths)
{
string regPattern = (#"[~#&$!%+{}]+");
string replacement = " ";
Regex regExPattern = new Regex(regPattern);
StreamWriter errors = new StreamWriter(#"S:\Testing\Errors.txt", true);
var filesCount = new Dictionary<string, int>();
dataGridView1.Rows.Clear();
try
{
foreach (string files2 in paths)
{
string filenameOnly = System.IO.Path.GetFileName(files2);
string pathOnly = System.IO.Path.GetDirectoryName(files2);
string sanitizedFileName = regExPattern.Replace(filenameOnly, replacement);
string sanitized = System.IO.Path.Combine(pathOnly, sanitizedFileName);
if (!System.IO.File.Exists(sanitized))
{
DataGridViewRow clean = new DataGridViewRow();
clean.CreateCells(dataGridView1);
clean.Cells[0].Value = pathOnly;
clean.Cells[1].Value = filenameOnly;
clean.Cells[2].Value = sanitizedFileName;
dataGridView1.Rows.Add(clean);
System.IO.File.Move(files2, sanitized);
}
else
{
if (filesCount.ContainsKey(sanitized))
{
filesCount[sanitized]++;
}
else
{
filesCount.Add(sanitized, 1);
}
string newFileName = String.Format("{0}{1}{2}",
System.IO.Path.GetFileNameWithoutExtension(sanitized),
filesCount[sanitized].ToString(),
System.IO.Path.GetExtension(sanitized));
string newFilePath = System.IO.Path.Combine(System.IO.Path.GetDirectoryName(sanitized), newFileName);
System.IO.File.Move(files2, newFilePath);
sanitized = newFileName;
DataGridViewRow clean = new DataGridViewRow();
clean.CreateCells(dataGridView1);
clean.Cells[0].Value = pathOnly;
clean.Cells[1].Value = filenameOnly;
clean.Cells[2].Value = newFileName;
dataGridView1.Rows.Add(clean);
}
}
}
catch (Exception e)
{
errors.Write(e);
}
}
private void SanitizeFileNames_Load(object sender, EventArgs e)
{ }
private void dataGridView1_CellContentClick(object sender, DataGridViewCellEventArgs e)
{
}
private void button1_Click(object sender, EventArgs e)
{
Application.Exit();
}
}
The problem is, that not all files after a rename will have the same amount of blankspaces. As in, i could have Deal A&B.txt which after a rename would become Deal A B.txt (1 space b/w A and B--this is fine). But i will also have files that are like: Deal A & B & C.txt which after a rename is: Deal A B C.txt (3 spaces between A,B and C--not acceptable).
Does anybody have any ideas/code for how to accomplish this?
Do the local equivalent of:
s/\s+/ /g;
Just add a space to your regPattern. Any collection of invalid characters and spaces will be replaced with a single space. You may waste a little bit of time replacing a space with a space, but on the other hand you won't need a second string manipulation call.
Does this help?
var regex = new System.Text.RegularExpressions.Regex("\\s{2,}");
var result = regex.Replace("Some text with a lot of spaces, and 2\t\ttabs.", " ");
Console.WriteLine(result);
output is:
Some text with a lot of spaces, and 2 tabs.
It just replaces any sequence of 2 or more whitespace characters with a single space...
Edit:
To clarify, I would just perform this regex right after your existing one:
public void Sanitizer(List<string> paths)
{
string regPattern = (#"[~#&$!%+{}]+");
string replacement = " ";
Regex regExPattern = new Regex(regPattern);
Regex regExPattern2 = new Regex(#"\s{2,}");
and:
foreach (string files2 in paths)
{
string filenameOnly = System.IO.Path.GetFileName(files2);
string pathOnly = System.IO.Path.GetDirectoryName(files2);
string sanitizedFileName = regExPattern.Replace(filenameOnly, replacement);
sanitizedFileName = regExPattern2.Replace(sanitizedFileName, replacement); // clean up whitespace
string sanitized = System.IO.Path.Combine(pathOnly, sanitizedFileName);
I hope that makes more sense.
you can perform another regex replace after your first one
#" +" -> " "
As Fosco said, with formatting:
while (mystring.Contains(" ")) mystring = mystring.Replace(" "," ");
// || || |
After you're done sanitizing it your way, simply replace 2 spaces with 1 space, while 2 spaces exist in the string.
while (mystring.Contains(" ")) mystring = mystring.Replace(" "," ");
I think that's the right syntax...
Related
I have
string filepath = #"F:\first_folder\Node3_V_1.3";
I have a button named version_check
On the click of this button i want to print the message:
upgrading node3 to version 1.3
The details are fetched from string filepath. How do I code this in C#.
You can do in this way
version_check.Click += version_check_Click; //subscription of the event
public void version_check_Click(object sender, EventArgs e)
{
string filepath = #"F:\first_folder\Node3_V_1.3";
var name = Path.GetFileName(filepath).Split(new[] {'_', 'V'}, StringSplitOptions.RemoveEmptyEntries);
if (name.Length < 1)
{
MessageBox.Show("Failed to update");
return;
}
MessageBox.Show(string.Format("Upgrading {0} to version {1} ...", name[0], name[1]));
}
Debug Information:
This is really simple using the already available classes in the NET Framework,
The Path class has a method GetFileName that takes the last part of a pathname also if it is not a real filename but just a folder....
string filePath = #"F:\first_folder\Node3_V_1.3";
string lastFolder = Path.GetFileName(filePath);
Console.WriteLine(lastFolder);
Now, if your lastFolder is regularly composed of three parts (IE, software, V per version and finally the version number, you could use the Split method of the String class to divide you lastFolder in three parts
string[] parts = lastFolder.Split('_');
Console.WriteLine("Upgrading {0} to version {1}", parts[0], parts[2]);
If you work with C# 6.0 you could also write the last statement using string interpolation with
Console.WriteLine($"Upgrading {parts[0]} to version {parts[2]}");
This will work for the sample folder path you gave,
private void version_check_Click(object sender, EventArgs e)
{
string version = "F:\\first_folder\\Node3_V_1.3";
var result = version.Substring(version.LastIndexOf('\\') + 1);
string[] splitString = result.Split('_').ToArray();
MessageBox.Show("upgrading "+ splitString[0] + " to version" + splitString[2]);
}
Regex solution:
string filepath = #"F:\first_folder\Node3_V_1.3";
string filename = Path.GetFileName(filepath);
Match m = Regex.Match(filename, "^(?<name>.+)_V_(?<version>.+)$");
string output = string.Format("upgrading {0} to version {1}",
m.Groups["name"].Value,
m.Groups["version"].Value);
String.Split solution:
string filepath = #"F:\first_folder\Node3_V_1.3";
string filename = Path.GetFileName(filepath);
string[] parts = filename.Split(new[] { "_V_" }, 2, StringSplitOptions.None);
string output = string.Format("upgrading {0} to version {1}",
parts[0],
parts[1]);
When I copy a string from an excel data set into the text box, the string has HUGE spaces in between each item in the string.
I currently have if (textBox1.Text.Contains(" ") == true) to detect the spaces in the string.
What would I use to delete those spaces?
Bonus Question: I do still need one space inbetween each item in the string, how would I add that and still delete the massive spaces?
private void radioGenerateScript_CheckedChanged(object sender, EventArgs e)
{
hexData.Cells.Copy();
textBox1.Clear();
textBox1.Paste();
if (textBox1.Text.Contains(" ") == true)
{
}
}
private void radioWriteScript_CheckedChanged(object sender, EventArgs e)
{
string waveForm = textBox1.Text;
System.IO.File.WriteAllText("E:/Scripts/Test.us1", waveForm);
}
If you want to remove all kinds of whitespaces use:
textBox1.Text = Regex.Replace(textBox1.Text, #"\s+", "");
\s matches all whitespaces (spaces, tabs and new lines).
textBox1.Text = Regex.Replace(textBox1.Text, " +", " ");
It seems that you have tabs as separators, so the following is better (as Alexei suggested):
textBox1.Text = Regex.Replace(textBox1.Text, #"\s+", " ");
textBox1.Text = textBox1.Text.Replace(" ", "");
If you want to keep some spaces then use Split and string.Join
var words = textBox1.Text.Split(new [] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
textBox1.Text = string.Join(" ", words);
I am rookie in C#, but I need solve one Problem.
I have several text files in Folder and each text files has this structure:
IdNr 000000100
Name Name
Lastname Lastname
Sex M
.... etc...
Load all files from Folder, this is no Problem ,but i need delete "zero" in IdNr, so delete 000000 and 100 leave there. After this file save. Each files had other IdNr, Therefore, it is harder :(
Yes, it is possible each files manual edit, but when i have 3000 files, this is not good :)
Can C# one algorithm, which could this 000000 delete and leave only number 100?
Thank you All.
Vaclav
So, thank you ALL !
But in the End I have this Code :-) :
using System.IO;
namespace name
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Browse_Click(object sender, EventArgs e)
{
DialogResult dialog = folderBrowserDialog1.ShowDialog();
if (dialog == DialogResult.OK)
TP_zdroj.Text = folderBrowserDialog1.SelectedPath;
}
private void start_Click(object sender, EventArgs e)
{
try
{
foreach (string file in Directory.GetFiles(TP_zdroj.Text, "*.txt"))
{
string text = File.ReadAllText(file, Encoding.Default);
text = System.Text.RegularExpressions.Regex.Replace(text, "IdNr 000*", "IdNr ");
File.WriteAllText(file, text, Encoding.Default);
}
}
catch
{
MessageBox.Show("Warning...!");
return;
}
{
MessageBox.Show("Done");
}
}
}
}
Thank you ALL ! ;)
You can use int.Parse:
int number = int.Parse("000000100");
String withoutzeros = number.ToString();
According to your read/save file issue, do the files contain more than one record, is that the header or does each record is a list of key and value like "IdNr 000000100"? It's difficult to answer without these informations.
Edit: Here's a simple but efficient approach which should work if the format is strict:
var files = Directory.EnumerateFiles(path, "*.txt", SearchOption.TopDirectoryOnly);
foreach (var fPath in files)
{
String[] oldLines = File.ReadAllLines(fPath); // load into memory is faster when the files are not really huge
String key = "IdNr ";
if (oldLines.Length != 0)
{
IList<String> newLines = new List<String>();
foreach (String line in oldLines)
{
String newLine = line;
if (line.Contains(key))
{
int numberRangeStart = line.IndexOf(key) + key.Length;
int numberRangeEnd = line.IndexOf(" ", numberRangeStart);
String numberStr = line.Substring(numberRangeStart, numberRangeEnd - numberRangeStart);
int number = int.Parse(numberStr);
String withoutZeros = number.ToString();
newLine = line.Replace(key + numberStr, key + withoutZeros);
newLines.Add(line);
}
newLines.Add(newLine);
}
File.WriteAllLines(fPath, newLines);
}
}
Use TrimStart
var trimmedText = number.TrimStart('0');
This should do it. It assumes your files have a .txt extension, and it removes all occurrences of "000000" from each file.
foreach (string fileName in Directory.GetFiles("*.txt"))
{
File.WriteAllText(fileName, File.ReadAllText(fileName).Replace("000000", ""));
}
These are the steps you would want to take:
Loop each file
Read file line by line
for each line split on " " and remove leading zeros from 2nd element
write the new line back to a temp file
after all lines processed, delete original file and rename temp file
do next file
(you can avoid the temp file part by reading each file in full into memory, but depending on your file sizes this may not be practical)
You can remove the leading zeros with something like this:
string s = "000000100";
s = s.TrimStart('0');
Simply, read every token from the file and use this method:
var token = "000000100";
var result = token.TrimStart('0');
You can write a function similar to this one:
static IEnumerable<string> ModifiedLines(string file) {
string line;
using(var reader = File.OpenText(file)) {
while((line = reader.ReadLine()) != null) {
string[] tokens = line.Split(new char[] { ' ' });
line = string.Empty;
foreach (var token in tokens)
{
line += token.TrimStart('0') + " ";
}
yield return line;
}
}
}
Usage:
File.WriteAllLines(file, ModifiedLines(file));
I am working on a ASP.NET 4.0 web application, the main goal for it to do is go to the URL in the MyURL variable then read it from top to bottom, search for all lines that start with "description" and only keep those while removing all HTML tags. What I want to do next is remove the "description" text from the results afterwords so I have just my device names left. How would I do this?
protected void parseButton_Click(object sender, EventArgs e)
{
MyURL = deviceCombo.Text;
WebRequest objRequest = HttpWebRequest.Create(MyURL);
objRequest.Credentials = CredentialCache.DefaultCredentials;
using (StreamReader objReader = new StreamReader(objRequest.GetResponse().GetResponseStream()))
{
originalText.Text = objReader.ReadToEnd();
}
//Read all lines of file
String[] crString = { "<BR> " };
String[] aLines = originalText.Text.Split(crString, StringSplitOptions.RemoveEmptyEntries);
String noHtml = String.Empty;
for (int x = 0; x < aLines.Length; x++)
{
if (aLines[x].Contains(filterCombo.SelectedValue))
{
noHtml += (RemoveHTML(aLines[x]) + "\r\n");
}
}
//Print results to textbox
resultsBox.Text = String.Join(Environment.NewLine, noHtml);
}
public static string RemoveHTML(string text)
{
text = text.Replace(" ", " ").Replace("<br>", "\n");
var oRegEx = new System.Text.RegularExpressions.Regex("<[^>]+>");
return oRegEx.Replace(text, string.Empty);
}
Ok so I figured out how to remove the words through one of my existing functions:
public static string RemoveHTML(string text)
{
text = text.Replace(" ", " ").Replace("<br>", "\n").Replace("description", "").Replace("INFRA:CORE:", "")
.Replace("RESERVED", "")
.Replace(":", "")
.Replace(";", "")
.Replace("-0/3/0", "");
var oRegEx = new System.Text.RegularExpressions.Regex("<[^>]+>");
return oRegEx.Replace(text, string.Empty);
}
public static void Main(String[] args)
{
string str = "He is driving a red car.";
Console.WriteLine(str.Replace("red", "").Replace(" ", " "));
}
Output:
He is driving a car.
Note: In the second Replace its a double space.
Link : https://i.stack.imgur.com/rbluf.png
Try this.It will remove all occurrence of the word which you want to remove.
Try something like this, using LINQ:
List<string> lines = new List<string>{
"Hello world",
"Description: foo",
"Garbage:baz",
"description purple"};
//now add all your lines from your html doc.
if (aLines[x].Contains(filterCombo.SelectedValue))
{
lines.Add(RemoveHTML(aLines[x]) + "\r\n");
}
var myDescriptions = lines.Where(x=>x.ToLower().BeginsWith("description"))
.Select(x=> x.ToLower().Replace("description",string.Empty)
.Trim());
// you now have "foo" and "purple", and anything else.
You may have to adjust for colons, etc.
void Main()
{
string test = "<html>wowzers description: none <div>description:a1fj391</div></html>";
IEnumerable<string> results = getDescriptions(test);
foreach (string result in results)
{
Console.WriteLine(result);
}
//result: none
// a1fj391
}
static Regex MyRegex = new Regex(
"description:\\s*(?<value>[\\d\\w]+)",
RegexOptions.Compiled);
IEnumerable<string> getDescriptions(string html)
{
foreach(Match match in MyRegex.Matches(html))
{
yield return match.Groups["value"].Value;
}
}
Adapted From Code Project
string value = "ABC - UPDATED";
int index = value.IndexOf(" - UPDATED");
if (index != -1)
{
value = value.Remove(index);
}
It will print ABC without - UPDATED
My app takes "unclean" file names and "cleans" them up. "Unclean" file names contain characters like #, #, ~, +, %, etc. The "cleaning" process replaces those chars with "". However, I found that if there are two files in the same folder that, after a cleaning, will have the same name, my app does not rename either file. (I.e. ##test.txt and ~test.txt will both be named test.txt after the cleaning).
Therefore, I put in a loop that basically checks to see if the file name my app is trying to rename already exists in the folder. However, I tried running this and it would not rename all the files. Am I doing something wrong?
Here's my code:
public void FileCleanup(List<string> paths)
{
string regPattern = (#"[~#&!%+{}]+");
string replacement = "";
Regex regExPattern = new Regex(regPattern);
List<string> existingNames = new List<string>();
StreamWriter errors = new StreamWriter(#"C:\Documents and Settings\joe.schmoe\Desktop\SharePointTesting\Errors.txt");
StreamWriter resultsofRename = new StreamWriter(#"C:\Documents and Settings\joe.schmoe\Desktop\SharePointTesting\Results of File Rename.txt");
var filesCount = new Dictionary<string, int>();
string replaceSpecialCharsWith = "_";
foreach (string files2 in paths)
try
{
string filenameOnly = Path.GetFileName(files2);
string pathOnly = Path.GetDirectoryName(files2);
string sanitizedFileName = regExPattern.Replace(filenameOnly, replacement);
string sanitized = Path.Combine(pathOnly, sanitizedFileName);
if (!System.IO.File.Exists(sanitized))
{
System.IO.File.Move(files2, sanitized);
resultsofRename.Write("Path: " + pathOnly + " / " + "Old File Name: " + filenameOnly + "New File Name: " + sanitized + "\r\n" + "\r\n");
}
else
{
existingNames.Add(sanitized);
foreach (string names in existingNames)
{
string sanitizedPath = regExPattern.Replace(names, replaceSpecialCharsWith);
if (filesCount.ContainsKey(sanitizedPath))
{
filesCount[names]++;
}
else
{
filesCount.Add(sanitizedPath, 1);
}
string newFileName = String.Format("{0},{1}, {2}", Path.GetFileNameWithoutExtension(sanitizedPath),
filesCount[sanitizedPath] != 0
? filesCount[sanitizedPath].ToString()
: "",
Path.GetExtension(sanitizedPath));
string newFilePath = Path.Combine(Path.GetDirectoryName(sanitizedPath), newFileName);
System.IO.File.Move(names, newFileName);
}
}
}
catch (Exception e)
{
//write to streamwriter
}
}
}
Anybody have ANY idea why my code won't rename duplicate files uniquely?
You do foreach (string names in existingNames), but existingNames is empty.
You have your if (System.IO.File.Exists(sanitized)) backwards: it makes up a new name if the file doesn't exist, instead of when it exists.
You make a string newFileName, but still use sanitizedPath instead of newFileName to do the renaming.
The second parameter to filesCount.Add(sanitizedPath, 0) should be 1 or 2. After all, you have then encountered your second file with the same name.
If filesCount[sanitizedPath] equals 0, you don't change the filename at all, so you overwrite the existing file.
In addition to the problem pointed out by Sjoerd, it appears that you are checking to see if the file exists and if it does exist you move it. Your if statement should be
if (!System.IO.File.Exists(sanitized))
{
...
}
else
{
foreach (string names in existingNames)
{
...
}
}
}
Update:
I agree that you should split the code up into smaller methods. It will help you identify which pieces are working and which aren't. That being said, I would get rid of the existingNames list. It is not needed because you have the filesCount Dictionary. Your else clause would then look something like this:
if (filesCount.ContainsKey(sanitized))
{
filesCount[sanitized]++;
}
else
{
filesCount.Add(sanitized, 1);
}
string newFileName = String.Format("{0}{1}.{2}",
Path.GetFileNameWithoutExtension(sanitized),
filesCount[sanitized].ToString(),
Path.GetExtension(sanitized));
string newFilePath = Path.Combine(Path.GetDirectoryName(sanitized), newFileName);
System.IO.File.Move(files2, newFileName);
Please note that I changed your String.Format method call. You had some commas and spaces in there that looked incorrect for building a path, although I could be missing something in your implementation. Also, in the Move I changed the first argument from "names" to "files2".
A good way to make the code less messy would be to split it to methods as logical blocks.
FindUniqueName(string filePath, string fileName);
The method would prefix the fileName with a character, until the fileName is unique withing the filePath.
MoveFile(string filePath, string from, string to);
The method would use the FindUniqueName method if the file already exists.
It would be way easier to test the cleanup that way.
Also you should check if a file actually requires renaming:
if (String.Compare(sanitizedFileName, filenameOnly, true) != 0)
MoveFile(pathOnly, fileNameOnly, sanitizedFileName);
private string FindUniqueName(string fileDirectory, string from, string to)
{
string fileName = to;
// There most likely won't be that many files with the same name to reach max filename length.
while (File.Exists(Path.Combine(fileDirectory, fileName)))
{
fileName = "_" + fileName;
}
return fileName;
}
private void MoveFile(string fileDirectory, string from, string to)
{
to = FindUniqueName(fileDirectory, from, to);
File.Move(Path.Combine(fileDirectory, from), Path.Combine(fileDirectory, to));
}