Using Regex Replace instead of String Replace - c#

I am not clued up on Regex as much as I should be, so this may seem like a silly question.
I am splitting a string into a string[] with .Split(' ').
The purpose is to check the words, or replace any.
The problem I'm having now, is that for the word to be replaces, it has to be an exact match, but with the way I'm splitting it, there might be a ( or [ with the split word.
So far, to counter that, I'm using something like this:
formattedText.Replace(">", "> ").Replace("<", " <").Split(' ').
This works fine for now, but I want to incorporate more special chars, such as [;\\/:*?\"<>|&'].
Is there a quicker way than the method of my replacing, such as Regex? I have a feeling my route is far from the best answer.
EDIT
This is an (example) string
would be replaced to
This is an ( example ) string

If you want to replace whole words, you can do that with a regular expression like this.
string text = "This is an example (example) noexample";
string newText = Regex.Replace(text, #"\bexample\b", "!foo!");
newText will contain "This an !foo! (!foo!) noexample"
The key here is that the \b is the word break metacharacter. So it will match at the beginning or end of a line, and the transitions between word characters (\w) and non-word characters (\W). The biggest difference between it and using \w or \W is that those won't match at the beginning or end of lines.

I thing this is the right thing you want
if you want these -> ;\/:*?"<>|&' symbols to replace
string input = "(exam;\\/:*?\"<>|&'ple)";
Regex reg = new Regex("[;\\/:*?\"<>|&']");
string result = reg.Replace(input, delegate(Match m)
{
return " " + m.Value + " ";
});
if you want to replace all characters except a-zA-Z0-9_
string input = "(example)";
Regex reg = new Regex(#"\W");
string result = reg.Replace(input, delegate(Match m)
{
return " " + m.Value + " ";
});

Related

Find and replace the string in paragraph

I want to empty the value between the hyphn for example need to clear the data in between the range of hyphen prefix and suffix then make it has empty string.
string templateContent = "Template content -macro- -UnitDetails- -testEmail- sending Successfully";
Output
templateContent = "Template content sending Successfully";
templateContent = Regex.Replace(templateContent, #"-\w*-\s?", string.Empty).TrimEnd(' ');
#"-\w*-\s" - is regex pattern for '-Word- '
- - pattern for -
\w - word character.
* - zero or any occurrences of \w
\s - pattern for whitespace character
? - marks \s as optional
TrimEnd(' ') - to remove trailing space if there was a pattern at end of the string
There are many ways to do this, however given your example the following should work
var split = templateContent
.Split(' ')
.Where(x => !x.StartsWith("-") && !x.EndsWith("-"));
var result = string.Join(" ",split);
Console.WriteLine(result);
Output
Template content sending Successfully
Full Demo Here
Note : I personally think regex is better suited to this
You can use regex for this
string regExp = "(-[a-zA-Z]*-)";
string tmp = Regex.Replace(templateContent , regExp, "");
string finalStr = Regex.Replace(tmp, " {2,}", " ");
var resultWithSpaces = Regex.Replace(templateContent, #"-\S+-", string.Empty);
This regular expression looks for two hyphens surrounding one or more characters that are not white space.
It will leave the spaces that were around the removed word. To get rid of those you can do another Regex to replace multiple spaces with a single space.
var result = Regex.Replace(resultWithSpaces, #"\s+", " ");

Replace text which contain line breaks without dropping them

We have a text which goes like this ..
This is text
i want
to keep
but
Replace this sentence
because i dont like it.
Now i want to replace this sentence Replace this sentence because i dont like it.
Of course going like this
text = text.Replace(#"Replace this sentence because i dont like it.", "");
Wont solve my problem. I can't drop line breaks and replace them with one line.
My output should be
This is text
i want
to keep
but
Please keep in mind there is a lot variations and line breaks for sentence i don't like.
I.E it may go like
Replace this
sentence
because i dont like it.
or
Some text before. Replace this
sentence
because i dont like it.
You can use Regex to find any kind of whitespace. This includes regular spaces but also carriage returns and linefeeds as well as tabulators or half-spaces and so on.
string input = #"This is text
i want
to keep
but
Replace this sentence
because i dont like it.";
string dontLike = #"Replace this sentence because i dont like it.";
string pattern = Regex.Escape(dontLike).Replace(#"\ ", #"\s+");
Console.WriteLine("Pattern:");
Console.WriteLine(pattern);
string clean = Regex.Replace(input, pattern, "");
Console.WriteLine();
Console.WriteLine("Result:");
Console.WriteLine(clean);
Console.ReadKey();
Output:
Pattern:
Replace\s+this\s+sentence\s+because\s+i\s+dont\s+like\s+it\.
Result:
This is text
i want
to keep
but
Regex.Escape escapes any character that would otherwise have a special meaning in Regex. E.g., the period "." means "any number of repetitions". It also replaces the spaces " " with #"\ ". We in turn replace #"\ " in the search pattern by #"\s+". \s+ in Regex means "one or more white spaces".
Use regex to match "any whitespace" instead of just space in your search string. Roughly
escape search string to be safe for regex -Escape Special Character in Regex
replace spaces with "\s+" (reference)
run regex matching multiple lines - Multiline regular expression in C#
Or, use LINQ to accomplish this:
var text = "Drones " + Environment.NewLine + "are great to fly, " + Environment.NewLine + "yes, very fun!";
var textToReplace = "Drones are great".Split(" ").ToList();
textToReplace.ForEach(f => text = text.Replace(f, ""));
Output:
to fly,
yes, very fun!
Whatever method you choose, you are going to deal with extra line breaks, too many spaces and other formatting issues... Good luck!
You can use something like this, if output format of string is optional here:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string textToReplace = #"Replace this sentence because i dont like it.";
string text = #"This is text
i want
to keep
but
Replace this sentence
because i dont like it.";
text = Regex.Replace(text, #"\s+", " ", RegexOptions.Multiline);
text = text.Replace(textToReplace, string.Empty);
Console.WriteLine(text);
}
}
Output:
"This is text i want to keep but"

Replace a part of string containing Password

Slightly similar to this question, I want to replace argv contents:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
to this:
"-help=none\n-URL=(default)\n-password=********\n-uname=Khanna\n-p=100"
I have tried very basic string find and search operations (using IndexOf, SubString etc.). I am looking for more elegant solution so as to replace this part of string:
-password=AnyPassword
to:
-password=*******
And keep other part of string intact. I am looking if String.Replace or Regex replace may help.
What I've tried (not much of error-checks):
var pwd_index = argv.IndexOf("--password=");
string converted;
if (pwd_index >= 0)
{
var leftPart = argv.Substring(0, pwd_index);
var pwdStr = argv.Substring(pwd_index);
var rightPart = pwdStr.Substring(pwdStr.IndexOf("\n") + 1);
converted = leftPart + "--password=********\n" + rightPart;
}
else
converted = argv;
Console.WriteLine(converted);
Solution
Similar to Rubens Farias' solution but a little bit more elegant:
string argv = "-help=none\n-URL=(default)\n-password=\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)[^\n]*", "$1********");
It matches password= literally, stores it in capture group $1 and the keeps matching until a \n is reached.
This yields a constant number of *'s, though. But telling how much characters a password has, might already convey too much information to hackers, anyway.
Working example: https://dotnetfiddle.net/xOFCyG
Regular expression breakdown
( // Store the following match in capture group $1.
password= // Match "password=" literally.
)
[ // Match one from a set of characters.
^ // Negate a set of characters (i.e., match anything not
// contained in the following set).
\n // The character set: consists only of the new line character.
]
* // Match the previously matched character 0 to n times.
This code replaces the password value by several "*" characters:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)([\s\S]*?\n)",
match => match.Groups[1].Value + new String('*', match.Groups[2].Value.Length - 1) + "\n");
You can also remove the new String() part and replace it by a string constant

Regex to remove specific string if exist

I wanna remove the -L from the end of my string if exists
So
ABCD => ABCD
ABCD-L => ABCD
at the moment I'm using something like the line below which uses the if/else type of arrangement in my Regex, however, I have a feeling that it should be way more easier than this.
var match = Regex.Match("...", #"(?(\S+-L$)\S+(?=-L)|\S+)");
How about just doing:
Regex rgx = new Regex("-L$");
string result = rgx.Replace("ABCD-L", "");
So basically: if the string ends with -L, replace that part with an empty string.
If you want to not only invoke the replacement at the end of the string, but also at the end of a word, you can add an additional switch to detect word boundaries (\b) in addition to the end of the string:
Regex rgx = new Regex("-L(\b|$)");
string result = rgx.Replace("ABCD-L ABCD ABCD-L", "");
Note that detecting word boundaries can be a little ambiguous. See here for a list of characters that are considered to be word characters in C#.
You also can use String.Replace() method to find a specific string inside a string and replace it with another string in this case with an empty string.
http://msdn.microsoft.com/en-us/library/fk49wtc1(v=vs.110).aspx
Use Regex.Replace function,
Regex.Replace(string, #"(\S+?)-L(?=\s|$)", "$1")
DEMO
Explanation:
( group and capture to \1:
\S+? non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times)
) end of \1
-L '-L'
(?= look ahead to see if there is:
\s whitespace (\n, \r, \t, \f, and " ")
| OR
$ before an optional \n, and the end of
the string
) end of look-ahead
You certainly can use Regex for this, but why when using normal string functions is clearer?
Compare this:
text = text.EndsWith("-L")
? text.Substring(0, text.Length - "-L".Length)
: text;
to this:
text = Regex.Replace(text, #"(\S+?)-L(?=\s|$)", "$1");
Or better yet, define an extension method like this:
public static string RemoveIfEndsWith(this string text, string suffix)
{
return text.EndsWith(suffix)
? text.Substring(0, text.Length - suffix.Length)
: text;
}
Then your code can look like this:
text = text.RemoveIfEndsWith("-L");
Of course you can always define the extension method using the Regex. At least then your calling code looks a lot cleaner and is far more readable and maintainable.

Replace text in string with delimeters using Regex

I have a string something like,
string str = "(50%silicon +20%!(20%Gold + 80%Silver)| + 30%Alumnium)";
I need a Regular Expression which would Replace the contents in between ! and | with an empty string. The result should be (50%silicon +20% + 30%Alumnium).
If the string contains something like (with nested delimiters):
string str = "(50%silicon +20%!(80%Gold + 80%Silver + 20%!(20%Iron + 80%Silver)|)|
+ 30%Alumnium)";
The result should be (50%silicon +20% + 30%Alumnium) - ignoring the nested delimiters.
I've tried the following Regex, but it doesn't ignore the nesting:
Regex.Replace(str , #"!.+?\|", "", RegexOptions.IgnoreCase);
You are using the lazy quantifier +? which will look for the smallest possible substring that matches your regex. To get the result you are looking for, you want to use the greedy quantifier + which will match the largest substring possible.
The following regex (not tested in C# because I don't have it available, but this should work for any standard regex implementation) will do what you want:
'!.+\|'
using System.Text.RegularExpressions;
str = Regex.Replace(str , #"!.+?\|", "", RegexOptions.IgnoreCase);
Regex.Replace(str, #"!.+?\||\)\|", "", RegexOptions.IgnoreCase);
Works for both provided strings. I extended the regex with a 2nd check on ")/" to replace the leftover characters.

Categories