replacing whole word with .Replace() using \b not working

replacing whole word with .Replace() using \b not working - c#

I have a string in which I want to replace a whole word. This is what I have:
var TheWord = "SomeWord";
TheWord = "\b" + TheWord + "\b";
TheText = TheText.replace(TheWord, "SomeOtherWord");
I'm using "\b" because I only want to replace "SomeWord", not "SomeWordDifferent". The text looks like this: var TheHTML = '<div class="SomeWord">'; However, the replacement doesn't take place. What do I need to change?

You need to escape the backslashes. Try either of these...
TheWord = #"\b" + TheWord + #"\b";
or
TheWord = "\\b" + TheWord + "\\b";

I assume you are trying to use Regex. The method for this is
string Regex.Replace(string input, string replacment)
So I think this is what you want:
string text = ...; // text comes from somewhere
string pattern = #"\bSomeWord\b"; // escape \b (word boundary regex anchor), or use verbatim string literal, like here
var regex = new Regex(pattern);
text = regex.Replace(text, "SomeOtherWord");
Or simply the static version of Replace method as Tim wrote:
Regex.Replace(text, pattern, "SomeOtherWord");

Related

C# Filter a word with an undefined number of spaces between charachers

For exampe:
I can create a wordt with multiple spaces for example:
string example = "**example**";
List<string>outputs = new List<string>();
string example_output = "";
foreach(char c in example)
{
example_putput += c + " ";
}
And then i can loop it to remve all spaces and add them to the outputs list,
The problem there is. I need it to work in scenario's where there are double spaces and more.
For example.
string text = "This is a piece of text for this **example**.";
I basicly want to detect AND remove 'example'
But, i want to do that even when it says e xample, e x ample or example.
And in my scenaria, since its a spamfilter, i cant just replace the spaces in the whole sentence like below, because i'd need to .Replace( the word with the exact same spaces as the user types it in).
.Replace(" ", "");
How would i achieve this?
TLDR:
I want to filter out a word with multiple spaces combinations without altering any other parts of the line.
So example, e xample, e x ample, e x a m ple
becomes a filter word
I wouldn't mind a method which could generates a word with all spaces as plan b.

You can use this regex to achieve that:
(e[\s]*x[\s]*a[\s]*m[\s]*p[\s]*l[\s]*e)
Link
Dotnet Fiddle

You could use a regex for that: e\s*x\s*a\s*m\s*p\s*l\s*e
\s means any whitespace character and the * means 0-n count of that whitespace.
Small snippet:
const string myInput = "e x ample";
var regex = new Regex("e\s*x\s*a\s*m\s*p\s*l\s*e");
var match = regex.Match(myInput);
if (match.Success)
{
// We have a match! Bad word
}
Here the link for the regex: https://regex101.com/r/VFjzTg/1

I see that the problem is to ignore the spaces in the matchstring, but not touch them anywhere else in the string.
You could create a regular expression out of your matchword, allowing arbitrary whitespace between each character.
// prepare regex. Need to do this only once for many applications.
string findword = "example";
// TODO: would need to escape special chars like * ( ) \ . + ? here.
string[] tmp = new string[findword.Length];
for(int i=0;i<tmp.Length;i++)tmp[i]=findword.Substring(i,1);
System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(string.Join("\\s*",tmp));
// on each text to filter, do this:
string inp = "A text with the exa mple word in it.";
string outp;
outp = r.Replace(inp,"");
System.Console.WriteLine(outp);
Left out the escaping of regex-special-chars for brevity.

You can try regular expressions:
using System.Text.RegularExpressions;
....
// Having a word to find
string toFind = "Example";
// we build the regular expression
Regex regex = new Regex(
#"\b" + string.Join(#"\s*", toFind.Select(c => Regex.Escape(c.ToString()))) + #"\b",
RegexOptions.IgnoreCase);
// Then we apply regex built for the required text:
string text = "This is a piece of text for this **example**. And more (e X amp le)";
string result = regex.Replace(text, "");
Console.Write(result);
Outcome:
This is a piece of text for this ****. And more ()
Edit: if you want to ignore diacritics, you should modify regular expression:
string toFind = "Example";
Regex regex = new Regex(#"\b" + string.Join(#"\s*",
toFind.Select(c => Regex.Escape(c.ToString()) + #"\p{Lm}*")),
RegexOptions.IgnoreCase);
and Normalize text before matching:
string text = "This is a piece of text for this **examplé**. And more (e X amp le)";
string result = regex.Replace(text.Normalize(NormalizationForm.FormD), "");

Use Regex in Dynamical Approach

Today I can do the hard code but later I would like to change it that the string word pattern can be applied in side of the #"\bfood\b". I want to make it into dynamical without using hardcoding
IN the futore I would like to have the word "chicken" instead of "food".
I tried to replace the code "#"\bfood\b" into #"\b" + pattern +"\b" but it doesn't work.
string inputText = "food ddd dd";
string dddd = "\bfood\b";
string pattern = "food";
Regex rx = new Regex(#"\bfood\b", RegexOptions.None);
MatchCollection mc = rx.Matches(inputText);
if (rx.Match(pattern).Success)
{
int dd = 3;
}

You should use
#"\b" + Regex.Escape(pattern) + #"\b"
Or a more generic:
#"(?<!\w)" + Regex.Escape(pattern) + #"(?!\w)"
Or with the string.format:
Regex rx = new Regex(string.Format(#"\b{0}\b", Regex.Escape(pattern)), RegexOptions.None);
Or with the string interpolation:
Regex rx = new Regex($#"(?<!\w){Regex.Escape(pattern)}(?!\w)", RegexOptions.None);
Now, why do I suggest (?<!\w) and (?!\w) lookarounds? Because these are word boundaries that are not context dependent. What if you decide to pass a |border| pattern? The \b\|border\|\b will most probably fail to match most of the cases you intended to match because \b will require a word character to appear before the first | and after the last |. The lookarounds will match the |border| string only if not enclosed with word characters.

The reason your #"\b" + pattern +"\b" didn't work is that the verbatim string literal # wasn't applied to both pieces of your regex building.
Fix this with either
#"\b" + pattern + #"\b"
Or even better use String.Format()
String.Format(#"\b{0}\b", pattern);
Or use C#6 string interpolation
$#"\b{pattern}\b";

Regex to insert space C#

I have some string. I need a regex that will replace each occurrence of symbol that is not space + '<' with the same symbol + space + '<'.
In other words if there is '<' without ' ' before it it must add the space.
I've tried something like :
string pattern = "[^ ]<";
string replacement = "$0" + "<";
string result = Regex.Replace(html, pattern, replacement);
Obviously not working as I want.

string pattern = "([^ ])<";
string replacement = "$1" + " <";
You can try something like this.

Using Regex Replace instead of String Replace

I am not clued up on Regex as much as I should be, so this may seem like a silly question.
I am splitting a string into a string[] with .Split(' ').
The purpose is to check the words, or replace any.
The problem I'm having now, is that for the word to be replaces, it has to be an exact match, but with the way I'm splitting it, there might be a ( or [ with the split word.
So far, to counter that, I'm using something like this:
formattedText.Replace(">", "> ").Replace("<", " <").Split(' ').
This works fine for now, but I want to incorporate more special chars, such as [;\\/:*?\"<>|&'].
Is there a quicker way than the method of my replacing, such as Regex? I have a feeling my route is far from the best answer.
EDIT
This is an (example) string
would be replaced to
This is an ( example ) string

If you want to replace whole words, you can do that with a regular expression like this.
string text = "This is an example (example) noexample";
string newText = Regex.Replace(text, #"\bexample\b", "!foo!");
newText will contain "This an !foo! (!foo!) noexample"
The key here is that the \b is the word break metacharacter. So it will match at the beginning or end of a line, and the transitions between word characters (\w) and non-word characters (\W). The biggest difference between it and using \w or \W is that those won't match at the beginning or end of lines.

I thing this is the right thing you want
if you want these -> ;\/:*?"<>|&' symbols to replace
string input = "(exam;\\/:*?\"<>|&'ple)";
Regex reg = new Regex("[;\\/:*?\"<>|&']");
string result = reg.Replace(input, delegate(Match m)
{
return " " + m.Value + " ";
});
if you want to replace all characters except a-zA-Z0-9_
string input = "(example)";
Regex reg = new Regex(#"\W");
string result = reg.Replace(input, delegate(Match m)
{
return " " + m.Value + " ";
});

Regex to remove string from string

Is there a regex pattern that can remove .zip.ytu from the string below?
werfds_tyer.abc.zip.ytu_20111223170226_20111222.20111222

Here is an answer using regex as the OP asked.
To use regex, put the replacment text in a match ( ) and then replace that match with nothing string.Empty:
string text = #"werfds_tyer.abc.zip.ytu_20111223170226_20111222.20111222";
string pattern = #"(\.zip\.ytu)";
Console.WriteLine( Regex.Replace(text, pattern, string.Empty ));
// Outputs
// werfds_tyer.abc_20111223170226_20111222.20111222

Just use String.Replace()
String.Replace(".zip.ytu", "");
You don't need regex for exact matches.

txt = txt.Replace(".zip.ytu", "");
Why don't you simply do above?

Don't really know what is the ".zip.ytu", but if you don't need exact matches, you might use something like that:
string txt = "werfds_tyer.abc.zip.ytu_20111223170226_20111222.20111222";
Regex mRegex = new Regex(#"^([^.]*\.[^.]*)\.[^.]*\.[^_]*(_.*)$");
Match mMatch = mRegex.Match(txt);
string new_txt = mRegex.Replace(txt, mMatch.Groups[1].ToString() + mMatch.Groups[2].ToString());

use string.Replace:
txt = txt.Replace(".zip.ytu", "");

Here is the method I use for more complex repaces. Check out the link: http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace(v=vs.110).aspx for a Regular Expression replace. I added the code below as well.
string input = "This is text with far too much " +
"whitespace.";
string pattern = "\\s+";
string replacement = " ";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("Replacement String: {0}", result);

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

replacing whole word with .Replace() using \b not working - c#

You need to escape the backslashes. Try either of these... TheWord = #"\b" + TheWord + #"\b"; or TheWord = "\\b" + TheWord + "\\b";

Related

C# Filter a word with an undefined number of spaces between charachers

Use Regex in Dynamical Approach

Regex to insert space C#

Using Regex Replace instead of String Replace

Regex to remove string from string

Categories

Resources