How to remove text between multiple pairs of brackets? - c#

I would like to remove text contained between each of multiple pairs of brackets. The code below works fine if there is only ONE pair of brackets within the string:
var text = "This (remove me) works fine!";
// Remove text between brackets.
text = Regex.Replace(text, #"\(.*\)", "");
// Remove extra spaces.
text = Regex.Replace(text, #"\s+", " ");
Console.WriteLine(text);
This works fine!
However, if there are MULTIPLE sets of brackets contained within the string too much text is removed. The Regex expression removes all text between the FIRST opening bracket and LAST closing bracket.
var text = "This is (remove me) not (remove me) a problem!";
// Remove text between brackets.
text = Regex.Replace(text, #"\(.*\)", "");
// Remove extra spaces.
text = Regex.Replace(text, #"\s+", " ");
Console.WriteLine(text);
This is a problem!
I'm stumped - I'm sure there's a simple solution, but I'm out of ideas...
Help most welcome!

You have two main possibilities:
change .* to .*? i.e. match as few as possible and thus match ) as early as possible:
text = Regex.Replace(text, #"\(.*?\)", "");
text = Regex.Replace(text, #"\s{2,}", " "); // let's exclude trivial replaces
change .* to [^)]* i.e. match any symbols except ):
text = Regex.Replace(text, #"\([^)]*\)", "");
text = Regex.Replace(text, #"\s{2,}", " ");

working example in c#, this will handle curly braces "{", so result will be.. {{pc_mem_kc}}
string str = "{{pc_mem_kc}} of members were health (test message)";
var pattern = #"\{.*?\}}";
var data11 = Regex.Matches(str, pattern, RegexOptions.IgnoreCase);

Related

Trim() and Replace(" ", "") not removing white space in C#

I am trying to write "text" into a file with
private void WriteToLogs(string text)
{
File.AppendAllText(todayMessageLog, $"({DateTime.Now}) Server Page: \"{text.Trim()}\"\n");
}
The text comes out as this:
"text (a bunch of white space)"
The text string is made up of these:
string username = e.NewClientUsername.Trim().Replace(" ", "");
string ip = e.NewClientIP.Trim().Replace(" ", "");
WriteToLogs($"{username.Trim().Replace(" ", "")} ({ip.Trim().Replace(" ", "")}) connected"); // NONE OF THESE WORKED FOR REMOVING THE WHITE SPACE
The "e" parameter comes from a custom EventArgs class in another namespace and NewClientIP and NewClientUsername are properties inside the class
As you can see, I tried with both Trim and Replace on both the strings themselves and the method but nothing removes the white space.
If the Trim() and Replace() methods do not work, the string is likely not padded with the usual white-space characters like SPACE or TAB, but something else. There are many other characters which can show up blank.
Try printing the result with something like BitConverter.ToString(Text.Encoding.UTF8.GetBytes(text)). Spaces would show up as 20-20-20-..., but you will probably get something else.
The white space shows up as 00, not 20, how can I remove it?
Good. Use the argument to the Trim() method, like so:
var text = "Blah\0\0\0\0";
text.Length → 8
text.Trim('\0').Length → 4
I hope that this working for you
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
//This is your text
string input = "This is text with far too much "+
"This is text with far too much "+
"This is text with far too much ";
//This is the Regex
string pattern = "\\s";
//Value with the replace
string replacement = "";
//Replace
string result = Regex.Replace(input, pattern, replacement);
//Result
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("Replacement String: {0}", result);
}
}
If you want to trim white spaces (not only ' ', but \t, \U00A0 etc.) as well as \0 (which is not white space), you can try regular expressions:
using System.Text.RegularExpressions;
...
// Trim whitespaces and \0
string result = Regex.Replace(input, #"(^[\s\0]+)|([\s\0]+$)", "");
For reference
// Trim start whitespaces and \0
string resultStart = Regex.Replace(input, #"^[\s\0]+", "");
// Trim end whitespaces and \0
string resultEnd = Regex.Replace(input, #"[\s\0]+$", "");
Same idea (regular expressions), but different pattern if you want not to trim but remove white spaces and \0:
string result = Regex.Replace(input, #"[\s\0]+", "");

Find and replace the string in paragraph

I want to empty the value between the hyphn for example need to clear the data in between the range of hyphen prefix and suffix then make it has empty string.
string templateContent = "Template content -macro- -UnitDetails- -testEmail- sending Successfully";
Output
templateContent = "Template content sending Successfully";
templateContent = Regex.Replace(templateContent, #"-\w*-\s?", string.Empty).TrimEnd(' ');
#"-\w*-\s" - is regex pattern for '-Word- '
- - pattern for -
\w - word character.
* - zero or any occurrences of \w
\s - pattern for whitespace character
? - marks \s as optional
TrimEnd(' ') - to remove trailing space if there was a pattern at end of the string
There are many ways to do this, however given your example the following should work
var split = templateContent
.Split(' ')
.Where(x => !x.StartsWith("-") && !x.EndsWith("-"));
var result = string.Join(" ",split);
Console.WriteLine(result);
Output
Template content sending Successfully
Full Demo Here
Note : I personally think regex is better suited to this
You can use regex for this
string regExp = "(-[a-zA-Z]*-)";
string tmp = Regex.Replace(templateContent , regExp, "");
string finalStr = Regex.Replace(tmp, " {2,}", " ");
var resultWithSpaces = Regex.Replace(templateContent, #"-\S+-", string.Empty);
This regular expression looks for two hyphens surrounding one or more characters that are not white space.
It will leave the spaces that were around the removed word. To get rid of those you can do another Regex to replace multiple spaces with a single space.
var result = Regex.Replace(resultWithSpaces, #"\s+", " ");

Replace text which contain line breaks without dropping them

We have a text which goes like this ..
This is text
i want
to keep
but
Replace this sentence
because i dont like it.
Now i want to replace this sentence Replace this sentence because i dont like it.
Of course going like this
text = text.Replace(#"Replace this sentence because i dont like it.", "");
Wont solve my problem. I can't drop line breaks and replace them with one line.
My output should be
This is text
i want
to keep
but
Please keep in mind there is a lot variations and line breaks for sentence i don't like.
I.E it may go like
Replace this
sentence
because i dont like it.
or
Some text before. Replace this
sentence
because i dont like it.
You can use Regex to find any kind of whitespace. This includes regular spaces but also carriage returns and linefeeds as well as tabulators or half-spaces and so on.
string input = #"This is text
i want
to keep
but
Replace this sentence
because i dont like it.";
string dontLike = #"Replace this sentence because i dont like it.";
string pattern = Regex.Escape(dontLike).Replace(#"\ ", #"\s+");
Console.WriteLine("Pattern:");
Console.WriteLine(pattern);
string clean = Regex.Replace(input, pattern, "");
Console.WriteLine();
Console.WriteLine("Result:");
Console.WriteLine(clean);
Console.ReadKey();
Output:
Pattern:
Replace\s+this\s+sentence\s+because\s+i\s+dont\s+like\s+it\.
Result:
This is text
i want
to keep
but
Regex.Escape escapes any character that would otherwise have a special meaning in Regex. E.g., the period "." means "any number of repetitions". It also replaces the spaces " " with #"\ ". We in turn replace #"\ " in the search pattern by #"\s+". \s+ in Regex means "one or more white spaces".
Use regex to match "any whitespace" instead of just space in your search string. Roughly
escape search string to be safe for regex -Escape Special Character in Regex
replace spaces with "\s+" (reference)
run regex matching multiple lines - Multiline regular expression in C#
Or, use LINQ to accomplish this:
var text = "Drones " + Environment.NewLine + "are great to fly, " + Environment.NewLine + "yes, very fun!";
var textToReplace = "Drones are great".Split(" ").ToList();
textToReplace.ForEach(f => text = text.Replace(f, ""));
Output:
to fly,
yes, very fun!
Whatever method you choose, you are going to deal with extra line breaks, too many spaces and other formatting issues... Good luck!
You can use something like this, if output format of string is optional here:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string textToReplace = #"Replace this sentence because i dont like it.";
string text = #"This is text
i want
to keep
but
Replace this sentence
because i dont like it.";
text = Regex.Replace(text, #"\s+", " ", RegexOptions.Multiline);
text = text.Replace(textToReplace, string.Empty);
Console.WriteLine(text);
}
}
Output:
"This is text i want to keep but"

Find hashtags in string

I am working on a Xamarin.Forms PCL project in C# and would like to detect all the hashtags.
I tried splitting at spaces and checking if the word begins with an # but the problem is if the post contains two spaces like "Hello #World Test" it would lose that the double space
string body = "Example string with a #hashtag in it";
string newbody = "";
foreach (var word in body.Split(' '))
{
if (word.StartsWith("#"))
newbody += "[" + word + "]";
newbody += word;
}
Goal output:
Example string with a [#hashtag] in it
I also only want it to have A-Z a-z 0-9 and _ stopping at any other character
Test #H3ll0_W0rld$%Test => Test [#H3ll0_W0rld]$%Test
Other Stack questions try to detect the string and extract it, I would like it work with it and put it back in the string without losing anything that methods such as splitting by certain characters would lose.
You can use Regex with #\w+ and $&
Explanation
# matches the character # literally (case sensitive)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
$& Includes a copy of the entire match in the replacement string.
Example
var input = "asdads sdfdsf #burgers, #rabbits dsfsdfds #sdf #dfgdfg";
var regex = new Regex(#"#\w+");
var matches = regex.Matches(input);
foreach (var match in matches)
{
Console.WriteLine(match);
}
or
var result = regex.Replace(input, "[$&]" );
Console.WriteLine(result);
Ouput
#burgers
#rabbits
#sdf
#dfgdfg
asdads sdfdsf [#burgers], [#rabbits] dsfsdfds [#sdf] [#dfgdfg]
Updated Demo here
Another Example
Use a regular expression: \#\w*
string pattern = "\#\w*";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(input);

Error in using regex (verifying the code for regex)

I have this piece of code
string myText = new TextRange(mainWindow.richtextbox2.Document.ContentStart,
mainWindow.richtextbox2.Document.ContentEnd).Text;
//replace two or more consecutive spaces with a single space, and
//replace two or more consecutive newlines with a single newline.
var str = Regex.Replace(myText, #"( |\r?\n)\1+", "$1", RegexOptions.Multiline);
mainWindow.Dispatcher.Invoke(new Action(() =>
mainWindow.richtextbox2.Document.Blocks.Add(new Paragraph(new
Run("Hello")))));
This is already working but the spacing still remains in between text sent.
how can I fix it or update my richtextbox? I am trying to eliminate the spacing in displaying a text to a richtextbox as shown
I want to show :
Hello
Hello
Hello
without the multiple newline or spacing.
Document is not of type string.
EDIT
string myText = new TextRange(richtextbox2.Document.ContentStart, richtextbox2.Document.ContentEnd).Text;
//replace two or more consecutive spaces with a single space, and
//replace two or more consecutive newlines with a single newline.
var str = Regex.Replace(myText, #"( |\r?\n)\1+", "$1", RegexOptions.Multiline);

Categories