regex for removing characters at end of string - c#

i would like to match recursively, all text that ends with : or / or ; or , and remove all these characters, along with any spaces left behind, in the end of the text.
Example:
some text : ; , /
should become:
some text
What i have tried, just removes the first occurrence of any of these special characters found, how one can do this recursively, so as to delete all characters
found that match?
regex i use:
find: [ ,;:/]*
replace with nothing

[ ,;:/]*$ should be what you need. This is the same as your current regex except with the $ on the end. The $ tells it that the match must happen at the end of the string.

You can use C#'s TrimEnd() like so
string line = "some text : ; , / "
char[] charsToTrim = {',', ':', ';', ' ', '/'};
string trimmedLine = line.TrimEnd(charsToTrim);

Related

My Regex.Split with '\n' takes up two spaces instead of 1

I need to split my text into each word, space, and new line.
Although the words and spaces are properly working, the \n is taking up two spaces only if it's not after a word.
Example: "\nTest\nword", here, the first \n takes up two spaces while the second one takes up one.
How would I write the proper regex?
My code:
string delimiterChars = "([ \r\n])";
wordArray = Regex.Split(myTexy, delimiterChars);
For context, I am using Unity.
Input: enter image description here
Output: enter image description here
On the output of the picture: The first element is empty and the second is \n here. I don't want the empty element.
Regex.Split will always produce empty items where the matches are consecutive, or when they are at the start/end of string.
Instead, you can use a matching and extracting approach:
string delimiterChars = "[^ \r\n]+|[ \r\n]";
string[] wordArray = Regex.Matches(myTexy, delimiterChars)
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
The [^ \r\n]+|[ \r\n] regex matches one or more chars other than a space, CR and LF, or a space, CR or an LF char.
You can use regular expressions to remove leading delimiter characters.
var myTexy = "\nTest\nword";
string delimiterChars = "([ \r\n])";
myTexy = Regex.Replace(myTexy, "^" + delimiterChars, "");
var wordArray = Regex.Split(myTexy, delimiterChars);
The "^" regex option says only look for these characters at the beginning of the string.
Also, just so you are aware the behavior you are seeing is intended and is documented here:
If a match is found at the beginning or the end of the input string,
an empty string is included at the beginning or the end of the
returned array.
Let me know if this is what you are looking for -
String text = "\nTest\nword";
string[] words = Regex.Split(text, #"(\n+)");
Output -
Try this :-
string myStr = "This is test text";
wordArray = myStr.Split(new char[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries);
Output:

Regex.IsMatch is not working when text including "$"

Regex.IsMatch method returns the wrong result while checking the following condition,
string text = "$0.00";
Regex compareValue = new Regex(text);
bool result = compareValue.IsMatch(text);
The above code returns as "False". Please let me know if i missed anything.
The Regex class has a special method for escaping characters in a pattern: Regex.Escape()
Change your code like this:
string text = "$0.00";
Regex compareValue = new Regex(Regex.Escape(text)); // Escape characters in text
bool result = compareValue.IsMatch(text);
"$" is a special character in C# regex. Escape it first.
Regex compareValue = new Regex(#"\$0\.00");
bool result = compareValue.IsMatch("$0.00");
Regex expressions: https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
Both '.' and '$' are special characters and thus you need to escape them if you want to match the character itself. '.' matches any character and '$' matches the end of a string
see: https://regex101.com/r/pK2uY6/1
You have to escape $ since it is a special (reserved) character which means "end of string". In case . means just dot (say, decimal separator) you have to escape it as well (when not escaped, . means "any symbol"):
string pattern = #"\$0\.00";
bool result = RegEx.IsMatch(text, pattern);
As for your original pattern, it has no chance to match any string, since $0.00 means
$ end of string, followed by
0 zero
. any character
0 zero
0 zero
but end of string can't be followed by...

Replacing a string with strings that include parenthesis issue

i am currenty having a problem related with regex.replace . I have an item in checkedlistbox that contains a string with parenthesis "()" :
regx2[4] = new Regex( "->" + checkedListBox1.SelectedItem.ToString());
the example setence inside the selected item is
hello how are you (today)
i use it in regex like this :
if (e.NewValue == CheckState.Checked)
{
//replaces the string without parenthesis with the one with parenthesis
//ex:<reason1> ----> hello, how are you (today) (worked fine)
richTextBox1.Text = regx2[selected].Replace(richTextBox1.Text,"->"+checkedListBox1.Items[selected].ToString());
}
else if (e.NewValue == CheckState.Unchecked)
{
//replaces the string with parenthesis with the one without parenthesis
//hello, how are you (today)----><reason1> (problem)
richTextBox1.Text = regx2[4].Replace(richTextBox1.Text, "<reason" + (selected + 1).ToString() + ">");
}
it is able to replace the string on the first condition but unable to re-replace the setences again on second because it has parenthesis "()", do you know how to solve this problem??
thx for the response :)
Instead of:
regx2[4] = new Regex( "->" + checkedListBox1.SelectedItem.ToString());
Try:
regx2[4] = new Regex(Regex.Escape("->" + checkedListBox1.SelectedItem));
To use any of the special characters as a literal in a regex, you need to escape them with a backslash. If you want to match 1+1=2, the correct regex is 1\+1=2. Otherwise, the plus sign has a special meaning.
http://www.regular-expressions.info/characters.html
special characters:
backslash \,
caret ^,
dollar sign $,
period or dot .,
vertical bar or pipe symbol |,
question mark ?,
asterisk or star *,
plus sign +,
opening parenthesis (,
closing parenthesis ),
opening square bracket [,
opening curly brace {
To fix it you could probably do this:
regx2[4] = new Regex("->" + checkedListBox1.SelectedItem.ToString().Replace("(", #"\(").Replace(")", #"\)"));
But I would just use string.replace() since you aren't doing any parsing. I can't tell what you're transforming from/to and why you use selected as an index on the regex array in the if and 4 as the index in the else.

remove spaces and empty line in the end of a string vb.net

in vb.net i have a string that looks like this
"text text text
"
so in the end of it there are spaces and a new empty line
How can make this look like
"text text text"
string.TrimEnd:
var s = #"text text text
";
Console.Write(s.TrimEnd() + "<-- End"); // Output: text text text<-- End
TrimEnd trims from the end of the string, Trim removes from both the beginning and end of the string.
You're looking for the Trim() method.
Dim value As String = "text text text
"
Dim trimmed As String = value.Trim()
Trim removes leading and trailing whitespace. String data often has
leading or trailing whitespace characters such as newlines, spaces or
tabs. These characters are usually not needed. With Trim we strip
these characters in a declarative way.
Reference: http://www.dotnetperls.com/trim-vbnet
I'm not sure that Trim() will cut the new lines too...if not you can use it with param - Trim('\n') or Trim('\t') for tabs or even specify a list of characters which you'd like to cut off.
Use the string.TrimEnd(params char[] trimChars) method:
yourString.TrimEnd();

use c# split string with Multiple delimiters

I have this string
"abc,\u000Bdefgh,\u000Bjh,\u000Bkl"
And i need to split the string in c#, every time ,\u000B appears should be a new word.
I tried this:
string[] newString = myString.Split(",\u000B");
but it didnt work, how can i do this?
Change your split command to this:
string[] newString = ip.Split(new[]{",\u000B"}, StringSplitOptions.RemoveEmptyEntries);
Or use, StringSplitOptions.None if you want to preserve empty entries while splitting.
string[] newString = myString.Split(new string[] { ",\u000B" }, StringSplitOptions.None);
Works on my machine
string myString = "abc,\u000Bdefgh,\u000Bjh,\u000Bkl";
string[] a = myString.Split(new string[] { ",\u000B" }, StringSplitOptions.RemoveEmptyEntries);
You could use the short character escape notation: ",\v" instead.
Short UTF-16 Description
--------------------------------------------------------------------
\' \u0027 allow to enter a ' in a character literal, e.g. '\''
\" \u0022 allow to enter a " in a string literal, e.g. "this is the double quote (\") character"
\\ \u005c allow to enter a \ character in a character or string literal, e.g. '\\' or "this is the backslash (\\) character"
\0 \u0000 allow to enter the character with code 0
\a \u0007 alarm (usually the HW beep)
\b \u0008 back-space
\f \u000c form-feed (next page)
\n \u000a line-feed (next line)
\r \u000d carriage-return (move to the beginning of the line)
\t \u0009 (horizontal-) tab
\v \u000b vertical-tab

Categories