Remove whitespace before or after a character with regex - c#

I new in regex and i want to find a good solution for replacing whitespace before or after the / char in my sub string.
I have got string like
"Path01 /Some folder/ folder (2)"
i checked regex
#"\s?()\s?"
but this incorrect for me. I must get in output
Path01/Some folder/folder (2)
Can you help me?
Thanks!

You may use
#"\s*/\s*"
and replace with /.
See the regex demo
The pattern matches zero or more (*) whitespace chars (\s), then a / and then again 0+ whitespace chars.
C#:
var result = Regex.Replace(s, #"\s*/\s*", "/");

Related

c# Regex.Replace [^\w ] that also removes underscores?

So I have spent far too long on this and have tried tons of things with no luck. I think I am just bad at regex. I am trying to clean a string of ALL non alpha numeric characters but leaving spaces. I DO NOT WANT TO USE [^A-Za-z0-9 ]+ due language concerns.
Here are a few things I have tried:
cleaned_string = Regex.Replace(input_string, #"[^\w ]+[_]+);
cleaned_string = Regex.Replace(input_string, ([^\w ]+)([_]+));
cleaned_string = Regex.Replace(input_string, [^ \w?<!_]+);
Edit: Solved thanks to a very helpful person below.
My final product ended up being this: [_]+|[^\w\s]+
Thanks for all the help!
This should work for you
// Expression: _|[^\w\d ]
cleaned_string = Regex.Replace(input_string, #"/_|[^\w\d ]", "");
You may use
var res = Regex.Replace(s, #"[\W_-[\s]]+", string.Empty);
See the regex demo.
Look at \W pattern: it matches any non-word chars. Now, you want to exclude a whitespace matching pattern from \W - use character class subtraction: [\W-[\s]]. This matches any char \W matches except what \s matches. And to also match a _, just add it to the character class. Add + quantifier to remove whole consecutive chunks of matching chars at one go.
Details
[ - start of a character class
\W_ - any non-word or _ chars
-[\s] - except for chars matched with \s (whitespace) pattern
] - end of the character class
+ - one or more times.

How to remove string "[xyz]" from filename with Regex?

I have the filename 0154562A5BS16101[001] in which I would like to remove the "[001]" to just leave 0154562A5BS16101.
I have tried using the regex:
var output = Regex.Replace(filename, #"[]", string.Empty);
But it throws:
System.ArgumentException: 'parsing '[]' - Unterminated [] set.'
I feel like this is a pretty easy for Regex masters, but I don't have much experience with Regex.
Since [ and ] are metacharacters in regex language, you need to escape them. You also need to tell regex that you want to match everything up to the closing square bracket:
var output = Regex.Replace(filename, #"\[[^\]]*\]", string.Empty);
\[ and \] at the ends are square brackets that you want to replace. The [^\]]* section in the middle matches any number of characters other than the closing square bracket.
Demo.

Regex searching for string that contains 3 or more digits

I'm trying to find a way to extract a word from a string only if it contains 3 or more digits/numbers in that word. It would also need to return the entire text like
TX-23443 or FUX3329442 etc...
From what I found
\w*\d\w*
won't return the any letters before the dash like the first example?
All the example I found online don't seem to be working for me. Any help is appreciated!
IF I understand your question correctly you wanted to find all the string which contains 3+ consequtive numbers in it such as TX-23443 or FUX3329442 so you wanted to extract TX-23443 and FUX3329442 even if it contains - in between the string. So here is the solution which might help you
string InpStr = "TX-23443 or FUX3329442";
MatchCollection ms = Regex.Matches(InpStr, #"[A-Za-z-]*\d{3,}");
foreach(Match m in ms)
{
Console.WriteLine(m);
}
This one should do the trick assuming your "words" have only the standard latin word characters: A-Z, a-z, 0-9 and _.
Regex word_with_3_digits = new Regex(#"(?#!cs word_with_3_digits Rev:20161129_0600)
# Match word having at least three digits.
\b # Anchor to word boundary.
(?: # Loop to find three digits.
[A-Za-z_]* # Zero or more non-digit word chars.
\d # Match one digit at a time.
){3} # End loop to find three digits.
\w* # Match remainder of word.
\b # Anchor to word boundary.
", RegexOptions.IgnorePatternWhitespace);
In javascript I would write a regex like this:
\S*\d{3,}\S*
I've prepared an online test.
Try this:
string strToCount = "Asd343DSFg534434";
int count = Regex.Matches(strToCount,"[0-9]").Count;
This one seems to be working for me even if there is a dash at the end as well.
[-]\w[-]\d{3,}[-]\w*[-]\w

Regex that removes the 2 trailing letters from a string not preceded with other letters

This is in C#. I've been bugging my head but not luck so far.
So for example
123456BVC --> 123456BVC (keep the same)
123456BV --> 123456 (remove trailing letters)
12345V -- > 12345V (keep the same)
12345 --> 12345 (keep the same)
ABC123AB --> ABC123 (remove trailing letters)
It can start with anything.
I've tried #".*[a-zA-Z]{2}$" but no luck
This is in C# so that I always return a string removing the two trailing letters if they do exist and are not preceded with another letter.
Match result = Regex.Match(mystring, pattern);
return result.Value;
Your #".*[a-zA-Z]{2}$" regex matches any 0+ characters other than a newline (as many as possible) and 2 ASCII letters at the end of the string. You do not check the context, so the 2 letters are matched regardless of what comes before them.
You need a regex that will match the last two letters not preceded with a letter:
(?<!\p{L})\p{L}{2}$
See this regex demo.
Details:
(?<!\p{L}) - fails the match if a letter (\p{L}) is found before the current position (you may use [a-zA-Z] if you only want to deal with ASCII letters)
\p{L}{2} - 2 letters
$ - end of string.
In C#, use
var result = Regex.Replace(mystring, #"(?<!\p{L})\p{L}{2}$", string.Empty);
If you're looking to remove those last two letters, you can simply do this:
string result = Regex.Replace(originalString, #"[A-Za-z]{2}$", string.Empty);
Remember that in regex $ means the end of the input or the string before a newline.

Check regex for letters numbers and underscore characters

I tried to check if a string Name contains letters, numbers, and underscore character with
the following code without success, any idea of what I miss here?
var regex = new Regex(#"^[a-zA-Z0-9]+$^\w+$");
if (regex.IsMatch(Name) )
....
in addtion when I tried with the following code, I got a parsing error "^[a-zA-Z0-9\_]+$" - Unrecognized escape sequence \_.
Var regex = new Regex(#"^[a-zA-Z0-9\_]+$");
The regex should be:
#"^[a-zA-Z0-9_]+$"
You don't need to escape the underscore. You can also use the Regex.Ignorecase option, which would allow you to use #"^[a-z0-9_]+$" just as well.
Try this regex
^[a-zA-Z0-9_-]$
You can match name with length also by this regex
^[a-zA-Z0-9_-]{m,n}$
Where
m is the start index
n is the end index
Regex Demo
Take a look at here

Categories