Regex pattern to replace a string - c#

I have the following input:
Person 1kg
To get the expected output:
Person 1kEq
I am using the following pattern:
string.Format(#"(?<!\S){0}(?!\S)", Regex.Escape("kg"));
Regex.Replace(inputSentence, Pattern, "kEq");
The Regex.Replace does not replace kg with kEq.
If I edit the input sentence to Person 1 kg the replacement happens,
Could someone help me with the pattern for this?

The (?<!\S) requires either a start of the string or a whitespace before the kg search term. The (?!\S) lookahead requires the end of string or a whitespace after the search term. That is why the replacement happens if you separate the number and the measurement unit with a space as in Person 1 kg.
It seems in this case, you want to replace a match if it is not enclosed with other letters. Use (?<!\p{L}) lookbehind at the start and (?!\p{L}) lookahead at the end:
string.Format(#"(?<!\p{{L}}){0}(?!\p{{L}})", Regex.Escape("kg"));
See the regex demo.

Related

URL match regex pattern

I have a bunch of URLs that I need to filter out, based on whether it contains the keyword 'staff'
1. /services
2. /services/EarNoseThroat
3. /services/EarNoseThroat/Audiology
4. /services/EarNoseThroat/Audiology/CochlearImplant
5. /services/BehavioralHealth/Clinic
6. /services/BehavioralHealth/Clinic/staff
7. /services/BehavioralHealth/Clinic/staff/Jamie-Hudgins
I want to create one regex pattern to match all the URLs that have /services after the host URL, but not 'staff' anywhere in the URL. Basically match URLS 1 to 5.
I also need a pattern than only match URL 6 and 7.
It seems like the negative lookahead will do the trick, except I don't know how to put it together. Can someone help me out?
Something like:
^\/services\/(?:[^\/]+\/?)*$
OR
^/services\/...any Depth here...\/(?!staff)
Regex to match the following:
/services
/services/EarNoseThroat
/services/EarNoseThroat/Audiology
/services/EarNoseThroat/Audiology/CochlearImplant
/services/BehavioralHealth/Clinic
Regex:
^\/services\/(?!.*\bstaff\b).*$
Click for Demo
Explanation:
^ - asserts the start of the string
\/services\/ - matches /services/
(?!.*\bstaff\b) - negative lookahead to make sure that the word staff does not appear anywhere in the string
.* - matches 0+ occurrences of any character except a newline character
$ - asserts the end of string
Regex to match the following:
/services/BehavioralHealth/Clinic/staff
/services/BehavioralHealth/Clinic/staff/Jamie-Hudgins
Regex:
^\/services\/(?=.*\bstaff\b).*$
Click for Demo
Explanation:
The only difference is the positive lookahead:
(?=.*\bstaff\b) - positive lookahead to make sure that the word staff appears somewhere in the string before the end of the string

Regex: Few Matches with * Quantifier

My regex is ending by quantifier * .
But I have few matches in a string. How can I make so it still found all matches ? My regex:
((CMD1|CMD2)+(?::|;)+.*)
And the test string is "cmd1: test. test. test cmd2: test2. test2. test2"
So I need to get matches:
cmd1: test. test. test
cmd2: test2. test2. test2
Commands could be random words like "Look", "Take", "Go". There could be n-occurance of any commands in one string.
Example:
Go: some sentences. and more. Take: other more sentences, and even more text here. Look: more and more. and more.
You could use a positive lookahead:
\w+:.*?(?= \w+:|$)
Match a word character one or more times \w+
Match a colon :
Match any character zero or more times .*
Make it non greedy ?
A positive lookahead which asserts a word character one or more times \w+ followed by a colon : or | the end of the sting (?= \w+:|$)
Demo
A general rule when writing regex is that when you want to find all occurrences of a pattern and put each pattern into its own match, you write a regex for that pattern, not that pattern quantified * times. Otherwise, you will end up putting the whole string into one single match.
I edited the regex for you:
CMD(?:1|2)(?::|;).*?(?=$|CMD)
The beginning is pretty much self-explanatory. Towards the end, I matched . with a lazy quantifier *?. This will stop matching as soon as the string after it matches the lookahead. The lookahead just matches another CMD or the end of the string.
Remember to turn on case insensitive option!
string s = "Go: some sentences. and more. Take: other more sentences, and even more text here. Look: more and more. and more.";
var matches = Regex.Matches(s, #"(?i)(go|take|look):.+?(?=\s+\w+:)");
You can remove \s+, but in this case you should call Trim on result string.

Regex for fix length string which allow space(s) at end only. C#

I'm need regex which
a-z, A-Z, 0-9 allowed.
Fixed with 20 characters.
Space(s) at the beginning or middle not allowed.
Space(s) at the end allowed.
Example:
12345678901234567890 [Match]
1234567890 [Match]
abcde12345 [Match]
abcdefg [Not match]
ab cdefg [Not match]
I use this regex, it works fine, but it's really long and hard to maintain.
[a-zA-Z0-9]{20}|[a-zA-Z0-9]{19}\s{1}|[a-zA-Z0-9]{18}\s{2}
|[a-zA-Z0-9]{17}\s{3}|[a-zA-Z0-9]{16}\s{4}|[a-zA-Z0-9]{15}\s{5}
|[a-zA-Z0-9]{14}\s{6}|[a-zA-Z0-9]{13}\s{7}|[a-zA-Z0-9]{12}\s{8}
|[a-zA-Z0-9]{11}\s{9}|[a-zA-Z0-9]{10}\s{10}|[a-zA-Z0-9]{9}\s{11}
|[a-zA-Z0-9]{8}\s{12}|[a-zA-Z0-9]{7}\s{13}|[a-zA-Z0-9]{6}\s{14}
|[a-zA-Z0-9]{5}\s{15}|[a-zA-Z0-9]{4}\s{16}|[a-zA-Z0-9]{3}\s{17}
|[a-zA-Z0-9]{2}\s{18}|[a-zA-Z0-9]{1}\s{19}|\s{20}
Please help, thank you.
UPDATE.
In fact i need to check very long string.
At first, before i asking this question, my regex is look like this (1st regex)
^[\s\d]{25}\d{6}[0-1]{1}\d{24}[\!\#\#\$\%\^\&\*\(\)\-\\_\=\+\/\,\?\<\>\;\:\"\'\w\s\.]{30}(\s{3}|\d{3})(\s{4}|\d{4})[\s|00|10|20|40]{2}[a-zA-Z0-9\s]{20}[\s\w\d]{0,32}$
After i ask the question, my regex is look like this (2nd regex)
^[\s\d]{25}\d{6}[0-1]{1}\d{24}[\!\#\#\$\%\^\&\*\(\)\-\\_\=\+\/\,\?\<\>\;\:\"\'\w\s\.]{30}(\s{3}|\d{3})(\s{4}|\d{4})[\s|00|10|20|40]{2}(?=.{20})[a-zA-Z0-9]*\s*[\s]{0,32}$
Suppose i split this regex in 3 parts.
Part1:[\s\d]{25}\d{6}[0-1]{1}\d{24}[\!\#\#\$\%\^\&\*\(\)\-\\_\=\+\/\,\?\<\>\;\:\"\'\w\s\.]{30}(\s{3}|\d{3})(\s{4}|\d{4})[\s|00|10|20|40]{2}
Part2:[a-zA-Z0-9\s]{20} changed to (?=.{20})[a-zA-Z0-9]*\s*
Part3:[\s\w\d]{0,32}
Part1 is work fine.
Part2 requirement was changed, so i was changed it to "(?=.{20}$)[a-zA-Z0-9]*\s*"
Part3 is the problem when i change part2.
Example
00202510027680 1901160000000000000000000007000Test Test 009 069 aaaaaaaaaaaaaaaaaaaa
The string end with 32 spaces.
If i add 1 more space 1st regex not match. But the 2nd regex match. The correct is not match.
How can i modify part2 (or part3) to meet the requirement. Thank you.
demo for 1st regex
demo for 2nd regex
You can use
^(?=.{20}$)[a-zA-Z0-9]*\s*$
See regex demo
In this expression, the length of 20 chars is enforced with the positive lookahead (?=.{20}$). Since matching pattern only matches A-Z, a-z and 0-9, the . in the lookahead is possible.
Regex explanation:
^ - start of string
(?=.{20}$) - The string must be 20 characters long
[a-zA-Z0-9]* - zero or more letters or digits (a + can be used instead of *)
\s* - zero or more whitespaces
$ - end of string
C#:
var reg = #"^(?=.{20}$)[a-zA-Z0-9]*\s*$";

How to insert spaces between characters using Regex?

Trying to learn a little more about using Regex (Regular expressions). Using Microsoft's version of Regex in C# (VS 2010), how could I take a simple string like:
"Hello"
and change it to
"H e l l o"
This could be a string of any letter or symbol, capitals, lowercase, etc., and there are no other letters or symbols following or leading this word. (The string consists of only the one word).
(I have read the other posts, but I can't seem to grasp Regex. Please be kind :) ).
Thanks for any help with this. (an explanation would be most useful).
You could do this through regex only, no need for inbuilt c# functions.
Use the below regexes and then replace the matched boundaries with space.
(?<=.)(?!$)
DEMO
string result = Regex.Replace(yourString, #"(?<=.)(?!$)", " ");
Explanation:
(?<=.) Positive lookbehind asserts that the match must be preceded by a character.
(?!$) Negative lookahead which asserts that the match won't be followed by an end of the line anchor. So the boundaries next to all the characters would be matched but not the one which was next to the last character.
OR
You could also use word boundaries.
(?<!^)(\B|b)(?!$)
DEMO
string result = Regex.Replace(yourString, #"(?<!^)(\B|b)(?!$)", " ");
Explanation:
(?<!^) Negative lookbehind which asserts that the match won't be at the start.
(\B|\b) Matches the boundary which exists between two word characters and two non-word characters (\B) or match the boundary which exists between a word character and a non-word character (\b).
(?!$) Negative lookahead asserts that the match won't be followed by an end of the line anchor.
Regex.Replace("Hello", "(.)", "$1 ").TrimEnd();
Explanation
The dot character class matches every character of your string "Hello".
The paranthesis around the dot character are required so that we could refer to the captured character through the $n notation.
Each captured character is replaced by the replacement string. Our replacement string is "$1 " (notice the space at the end). Here $1 represents the first captured group in the input, therefore our replacement string will replace each character by that character plus one space.
This technique will add one space after the final character "o" as well, so we call TrimEnd() to remove that.
A demo can be seen here.
For the enthusiast, the same effect can be achieve through LINQ using this one-liner:
String.Join(" ", YourString.AsEnumerable())
or if you don't want to use the extension method:
String.Join(" ", YourString.ToCharArray())
It's very simple. To match any character use . dot and then replace with that character along with one extra space
Here parenthesis (...) are used for grouping that can be accessed by $index
Find what : "(.)"
Replace with "$1 "
DEMO

regex to get substring before substring

I have a string like following,
hi,hello,-LSB-,ASPECT,-RSB-,you
I want to extract sub-string that comes before -LSB-,ASPECT, till comma, hello in this case.
I have written regular expression like
\b\w+[/-/,LSB/-/,ASPECT]
however it extracts entire substring before and inclusing-LSB-,ASPECT, till start like,
hi,hello,-LSB-,ASPECT
Any clue??
The regex for this (using a positive lookahead assertion) would be
[^,]*(?=,-LSB-,ASPECT,)
Explanation:
[^,]* # Match any number of characters except commas
(?= # until the following regex can be matched:
,-LSB-,ASPECT, # the literal text ",-LSB-,ASPECT,".
) # (End of lookahead assertion)
Careful, square brackets create a character class which you don't want in this case.
Live demo
Try this:
(\w+),-LSB-,ASPECT

Categories