sorry for such a direct question but i've spent a little too long trying to find a suitable RegEx that can alter the following strings:
01.10
10.01
setting them as:
1.10
10.1
So basically always remove the first '0' in the complete sequence before each period, or in the last sequence.
Is this possible with RegEx as currently it doesn't seem so?
Try this:
find: (^|\.)0+
replace: $1
See here a demo
Note: if the expression is not at the beginning of the string, you should not use ^, but the word boundary \b, like this:
(\b|\.)0+
eventually, double escape it:
(\\b|\.)0+
See other demo
Perhaps you could try it using this regex. This will not match the zero in 0.0 or 0.1 but only when there are digits after the leading zero(s).
\b0+(?=\d\.\d+\b)|(?<=\b\d+\.)0+(?=\d+\b)
\b word boundary
0+(?=\d\.\d+\b) match a zero and use a positive lookahead to assert that the zero is followed by a digit, dot, one or more digits and a word boundary
| Or
(?<=\b\d+\.)0+(?=\d+\b) Positive lookbehind that asserts that what is on the left is a wordboundary, one or more digits and a dot. Then match one or more zeroes and assert that what follows id one or more digits and a wordboundary.
Related
I'm having some trouble to capture a specific string inside of a sentence.
The Regex I'm using is \b[0-9]{9,12}\b to capture numbers which have between 9 and 12 digits. The boundary I was using it to specify the exact number, but the problem is, when I have a number which matches with this regex followed by a dot, for example, the regex still matching and giving me much trouble.
As I searched, the problem is that \b uses some special characters as a separator too, right? Then is there a way to consider, for example 123456789. a whole string and the regex will not match with that example?
Thanks !
The word boundary \b requires a non-word character before and after a digit (as a digit is a word character). As dots and commas are non-word characters, they are allowed. To make sure the digit sequence between dots is not matched, you need to use lookarounds.
You can use
\b(?<!\.)[0-9]{9,12}(?!\.)\b
See the regex demo
The additional subpatterns are the lookbehind (?<!\.) and a lookahead (?!\.) that make sure there are no . before and after the digit sequence.
If you have . and , as decimal separators, you may want to adjust the pattern to
\b(?<![.,])[0-9]{9,12}(?![.,])\b
I encountered a problem with quite simple thing I guess, I want to replace each comma ',' in a string except for the ones that are surrounded by digits.
Examples:
hey, world -> hey,\nworld
hey , world -> hey,\nworld
they are simple, but now also:
hey,world -> hey,\nworld
hey),world -> hey),\nworld
(1,2) -> (1,2) << no change :P
I tried it with different Regexes and I can't really get it working as easily as I'd like to. Matching the commas that I need is quite easy but the problem is that I thought I can do it this way:
Regex.Replace(input, #"[^\d]\s*,\s*[^\d]", ",\n");
it works cool but it changes my:
hey,world into: he,\norld
I'd be glad if you could help me figure that out :)
Regards,
Andrew
This uses negative lookbehind (?<!...) and negative lookahead (?!...) to check for the presence of digits.
(?<![0-9])\s*,\s*|\s*,\s*(?![0-9])
It means: not preceded by digits OR not followed by digits. So the only failure case is: preceded by digits AND followed by digits.
Be aware that \d is different than [0-9]. ԱԲԳԴԵԶԷԸԹ0123456789 are \d (and many others) (they are Armenian numerals), while 0123456789 are [0-9]
My original regex was TOTALLY WRONG! Because it was: not-preceded by digits AND not-followed by digits, while the request was: non-preceded by digits OR not followed by digits.
You need to use lookaheads to only match the comma, not the characters before and after the comma:
(?=[^\d]\s*),(?=\s*[^\d])
Adding the removal of spaces shown in the second example:
(?=[^\d]\s*)[ ]*,[ ]*(?=\s*[^\d])
Your match contains the characters you don't want to replace, you should use the negative lookahead assertion and the negative lookbehind assertion.
Here's a good site for regex.
#"(?<!\d)\s*,\s*(?!\d)"
The above regex will replace the comma and any spaces directly before or after it.
Try to replace with an empty string.
Regex.Replace(input, #"(?![0-9])\s*,\s*(?![0-9])", "");
I know this stuff has been talked about a lot, but I'm having a problem trying to match the following...
Example input: "test test 310-315"
I need a regex expression that recognizes a number followed by a dash, and returns 310. How do I include the dash in the regex expression though. So the final match result would be: "310".
Thanks a lot - kcross
EDIT: Also, how would I do the same thing but with the dash preceding, but also take into account that the number following the dash could be a negative number... didnt think of this one when I wrote the question immediately. for example: "test test 310--315" returns -315 and "test 310-315" returns 315.
Regex regex = new Regex(#"\d+(?=\-)");
\d+ - Looks for one or more digits
(?=\-) - Makes sure it is followed by a dash
The # just eliminates the need to escape the backslashes to keep the compiler happy.
Also, you may want this instead:
\d+(?=\-\d+)
This will check for a one or more numbers, followed by a dash, followed by one or more numbers, but only match the first set.
In response to your comment, here's a regex that will check for a number following a -, while accounting for potential negative (-) numbers:
Regex regex = new Regex(#"(?<=\-)\-?\d+");
(?<=\-) - Negative lookbehind which will check and make sure there is a preceding -
\-? - Checks for either zero or one dashes
\d+ - One or more digits
(?'number'\d+)- will work ( no need to escape ). In this example the group containing the single number is the named group 'number'.
if you want to match both groups with optional sign try:
#"(?'first'-?\d+)-(?'second'-?\d+)"
See it working here.
Just to describe, nothing complicated, just using -? to match an optional - and \d+ to match one or more digit. a literal - match itself.
here's some documentation that I use:
http://www.mikesdotnetting.com/Article/46/CSharp-Regular-Expressions-Cheat-Sheet
in the comments section of that page, it suggests escaping the dash with '\-'
make sure you escape your escape character \
You would escape the special meaning of - in regex language (means range) using a backslash (\). Since backslash has a special meaning in C# literals to escape quotes or be part of some characters, you need to escape that with another backslash(\). So essentially it would be \d+\\-.
\b\d*(?=\-) you will want to look ahead for the dash
\b = is start at a word boundry
\d = match any decimal digit
* = match the previous as many times as needed
(?=\-) = look ahead for the dash
Edited for Formatting issue with the slash not showing after posting
I have trouble finding a regex matching this pattern:
A numeric (decimal separator can be . or ,), followed by
a dash -, followed by
a numeric (decimal separator can be . or ,), followed by
a semi-column or a space character
This pattern can be repeated one or more time.
The following examples should match the regex:
1-2;
1-2;3-4;5-6;
1,0-2;
1.0-2;
1,0-2.0;
1-2 3-4;
1-2 3,00-4;5.0-6;
The following examples should not match the regex:
1-2
1 2;
1_2;
1-2;3-4
Edit updated based on moving of 1 2; to non-match.
This should work:
#"^(\d+([,.]\d+)?-\d+([,.]\d+)?[ ;])+(?<=;)$"
Explanation
^ //Start of the string.
( //Start of group to be repeated. You can also use (?=
\d+ //One or more digits.
([,.]\d+)? //With an optional decimal
- //Separated by a dash
\d+([,.]\d+)? //Same as before.
[ ;] //Terminated by a semi-colon or a space
)+ //One or more of these groups.
(?<=;) //The last char before the end needs to be a semi-colon
$ //End of string.
Try this:
#"^([\d.,]+-[\d.,]+[ ;])*[\d.,]+-[\d.,]+;$"
Note that [\d.,]+ accepts some character sequences which wouldn't normally be considered valid "numeric" values such as 00..,.,. You might want to find a better regular expression to match numeric values and substitute it into the regular expression.
I am trying to create a regex that does not match a word (a-z only) if the word has a : on the end but otherwise matches it. However, this word is in the middle of a larger regex and so I (don't think) you can use a negative lookbehind and the $ metacharacter.
I tried this negative lookahead instead:
([a-z]+)(?!:)
but this test case
example:
just matches to
exampl
instead of failing.
If you are using a negative lookahead, you could put it at the beginning:
(?![a-z]*:)[a-z]+
i.e: "match at least one a-z char, except if the following chars are 0 to n 'a-z' followed by a ':'"
That would support a larger regex:
X(?![a-z]*:)[a-z]+Y
would match in the following string:
Xeee Xrrr:Y XzzzY XfffZ
only 'XzzzY'
Try this:
[a-z]\s
([a-z]+\b)(?!:)
asserts a word boundary at the end of the match and thus will fail "exampl"
[a-z]+(?![:a-z])