I am trying to replace a values between OFFSET and ROWS in a string.
I'm using the below regex and it doesn't work.
string strValue = "OFFSET NUMBER ROWS"
string strIndex = "5";
strValue = Regex.Replace(strValue , #"(?<=OFFSET)(\w+?)(?=ROWS)", strIndex);
So my desired result will be like
OFFSET 5 ROWS
Can anyone help or suggest what's wrong with this regex as it doesn't replace values.
The regex related problem here is that you have not accounted for whitespace chars on both ends of the NUMBER. Add \s* or \s+ to account for them.
Use
string strValue = "OFFSET NUMBER ROWS";
string strIndex = "5";
strValue = Regex.Replace(strValue , #"(?<=OFFSET\s*)\w+(?=\s*ROWS)", strIndex);
Console.Write(strValue); // => OFFSET 5 ROWS
Here,
(?<=OFFSET\s*) is a positive lookbehind requiring OFFSET and 0+ whitespace chars immediately to the left of the current location
\w+ - 1+ word chars
(?=\s*ROWS) - is a positive lookahead requiring 0+ whitespace chars immediately to the right of the current location and then ROWS substring.
Alternatively, use capturing groups with backreferences in the replacement pattern:
strValue = Regex.Replace(strValue , #"(OFFSET\s*)\w+(\s*ROWS)", $"${{1}}{strIndex}$2");
See the C# online demo.
The variation of the solution with the capturing group is a bit tricky since the first backreference is followed with a digit, and thus you cannot use a regular $1 syntax, you must use an unambiguous form, ${1}.
Related
I am trying to extract the fifth and sixth value present in the stream through regex.
The stream is
12,097.00 435.00 100.00 43,037.00 3,090.00 200.00 86.00 45,890.47 7,570.00 51,514.47
I want values 200.00 and 100.00.
I tried ^(?:\S+\s+\n?){3,3} but it's selecting the string from beginning.
Can anybody help me please in getting the values that are present in the middle?
Using a quantifier like {3,3} can be written as {3}, but note that in the example string the values 200.00 and 100.00 are not the 5th and the 6th value.
With your pattern you only get the values at the beginning as the anchor ^ asserts the start of the string.
To get the third and the sixth value, you could also use 2 capture groups by using a quantifier {2} for the parts in between.
^(?:\S+\s+){2}(\S+)(?:\s+\S+){2}\s+(\S+)
^ Start of string
(?:\S+\s+){2} Repeat 2 times matching non whitespace chars followed by whitespace char
(\S+) Capture group 1, match 1+ non whitespace chars
(?:\s+\S+){2}\s+ Repeat 2 times matching whitespace chars and non whitespace chars
(\S+) Capture group 2, match 1+ non whitespace chars
Regex demo
Certainly, if you have access to the code itself, it would be easier to split the string and get nth chunk by its index.
If you are limited to a regex, you can use
(?<=^(?:\S+\s+){2})\S+
(?<=^(?:\S+\s+){5})\S+
Or, if there can be leading whitespaces:
(?<=^\s*(?:\S+\s+){2})\S+
(?<=^\s*(?:\S+\s+){5})\S+
See a .NET regex demo.
Details:
(?<= - start of a positive lookbehind that requires the following sequence of patterns to appear immediately to the left of the current location:
^ - start of string
\s* - zero or more whitespaces
(?:\S+\s+){2} - two occurrences of 1+ non-whitespace chars followed with 1+ whitespace chars
) - end of the lookbehind
\S+ - one or more non-whitespace chars.
I want to check string which look like following
1st radius = 120
and
2nd radius = 'value'
Here is my code
v1 = new Regex(#"^[A-Za-z]+\s[=]\s[A-Za-z]+$");
if (v1.IsMatch(singleLine))`
{
...
...
}
Using #"^[A-Za-z]+\s[=]\s[A-Za-z]+$" this expression 2nd string is matched but not first and when used this #"^[A-Za-z]+\s[=]\s\d{0,3}$" then only matched first one.
And i also want to check for radius = 'val01'
Basing on your effort, it looks as if you were trying to come up with
^[A-Za-z]+\s=\s(?:'[A-Za-z0-9]+'|\d{1,3})$
See the regex demo. Details:
^ - start of string
[A-Za-z]+ - one or more ASCII letters
\s=\s - a = char enclosed with single whitespace chars
(?:'[A-Za-z0-9]+'|\d{1,3}) - a non-capturing group matching either
'[A-Za-z0-9]+' - ', then one or more ASCII letters or digits and then a '
| - or
\d{1,3} - one, two or three digits
$ - end of string (actually, \z is safer when it comes to validating as there can be no final trailing newline after \z, and there can be such a newline after $, but it also depends on how you obtain the input).
If the pattern you tried ^[A-Za-z]+\s[=]\s[A-Za-z]+$ matches the second string radius = 'value', that means that 'value' consists of only chars A-Za-z.
In that case, you could either add matching digits to the second character class:
^[A-Za-z]+\s=\s[A-Za-z0-9]+$
If you either want to match 1-3 digits or at least a single char A-Za-z followed by optional digits:
^[A-Za-z]+\s=\s(?:[0-9]{1,3}|[A-Za-z]+[0-9]*)$
The pattern matches:
^ Start of string
[A-Za-z]+\s=\s Match the first part with chars A-Za-z and the = sign (Note that = does not have to be between square brackets)
(?: Non capture group
[0-9]{1,3} Match 1-3 digits (You can use \d{0,3} but that will also match an emtpy string due to the 0)
| Or
[A-Za-z]+[0-9]* Match 1+ chars A-Za-z followed by optional digits
) Close non capture group
$ End of string
Regex demo
I am having a hard time trying to figure out this regex pattern. I want to replace all non-numeric characters in a string except for certain alpha character patterns.
For example i am trying:
string str = "The sky is blue 323.05 lnp days of the year";
str = Regex.Replace(str, "(^blue|lnp|days)[^.0-9]", "", RegexOptions.IgnoreCase);
I would like it to return:
"blue 323.05 lnp days"
but I can't figure out how to get it to match the entire character pattern in the expression.
I'd suggest capturing what you need to keep and just matching what you need to remove:
var result = Regex.Replace(text, #"(\s*\b(?:blue|lnp|days)\b\s*)|[^.0-9]", "$1").Trim();
See the regex demo. Note that the eventual leading/trailing spaces will be trimmed with .Trim().
The regex means:
(\s*\b(?:blue|lnp|days)\b\s*) - Group 1 ($1):
\s* - 0+ whitespaces
\b(?:blue|lnp|days)\b - one of the three words as whole words
\s* - 0+ whitespaces
| - or
[^.0-9] - any char but . and ASCII digit.
This is in C#. I've been bugging my head but not luck so far.
So for example
123456BVC --> 123456BVC (keep the same)
123456BV --> 123456 (remove trailing letters)
12345V -- > 12345V (keep the same)
12345 --> 12345 (keep the same)
ABC123AB --> ABC123 (remove trailing letters)
It can start with anything.
I've tried #".*[a-zA-Z]{2}$" but no luck
This is in C# so that I always return a string removing the two trailing letters if they do exist and are not preceded with another letter.
Match result = Regex.Match(mystring, pattern);
return result.Value;
Your #".*[a-zA-Z]{2}$" regex matches any 0+ characters other than a newline (as many as possible) and 2 ASCII letters at the end of the string. You do not check the context, so the 2 letters are matched regardless of what comes before them.
You need a regex that will match the last two letters not preceded with a letter:
(?<!\p{L})\p{L}{2}$
See this regex demo.
Details:
(?<!\p{L}) - fails the match if a letter (\p{L}) is found before the current position (you may use [a-zA-Z] if you only want to deal with ASCII letters)
\p{L}{2} - 2 letters
$ - end of string.
In C#, use
var result = Regex.Replace(mystring, #"(?<!\p{L})\p{L}{2}$", string.Empty);
If you're looking to remove those last two letters, you can simply do this:
string result = Regex.Replace(originalString, #"[A-Za-z]{2}$", string.Empty);
Remember that in regex $ means the end of the input or the string before a newline.
I want it to search string like "$12,56,450" using Regex in c#, but it doesn't match the string
Here is my code:
string input="Total earn for the year $12,56,450";
string pattern = #"\b(?mi)($12,56,450)\b";
Regex regex = new Regex(pattern);
if (regex.Match(input).Success)
{
return true;
}
This Regex will do the job, (?mi)(\$\d{2},\d{2},\d{3}), and here's a Regex 101 to prove it.
Now let's break it down a little:
\$ matches the literal $ at the beginning of the string
\d{2} matches any two digits
, matches the literal ,
\d{2} matches any two digits
, matches the literal ,
\d{3} matches any three digits
Now, for the purposes of the demonstration I removed the word boundaries, \b, but I'm also pretty confident you don't need them anyway. See, word boundaries aren't generally necessary for such a finite string match. Consider their definition:
Before the first character in the string, if the first character is a word character.
After the last character in the string, if the last character is a word character.
Between two characters in the string, where one is a word character and the other is not a word character.
You need to escape $ and some other special regex caracters.
try this #"\b(?mi)(\$12,56,450)\b";
if you want you can use \d to match a digit, and use \d{2,3} to match a digit with size 2 or 3.