C# Regex Replace sequence of numbers preceded with a space - c#

I have this string:
Hello22, I'm 19 years old
I just want to replace the number with * if its preceded with a space, so it would look like this:
Hello22, I'm ** years old
I've been trying a bunch of regexes but no luck. Hope someone can help out with the correct regex. Thank you.
Regexes which I tried:
Regex.Replace(input, #"([\d-])", "*");
Returns all numbers replaced with *
Regex.Replace(input, #"(\x20[\d-])", "*");
Does not work as expected

You can try (?<= )[0-9]+ pattern where
(?<= ) - look behind for a space
[0-9]+ - one or more digits.
Code:
string source = "Hello22, I'm 19 years old";
string result = Regex.Replace(
source,
"(?<= )[0-9]+",
m => new string('*', m.Value.Length));
Have a look at \b[0-9]+\b (here \b stands for word bound). This pattern
will substitute all 19 in the "19, as I say, 19, I'm 19" (note, that 1st 19 doesn't have space before it):
string source = "19, as I say, 19, I'm 19";
string result = Regex.Replace(
source,
#"\b[0-9]+\b",
m => new string('*', m.Value.Length));

In C# you could also make use of a pattern with a lookbehind and an infinite quantifier.
(?<= [0-9]*)[0-9]
The pattern matches:
(?<= Positive lookbehind, assert what is to the left of the current position is
[0-9]* Match a space followed by optional digits 0-9
) Close lookbehind
[0-9]\ Match a single digit 0-9
Example
string s = "Hello22, I'm 19 years old";
string result = Regex.Replace(s, "(?<= [0-9]*)[0-9]", "*");
Console.WriteLine(result);
Output
Hello22, I'm ** years old

Related

Regex - Get digits after a colon

I have a regex:
var topPayMatch = Regex.Match(result, #"(?<=Top Pay)(\D*)(\d+(?:\.\d+)?)", RegexOptions.IgnoreCase);
And I have to convert this to int which I did
topPayMatch = Convert.ToInt32(topPayMatchString.Groups[2].Value);
So now...
Top Pay: 1,000,000 then it currently grabs the first digit, which is 1. I want all 1000000.
If Top Pay: 888,888 then I want all 888888.
What should I add to my regex?
You can use something as simple like #"(?<=Top Pay: )([0-9,]+)". Note that, decimals will be ignored with this regex.
This will match all numbers with their commas after Top Pay:, which after you can parse it to an integer.
Example:
Regex rgx = new Regex(#"(?<=Top Pay: )([0-9,]+)");
string str = "Top Pay: 1,000,000";
Match match = rgx.Match(str);
if (match.Success)
{
string val = match.Value;
int num = int.Parse(val, System.Globalization.NumberStyles.AllowThousands);
Console.WriteLine(num);
}
Console.WriteLine("Ended");
Source:
Convert int from string with commas
If you use the lookbehind, you don't need the capture groups and you can move the \D* into the lookbehind.
To get the values, you can match 1+ digits followed by optional repetitions of , and 1+ digits.
Note that your example data contains comma's and no dots, and using ? as a quantifier means 0 or 1 time.
(?<=Top Pay\D*)\d+(?:,\d+)*
The pattern matches:
(?<=Top Pay\D*) Positive lookbehind, assert what is to the left is Top Pay and optional non digits
\d+ Match 1+ digits
(?:,\d+)* Optionally repeat a , and 1+ digits
See a .NET regex demo and a C# demo
string pattern = #"(?<=Top Pay\D*)\d+(?:,\d+)*";
string input = #"Top Pay: 1,000,000
Top Pay: 888,888";
RegexOptions options = RegexOptions.IgnoreCase;
foreach (Match m in Regex.Matches(input, pattern, options))
{
var topPayMatch = int.Parse(m.Value, System.Globalization.NumberStyles.AllowThousands);
Console.WriteLine(topPayMatch);
}
Output
1000000
888888

Regex replace 'whole' decimal numbers not followed by a certain string

I want to replace "whole" decimal numbers not followed by pt with M.
For example, I need to replace 1, 12, and 36.7, but not 45.63 in the following.
string exp = "y=tan^-1(45.63pt)+12sin(-36.7)";
I have already tried
string newExp = Regex.Replace(exp, #"(\d+\.?\d*)(?!pt)", "M");
and it gives
"y=tan^-M(M3pt)+Msin(-M)"
It does make sense to me why it works like this, but I need to get
"y=tan^-M(45.63pt)+Msin(-M)"
The problem with the regex is that it is still matching a portion of the decimal value 45.63, up to the second-to-last decimal digit. One solution is to add a negative lookahead to the pattern to ensure that we only assert (?!pt) at the real end of every decimal value. This version is working:
string exp = "y=tan^-1(45.63pt)+12sin(-36.7)";
string newExp = Regex.Replace(exp, #"(\d+(?:\.\d+)?)(?![\d.])(?!pt)", "M");
Console.WriteLine(newExp);
This prints:
y=tan^-M(45.63pt)+Msin(-M)
Here is an explanation of the regex pattern used:
( match and capture:
\d+ one or more whole number digits
(?:\.\d+)? followed by an optional decimal component
) stop capturing
(?![\d.]) not being followed by another digit or dot
(?!pt) not followed by pt
Hi there if you need the out put as
"y=tan^-M(Mpt)+Msin(-M)"
then then newExp should be
string newExp = Regex.Replace(exp, #"(\d+\.?\d*)", "M");
if output is
"y=tan^-M(45.63pt)+Msin(-M)"
then newExp should be
string newExp = Regex.Replace(exp, #"(\d+\.?\d*)(?![.\d]*pt), "M");
I think you may assert the point in a string where there are no digits and dots directly followed by "pt":
\b(?![\d.]+pt)\d+(?:\.\d+)?
See the online demo
\b - Match a word-boundary.
(?![\d.]+pt) - Negative lookahead for 1+ digits and dots followed by "pt".
\d+ - 1+ digits.
(?: - Open non-capture group:
\.\d+ - A literal dot and 1+ digits.
)? - Close non-capture group and make it optional.
See the .NET demo

Search for 2 specific letters followed by 4 numbers Regex

I need to check if a string begins with 2 specific letters and then is followed by any 4 numbers.
the 2 letters are "BR" so BR1234 would be valid so would BR7412 for example.
what bit of code do I need to check that the string is a match with the Regex in C#?
the regex I have written is below, there is probably a more efficient way of writing this (I'm new to RegEx)
[B][R][0-9][0-9][0-9][0-9]
You can use this:
Regex regex = new Regex(#"^BR\d{4}");
^ defines the start of the string (so there should be no other characters before BR)
BR matches - well - BR
\d is a digit (0-9)
{4} says there must be exactly 4 of the previously mentioned group (\d)
You did not specify what is allowed to follow the four digits. If this should be the end of the string, add a $.
Usage in C#:
string matching = "BR1234";
string notMatching = "someOther";
Regex regex = new Regex(#"^BR\d{4}");
bool doesMatch = regex.IsMatch(matching); // true
doesMatch = regex.IsMatch(notMatching); // false;
BR\d{4}
Some text to make answer at least 30 characters long :)

Regex to extract substrings in C#

I have a string as:
string subjectString = #"(((43*('\\uth\Hgh.Green.two.190ITY.PCV')*9.8)/100000+('VBNJK.PVI.10JK.PCV'))*('ASFGED.Height Density.1JKHB01.PCV')/476)";
My expected output is:
Hgh.Green.two.190ITY.PCV
VBNJK.PVI.10JK.PCV
ASFGED.Height Density.1JKHB01.PCV
Here's what I have tried:
Regex regexObj = new Regex(#"'[^\\]*.PCV");
Match matchResults = regexObj.Match(subjectString);
string val = matchResults.Value;
This works when the input string is :"#"(((43*('\\uth\Hgh.Green.two.190ITY.PCV')*9.8)/100000+"; but when the string grows and the number of substrings to be extracted is more than 1 , I am getting undesired results .
How do I extract three substrings from the original string?
It seems you want to match word and . chars before .PCV.
Use
[\w\s.]*\.PCV
See the regex demo
To force at least 1 word char at the start use
\w[\w\s.]*\.PCV
Optionally, if needed, add a word boundary at the start: #"\b\w[\w\s.]*\.PCV".
To force \w match only ASCII letters and digits (and _) compile the regex object with RegexOptions.ECMAScript option.
Here,
\w - matches any letter, digit or _
[\w\s.]* - matches 0+ whitespace, word or/and . chars
\. - a literal .
PCV - a PCV substring.
Sample usage:
var results = Regex.Matches(str, #"\w[\w\s.]*\.PCV")
.Cast<Match>()
.Select(m=>m.Value)
.ToList();

Basic regex for 16 digit numbers

I currently have a regex that pulls up a 16 digit number from a file e.g.:
Regex:
Regex.Match(l, #"\d{16}")
This would work well for a number as follows:
1234567891234567
Although how could I also include numbers in the regex such as:
1234 5678 9123 4567
and
1234-5678-9123-4567
If all groups are always 4 digit long:
\b\d{4}[ -]?\d{4}[ -]?\d{4}[ -]?\d{4}\b
to be sure the delimiter is the same between groups:
\b\d{4}(| |-)\d{4}\1\d{4}\1\d{4}\b
If it's always all together or groups of fours, then one way to do this with a single regex is something like:
Regex.Match(l, #"\d{16}|\d{4}[- ]\d{4}[- ]\d{4}[- ]\d{4}")
You could try something like:
^([0-9]{4}[\s-]?){3}([0-9]{4})$
That should do the trick.
Please note:
This also allows
1234-5678 9123 4567
It's not strict on only dashes or only spaces.
Another option is to just use the regex you currently have, and strip all offending characters out of the string before you run the regex:
var input = fileValue.Replace("-",string.Empty).Replace(" ",string.Empty);
Regex.Match(input, #"\d{16}");
Here is a pattern which will get all the numbers and strip out the dashes or spaces. Note it also checks to validate that there is only 16 numbers. The ignore option is so the pattern is commented, it doesn't affect the match processing.
string value = "1234-5678-9123-4567";
string pattern = #"
^ # Beginning of line
( # Place into capture groups for 1 match
(?<Number>\d{4}) # Place into named group capture
(?:[\s-]?) # Allow for a space or dash optional
){4} # Get 4 groups
(?!\d) # 17th number, do not match! abort
$ # End constraint to keep int in 16 digits
";
var result = Regex.Match(value, pattern, RegexOptions.IgnorePatternWhitespace)
.Groups["Number"].Captures
.OfType<Capture>()
.Aggregate (string.Empty, (seed, current) => seed + current);
Console.WriteLine ( result ); // 1234567891234567
// Shows False due to 17 numbers!
Console.WriteLine ( Regex.IsMatch("1234-5678-9123-45678", pattern, RegexOptions.IgnorePatternWhitespace));

Categories