Regex for paragraph numbering in C#

Regex for paragraph numbering in C# - c#

I am looking for Regex expression that will match any of the following:
1.0
2.0
3.1
4.2.1
2.1.1.7
1.3.17.11
12.23.54.18
the nesting/level could be higher than 4 levels...the digits between the dots likely not to exceed 2 digits (last sample).
I tried this #"\d.\d+" but in some cases it did not work.
I am also looking for expression that will match ONLY this:
1.0
12.0
4.0
Here also - no more than 2 digits before the dot.

As usual, think about the structure of what you want to match:
A single digit:
\d
A single number of arbitrary length:
\d+
A single number, constrained to at most 2 digits:
\d{1,2}
A number, followed by a dot, followed by another number:
\d{1,2}\.\d{1,2}
A number, followed by a dot, followed by another number, followed by another dot, followed by yet another number:
\d{1,2}\.\d{1,2}\.\d{1,2}
Notice a pattern? Exactly, you can use grouping and repetition to match that pattern to an arbitrary length:
\d{1,2}(\.\d{1,2})+
Note that . is a meta-character in regular expressions, matching (almost) any character, so to match a literal dot, you need to escape it (as shown above).
To match just two levels of nesting you can constrain the * after the parentheses in a similar manner:
\d{1,2}(\.\d{1,2}){1}
This means it will have to match exactly once. However, in that case you can also simplify to a regex we've seen before:
\d{1,2}\.\d{1,2}
However, putting an exact number of repetitions at the end can be helpful, if you want to create regexes that match n levels of nesting, for arbitrary n.

Try using this
(\d{1,2}[.])+\d{1,2}

Related

Using RegEx, what's the best way to capture groups of digits, ignoring any whitespace in them

Given the following string...
ABC DEF GHI: 319 022 6543 QRS : 531 450
I'm trying to extract all ranges that start/end with a digit, and which may contain whitespace, but I want that whitespace itself removed.
For instance, the above should yield two results (since there are two 'ranges' that match what I aim looking for)...
3190226543
531450
My first thought was this, but this matches the spaces between the letters...
([\d\s])
Then I tried this, but it didn't seem to have any effect...
([\d+\s*])
This one comes close, but its grabbing the trailing spaces too. Also, this grabs the whitespace, but doesn't remove it.
(\d[\d\s]+)
If it's impossible to remove the spaces in a single statement, I can always post-process the groups if I can properly extract them. That most recent statement comes close, but how do I say it doesn't end with whitespace, but only a digit?
So what's the missing expression? Also, since sometimes people just post an answer, it would be helpful to explain out the RegEx too to help others figure out how to do this. I for one would love not just the solution, but an explanation. :)
Note: I know there can be some variations between RegEx on different platforms so that's fine if those differences are left up to the reader. I'm more interested in understanding the basic mechanics of the regex itself more so than the syntax. That said, if it helps, I'm using both Swift and C#.

You cannot get rid of whitespace from inside the match value within a single match operation. You will need to remove spaces as a post-processing step.
To match a string that starts with a digit and then optionally contains any amount of digits or whitespaces and then a digit you can use
\d(?:[\d\s]*\d)?
Details:
\d - a digit
(?:[\d\s]*\d)? - an optional non-capturing group matching
[\d\s]* - zero or more whitespaces / digits
\d - a digit.
See the regex demo.

How to mask first 6 and last 4 digits for a credit card number in .net

I'm very new to regex And I'm trying to use a regular expression to turn a credit card number which will be part of a conversation into something like 492900******2222
As it can come from any conversation it might contain string next to it or might have an inconsistent format, so essentially all of the below should be formatted to the example above:
hello my number is492900001111222
number is 4929000011112222ok?
4929 0000 1111 2222
4929-0000-1111-2222
It needs to be a regular expression which extracts the capture group of which I will then be able to use a MatchEvaluator to turn all digits (excluding non digits) which are not the first 6 and last 4 into a *
I've seen many examples here on stack overflow for PHP and JS but none which helps me resolve this issue.
Any guidance will be appreciated
UPDATE
I need to expand upon an existing implementation which uses MatchEvaluator to mask each character that is not the first 6 or last 4 and ideally I dont want to change the MatchEvaluator and just make the masking flexible based on the regular expression, see this for an example https://dotnetfiddle.net/J2LCo0
UPDATE 2
#Matt.G and #CAustin answers do resolve what I asked for but I am hitting another barrier where I cant have it be so strict. The final captured group needs to only take into account the digits and as such maintain the format of the input text.
So for example:
If some types in my card number is 99 9988 8877776666 the output from the evaluation should be 99 9988 ******666666
OR
my card number is 9999-8888-7777-6666 it should output 9999-88**-****-6666.
Is this possible?
Changed the list to include items that are in my unit tests https://dotnetfiddle.net/tU6mxQ

Try Regex: (?<=\d{4}\d{2})\d{2}\d{4}(?=\d{4})|(?<=\d{4}( |-)\d{2})\d{2}\1\d{4}(?=\1\d{4})
Regex Demo
C# Demo
Explanation:
2 alternative regexes
(?<=\d{4}\d{2})\d{2}\d{4}(?=\d{4}) - to handle cardnumbers without any separators (- or <space>)
(?<=\d{4}( |-)\d{2})\d{2}\1\d{4}(?=\1\d{4}) - to handle cardnumbers with separator (- or <space>)
1st Alternative (?<=\d{4}\d{2})\d{2}\d{4}(?=\d{4})
Positive Lookbehind (?<=\d{4}\d{2}) - matches text that has 6 digits immediately behind it
\d{2} matches a digit (equal to [0-9])
{2} Quantifier — Matches exactly 2 times
\d{4} matches a digit (equal to [0-9])
{4} Quantifier — Matches exactly 4 times
Positive Lookahead (?=\d{4}) - matches text that is followed immediately by 4 digits
Assert that the Regex below matches
\d{4} matches a digit (equal to [0-9])
{4} Quantifier — Matches exactly 4 times
2nd Alternative (?<=\d{4}( |-)\d{2})\d{2}\1\d{4}(?=\1\d{4})
Positive Lookbehind (?<=\d{4}( |-)\d{2}) - matches text that has (4 digits followed by a separator followed by 2 digits) immediately behind it
1st Capturing Group ( |-) - get the separator as a capturing group, this is to check the next occurence of the separator using \1
\1 matches the same text as most recently matched by the 1st capturing group (separator, in this case)
Positive Lookahead (?=\1\d{4}) - matches text that is followed by separator and 4 digits

If performance is a concern, here's a pattern that only goes through 94 steps, instead of the other answer's 473, by avoiding lookaround and alternation:
\d{4}[ -]?\d{2}\K\d{2}[ -]?\d{4}
Demo: https://regex101.com/r/0XMluq/4
Edit: In C#'s regex flavor, the following pattern can be used instead, since C# allows variable length lookbehind.
(?<=\d{4}[ -]?\d{2})\d{2}[ -]?\d{4}
Demo

Repeating pattern matching with Regex

I am trying to validate an input with a regular expression. Up until now all my tests fail and as my experience with regex is limited I thought someone might be able to help me out.
Pattern: digit (possibly "," digit) (possibly ;)
A String may not begin with a ; and not end with a ;.
Digits are allowed to stand alone or with
My regEx (not working): ((\d)(,\d)?)(;?) the problem is it does not seem to check until the end of the string. Also the optional parts are giving me headaches.
Update: ^[0-9]+(,[0-9])?(;[0-9]+(,[0-9])?)+$this seems to work better but it does not match the single digit.
OK:
2,3;4,4;3,2
2,3
2
2,3;3;4,3
NOK:
2,3,,,,
2,3asfafafa
;2,3
2,3;;3,4
2,3;3,4;

Your ^[0-9]+(,[0-9])?(;[0-9]+(,[0-9])?)+$ regex matches 1 or more digits, then an optional sequence of , and 1 digit, followed with one or more similar sequences.
You need to match zero or more comma-separated numbers:
^\d+(?:,\d+)?(?:;\d+(?:,\d+)?)*$
^
See the regex demo
Now, tweaking part:
If only single-digit numbers should be matched, use ^\d(?:,\d)?(?:;\d(?:,\d)?)*$
If the comma-separated number pairs can have the second element empty, add ? after each ,\d (if single digit numbers are to be matched) or * (if the numbers can have more than one digit): ^\d(?:,\d?)?(?:;\d(?:,\d?)?)*$ or ^\d+(?:,\d*)?(?:;\d+(?:,\d*)?)*$.

Regular Expression to match a group of alphanumerics followed by a group of spaces, making a fixed total of characters

I'm trying to write a regular expression using C#/.Net that matches 1-4 alphanumerics followed by spaces, followed by 10 digits. The catch is the number of spaces plus the number of alphanumerics must equal 4, and the spaces must follow the alphanumerics, not be interspersed.
I'm at a total loss as to how to do this. I can do ^[A-Za-z\d\s]{1,4}[\d]{10}$, but that lets the spaces fall anywhere in the first four characters. Or I could do ^[A-Za-z\d]{1,4}[\s]{0,3}[\d]{10}$ to keep the spaces together, but that would allow more than a total of four characters before the 10 digit number.
Valid:
A12B1234567890
AB1 1234567890
AB 1234567890
Invalid:
AB1 1234567890 (more than 4 characters before the numbers)
A1B1234567890 (less than 4 characters before the numbers)
A1 B1234567890 (space amidst the first 4 characters instead of at the end)

You can force the check with a look-behind (?<=^[\p{L}\d\s]{4}) that will ensure there are four allowed characters before the 10-digits number:
^[\p{L}\d]{1,4}\s{0,3}(?<=^[\p{L}\d\s]{4})\d{10}$
^^^^^^^^^^^^^^^^^^^^
See demo
If you do not plan to support all Unicode letters, just replace \p{L} with [a-z] and use RegexOptions.IgnoreCase.

Here's the regex you need:
^(?=[A-Za-z0-9 ]{4}\d{10}$)[A-Za-z0-9]{1,4} *\d{10}$
It uses a lookahead (?= ) to test if it's followed by 4 chars, either alnum or space, and then it goes back to where it was (the beggining of string, not consuming any chars).
Once that condition is met, the rest is a expression quite similar to what you were trying ([A-Za-z0-9]{1,4} *\d{10}).
Online tester

I know this is dumb, but must work exactly as required.
^[A-Za-z\d]([A-Za-z\d]{3}|[A-Za-z\d]{2}\s|[A-Za-z\d]\s{2}|\s{3})[\d]{10}$

Not sure what you are looking for, but perhaps:
^(?=.{14}$)[A-Za-z0-9]{1,4} *\d{10}
demo

Try this:
Doesn't allow char/space/char combination and starts with a char:
/\b(?!\w\s{1,2}\w+)\w(\w|\s){3}\d{10}/gm
https://regex101.com/r/fF2tR8/2

Regular Expression to deny input of repeated characters

I want a regular expression which allows the uses to enter the following values. Minimum of Four and max of 30 characters and first character should be Upper Case.
Eg: John, Smith, Anderson, Emma
And I don't want the user to input the following types of values
Jooohnnnnnn, Smmmmith, Aaaanderson, Emmmmmmmmma
Can any one provide me with a regular expression? I search for quite some time but can't find working RegEx.
I need it for my ASP.net MVC application Model validation.
Thanks
Edited: I don't know how to check for repeated characters I just tried the following
#"^[A-Z]{1}[a-zA-Z ]{2,29}$"
The rules that I would like to add are
1. First character Upper case
2. 4-30 characters
3. No repeats of characters. Not greater than 2

To perform a check on your regex you can use a negative look ahead:
^(?!.*(.)\1{2})[A-Z][a-zA-Z ]{3,29}$
The look ahead (?!...) will fail the whole regex if what's inside it matches.
To look for repeated patterns, we use a capture group: (.)\1{2}. We capture the first character, then check if it is followed by (at least) two identical characters with the backreference \1.
See demo here.

Here is what you are looking for:
^ (?# Starting of name)
(?=[A-Z]) (?# Ensure it starts with capital A-Z without consuming the text)
(?i:([a-z]) (?# Following letters ignoring case)
(?!\1{2,}) (?# Letter cant be followed by previous letter more than twice)
){3,30} (?# Allow condition to be repeated 3 to 30 times)
$
Visual representation would look like follow:

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex for paragraph numbering in C# - c#

Try using this (\d{1,2}[.])+\d{1,2}

Related

Using RegEx, what's the best way to capture groups of digits, ignoring any whitespace in them

How to mask first 6 and last 4 digits for a credit card number in .net

Repeating pattern matching with Regex

Regular Expression to match a group of alphanumerics followed by a group of spaces, making a fixed total of characters

Regular Expression to deny input of repeated characters

Categories

Resources