C# Regex for a username with a few restrictions - c#

Similar to this topic.
I am trying to validate a username with the following restrictions:
Must start with a letter or number
Must be 3 to 15 characters in length
Symbols include: . - _ ( ) [ ]
Symbols cannot be adjacent, but letters and numbers can
Edit:
Letters and numbers are a-z A-Z 0-9
Been stumped for a while. I'm new to regex.

As an optimization to Mark's answer:
^(?=.{3,15}$)([A-Za-z0-9][._()\[\]-]?)*$
Explanation:
(?=.{3,15}$) Must be 3-15 characters in the string
([A-Za-z0-9][._()\[\]-]?)* The string is a sequence of alphanumerics,
each of which may be followed by a symbol
This one permits Unicode alphanumerics:
^(?=.{3,15}$)((\p{L}|\p{N})[._()\[\]-]?)*$
This one is the Unicode variant, plus uses non-capturing groups:
^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)*$

It is not so clean to express a set of unrelated rules in a single regular expression, but it can be done by using lookaround assertions (Rubular):
#"^(?=[A-Za-z0-9])(?!.*[._()\[\]-]{2})[A-Za-z0-9._()\[\]-]{3,15}$"
Explanation:
(?=[A-Za-z0-9]) Must start with a letter or number
(?!.*[._()\[\]-]{2}) Cannot contain two consecutive symbols
[A-Za-z0-9._()\[\]-]{3,15} Must consist of between 3 to 15 allowed characters
You might want to consider if this would be easier to read and more maintable as a list of simpler regular expressions, all of which must validate successfully, or else write it in ordinary C# code.

Related

Underscore in regex not validating

How do I add underscore as a part of my regex string.
Here is my string that checks for uppercase, lowercase, numbers and special characters. The rest of the special characters work. Validation isn't working for underscores.
#"^[^\s](?=(.*[A-Za-z]){1,})(?=(.*[\d]){1,})(?=(.*[\W]){1,})(?=(.*[!##$%^&*()-+=\[{\]};:<>|_.\\/?,\-`'""~]{1,})).*[^\s]$"
Any ideas?
Thanks
This is the regex that AWS Cogito uses, it should apply to your situation:
#"^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[\^$*.\[\]{}\(\)?\-“!##%&\/,><’:;|_~`])\S{8,99}$"
You can check regexes at http://regexstorm.net, it's faster than building your application everytime.
I've approached it like this: I took your requirements and made them into separate positive lookaheads:
Check for:
uppercase (?=.*[A-Z])
lowercase (?=.*[a-z]) (note that I broke A-Z and a-z up into separate groups)
numbers (?=.*\d)
special characters (?=.*[!##$%^&*()-+=\[{\]};:<>|_.\\/?,\-`'""~])
You can then combine them in any order and I've combined them in the same order as I listed them above and anchored it with the beginning of the line using ^. Don't add any extra matches before, in-between or after the groups in your requirement that could cause the regex to enforce a certain ordering of the groups:
The lookahead for any non-word character \W makes it impossible to match Underscore1_ since it will only match on "anything other than a letter, digit or underscore" - which is all Underscore1_ contains.
The starting [^\s] (and ending [^\s]) that consumes one character is likely destroying a lot of good matches. Underscore1_ or _1scoreUnder shouldn't matter, but if you start with _ and consume it with [^\s] like you do, the later lookahead for a special character will fail (unless you have a second special character in the password).
#"^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[!##$%^&*()-+=\[{\]};:<>|_.\\/?,\-`'""~])"
If you have a minimum length requirement of, say, 7 characters, you just have to add .{7,}$ to the end of the regex, making it:
#"^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[!##$%^&*()-+=\[{\]};:<>|_.\\/?,\-`'""~]).{7,}$"
Without a minimum length, a password of one character from each group will be enough, and since there are 4 groups, a password with only 4 characters will pass the filter.
I see no point in putting an upper length limit into the regex. If the user interface has accepted a string that is thousands of characters long, then why reject it for being too long later? The length of what you store is probably going to be much smaller anyway since you'll be storing the bcrypt/scrypt/argon2/... encoded password.
Suggestion: Also add space (or even whitespaces) to the list of special characters.
In you regexp add underscore in 3rd Capturing Group regex101
#"^[^\s](?=(.*[A-Za-z]){1,})(?=(.*[\d]){1,})(?=(.*[\W_]){1,})(?=(.*[!##$%^&*()-+=\[{\]};:<>|_.\\/?,\-`'""~]{1,})).*[^\s]$"

Match vocabulary words and phrases

I am writing an application/logic that has vocabulary word/phrase as an input parameter. I am having troubles writing validation logic for this parameter's value!
Following are the rules I've came up with:
can be up to 4 words (with hyphens or not)
one apostrophe is allowed
only regular letters are allowed (no special characters like !##$%^&*()={}[]"";|/>/? ¶ © etc)
numbers are disallowed
case insensitive
multiple languages support (English, Russian, Norwegian, etc..) (so both Unicode and Cyrillic must be supported)
either whole string matches or nothing
Few examples (in 3 languages):
// match:
one two three four
one-two-three-four
one-two-three four
vær så snill
тест регекс
re-read
under the hood
ONe
rabbit's lair
// not-match:
one two three four five
one two three four#
one-two-three-four five
rabbit"s lair
one' two's
one1
1900
Given the expected result provided above - could someone point me to right direction on how to create a validation rule like that? If that matters - I will be writing validation logic in C# so I have more tools than just Regex available at my disposal.
If that is going to be of any help - I have been testing several solutions, like these ^[\p{Ll}\p{Lt}]+$ and (?=\S*['-])([a-zA-Z'-]+)$. The first regex seems to be doing a great job allowing just the letters I need (En, No and Rus), whereas the second rule set is doing great in using the Lookahead concept.
\p{Ll} or \p{Lowercase_Letter}: a lowercase letter that has an uppercase variant.
\p{Lu} or \p{Uppercase_Letter}: an uppercase letter that has a lowercase variant.
\p{Lt} or \p{Titlecase_Letter}: a letter that appears at the start of a word when only the first letter of the word is capitalized.
\p{L&} or \p{Letter&}: a letter that exists in lowercase and uppercase variants (combination of Ll, Lu and Lt).
\p{Lm} or \p{Modifier_Letter}: a special character that is used like a letter.
\p{Lo} or \p{Other_Letter}: a letter or ideograph that does not have lowercase and uppercase variants.
Needless to say, neither of the solutions I have been testing take into account all the rules I defined above..
You can use
\A(?!(?:[^']*'){2})\p{L}+(?:[\s'-]\p{L}+){0,3}\z
See the regex demo. Details:
\A - start of string
(?!(?:[^']*'){2}) - the string cannot contain two apostrophes
\p{L}+ - one or more Unicode letters
(?:[\s'-]\p{L}+){0,3} - zero to three occurrences of
[\s'-] - a whitespace, ' or - char
\p{L}+ - one or more Unicode letters
\z - the very end of string.
In C#, you can use it as
var IsValid = Regex.IsMatch(text, #"\A(?!(?:[^']*'){2})\p{L}+(?:[\s'-]\p{L}+");{0,3}\z")

Regular Expression for given scenario

I already have an email address regular expression FROM RFC 2822 FORMAT
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
but want to modify it to include the following some new conditions:
at least one full stop
at least one # character
no consecutive full stops
must not start/end with special characters i.e. should only start/end with [0-9a-zA-Z]
should still follow RFC specification for regular expression rules.
Currently the above one allows the email to start with special characters. Also it is allowing two consecutive full stops (except for domain name which is fine, so test#test..com fails and its correct).
Thanks.
^[a-zA-Z0-9]+(?:\.?[\w!#$%&'*+/=?^`{|}~\-]+)*#[a-zA-Z0-9](?:\.?[\w\-]+)+\.[A-Za-z0-9]+$
No .. and at least 1 . and 1 #.
Also starts/ends with letters/numbers.
The ^ (start) and $ (end) were just added to match a whole string, not just a substring. But you could replace those by a word boundary \b.
An alternative where the special characters aren't hardcoded:
^(?!.*[.]{2})[a-zA-Z0-9][^#\s]*?#[a-zA-Z0-9][^#\s]*?\.[A-Za-z0-9]+$

Regular Expression to deny input of repeated characters

I want a regular expression which allows the uses to enter the following values. Minimum of Four and max of 30 characters and first character should be Upper Case.
Eg: John, Smith, Anderson, Emma
And I don't want the user to input the following types of values
Jooohnnnnnn, Smmmmith, Aaaanderson, Emmmmmmmmma
Can any one provide me with a regular expression? I search for quite some time but can't find working RegEx.
I need it for my ASP.net MVC application Model validation.
Thanks
Edited: I don't know how to check for repeated characters I just tried the following
#"^[A-Z]{1}[a-zA-Z ]{2,29}$"
The rules that I would like to add are
1. First character Upper case
2. 4-30 characters
3. No repeats of characters. Not greater than 2
To perform a check on your regex you can use a negative look ahead:
^(?!.*(.)\1{2})[A-Z][a-zA-Z ]{3,29}$
The look ahead (?!...) will fail the whole regex if what's inside it matches.
To look for repeated patterns, we use a capture group: (.)\1{2}. We capture the first character, then check if it is followed by (at least) two identical characters with the backreference \1.
See demo here.
Here is what you are looking for:
^ (?# Starting of name)
(?=[A-Z]) (?# Ensure it starts with capital A-Z without consuming the text)
(?i:([a-z]) (?# Following letters ignoring case)
(?!\1{2,}) (?# Letter cant be followed by previous letter more than twice)
){3,30} (?# Allow condition to be repeated 3 to 30 times)
$
Visual representation would look like follow:

c# Regex question: only letters, numbers and a dot (2 to 20 chars) allowed

i am wrestling with my regex.
I want to allow only letters and numbers and a dot in a username, and 2 to 20 chars long
I thought of something like this
[0-9a-zA-Z]{2,20}
but then 21 chars is also ok, and that's not what i want
I suggest that you make two checks -- one for length and one for content based on the fact that you probably only want one dot in the name, rather than any number of dots. I'll assume that names like username and user.name are the only formats allowed.
This should get the content( but allows underscores as well):
^\w+(\.\w+)?$
If you don't want underscores, then you would use [0-9a-zA-Z]+ in place of \w+. To explain, it will match any string that consists of one or more word characters, followed by exactly 0 or 1 of a dot followed by one or more word characters. It must also match the beginning and end of the string, i.e., no other characters are allowed in the string.
Then you only need to get the length with a simple length check.
^[0-9a-zA-Z\.]{2,20}$
Try ^[\w\.]{2,20}$ instead.
You need to use start and end of string (^ and $), and escape the .:
^[0-9a-zA-Z\.]{2,20}$

Categories