Regular expression one letter followed by at least one numerical value - c#

I try to create a regular expression that match the following..
The name starts with a letter and followed by at least 1 numerical value. For example this should be valid: "w1234.pdf" but not this: "ww1234.pdf". So far I only have this:
^[a-zA-Z][0-9]{1}?$

For
The name starts with a letter and followed by at least 1 numerical
value
you can try
^[a-zA-Z][0-9]
pattern. Explanation:
^ starting anchor
[a-zA-Z] single letter
[0-9] followed by a digit
Please, notice that valid names are wider class than provided in the question, e.g a1x456.txt - starting with a letter (a) followed by at least one numerical value (1)

Your regex matches one character in the range [a-zA-Z], followed by one digit. Note that {1}? is useless in this case, since [0-9] already matches one digit.
Change your regex to:
^[a-zA-Z][0-9].+$
if you want to match one digit or more after the first character.

Related

C# Regex specify allowed start and end condtions

I'm trying to create a regex expression with the following requirements:
The value:
Must start with a-z or _, numbers are OK after the first character
Can have parentheses if they are opened and closed with number inside at the end of string, i.e SomeVar(10) is OK, SomeVar(10 is not OK.
Can have a . but only one at a time, and only between letters or numbers. SomeVar.InnerVar is OK, SomeVar..Innevar is not OK.
My try at the regex:
[a-zA-Z_]
??
??
Assuming you want to match an entire string, you may use something like the following:
^[a-zA-Z_](?:\w|(?<=\w)\.(?=\w))*(?:\(\d+\))?$
Demo.
If you want to match partial strings, you'd need to decide what boundaries are allowed. Otherwise, "SomeVar(10" would have a match (i.e., what comes before (), for example.
Notes:
\w matches a lowercase/uppercase letter, a digit, or an underscore. But it also matches Unicode letters and numbers. If you don't want that, you could use [a-zA-Z0-9_] instead.
Similarly, \d matches any Unicode digit. You either use it or use [0-9] depending on your requirements.
Use
^[a-zA-Z_][a-zA-Z0-9_]*(\.[a-zA-Z_][a-zA-Z0-9_]*)*(\([^()]*\))?$
See proof.
[a-zA-Z_][a-zA-Z0-9_]* - a letter or underscore, then zero or more letters, digits, underscores
(\([^()]*\))? - optional group, parens may be present or absent
(\.[a-zA-Z_][a-zA-Z0-9_]*)* - dot is allowed between letter/digit/underscore.

Regex pattern for search first letters of the first and last name

I have a problem with regex pattern. Every day I get names and surnames. Example:
Darkholme Van Tadashi
Herrington Billy Aniki
Johny
Walker Sam Cooler
etc..
The fact is that they are specific and do not consist of just one last name and first name.
From this list, I need to select one person (whose last name and first name I know). To do this, I found pattern:
"Darkholme|\b[vt]"
As I said, I know the person's data in advance (before the list arrives). But I only know his last name. The second and third names (Van Tadashi) are unknown to me, I only know the first letters of these names ("V" and "T"). I ran into this problem: when regex analyzes incoming data (I use regex.ismatch), it returns true if the input string is "Van Dungeonmaster". How do I create a pattern that will only return true if the surname=Darkholme, first letters of the second and third names match (=V and T)?
Perhaps I'm not making myself clear.. But in the end, it should turn out that I passed only the last name and the first letters of the first name and patronymic to pattern, and regex gave a match for input string.
If there is a comma present and the names can start with either V or T where the third name can be optional, you could use an optional group matching any non whitespace char except a comma.
\bDarkholme\s+[VT][^\s,]+(?:\s+[VT][^\s,]+)?
\b Word bounary, to prevent Darkholme being part of a larger word
Darkholme Match literally
\s+[VT] Match 1+ whitespace chars followed by either V or T
[^\s,]+ Match 1+ times any char except a whitespace char or comma
(?: Non capture group
\s+[VT] Match 1+ whitespace chars followed by either V or T
[^\s,]+ Match 1+ times any char except a whitespace char or comma
)? Close the group to make the 3rd part optional
.NET regex demo
If you know that the name starts with V for the second and T for the third:
\bDarkholme\s+V[^\s,]+(?:\s+T[^\s,]+)?
.NET regex demo
If the name can also be a Single V or T, the quantifier could be an asterix for [^\s,]*
Your pattern as is means "match any string that contains Darkholme or any string where any word starts with a v or a t" which isn't quite what you want
Perhaps
Darkholme\s+V\S*\s+T
Would suit you better. It means "darkholme followed by at least one white space then V, followed by any number of non whitespace characters then any number of whitespace followed by T

Regular expression to accept 5 characters with optional

Hi i have to write a regular expression that should match the format like A12BC. Here first 2 characters that is A & 1 is mandatory and next 3 characters 2, B & C are optional. Currently my regEx works if i give the string value as A12BC.
When I give the input as A1B it should not match but my regular expression matches and gives me the result as susses. Can any one please help me and modify my RegExp so
that it behaves as per below:
Case "A1" : Should match
Case "A1B" : Should not match (this case is not working)
Case "A12B" : Should match
Case "A12BC" : Should match
Case "A12BCD" : Should not match
My regular expression is as below:
^[a-zA-Z][0-9][0-9]?[a-zA-Z]?[a-zA-Z]?$
To make sure that the third character, if present, is a digit, make third character mandatory in an optional group, like this:
^[a-zA-Z][0-9]([0-9][a-zA-Z]?[a-zA-Z]?)?$
This expression says that if the third character is present, it needs to match a digit. The two trailing letters are optional, too.
Note: you can simplify your expression by using predefined Character Classes \w for letters and \d for digits. Remember that you need to double backslashes for use in "plain" string literals (as opposed to verbatim string literals, in which backslashes are not doubled).
You can use:
^[a-zA-Z][0-9](?:[0-9][a-zA-Z]{0,2})?$
In the 2BC pattern, you have to make the digit mandatory before allowing zero, one or two letters.
(?:[0-9][a-zA-Z]{0,2})? matches an empty string, or a digit, or a digit followed by a letter, or a digit followed by two letters, but not a single letter.
(?:...) is a non capturing group, see demo here.

Regular Expression to deny input of repeated characters

I want a regular expression which allows the uses to enter the following values. Minimum of Four and max of 30 characters and first character should be Upper Case.
Eg: John, Smith, Anderson, Emma
And I don't want the user to input the following types of values
Jooohnnnnnn, Smmmmith, Aaaanderson, Emmmmmmmmma
Can any one provide me with a regular expression? I search for quite some time but can't find working RegEx.
I need it for my ASP.net MVC application Model validation.
Thanks
Edited: I don't know how to check for repeated characters I just tried the following
#"^[A-Z]{1}[a-zA-Z ]{2,29}$"
The rules that I would like to add are
1. First character Upper case
2. 4-30 characters
3. No repeats of characters. Not greater than 2
To perform a check on your regex you can use a negative look ahead:
^(?!.*(.)\1{2})[A-Z][a-zA-Z ]{3,29}$
The look ahead (?!...) will fail the whole regex if what's inside it matches.
To look for repeated patterns, we use a capture group: (.)\1{2}. We capture the first character, then check if it is followed by (at least) two identical characters with the backreference \1.
See demo here.
Here is what you are looking for:
^ (?# Starting of name)
(?=[A-Z]) (?# Ensure it starts with capital A-Z without consuming the text)
(?i:([a-z]) (?# Following letters ignoring case)
(?!\1{2,}) (?# Letter cant be followed by previous letter more than twice)
){3,30} (?# Allow condition to be repeated 3 to 30 times)
$
Visual representation would look like follow:

Explain the Regex mentioned

Can any one please explain the regex below, this has been used in my application for a very long time even before I joined, and I am very new to regex's.
/^.*(?=.{6,10})(?=.*[a-zA-Z].*[a-zA-Z].*[a-zA-Z].*[a-zA-Z])(?=.*\d.*\d).*$/
As far as I understand
this regex will validate
- for a minimum of 6 chars to a maximum of 10 characters
- will escape the characters like ^ and $
also, my basic need is that I want a regex for a minimum of 6 characters with 1 character being a digit and the other one being a special character.
^.*(?=.{6,10})(?=.*[a-zA-Z].*[a-zA-Z].*[a-zA-Z].*[a-zA-Z])(?=.*\d.*\d).*$
^ is called an "anchor". It basically means that any following text must be immediately after the "start of the input". So ^B would match "B" but not "AB" because in the second "B" is not the first character.
.* matches 0 or more characters - any character except a newline (by default). This is what's known as a greedy quantifier - the regex engine will match ("consume") all of the characters to the end of the input (or the end of the line) and then work backwards for the rest of the expression (it "gives up" characters only when it must). In a regex, once a character is "matched" no other part of the expression can "match" it again (except for zero-width lookarounds, which is coming next).
(?=.{6,10}) is a lookahead anchor and it matches a position in the input. It finds a place in the input where there are 6 to 10 characters following, but it does not "consume" those characters, meaning that the following expressions are free to match them.
(?=.*[a-zA-Z].*[a-zA-Z].*[a-zA-Z].*[a-zA-Z]) is another lookahead anchor. It matches a position in the input where the following text contains four letters ([a-zA-Z] matches one lowercase or uppercase letter), but any number of other characters (including zero characters) may be between them. For example: "++a5b---C#D" would match. Again, being an anchor, it does not actually "consume" the matched characters - it only finds a position in the text where the following characters match the expression.
(?=.*\d.*\d) Another lookahead. This matches a position where two numbers follow (with any number of other characters in between).
.* Already covered this one.
$ This is another kind of anchor that matches the end of the input (or the end of a line - the position just before a newline character). It says that the preceding expression must match characters at the end of the string. When ^ and $ are used together, it means that the entire input must be matched (not just part of it). So /bcd/ would match "abcde", but /^bcd$/ would not match "abcde" because "a" and "e" could not be included in the match.
NOTE
This looks like a password validation regex. If it is, please note that it's broken. The .* at the beginning and end will allow the password to be arbitrarily longer than 10 characters. It could also be rewritten to be a bit shorter. I believe the following will be an acceptable (and slightly more readable) substitute:
^(?=(.*[a-zA-Z]){4})(?=(.*\d){2}).{6,10}$
Thanks to #nhahtdh for pointing out the correct way to implement the character length limit.
Check Cyborgx37's answer for the syntax explanation. I'll do some explanation on the meaning of the regex.
^.*(?=.{6,10})(?=.*[a-zA-Z].*[a-zA-Z].*[a-zA-Z].*[a-zA-Z])(?=.*\d.*\d).*$
The first .* is redundant, since the rest are zero-width assertions that begins with any character ., and .* at the end.
The regex will match minimum 6 characters, due to the assertion (?=.{6,10}). However, there is no upper limit on the number of characters of the string that the regex can match. This is because of the .* at the end (the .* in the front also contributes).
This (?=.*[a-zA-Z].*[a-zA-Z].*[a-zA-Z].*[a-zA-Z]) part asserts that there are at least 4 English alphabet character (uppercase or lowercase). And (?=.*\d.*\d) asserts that there are at least 2 digits (0-9). Since [a-zA-Z] and \d are disjoint sets, these 2 conditions combined makes the (?=.{6,10}) redundant.
The syntax of .*[a-zA-Z].*[a-zA-Z].*[a-zA-Z].*[a-zA-Z] is also needlessly verbose. It can be shorten with the use of repetition: (?:.*[a-zA-Z]){4}.
The following regex is equivalent your original regex. However, I really doubt your current one and this equivalent rewrite of your regex does what you want:
^(?=(?:.*[a-zA-Z]){4})(?=(?:.*\d){2}).*$
More explicit on the length, since clarity is always better. Meaning stay the same:
^(?=(?:.*[a-zA-Z]){4})(?=(?:.*\d){2}).{6,}$
Recap:
Minimum length = 6
No limit on maximum length
At least 4 English alphabet, lowercase or uppercase
At least 2 digits 0-9
REGEXPLANATION
/.../: slashes are often used to represent the area where the regex is defined
^: matches beginning of input string
.: this can match any character
*: matches the previous symbol 0 or more times
.{6,10}: matches .(any character) somewhere between 6 and 10 times
[a-zA-Z]: matches all characters between a and z and between A and Z
\d: matches a digit.
$: matches the end of input.
I think that just about does it for all the symbols in the regex you've posted
For your regex request, here is what you would use:
^(?=.{6,}$)(?=.*?\d)(?=.*?[!##$%&*()+_=?\^-]).*
And here it is unrolled for you:
^ // Anchor the beginning of the string (password).
(?=.{6,}$) // Look ahead: Six or more characters, then the end of the string.
(?=.*?\d) // Look ahead: Anything, then a single digit.
(?=.*?[!##$%&*()+_=?\^-]) // Look ahead: Anything, and a special character.
.* // Passes our look aheads, let's consume the entire string.
As you can see, the special characters have to be explicitly defined as there is not a reserved shorthand notation (like \w, \s, \d) for them. Here are the accepted ones (you can modify as you wish):
!, #, #, $, %, ^, &, *, (, ), -, +, _, =, ?
The key to understanding regex look aheads is to remember that they do not move the position of the parser. Meaning that (?=...) will start looking at the first character after the last pattern match, as will subsequent (?=...) look aheads.

Categories