Character 'e' is not recognized by simple regular expression - why?

Character 'e' is not recognized by simple regular expression - why? - c#

I wrote a very simple regular expression that need to match the next pattern:
word.otherWord
- Word must have at least 2 characters and must not start with digit.
I wrote the next expression:
[a-zA-Z][a-zA-Z](.[a-zA-Z0-9])+
I tested it using Regex tester and it seems to be working at most of the cases but when I try some inputs that ends with 'e' it's not working.
for example:
Hardware.Make does not work but Hardware.Makee is works fine, why? How can I fix it?

That's because your regex looks for inputs which length is even.
You have two characters matched by [a-zA-Z][a-zA-Z] and then another two characters matched by (.[a-zA-Z0-9]) as a group which is repeated one or more times (because of +).
You can see it here: http://regex101.com/r/fW2bC1
I think you need that:
[a-zA-Z]+(\.[a-zA-Z0-9]+)+

Actually, the dot is a regex metacharacter, which stands for "any character". You'll need to escape the dot.
For your situation, I'd do this:
[a-zA-Z]{2,}\.[a-zA-Z0-9]+
The {2,} means, at least 2 characters from the previous range.

In regex, the dot period is one of the most commonly used metacharacters and unfortunately also commonly misused metacharacter. The dot matches a single character without caring what that character is...
So u would also re-write it like
[a-zA-Z]+(\.[a-zA-Z0-9]+)+

Related

ASP.net core RegularException attribute - multiple conditions

I have two regex that should be matched:
"^[a-z0-9\\!#\\$\\^&\\-\\+%\\=_\\(\\)\\{\\}\\<\\>'\";\\:/\\.,~`\\|\\\\]+$"
and
".*(g[o0]+gle).*"
The first one accept any alpha numeric character (with few more extras). Like helloworld123. The second one should reject any string that contain the word "google" (in diffrent forms - like: gooo0gle).
Allowed:
hello
helloworld
helloworld123
Disallowed:
hellogoogle
google
...
I want to use the RegularExpression to match this string. Thought about something like:
[RegularExpression("^[a-z0-9\\!#\\$\\^&\\-\\+%\\=_\\(\\)\\{\\}\\<\\>'\";\\:/\\.,~`\\|\\\\]+$|.*(g[o0]+gle).*"]
But it's not working since the second part (.*(g[o0]+gle).*) should be NOT.
How to do it right?
Thanks.

You can use your second regex by placing it in a negative look ahead and use the first regex as character set and combine both to get following regex that you can use,
^(?!.*g[o0]+gle)[-a-z0-9!#$^&+%=_(){}<>'";:\/.,~`|]+$
Here, this (?!.*g[o0]+gle) negative look ahead will reject any strings that contains google or any variation as supported by your regex, and this character set [-a-z0-9!#$^&+%=_(){}<>'";:\/.,~|]+` will match one or more characters allowed by it.
Also, you don't need to escape most special characters while they are in character set, hence I have unescaped most of them except / and also always place the hyphen - either as the very first character or very last character in the character set, else depending upon the regex dialects, you may see weird behavior.
Regex Demo

Match text not surrounded by & and ;

I am currently using the following regular expression:
(?<!&)[^&;]*(?!;)
To match text like this:
match1<match2>
And extract:
match1
match2
However, this seems to match an extra five empty strings. See Regex Storm.
How can I only match the two listed above?
Note the existing pattern ((?<=^|;)[^&]+) by #xanatos will only match matches 1 to 3 in the following string and not match4:
match1&lte;match2<match;3+match&4

Try changing the * to a +:
(?<!&)[^&;]+(?!;)
Test here
More correct regex:
(?<=^|;)[^&]+
Test here
The basic idea here is that a "good" substring starts at the beginning of the string (^) or right after the ;, and ends when you encounter a & ([^&]+).
Third version... But here we are showing how if you have a problem, and you decide to use regexes, now you have two problems:
(?<=^|;)([^&]|&(?=[^&;]*(?:&|$)))+
Test here

I have managed it with:
(?<Text>.+?)(?:&[^&;]*?;|$)
This seems to match all of the corner cases but it might not work with a case I can't think of at the moment.
This won't work if the string starts with a &...; pattern or is only that.
See Regex Storm.

Regex lookbeaind only when contains colon

Today I use c# Regex.IsMatch function to matching key:value format.
I have some code that checking if string format is: key:value (like: H:15).
The Regex pattern that I am using today is: [D,H,M,S]:[1-9]+\d?
I what to add the option for default key, when the input is 15, I would like to consider it like: H:15
So, I need to improve my Regex to support key:value or only value (without colon), H:15 is good and 15 is also good
I tried to use the or regex condition (|) something like : ([D,H,M,S]:[1-9]+\d?)|([1-9]+\d?)
But now it match more thinks like :1 and H:01 that are bad input for me.
I try to use also lookbehind regex without success
Any help would be greatly appreciated,
Nadav.

This should do the trick:
\b(?:[DHMS]:|(?<!:))[1-9][0-9]*\b
Demo
So, either match [DHMS]: or a word boundary not preceded by :.
Also, [1-9]+\d? looks very suspicious to me, so I replaced it with [1-9][0-9]*. Note that in .NET \d is not equivalent to [0-9] because it includes Unicode digits as well.

Looks like Avinash just beat me to it, but I added word boundaries with this expression, which works well in tests.
\b(?<=[DHMS]:)?[1-9]\d*\b

Seems like you wants something like this,
#"^(?:[DHMS]:)?[1-9]\d*$"
[DHMS] matches a single character from the given list. ? after the non-capturing group will turn the key part to an optional one. \d* matches zero or more digit characters.

Why is this regex not allowing this text?

I have a username validator IsValidUsername, and I am testing "baconman" but it is failing, could someone please help me out with this regex?
if(!Regex.IsMatch(str, #"^[a-zA-Z]\\w+|[0-9][0-9_]*[a-zA-Z]+\\w*$")) {
isValid = false;
}
I want the restrictions to be: (It's very close)
Be between 5 & 17 characters long
contain at least one letter
no spaces
no special characters

You're escaping unnecessarily: if you write your regex as starting with # outside the string, you don't need both \ - just one is fine.
Either:
#"\w"
or
"\\w"
Edit: I didn't make this clear: right now due to the double escaping, you're looking for a \ in your regex and a w. So your match would need [some character]\w to match (example: "a\w" or "a\wwwwww" would match.

Your requirements are best taken care of in normal C#. They don't map well to a regular expression. Just code them up using LINQ which works on strings like it would on an IEnumerable<char>.
Also, understanding a query of a string is much easier than understanding a Regex with the requirements that you have.

It is possible to do everything as part of a Regex, however it is not pretty :-)
^(\w(?=\w*[a-zA-Z])|[a-zA-Z]|\w(?<=[a-zA-Z]\w*)){5,17}$
It does 3 checks that always results in 1 character being matched (so we can perform the length check in the end)
Either the character is any word character \w which is before [a-zA-Z]
Or it is [a-zA-Z]
Or it is any word character \w which is after [a-zA-Z]

C# Regex Validation

Can someone please validate this for me (newbie of regex match cons).
Rather than asking the question, I am writing this:
Regex rgx = new Regex (#"^{3}[a-zA-Z0-9](\d{5})|{3}[a-zA-Z0-9](\d{9})$"
Can someone telll me if it's OK...
The accounts I am trying to match are either of:
1. BAA89345 (8 chars)
2. 12345678 (8 chars)
3. 123456789112 (12 chars)
Thanks in advance.

You can use a Regex tester. Plenty of free ones online. My Regex Tester is my current favorite.

Is the value with 3 characters then followed by digits always starting with three... can it start with less than or more than three. What are these mins and max chars prior to the digits if they can be.

You need to place your quantifiers after the characters they are supposed to quantify. Also, character classes need to be wrapped in square brackets. This should work:
#"^(?:[a-zA-Z0-9]{3}|\d{3}\d{4})\d{5}$"
There are several good, automated regex testers out there. You may want to check out regexpal.

Although that may be a perfectly valid match, I would suggest rewriting it as:
^([a-zA-Z]{3}\d{5}|\d{8}|\d{12})$
which requires the string to match one of:
[a-zA-Z]{3}\d{5} three alpha and five numbers
\d{8} 8 digits or
\d{12} twelve digits.
Makes it easier to read, too...

I'm not 100% on your objective, but there are a few problems I can see right off the bat.
When you list the acceptable characters to match, like with a-zA-Z0-9, you need to put it inside brackets, like [a-zA-Z0-9] Using a ^ at the beginning will negate the contained characters, e.g. `[^a-zA-Z0-9]
Word characters can be matched like \w, which is equivalent to [a-zA-Z0-9_].
Quantifiers need to appear at the end of the match expression. So, instead of {3}[a-zA-Z0-9], you would need to write [a-zA-Z0-9]{3} (assuming you want to match three instances of a character that matches [a-zA-Z0-9]

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Character 'e' is not recognized by simple regular expression - why? - c#

Actually, the dot is a regex metacharacter, which stands for "any character". You'll need to escape the dot. For your situation, I'd do this: [a-zA-Z]{2,}\.[a-zA-Z0-9]+ The {2,} means, at least 2 characters from the previous range.

In regex, the dot period is one of the most commonly used metacharacters and unfortunately also commonly misused metacharacter. The dot matches a single character without caring what that character is... So u would also re-write it like [a-zA-Z]+(\.[a-zA-Z0-9]+)+

Related

ASP.net core RegularException attribute - multiple conditions

Match text not surrounded by & and ;

Regex lookbeaind only when contains colon

Why is this regex not allowing this text?

C# Regex Validation

Categories

Resources