C# regex match behaviour

C# regex match behaviour - c#

I've got this line in my code:
Match match = Regex.Match(actualValue, regexValue, RegexOptions.None);
I've got a simple question. why when checking for success meaning with the line:
if(match.Success)
then the match does succeed with the following values:
actualValue = "G:1"
regexValue = "A*"
the actual does not seem to fit at least for me so i probably miss something...
what i do want to achieve is just receiving an actual value and a regular expression and check if the actual value fits the regular expression.. i thought that's what i did there but apparently i didn't.
EDIT: another question. is there a way to treat the * as the "any char" wildcard? meaning is it possible that A* will be considered as A and after it any char is possible?

Your code itself is correct; your regular expression isn't.
Based on your comments on other answers, you're after a regular expression which matches any string which starts with A, and you're assuming that '*' means "any characters". '*' in fact means "match the preceding character zero or more times", so the regular expression you've given means "match the start of the string followed by zero or more 'A' characters", which will match absolutely anything.
If you're looking for a regular expression that matches the whole string but only if it starts with 'A', the regular expression you're after is ^A.*. The '.' character in a regular expression means "match any character". This regular expression thus means "match the start of the string, followed by an 'A', followed by zero or more other characters" and will thus match the entire string provided it starts with 'A'.
However, you already have the whole string, so this is a little unnecessary - all you really want to do is get an answer to the question "does the string start with an 'A'?". A regular expression that will achieve this is simply '^A'. If it matches, the string started with an 'A'.
Of course, it should be pointed out that you don't need a regular expression to confirm this anyway. If this is genuinely all you want to do (and it's possible you've just put together a simple example, and your real scenario is more complicated), why not just use the StartsWith method?:
bool match = actualValue.StartsWith("A");

The regex matches because A* means "look for 0 or more occurrences of 'A'". It will match any string.
If you meant to look for an arbitrary number of 'A', but at least one, try A+ instead.

Looking at the comments it looks like you're trying to match a lot of strings starting with A.
If they're separated by white space you could find all of them using the following:
bool matched = Regex.IsMatch(actualValue, #"\bA\w+");
This matches : "Atest flkjs Apple Ascii cAse".
If there is only one string you're matching and it starts with A and has no spaces:
bool matched = Regex.IsMatch(actualValue, #"^A\w+$");
This matches "Apple", but not "Apple and orange" as the second string has spaces.
As Chris noted * is not a wildcard in the way you meant with regex searches. You can find some information to get you started with regexes at regex-info.

Regex take the regular expression in the constructor.
Exampel in your case could be :
if(new Regex("A*").IsMatch(actualValue)
//Do something
If you are unsecure of the regexpattern, try it out here

Related

C# Regular Expression not matching

I have a regular expression
string dateformattwo = #"^(?:(?:31(\/|-|\.)(?:0?[13578]|1[02]|(?:Jan|Mar|May|Jul|Aug|Oct|Dec)))\1|(?:(?:29|30)(\/|-|\.)(?:0?[1,3-9]|1[0-2]|(?:Jan|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:29(\/|-|\.)(?:0?2|(?:Feb))\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:0?[1-9]|1\d|2[0-8])(\/|-|\.)(?:(?:0?[1-9]|(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep))|(?:1[0-2]|(?:Oct|Nov|Dec)))\4(?:(?:1[6-9]|[2-9]\d)?\d{2})";
and two strings
string value = "30.Jul.2019 This is the line that I want to match"
string value2 = "30.jul.2019"
The regex is correct however it does not match with value but it matches with value2. Why is that happening?

I couldn't get your regex to match your strings, so it's hard to say exactly what's expected here, but I can take a guess as to why it's not working: nowhere in your regex are you looking for july - looks to me like you're only matching for JUL.
Edit: each of your regexes end with $, which asserts its position at the end of the line. Your first line fails because there's characters after the date.
Updates regex here which, despite being a php-matching regex as pointed out in the comments, still matches your desired text.

Regular expression in RegularExpressionAttribute behavior

I am using this regular expression: #"[ \]\[;\/\\\?:*""<>|+=]|^[.]|[.]$"
First part [ \]\[;\/\\\?:*""<>|+=] should match any of the characters inside the brackets.
Next part ^[.] should match if the string starts with a 'dot'
Last part [.]$ should match if the string ends with a 'dot'
This works perfectly fine if I use Regex.IsMatch() function. However if I use RegularExpressionAttribute in ASP.NET MVC, I always get invalid model. Does anyone have any clue why this behavior occurs?
Examples:
"abcdefg" should not match
".abcdefg" should match
"abc.defg" should not match
"abcdefg." should match
"abc[defg" should match
Thanks in advance!
EDIT:
The RegularExpressionAttribute Specifies that a data field value in ASP.NET Dynamic Data must match the specified regular expression..
Which means. I need the "abcdef" to match, and ".abcdefg" to not match. Basically negate the whole expression I have above.

You need to make sure the pattern matches the entire string.
In a general case, you may append/prepend the pattern with .*.
Here, you may use
.*[ \][;/\\?:*"<>|+=].*|^[.].*|.*[.]$
Or, to make it a bit more efficient (that is, to reduce backtracking in the first branch) a negated character class will perform better:
[^ \][;/\\?:*"<>|+=]*[ \][;\/\\?:*"<>|+=].*|^[.].*|.*[.]$
But it is best to put the branches matching text at the start/end of the string as first branches:
^[.].*|.*[.]$|[^ \][;/\\?:*"<>|+=]*[ \][;/\\?:*"<>|+=].*
NOTE: You do not have to escape / and ? chars inside the .NET regex since you can't use regex delimiters there.
C# declaration of the last pattern will look like
#"^[.].*|.*[.]$|[^ \][;/\\?:*""<>|+=]*[ \][;/\\?:*""<>|+=].*"
See this .NET regex demo.
RegularExpressionAttrubute:
[RegularExpression(
#"^[.].*|.*[.]$|[^ \][;/\\?:*""<>|+=]*[ \][;/\\?:*""<>|+=].*",
ErrorMessage = "Username cannot contain following characters: ] [ ; / \\ ? : * \" < > | + =")
]

Your regex is an alternation which matches 1 character out of 3 character classes, the first consisting of more than 1 characters, the second a dot at the start of the string and the third a dot at the end of the string.
It works fine because it does match one of the alternations, only not the whole string you want to match.
You could use 3 alternations where the first matches a dot followed by repeating the character class until the end of the string, the second the other way around but this time the dot is at the end of the string.
Or the third using a positive lookahead asserting that the string contains at least one of the characters [\][;\/\\?:*"<>|+=]
^\.[a-z \][;\/\\?:*"<>|+=]+$|^[a-z \][;\/\\?:*"<>|+=]+\.$|^(?=.*[\][;\/\\?:*"<>|+=])[a-z \][;\/\\?:*"<>|+=]+$
Regex demo

C# Regular Expression for String matching

I am looking for a regular expression that returns success only if the input string contains following characters:
a-zA-Z0-9~!#$^ ()_-+’:.?
Is this regular expression correct?
^[a-zA-Z0-9~!#$^ ()_-+’:.?]+$
I have understood what ^ means here but not sure about +$. Also are there any alternatives to this? By the way the above regular expression also includes a space character between ^ and (

it only contains the characters listed above
bool invalidCharsExist =
Regex.Replace(input, #"[a-zA-Z0-9~!#\$\^\ \(\)_\-\+’:\.\?]", "").Length != 0;
BTW: This is not fully equivalent to your regex (It will also include non-ascii letters and digits) but I think it is a better way to check
var specialChars = new HashSet<char>("~!#$^ ()_-+’:.?");
var allValid = input.All(c => char.IsLetterOrDigit(c) || specialChars.Contains(c));

Close, but get rid of that dash in the middle of your character class and put it at the beginning:
^[-a-zA-Z0-9~!#$^ ()_+’:.?]+$
And make sure when you put it in a string that you use the proper string qualifier (I forget what it's called):
#"^[-a-zA-Z0-9~!#$^ ()_+’:.?]+$"
As to whether or not you can do it in other ways, sure, for example a negative look-ahead that doesn't actually match anything. I don't think a proper regex optimizer would leave one better than the other, it's just a matter of preference. Do you want something that looks to succeed (selects the entire string if valid), or something that looks to fail (negative look-ahead).
Honestly if performance is at all important, you should write a good old for and loop over the characters (or the equivalent LINQ implementation). Regex won't even be in the ballpark.

the regular expression would be: ^[a-zA-Z0-9~!#$^ ()_\-+’:.?]+$
I personally recommend using https://regex101.com to check regex expressions - note that they don't have C# support, but in general javascript's RegExp has similar syntax to C#, but what it does give you a particularly useful explaination of what your expression is doing, here is this epression's explaination from there:
^ assert position at start of the string
[a-zA-Z0-9~!#$^ ()_\-\+’:.?]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case sensitive)
A-Z a single character in the range between A and Z (case sensitive)
0-9 a single character in the range between 0 and 9
~!#$^ ()_ a single character in the list ~!#$^ ()_ literally
\- matches the character - literally
+’:.? a single character in the list ’:.? literally
$ assert position at end of the string
the issue with what you put in the OP was literally only forgetting to escape the - as it is reserved in the regular expression pattern to be used for special purposes (i.e in the [] notation the - is reserved to declare a character range like a-z)

Character 'e' is not recognized by simple regular expression - why?

I wrote a very simple regular expression that need to match the next pattern:
word.otherWord
- Word must have at least 2 characters and must not start with digit.
I wrote the next expression:
[a-zA-Z][a-zA-Z](.[a-zA-Z0-9])+
I tested it using Regex tester and it seems to be working at most of the cases but when I try some inputs that ends with 'e' it's not working.
for example:
Hardware.Make does not work but Hardware.Makee is works fine, why? How can I fix it?

That's because your regex looks for inputs which length is even.
You have two characters matched by [a-zA-Z][a-zA-Z] and then another two characters matched by (.[a-zA-Z0-9]) as a group which is repeated one or more times (because of +).
You can see it here: http://regex101.com/r/fW2bC1
I think you need that:
[a-zA-Z]+(\.[a-zA-Z0-9]+)+

Actually, the dot is a regex metacharacter, which stands for "any character". You'll need to escape the dot.
For your situation, I'd do this:
[a-zA-Z]{2,}\.[a-zA-Z0-9]+
The {2,} means, at least 2 characters from the previous range.

In regex, the dot period is one of the most commonly used metacharacters and unfortunately also commonly misused metacharacter. The dot matches a single character without caring what that character is...
So u would also re-write it like
[a-zA-Z]+(\.[a-zA-Z0-9]+)+

how to create regular expression based on some condition

i want to create a regular expression to find and replace uppercase character based on some condition.
find the starting uppercase for a group of uppercase character in a string and replace it lowercase and * before the starting uppercase.
If there is any lowercase following the uppercase,replace the uppercase with lowercase and * before the starting uppercase.
input string : stackOVERFlow
expected output : stack*over*flow
i tried but could not get it working perfectly.
Any idea on how to create a regular expression ?
Thanks

Well the expected inputs and outputs are slightly illogical: you're lower-casing the "f" in "flow" but not including it in the asterisk.
Anyway, the regex you want is pretty simple: #"[A-Z]+?". This matches a string of one or more uppercase alpha characters, nongreedily (don't think it makes a difference either way as the matched character class is relatively narrow).
Now, to do the find/replace, you would do something like the following:
Regex.Replace(inputString, #"([A-Z]+?)", "*$1*").ToLower();
This simply finds all occurrences of one or more uppercase alpha characters, and wherever it finds a match it replaces it with itself surrounded by asterisks. This does the surrounding but not the lowercasing; .NET Regex doesn't provide for that kind of string modification. However, since the end result of the operation should be a string with all lowercase chars, just do exactly that with a ToLower() and you'll get the expected result.

KeithS's solution can be simplified a bit
Regex.Replace("stackOVERFlow","[A-Z]+","*$0*").ToLower()
However, this will yield stack*overf*low including the f between the stars. If you want to exclude the last upper case letter, use the following expression
Regex.Replace("stackOVERFlow","[A-Z]+(?=[A-Z])","*$0*").ToLower()
It will yield stack*over*flow
This uses the pattern find(?=suffix), which finds a position before a suffix.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# regex match behaviour - c#

The regex matches because A* means "look for 0 or more occurrences of 'A'". It will match any string. If you meant to look for an arbitrary number of 'A', but at least one, try A+ instead.

Regex take the regular expression in the constructor. Exampel in your case could be : if(new Regex("A*").IsMatch(actualValue) //Do something If you are unsecure of the regexpattern, try it out here

Related

C# Regular Expression not matching

Regular expression in RegularExpressionAttribute behavior

C# Regular Expression for String matching

Character 'e' is not recognized by simple regular expression - why?

how to create regular expression based on some condition

Categories

Resources