checking for alphanumeric and special characters in a string

checking for alphanumeric and special characters in a string - c#

The requirement in my project is to check a string for following conditions:
It must contain at least one letter (a-z or A-Z)
It must contain at least one number (0-9)
It must not contain any special characters.
Is there any regular expression that can match all these conditions ?
Here is the code I am using for this
private bool IsValidFormat(string str)
{
Regex rgx = new Regex(#"^[A-Za-z]+\d+.*$");
return rgx.IsMatch(str);
}
It is working for point# 1 and 2 above but it is allowing special characters.
Any help would be appreciated.

The following change allows at least one letter, at least one digit and no other characters. The order of letters and digits is not important, unlike the solution offered in the OP where it requires that it starts with a letters and ends with numbers.
private bool IsValidFormat(string str)
{
Regex rgx = new Regex(#"^(?=.*[0-9])(?=.*[a-zA-Z])([a-zA-Z0-9]+)$");
return rgx.IsMatch(str);
}

Related

REGEX Matching string nonconsecutively

I'm trying to understand how to match a specific string that's held within an array (This string will always be 3 characters long, ex: 123, 568, 458 etc) and I would match that string to a longer string of characters that could be in any order (9841273 for example). Is it possible to check that at least 2 of the 3 characters in the string match (in this example) strMoves? Please see my code below for clarification.
private readonly string[] strSolutions = new string[8] { "123", "159", "147", "258", "357", "369", "456", "789" };
Private Static string strMoves = "1823742"
foreach (string strResult in strSolutions)
{
Regex rgxMain = new Regex("[" + strMoves + "]{2}");
if (rgxMain.IsMatch(strResult))
{
MessageBox.Show(strResult);
}
}
The portion where I have designated "{2}" in Regex is where I expected the result to check for at least 2 matching characters, but my logic is definitely flawed. It will return true IF the two characters are in consecutive order as compared to the string in strResult. If it's not in the correct order it will return false. I'm going to continue to research on this but if anyone has ideas on where to look in Microsoft's documentation, that would be greatly appreciated!
Correct order where it would return true: "144257" when matched to "123"
incorrect order: "35718" when matched to "123"
The 3 is before the 1, so it won't match.

You can use the following solution if you need to find at least two different not necessarily consecutive chars from a specified set in a longer string:
new Regex($#"([{strMoves}]).*(?!\1)[{strMoves}]", RegexOptions.Singleline)
It will look like
([1823742]).*(?!\1)[1823742]
See the regex demo.
Pattern details:
([1823742]) - Capturing group 1: one of the chars in the character class
.* - any zero or more chars as many as possible (due to RegexOptions.Singleline, . matches any char including newline chars)
(?!\1) - a negative lookahead that fails the match if the next char is a starting point of the value stored in the Group 1 memory buffer (since it is a single char here, the next char should not equal the text in Group 1, one of the specified digits)
[1823742] - one of the chars in the character class.

Search for 2 specific letters followed by 4 numbers Regex

I need to check if a string begins with 2 specific letters and then is followed by any 4 numbers.
the 2 letters are "BR" so BR1234 would be valid so would BR7412 for example.
what bit of code do I need to check that the string is a match with the Regex in C#?
the regex I have written is below, there is probably a more efficient way of writing this (I'm new to RegEx)
[B][R][0-9][0-9][0-9][0-9]

You can use this:
Regex regex = new Regex(#"^BR\d{4}");
^ defines the start of the string (so there should be no other characters before BR)
BR matches - well - BR
\d is a digit (0-9)
{4} says there must be exactly 4 of the previously mentioned group (\d)
You did not specify what is allowed to follow the four digits. If this should be the end of the string, add a $.
Usage in C#:
string matching = "BR1234";
string notMatching = "someOther";
Regex regex = new Regex(#"^BR\d{4}");
bool doesMatch = regex.IsMatch(matching); // true
doesMatch = regex.IsMatch(notMatching); // false;

BR\d{4}
Some text to make answer at least 30 characters long :)

How to match camel case identifiers with a Regular Expression?

I have the need to match camel case variables. I am ignoring variables with numbers in the name.
private const String characters = #"\-:;*+=\[\{\(\/?\s^""'\<\]\}\.\)$\>";
private const String start = #"(?<=[" + characters +"])[_a-z]+";
private const String capsWord = "[_A-Z]{1}[_a-z]+";
private const String end = #"(?=[" + characters + "])";
var regex = new Regex($"{start}{capsWord}{end}",
RegexOptions.Compiled | RegexOptions.CultureInvariant) }
This is great for matching single hump variables! But not with multiple nor does the one that meets the end of the line. I thought $ or ^ in my characters would allow them to match.
abcDef // match
notToday<end of line> // no match
<start of line>intheBeginning // no match
whatIf // match
"howFar" // match
(whatsNext) // match
ohMyGod // two humps don't match
I have also tried wrapping my capsWord like this
"(capsWord)+" but it also doesn't work.
WARNING! Regex tester online matches using this "(capsWord)+" so don't verify and respond by testing from there.
It seems that my deployment wasn't getting the updates when I was making changes so there may not have been an issue after all.
This following almost works save for the start of line problem. Note, I notice I didn't need the suffix part because the match ends with [a-z] content.
private const String characters = #"\-:;*+=\[\{\(\/?\s^""'\<\]\}\.\)$\>";
private const String pattern = "(?<=[" + characters + "])[_a-z]+([A-Z][a-z]+)+";
abcDef // match
notToday<end of line> // match
<start of line>intheBeginning // no match
whatIf // match
"howFar" // match
(whatsNext) // match
ohMyGod // match
So, if anyone can solve it let me know.
I have also simplified the other characters to a simpler more concise expression but it still has a problem with matching from the beginning of the line.
private const String pattern = "(?<=[^a-zA-Z])[_a-z]+([A-Z][a-z]+)+";

You can match an empty position between a prefix and a suffix to split the camelCase identifiers
(?<=[_a-z])(?=[_A-Z])
The prefix contains the lower case letters, the suffix the upper case letters.
If you want to match camelCase identifiers, you can use
(?<=^|[^_a-zA-Z])_*[a-z]+[_a-zA-Z]*
How it works:
(?<= Match any position pos following a prefix exp (?<=exp)pos
^ Beginning of line
| OR
[^_a-zA-Z] Not an identifier character
)
_* Any number of underlines
[a-z]+ At least one lower case letter
[_a-zA-Z]* Any number of underlines and lower or upper case letters
So, it basically says: Match a sequence optionally starting with underlines, followed by at least one lower case letter, optionally followed by underlines and letters (upper and lower), and the whole thing must be preceded by either a beginning of line or a non-identifier character. This is necessary to make sure that we not only match the ending of a identifier starting with an upper case letter (or underscores and a upper case letter).
var camelCaseExpr = new Regex("(?<=^|[^_a-zA-Z])_*[a-z]+[_a-zA-Z]*");
MatchCollection matches = camelCaseExpr.Matches("whatIf _Abc _abc howFar");
foreach (Match m in matches) {
Console.WriteLine(m.Value);
}
prints
whatIf
_abc
howFar

Had the same problem today, what worked for me:
\b([a-z][a-z0-9]+[A-Z])+[a-z0-9]+\b
Note: this is for PCRE regexes
Explanation:
`(` group begin
`[a-z]` start with a lower-case letter
`[a-z0-9]+` match a string of all lowercase/numbers
`[A-Z]` an upper-case letter
`)+` group end; match one or more of such groups.
Ends with some more lower-case/numbers.
\b for word boundary.
In my case, the _camelCaseIdent_s had only one letter upper in between words.
So, this worked for me, but if you can have (or want to match) more than one
upper-case letter in between, you could do something like [A-Z]{1,2}

Merging 3 Regular Expressions to make a Slug/URL validation check

I am trying to merge a few working RegEx patterns together (AND them). I don't think I am doing this properly, further, the first RegEx might be getting in the way of the next two.
Slug example (no special characters except for - and _):
(^[a-z0-9-_]+$)
Then I would like to ensure the first character is NOT - or _:
(^[^-_])
Then I would like to ensure the last character is NOT - or _:
([^-_]$)
Match (good Alias):
my-new_page
pagename
Not-Match (bad Alias)
-my-new-page
my-new-page_
!##$%^&*()
If this RegExp can be simplified and I am more than happy to use it. I am trying to create validation on a page URL that the user can provide, I am looking for the user to:
Not start or and with a special character
Start and end with a number or letter
middle (not start and end) can include - and _
One I get that working, I can tweak if for other characters as needed.
In the end I am applying as an Annotation to my model like so:
[RegularExpression(
#"(^[a-z0-9-_]+$)?(^[^-_])?([^-_]$)",
ErrorMessage = "Alias is not valid")
]
Thank you, and let me know if I should provide more information.

See regex in use here
^[a-z\d](?:[a-z\d_-]*[a-z\d])?$
^ Assert position at the start of the line
[a-z\d] Match any lowercase ASCII letter or digit
(?:[a-z\d_-]*[a-z\d])? Optionally match the following
[a-z\d_-]* Match any character in the set any number of times
[a-z\d] Match any lowercase ASCII letter or digit
$ Assert position at the end of the line
See code in use here
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
Regex regex = new Regex(#"^[a-z\d](?:[a-z\d_-]*[a-z\d])?$");
string[] strings = {"my-new_page", "pagename", "-my-new-page", "my-new-page_", "!##$%^&*()"};
foreach(string s in strings) {
if (regex.IsMatch(s))
{
Console.WriteLine(s);
}
}
}
}
Result (only positive matches):
my-new_page
pagename

How to ignore regex matches in C#?

An input string:
string datar = "aag, afg, agg, arg";
I am trying to get matches: "aag" and "arg", but following won't work:
string regr = "a[a-z&&[^fg]]g";
string regr = "a[a-z[^fg]]g";
What is the correct way of ignoring regex matches in C#?

The obvious way is to use a[a-eh-z]g, but you could also try with a negative lookbehind like this :
string regr = "a[a-z](?<!f|g)g"
Explanation :
a Match the character "a"
[a-z] Match a single character in the range between "a" and "z"
(?<!XXX) Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
f|g Match the character "f" or match the character "g"
g Match the character "g"

Character classes aren't quite that fancy. The simple solution is:
a[a-eh-z]g
If you really want to explicitly list out the letters that don't belong, you could try something like:
a[^\W\d_A-Zfg]g
This character class matches everything except:
\W excludes non-word characters, i.e. punctuation, whitespace, and other special characters. What's left are letters, digits, and the underscore _.
\d removes digits so now we have letters and the underscore _.
_ removes the underscore so now we only match letters.
A-Z removes uppercase letters so now we only match lowercase letters.
Finally at this point we can list the individual lowercase letters we don't want to match.
All in all way more complicated than we'd likely ever want. That's regular expressions for ya!

What you're using is Java's set intersection syntax:
a[a-z&&[^fg]]g
..meaning the intersection of the two sets ('a' THROUGH 'z') and (ANYTHING EXCEPT 'f' OR 'g'). No other regex flavor that I know of uses that notation. The .NET flavor uses the simpler set subtraction syntax:
a[a-z-[fg]]g
...that is, the set ('a' THROUGH 'z') minus the set ('f', 'g').
Java demo:
String s = "aag, afg, agg, arg, a%g";
Matcher m = Pattern.compile("a[a-z&&[^fg]]g").matcher(s);
while (m.find())
{
System.out.println(m.group());
}
C# demo:
string s = #"aag, afg, agg, arg, a%g";
foreach (Match m in Regex.Matches(s, #"a[a-z-[fg]]g"))
{
Console.WriteLine(m.Value);
}
Output of both is
aag
arg

Try this if you want match arg and aag:
a[ar]g
If you want to match everything except afg and agg, you need this regex:
a[^fg]g

It seems like you're trying to match any three alphabetic characters, with the condition that the second character cannot be f or g. If this is the case, why not use the following regular expression:
string regr = "a[a-eh-z]g";

Regex: a[a-eh-z]g.
Then use Regex.Matches to get the matched substrings.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

checking for alphanumeric and special characters in a string - c#

Related

REGEX Matching string nonconsecutively

Search for 2 specific letters followed by 4 numbers Regex

How to match camel case identifiers with a Regular Expression?

Merging 3 Regular Expressions to make a Slug/URL validation check

How to ignore regex matches in C#?

Categories

Resources