How to make an email pattern that disallow arabic characters - c#

i am not very experienced in Regular Expression so its why i am asking you :)
my question is i use this pattern when i validate Emails.
/^(([^<>()[\]\\.,;:\s#\"]+(\.[^<>()[\]\\.,;:\s#\"]+)*)|(\".+\"))#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zAZ\-0-9]+\.)+[a-zA-Z]{2,}))$/
what is it to add to this pattern to disallow Arabic characters ?

Regular expressions should not be used to validate emails.
The correct way to validate an email address is using the MailAddress class like this:
try
{
string address = new MailAddress(address).Address;
}
catch(FormatException)
{
//address is invalid
}
Regarding the question itself, after you see that it is a valid email address - you can check for arabic characters.

I bet you could do it with a bracket expression (aka Character Set aka Character Class) and unicode escapes (available in javascript and C#):
[^\u####-\u%%%%]
... where the hashtags (####) represent the first arabian character (i.e. the character with the lowest unicode value), and the percent signs (%%%%) the last arabian character (i.e. the character with the highest unicode value).
Wikipedia tells me that there are multiple ranges of arabian characters, so you'd need to repeat the snippet above.

Use Character Properties:
/\p{sc=Arabic}/
matches all Arabic characters.
Then inverse the chracters that the expression matches to
/[^\p{sc=Arabic}]/

Related

Not able to generate string from complex regular expression with Fare

I am using Fare library https://github.com/moodmosaic/Fare/ to generate a random string from regular expression. Up until now, it has been working properly.
What I wanted now is
"The Password must have a minimum/maximum of 8 characters, including one special character, atleast 1 digit and atleast capital letter."
Special characters allowed are !#$%^&*()=,.
for that, I have created the expression
^((?=.\d)(?=.[A-Z])(?=.*\W).{8,8})$
But it is not generating valid expression
Please check what's the problem
I am generating the regular expression with:
var secret = new Xeger(ConfigurationManager.AppSettings["expression"]).Generate();
Console.WriteLine(secret);
I have updated the pattern requirement
If Fare doesn't support lookaheads, then you may instead try using a different regex expression devoid of any such lookaheads:
^([0-9])([A-Z])([!##%*()$?+-])([a-zA-Z0-9]{5})$
So this regex expression can be used for validating passwords but we would then be restricting user to input first a digit, second a capital letter, third a special character with rest being a mix of alphanumeric chars.

Regex validation Comma Separated Words - Foreign Charcters

I am developing an application in Arabic-English language, so i needed a Regex that validates to a set of separated words, here is my RegEx:
^([a-zA-Z]+(,[a-zA-Z]+)*)?$
This works flawless for me but as you see the charters specified is in English, i want this for Arabic language.
Can this expression be altered to accept other charters either Arabic or even maybe some other language ?
Instead of restricting to a set of alphabetical character, exclude the characters that mark the end of your word.
^([^,]+(,[^,]+)*)?$
If you really want to match Arabic characters, see: regular expression For Arabic Language

regular expressions with the Cyrillic alphabet?

I am currently writing some validation that will validate inputted data. I am using regular expressions to do so, working with C#.
Password = #"(?!^[0-9]*$)(?!^[a-zA-Z]*$)^([a-zA-Z0-9]{6,18})$"
Validate Alpha Numeric = [^a-zA-Z0-9ñÑáÁéÉíÍóÓúÚüÜ¡¿{0}]
The above work fine on the latin alphabet, but how can I expand such to working with the Cyrillic alphabet?
The basic approach to covering ranges of characters using regular expressions is to construct an expression of the form [A-Za-z], where A is the first letter of the range, and Z is the last letter of the range.
The problem is, there is no such thing as "The" Cyrillic alphabet: the alphabet is slightly different depending on the language. If you would like to cover Russian version of the Cyrillic, use [А-Яа-я]. You would use a different range, say, for Serbian, because the last letter in their Cyrillic is Ш, not Я.
Another approach is to list all characters one-by-one. Simply find an authoritative reference for the alphabet that you want to put in a regexp, and put all characters for it into a pair of square brackets:
[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдеёжзийклмнопрстуфхцчшщъыьэюя]
You can use character classes if you need to allow characters of particular language or particular type:
#"\p{IsCyrillic}+" // Cyrillic letters
#"[\p{Ll}\p{Lt}]+" // any upper/lower case letters in any language
In your case maybe "not a whitespace" would be enough: #"[^\s]+" or maybe "word character (which includes numbers and underscores) - #"\w+".
Password = #"(?!^[0-9]*$)(?!^[А-Яа-я]*$)^([А-Яа-я0-9]{6,18})$"
Validate Alpha Numeric = [^а-яА-Я0-9ñÑáÁéÉíÍóÓúÚüÜ¡¿{0}]

Regular Expression to validate email ending in .edu

I am trying to create a regex validation attribute in asp.net mvc to validate that an entered email has the .edu TLD.
I have tried the following but the expression never validates to true...
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+edu
and
\w.\w#{1,1}\w[.\w]?.edu
Can anyone provide some insight?
This should work for you:
^[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+\.edu$
Breakdown since you said you were weak at RegEx:
^ Beginning of string
[a-zA-Z0-9._%+-]+ one or more letters, numbers, dots, underscores, percent-signs, plus-signs or dashes
# #
[a-zA-Z0-9.+-]+ one or more letters, numbers, dots, plus-signs or dashes
\.edu .edu
$ End of string
if you're using asp.net mvc validation attributes, your regular expression actually has to be coded with javascript regex syntax, and not c# regex syntax. Some symbols are the same, but you have to be weary about that.
You want your attribute to look like the following:
[RegularExpression(#"([0-9]|[a-z]|[A-Z])+#([0-9]|[a-z]|[A-Z])+\.edu$", ErrorMessage = "text to display to user")]
the reason you include the # before the string is to make a literal string, because I believe c# will apply its own escape sequences before it passes it to the regex
(a|b|c) matches either an 'a' or 'b' or 'c'. [a-z] matches all characters between a and z, and the similar for capital letters and numerals so, ([0-9]|[a-z]|[A-Z]) matches any alphanumeric character
([0-9]|[a-z]|[A-Z])+ matches 1 or more alphanumeric characters. + in a regular expression means 1 or more of the previous
# is for the '#' symbol in an email address. If it doesn't work, you might have to escape it, but i don't know of any special meaning for # in a javascript regex
Let's simplify it more
[RegularExpression(#"\w+#\w+\.edu$", ErrorMessage = "text to display to user")]
\w stands for any alphanumeric character including underscore
read some regex documentation at https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions for more information
You may have different combinations and may be this very simple one :
\S+#\S+\.\S+\.edu
try this:
Regex regex = new Regex(#"^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.(edu)$", RegexOptions.IgnoreCase);
ANSWER UPDATED...

Regular expression for non-standard ascii characters

i need a regular expression that check a string for any non-standard ASCIi characters.
You can specify character's unicode point in c# string: "[\u0080-\uFFFF]" should find any character whose "ascii" code is 128+
does this simple one suit your needs ?
[^\x20-\x7E]
Put what you consider the standard characters in a set, then put the negate ^ sign in the set. That will match the nonstandard. For example I consider the standard to be a-z so my nonstandard match pattern would be
[^A-Za-z]
if that matches you have a non standard.

Categories