In reading about what would be the best way to validate a mail address via regular expressions, I came across with an attempt to validate with
try
{
new MailAddress(input);
}
catch (Exception ex)
{
// invalid
}
What method does the MailAddress class use to ensure a mail address is valid?
You can see the source code without using Reflector using the new .NET Reference Source. Here's the link to the MailAddress class.
If you mean by validate whether or not it's a valid e-mail address format, it supports several standard formats:
The MailAddress class supports the following mail address formats:
A simple address format of user#host. If a DisplayName is not set,
this is the mail address format generated.
A standard quoted display name format of "display name" .
If a DisplayName is set, this is the format generated.
Angle brackets are added around the User name, Host name for "display
name" user#host if these are not included.
Quotes are added around the DisplayName for display name ,
if these are not included.
Unicode characters are supported in the DisplayName. property.
A User name with quotes. For example, "user name"#host.
Consecutive and trailing dots in user names. For example,
user...name..#host.
Bracketed domain literals. For example, .
Comments. For example, (comment)"display
name"(comment)<(comment)user(comment)#(comment)domain(comment)>(comment).
Comments are removed before transmission
.
This is from MailAddress Class
As for what method it uses to validate the formats, I don't know. You could always try Reflector to see what it's doing internally. Is there a particular reason you want to know the internal details?
According to the documentation
The address parameter can contain a display name and the associated
e-mail address if you enclose the address in angle brackets. For
example:
"Tom Smith <tsmith#contoso.com>"
White space is permitted between the display name and the angle
brackets.
So a "naked" address such as tsmith#contos.com or one with a displayed name as mentioned in the documentation is fine. It is impossible to tell how the validation is done internally without access to the code, but a regex doing that validation can of course be constructed.
Related
In validating email addresses I have tried using both the EmailAddressAttribute class from System.ComponentModel.DataAnnotations:
[EmailAddress(ErrorMessage = "Invalid Email Address")]
public string Email { get; set; }
and the MailAddress class from System.Net.Mail by doing:
bool IsValidEmail(string email)
{
try {
var addr = new System.Net.Mail.MailAddress(email);
return addr.Address == email;
}
catch {
return false;
}
}
as suggested in C# code to validate email address. Both methods work in principle, they catch invalid email addresses like, e.g., user#, not fulfilling the format user#host.
My problem is that none of the two methods detect invalid characters in the user field, such as æ, ø, or å (e.g. åge#gmail.com). Is there any reason for why such characters are not returning a validation error? And do anybody have a elegant solution on how to incorporate a validation for invalid characters in the user field?
Those characters are not invalid. Unusual, but not invalid. The question you linked even contains an explanation why you shouldn't care.
Full use of electronic mail throughout the world requires that
(subject to other constraints) people be able to use close variations
on their own names (written correctly in their own languages and
scripts) as mailbox names in email addresses.
- RFC 6530, 2012
The characters you mentioned (ø, å or åge#gmail.com) are not invalid. Consider an example: When someone uses foreign language as their email id (French,German,etc.), then some unicode characters are possible. Yet EmailAddressAttribute blocks some of the unusual characters.
You can use international characters above U+007F, encoded as UTF-8.
space and "(),:;<>#[] characters are allowed with restrictions (they are only allowed inside a quoted string, a backslash or double-quote must be preceded by a backslash)
special characters !#$%&'*+-/=?^_`{|}~
Regex to validate this: Link
^(([^<>()[].,;:\s#\"]+(.[^<>()[].,;:\s#\"]+)*)|(\".+\"))#(([^<>()[].,;:\s#\"]+.)+[^<>()[].,;:\s#\"]{2,})
I've got a regular expression that I am using to check against a string to see if it an email address:
#"^((([\w]+\.[\w]+)+)|([\w]+))#(([\w]+\.)+)([A-Za-z]{1,3})$"
This works fine for all the email addresses I've tested, provided the bit before '#' is at least four characters long.
Works:
web1#domain.co.uk
Doesn't work:
web#domain.co.uk
How can I change the regex to allow prefixes of less than 4 characters??
The 'standard' regex used in asp.net mvc account models for email validation is as follows:
#"^[\w-]+(\.[\w-]+)*#([a-z0-9-]+(\.[a-z0-9-]+)*?\.[a-z]{2,6}|(\d{1,3}\.){3}\d{1,3})(:\d{4})?$"
It allows 1+ characters before the #
I believe the best way to check a valid email address is to make the user type it twice and then send him an email and challenge the fact that he received it using a validation link.
Check your regex againt a list of weird valid email addresses and you will see regexes are not perfect for email validation tasks.
I recommend not using a regex to validate email (for reasons outlined here) http://davidcel.is/blog/2012/09/06/stop-validating-email-addresses-with-regex/
If you can't sent a confirmation email a good alternative in C# is to try creating a MailAddress and check if it fails.
If you're using ASP.NET you can use a CustomValidator to call this validation method.
bool isValidEmail(string email)
{
try
{
MailAddress m = new MailAddress(email);
return true;
}
catch
{
return false;
}
}
You can use this regex as an alternative:
^([a-z0-9_\.-]+)#([\da-z\.-]+)\.([a-z\.]{2,6})$
Its description can be found here.
About your regex, the starting part (([\w]+\.[\w]+)+) forces the email address to have four characters at the beginning. Emending this part
would do the work for you.
The little trick used in the validated answer i.e. catching exceptions on
new MailAddress(email);
doesn't seem very satisfying as it considers "a#a" as a valid adress in fact it does't raise an exception for almost any string matching the regex "*.#.*" which is clearly too permissive for example
new MailAddress("¦#°§¬|¢#¢¬|")
doesn't raise an exception.
Thus I clearly would go for regex matching
This example is quite satisfying
https://msdn.microsoft.com/en-us/library/01escwtf%28v=vs.110%29.aspx
You can also try this one
^[a-zA-Z0-9._-]*#[a-z0-9._-]{2,}\.[a-z]{2,4}$
Is there a way we can validate and correct invalid format emailids in C#.I got a function which can only validate but not correction.Some emailds like "abc#def.com." can be corrected.I`m fetching all emailids from database and sending them a mail,if I just remove invalid emailids,the person may loose info,so instead of removing I thought of correcting the mailid and send him the mail.
Is there a way?Or a function to do this.???
Thanks in advance.
If you have the email address as a string, then you can manipulate the string. In your example, that would be removal of the trailing period. Other than this simple example, I suggest that you think long and hard about how useful this will be. What is the context? Can you pass the mail address back to a user to get the correct address, as opposed to your best guess?Adding code will clarify your question. From your question, I don't know why you assume you can only validate, as opposed to correcting the mail address string.
You could check wheather the Mail domain exists for example like this, you can check if the Email ends with an unvalid char like "." or "," and remove this if found but you can not really "correct" wrong Emails by trying to change each char and check if the Email exists or not, and its not desired sicne you would find probably for each change you make an exissting Email adress which is not the one you really wish to reach.
No. There is no way to do this. You may have a built-in guess system that will take care of common mistakes though.
For instance, if I type my email id as abc#gmali.com, you may change it to abc#gmail.com. This still does not guarantee that the email is is now correct.
Assume I had an email id as abc#gmail.com and intentionally I typed in asd#gmail.com. Now, there is no way you can correct it. With the same intention, if I type asd#gmial.com, your code might make it correct email id as asd#gmail.com which still is incorrect.
Essentially what you are looking for is called client side validation. What ever front end you have, place validation that check if email address is correct as per syntax. For verifying if the user has given his real email, send a mail to the given address with activation link and ask them to click on it if they want to use applciation.
Edit:
If you need to just format the emails in database, you can check for common mistakes using queries/external executable. These will validate the data against a valid format which then, can be changed. What are the options you have, technology wise, for doing this?
I was wondering if anybody has found a solution that validates an email that includes unicode characters as in from a unicode domain? I have searched at length and have yet to find a solution that works.
Fully validating an email address through a regex is hard. Really hard. This is one that is fully compliant with RFC822. Even if you create a perfect regex that correct validates all email addresses, that doesn't stop me from entering hi#hi.com (If you're trying to make sure that I enter a valid email address) or from accidentally misspelling my username (If you're trying to make sure that I enter my email address correctly).
Just send a link in an email saying, "click here to validate your email address."
I had the same issue and came up with an intelligent solution \p{L}.
Please check it out:
private static bool IsEmailValid(string email) {
System.Text.RegularExpressions.Regex re = new Regex(#"^[\p{L}0-9!$'*+\-_]+(\.[\p{L}0-9!$'*+\-_]+)*#[\p{L}0-9]+(\.[\p{L}0-9]+)*(\.[\p{L}]{2,})$", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
return re.IsMatch(email);
}
Ok, so the only email validation I ever found that was truly awesome (instead of just OK) is part of the Zend Framework. Of course that means PHP, hopefully though, you can look at how they do it and emulate some of their better ideas: http://pastebin.com/SvZPBp31 Or just look up Zend_Validate_EmailAddress sourcecode.
sorry that this isn't in C# syntax / language.
Like has been pointed out, validating e-mail addresses through a regular expression is a hard problem. You can get close with a fairly simple one, but there are many, many cases that it will fail to catch. I'm all for sending an email to a supposed email address as #Nick ODell suggests (after doing some basic sanity checking, like, does it contain an # sign, does the domain name portion exist and have one or more of MX/A/AAAA RRs, and the likes) and including a verification link.
That said, if by Unicode domain you mean a Punycode-encoded host name label, those should be covered by any half-way competent validation regexp, as in encoded form those are just xn-- followed by the regular set [a-z0-9-] (case insensitive comparison).
I would like to use the masked edit box to let people type in input a valid email address,
What is the best mask i should use in MaskedEditExtender.Mask ?
There are two different things: the mask and the validation.
Obviously, you cannot use one of the predefined MaskTypes (number, date,...) and neither a mask with a fixed length (and a very long mask with enough underscores for every possible email address will be ugly and confuse the user). My answer would be: don't use a mask.
However, you can still validate the email address setting the ValidationExpression to your favorite email regex (like the one that comes with the Regex validator control). Note that no "perfect" email validation regex exists, but you can warn the user if there are apparent errors.