C# – Email Regular Expression
I wrote a regex for email that is gets the best results of any I have found online. Along with getting better results, it is shorter too.
Download the C# project with unit tests here:
The pattern of an email is described as follows:
- It will always have a single @ sign
- 1 to 64 characters before the @ sign called the local-part. Can contain characters a–z, A–Z, 0-9, ! # $ % & ‘ * + – / = ? ^ _ ` { | } ~, and . if it is not at the first or end of the local-part.
- Some characters after the @ sign that have a pattern as follows called the domain.
- It will always have a period “.”.
- One or more character before the period.
- Two to four characters after the period.
So a simple patterns of an email address should be something like these:
- This one just makes sure there are characters before and after the @
.+@.+ - This one makes sure the are characters before and after the @ as well as a character before and after the . in the domain.
.+@.*+\..+ - This one makes sure that there is only one @ symbol.
[^@]+@[^@]+\.
These are all quick an easy examples and will not work in every instance but are usually accurate enough for casual programs.
But a comprehensive example is much more complex.
- I wrote one myself that is the shortest and gets the best results of any I have found:
^[\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*@((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))\z - Here is another complex one I found: [reference]
^(([^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$
So let me explain the first one that I wrote as it passes my unit tests below:
| The start | |
| [\w!#$%&'*+\-/=?\^_`{|}~]+ | At least one valid local-part character not including a period. |
| (\.[\w!#$%&'*+\-/=?\^_`{|}~]+)* | Any number (including zero) of a group that starts with a single period and has at least one valid local-part character after the period. |
| @ | The @ character |
| ( | Start group 1 |
| ( | Start group 2 |
| ([\-\w]+\.)+ | At least one group of at least one valid word character or hyphen followed by a period. The attached project has a more complex hostname regex option too. |
| [\w]{2,4} | Any two to four valid top level domain characters. |
| ) | End group 2 |
| | | an OR statement |
| ( | Start group 3 |
| ([0-9]{1,3}\.){3}[0-9]{1,3} | A regular expression for an IP Address. The attached project has a more complex IP regex example too. |
| ) | End group 3 |
| ) | End group 1 |
| \z | No end of line: \r or \n. |
Code for the Email Regular Expression
Here is code for both examples. My email regular expression is enabled and the one I found on line is commented out. To see how they work differently, just comment out mine, and uncomment the one I found online.
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace RegularExpressionsTest
{
class Program
{
static void Main(string[] args)
{
String theEmailPattern = @"^[\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*"
+ "@"
+ @"((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))\z";
// The string pattern from here doesn't not work in all instances.
// http://www.cambiaresearch.com/c4/bf974b23-484b-41c3-b331-0bd8121d5177/Parsing-Email-Addresses-with-Regular-Expressions.aspx
//String theEmailPattern = @"^(([^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))"
// + "@"
// + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])"
// + "|"
// + @"(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$";
Console.WriteLine("Bad emails");
foreach (String email in GetBadEmails())
{
Log(Regex.IsMatch(email, theEmailPattern));
}
Console.WriteLine("Good emails");
foreach (String email in GetGoodEmails())
{
Log(Regex.IsMatch(email, theEmailPattern));
}
}
private static void Log(bool inValue)
{
if (inValue)
{
Console.WriteLine("It matches the pattern");
}
else
{
Console.WriteLine("It doesn't match the pattern");
}
}
private static List<String> GetBadEmails()
{
List<String> emails = new List<String>();
emails.Add("joe"); // should fail
emails.Add("joe@home"); // should fail
emails.Add("a@b.c"); // should fail because .c is only one character but must be 2-4 characters
emails.Add("joe-bob[at]home.com"); // should fail because [at] is not valid
emails.Add("joe@his.home.place"); // should fail because place is 5 characters but must be 2-4 characters
emails.Add("joe.@bob.com"); // should fail because there is a dot at the end of the local-part
emails.Add(".joe@bob.com"); // should fail because there is a dot at the beginning of the local-part
emails.Add("john..doe@bob.com"); // should fail because there are two dots in the local-part
emails.Add("john.doe@bob..com"); // should fail because there are two dots in the domain
emails.Add("joe<>bob@bob.com"); // should fail because <> are not valid
emails.Add("joe@his.home.com."); // should fail because it can't end with a period
emails.Add("john.doe@bob-.com"); // should fail because there is a dash at the start of a domain part
emails.Add("john.doe@-bob.com"); // should fail because there is a dash at the end of a domain part
emails.Add("a@10.1.100.1a"); // Should fail because of the extra character
emails.Add("joe<>bob@bob.com\n"); // should fail because it end with \n
emails.Add("joe<>bob@bob.com\r"); // should fail because it ends with \r
return emails;
}
private static List<String> GetGoodEmails()
{
List<String> emails = new List<String>();
emails.Add("joe@home.org");
emails.Add("joe@joebob.name");
emails.Add("joe&bob@bob.com");
emails.Add("~joe@bob.com");
emails.Add("joe$@bob.com");
emails.Add("joe+bob@bob.com");
emails.Add("o'reilly@there.com");
emails.Add("joe@home.com");
emails.Add("joe.bob@home.com");
emails.Add("joe@his.home.com");
emails.Add("a@abc.org");
emails.Add("a@abc-xyz.org");
emails.Add("a@192.168.0.1");
emails.Add("a@10.1.100.1");
return emails;
}
}
}
Well, now you have the best C# Email Regular Expression out there.
Update: My attached project has an even better and more accurate one now too.
(Reference: wikipedia)

If a newline character is at the end of the address it passes as valid even against the most complex regex when it should not I guess. Such as "test@test.com\n"
Interesting...
\r works
\r\n works
\n doesn't
I am reading about it here:
http://stackoverflow.com/questions/988951/net-regex-and-newline
Looks like if I replace the very last $ with \z (lowercase z) it works.
Thank you, the shortest and still accurate one I have come across so far.
[...] See updates here: C# – Email Regular Expression [...]
I added a project with Unit tests and fixed a bug or two.
In the project, in the constructor of the EmailValidator object, just set the Pattern to the desired static email regular expression.