C# – Email Regular Expression

I wrote a regex for email that is gets the best results of any I have found online. Along with getting better results, it is shorter too.

Download the C# project with unit tests here:

EmailRegEx.zip

The pattern of an email is described as follows:

  1. It will always have a single @ sign
  2. 1 to 64 characters before the @ sign called the local-part. Can contain characters a–z, A–Z, 0-9, ! # $ % & ‘ * + – / = ? ^ _ ` { | } ~, and . if it is not at the first or end of the local-part.
  3. Some characters after the @ sign that have a pattern as follows called the domain.
    1. It will always have a period “.”.
    2. One or more character before the period.
    3. Two to four characters after the period.

So a simple patterns of an email address should be something like these:

  1. This one just makes sure there are characters before and after the @
    .+@.+
  2. This one makes sure the are characters before and after the @ as well as a character before and after the . in the domain.
    .+@.*+\..+
  3. This one makes sure that there is only one @ symbol.
    [^@]+@[^@]+\.

These are all quick an easy examples and will not work in every instance but are usually accurate enough for casual programs.

But a comprehensive example is much more complex.

  1. I wrote one myself that is the shortest and gets the best results of any I have found:
    ^[\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*@((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))\z
    
  2. Here is another complex one I found: [reference]
    ^(([^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$
    

So let me explain the first one that I wrote as it passes my unit tests below:

The start
[\w!#$%&'*+\-/=?\^_`{|}~]+ At least one valid local-part character not including a period.
(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)* Any number (including zero) of a group that starts with a single period and has at least one valid local-part character after the period.
@ The @ character
( Start group 1
( Start group 2
([\-\w]+\.)+ At least one group of at least one valid word character or hyphen followed by a period. The attached project has a more complex hostname regex option too.
[\w]{2,4} Any two to four valid top level domain characters.
) End group 2
| an OR statement
( Start group 3
([0-9]{1,3}\.){3}[0-9]{1,3} A regular expression for an IP Address. The attached project has a more complex IP regex example too.
) End group 3
) End group 1
\z No end of line: \r or \n.

Code for the Email Regular Expression

Here is code for both examples. My email regular expression is enabled and the one I found on line is commented out. To see how they work differently, just comment out mine, and uncomment the one I found online.

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace RegularExpressionsTest
{
    class Program
    {
        static void Main(string[] args)
        {
            String theEmailPattern = @"^[\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*"
                                   + "@"
                                   + @"((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))\z";

            // The string pattern from here doesn't not work in all instances.
            // http://www.cambiaresearch.com/c4/bf974b23-484b-41c3-b331-0bd8121d5177/Parsing-Email-Addresses-with-Regular-Expressions.aspx
            //String theEmailPattern = @"^(([^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))"
            //                       + "@"
            //                       + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])"
            //                       + "|"
            //                       + @"(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$";

            Console.WriteLine("Bad emails");
            foreach (String email in GetBadEmails())
            {
                Log(Regex.IsMatch(email, theEmailPattern));
            }

            Console.WriteLine("Good emails");
            foreach (String email in GetGoodEmails())
            {
                Log(Regex.IsMatch(email, theEmailPattern));
            }
        }

        private static void Log(bool inValue)
        {
            if (inValue)
            {
                Console.WriteLine("It matches the pattern");
            }
            else
            {
                Console.WriteLine("It doesn't match the pattern");
            }
        }

        private static List<String> GetBadEmails()
        {
            List<String> emails = new List<String>();
            emails.Add("joe"); // should fail
            emails.Add("joe@home"); // should fail
            emails.Add("a@b.c"); // should fail because .c is only one character but must be 2-4 characters
            emails.Add("joe-bob[at]home.com"); // should fail because [at] is not valid
            emails.Add("joe@his.home.place"); // should fail because place is 5 characters but must be 2-4 characters
            emails.Add("joe.@bob.com"); // should fail because there is a dot at the end of the local-part
            emails.Add(".joe@bob.com"); // should fail because there is a dot at the beginning of the local-part
            emails.Add("john..doe@bob.com"); // should fail because there are two dots in the local-part
            emails.Add("john.doe@bob..com"); // should fail because there are two dots in the domain
            emails.Add("joe<>bob@bob.com"); // should fail because <> are not valid
            emails.Add("joe@his.home.com."); // should fail because it can't end with a period
            emails.Add("john.doe@bob-.com"); // should fail because there is a dash at the start of a domain part
            emails.Add("john.doe@-bob.com"); // should fail because there is a dash at the end of a domain part
            emails.Add("a@10.1.100.1a");  // Should fail because of the extra character
            emails.Add("joe<>bob@bob.com\n"); // should fail because it end with \n
            emails.Add("joe<>bob@bob.com\r"); // should fail because it ends with \r
            return emails;
        }

        private static List<String> GetGoodEmails()
        {
            List<String> emails = new List<String>();
            emails.Add("joe@home.org");
            emails.Add("joe@joebob.name");
            emails.Add("joe&bob@bob.com");
            emails.Add("~joe@bob.com");
            emails.Add("joe$@bob.com");
            emails.Add("joe+bob@bob.com");
            emails.Add("o'reilly@there.com");
            emails.Add("joe@home.com");
            emails.Add("joe.bob@home.com");
            emails.Add("joe@his.home.com");
            emails.Add("a@abc.org");
            emails.Add("a@abc-xyz.org");
            emails.Add("a@192.168.0.1");
            emails.Add("a@10.1.100.1");
            return emails;
        }
    }
}

Well, now you have the best C# Email Regular Expression out there.

Update: My attached project has an even better and more accurate one now too.

(Reference: wikipedia)

7 Comments

  1. greatonkar says:

    tested...i works fine but only one issue
    regular expression not working for match test@-test.com or test@test-.com

    • Rhyous says:

      Are you sure? Neither test-.com or -test.com are valid. I have specific unit tests for both of those being invalid and the unit tests are passing.

      From RFC952
      The last character must not be a minus sign or period.
      The first character must be an alpha character.

      From RFC 1123
      One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit.

  2. MS says:

    If a newline character is at the end of the address it passes as valid even against the most complex regex when it should not I guess. Such as "test@test.com\n"

  3. Thank you, the shortest and still accurate one I have come across so far.

  4. Rhyous says:

    I added a project with Unit tests and fixed a bug or two.

    In the project, in the constructor of the EmailValidator object, just set the Pattern to the desired static email regular expression.

Leave a Reply

Powered by sweet Captcha