Coding Cookbook/Validate Email Address

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Regular Expressions?[edit | edit source]

There are many many proposed regular expression solutions to this problem, however the e-mail address format is so complicated that it is difficult to do with a regular expression alone. On the other hand, the vast majority of email addresses that people actually use fall into a small subset of the standard, and thus various usable regular expressions can be found of varying degrees accuracy scattered widely across the internet. If you want to look at a cool regular expression that performs a subset of the email validation process, stare at this for a while; there are also some quick not-quite-perfect solutions here.

Practical solution[edit | edit source]

Because parsing an email address is so difficult, it makes little sense to even attempt it; particularly as it would be trivial for someone to give you a correctly formatted email address that still didn't work. This means that the best way to detect whether an email address is valid is to simply send it an email. If you can verify that the email was received, then you know for certain that the email address is valid (though it doesn't guarantee this for any length of time, as many services exist giving out temporary email addresses).

The most common way of doing this is to send someone an email containing a link to an http service with a long random string attached to it, only the email and your server contain the random key, so someone would have to read the email to find the correct link.

The purpose of the random string is to reduce the possibility that someone can make the server believe that the email address exists, while in fact it does not. For example, suppose that the link sent in the email is of the form http://myserver.com/validateemail?address=example@yourserver.com. Someone could use an existing, valid email address to see the pattern, and subsequently call the server with any random email address, even without receiving the actual mail containing the link.

Some servers require authentication when opening the link. For example, the user claiming the email address must supply his user id and password when validating it. While this is useful for confidentiality of the data shown on the validation page (including the statement that the address has been validated as an existing address), it does not prevent registering non-existent addresses. Only a long random string used as described above can do that.

Note also that the string must be random only from the client perspective, not necessarily from the server perspective. For example, the server can digitally sign the url, and validate the signature. In this case, the server does not need to store the random key to be able to validate it.

For further reading[edit | edit source]