Sign in to follow this  
iGuest

How Do You Validate Email Addresses? Looking for the best method

Recommended Posts

I am just wondering what you would do to validate against an email address, basically I want to know what Regular Expression you would use to check for a valid format. I've seen so many different ways of doing it, some I can definitely say can produce wrong results, but I'm not saying mine is perfect either, basically I want to perfect it.

Here's what I have, I use PHP's function eregi for this, so case is not important.

^[a-z][_.a-z0-9-]+@[a-z0-9][a-z0-9-]+\.[a-z]+([.]?[a-z][a-z]+)*

What I wanted to achieve was the first character of any email address should be a letter if I'm not mistaken, I probably should go back and read the RFC 2822 as this should have been my first stop in understanding how emails are handled and what is accepted and what isn't. Next I allow an alphanumeric words with underscore, dot and hyphen (minus), next comes the @ sign, after it I check again if the first character is a letter or number, after that another alphanumeric word which allows hypens as well, I don't think you can have an underscore in a domain name?, then comes a dot, then only a word with only letters, I'm still wondering whether to include numbers or not, but so far seen no signs of why I should. The last bit is the trickiest to make it work, basically it checks if another dot follows, if so it checks that there's a letter afterwards then a word of only letters, so there must exist at least 2 letters after the dot, if not, no match, it then continuously checks it over an over again with the additional dots.

Which means something like a2z-abc_def@a2z-abc.com.net.co.uk is valid, even if it does not exist, but that's where checking the host comes into play.

It's that end part that worries me a bit though as it can accept a large amount of dots and words over and over again, although I maybe able to find a limit in the RFC's so I will fix that accordingly to it, else it may be a problem. Other than that, after verifying if the email is valid, I would then check against DNS records or Hostname to see if the domain of the email exists, I could query the sever if the user exists but most servers suggest the user does even when they don't so I rather see if the host exists instead.

So what can you suggest to improve it? I know about shortform ways of writing it, but this is how eregi accepts it. I will now go off and read the RFCs on it, might look at the earlier one first before the newest.

Cheers,


MC

Share this post


Link to post
Share on other sites

you can send them a link to validate their address...you put the emailaddress in your db in a table that has two fields: the address and a boolean value, default 0 and send them a link like:yourhost.com/validate?mail=theiremailand they you just put the boolean on 1 for the reg expression, can't really think of anything better than the one you haveoh yeah, by the way, tnx for the linux boot thingy: i inserted my 1st cd, entered repair on boot and i was given the option to repair my bootloader! hurray! tnx mate

Share this post


Link to post
Share on other sites

you can send them a link to validate their address...

you put the emailaddress in your db in a table that has two fields: the address and a boolean value, default 0 and send them a link like:

yourhost.com/validate?mail=theiremail

and they you just put the boolean on 1

 

for the reg expression, can't really think of anything better than the one you have

oh yeah, by the way, tnx for the linux boot thingy: i inserted my 1st cd, entered repair on boot and i was given the option to repair my bootloader! hurray! tnx mate

<{POST_SNAPBACK}>


I understand the confirmation email side of things, although this is no membership system but rather just an ordinary form fill out where I want to eliminate any errors. I mean I could do multiple checks on it, splitting it etc, but I wanted a sure fire, basically one liner way of checking it, although now that I've thought about multiple checks, maybe I can look at how many checks I could possibly do to make sure it's valid and then try to build from that and simplify it to just a one liner.

 

I'm glad you got your linux sorted.

 

Cheers,

 

 

MC

Share this post


Link to post
Share on other sites

mastercomputers:

so you want to validate email addresses? i would like to advice that you should use Client Side Scripting for that manner. This includes JavaScript and VBScript. This way, the validation will only be done at client, thus eliminating any unnecessary burden to the server and minimize the use for your bandwidth.

where to find the example? click your mouse to:
http://zudocube.com/

do a search, enter the keyword "email validation" and select "JavaScript" in the category. easy and simple.

Share this post


Link to post
Share on other sites

Try PCRE (Perl Compatible Regular Expressions) (preg_*), it's much faster.

And, convert the email address to all lowercase first.

<^[a-z]{1}[a-z0-9_\.\-]+@[a-z0-9]{1}[a-z0-9\-]+\.[a-z0-9\-\.]+$>

<{POST_SNAPBACK}>


I changed mine a bit and now and also I use preg_match,

 

it now looks like

 

if(preg_match("/^[a-z][a-z0-9_.-]+@[a-z0-9][a-z0-9-]+\.[a-z]+(\.[a-z]+)*$/i", $email))

I think I will settle on this as the method although I will work out the minimum values, as that's my last resort for this, far from perfect but it does meet most problems I have seen. I will also try and see if I can find any performance difference in converting to lowercase and removing the case insensitive bit, escaping the characters or enclosing them in braces as it's just a small thing for now.

 

If you look at cryptwizards code, the only problem I can see is on the end \.[a-z0-9\-\.]

 

which will say blah_this@abc-def.com.au to be valid, but it also allows blah_this@abc-def.....com....au.. to be valid too.

 

After every dot, it's probably a good idea to check for a letter or number, but not if it's in the first part before the @ sign.

 

The last bit of an email, usually falls in the line of net, biz, com, org, etc but you can also get com.au, co.uk, etc. I believe there's no dashes in this part so I assume it's safe to leave those out. If there ever was a problem brought to my attention about this I would fix it promptly, but for now, I find my code fitting for it's purpose.

 

I have thought about client side checking, but I'm still left using the server to verify the rest of the data, so decided to do the whole thing server side, it's all built into my validator class so it's probably best to try and keep the code altogether than to partially split them between client and server. I also have less trust in client checking than server checking, considering that if I did do client side checking, I'd still use the server to verify it again, which means double the effort.

 

Thanks for the pointers though.

 

Cheers,

 

 

MC

Share this post


Link to post
Share on other sites

Short answer, it's impossible. even if you used RFC's issued 2000+ character regular expression, it would still fail some perfectly valid, deliverable addresses. As far as checking the host, it can still be thrown off by gobbleDgook@hotmail.com.In response to the suggestion of using client-side scripts, lots of people (myself included) have them disabled most of the time.so, the onl way you can truly alidate an e-mail address is to actually send an e-mail to it

Share this post


Link to post
Share on other sites

You could always use PHP

<?php 		$email="john@zend.com"; 	 if(!eregi("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$", $email)) { 			echo "The e-mail was not valid"; 		 } else { 		 	echo "The e-mail was valid"; 		 } 	 ?>
Yes I know that this is a Perl & CGI but PHP is considered cgi and server sides like javasxript with a user not haing it enabled will not work while the above will. Plus it takes care of .de.tk.anything with two character tlds.
Edited by Houdini

Share this post


Link to post
Share on other sites

The information regarding email validation is very helpful and clear my doubt. for this regard i use like this for email validation

if (!preg_match("/^[^@]+@[^@]+\.[^@]/", $_POST['email'])) 		{		$emailerror="E-mail address is invalid";	   		}

and i think it most probability full fill requirement but it is issue of research. so please dont stop this topic till final stage.or final conclusion. it need some more advance code to perfectly validate the email.

thanks.

Share this post


Link to post
Share on other sites

To validate an email, I was using this method, some years ago I did a research on what's the best method and found that that one was "BEST", maybe today everything changed, but I guess I don't care as usually now the best method is to check this with a preg math and later send a confirmation email that it was really a good email address..

$email = $_POST['email'];if (isset($email{64})) {  echo 'Your Email address is over 64 characters long';}if (!preg_match("/^([a-z0-9._-](\+[a-z0-9])*)+@[a-z0-9.-]+\.[a-z]{2,6}$/i", $email)) {    echo 'Please use a valid Email address';}

Share this post


Link to post
Share on other sites

I'm just revisiting this old thread of mine, a lot has changed but I never gave up on Email Validation.

People always ask why validate this way when you could easily just send them an email and expect confirmation, if I were developing my own email addresses for people how can I validate a non-existing username?

I can't, so I must write my own rules as to what can or can't be used and I'm basing it on the RFCs.

Now the code:

<?phpfunction validate_email($email) {	$err = '';	if(preg_match('/^((?:[a-z0-9!#$%&\'*+\/=?^_`{|}~-]+(?:\.[^.][a-z0-9!#$%&\'*+\/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*"))@((?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.[^.])+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.[^.]){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]))$/', $email, $matches)) {		if(($pos = strpos($matches[0], '@')) > 64) {			$err .= 'The username (local part) of your email address is too long.</br>' . "\n";		}		if(isset($matches[0]{$domlen = 256 + $pos})) {			$err .= 'The domain name for your email address is too long.</br>' . "\n";		}	}	if($err === '') {		echo 'it\'s correct';		return true;	}	echo $err;	return false;}validate_email($email = 'test@home.com');?>

This code has things here just to aide in visual representation, but basically I only rely on whether it's true or false being returned and also applying the error messages to the correct locations of the form.

I don't actually recommend using this, it's based off the RFC and allows way too much than it should but that's my own personal reason, it does help you understand what's valid and what's not though and as always if there's any mistakes or problems with it, let me know and I'll try to sort it.



Cheers,


MC

Share this post


Link to post
Share on other sites
How to correct thisHow Do You Validate Email Addresses?

I have this php code here for my contact form page but each time I test it out through my website the error message comes up "Invalid email address." I wanted to know how do I correct this problem so when someone sends me a message they do not see that error message.

// Validate their e-mail address.  

 if (!preg_match('/^[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}$/I', $_POST['email'])) {  

     $errors[] = 'Invalid e-mail address.';  

 }  

Thank you in advance to anyone who can help.

-question by trish

Share this post


Link to post
Share on other sites
On 11/19/2010 at 1:28 AM, '(G)trish' said:

How to correct thisHow Do You Validate Email Addresses?

I have this php code here for my contact form page but each time I test it out through my website the error message comes up "Invalid email address." I wanted to know how do I correct this problem so when someone sends me a message they do not see that error message.

// Validate their e-mail address.

if (!preg_match('/^[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}$/I', $_POST['email'])) {

$errors[] = 'Invalid e-mail address.';

}

Thank you in advance to anyone who can help.

-question by trish

 


This is an old post but the error is either related to testing for a negative match by doing if(!preg_match, The exclamation mark would be the culprit. To clean up the code slightly, I would use lowercase characters, avoid testing case insensitivity and actually change the email address to lowercase, this will allow you to match if an email address already exists in your database or where ever you store it.

 

On 6/1/2005 at 2:51 AM, 'geancanach' said:

Short answer, it's impossible. even if you used RFC's issued 2000+ character regular expression, it would still fail some perfectly valid, deliverable addresses.

Impossible? I would like to see my validation pattern fail, so if anyone could fail it. That would be great, then I could rework it.

 

My email validation in post #13 is still my most accurate validation pattern based on RFC2822 and I don't believe I've altered the pattern since. The only thing you would really need after this validation is to specify your rules after the '@' position check and the total length check. This allows you to further customise and set boundaries. e.g. you may only want emails to start with a letter or number first, add that pattern to it. Any other rules, just keep adding them.

 

This pattern can be used with any PERL Compatible Regular Expression (PCRE) engine.

 

Cheers,

 

MC

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this