I wrote a pattern which validates email addresses. Its seems to work but i wanted to see if anyone could come up with some variations and possibly some tips. Thanks.
Code:String email = new String("(?:\\w+?\\@{1}\\w+?)\\.{1}(?:com|net|org|edu)");
Printable View
I wrote a pattern which validates email addresses. Its seems to work but i wanted to see if anyone could come up with some variations and possibly some tips. Thanks.
Code:String email = new String("(?:\\w+?\\@{1}\\w+?)\\.{1}(?:com|net|org|edu)");
An email address typically consists of two major parts:Quote:
Originally Posted by Dilenger4
- Mailbox - the can contain letters, numbers, underscores, hyphens and can also contain a + sign as well as dots of course. So I would recommend you use the ( . ) to match this part:
- Fully Qualified domain name / host name - If I were you I wouldn't chck for valid TLD's, what about .org.uk and .sch.uk and .de and ac.uk and .mil?
You can however check that it contains only letters, numbers and hyphens. But they cannot contain underscores, so you can't use \w.
Code:PCRE:
/^.+@((?i)[a-z0-9\-]+(\.(?i)[a-z0-9\-]+)?)+$/
Thanks for replying visualAd. :thumb: Yeah i didn't take into account that an email address might contain hyphens, underscores and dots. The + sign ive never seen used in an address though. Ive just been using the code below to test the patterns i create. I found the following expression which might be better suited. "(\\w[\\-.\\w]*.*@\\w+\\.(?:com|net|org))". I didn't create it so im a bit shady on how it works. I guess it tests for a word character \\w(dont know why they didn't specify a quanitifer), [\\-.\\w](Guess it's supposed to be read "-" or "." or just a word character or set of words), then more words, don't know why the @ isn't escaped then well we get the rest :lol:.
Code:import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class mailval{
public static void main(String[] args){
String email = new String("(?:\\w+?\\@{1}\\w+?)\\.{1}(?:com|net|org|edu)");
Pattern emp = Pattern.compile(email);
Matcher m = null;
String[] malto = new String[7];
malto[0] = "[email protected]"; // should be true
malto[1] = "whatever@@whatever.com"; // false
malto[2] = "whatever@@whatever.biz"; // false
malto[3] = "[email protected]"; // true
malto[4] = "[email protected]"; // true
malto[5] = "[email protected]"; //false
malto[6] = "[email protected]"; //false
for(int i = 0; i < malto.length; i++){
if(emp.matcher(malto[i]).matches()){
System.out.println("true");
}else{
System.out.println("false");
}
}
}
}
I assumed that the regex I gave you didn't work. Actually I'm sure of it - the Java pattern syntax is a little different from the Perl Compatible syntax I gave you. :blush:
I still think you should not limit the top level domains to only that small subset. If you want to ensure that the email is correct, the only real measure you can take is to send the person an email and ask them to click a confirmation link. The regex can only really be used to detect mistakes made by people entering an address and should be there for the convenience of the user more than a method used by the devleoper to ensure a fake email hasn't been entered.
The first part of the email address should really be free form and I were you the only constraint I would put on it would be to ensure that there is at least one character there before the @, which is why I suggest you use the \S class which matches any non white space character.
Then of course you have the literal @ character which is a requitement and the omission of that will defintaly mean that the email address is invalid.
The next part of the address is the host. On a local network this may not be a fully qualified domain but if you want to ensure it is a fully qualified domain then you should match at least one group of letters, numbers and hyphens and at least on dot followed by another group of letters numbers and hyphens.
So all said, give this one a go: :D
Code:\\S+?@(?:[\\w\\d\\-]+?\\.)(?:[\\w\\d\\-]+?\\.?)+
\\S+? : matches the first part of the email. This will match at least one non-whitespace chracter in a non greedy way.
@ - matches the @ sign :D
(?:[\\w\\d\\-]+?\\.) : matches the first part of a fuly qualified domain i.e: vbforums.com. This can be any word character \\w, any numeric character \\d and any hyphen \\- followed by a literal dot \\.
(?:[\\w\\d\\-]+?\\.?)+ : matches further parts of the domain name. As with the first part of domain, word characters, numeric characters and hyphens are mateched but this time with an optional dot \\.? at the end as it may be the last part of the domain i.e: vbforums.com. The entire subpattern must be matched at least once.
I didn't get to try the pattern you posted. /^.+@((?i)[a-z0-9\-]+(\.(?i)[a-z0-9\-]+)?)+$/ It dosen't seem too far off from a regular expression written in Java though.
Here is a pattern that really allows ALL valid e-mail address and NO others.
^[A-Za-z0-9!#-'\*\+\-\/=\?\^_`\{-~]+(\.[A-Za-z0-9!#-'\*\+\-\/=\?\^_`\{-~]+)*@[A-Za-z0-9!#-'\*\+\-\/=\?\^_`\{-~]+(\.[A-Za-z0-9!#-'\*\+\-\/=\?\^_`\{-~]+)*$
It seems complicated, but really is very simple. The core part is this character class:
[A-Za-z0-9!#-'\*\+\-\/=\?\^_`\{-~]
This is the collection of all characters that are valid as normal parts of e-mail addresses. Substitute this by [[:mail:]] and the whole expression becomes:
^[[:mail:]]+(\.[[:mail:]]+)*@[[:mail:]]+(\.[[:mail:]]+)*$
So we have the start of the string, followed by one or more mail characters. Then there are any numbers of groups that consist of a dot followed by one or more mail characters.
Then comes the @.
After that, the same pattern again.