A plain email address on a web page is easily found by the email harvesters (spambots) used by spammers. To make it harder to find, split the address into pieces. Separate the pieces with HTML tags or spaces, insert the word “nospam”, replace the “@” with “at”, or put the pieces on separate lines or in separate table cells. The harvester tests reported in this article show that many of these methods work well to stop harvesters.
Table of Contents
- How to fragment an email address
- Split the address onto separate lines
- Add “nospam” within the address
- Spell out the punctuation
- Add spaces between the characters
- Embed an HTML comment
- Embed an HTML tag
- Distribute characters into HTML table cells
- Draw the address using a CSS “font”
- Draw the address using ASCII art
- Further reading
This article is part of a series on Effective methods to protect email addresses from spammers that compares and tests 50 ways to protect email addresses published on a web site.
How to fragment an email address
The email harvesters (“spam robots” or “spambots”) used by spammers scan the text of web pages looking for plain email addresses like “firstname.lastname@example.org”. Fragmenting an email address splits its into pieces by inserting extra text or by rearranging it a bit. The fragmented address is still readable by your site’s visitors, but many harvesters are confused. And if harvesters can’t find your protected email address, they can’t add it to mailing lists and you’ll get less spam.
Below I discuss each of the most common ways to fragment an email address. After this list, I report the results of running fragmented addresses past a collection of email harvesters to see which methods are effective at protecting your address, and which are not.
Split the address onto separate lines
To protect an email address, split it into two or more parts and place them on separate lines or in separate columns of a table.
This works well to protect a single email address, but beware of using it for a large table of addresses, such as a company contact list. Some harvesters can extract a table’s columns of user and domain names and re-assemble them into complete email addresses. This requires that a spammer spend a few minutes configuring the harvester. They are unlikely to take the trouble for only a few addresses, but they may do so for a large contact list.
Add “nospam” within the address
Harvesters extract addresses and add them to a list that they later validate by probing email servers. Invalid email addresses are discarded. To protect an address, publish an invalid form of it. For example, add the word “nospam”, “removeme”, “ABC”, or whatever you like into the middle of the address. Add a comment near the address telling site visitors how to remove the extra word and get a valid address. Harvesters won’t read and understand your comment, so they won’t know how to make the invalid address valid.
|Resultemail@example.com (remove "-nospam" before use)|
A U.S. Federal Trade Commission (FTC) web page Don’t Want Your Email Address Harvested? recommends adding “spamaway” to protect an address, such as “firstname.lastname@example.org”. The word you use doesn’t matter. It can be anything you like.
Many spambots have an optional email address filter that can watch for addresses to skip, such as “webmaster” or “support”. Some of these harvesters come pre-configured to skip email addresses containing the word “spam”. However, spammers can (and probably do) disable this email filtering.
If you use a “mailto” link to point to your email address, you’ll need to protect the visible address and the address in the “href” part of the link.
Spell out the punctuation
Email harvesters scan your web pages looking for an “@” character — the words on either side form an email address. Block harvesters by typing “at” instead of “@”, and “dot” instead of “.”. Site visitors will know what you mean.
|Result||person at example dot com|
Common variations on this write “(at)”, “[at]”, “-at-”, “(dot)”, and “[dot]” instead. This approach is very widely used to protect email addresses published in news groups and mailing list archives.
A 2005 U.S. Federal Trade Commission (FTC) report, Email Address Harvesting and the Effectiveness of Anti-Spam Filters (PDF), tested this approach and found that it stopped nearly all spam. But that was in 2005. Today, some spambots can recognize this trick.
Add spaces between the characters
Email addresses cannot contain spaces (technically, they can in the user name part but only with special effort that nobody does). To make your address hard for a harvester to find, insert spaces between the characters. Site visitors will be able to read the protected email address, but harvesters won’t see it.
|Result||p e r s o n @ e x a m p l e . c o m|
Embed an HTML comment
Harvesters are looking for a complete email address, so break it up by inserting an HTML comment within it. The comment is ignored by browsers and invisible to site visitors.
Bontrager Connection’s Protecting Your Email Address article introduces this method to protect an email address, and then shoots it down by providing a free test page that shows how a harvester could extract the comments and reveal the protected address. Real harvesters, though, may not be as smart as the Bontragers.
Embed an HTML tag
The HTML comment used above might be removed by a clever harvester, revealing the email address. Instead of a comment, add something harder to remove: add an HTML tag such as a <span>. Unlike a comment, an HTML tag is meant to do something, such as format the text, making it harder for a harvester to safely remove it.
An added <span> tag could enclose nothing:
or enclose a portion of the email address, such as the “@”:
or enclose junk text that you hide using CSS:
Harvesters do not understand page formatting and do not apply CSS styles. In the last example above, harvesters will not understand that the “hideme” text within the protected address is invisible. If they remove the HTML tags, the email address that’s left still includes “hideme”, making the address invalid.
Distribute characters into HTML table cells
As an extended example of fragmenting an email address, you can use HTML to place each character into its own table cell. Style the table without borders or cell spacing so that the characters are tight together and readable.
Draw the address using a CSS “font”
Instead of filling table cells with characters in an address, leave them empty and instead draw selected cell borders to create the rectangular outlines of characters. Control those borders with CSS. Instead of a table, use nested <div> tags to get more flexibility. The result is a kind of extreme “font” reminiscent of digital displays on consumer electronics gear.
The HTML for this is too long to include here, so here is just the first character (a “p”) and the result for an entire address:
The implementation of the font comes from Stu Nicholls, who provides the CSS font at his CSSplay web site. To use his font, view his web page source and copy and paste the characters you need. Give him credit, of course.
Draw the address using ASCII art
Before images were so easy to create, email, and print, there was ASCII Art. In this digital-age artistic medium, a large table of characters is used to represent the dots (pixels) in a picture. To draw a picture with the table of characters, fill in dark parts of the picture with dark letters, like “M” or “#”, while light parts are filled in with light letters, like “:” or “.””. With the characters at a normal size, the table looks like a mess, but when the characters are shrunk down, a picture emerges.
Here’s a classic one of the Mona Lisa and a close up of the face. This image, and many more, are available at Christopher Johnson’s ASCII Art Collection. Some of them are pretty amazing.
You can do the same thing to protect an email address (how mundane compared to Mona). The HTML for this is way too long to show here, but here’s what it looks like. While this will work in all web browsers, many of today’s browsers have a minimum font size preference. This prevents this ASCII art address from being shrunk down as small as it should be to look right.
Mardeg.sitesled.com has a free web page that generates text like this when you enter an email address. The page is titled “The most bloated human-readable email hider in the world!” Yup. :-)
I tested 23 widely-available email harvesters to see how well these methods work to protect an email address. Each harvester was aimed at a test page containing plain and fragmented email addresses. In the table below, a harvester gets a check mark if it recognizes the protected address.
All of the harvesters were tested on Windows XP SP2. The names of the harvesters are intentionally left off to avoid giving this web page search engine attention for spammers looking for the ”best” harvester to download.
|Plain email address||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√||√|
|Split the address onto separate lines|
|Add “nospam” within the address – user name||1||1||1||1||3||1||1||3||1||1||1||1||1||1||1||1||1||1||1||1||1|
|Add “nospam” within the address – domain name||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2||2|
|Spell out the punctuation - “ at ”||√|
|Spell out the punctuation - “(at)”||√|
|Spell out the punctuation - “[at]”||√|
|Add spaces between the characters||5|
|Embed an HTML comment||4|
|Embed an HTML tag - empty||√|
|Embed an HTML tag - around “@”||√|
|Embed an HTML tag - hidden text|
|Distribute characters into HTML table cells|
|Draw the address using a CSS “font”|
|Draw the address using ASCII art|
Every harvester found the plain email address that was not protected.
Most of the harvesters (cells marked with 1 or 2) found the invalid addresses with “nospam” added. Two harvesters (cells marked with 3) incorrectly truncated the “nospam” address by dropping the text before “nospam”. None of these removed “nospam” to get a valid address.
One spam robot found the email address that replaced “@” with “ at ”, and one spambot found “(at)” and “[at]” addresses.
One harvester (cell marked 5) partially decoded the email address that added a space between each character. It correctly identified the domain name with embedded spaces, but not the user name. This left it with an incomplete and useless address.
One harvester partially removed the HTML comment embedded within an email address (cell marked 4).
One harvester recognized email addresses where an empty HTML tag or a tag surrounding the “@” was embedded within the address. None recognized the protected address with an HTML tag surrounding hidden text.
None of the harvesters recognized the protected address split across table cells or onto two lines. And none understood the CSS font or ASCII art forms of the email addresses.
While most of the harvesters did not find the fragmented addresses, a few harvesters did. Recently released harvesters did better. The more common these protection methods become, the more likely it is that newer harvesters will recognize them.
Spelled-out punctuation (e.g. “at”, “(at)”, or “[at]” for “@”) is very widely used to try to protect addresses in news groups and mailings, but two of the tested harvesters recognized these addresses. One of these was released back in 2004, so this protection method has been bypassed for several years. Despite the FTC’s recommendations a year later in 2005, protecting an email address by replacing “@” with “at”, “(at)” or “[at]” is not an effective way to stop email harvesters.
One of the tested harvesters almost found the address with spaces added between the characters. They may get it right in the next version of the program, and particularly if this becomes a common method. Adding spaces between email address characters is probably not effective. Addresses with embedded spaces also copy and paste badly into email programs and cannot be read normally by screen readers for the visually impaired. Addresses with embedded spaces have poor usability and accessibility.
Most of the tested harvesters found the invalid addresses with “nospam” added. Once harvested, spammers pass these addresses through a separate email “verifier” that probes email servers to confirm that addresses are good. While I did not test email verifiers, it is simple to write a program to automatically strip off commonly-inserted words, such as “nospam” or the FTC’s recommended “spamaway”. Inserting “nospam”, “spamaway”, or any other common phrase into an email address is probably not effective at stopping spammers.
Email address protection methods that add an HTML tag within an address presume that harvesters can’t remove them. So far, only one harvester does. That harvester should have, but did not, recognize the address with HTML comment tags embedded. This could be an artifact of the particular test address. However, it is surprising that there weren’t more harvesters that could strip out HTML tags and comments. This is a simple thing to program and a feature that I expect more harvesters will have soon. Every search engine web spider already has this feature. Embedding an HTML comment or tag into an email address is not effective.
Embedding an HTML tag with text hidden by CSS will probably defeat harvesters for awhile, as will distributing characters into HTML table cells. Even with HTML tag removal, the resulting text is hard to interpret. Harvesting these addresses will require more sophisticated HTML and CSS handling than spammers are likely to do. However, both methods copy and paste badly and cannot be read properly by screen readers for the visually impaired. Embedding hidden text or splitting an address into table cells is effective, but the results have poor usability and accessibility.
While effective, the CSS “font” and ASCII art methods have poor usability and accessibility. These methods are pretty clever, but they’re cumbersome and the protected email address they draw can’t be copy and pasted or read by a screen reader.
Splitting an email address onto multiple lines is a good solution: it is effective, usable, and accessible. Because it looks like any other multi-line web page text, addresses shown this way are unlikely to be recognized by spam robots. The protected addresses can be read naturally by screen readers and it isn’t difficult for visitors to copy and paste the address (in two steps) into an email program.
Recommendation: splitting an email address onto multiple lines is easy and it works well. There are other methods that also work. They are discussed in the other articles in this series.