Email Masking and Obfuscation

Getting useless emails might be close to the top of everyone’s list of most hated things on the internet (tied with Facebook game requests, probably). No one likes getting spammed and there are tons of tiny little creatures known as spam bots that live to do exactly that. The same goes for users of any given app that accidentally get spammed with irrelevant correspondence right after sign up (if there’s a better way to drive off of users, we don’t know it). To avoid unwanted messages, lots of people use various email obfuscators and masking techniques, with varying levels of  success. We’ll cover both aspects in detail below.

What is it all about?

The idea behind email obfuscation is simple. You want to make it as hard as possible for website crawlers to capture your email address and use it later on. If you don’t, your inbox is likely to get flooded with ever-so-interesting offers of sophisticated therapies and messages from the long-forgotten uncle that left you a small fortune. Obfuscated contacts should be hard to traverse for bots but easy to utilize for real users.

Email masking, on the other hand, is also intended not to hide as much as to disguise real addresses. This is done to protect the user’s privacy and prevent data breaches, and also to modify real addresses to perform the necessary QA tests of email workflows. Whether it’s a good approach or not, we’ll take a look at later on. 

Let’s start off with email obfuscation.

Obfuscating Public Emails to Prevent Spam

Now, let’s explore the most common way of obfuscating your emails.

Changing email format

By far, the easiest way to hide your email address from crawlers is by removing or replacing some characters. The most common method is to replace ‘@’ character with [at]. It’s fairly obvious to just about anyone what the correct address is and bots looking strictly for email addresses will get confused. It also can be implemented within seconds, without any code. By default, such an email address is not clickable as adding a mailto redirection underneath would share the actual address. But there’s a way to get around this with a bit of JavaScript. We cover it in more detail below.

email-obfuscation

Many variations of this approach can be seen around the web. The next step would be to add our address as “support at mailtrap dot io”. Sounds clear? Yes. Will it mislead some bots? Likely. But it’s already forcing users to make some extra effort to contact you. The same goes for all the “contact me at steve and the rest is my domain address” contact details. They’re very clear but will likely lower conversion.

Another approach that can sometimes be seen on rather simple pages is hiding contact details on an image. So, instead of a footer of your page with the details in it, the webmaster uploads a picture of a footer with an email address. It’s almost impossible for spam bots to penetrate but it’s also quite a hassle for users, especially those with visual impairment. So please don’t take this approach.

Using contact forms

Another common way of hiding emails is by… removing them from a site. Such emails are replaced by contact forms of various shapes and sizes. Forms don’t expose an email address to bots, they also allow you to gather additional data in an easy to absorb form (as in the example below).

Are they perfect? No. Many users prefer sending emails over filling forms, especially if they want to email several companies at once (for example, to compare the offers). Businesses tend to add multiple fields to a form, often with good intentions (e.g. when troubleshooting). But the more required fields, the harder it is to fill the form and fewer people will contact you. 

What’s even worse, while bots can’t harvest an email from such sites, they can easily complete a form and submit it within milliseconds. To prevent that, many forms come with verification solutions that check if users are legit before a message is sent.

Using Google reCaptcha

A great approach to validating senders is with Google reCaptcha. It went a long way from forcing users to decipher sometimes ridiculously twisted characters to a really user-friendly tool these days. What’s the most important, reCaptcha really works and with very high accuracy is able to distinguish bots from humans. 

You sure have seen reCaptcha v2 a number of times. It’s this bar that pops up usually under the ‘submit’ button and asks you to check a box if you’re not a bot. A quick load and a message is sent. Later iterations also added the so-called Invisible reCaptcha – a bar stating that a form is “protected by ReCaptcha”. When submitting a form, a user doesn’t need to perform any action as the check happens automatically within milliseconds. Only those with low reCaptcha score (so likely the bots) will be subject to additional verification before they can proceed.

With the latest iteration, ReCaptcha v3, the entire process happens in the background. Users don’t even know that any check is performed and yet, nearly all spambots are easily discarded. V3 also comes with other features that enable you to hide or fake contact details if a low score is recorded. Among other features, displaying only partial contact details and forcing users to click (and get verified) to get more might be also worth your attention.

We can safely recommend reCaptcha as a great way to secure contact forms.

Obfuscating emails with JavaScript

As promised earlier on, we’ll now show how to obfuscate an email address with JavaScript. We feel it’s the best way to tackle the problem for a few simple reasons:

  • Users can still click/tap on your email and be redirected directly to their inbox -> increased conversion
  • It’s neat, almost doesn’t take up any space and doesn’t slow down your page, like contact forms or images of contact details
  • Bots go crazy and look for a better target elsewhere

Obfuscating emails with JavaScript requires adding a simple code to your website. HTML code for adding a clickable email address is as follows:

<a href="mailto:name@domain.com">Your Name</a>

Since the address is exposed, it’s extremely easy for bots to find and save it. But with a bit of JavaScript, you can quite easily hide it.

<SCRIPT LANGUAGE="JavaScript">user = 'name';site = 'domain';document.write('<a href=\"mailto:' + user + '@' + site + '\">');document.write(user + '@' + site + '</a>');</SCRIPT>

Of course, ‘name’ and ‘domain’ are to be replaced with the components of your email address. In the case of our address (support@mailtrap.io), ‘support’ would be ‘name’ while ‘mailtrap.io’ is a ‘domain’.

Bots are getting smarter and smarter and some can already decipher even such code. That’s why developers try to find new and new ways to encode such addresses, without affecting the user experience. Below you can see our email address encoded with one of the approaches:

<a href="mailto:support@mailtrap.io">Mailtrap Support</a>

It’s really easy to find and use email obfuscators. These often free web tools let you encode your addresses in various possible ways. Try, for example, email-obfuscator.com or hcidata.info.

There are also various plugins that can automate the process in respective frameworks so that you don’t have to obfuscate each link manually. Here are some examples:

  • actionview-encoded_mail_to can be used to obfuscate emails in Ruby on Rails applications
  • react-obfuscate obfuscates not only emails but also phone numbers or Facetime links when developing with ReactJS
  • Email Obfuscator is a more generic plugin that will work whether users have JavaScript enabled or not
  • email-scramble is another generic JavaScript plugin that works with emails but also phone numbers

Does email obfuscation work in general?

It kind of does. If you did a simple test on two similar websites and put a plain email address on one and JS-obfuscated address on another, you would likely see the latter one perform better. Likely it wouldn’t be 100% accurate though. As we mentioned earlier, crawlers are getting better and better as they need to find ways to harvest as many addresses as their computing power allows. Many are already coded in such a  way that they can decipher all those [at] addresses without any hassle (see how easy it is to decode them here). So if you’re putting some effort into obfuscation, do it with JavaScript or, even better, add a good-looking reCaptcha to your website.

In all honesty, though, we wouldn’t recommend focusing on this for too long. You might spend days testing different solutions, coding them and analyzing results. And then, it could take a single person to find your email in some long-forgotten spreadsheet and sell it to a harvester to make all your efforts futile.

The crawlers are improving but so are spam filters. Gmail or Thunderbird spam filters these days are able to stop almost every useless message sent your way. In 2015 Google claimed that they’re able to stop 99,9% of spam messages and mistakenly classify as spam only 0,05% of incoming mail. And it was four years ago! One would argue that looking into a reliable filter might be a better investment of your time than trying to outsmart the bots.

Masking Emails in Your Database

What is email masking? It’s the technique of altering email addresses with the use of relevant algorithms. The reasons for masking vary and so do the techniques. Let’s cover the two most common approaches.

Masking for security

Many data protection laws require businesses to protect their users’ data. Countless leaks taught us all that even the best-protected databases can be vulnerable to attacks by sophisticated malware. That’s why lawmakers more and more often want businesses not only to safeguard their database but each and every record.

GDPR, for example, requires every company to pseudonymize users’ data. What does it mean? When a company obtains any data of its clients (for example, a user signs up to their platform and fills a questionnaire), the data is split into two. The user’s profile (for example name, last name, age) is anonymized. The same thing happens to the rest of his/her profile (e.g. email, physical address, education, social security number) but these two parts are not directly connected in the database. They definitely can be matched with an additional piece of code but in case of a leak, they will be impossible to connect. This approach is called pseudonymization.

Another approach to data masking is with hashing. Hashing is basically a one-way transformation of data into a meaningless form. A good example of hashing we observed in the chapter about email obfuscation in JavaScript (our DKIM article also has some examples of hashed data). The data after hashing cannot be read without a dedicated key but can be easily encoded for the purpose of running an application. There are many types of hashing algorithms that are in common use and work pretty well. Refer to this article for a lot more details.

Other approaches, such as Microsoft’s Dynamic Data Masking, also do the job quite nicely. We strongly recommend choosing at least one of the above mentioned methods to secure the data in your database.

Email masking for testing

Before you send any production emails, you should definitely test them first in a staging environment. The last thing you need is users unable to reset their password or activate their accounts because the links provided don’t work as expected.

In the recent Data Governance Survey 61% of respondents admitted to using production data for non-production purposes, such as email tasking. If Mr. Murphy was right then something must eventually go wrong. That’s why it’s critical to mask an email address carefully and ensure no real users take part in your testing procedures.

Here are the most common techniques for email masking with the respective SQL queries:

Updating each record with a fake address (the same for all)

UPDATE dbo.CM_CUSTOMERS SET customer_email = ‘test_email@emailtestingis.cool’ ;

This method is good for two reasons – it lets you protect user data (the real addresses disappear) and test if the right emails are sent. The drawback is that everything ends up in one inbox so if you want to investigate what went wrong for a specific user, you’ll have a bit of digging to do.

Updating each record with a random address (different for each)

This method solves this problem without adding much complexity. All you need to do is generate a unique email address that you’ll then match with specific records in your database. With the following query, you could generate thousands of fake addresses on your @emailtestingis.cool domain.

UPDATE db.CM_CUSTOMERS SET customer_emal = (SELECT CONCAT(left(NEWID(),6), ‘@emailtestingis.cool’));

Here are some of the examples of addresses it would return:

  • D482AA@emailtestingis.cool
  • 123ABC@emailtestingis.cool
  • 9F9F1A@emailtestingis.cool

If during the QA process you realize something is wrong, you’ll be able to easily trace back to the origin of the problem.

These are just two very basic methods but if you need to further anonymize your addresses, there are at least a few 3rd party companies offering such services. Are all of these worth the trouble, though?

Do you really need these practices?

Truth be told, even if you use the most sophisticated tools for masking real email addresses and spend hours altering your data, something might eventually go wrong. You will accidentally skip some records from your DB or will upload the wrong contacts to your next QA campaign. The more users you have, the higher the chance something will go wrong and the worse consequences.

For that reason, we always advise against testing email workflows with real databases. While there’s plenty of tools for creating dummy email addresses, this solution is far from ideal too. The best alternative to email masking is probably setting up a fake SMTP server to test your emails.

Mailtrap.io is the biggest player in this field and is used by thousands of businesses and hundreds of thousands of developers and QA professionals. Its SMTP server can be connected to your application in minutes. Once configured, you can test your email workflows and each message will end up on Mailtrap servers, without any risk of reaching real users. These messages can then be previewed and, for example, analyzed against spam filters on the Mailtrap platform. Each email can be also manually or automatically forwarded to your team inboxes.

We feel this is by far the best available approach to email masking. Try it for free today and see if it makes a difference in your testing process.