Posts Tagged ‘recaptcha’

vBulletin spam signup email addresses, combatting the problem

Thursday, March 3rd, 2011

A while back I had corresponded with Google’s spam team regarding a pattern I had discovered and sent it off to some people. It appears that they used some of that to clean up the search results removing this particular type of spam, but, the source of the problem still exists. Over the past 60 days, a particular client’s vBulletin site has received 2670 signups, over half using gmail addresses. A group of three people have independently looked at every signup to verify that these indeed do fit the spam pattern.

It appears an outsourcing company is hired to sign up, but, are given a list of email addresses that they can choose from. Signup and verification always take place from radically different IPs, so, we can assume that the people doing the actual signup have no idea that their verification email never goes out. This is confirmed by the fact that they use multiple periods in their gmail address to make the email address appear to be unique. Once we’ve determined that the email address has already been seen by modifying vBulletin to strip out the . and truncate at the +, it is instantly banned. We opted to allow the signups to be registered rather than saying that the email was in use.

A slight background to the issue. Google allows one to use a . or + in the email address which resolves to the same destination address. While I like that feature and have used it in the past, vBulletin appears to ignore this fact. So, the email address goes to the same destination as and Likewise, you can use the + in the email address to signify the source of the email. So, might signify that the email came from your twitter profile and comes from your facebook profile. Since they end up at the same place, google is a perfect way to have hundreds of email addresses that appear to be unique, but, are delivered to the same destination. This means your validation script can check fewer mailboxes, decode the validation email looking for the link, and can automatically click.

Initially our client installed Recaptcha which increases the chance that a human is probably filling out the form. Based on the number of resubmissions, I’m reasonably certain that a human is doing the data entry and they aren’t cracking Recaptcha.

I figured at one point these were created accounts, but, some of the names are so specific, one would have to assume that perhaps there are some compromised email accounts in here as well. If you glance through the list, you’ll see judicious use of the . and + to try to create unique email addresses.

The first thing we did was write a plugin that hooked into the signup process that cleaned up the email addresses. The second thing we did was look for a signup that took place in a country different from the verification click. Often times they did use proxy servers, so, using a few of the proxy dns blacklists, we were also able to make an educated guess that the signups were probably going to post spam. The first post at the board is moderated using Akismet for any that slip through, but, this method appears to be fairly good at hitting the right ones, and out of 2691 signups, it detected 2670 spam signups with 1 false positive. The false positive was a tough one – even looking at the signup data, the IP addresses used in both the signup request and validation took place in separate countries according to maxmind’s geoip database (the person signed up at work, drove home across a country border in Europe, and validated his email address from home). We also changed the registration form and put a second link above the first that said:

If you didn’t intend to sign up, click this link

For a few days, their spider was hitting the first link, banning the account for us. Often times there was a delay of a few days between an account that was validated and the first post.

If you look at the list, you can see where they have attempted to obfuscate the email address, and in some cases, are using the + to insert a counter. Based on the posts that were made, it suggests we might have more than one group actually spamming, all outsourcing the account creation to the same company.

Spammers are resourceful. It is a shame there isn’t a way to get these email addresses shut down to squelch some of the spam at the source.

Since starting this post, eight more signups came in, bringing the average to roughly 90 signups per day.

In short:

* Check a ‘cleaned’ email against the database, i.e. remove the . and truncate at the + for gmail/googlemail accounts
* Use Recaptcha
* Alter the signup form to include a link to decline the signup
* Look at the Signup IP and the validation IP
* Validate Signup IP isn’t coming from a proxy proves I am not human

Saturday, April 25th, 2009

I must not be human as I failed a request.  I like using technology for good purposes and I understand how recaptcha works.  The folks at actually turned it into a service that provides a great deal for the public in that your recaptcha response is used to correct the Optical Character Recognition problems that they had while scanning books to put them online.

So, any human readers out there that can solve this one?


nerlu emitted?

nehig emitted?

nerlD emitted?

Admit it, you’re not human either and Carnegie Melon has stolen the last bit of humanity you thought you had.

Entries (RSS) and Comments (RSS).
Cluster host: li