This site is now 100% read-only, and retired.

Question: Preventing Apache referer spam?

Posted by Steve on Mon 29 Aug 2005 at 16:00

Referer spam is something that has only affected weblogs until recently. However it is now on the rise generally and many webservers are seeing incoming requests with HTTP Referer spam.

Referer spam is simply described as incoming requests to your webserver with a website being listed in the "referer" field. The intention of submitting requests is that these logs will be archived somewhere and that search engines will spider these logs and increase the score of the spammed websites.

There are two popular approaches to dealing with Referer spam on Apache webservers - both of which require you to maintain a blacklist of referer strings, or IP addresses, you wish to ignore.

  • Using mod_rewrite to redirect bogus requests.
  • Using mod_security to deny incoming requests.

Each of these approaches suffers from the same problem: You must have a list of the invalid referers to block.

For example with mod_security you can block referers which mention "poker" with rules like this:

SecFilterSelective "HTTP_REFERER" "(holdem|poker|casino)"

This will match on all incoming requests which have a referer string containing the words "poker", "holdem", or "casino".

The mod_rewrite equivilent is :


  RewriteEngine   on
  RewriteCond %{HTTP_REFERER} poker  [OR]
  RewriteCond %{HTTP_REFERER} holdem [OR]
  RewriteCond %{HTTP_REFERER} casino 
  RewriteRule .* - [F,L]

Both of these solutions are simple to setup if you're using one of the modules already. (We've previously covered installing mod_security and enabling mod_rewrite for Apache/Apache2.)

The real problem is keeping the blacklists/rules current.

So, my question is how do you deal with this problem?

 

 


Re: Question: Preventing Apache referer spam?
Posted by fsateler (201.214.xx.xx) on Mon 29 Aug 2005 at 21:18
[ View Weblogs ]
How come this kind of spam is useful for the spammers? I thought search engines created their own referral databases scanning the web pages directly.
--------
Felipe Sateler

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by Steve (69.13.xx.xx) on Mon 29 Aug 2005 at 21:22
[ View Weblogs ]

It seems to have started shortly after it became well-known that Google used inbound links as a measure of website popularity / importance.

The intention is that if the "fake referers" get archived publically then search engine spiders will count those links when assessing the relevence of the target - and the site rank will be boosted artificially.

Steve
-- Steve.org.uk

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by simonw (84.45.xx.xx) on Mon 29 Aug 2005 at 21:23
[ View Weblogs ]
No spam problem, but one client is having an issue with email forms being automatically clicked. Looks like some sort of failed abuse attempt, but nothing "obvious", and we don't usually log that much detail on the server in question, and he gets blank emails.

Some of the source IPs are "well known" open proxies.

Is there a simple way to use the DNS accessible lists of open proxies in Apache2 I wonder? I'm thinking look up would be too slow for HTTP.

Obviously we can spot all the "blank" messages for this specific hosting client, but I noticed the same thing happening to forms on sites owned by other clients.

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by Steve (69.13.xx.xx) on Mon 29 Aug 2005 at 21:26
[ View Weblogs ]

There's not any obvious way to do this, short of adding IPs to a blacklist / firewall manually.

Steve
-- Steve.org.uk

[ Parent ]

It worked, but...
Posted by SanctimoniousHypocrite (12.221.xx.xx) on Tue 30 Aug 2005 at 15:46
[ View Weblogs ]

I implemented the mod_security filter. It worked but now there's an entry in access.log showing the spam url with a 500 error, and an entry in audit_log showing the spam url that was kept out. So by implementing this I now get two spam referrer entries. That's kind of amusing. I guess I should tell mod_security not to log those:

SecFilterSelective "HTTP_REFERER" "(holdem|poker|casino)" deny,nolog,status:500

I think this will stop the referrer from appearing in audit_log, but how do I stop the entry from appearing in access.log? Maybe if they keep getting an error they'll stop. I also wonder, is a 500 error the best one to have mod_security generate? Or is there another error that will more effectively discourage the spammers? Or should it just fail silently?

[ Parent ]

Re: It worked, but...
Posted by Steve (82.41.xx.xx) on Tue 30 Aug 2005 at 18:15
[ View Weblogs ]

You can avoid logging particular status codes if you like - previously mentioned here briefly.

But that is global for a host's logging, and it might not make sense, because you might want to ensure you see all legitimate 500's.

Steve
-- Steve.org.uk

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by dopehouse (84.130.xx.xx) on Sat 3 Sep 2005 at 19:47
I think a good way is to block the webstatistics from being indexed by any searchengines. I know that's a long time work, but that should be the right way.

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by Steve (82.41.xx.xx) on Sat 3 Sep 2005 at 20:33
[ View Weblogs ]

I already do that ... but this doesn't prevent the malicious requests from coming in.

Steve
-- Steve.org.uk

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by dopehouse (84.130.xx.xx) on Sat 3 Sep 2005 at 21:32
That's right. But if 90% of the webmasters will do so, than the spamming will be reduced. *I think*

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by Anonymous (209.149.xx.xx) on Fri 9 Sep 2005 at 14:42
I've got an article that is a brief tour of what I've been able to do (and not been able to do) , here.

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by Anonymous (66.93.xx.xx) on Tue 18 Oct 2005 at 23:40
ReferrerCop is the answer to all your problems.

[ Parent ]

Re: Question: Preventing Apache referer spam?
Posted by Anonymous (24.143.xx.xx) on Sun 13 Nov 2011 at 16:51
ReferrerCop is not the answer. This cleans logs, but does not stop the referrer spam, server loads, and bandwidth usage.

[ Parent ]