Apache .htaccess RewriteRule to send semalt referrer spambots back to source

841 Views Asked by At

Referrer spam is a huge problem in my analytics right now and I have been combating it for months.

I'm aware of the botnet discussions surrounding semalt.com (and other referral spammers). I'm also aware that a some of the referral spam is likely triggered without ever visiting my site (which is why my .htaccess directives aren't catching all of it) and I have added filters to my analytics/tag manager accordingly.

I've researched extensively, including: How to Block Spam Referrers like darodar.com from Accessing Website? and Domain name in mod_rewrite RewriteRule

I'm hoping to implement code which for any sites with actual crawlers will send their 'bots back at them. I have over 100 referrers blacklisted in my .htaccess but they all follow the same pattern, this is what I have now:

<IfModule mod_rewrite.c>
  RewriteEngine on
  Options +FollowSymlinks

  RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*semalt\.com.*? [NC]
  RewriteRule ^(.*)$ http://semalt.com/ [L]

  RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*simple-share-buttons\.com.*? [NC]
  RewriteRule ^(.*)$ http://simple-share-buttons.com/ [L]
</IfModule>

I'd like to simplify that (new domains sending referral spam pop up frequently) so I'm wondering if this would work:

<IfModule mod_rewrite.c>
  RewriteEngine on
  Options +FollowSymlinks

  RewriteCond %{HTTP_REFERER} (semalt\.com) [NC]
  RewriteRule ^(.*)$ %{HTTP_REFERER} [L]

  RewriteCond %{HTTP_REFERER} (simple-share-buttons\.com) [NC]
  RewriteRule ^(.*)$ %{HTTP_REFERER} [L]
</IfModule>

It seems like it should work, which makes me wonder if I can go a step further to this:

<IfModule mod_rewrite.c>
  RewriteEngine on
  Options +FollowSymlinks

  RewriteCond %{HTTP_REFERER} (semalt\.com|simple-share-buttons\.com) [NC]
  RewriteRule ^(.*)$ %{HTTP_REFERER} [L]
</IfModule>

I want to burden my own server as little as possible and I don't care about protocols, subdomains, or paths included.

Basically, if any part of the referrer matches that string, I want to block it and redirect it to itself.

Will the directives I have written work as I expect and are they reasonably efficient in the RegEx matching patterns?

Is there a better way to do this that I am unaware of?

Note: Many of these sites are on a VPS where I can edit the httpd.conf but not all so .htaccess specific answers, which I can adapt, are preferred.

1

There are 1 best solutions below

5
On

Just little fix for the first example you gave, you should escape the slashes // like

 RewriteCond %{HTTP_REFERER} ^https?:\/\/([^.]+\.)*semalt\.com.*? [NC]

But for the rule purpose you only need this

RewriteCond %{HTTP_REFERER} ([^.]+\.)*semalt\.com.*? [NC]

Any of the rules you propose will work fine, but they only will be effective for semalt. simple share buttons is not a crawler so it won't have any effect.

You can demonstrate it by checking your access log, if you look for these 2 referrer spam you will only see records of semalt, none from simple share buttons.

The only way to stop Ghost Spam** is by using filters in GA. You can find more information about this Referrer Spam here https://stackoverflow.com/a/29312117/3197362

And for more general information about Referrer Spam you can check this answer https://stackoverflow.com/a/28354319/3197362

As for the REGEX this is an excelent tool to test them https://regex101.com/