Referrer spam is a huge problem in my analytics right now and I have been combating it for months.
I'm aware of the botnet discussions surrounding semalt.com (and other referral spammers). I'm also aware that a some of the referral spam is likely triggered without ever visiting my site (which is why my .htaccess
directives aren't catching all of it) and I have added filters to my analytics/tag manager accordingly.
I've researched extensively, including: How to Block Spam Referrers like darodar.com from Accessing Website? and Domain name in mod_rewrite RewriteRule
I'm hoping to implement code which for any sites with actual crawlers will send their 'bots back at them. I have over 100 referrers blacklisted in my .htaccess
but they all follow the same pattern, this is what I have now:
<IfModule mod_rewrite.c>
RewriteEngine on
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*semalt\.com.*? [NC]
RewriteRule ^(.*)$ http://semalt.com/ [L]
RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*simple-share-buttons\.com.*? [NC]
RewriteRule ^(.*)$ http://simple-share-buttons.com/ [L]
</IfModule>
I'd like to simplify that (new domains sending referral spam pop up frequently) so I'm wondering if this would work:
<IfModule mod_rewrite.c>
RewriteEngine on
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} (semalt\.com) [NC]
RewriteRule ^(.*)$ %{HTTP_REFERER} [L]
RewriteCond %{HTTP_REFERER} (simple-share-buttons\.com) [NC]
RewriteRule ^(.*)$ %{HTTP_REFERER} [L]
</IfModule>
It seems like it should work, which makes me wonder if I can go a step further to this:
<IfModule mod_rewrite.c>
RewriteEngine on
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} (semalt\.com|simple-share-buttons\.com) [NC]
RewriteRule ^(.*)$ %{HTTP_REFERER} [L]
</IfModule>
I want to burden my own server as little as possible and I don't care about protocols, subdomains, or paths included.
Basically, if any part of the referrer matches that string, I want to block it and redirect it to itself.
Will the directives I have written work as I expect and are they reasonably efficient in the RegEx matching patterns?
Is there a better way to do this that I am unaware of?
Note: Many of these sites are on a VPS where I can edit the httpd.conf
but not all so .htaccess
specific answers, which I can adapt, are preferred.
Just little fix for the first example you gave, you should escape the slashes
//
likeBut for the rule purpose you only need this
Any of the rules you propose will work fine, but they only will be effective for semalt. simple share buttons is not a crawler so it won't have any effect.
You can demonstrate it by checking your access log, if you look for these 2 referrer spam you will only see records of semalt, none from simple share buttons.
The only way to stop Ghost Spam** is by using filters in GA. You can find more information about this Referrer Spam here https://stackoverflow.com/a/29312117/3197362
And for more general information about Referrer Spam you can check this answer https://stackoverflow.com/a/28354319/3197362
As for the REGEX this is an excelent tool to test them https://regex101.com/