I have the following on my .htaccess file:
Options +FollowSymlinks
#+FollowSymLinks must be enabled for any rules to work, this is a security
#requirement of the rewrite engine. Normally it's enabled in the root and we
#shouldn't have to add it, but it doesn't hurt to do so.
RewriteEngine on
#Apache scans all incoming URL requests, checks for matches in our #.htaccess file
#and rewrites those matching URLs to whatever we specify.
#allow blank referrers.
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?site.com [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?site.dev [NC]
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?dev.site.com [NC]
RewriteRule \.(jpg|jpeg|png|gif)$ - [NC,F,L]
# if a directory or a file exists, use it directly
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# otherwise forward it to index.php
RewriteRule . index.php
site.com is the production site.
site.dev is a localhost dev environment.
dev.site.com is a subdomain where we test live.
I'm aware that this will avoid the site to be indexed:
Header set X-Robots-Tag "noindex, nofollow"
cf. http://yoast.com/prevent-site-being-indexed/
My question is however, fairly simple perhaps:
Is there a way to apply this line ONLY on dev.site.com, so that it doesn't get indexed ?
Yes, you need to put the
Header
line in the vhost config fordev.site.com
. There's no way you can make a host check tied to aHeader set
directive from within an htaccess file.The other possibility is if you want to block bots via useragent, you can remove the
Header set
and add some rules:Note that the list of user agents isn't complete. You can try to go through the massive list of User-Agents and look for all of the index robots, or at least the more popular ones.