I'm using the HTML5 Boilerplate build script on a new project that I've just deployed to a staging environment. The script works like a charm; it's well documented, so it was easy to configure for use in my application.
After reading through the documentation I decided to use Paul Irish's approach for VCS-based deployment to point to the /publish directory, using this snippet from his documentation in my .htaccess file:
RewriteEngine On
RewriteCond $1 !^publish/
RewriteRule ^(.*)$ publish/$1 [L]
I have it configured like this for my particular setup, and everything points to the minified and concatenated files just like it should. This is great, but the /publish directory is also browsable directly by going to http://[mysite.com]/publish/
This seems like kind of a loose thread to leave dangling. I'm wondering if anyone here has run into this and come up with a good solution. I'm not expecting users to type in /publish/ after the URL, but I wouldn't want it to be crawlable for sure, and it just seems a little sloppy to leave it like that.
Any ideas?
Thanks in advance
Update: after much appreciated help from Gerben, below, I ended up changing my thinking on this a bit - there is no need to redirect users from /publish to the root URL because users won't be typing in /publish, and there will never be any links to [site.com]/publish. Instead I've added the following rule in the .htaccess file within the /publish directory. This produces a 403 error (Forbidden) for any requests to the publish subdirectory: http://httpd.apache.org/docs/current/rewrite/flags.html#flag_f
RewriteCond %{THE_REQUEST} publish
RewriteRule .? - [F]
In addition, I've added the publish directory to robots.txt just to be sure search bots aren't indexing two sets of files which contain the same data.
Seems I misread you question. I think the following would redirect anything back the the root folder.
To be sure I would probably also add /public to my robots.txt as forbidden, just in case you accidentally remove the htaccess or something.
Explanation: The RewriteRule check that the requested url starts with
publish/___
and redirect those urls to/___
. But to distinguish between direct requests to/publish
and urls rewritten to/publish
you'll need to examine the originally requested url. The only way to get to that is via the THE_REQUEST variable. That variable should contain something likeGET /publish/___ HTTP/1.1
for direct requests. So the RewriteCond checks for the presence of<space>/publish/
EDIT: final attempt: