I'd like to use an HTTP proxy (such as nginx) to cache large/expensive requests. These resources are identical for any authorized user, but their authentication/authorization needs to be checked by the backend on each request.
It sounds like something like Cache-Control: public, max-age=0 along with the nginx directive proxy_cache_revalidate on; is the way to do this. The proxy can cache the request, but every subsequent request needs to do a conditional GET to the backend to ensure it's authorized before returning the cached resource. The backend then sends a 403 if the user is unauthorized, a 304 if the user is authorized and the cached resource isn't stale, or a 200 with the new resource if it has expired.
In nginx if max-age=0 is set the request isn't cached at all. If max-age=1 is set then if I wait 1 second after the initial request then nginx does perform the conditional GET request, however before 1 second it serves it directly from cache, which is obviously very bad for a resource that needs to be authenticated.
Is there a way to get nginx to cache the request but immediately require revalidating?
Note this does work correctly in Apache. Here are examples for both nginx and Apache, the first 2 with max-age=5, the last 2 with max-age=0:
# Apache with `Cache-Control: public, max-age=5`
$ while true; do curl -v http://localhost:4001/ >/dev/null 2>&1 | grep X-Cache; sleep 1; done
< X-Cache: MISS from 172.x.x.x
< X-Cache: HIT from 172.x.x.x
< X-Cache: HIT from 172.x.x.x
< X-Cache: HIT from 172.x.x.x
< X-Cache: HIT from 172.x.x.x
< X-Cache: REVALIDATE from 172.x.x.x
< X-Cache: HIT from 172.x.x.x
# nginx with `Cache-Control: public, max-age=5`
$ while true; do curl -v http://localhost:4000/ >/dev/null 2>&1 | grep X-Cache; sleep 1; done
< X-Cached: MISS
< X-Cached: HIT
< X-Cached: HIT
< X-Cached: HIT
< X-Cached: HIT
< X-Cached: HIT
< X-Cached: REVALIDATED
< X-Cached: HIT
< X-Cached: HIT
# Apache with `Cache-Control: public, max-age=0`
# THIS IS WHAT I WANT
$ while true; do curl -v http://localhost:4001/ >/dev/null 2>&1 | grep X-Cache; sleep 1; done
< X-Cache: MISS from 172.x.x.x
< X-Cache: REVALIDATE from 172.x.x.x
< X-Cache: REVALIDATE from 172.x.x.x
< X-Cache: REVALIDATE from 172.x.x.x
< X-Cache: REVALIDATE from 172.x.x.x
< X-Cache: REVALIDATE from 172.x.x.x
# nginx with `Cache-Control: public, max-age=0`
$ while true; do curl -v http://localhost:4000/ >/dev/null 2>&1 | grep X-Cache; sleep 1; done
< X-Cached: MISS
< X-Cached: MISS
< X-Cached: MISS
< X-Cached: MISS
< X-Cached: MISS
< X-Cached: MISS
As you can see in the first 2 examples the requests are able to be cached by both Apache and nginx, and Apache correctly caches even max-age=0 requests, but nginx does not.
I think your best bet would be to modify your backend with support of
X-Accel-Redirect.Its functionality is enabled by default, and is described in the documentation for
proxy_ignore_headers:You would then cache said internal resource, and automatically return it for any user that has been authenticated.
As the redirect has to be
internal, there would not be any other way for it to be accessed (e.g., without an internal redirect of some sort), so, as per your requirements, unauthorised users won't be able to access it, but it could still be cached just as any otherlocation.