(I have completely rewritten my question because (1) I found errors in the function I posted earlier and (2) I was failing to click the Publish button anyway. There was one answer to my previous erroneous post from @Michael - sqlbot which I've commented on.)
I have two Cloudfront distributions to S3 buckets. The first, the actual website, is connected to a URL of "www.xyz.com" (not its real URL). The second is connected to "xyz.com" which redirects to "www.xyz.com". This is what AWS says to do, and it works just fine.
I want references to folders to work without the user having to include the file name in the URL. That is, "xyz.com/folder" and "xyz.folder/" should work and return the index document.
Here's my Cloudfront function, which nearly works:
function basename(str, sep) {
return str.substr(str.lastIndexOf(sep) + 1);
}
function handler(event) {
let request = event.request;
let start = 0;
let pfx;
let newuri;
if (request.uri.startsWith('https://www.')) {
start = 12;
}
else if (request.uri.startsWith('https://')) {
start = 8;
}
if (start > 0)
pfx = request.uri.substr(0, start)
else
pfx = ''
let rest = request.uri.substr(start);
let base = '';
if (rest.includes('/')) {
base = basename(rest, '/');
}
if (!base.includes('.')) {
let slash = '';
if (!rest.endsWith('/')) {
slash = '/';
}
newuri = pfx + rest + slash + 'index.html';
}
else
newuri = pfx + rest;
request.uri = newuri;
return request;
}
The problem is that typing "xyz.com/folder" and "xyz.com/folder/" work, but "www.xyz.com/folder" and "www.xyz.com/folder/" do not. This is OK, because if I give a URL to someone, I never include the "www." part anyway. Still, it would be better if URLs with "www." worked.
I think a clue is that "xyz.com" is redirected to "www.xyz.com", but "www.xyz.com" is not redirected. In the code, I tried rewriting URLs with "www." to not have the "www." part, but that was of no help.
(In his answer to my previous post, @Michael - sqlbot suggested that URLs of the form "xyz.com" didn't work because the base directory wasn't being set properly, but that turned not to be the case. The function shown handles URLs with and without the trailing slash OK.)
(Note that my function is more complex that the example given by AWS:
async function handler(event) {
const request = event.request;
const uri = request.uri;
// Check whether the URI is missing a file name.
if (uri.endsWith('/')) {
request.uri += 'index.html';
}
// Check whether the URI is missing a file extension.
else if (!uri.includes('.')) {
request.uri += '/index.html';
}
return request;
}
The example is wrong because the call to "includes" will find a dot in the first part of the URL. It needs to look only in the basename.)
Ideas?
Examine the behavior of any common web server (Nginx, Apache, etc.) on index documents and you will find your oversight: when the browser requests
/aup
you need to return a 3xx redirect so that the browser subsequently requests/aup/
and only then should you return the contents of the index document. This is what conventional web servers do.The links to the images are interpreted by the browser relative to the directory of the current document URI, as you know... but what you are overlooking is which directory that actually would be in these two different cases.
The file
cat.jpg
in the same directory as/foo/bar/
is/foo/bar/cat.jpg
.The file
cat.jpg
in the same directory as/foo/bar
is/foo/cat.jpg
because/foo/bar
is interpreted by the browser as a file namedbar
in the directory at/foo
. This is the only reasonable conclusion for the browser to draw. The lack of a trailing slash changes the meaning entirely.So what you are seeing isn't strange behavior — it's correct behavior, on the part of the browser, unrelated to S3 or CloudFront. It's the browser that has to understand the base directory.