Should the filename + extension be included as part of a Canonical Tag

140 Views Asked by At

I've been reading about canonical tags but can't find a definitive explanation as to whether the file extension should be included in the canonical tag.

I have three files in the root folder and Google console tells me the not all pages have been
indexed. Google says:

Duplicate without user-selected canonical.

So how do I tell Google crawler that index.html is the master version?

In the examples I've seen, there has been no mention of the filename just folder names.

In my example I a fictitious site below: https://portfolio-website.example/index.html

Should the canonical in the index.html header be: <link rel="canonical" href="https://portfolio-website.example/index.html" />

Is this the pattern to use for every .html file?

1

There are 1 best solutions below

3
On

index.html should never be part of your URLs. It should be omitted from your canonical URL. index.html is supposed to be a hidden file that powers the request for the directory. Users are never supposed to know that you have it because it is ugly and unnecessary to put in the URL.

That means that when you choose the canonical for your home page, it should be: <link rel="canonical" href="https://portfolio-website.example/">

When you link to your home page, you should also omit the index.html. The easiest ways to link to your home page without it are <a href="/"> (root relative link, works from within your site) or <a href="https://portfolio-website.example/"> (absolute link.)

index.html is the only HTML document that should be treated this way. If you have another page (like foo.html) the document name and extension would go in the canonical URL (and in links).