Why does Google not index my "robots.txt"?

341 Views Asked by At

I am trying to allow the Googlebot webcrawler to index my site. My robots.txt initially looked like this:

User-agent: * 
Disallow: / 
Host: www.sitename.com 
Sitemap: https://www.sitename.com/sitemap.xml

And I changed it to:

User-agent: * 
Allow: / 
Host: www.sitename.com 
Sitemap: https://www.sitename.com/sitemap.xml 

Only Google is still not indexing my links.

2

There are 2 best solutions below

0
On

I am trying to allow the Googlebot webcrawler to index my site.

  1. Robots rules has nothing to do with indexing! They are ONLY about crawling ability. A page can be indexed, even if it is forbidden to be crawled!

  2. host directive is supported only by Yandex.

  3. If you want all bots are able to crawl your site, your robots.txt file should be placed under https://www.sitename.com/robots.txt, be available with status code 200, and contain:

    User-agent: * Disallow: Sitemap: https://www.sitename.com/sitemap.xml

1
On

From the docs:

Robots.txt syntax can be thought of as the “language” of robots.txt files. There are five common terms you’re likely come across in a robots file. They include:

User-agent: The specific web crawler to which you’re giving crawl instructions (usually a search engine). A list of most user agents can be found here.

Disallow: The command used to tell a user-agent not to crawl particular URL. Only one "Disallow:" line is allowed for each URL.

Allow (Only applicable for Googlebot): The command to tell Googlebot it can access a page or subfolder even though its parent page or subfolder may be disallowed.

Crawl-delay: How many seconds a crawler should wait before loading and crawling page content. Note that Googlebot does not acknowledge this command, but crawl rate can be set in Google Search Console.

Sitemap: Used to call out the location of any XML sitemap(s) associated with this URL. Note this command is only supported by Google, Ask, Bing, and Yahoo.

Try to specifically mention Googlebot in your robots.txt-directives such as:

User-agent: Googlebot 
Allow: /

or allow all web crawlers access to all content

User-agent: * 
Disallow: