disable crawl unwanted subdomain

905 Views Asked by AliN11 At 02 October 2014 at 06:51

How to disable and remove subdomain.domain.com being crawled and listed to alexa and other crawlers ? Specially the cpanel.domain.com and webmail.domain.com that listed into my alexa information page and annoying :/ .

Original Q&A

There are 1 best solutions below

Thomas Jensen On 02 October 2014 at 07:00 BEST ANSWER

From this article: https://alexa.zendesk.com/hc/en-us/articles/200450194-Alexa-s-Web-and-Site-Audit-Crawlers

The Alexa web crawler (robot) identifies itself as “ia_archiver” in the HTTP “User-agent” header field. The Alexa Internet ia_archiver crawler strictly adheres to robots.txt rules.

To prevent ia_archiver from visiting any part of your site, your robots.txt file should look like this:

User-agent: ia_archiver
Disallow: /

You can also restrict crawling of specific directories. For example, to prevent ia_archiver from visiting the images directory (and its subdirectories):

User-agent: ia_archiver
Disallow: /images/

If you can you can place a robots.txt in the root of the subdomains you do not wish to have crawled. If these pages are outside of your control; the hosting service should/could have done these or similar restrictions.

disable crawl unwanted subdomain

There are 1 best solutions below

Related Questions in WEB-CRAWLER

Related Questions in SUBDOMAIN

Related Questions in ALEXA-INTERNET

Trending Questions

Popular # Hahtags

Popular Questions