How to prevent crawlers from following links?

6.1k Views Asked by At

I'm building a site that will allow sellers to:

  • list their products on my site
  • have each product link back to the seller's site
  • be charged for each link clicked

What I need to do now is to somehow make sure that I am only logging actual human users following the links to the sellers site. If it's a bot crawling the site, I shouldn't be charging the sellers for that.

Is there a way for me tell bots not to follow a certain link? I don't think it's nofollow as that is not intended to block access to content.

3

There are 3 best solutions below

2
On BEST ANSWER

The way to tell a bot not to follow a link is precisely to add rel=nofollow to your <a> tag. Assuming you are also logging locally before forwarding to the external url you could also check the user agent string.

In fact, if you are going to ask people to pay based on number of referrals it might be an idea to log IP address and user agent against each paid for click in case your stats are ever questioned.

0
On

Typicall you can identify them by the user agent string. You can find a list here, can't say it's perferct, but it's a good base to extend: PHP/MySQL - an array filter for bots

Robots.txt is another way, more about it here

3
On

You just add a [robots.txt] file, e.g. like this one.

You can find more info about [robots.txt] files on the net, e.g. in Wikipedia.