Nutch - Visit few pages again and again to find new links

69 Views Asked by At

I have setup Nutch 1.17 to crawl few thousand domains with inlinks crawl only. One of my main requirement is I should have to visit home pages again and again (lets say after 2 hour) and if there is any new page, then only that should be crawled.

What should be the best possible way ? I am thinking to crawl run injector job again and again to crawl home pages. Is it the right way ? Meanwhile how should I ensure that inlinks are also going to fetch with time.

0

There are 0 best solutions below