crawler4j to crawl a list of urls without crawling entire website

909 Views Asked by At

I have a list of web URLS need to be crawl. Is that possible to crawl only the list of webpage s with out crawling it deep. If i add the url as seed it crawls full website with full depth.

1

There are 1 best solutions below

0
On BEST ANSWER

To only crawl the pages which you added as a seed, set the MaxDepthOfCrawling to 0.

CrawlConfig config = new CrawlConfig();
config.setMaxDepthOfCrawling(0);
PageFetcher pageFetcher = new PageFetcher(config);