Crawler4j visits only seeds URLs

762 Views Asked by Vuk Stanković At 05 August 2013 at 22:03

I'm using crawler4j to crawl rottentomatoes website to extract structured data. I have setup everything and with default urls given in example on project home page, everything works, but when I put my own seeds, application only visits URLs that I have given it. Did I miss something?

Original Q&A

There are 1 best solutions below

Julien On 18 August 2013 at 10:01

The most common error is that the shouldVisit method always returns false, therefore the crawler only visits the seed urls.

Crawler4j visits only seeds URLs

There are 1 best solutions below

Related Questions in JAVA

Related Questions in WEB-CRAWLER

Related Questions in CRAWLER4J

Trending Questions

Popular # Hahtags

Popular Questions