Find number of already exist documents in solr with solrindexing job in nutch

19 Views Asked by Naser Aslam At 28 July 2025 at 13:11

In nutch, In solrindex job how we can calculate the number of documents which have been updated in solr and the number of documents which have been indexed as new documents.

Original Q&A

There are 1 best solutions below

Quent On 08 November 2018 at 15:24

You can use this to see stats and status (fetched, not_modified, gone...)

bin/nutch readdb crawl/crawldb/ -stats

Or else you can dump crawldb to see all urls that have been crawled with their status

bin/nutch readdb crawl/crawldb/ -dump whole_db
vi whole_db/part-r-00000

Find number of already exist documents in solr with solrindexing job in nutch

There are 1 best solutions below

Related Questions in SOLR

Related Questions in NUTCH2

Trending Questions

Popular # Hahtags

Popular Questions