$ bin/nutch inject crawl/crawldb urls
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/Gjergj%20Kadriu/Documents/apache-nutch-1.19/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.c
lass]
SLF4J: Found binding in [jar:file:/C:/Users/Gjergj%20Kadriu/Documents/apache-nutch-1.19/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.cla
ss]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?).
at [row,col,system-id]: [9,2,"file:/C:/Users/Gjergj%20Kadriu/Documents/apache-nutch-1.19/conf/nutch-site.xml"]
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3092)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:3041)
at org.apache.hadoop.conf.Configuration.loadProps(Configuration.java:2914)
at org.apache.nutch.crawl.Injector.main(Injector.java:533)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?).
at [row,col,system-id]: [9,2,"file:/C:/Users/Gjergj%20Kadriu/Documents/apache-nutch-1.19/conf/nutch-site.xml"]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:634)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:504)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:488)
... 13 more
Tried different configurations of nutch-site.xml using default from nutch-default, I'm using cygwin in windows 10. Tried enviromental variables troubleshooting etc, nothing work. Any ideas on how to approach this error?
The file nutch-site.xml is required to be a valid XML document. The error message indicates that there are multiple root elements. For example, the error is reproducible with the following nutch-site.xml:
Once the XML syntax is fixed, Nutch should be able to read the configuration file.