I have configured Apache Nutch 2.3.1 and crawled few websites. I have to index these documents to Solr (6.6.3) that is running in Cloud mode. When I execute solrindex command, I got following exception
2018-05-02 13:10:40,679 INFO [main] org.apache.hadoop.mapred.MapTask: Ignoring exception during close for org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector@3bd3d05e
java.io.IOException: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://10.11.22.156:8983/solr/collection2
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:103)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:114)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2019)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://10.11.22.156:8983/solr/collection2
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:559)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:97)
... 11 more
Caused by: org.apache.http.conn.HttpHostConnectException: Connection to http://10.11.22.156:8983 refused
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:643)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
... 17 more
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
Where is the problem? If I repeat the same job with solr without cloud mode, it works fines.
The error straight forward showing that you have an Apache Nutch server, which is unable to access this particular node and port of Apache Solr http://10.11.22.156:8983/solr/collection2.
You need access between these two servers to make them communicate with each other :