I'm writing 20 millions rows of data to Elasticsearch (Azure Cloud) using spark-es connector. After writing 13 millions successfully, I've got the error bellow :
Caused by: EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[.......westeurope.azure.elastic-cloud.com:9243]]
My code: Writing data from spark to Elastic:
data
.write
.format("org.elasticsearch.spark.sql")
.option("es.nodes", node)
.option("es.port", port)
.option("es.net.http.auth.user", username)
.option("es.net.http.auth.pass", password)
.option("es.net.ssl", "true")
.option("es.nodes.wan.only", "true")
.option("es.mapping.id", "id")
.mode(writingMode)
.save(index)
Any help or suggestion would be appreciated !
When you do spark-submit, try playing with the parameters: driver-memory executor-memory
For my setup, the following works. I do not know your system specifications, you can try experimenting with high values.
The issue is highly likely to be with the amount of load you are putting on the system and not your spark and elasticsearch connection.