Getting error while setting spark.ext.h2o.backend.cluster.mode=external in pysparkling standalone cluster

126 Views Asked by Priyadutt Bhatt At 08 August 2022 at 10:17

Code:

import pandas as pd
from pyspark.sql import SparkSession
from pysparkling import *
import h2o
from pysparkling.ml import H2OAutoML
spark = SparkSession.builder.appName('SparkApplication').getOrCreate()
hc = H2OContext.getOrCreate()

Spark-submit Command:

spark-submit --master spark://local:7077 --py-files sparkling-water-3.36.1.3-1-3.2/py/h2o_pysparkling_3.2-3.36.1.3-1-3.2.zip --conf "spark.ext.h2o.backend.cluster.mode=external" --conf spark.ext.h2o.external.start.mode="auto" --conf spark.ext.h2o.external.h2o.driver="/home/whiz/spark/h2odriver-3.36.1.3.jar" --conf spark.ext.h2o.external.cluster.size=2 spark_h20/h2o_script.py

Error Logs: py4j.protocol.Py4JJavaError: An error occurred while calling o58.getOrCreate. : java.io.IOException: Cannot run program "hadoop": error=2, No such file or directory**

Original Q&A

There are 1 best solutions below

Marek Novotny On 08 August 2022 at 10:57

the automatic start of SW external backend is only support in Hadoop or K8s environments. In a standalone deployment, you need to deploy the external backend manually according to the tutorial in SW documentation.

Getting error while setting spark.ext.h2o.backend.cluster.mode=external in pysparkling standalone cluster

There are 1 best solutions below

Related Questions in APACHE-SPARK

Related Questions in H2O

Related Questions in SPARKLING-WATER

Trending Questions

Popular # Hahtags

Popular Questions