I have been getting error in pyspark while running h2o model prediction.
file "/usr/spark/python/pyspark/cloudpickle.py", line 562, in subimport ModuleNotFoundError; No Modele named h2o
i created pandas udf
`def predict_h2o_model(*cols)
x=pd.concat(cols,axis=1)
h2odataframe=h2o.H2OFrame(x)
scores=model.predict(h2odataframe)
return pd.series(scores)`
I' scoring using pyspark dataframe
`df_scores=sparf_df.select(F.col("cust_id"),predict_h2o_model(*cols).alias('model_score'))`
I was expecting h2o model scores in spark_df dataframe
Please add
import h2o
before you call anything from h2o toolboxes.