I have a PySpark code to train an H2o DRF model. I need to save this model to disk and then load it.
from pysparkling.ml import H2ODRF
drf = H2ODRF(featuresCols = predictors,
labelCol = response,
columnsToCategorical = [response])
I can not find any document on this so I am asking this question here.
I think the section of the docs on deploying pipeline models might be relevant: https://docs.h2o.ai/sparkling-water/2.3/latest-stable/doc/deployment/pysparkling_pipeline.html
Pipelines may not be what you're looking for depending on the use case.
Something like the following might work for your use case.