How to specify the H2O version in Sparkling Water?

344 Views Asked by At

In a Databricks notebook, I am trying to load an H2O model that was trained for H2O version 3.30.1.3.

I have installed the version of Sparkling Water which corresponds to the Spark version used for the model training (3.0), h2o-pysparkling-3.0, which I pulled from PyPI.

The Sparkling Water server is using the latest version of H2O rather than the version I need. Maybe there is a way to specify the H2O version when I initiate the Sparkling Water context? Something like this:

import h2o
from pysparkling import H2OContext
from pysparkling.ml import H2OBinaryModel

hc = H2OContext.getOrCreate(h2o_version='3.30.1.3')
model = H2OBinaryModel.read('s3://bucket/model_file')

I run the above code without an argument to H2OContext.getOrCreate() and I get this error:

IllegalArgumentException: 
 The binary model has been trained in H2O of version
 3.30.1.3 but you are currently running H2O version of 3.34.0.6.
 Please make sure that running Sparkling Water/H2O-3 cluster and the loaded binary
 model correspond to the same H2O-3 version.

Where is the Python API for Sparkling Water? If I could find that I might be able to determine if there's an H2O version argument for the context initializer but surprisingly it's been impossible for me to find so far with Google and poking around in the docs.

Or is this something that's instead handled by installing an H2O version-specific build of Sparkling Water? Or perhaps there's another relevant configuration setting someplace?

1

There are 1 best solutions below

1
Karthikeyan Rasipalay Durairaj On

Did you try Notebook-scoped library concepts ? Notebook-scoped libraries let you create, modify, save, reuse, and share custom Python environments that are specific to a notebook. When you install a notebook-scoped library, only the current notebook and any jobs associated with that notebook have access to that library. Other notebooks attached to the same cluster are not affected. You can ref : link

Limitations : Notebook-scoped libraries do not persist across sessions. You must reinstall notebook-scoped libraries at the beginning of each session, or whenever the notebook is detached from a cluster.

enter image description here