How to add to classpath of running PySpark session

295 Views Asked by Adam Hoelscher At 05 June 2025 at 02:30

I have a PySpark notebook running in AWS EMR. In my specific case, I want to use pyspark2pmml to create pmml for a model I just trained. However, I get the following error (when running pyspark2pmml.PMMLBuilder but I don't think that matters).

JPMML-SparkML not found on classpath
Traceback (most recent call last):
  File "/tmp/1623111492721-0/lib/python3.7/site-packages/pyspark2pmml/__init__.py", line 14, in __init__
    raise RuntimeError("JPMML-SparkML not found on classpath")
RuntimeError: JPMML-SparkML not found on classpath

I know that this is caused by my Spark session not have reference to the needed class. What I don't know is how to start a Spark session with that class available. I found one other answer using %%conf -f, but that changed other settings which in turn kept me from using sc.install_pypi_package, which I also needed.

Is there a way that I could have started the Spark session with that JPMML class available, but without changing any other settings?

Original Q&A

There are 1 best solutions below

Adam Hoelscher On 10 June 2021 at 19:26

So, here's an answer, but not the one I want.

To add that class to the classpath I can start my work with this:

%%configure -f
{
    "jars": [
        "{some_path_to_s3}/jpmml-sparkml-executable-1.5.13.jar"
    ]
}

That creates the issue I referenced above, where I don't have the ability to sc.install_pypi_package. However, I can add that package in a more manual way. First step was to create a zip file of just the needed modules using the zip from the project's github (in this case, just the pyspark2pmml directory, instead of the whole zip). Then that module can be added using sc.addPyFile

sc.addPyFile('{some_path_to_s3}/pyspark2pmml.zip')

After this, I can run the original commands exactly as I expected.

How to add to classpath of running PySpark session

There are 1 best solutions below

Related Questions in PYSPARK

Related Questions in JAR

Related Questions in PMML

Trending Questions

Popular # Hahtags

Popular Questions