How to run PySpark with installed packages?

127 Views Asked by At

Normally, when I run pyspark with graphframes I have to use this command:

pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12

In the first time run this, this will install the packages graphframes but not the next time. In the .bashrc file, I have already added:

export SPARK_OPTS="--packages graphframes:graphframes:0.8.1-spark3.0-s_2.12"

But I cannot import the packages if I am not adding the option --packages.

How can I run pyspark with graphframes with this simple command?

pyspark
1

There are 1 best solutions below

1
On

you can make a wrapper script like myspark.sh that triggers pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12, that would be the simplest solution.