Running PySpark statements in AWS SageMaker Studio Lab

76 Views Asked by At

Folks: Apologies if this is a very basic question. I am trying out SageMaker Studio Lab https://studiolab.sagemaker.aws

After creating a new Notebook, I noticed there are choices of two kernels:

  • default:Python
  • sagemaker-distribution:Python

However neither of these seem to support PySpark coding.

When I attempt to set up a PySpark session, I get an error about the JAVA_HOME not being set.

import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder \
    .master("local") \
    .appName("My ML Application") \
    .config("fs.s3a.endpoint", "s3.amazonaws.com") \
    .getOrCreate()


JAVA_HOME is not set

Is there any way to code in PySpark using SageMaker Studio Lab ?

0

There are 0 best solutions below