SPARK_HOME config error with bitnami/spark and zeppelin on Docker

102 Views Asked by At

I'm facing an issue while implementing Spark with Zeppelin using Docker and need some solutions

My docker-compose.yml:

version: "3.7"

services:
  spark-master:
    image: bitnami/spark:latest
    container_name: spark-master
    command: bin/spark-class org.apache.spark.deploy.master.Master
    depends_on:
      - kafka
    ports:
      - "8080:8080"
      - "7077:7077"

  spark-worker-1:
    image: bitnami/spark:latest
    container_name: spark-worker-1
    command: bin/spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077
    depends_on:
      - spark-master
    environment:
      SPARK_MODE: worker
      SPARK_WORKER_CORES: 2
      SPARK_WORKER_MEMORY: 2g
      SPARK_MASTER_URL: spark://spark-master:7077

  spark-worker-2:
    image: bitnami/spark:latest
    container_name: spark-worker-2
    command: bin/spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077
    depends_on:
      - spark-master
    environment:
      SPARK_MODE: worker
      SPARK_WORKER_CORES: 2
      SPARK_WORKER_MEMORY: 2g
      SPARK_MASTER_URL: spark://spark-master:7077

  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"

  kafka:
    image: wurstmeister/kafka
    container_name: kafka
    ports:
      - "9092:9092"
    depends_on:
      - zookeeper
    environment:
      KAFKA_ADVERTISED_HOST_NAME: localhost
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181

  cassandra:
    image: cassandra:3.11.3
    container_name: cassandra
    ports:
      - 9042:9042

  zeppelin:
    image: apache/zeppelin:0.10.1
    container_name: zeppelin
    depends_on:
      - spark-master
    ports:
      - "8082:8080"
    environment:
      - "SPARK_HOME=/opt/bitnami/spark"
      - "SPARK_MASTER=spark://spark-master:7077"

I have check the spark-submit file in SPARK_HOME with commands:

  • C:\Users\ACER>docker exec -it 0d671530e12c /bin/bash
  • I have no name!@0d671530e12c:/opt/bitnami/spark$ cd bin/
  • I have no name!@0d671530e12c:/opt/bitnami/spark/bin$ ls -la -> and there is spark-submit.cmd file

But when i just test running %spark.pyspark in a zeppelin note, I got this error:

org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Fail to detect scala version, the reason is:Cannot run program "/opt/bitnami/spark/bin/spark-submit": error=2, No such file or directory

I'm new to Docker, but my project requires its use. Could you provide some solutions for this issue? Thank you very much!

0

There are 0 best solutions below