how to create a multi node spark connect server cluster?

42 Views Asked by At

I'm using docker compose this is my spark-connect service:

  spark-connect:
    hostname: spark-connect
    container_name: spark-connect
    image: bitnami/spark:latest
    command:  ["./sbin/start-connect-server.sh","--packages", "org.apache.spark:spark-connect_2.12:3.5.1"]
    ports:
      - "15002:15002"

    #depends_on:
     # - spark-master
    #environment:
     # SPARK_MASTER_URL: spark://spark-master:7077

    networks:
      - wba-network

I tried to make a cluster of master and two workers and connect that cluster to my spark connect but it didn't work

Im using jupyterlab and with this code I was able successfully to use spark connect server :

from pyspark.sql import SparkSession
SparkSession.builder.master("local[*]").getOrCreate().stop()
spark = SparkSession.builder.remote("sc://spark-connect:15002").getOrCreate()

columns = ["id","name"]
data = [(1,"Sarah"),(2,"Maria")]
df = spark.createDataFrame(data).toDF(*columns)
df.show()

so how to add more nodes to my spark connect server?

0

There are 0 best solutions below