Is writing to database done by driver or executor in spark cluster

1k Views Asked by Saranraj K At 15 June 2025 at 17:55

I have a spark cluster setup with 1 master node and 2 worker nodes. I am running a pyspark application in this spark standalone cluster where I have a job to write the transformed data into Mysql database.

So, I have a question here whether writing to database is done by driver or executor? Because when writing to a textfile, it's done by driver since my output file gets created in driver

Updated

Adding below the code I have used to write to a text file

from pyspark import SparkConf,SparkContext
if __name__ =="__main__":
    sc = SparkContext(master = "spark://IP:PORT",appName='word_count_application')
    words = sc.textFile("book_2.txt")
    word_count = words.flatMap(lambda a : a.split(" ")).map(lambda a : (a,1)).reduceByKey(lambda a,b : a+b)
    word_count.saveAsTextFile("book2_output.txt")

Original Q&A

There are 2 best solutions below

Abdennacer Lachiheb On 13 December 2022 at 10:15

If the writing is done using dataset/datafame api like this:

df.write.csv("...")

Then it's done by the executors, that why in spark we have multiple files in the output because each executor will write each partition defined inside it.

The driver is used for scheduling work across the executors, and not for doing the actual work ( reading, transforming and writing) which will be done by the executors

Robert Kossendey On 13 December 2022 at 13:23

saveAsTextFile() is distributed, each executor is writing files. Your driver will never write any files since, as @Abdennacer Lachiheb already mentioned, it is responsible for scheduling, the Spark UI and more.

Your path is referring to a local file system, so your files are not getting saved on your driver, but on the machine your driver runs. The path could also be an object storage like S3 or HDFS.

Is writing to database done by driver or executor in spark cluster

There are 2 best solutions below

Related Questions in APACHE-SPARK

Related Questions in PYSPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in APACHE-SPARK-STANDALONE

Trending Questions

Popular # Hahtags

Popular Questions