How to free up Spark worker container memory after jobs finish

325 Views Asked by At

I am running spark standalone in Kubernetes, I have a pyspark application that connects to the master using SparkSession. The app is loading around 4gb json files and does some SQL queries.

If I restart the workers, they use around 400/500mb RAM on the worker container, I start my application and the memory goes up to around 4/5GB, after the application finishes the memory only seems to drop by around 1GB. How can I get the worker to release all its memory.

My app is not caching or persisting any dataframes.

My problem is that my app runs hourly, after x number of runs the worker pods restart and the job will lost connection until it switches to a new worker.

You can see in the Grafana graph below.

enter image description here

0

There are 0 best solutions below