We (an engineering team) are running an EMR cluster with YARN and Spark. What is typically happening is that when one user submits a heavy memory intensive job, it grabs all the YARN available memory and then all the subsequent users submitted jobs have to wait for that memory to clear (I know that autoscaling will solve this problem to a certain extent and we are looking into that, but we would like to avoid a single user occupying all the memory even when the cluster is autoscaled to it's full limits).
Is there a way to configure YARN such that any application (Spark or otherwise) may not occupy more than, say 75% of available memory?
Thanks
According to the documentation, you can manage the amount of memory allocated to an executor using the parameter: spark.executor.memory