Exponential Backoff policy for spark on azure storage

114 Views Asked by At

I have spark jobs on k8s which are reading and writing parquet files from azure storage (blobs). I recently understood that there are environment limits on Azure for the number of transactions/sec and my pipelines is exceeding those limits.

This is resulting in throttling and my some tasks in my jobs are taking 8-10x the usual time (it isn't data skew). One of the recommendation was to apply an exponential backoff policy but i have not found any such setting on spark configurations.

Anyone facing similar situation or any help on this would truly be appreciated ?

0

There are 0 best solutions below