Airflow container OOM killed even though memory usage was below requested

113 Views Asked by At

I am running Airflow 2.5.3 and I am tearing my hair out over this one:

A task in my DAG (KubernestPodOperator) was configured to request 3Gi memory and it runs a loop. At the bottom of the loop, I log memory consumption up to that point.

Each iteration of the loop uses the same local variables, so memory allocation from previous iteration should be garbage collected.

The log shows at the bottom of each loop the container memory usage never exceeded 1Gi.

However after running a number of iterations, the pod got OOM-killed.

What could be the reason?

For container memory usage, I looked at the cgroup memory data.

How do I go about debugging this?

0

There are 0 best solutions below