K8s livenessProbe times out but not when run manually

72 Views Asked by At

I am running workloads on an EKS cluster (v1.25.16-eks) and some of my pods get restarted after liveness probe times out, while the service running is completely fine:

Warning  Unhealthy  19m (x9 over 25h)     kubelet  Liveness probe failed: command "docker-healthcheck" timed out

timeoutSeconds is set to 1 second. This is docker-healthcheck content:

#!/bin/sh
set -e

if env -i REQUEST_METHOD=GET SCRIPT_NAME=/health SCRIPT_FILENAME=/health cgi-fcgi -bind -connect localhost:9000; then
    exit 0
fi

exit 1

My hot fix is to increase timeoutSeconds which works well. But I cannot figure out why those probes time out as they are simple exec probes that make a localhost HTTP call. They certainly take much less than 1 second when run manually from inside the container.

I am running out of explanations and cannot seem to find EKS-specific reasons. I tried looking for a kubelet overhead for executing those probes but could not find any. Is there any reason why I observe this behavior?

Thanks!

0

There are 0 best solutions below