Jenkins for Kubernetes, how to prevent ClosedChannelException and keep pod agent alive

329 Views Asked by At

I am using Jenkins for Kubernetes, and sometimes the connection to the agent Pod is lost because of ClosedChannelException. From my experience, we observe that this is because of either:

  • Random lost connection to the pod, especially when the pod is running large processes that take a lot of resources. Thus the pod still exists, but the connection is now lost
  • Pod eviction due to kubectl drain by the cluster owner to do maintenance on the node

I'd like to ask for some ideas on how to prevent this. I am not a Jenkins expert and need some help on what is possible.

  • For the first point, can I configure the Jenkins master somehow to attempt to "re-search" the agent that is lost?
  • For the second point, I know that Kubernetes has a 'preStop' hook to run a pre-stop command before the pod is evicted. Is it somehow possible to tell the control plane (maybe through this hook, or some other way) to not evict this pod until it completes its operation?

For context, I am also deploying Jenkins using Jenkins Configuration as Code.

1

There are 1 best solutions below

1
On

Here is how I've fixed the problem by trapping SIGTERM in agent pod spec. Trap helped for agent process to stay alive and connected to jenkins controller. Even if you try to drain kubernetes nodes jenkins agent java process will stay alive till the end of termination grace period which is 120 sec in the example.

apiVersion: v1
kind: Pod
spec:
  terminationGracePeriodSeconds: 120
  containers:
  - name: jnlp
    command:
      - sh
      - -xc
      - | 
        trap 'echo "Pod Received SIGTERM, Pipeline will continue till terminationGracePeriodSeconds ends; $(date +"%d-%m-%Y %T.%N %Z")"' SIGTERM;
        /usr/local/bin/jenkins-agent