kubernetes: failed to load existing certificate apiserver-etcd-client:

6.9k Views Asked by At

My cluster certificates are expired and now I cannot execute any kubectls commands.

root@node1:~# kubectl get ns
Unable to connect to the server: x509: certificate has expired or is not yet valid
root@node1:~# 

I have created this cluster using Kubespray , kubeadm version is v1.16.3 and kubernetesVersion v1.16.3

root@node1:~# kubeadm alpha certs check-expiration
failed to load existing certificate apiserver-etcd-client: open /etc/kubernetes/pki/apiserver-etcd-client.crt: no such file or directory
To see the stack trace of this error execute with --v=5 or higher
root@node1:~# 

And it is found that apiserver-etcd-client.crt and apiserver-etcd-client.key files are missing on /etc/kubernetes/pki directory.

root@node1:/etc/kubernetes/pki# ls -ltr
total 72
-rw------- 1 root root 1679 Jan 24 2020 ca.key
-rw-r--r-- 1 root root 1025 Jan 24 2020 ca.crt
-rw-r----- 1 root root 1679 Jan 24 2020 apiserver.key.old
-rw-r----- 1 root root 1513 Jan 24 2020 apiserver.crt.old
-rw------- 1 root root 1679 Jan 24 2020 apiserver.key
-rw-r--r-- 1 root root 1513 Jan 24 2020 apiserver.crt
-rw------- 1 root root 1675 Jan 24 2020 apiserver-kubelet-client.key
-rw-r--r-- 1 root root 1099 Jan 24 2020 apiserver-kubelet-client.crt
-rw-r----- 1 root root 1675 Jan 24 2020 apiserver-kubelet-client.key.old
-rw-r----- 1 root root 1099 Jan 24 2020 apiserver-kubelet-client.crt.old
-rw------- 1 root root 1679 Jan 24 2020 front-proxy-ca.key
-rw-r--r-- 1 root root 1038 Jan 24 2020 front-proxy-ca.crt
-rw-r----- 1 root root 1675 Jan 24 2020 front-proxy-client.key.old
-rw-r----- 1 root root 1058 Jan 24 2020 front-proxy-client.crt.old
-rw------- 1 root root 1675 Jan 24 2020 front-proxy-client.key
-rw-r--r-- 1 root root 1058 Jan 24 2020 front-proxy-client.crt
-rw------- 1 root root 451 Jan 24 2020 sa.pub
-rw------- 1 root root 1679 Jan 24 2020 sa.key
root@node1:/etc/kubernetes/pki#

I have tried the following command but nothing is worked and showing errors:

#sudo kubeadm alpha certs renew all
#kubeadm alpha phase certs apiserver-etcd-client
#kubeadm alpha certs apiserver-etcd-client --config /etc/kubernetes/kubeadm-config.yaml

Kubespray command:

#ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml

The above command ended with the below error:

FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.058756", "end": "2021-02-05 13:32:51.656901", "msg": "non-zero return code", "rc": 1, "start": "2021-02-05 13:31:36.598145", "stderr": "timed out waiting for the condition\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["timed out waiting for the condition", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "", "stdout_lines": []}

# cat /etc/kubernetes/kubeadm-config.yaml 
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: master1_IP
  bindPort: 6443
certificateKey: xxx
nodeRegistration:
  name: node1
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
  criSocket: /var/run/dockershim.sock
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: cluster.local
etcd:
  external:
      endpoints:
      - https://master1:2379
      - https://master2:2379
      - https://master3:2379
      caFile: /etc/ssl/etcd/ssl/ca.pem
      certFile: /etc/ssl/etcd/ssl/node-node1.pem
      keyFile: /etc/ssl/etcd/ssl/node-node1-key.pem
dns:
  type: CoreDNS
  imageRepository: docker.io/coredns
  imageTag: 1.6.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: IP/18
  podSubnet: IP/18
kubernetesVersion: v1.16.3
controlPlaneEndpoint: master1_IP:6443
certificatesDir: /etc/kubernetes/ssl
imageRepository: gcr.io/google-containers
apiServer:
2

There are 2 best solutions below

1
On

First you need to renew expired certificates, use kubeadm to do this:

kubeadm alpha certs renew apiserver
kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew front-proxy-client

Next generate new kubeconfig files:

kubeadm alpha kubeconfig user --client-name kubernetes-admin --org system:masters > /etc/kubernetes/admin.conf
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
# instead of $(hostname) you may need to pass the name of the master node as in "/etc/kubernetes/kubelet.conf" file.
kubeadm alpha kubeconfig user --client-name system:node:$(hostname) --org system:nodes > /etc/kubernetes/kubelet.conf 
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf

Copy new kubernetes-admin kubeconfig file:

cp /etc/kubernetes/admin.conf ~/.kube/config

Finally you need to restart: kube-apiserver, kube-controller-manager and kube-scheduler. You can use below commands or just restart master node:

sudo kill -s SIGHUP $(pidof kube-apiserver)
sudo kill -s SIGHUP $(pidof kube-controller-manager)
sudo kill -s SIGHUP $(pidof kube-scheduler)

Additionally you can find more information on github and this answer may be of great help to you.

0
On

In my case, I use AKS (Azure Kubernetes Services), to fix this error I runned the command:

az aks rotate-certs -g $RESOURCE_GROUP_NAME -n $CLUSTER_NAME

follow link: https://learn.microsoft.com/en-us/azure/aks/certificate-rotation