etcd pod alternating between active and inactive

1.3k Views Asked by At

In the k8 cluster, we have etcd:3.3.15 pods running. But when we query for cluster health, the cluster seems to be degraded, the thing is, pods/peers go into active and inactive state, I don't know why they're doing it, below code may give some input to guys.

2021-03-17 13:39:32.775194 I | rafthttp: peer 68d1163017ce392 **became active**
2021-03-17 13:39:32.775221 I | rafthttp: established a TCP streaming connection with peer 68d1163017ce392 (stream Message writer)
2021-03-17 13:39:32.775437 I | rafthttp: established a TCP streaming connection with peer 68d1163017ce392 (stream MsgApp v2 writer)
2021-03-17 13:39:34.273399 E | rafthttp: failed to dial 68d1163017ce392 on stream MsgApp v2 (dial tcp: i/o timeout)
2021-03-17 13:39:34.273433 I | rafthttp: peer 68d1163017ce392 **became inactive** (message send to peer failed)
[root@k8s-master-0 ~]# kubectl -n mvnr-mtcil1-infra-mtcil-mtcil1 exec -it etcd-1 -- /bin/bash
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
{"level":"warn","ts":"2021-03-17T15:34:30.858Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-bdb76ddc-ac56-4ba3-b8f0-9d15a9f3571f/etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 4.436385ms
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 4.21728ms
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 76.860651ms
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 3.96122ms
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 77.39875ms
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
{"level":"warn","ts":"2021-03-17T15:34:52.874Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7186ba57-fc81-49be-889a-552b5aba0c8d/etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2021-03-17T15:34:52.874Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-867d9d6e-0925-491f-81e1-d92dc799c25a/etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 88.035144ms
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
{"level":"warn","ts":"2021-03-17T15:35:22.927Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-bf153df7-fb4d-4ade-aafc-38003d4577f5/etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2021-03-17T15:35:22.960Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-6f107368-f895-4b2b-8cd0-f952f36359f4/etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 35.355891ms
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
{"level":"warn","ts":"2021-03-17T15:35:51.991Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-a20e8fd4-a047-4ac3-a28c-d00ec5c63bbb/etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 68.568217ms
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 68.503005ms
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
{"level":"warn","ts":"2021-03-17T15:36:01.490Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-cfcbf88f-7115-455b-ab5d-06a3c82c7a63/etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2021-03-17T15:36:01.557Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-b7144958-b2ee-4e2e-a709-c18c10e6ac7c/etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 71.045159ms
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
bash-5.0# ETCDCTL_API=3 etcdctl endpoint health --cluster
{"level":"warn","ts":"2021-03-17T15:36:11.667Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-24f1dd60-2da4-4c78-a3e4-be848fcc3bef/etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2021-03-17T15:36:11.668Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-ee401b13-7d39-4243-9de6-15701874dc36/etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
http://etcd-1.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is healthy: successfully committed proposal: took = 3.814891ms
http://etcd-0.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
http://etcd-2.etcd.mvnr-mtcil1-infra-mtcil-mtcil1.svc.cluster.local:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster

All the configuration seems to be fine.

0

There are 0 best solutions below