k8s, without information about CPU and Memory

440 Views Asked by At

I got empty values for CPU and Memory, when I used igztop for check running pods in iguazio/mlrun solution. See the first line in output for this pod *m6vd9:

[ jist @ iguazio-system 07:41:43 ]->(0) ~ $ igztop -s cpu
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
| NAME                                                         | CPU(m) | MEMORY(Mi) | NODE      | STATUS  | MLRun Proj. | MLRun Owner |
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
| xxxxxxxxxxxxxxxx7445dfc774-m6vd9                             |        |            | k8s-node3 | Running |             |             |
| xxxxxx-jupyter-55b565cc78-7bjfn                              | 27     | 480        | k8s-node1 | Running |             |             |
| nuclio-xxxxxxxxxxxxxxxxxxxxxxxxxx-756fcb7f74-h6ttk           | 15     | 246        | k8s-node3 | Running |             |             |
| mlrun-db-7bc6bcf796-64nz7                                    | 13     | 717        | k8s-node2 | Running |             |             |
| xxxx-jupyter-c4cccdbd8-slhlx                                 | 10     | 79         | k8s-node1 | Running |             |             |
| v3io-webapi-scj4h                                            | 8      | 1817       | k8s-node2 | Running |             |             |
| v3io-webapi-56g4d                                            | 8      | 1827       | k8s-node1 | Running |             |             |
| spark-worker-8d877878c-ts2t7                                 | 8      | 431        | k8s-node1 | Running |             |             |
| provazio-controller-644f5784bf-htcdk                         | 8      | 34         | k8s-node1 | Running |             |             |

and It also was not possible to see performance metrics (CPU, Memory, I/O) for this pod in Grafana.

Do you know, how can I resolve this issue without whole node restart (and what is the root cause)?

2

There are 2 best solutions below

0
On BEST ANSWER

It seems as the issue with kubelet, the best is to follow the next step by step scenario (see diagram in pdf)

k8s diagram first part k8s diagram second part

6
On

Below troubleshooting steps will help you in resolving the issue:

1.Check if you can see the CPU and memory of the pod using describe command:

kubectl describe pods my-pod

2.Check if you can view CPU and memory of all pods and nodes using below commands:

kubectl top pod 

kubectl top node

3.Check if the metric server is running by using below command:

kubectl get apiservices v1beta1.metrics.k8s.io
kubectl get pod -n kube-system -l k8s-app=metrics-server

4.Check the CPU and memory of the pod using below queries:

CPU Utilisation Per Pod:

sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)

RAM Usage Per Pod:

sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)

5.Check logs of the pod and node, if you find any error attach those logs for further troubleshooting.