Why KSM (Kube-State-Metrics) is being scraped by only one Prometheus shard?

181 Views Asked by At

We are deploying Prometheus with sharding capabilities using thanos sidecar.

Prometheus has the below recording rule:

sum by (cluster, namespace, pod, container) ( irate(container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}[5m]) ) * on (cluster, namespace, pod) group_left(node) topk by (cluster, namespace, pod) ( 1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=""}) )

The problem with the above recording rule is that (kube_pod_info{node!=""}) provided by (Kube-State-Metrics) is only getting scraped by only one Prometheus shard. I don't know why ?!!

Hence the new recorded/generated rule only has part of the metrics coming from the node that has (kube_pod_info

I need to why only one Prom. Shard is able to scrape Kube-state-metrics (KSM) and how to make other prom shards scrape it as well.

Thanks

Only solution for now is to run the recording rule using Thanos ruler through thanos query.

1

There are 1 best solutions below

0
On

Shards are used to split the metrics between several instances of prom. The split is done by job target. If you run one replica of KSM, it will be scraped by only one prom. If you want replication, you should increase the replica count.