I want to use prometheus to scrape metrics from my distributed web service. I have four kind of services setup with docker-compose or kubernetes.
Flask: 5000 Redis-Queue: 6379 Prometheus Workers: Horizontally scaled based on system load. They get their working instructions over the redis Queue.
It is strait forward how to scrape metrics from Flask. However, what is best-practise to get the metrics from the Workers? I cannot bind a port to them, because, I do not know, how many of them exist.
I was thinking about using a prometheus pushgateway. However, as I found out, this is not recommended.
the answer depends whether your workers lifetime is long or short
if a worker lives to execute a single task and then quits push gateway is the correct way to send metrics from the worker to Prometheus.
if a worker lives for at least two Prometheus scrape periods (which is configurable) you can definitely open a port on the worker and have Prometheus scrape metrics from a dedicated endpoint.
Prometheus's default scrape configuration comes with a scrape job that will scrape any pod with the following annotation:
it also derives the scrape endpoint from the following annotations on the pod
so you can easily annotate worker pods with the above annotations to direct Prometheus to scrape metrics from them