Sometimes some data is missing from Prometheus's query

747 Views Asked by At

I have 3 Ubuntu machines. Each of them has Prometheus installed with node_exporter. When I try to get network statistics, I query

  • node_network_receive_bytes_total
  • node_network_transmit_bytes_total
  • node_network_receive_drop_total
  • node_network_transmit_drop_total
  • node_network_receive_errs_total
  • node_network_transmit_errs_total
  • node_network_receive_packets_total
  • node_network_transmit_packets_total

one by one in a for loop through HTTP API.

For most of time, returned statistics are fine. However, in some cases, data for one node will be missing.

For example, in 1 round, query for node_network_receive_bytes_total returns all data. But the data of 3rd node is missing in node_network_receive_packets_total.

How can I avoid this problem? Or do I simply resend the query if I found some data is missing?

1

There are 1 best solutions below

0
On

I think this happens because the actual data has not been saved to prometheus when I try to query it.

Think about this scenario, prometheus pulls data from node_exporters on each node. These pull actions are not done at exactly the same time. When I send the query, it is possible that prometheus has finished pulling the latest data from 2 nodes but not for the 3rd node. So, prometheus cannot calculate the value at a timestamp for the 3rd node. This leads to the 3rd node's data been not returned in the query.

BTW, from source code of prometheus, even if you do not specify a timestamp in query request, prometheus will generate a timestamp.

I find 2 workarounds for this problem:

  1. Setting time parameter of query request to 1 second ago from now. In this way, you are always querying "old" data, so no need to worry about data not existing. However, the drawback is that the data you get is not very accurate, since you are always using data of 1 second ago as now.
  2. Simply send query to prometheus again.