Incorrect Google Cloud metrics? or what is going on?

396 Views Asked by At

My background is more from the Twitter side where all stats are recorded minutely so you might have 120 request per minute. Inside twitter someone had the bright idea to divide by 60 so most graphs(except some teams who realize dividing by 60 is NOT the true rps at all since in a minute, that will fluctuate). So instead of 120 request per minute, many graphs report out 2 request per second. In google, seems like they are doing the same EXCEPT the math is not showing that. In twitter, we could multiply by 60 and the answer was always a whole integer of how many requests occurred in that minute.

In Google however, we see 0.02 requests / second which if we multiply by 60 is 1.2 request per minute. IF they are a minute granularity, they are definitely counting it wrong or something is wrong with their math.

This is from cloudrun metrics as we click into the instance itself

enter image description here

What am I missing here? AND BETTER yet, can we please report on request per minute. request per second is really the average req/second for that minute and it can be really confusing to people when we have these discussions of how you can get 0.5 request / second.

I AM assuming that this is not request per second 'at' the minute boundary because that would be VERY hard to calculate BUT would also be a whole number as well...ie. 0 requests or 1, not 0.2 and that would be quite useless to be honest.

EVERY cloud run instance creates this chart so I assume it's the same for everyone but if I click 'view in metrics explorer' it then give this picture on how 'google configured it'....

enter image description here

1

There are 1 best solutions below

0
On

As it is available on the Metrics from Cloud Run Documentation, the Request Count metric is sampled every 60 seconds and it excludes from the count requests that are not reaching your container instances, the examples given are unauthorized requests or request sent after the maximum number of instances are reached, which obviously are not your case but again, something to be consider.

Assuming that the calculation of the request count is wrong, I did some digging on Google's IssueTracker system for the monitoring and cloud run components to check if there are any bugs opened that are related to that but could not find any, I would advice that you create a bug in their system so that Google can address it and that you are notified once that is fixed.