I am testing a small Bigtable cluster (minimum 3 nodes). I see on the Google console that as the Write QPS level approaches 10K, the CPU utilization approaches the recommended maximum of ~80%.
From what I understand, the QPS metric is for the whole instance, not for each node? In that case, why is the CPU threshold reached while technically the QPS load of the instance is just 1/3 of 30K guidance max? I'm just trying to understand if something is off with my data upload program (done via Dataflow).
Also curious why I never manage to observe anything close to the 30K Writes/sec, but I suspect this is due to the limitations on the Dataflow side, as I'm still restricted to the 8 CPU quote while on trial...
The CPU graph shows the definitive metric to show that Bigtable is overloaded. Unfortunately, QPS isn't the ideal metric to determine the root cause of the overload since we added the bulk write API. Bigtable / Dataflow loading uses the cloud bigtable bulk APIs which send multiple requests in a single batch and 1 query now can have a few dozen update requests. Rows Updated Per Second would be a better metric, but alas it does not exist yet on the Cloud Bigtable side. There is an equivalent metric in your Dataflow UI in the Cloud Bigtable step, and you can use that number to judge Cloud Bigtable performance.
The rule of thumb I use is ~3 Dataflow worker CPUs per 1 Cloud Bigtable node when doing writes. It's very likely that your job is properly configured with 8 CPUs and 3 Bigtable nodes. Given your description, I think that your system is working as efficiently as possible.