How to calculate hinted_handoff_throttle_in_kb to performance of hints handoff?

25 Views Asked by At

do you know if we have some formula to calculate hinted_handoff_throttle_in_kb to performance of hints handoff? I know that the number for this value is divided per number of nodes in the cluster, but I would say that we cannot just increase this value without taking care about cluster metrics for connectivity and another things, so I would like to know if the community has some guide for increasing this properties based on some cluster information.

There're the guide that I'm reading:

https://cassandra.apache.org/doc/latest/cassandra/managing/operating/hints.html#configuring-hints

1

There are 1 best solutions below

0
Mário Tavares On

Typically you wouldn't need to increase hinted_handoff_throttle_in_kb, provided dropped mutations or unavailable nodes are not a common issue in the cluster.

The goal of hinted_handoff_throttle_in_kb being divided by the number of nodes minus one is to keep hinted handoff scalable, in the sense that the node receiving hints processes hint threads from multiple peers at an average of hinted_handoff_throttle_in_kb * max_hints_delivery_threads regardless of the number of nodes in the cluster.

With that said, hints only come into play when mutations are missed, as a mechanism to replay missed mutations. This is common when nodes are restarted, or become unavailable for some period - If you see frequent hints in Cassandra logs for unknown reasons, I advise looking into the root cause of the missed mutations rather than address the problem at the hint configuration level.

If that's not the case, and perhaps hint replay takes too long after maintenances or planned restarts, then it's possible that the write throughput is too high for the default value as explained in the hint documentation.

In this case, increasing hinted_handoff_throttle_in_kb may make sense, but I would recommend testing it in a lower environment under similar service load, as hint load can be very impactful on database performance. Additionally, if the cluster has multiple logical datacenters, rather than increasing hinted_handoff_throttle_in_kb, I recommend starting by increasing max_hints_delivery_threads to a maximum of 2 times the number of datacenters, as a means to scale hint throughput with replication.