What is the reason for the increase of heap usage when the aeron cluster is idle?

99 Views Asked by At

We use the aeron library. And after monitoring the heap when the system is completely idle, we saw that the heap usage increases, and after checking the logs, we found these things. What is the reason for this increase? The monitoring result has been sent.

{"log":"the prevVal of Failed attempts to free log buffers is: 0.0\n","stream":"stdout","time":"2022-12-19T04:00:29.992379754Z"} {"log":"we try to get Sender flow control limits, i.e. back-pressure events value and put it as aeron_sender_flow_control_limits__i_e__back_pressure_events\n","stream":"stdout","time":"2022-12-19T04:00:29.992383872Z"} {"log":"the currentValue of Sender flow control limits, i.e. back-pressure events is: 36\n","stream":"stdout","time":"2022-12-19T04:00:29.992387994Z"} {"log":"the prevVal of Sender flow control limits, i.e. back-pressure events is: 36.0\n","stream":"stdout","time":"2022-12-19T04:00:29.99239165Z"} {"log":"we try to get Unblocked Publications value and put it as aeron_unblocked_publications\n","stream":"stdout","time":"2022-12-19T04:00:29.992395649Z"}

enter image description here

2

There are 2 best solutions below

1
On

This is a hypothesis ...

In order to implement Aeron's advertised high availability and low latency characteristics, it needs to send periodic messages to check that the nodes in the cluster are all still alive. The sending, receiving and processing of these messages will continue even when the cluster is idle. That is a possible (IMO likely) cause of the heap usage that you are observing.

I suspect that the log events are coincidental. They don't seem to signify anything important.

0
On

The log messages shown don't look like Aeron log messages. The text "we try to get" does not appear any where in the Aeron code base. This suggests that these log messages are from a third party monitoring application scanning the Aeron Counters. Many monitoring clients are not built with zero-garbage operation in mind. I would recommend using an allocation profiler (e.g. async-profiler) on the application while it is idle to find where the allocations are coming from.