Clickhouse: n/n brokers are down

174 Views Asked by At

We have set up a few tables with ENGINE = Kafka() across our three node Clickhouse staging cluster.

In the logs we can frequently see this error:

[rdk:ERROR] [thrd:GroupCoordinator]: 15/15 brokers are down

I am sure the brokers are not down (not a single one). Also checking the topics, none of the currently 16 consumed topics have developed any lag.

Restarting the Clickhouse service with:

systemctl restart clickhouse-server

makes the errors go away (deleting and recreating the tables as well).

I was hoping disabling the DNS cache would help, but it didn't. Is it possible these occur when there is not much data?

Or any other ideas I could try?

SELECT version()

┌─version()─┐
│ 23.1.3.5  │
└───────────┘
1

There are 1 best solutions below

0
user1874594 On
kafka-status
  • Kafka status. Make sure that all of the brokers are up and running

  • check /etc/clickhouse-server/config.xml file. for following

KAFKA_BROKERS setting check

Increase the KAFKA_RECONNECT_INTERVAL setting. This setting controls how often Clickhouse will try to reconnect to a down broker.

Increase the KAFKA_POLL_TIMEOUT setting. This setting controls how long Clickhouse will wait for a response from Kafka before it considers the broker to be down.

If none of these work check clickhouse logs & post relevant events for further troubleshooting