Create Producer when the first broker in the list of brokers is down

445 Views Asked by At

I have a multi-node Kafka cluster which I use for consuming and producing.

In my application, I use confluent-kafka-go(1.6.1) to create producers and consumers. Everything works great when I produce and consume messages. This is how I configure my bootstrap server list

"bootstrap.servers":"localhost:9092,localhost:9093,localhost:9094"

But the moment when I start giving out the IP address of the brokers in bootstrap.servers and if the first broker is down, it seems that the producer repeatedly fails creation telling

Failed to initialize Producer ID: Local: Timed out

If I remove the IP of the failed node, producing and consuming messages work. If the broker is down after I create the producer/consumer, they continue to be usable by switching over to other nodes.

How should I configure bootstrap.servers in such a way that the producer will be created using the available nodes?

1

There are 1 best solutions below

0
On

You shouldn't really be running 3 brokers on the same machine anyway, but using multiple unique servers works fine for me when the first is down (and the cluster elects a different leader if it needs to), so sounds like you either lost the primary leader of your topic partitions or you've lost the Controller. Enabling retires on the producer should be able fix itself (by making a new metadata request for partition leaders)

Overall, it's just a CSV; there's no other way to configure that property itself. You could stick a reverse proxy in front of the brokers that resolves only to healthy nodes, but then you'd be conflicting with a potential DNS cache