I am using Apache Kafka V3.5 in order to create 3 Kafka machines in the cluster , the goal is to have 3 broker machines without zookeeper
I am facing a problem after we installed and configured the kakas machines , and the problem is about that broker/s stopped after couple min on all machines
first I want to share the configuration in server.properties
the following server.properties after rendering from template looks as following: (the node.id changes across 3 brokers)
process.roles=broker,controller
node.id=1
[email protected]:19092,[email protected]:19093,[email protected]:19094
listeners=PLAINTEXT://:9092,CONTROLLER://:19092
inter.broker.listener.name=PLAINTEXT
advertised.listeners=PLAINTEXT://:9092
controller.listener.names=CONTROLLER
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kraft-combined-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
initial.broker.registration.timeout.ms=240000
However, The cluster is not able to start and it seems to be a connections issue.
[2023-06-23 10:45:57,443] WARN [RaftManager id=1] Connection to node 3 (/120.201.245.121:19094) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2023-06-23 10:45:57,468] INFO [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2023-06-23 10:45:57,769] INFO [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2023-06-23 10:45:57,869] INFO [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2023-06-23 10:45:57,900] INFO [RaftManager id=1] Node 3 disconnected. (org.apache.kafka.clients.NetworkClient)
[2023-06-23 10:45:57,900] WARN [RaftManager id=1] Connection to node 3 (/120.201.245.121:19094) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2023-06-23 10:45:57,914] INFO [RaftManager id=1] Node 2 disconnected. (org.apache.kafka.clients.NetworkClient)
[2023-06-23 10:45:57,914] WARN [RaftManager id=1] Connection to node 2 (/120.201.245.16:19093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2023-06-23 10:45:57,969] INFO [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2023-06-23 10:45:58,170] INFO [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2023-06-23 10:45:58,229] WARN [RaftManager id=1] Graceful shutdown timed out after 5000ms (org.apache.kafka.raft.KafkaRaftClient)
[2023-06-23 10:45:58,230] ERROR [kafka-1-raft-io-thread]: Graceful shutdown of RaftClient failed (kafka.raft.KafkaRaftManager$RaftIoThread)
java.util.concurrent.TimeoutException: Timeout expired before graceful shutdown completed
at org.apache.kafka.raft.KafkaRaftClient$GracefulShutdown.failWithTimeout(KafkaRaftClient.java:2423)
at org.apache.kafka.raft.KafkaRaftClient.maybeCompleteShutdown(KafkaRaftClient.java:2169)
at org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:2234)
at kafka.raft.KafkaRaftManager$RaftIoThread.doWork(RaftManager.scala:64)
at org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:127)
[2023-06-23 10:45:58,230] INFO [kafka-1-raft-io-thread]: Shutdown completed (kafka.raft.KafkaRaftManager$RaftIoThread)
[2023-06-23 10:45:58,231] INFO [kafka-1-raft-io-thread]: Stopped (kafka.raft.KafkaRaftManager$RaftIoThread)
[2023-06-23 10:45:58,240] INFO [kafka-1-raft-outbound-request-thread]: Shutting down (kafka.raft.RaftSendThread)
[2023-06-23 10:45:58,240] INFO [kafka-1-raft-outbound-request-thread]: Stopped (kafka.raft.RaftSendThread)
[2023-06-23 10:45:58,241] INFO [kafka-1-raft-outbound-request-thread]: Shutdown completed (kafka.raft.RaftSendThread)
[2023-06-23 10:45:58,252] INFO [SocketServer listenerType=CONTROLLER, nodeId=1] Stopping socket server request processors (kafka.network.SocketServer)
[2023-06-23 10:45:58,257] INFO [SocketServer listenerType=CONTROLLER, nodeId=1] Stopped socket server request processors (kafka.network.SocketServer)
[2023-06-23 10:45:58,270] INFO [QuorumController id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
[2023-06-23 10:45:58,270] INFO [MetadataLoader id=1] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
[2023-06-23 10:45:58,273] INFO [SharedServer id=1] Stopping SharedServer (kafka.server.SharedServer)
[2023-06-23 10:45:58,273] INFO [MetadataLoader id=1] beginShutdown: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue)
[2023-06-23 10:45:58,273] INFO [SnapshotGenerator id=1] close: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue)
[2023-06-23 10:45:58,273] INFO [SnapshotGenerator id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
[2023-06-23 10:45:58,277] INFO [MetadataLoader id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
[2023-06-23 10:45:58,277] INFO [SnapshotGenerator id=1] closed event queue. (org.apache.kafka.queue.KafkaEventQueue)
[2023-06-23 10:45:58,279] INFO Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics)
[2023-06-23 10:45:58,279] INFO Closing reporter org.apache.kafka.common.metrics.JmxReporter (org.apache.kafka.common.metrics.Metrics)
[2023-06-23 10:45:58,279] INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics)
[2023-06-23 10:45:58,283] INFO App info kafka.server for 1 unregistered (org.apache.kafka.common.utils.AppInfoParser)
[2023-06-23 10:45:58,284] INFO App info kafka.server for 1 unregistered (org.apache.kafka.common.utils.AppInfoParser)
I've spent few days on this puzzle , as changing the server.properites few times but currently I am out of ideas. Would appreciate any help on this issue. ( in spite all machine are connected means ping and ssh are ok across machines )
the installation & configuration steps for this Kafka version are very easy , the machines are Linux version is RHEL 8.4
tar xzf kafka_2.13-3.5.0.tgz
cd /home/kafka_2.13-3.5.0/config/kraft
note - we changed the node.id for each server.properites in each broker machine
[root@kafka01 kraft]# KAFKA_CLUSTER_ID="$(bash /home/kafka_2.13-3.5.0/bin/kafka-storage.sh random-uuid)"
[root@kafka01 kraft]# echo $KAFKA_CLUSTER_ID
kx9G7XV3TOiagO4Fsu1FYQ
[root@kafka01 kraft]# bash /home/kafka_2.13-3.5.0/bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c /home/kafka_2.13-3.5.0/config/kraft/server.properties
Formatting /tmp/kraft-combined-logs with metadata.version 3.5-IV2.
[root@kafka01 kraft]# bash /home/kafka_2.13-3.5.0/bin/kafka-server-start.sh -daemon /home/kafka_2.13-3.5.0/config/kraft/server.properties
[root@kafka01 kraft]# ls -ltr /tmp/kraft-combined-logs
total 8
-rw-r--r-- 1 root root 86 Jun 23 11:07 meta.properties
-rw-r--r-- 1 root root 249 Jun 23 11:07 bootstrap.checkpoint
-rw-r--r-- 1 root root 0 Jun 23 11:07 recovery-point-offset-checkpoint
-rw-r--r-- 1 root root 0 Jun 23 11:07 log-start-offset-checkpoint
-rw-r--r-- 1 root root 0 Jun 23 11:07 replication-offset-checkpoint
drwxr-xr-x 2 root root 187 Jun 23 11:07 __cluster_metadata-0
[root@kafka01 kraft]# more /tmp/kraft-combined-logs/meta.properties
#
#Fri Jun 23 11:07:34 UTC 2023
cluster.id=kx9G7XV3TOiagO4Fsu1FYQ
version=1
node.id=1
[root@kafka01 kraft]#
references:
https://kafka.apache.org/35/documentation
https://kafka.apache.org/downloads
https://hevodata.com/learn/kafka-without-zookeeper
https://adityasridhar.com/posts/how-to-easily-install-kafka-without-zookeeper
I had a similar issue, as @OneCricketeer mentioned, each of the quorum members need to be available.
In my case, I had:
I could ping kafka2, kafka3 from kafka1 and vice-versa, but occasionally I'd get
java.net.UnknownHostExceptionthrown.In my Docker Compose file I needed to also set:
However, in your case you have the IP's set up but are only listening on the default interface:
So its not the hostname. Perhaps you could try to bind the listener ports to all interfaces, rather than the default. E.g.
Alternatively, if you could try using the respective IP's to only listen on that interface for each node.
Reference: https://kafka.apache.org/35/documentation/#brokerconfigs_listeners