I'm trying to create a new Kafka cluster using the new KRaft mode and using SSL certificates for authentication and authorization (along with the new StandardAuthorizer). I'm using the Bitnami Kafka image, version 3.3.1.
This is the relevant part of the error logs I'm getting:
[2022-12-02 14:18:44,159] ERROR [RaftManager nodeId=0] Unexpected error UNKNOWN_SERVER_ERROR in VOTE response: InboundResponse(correlationId=2596, data=VoteResponseData(errorCode=-1, topics=[]), sourceId=2) (org.apache.kafka.raft.KafkaRaftClient)
[2022-12-02 14:18:44,161] ERROR [RaftManager nodeId=0] Unexpected error UNKNOWN_SERVER_ERROR in VOTE response: InboundResponse(correlationId=2597, data=VoteResponseData(errorCode=-1, topics=[]), sourceId=1) (org.apache.kafka.raft.KafkaRaftClient)
[2022-12-02 14:18:44,162] ERROR [ControllerApis nodeId=0] Unexpected error handling request RequestHeader(apiKey=VOTE, apiVersion=0, clientId=raft-client-1, correlationId=2707) -- VoteRequestData(clusterId='sgasewhyweywqey4qeyefd', topics=[TopicData(topicName='__cluster_metadata', partitions=[PartitionData(partitionIndex=0, candidateEpoch=43088, candidateId=1, lastOffsetEpoch=5, lastOffset=139622)])]) with context RequestContext(header=RequestHeader(apiKey=VOTE, apiVersion=0, clientId=raft-client-1, correlationId=2707), connectionId='172.17.0.5:39093-10.82.8.117:33050-0', clientAddress=/10.82.8.117, principal=User:ANONYMOUS, listenerName=ListenerName(CONTROLLER), securityProtocol=SSL, clientInformation=ClientInformation(softwareName=apache-kafka-java, softwareVersion=3.3.1), fromPrivilegedListener=false, principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@7ccf149f]) (kafka.server.ControllerApis)
org.apache.kafka.common.errors.AuthorizerNotReadyException
According to KIP-801 Bootstrapping and Early Start Listeners sections, the problem of bootstraping the controller quorum should be solved by putting the broker and controller user in super.users, which I think I'm doing.
However, in the logs it seems the broker/controller is not assuming the right identity, since it shows principal=User:ANONYMOUS. It seems that controller connections are not using the right SSL certificates. Should this setup be currently supported by KRaft? Is there any new configuration missing for the inter controller communication?
This is what I think the most relevant part of my config (partial config file):
broker.id=2
listeners=SSL://:39092,CONTROLLER://:39093
advertised.listeners=SSL://<broker-2>:39092
listener.security.protocol.map=SSL:SSL,CONTROLLER:SSL
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizer
controller.listener.names=CONTROLLER
controller.quorum.voters=0@<broker-0>:39093,1@<broker-1>:39093,2@<broker-2>:39093
early.start.listeners=CONTROLLER
node.id=2
process.roles=broker,controller
security.inter.broker.protocol=SSL
ssl.keystore.location=/bitnami/kafka/config/certs/kafka.keystore.jks
ssl.keystore.password=<password>
ssl.key.password=<password>
ssl.truststore.location=/bitnami/kafka/config/certs/kafka.truststore.jks
ssl.truststore.password=<password>
super.users=User:CN=<kafka-broker>,O=<org>,C=<country>
Additional notes:
- Before enabling Authorization, it was working fine with KRaft, SSL just for encryption and plaintext communication between Controllers.
- My cluster is composed by 3 Kafka brokers that also act as controllers. Running in containers, each container in a different host.
- The 3 brokers use the same server certificate but each broker hostname is configured as an alt name in the certificate.
Also faced this issue, and in my case the problem was down to a missing configuration when making brokers and controllers communicate over TLS.
You need to set the config key
listener.name.<Controller ssl listener>.ssl.client.auth=requiredwhich is notably missing from the shared config.So, with the above CONTROLLER being the listener, you need to add
listener.name.controller.ssl.client.auth=requiredI was able to resolve the principal showing correctly instead of as ANONYMOUS, and was able to successfully standup a raft cluster on 3.3.1 with TLS between brokers and controllers.
Interestingly, this does not seem to be required when on 3.2.0 for raft to still function. Perhaps some change in the StandardAuthorizer between the versions, but did not check yet.