We have an instance of Kafka connect running on a self-hosted node which connects to AWS MSK using SASL_SSL mechanism. I am running into an issue where the SQL server connector is running out of memory with the underlying error as what seems to be an Auth issue for the internal schemahistory connector because in another environment where Kafka is deployed with no auth this works fine.
We have added the below settings to connect-distributed.properties but still I don't see these actually being picked and config values in the log still shows as PLAINTEXT. The only way for the connector to work is to pass these values in the connector create API post which the connector works fine and prints out correct values in the consumer and producer config and is able to establish the schema history connector.
The auth seems to be successful for the topic for which the request comes (verified in the config logs that sasl related properties are set properly) but fails for the internal schema connector.
# existing config
security.protocol=SASL_SSL
sasl.mechanism=SCRAM-SHA-512
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username='user' password='pass';
producer.security.protocol=SASL_SSL
producer.sasl.mechanism=SCRAM-SHA-512
producer.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username='user' password='pass';
ssl.truststore.type=PEM
producer.ssl.truststore.type=PEM
ssl.truststore.location=/etc/pki/tls/certs/ca-bundle.crt
producer.ssl.truststore.location=/etc/pki/tls/certs/ca-bundle.crt
# Newly added config
schema.history.internal.producer.security.protocol=SASL_SSL
schema.history.internal.producer.sasl.mechanism=SCRAM-SHA-512
schema.history.internal.producer.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username='user' password='pass';
schema.history.internal.consumer.security.protocol=SASL_SSL
schema.history.internal.consumer.sasl.mechanism=SCRAM-SHA-512
schema.history.internal.consumer.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username='user' password='pass';
Error log
INFO [Producer clientId=xxxxx-config-schemahistory] Node -2 disconnected. (org.apache.kafka.clients.NetworkClient:977)
INFO [Producer clientId=xxxxx-config-schemahistory] Cancelled in-flight API_VERSIONS request with correlation id 0 due to node -2 being disconnected (elapsed time since creation: 296ms, elapsed time since send: 296ms, request timeout: 30000ms) (org.apache.kafka.clients.NetworkClient:344)
WARN [Producer clientId=xxxxx-config-schemahistory] Bootstrap broker bbbb.kafka.us-east-1.amazonaws.com:9096 (id: -2 rack: null) disconnected (org.apache.kafka.clients.NetworkClient:1105)
INFO App info kafka.consumer for xxxxx-config-schemahistory unregistered (org.apache.kafka.common.utils.AppInfoParser:83)
ERROR WorkerSourceTask{id=xxxxx-config-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:212)
Along with this it prints out the consumerconfig and ProducerConfig
[2024-02-27 06:22:15,095] INFO ProducerConfig values:
acks = 1
auto.include.jmx.reporter = true
batch.size = 32768
bootstrap.servers = [aaaa.kafka.us-east-1.amazonaws.com:9096, bbbb.kafka.us-east-1.amazonaws.com:9096, cccc.kafka.us-east-1.amazonaws.com:9096]
buffer.memory = 1048576
client.dns.lookup = use_all_dns_ips
client.id = xxxxx-config-schemahistory
***removed***
sasl.jaas.config = null
***removed***
sasl.mechanism = GSSAPI
***removed***
security.protocol = PLAINTEXT
security.providers = null
***removed***
(org.apache.kafka.clients.producer.ProducerConfig:370)
[2024-02-27 06:22:15,100] INFO ConsumerConfig values:
***removed***
bootstrap.servers = [aaaa.kafka.us-east-1.amazonaws.com:9096, bbbb.kafka.us-east-1.amazonaws.com:9096, cccc.kafka.us-east-1.amazonaws.com:9096]
check.crcs = true
client.dns.lookup = use_all_dns_ips
client.id = xxxxx-config-schemahistory
***removed***
fetch.min.bytes = 1
group.id = xxxxx-config-schemahistory
***removed***
sasl.jaas.config = null
***removed***
sasl.mechanism = GSSAPI
***removed***
security.protocol = PLAINTEXT
security.providers = null
***removed***
(org.apache.kafka.clients.consumer.ConsumerConfig:370)
TLDR
- Schema history connector does not work with SASL Kafka managed instance.
- On passing the config in API everything works.
- Does not work when settings the same values in
connect-distributed.properties - Is there no config file where these settings can be added, the apprehension towards adding in the create call is to avoid persisting the Kafka credential in the client.
- The connection to the topic is done fine only issue is with internal connector which does not pick the right auth related config.
Short answer: connector configs should be defined defined in the connector and in your case should be provided as part of the POST restful API call to submit a connector to Kafka connect cluster. Defining
schema.history.*on a worker level won't be used by a connector's producers/consumersWhat you need is to hide credentials.
Solution
You will need to use config providers to allow using tokens to access the credentials from the environment, or from the other services.
Originally, Kafka (and Kafka Connect as well) supports
fileconfig provider natively. If we assume that you can place a credentials file on Kafka Connect cluster at/etc/connect-credentials.propertieswith the following content:then in your connector properties (via POST API call) you will use the following:
Alternatively, you can integrate your credentials with AWS Secrets Manager or AWS Parameters Store using custom config providers, as explained in MSK Connect documentation or github project.
Pay attention at what needs to be defined in worker config (distributed properties) vs connector config.