I have an ultimate goal of making AWS Lambda triggered from Kafka topics where Kafka is an MSK cluster running on another AWS account.
Setup. Lambda and MSK cluster live on different AWS accounts. Each is connected to its own VPC, i.e. Subnet (private via NAT GW), and is firewalled by a security group (I made them all-allowing for the sake of experiment)
Immediate problem. I cannot use "MSK trigger" for lambda because it requires specification of the ARN for MSK cluster but that is not possible since that MSK cluster is living on another account and cannot be referenced in the context of lambda trigger.
Problem that I'm trying to solve. I am trying to use "Kafka trigger" which needs specification of bootstrap server (I have it), topic name (I have it), batch size, and starting position (not a problem). The problem is with the second group of options that allows for authentication of lambda trigger in front of MSK cluster. It can either be a 1) network-based setup in form of a combination of VPC/Subnet/SecurityGroup of MSK cluster, 2) secret-based setup in form of SASL/xxx configuration.
The former option, i.e. network-based setup, cannot be used because it requires VPC and other params from Kafka cluster, i.e. Subnet(s), and SecurityGroup(s) that are not available on the account where Lambda is being configured.
The latter requires specification of one of the SASL methods, i.e. PLAIN, SCRAM512, and SCRAM256 paired with a reference to SecretManager record storing a combination of username/password. I chose this method as the only (in my opinion) theoretically possible for cross-account communication.
What I did:
- On Lambda account I created a record in SecretManager with 2 keys:
username
andpassword
and some specific values - On Kafka account I enabled SASL/SCRAM authentication for MSK cluster. It required setting up a record in SecretManager, which I did. I used the same credentials as in the secret on Lambda account. After that I can see 3 options for bootstrap server:
<hostname>:9092
(plaintext),<hostname>:9094
(TLS), and<hostname>:9096
(SASL/SCRAM). - I used bootstrap server specs (the one with 9096 port) in kafka trigger setup for my lambda function.
- In order to grant cross-account connectivity between subnets of Lambda and Kafka I setup VPC Peering connector with corresponding rules in routing tables.
- I tested connectivity between those subnets by spinning an EC2 on the subnet of Lambda and tried to
nmap -Pn -p 9092,9094,9096 <bootstrap server hostname>
to get the following response:
PORT STATE SERVICE
9092/tcp open unknown
9094/tcp open unknown
9096/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 0.03 seconds
Outcomes. No matter what I try in terms of a different combination of bootstrap server/port, and SASL method, I am getting this error on the Lambda trigger side: PROBLEM: Connection error. Please check your event source connection configuration
I cannot find any way of debugging this error scenario as no other details are provided. Enabling CloudWatch on MSK cluster didn't help as it doesn't collect any helpful info.
I totally abandoned this topic, but i actually have a solution. It is not as comprehensive as one might want but it works in our production setup.
The trick was to make both accounts to be able to communicate via corresponding setup of VPC peer connection (on each account) and then use network based setup for lambda trigger authentication, but instead of using security groups and subnets of MSK we want to use those of Lambda. It makes "kafka" trigger on Lambda side being able to reach out to bootstrap servers via VPC of Lambda itself.
The downside if this technique is that it works for plain text connection only. Chances are it will work with encrypted traffic but i didn't try it as we rely on isolated nature of those VPCs.