I am working on a AWS MSK consumer project, where I am looking to develop/test my application before deploying to EMR. However I am not aware how to configure my local to consume from my MSK kafka cluster, which is in VPC.
I tried reading my subscribing to kafka brokers and the particular topic.
streaming_query = spark.readStream.format("kafka").option("<kafka-brokers>").option("subscribe",input_topic).load()
Your code is correct. You have several options with different ports/addresses, depending on where the code is ran.
https://docs.aws.amazon.com/msk/latest/developerguide/client-access.html
I suggest you add a CLI flag / enum / boolean variable to determine where you've executed the code, from local or within the VPC