I cannot connect from my cloud kafka to databricks community edition's spark cluster

430 Views Asked by At

1- I have a spark cluster on databricks community edition and I have a Kafka instance on GCP.

2- I just want to data ingestion Kafka streaming from databricks community edition and I want to analyze the data on spark.

Kafka and nifi's external ip

3- This is my connection code.

val UsYoutubeDf = 
       spark
      .readStream
      .format("kafka")
      .option("kafka.bootstrap.servers", "XXX.XXX.115.52:9092")
      .option("subscribe", "usyoutube")
      .load`

As is mentioned my datas arriving to the kafka. I'm entering firewall settings spark.driver.host otherwise ı cannot sending any ping to my kafka machine from databricks's cluster My temporary spark machine's ip

import org.apache.spark.sql.streaming.Trigger.ProcessingTime
 
val sortedModelCountQuery = sortedyouTubeSchemaSumDf
                          .writeStream
                          .outputMode("complete")
                          .format("console")
                          .option("truncate","false")
                          .trigger(ProcessingTime("5 seconds"))
                          .start()

After this post the datas dont coming to my spark on cluster

import org.apache.spark.sql.streaming.Trigger.ProcessingTime
sortedModelCountQuery: org.apache.spark.sql.streaming.StreamingQuery = org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@3bd8a775

It stays like this. Actually, the data is coming, but the code I wrote for analysis does not work here

0

There are 0 best solutions below