How to setup jar configs in databricks for redis connections

852 Views Asked by At

I have installed the following jar in databricks "com.redislabs:spark-redis_2.12:2.5.0". And trying create a spark session with the respective authentications

Below is the code where I create a spark session with creds

redis= SparkSession.builder.appName("redis_connection").config("spark.redis.host", "hostname").config("spark.redis.port", "port").config("spark.redis.auth", "pass").getOrCreate()

But when I try to save it using the follwing code

df.write.format("org.apache.spark.sql.redis").option("table", "velocity").option("key.column", "name").option("ttl", 30).save()

This throws me the following error.

Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Failed connecting to host localhost:6379

It obviously mean to connect to localhost rather the hostname I provide. How to pass the jar configuration with hostnames and passphrase in databricks to validate the connection.

2

There are 2 best solutions below

0
On

Most likely databricks picks up the wrong spark session that doesn't have config parameters set. You may try two options:

  1. Set spark.redis.host, spark.redis.port and spark.redis.auth in Databricks cluster configuration. Go to cluster -> edit -> Advanced Options -> Spark -> Spark Config
  2. set options in the implicitly created spark session with spark.conf.set("spark.redis.host", "host") and the same with other parameters.
0
On

I was getting the same error while ingesting data into redis through spark using similar configuration, i used host, port and auth as configuration instead of spark.redis.*, this worked for me

import scala.collection.mutable.HashMap
def getRedisClusterProperties(): HashMap[String,String] = {
    val properties = new HashMap[String,String]
    properties.put("host","<host>")
    properties.put("port","6379")
    properties.put("auth","<auth>")
    properties
}

df.write.mode(SaveMode.Overwrite).format("org.apache.spark.sql.redis").options(getRedisClusterProperties()).option("table","<table_name>").option("key.column","<column_name>").save