Athena Connection with R

2.3k Views Asked by At

I am new to Athena. I want to connect this with R

Sys.getenv()
    URL <- 'https://s3.amazonaws.com/athena-downloads/drivers/AthenaJDBC42_2.0.14.jar'
fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)

drv <- JDBC(driverClass="com.simba.athena.jdbc.Driver", fil, identifier.quote="'")

This is the error message

Error in .jfindClass(as.character(driverClass)[1]) : 
  java.lang.ClassNotFoundException

Referred this article https://aws.amazon.com/blogs/big-data/running-r-on-amazon-athena/

   con <- jdbcConnection <- dbConnect(drv, 'jdbc:awsathena://athena.ap-south-1.amazonaws.com:443/',
                                   s3_staging_dir="s3://aws-athena-query-results-ap-south-1-region/",
                                   user=("xxx"),
                                   password=("xxx"))  

Need help really struggling from two days

Thanks in advance. I downloaded jar files and java.

2

There are 2 best solutions below

2
On BEST ANSWER

You are using a newer driver version and the driver is now developed by simba and therefore the driver class name has changed.

The driver class is now com.simba.athena.jdbc.Driver.

You may also want to check out AWR.Athena - A nice R package to interact with Athena.

0
On

If you are still having difficulty with the JDBC drivers for Athena you could always try: RAthena or noctua. These two packages opt in using AWS SDK to make the connection to AWS Athena.

RAthena uses Python boto3 sdk (similar to pyathena), through reticulate.

noctua uses R paws sdk.

Code Example:

library(DBI)

# connect to AWS
# using ~/.aws/credentials to store aws credentials
con <- dbConnect(RAthena::athena(),
                 s3_staging_dir = "s3://mybucket/")

# upload some data into aws athena
dbWriteTable(con, "iris", iris)

# query iris in aws athena
dbGetQuery(con, "select * from iris")

NOTE: noctua works extactly the same way as code example above but instead the driver is: noctua::athena()