How do I connect to AWS Athena in R using paws via the CLI v2 sso?

153 Views Asked by At

I am using AWS CLI v2 to create a configuration file and login to AWS via sso on my computer. I am then trying to access the data stored in our database on AWS Athena through R using the paws package.

I think I have the configuration file setup correctly, but when I try to list the data catalogs in Athena, I get an error message: "Error loading SSO Token: Token for myprof does not exist"

In AWS CLI v2 (on Windows), I run aws configure sso and setup my config file as described in the AWS documentation on configuring SSO.

After doing that, my config file looks like this:

[profile myprof]
sso_session = myprof
sso_account_id = 123456789012
sso_role_name = AWSAdministratorAccess
region = us-east-2
output = json
[sso-session myprof]
sso_start_url = https://mycompany.awsapps.com/start#/
sso_region = us-east-2
sso_registration_scopes = sso:account:access

In the process, my web browser opens up and I confirm the code that is shown in the command line.

In R, I have this code, as described in the paws documentation on Using AWS Single Sign On and connecting to Athena:

library(paws)
Sys.setenv(AWS_PROFILE = 'myprof',AWS_REGION = 'us-east-2')
svc = athena()
svc$list_data_catalogs(options("paws.log_level" = 3L))

When I run this, I get this result:

INFO [2023-11-10 08:20:45.468]: Unable to locate credentials file
Error: Error loading SSO Token: Token for myprof does not exist

From what I can understand from the paws documentation, I don't need a credentials file if I am using sso (unlike this question). Wondering if there is something I am doing wrong, or if there is a setting I need to check to get this to work? Not sure if the problem is in something I am doing, or with paws or the IAM settings.

2

There are 2 best solutions below

0
On BEST ANSWER

I found that the problem was in my system environment settings.

I used the traceback() function to see where the Error: Error loading SSO Token: Token for myprof does not exist was coming from, and then found that it was coming from the sso_credential_process function in paws. I was able to find the details of that function here.

That function references the system environment variables HOMEDRIVE and HOMEPATH. Mine were not set correctly to direct the function to the sso cache, hence it not finding the credentials. Once I updated those using Sys.setenv(HOMEDRIVE = 'C:',HOMEPATH = 'Users/MYPROF'), everything started working correctly.

1
On

When using SSO you need to get your temporary credentials using aws cli.

aws sso login --profile myprof

Note: this is similar to how boto3 works: How to use the AWS Python SDK while connecting via SSO credentials

library(paws)

client <- athena(config(credentials(profile = "myprof")))
client$list_data_catalogs()

Note: you don't need to set environmental variables as you have set up your .aws/config file. You can simply set the profile in the athena connection.

Alternatively you could use noctua which is built onto of paws to give a DBI interface.

library(DBI)
library(noctua)

con <- dbConnect(noctua::athena(), profile_name = "myprof", s3_staging_dir = "s3://mybucket/athena-query")

dbListTables(con)