I'm trying to use pyathena (which looks simpler than the native boto3) to perform some queries . However, I wasn't able to find how can I define which data catalog to use. For example the query execution using boto3:
athena_client = boto3.client('athena')
start_execution_response = athena_client.start_query_execution(
QueryString='SELECT * FROM test_table',
QueryExecutionContext={
'Database': 'default',
'Catalog': 'AwsDataCatalog'
},
ResultConfiguration={
'OutputLocation': 's3://some_bucket/query_results/'
}
)
And using pyathena I can do something like that (and it worked):
cursor = connect(s3_staging_dir='s3://some_bucket/query_results/',
schema_name="some_db",
).cursor()
cursor.execute("SELECT * FROM table1")
But how can it work when I have multiple catalogs? Am I missing something?
Thanks, Nir.
I would assume you can also just write a query with fully qualified table name
And then reference to different catalogs.