In Pyathena can you insert multiple values(Dates) into a SQL parameter?

391 Views Asked by At

I create a dataframe in a Jupyter Notebook instance (AWS Sagemaker) by connecting to an AWS Athena table using a SQL connection like the example below:

I have made a paramter per the link (https://pypi.org/project/pyathena/#query-with-parameter). The parameter is allowing me to insert a single month into the query but fails if I try multiple months - can someone please assist me with how you can insert multiple months into the dictionary :

from pandasql import sqldf
from pyathena import connect
from pyathena.pandas.cursor import PandasCursor

curr_month = 202206
prev_month = 202203

conn = connect(work_group='prod_user',
               region_name= 'eu-west-1')

sql_query = ('''
select 
year_month,
id,
field_1,
field_2,
field_3
from
mart.table_xyz
WHERE
year_month = (%(param)s)
;
''')

cursor = connect(work_group='prod_user',
               region_name= 'eu-west-1',
                 cursor_class=PandasCursor).cursor()
user_df = cursor.execute(sql_query,{"param": curr_month}).as_pandas()

I have tried a list like below to return two months but that causes an error. Any help would be appreciated

cursor = connect(work_group='prod_user',
               region_name= 'eu-west-1',
                 cursor_class=PandasCursor).cursor()
user_df = cursor.execute(sql_query,{"param": [curr_month,prev_month]}).as_pandas()

1

There are 1 best solutions below

4
Guru Stron On

Try changing the query to use the in operator (not sure about actual python code, but the sql query should look something like where year_month in (202206, 202203)):

WHERE year_month in (%(param)s)

And param can be something like {"param": ', '.join([curr_month, prev_month])}