awswrangler.s3.to_parquet arguments question

634 Views Asked by At

If I have the following code:

import awswrangler

#df = some dataframe with year, date and other columns

wr.s3.to_parquet(
    df=df,
    path=f's3://some/path/',
    index=False,
    dataset=True,
    mode="append",
    partition_cols=['year', 'date'],
    database=f'series_data',
    table=f'signal_data'
)

What exactly is happening when database and table are specified? I know that the table will be created (if it is not), but are Glue Crawlers run or something?

Should I use database and table only the first time I run this piece of code, or I can leave it like that (will it run any Crawlers or processes that may cause additional AWS charge?)

For example, if a new partition appears (a new date), how will the table understand the new partition? Usually, this is done when a Glue crawler is run to find a new partition.

0

There are 0 best solutions below