How to read all parquet files from S3 using awswrangler in python

9.6k Views Asked by Cristián Vargas Acevedo At 27 July 2025 at 23:48

Need read all parquet files with ext .parquet

s3_path = "s3://buckte/table/files.parquet"

df = wr.s3.read_parquet(
    path=[s3_path]
)

, but still a error :

Error occurred (404) when calling the HeadObject

Original Q&A

There are 2 best solutions below

Cristián Vargas Acevedo On 29 September 2021 at 20:02

The trick is to put only one string as s3 path and path_sufix

s3_path = "s3://buckte/table"

df = wr.s3.read_parquet(
    path=s3_path,
    path_suffix = ".snappy.parquet" ,
    use_threads =True
)

Sak1sham On 27 February 2022 at 15:43

You are getting this error because the file you are trying to search is not found, or the location that you are trying to read from doesn't exist.

You can either specify the exact (and correct) location of the file you want to access. Or if you want to read all the parquet files from a folder, you can just specify the name of the folder, while specifying the extensions (".parquet", ".csv", ".json" etc.) through the suffix property.

The following code helps to read all parquet files within the folder 'table'.

df = wr.s3.read_parquet(
    path = "s3://bucket/table/",
    path_suffix = ".parquet"
)

If you want to read all the parquet files within your bucket, the following code helps

df = wr.s3.read_parquet(
    path = "s3://bucket/",
    path_suffix = ".parquet"
)

How to read all parquet files from S3 using awswrangler in python

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in AWS-DATA-WRANGLER

Trending Questions

Popular # Hahtags

Popular Questions