Pyarrow's write_to_dataset() causes "Calling the invoke API action failed with this message: Network Error" when partition_cols provided in AWS Lambda

209 Views Asked by Tamas L At 24 May 2025 at 10:18

I have an AWS Lambda Function (python 3.8) with pyarrow 9.0.0 and s3fs bundled together in a layer.

The function reads multiple JSON files one by one and converts them to a parquet dataset with partitioning (year, month, day) to an S3 location.

When executed AWS reports "Calling the invoke API action failed with this message: Network Error" or if I retry soon after, then I get "Calling the invoke API action failed with this message: Rate Exceeded.".

The function is as follows:

import pyarrow.parquet as pq
from pyarrow import fs
import pyarrow

s3_fs = fs.S3FileSystem(region='eu-west-1', allow_bucket_creation=True) # the bucket do exist but the partitioning folders do not always

pq.write_to_dataset(ddf,
                    s3uri_parquet_dataset, # s3:// removed prior
                    use_legacy_dataset=False,
                    filesystem=s3_fs,
                    compression="gzip",
                    partition_cols=["partitionDateYear", "partitionDateMonth","partitionDateDay"],          
                    basename_template=get_unique_name(),
                   )

I have a faint feeling that this might have to do with multiprocessing's Pool not available on AWS Lambda due to no shared memory constraint but I have nothing to prove it so it might as well be something else and other people don't seem to have this problem.

What is a hint is that when I comment out the partition_cols param then the lambda function does its job and it succeeds, but obviously the outcome is an unpartitioned dataset which is undesired.

Please do offer some suggestions to try, I am fairly new to AWS and it is quite a bit of an undertaking!

Original Q&A

Pyarrow's write_to_dataset() causes "Calling the invoke API action failed with this message: Network Error" when partition_cols provided in AWS Lambda

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AWS-LAMBDA

Related Questions in PYARROW

Related Questions in PARQUET-DATASET

Trending Questions

Popular # Hahtags

Popular Questions