AWS wrangler writing wrong values in parquet

85 Views Asked by At

NaN value in not getting written in parquet files as expected. Not sure if this is an error with awswrangler to_parquet function.

Example:

creating the DataFrames

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'], 'B': ['B0', 'B1', 'B2', 'B3']})

df2 = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'], 'D': ['D0', 'D1', 'D2', 'D3']})

df = pd.concat([df1, df2]).reset_index(drop=True)

df

enter image description here

Parquet file:

wr.s3.to_parquet(df=df, path='file_path, dataset=True, index=False, compression='gzip', boto3_session=session)

enter image description here

0

There are 0 best solutions below