I'm trying to load a dataset to parquet, then save it in my s3 bucket. When I try this, automatically tries to convert my columns to int or double. For example: I have a column named ventas_deals.person_phone_value
, and that column saves someting like this '571234567890'
. The error is this:
ParquetDataSet(filepath=analytics-datalake-prod-primary-s3bucket/datasets_kedro/menu_property_matching_match_invento ry/data.parquet, load_args={}, protocol=s3, save_args={}). ("Could not convert '573223131490' with type str: tried to convert to double", 'Conversion failed for column ventas_deals.person_phone_value with type object')
I tried controlling the types as follows
datalake_out_priorization_deals['ventas_deals.person_phone_value'] = datalake_out_priorization_deals['ventas_deals.person_phone_value'].astype('str')
I do this just before the dataset goes to the catalog and tries to save in the s3 bucket. The catalog is like this:
datalake_out_priorization_deals:
<<: *datalake_out
filepath: ${datalake_target_location}/menu_property_matching_priorization_deals/data.parquet
How can I solve this?