How can i deactivate the automatic type conversion in parquetdataset?

27 Views Asked by At

I'm trying to load a dataset to parquet, then save it in my s3 bucket. When I try this, automatically tries to convert my columns to int or double. For example: I have a column named ventas_deals.person_phone_value, and that column saves someting like this '571234567890'. The error is this:

ParquetDataSet(filepath=analytics-datalake-prod-primary-s3bucket/datasets_kedro/menu_property_matching_match_invento ry/data.parquet, load_args={}, protocol=s3, save_args={}). ("Could not convert '573223131490' with type str: tried to convert to double", 'Conversion failed for column ventas_deals.person_phone_value with type object')

I tried controlling the types as follows

datalake_out_priorization_deals['ventas_deals.person_phone_value'] = datalake_out_priorization_deals['ventas_deals.person_phone_value'].astype('str')

I do this just before the dataset goes to the catalog and tries to save in the s3 bucket. The catalog is like this:

datalake_out_priorization_deals:
  <<: *datalake_out
  filepath: ${datalake_target_location}/menu_property_matching_priorization_deals/data.parquet

How can I solve this?

0

There are 0 best solutions below