Loading around 50gb of parquet data to Redshift taking indefinite time to load

43 Views Asked by RickyS At 27 March 2024 at 01:22

I am loading around 50 gb of Parquet data into Dataframe using Glue Etl job and then trying to load into Redshift table which is taking more 6-7 hrs and not even completing.

`datasink=glueContext.write_dynamic_frame.from_jdbc_conf(frame=<data_frame>, catalog_connection="redshift_connection", connection_options={ "preactions": pre_actions, "dbtable": dest_table, "database": "<redshift_database>", }, redshift_tmp_dir=args["TempDir"], transformation_ctx="datasink", )

Is there any performance improvement techniques one needs to follow ?

Tried partitioning the data and made significant changes with resource configuration.Using G2.x with 16 workers

Original Q&A

Loading around 50gb of parquet data to Redshift taking indefinite time to load

There are 0 best solutions below

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-REDSHIFT

Related Questions in AWS-GLUE

Related Questions in AMAZON-REDSHIFT-SPECTRUM

Trending Questions

Popular # Hahtags

Popular Questions