How to import/load data from csv files on s3 bucket into Redshift using AWS Glue without using copy command

795 Views Asked by mb_pravi At 17 August 2025 at 14:36

The inputs files which has the sales data will be sent to s3 bucket in csv format per day max of 5 files one time at specific time.

Using Glue the csv files data to be loaded in Redshift.

The transformations are needed like calculations and mapping from csv files files before loading into Redshift.

Please share your inputs/suggestion on how to build a efficient solution.

Below has to be taken care.

1.Duplicate in csv which client may or may not send. 2.same data say previous day's data has been sent on next day. 3. want to handle update and insert on Redshift tables. 4. csv files once processed/loaded into Redshift, move those files into another location in S3 bucket.

Original Q&A

There are 1 best solutions below

Bill Weiner On 08 April 2021 at 18:54

When you say "without using copy command" are you referring to all sources for COPY? COPY can load from S3, EMR, DDB, or through an ssh connection. If this is the case I can only think of 2 ways to get data into Redshift w/o copy:

Load literals in the SQL. (INSERT ... VALUES (...);)
Use Redshift Spectrum to access S3 objects then read from this external table and insert the data to an internal table

#1 is very slow and limited on how much data you can put in a single statement (SQL command length limit of 64k characters). It also adversely bogs down the leader node which can impact overall cluster performance. This is fine for a few rows of data but not a good idea for large amounts of data.

#2 is a fine way to go but you need to define the external table, have the data in S3, and ensure that it is getting the right set of S3 files. This is a fine pattern if you are going to keep reusing the S3 files but I'm not sure how it is better than COPY from S3 in the single load case.

If you intent is patterns that can load data w/o COPY FROM S3 then there are also the other 3 forms of COPY to think about.

How to import/load data from csv files on s3 bucket into Redshift using AWS Glue without using copy command

There are 1 best solutions below

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-S3

Related Questions in AMAZON-REDSHIFT

Related Questions in AWS-GLUE

Related Questions in AWS-GLUE-WORKFLOW

Trending Questions

Popular # Hahtags

Popular Questions