Load data from Snowflake table to aws s3 in batches for a very large files

1.4k Views Asked by At

I have a task to load a large table from Snowflake to s3. I need to load 1000 records from the table of snowflake in a csv file to s3, and consecutively next 1000 records from the table in another csv file and so on.

For example, I have a table of 15000 records in Snowflake and want to load 15 separate files from the table to s3 each containing 1000 records named with file_1.csv,file_2.csv,.....,file_15.csv and then zip all these files and unload this single zip file to s3.

I referred https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html and many other sources but could not figure it out. I'm very new to Snowflake.

1

There are 1 best solutions below

1
On

You are looking for the COPY INTO command, see here: https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html#required-parameters

Here you have two optional parameters:

SINGLE = TRUE | FALSE: This parameter determines whether your data is split into several files or not. The default is FALSE so your data is split into several files.

MAX_FILE_SIZE = num: With this parameter you can set the size of the unloaded files. Important for you is the fact, that you cannot set the max. file size by row numbers. Instead you have to set the upper size limit in bytes. (You can estimate the bytes for 1000 rows but of course the files can have 1000, 998 or 1002 rows then.)

Another option is to create a stored procedure and unload the resultset of a query. This query is dynamically executed 15 times and in every loop it Limits the resultset to WHERE ROWNUM<1001 / WHERE ROWNUM > 1000 AND ROWNUM < 2001 etc