Does anyone have an example of merging-concatenating chunks of files stored in an AWS S3 Bucket?

317 Views Asked by At

Currently, we are merging some output files through C# because we used to have these chunks in a drive on a server but now we are going to move these files directly from Snowflake to the S3 Bucket so it should be better to merge these files on the S3 bucket, we know that AWS has some function call Multipart Upload but we don't know if we could upload these files from Snowflake to S3 using that functionality.

At this moment we are exploring options, most of what we found is that we could create a lambda function for merging the files that are already in the S3 Bucket but the examples that we found are made mostly in python and our app is on .NET we also found about AWS Glue Crawler but we are not very sure about going with this option, Multipart Upload could be a good option but we lack experience with this type of implementations, so any help or example is welcome.

1

There are 1 best solutions below

2
On

AWS Glue Crawler would be perfect in this situation.

  1. Use a crawler to get the schema
  2. Use a Glue ETL job to merge the files and write them back to S3
  3. Make sure to turn on job bookmark(I will skip the previously merged files)

Example: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-samples-legislators.html