I need to replicate Millions (one time) of S3 Objs by modifying the metadata (within the same bucket, and obj path)
To perform this, we've various options mentioned below, we need to choose cost-effective method:
- AWS COPY requests
- AWS Batch Operations
- AWS DataSync
References: https://repost.aws/knowledge-center/s3-large-transfer-between-buckets
I've read AWS Docs but could not get which one is better in terms of cost.
To update metadata on an Amazon S3 object, it is necessary to COPY the object to itself while specifying the new metadata.
From Copying objects - Amazon Simple Storage Service:
However, you have a choice as to how to trigger the COPY operation:
Given that you have millions of objects, I would recommend using S3 Batch Operations since it can perform the process with massive scale.
I would recommend this process:
I suggest that you try the S3 Batch Operations step on a subset of objects (eg 10 objects) first to confirm that it operates the way you expect. This will be relatively fast and will avoid any potential errors.
Note that S3 Batch Operations charges
$1.00 per million object operations performed
.