ℹ︎ This question is about the MinIO Client, not the MinIO Server object storage!
I have two S3-compatible buckets A and B, and I need to copy a list of objects from A to B. The catch is that
- I do not want to copy all objects from
A, nor do my objects all have a shared prefix; instead, I have a list (multi-line text file) of objects that I need to copy. So I cannot just usemc cp --recursiveormc mirror(as far as I can see). - I need to preserve the entire object keys on
B, including the full prefixes.
For example, say I have the following list (in a file objects.txt):
a/b/1.txt
c/2.txt
d/e/3.txt
f/1.txt
And after the copy operation I expect all these objects, with their original names, to reside in B.
My initial attempt to achieve this was the following command:
mc cp $(sed 's,^,s3/A/,' objects.txt) s3/B
Unfortunately the resulting bucket B had the following listing:
B/1.txt
B/2.txt
B/3.txt
That is, mc treated the object keys as “file paths” and stripped the “directory name”, and it lost one object that had a duplicate key after stripping the prefix.
My current workaround is to use an explicit loop, i.e.
while read -r object; do
mc cp s3/A/"$object" s3/B/"$object"
done <objects.txt
… But this is a lot slower. Part of the issue is that my mc is containerised, so every invocation is spawning a new Singularity container. Furthermore, the copy operation does not seem to take advantage of the fact that both buckets are in the same region: I am getting average throughput of 10 MiB/s with a wide variation: it seems that mc cp is moving all the data via my local compute node rather than moving it directly between buckets. Ideally a direct copy would be performed (for comparison, mc mirror achieves > 1 GiB/s throughput).