I'm trying to upload file to s3a using distcp
.
distcp
writes to temporary file first and than renames it to proper filename.
But user does not allow for update/delete. So I have file with proper size, wrong name.
-rw-rw-rw- 1 3738 2021-05-24 12:04 s3a://testbucket/.distcp.tmp.attempt_1621587961870_0001_m_000000_0
on s3 and receive an error:
Error: java.io.IOException: File copy failed: file:///testfile.json --> s3a://testbucket/testfile.json
Is it possible to omit renaming and write directly to final filename?
I've found it here: https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html
there is a parameter :
Write directly to destination paths Useful for avoiding potentially very expensive temporary file rename operations when the destination is an object store
example
Unfortunately my distcp version is too old and hasn't got this feature.