How does speculative execution impact s3-dist-cp job?

134 Views Asked by At

I have noticed that sometimes s3-dist-cp takes much longer than usual due to a "slow node" issue. In case of spark I have enabled speculative execution which works fine. Howerver, when it comes to s3-dist-cp I would like to understand possible impact first.

In case of regular dist-cp I found that (link: https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html#MapReduce_and_other_side-effects):

If mapreduce.map.speculative is set set final and true, the result of the copy is undefined.

I'm aware that s3-dist-cp is a completely separate job, but I wonder if there any caveats. I wasn't able to find any related documentation.

Thanks for any suggestions!

0

There are 0 best solutions below