In short: "socket.error: [Errno 104] Connection reset by peer" exception while using MRJob. The script actually has access to S3 because it does create buckets and uploads some small files (I've checked manually via AWS console). But the largest file - INPUT - is not uploaded. Hey, it's just 7GB of test data!
Have tried 4 times, always got errors.
mrjob==0.4.2
CONFIG
# cat /etc/mrjob.conf
runners:
inline:
base_tmp_dir: /home/tmp
emr:
base_tmp_dir: /home/tmp
aws_access_key_id: [VALID KEY HERE]
aws_secret_access_key: [VALID SECRET HERE]
aws_region: us-east-1
ec2_instance_type: m1.medium
num_ec2_instances: 7
TRACEBACK
# python /home/bigdata/mr_job_1.py -r emr /home/filesystem/INPUT > /home/filesystem/OUTPUT
using configs in /etc/mrjob.conf
creating new scratch bucket mrjob-f02b7cd37b2bfffd
using s3://mrjob-f02b7cd37b2bfffd/tmp/ as our scratch dir on S3
creating tmp directory /home/tmp/mr_job_1.root.20131216.152251.298419
writing master bootstrap script to /home/tmp/mr_job_1.root.20131216.152251.298419/b.py
creating S3 bucket 'mrjob-f02b7cd37b2bfffd' to use as scratch space
Copying non-input files into s3://mrjob-f02b7cd37b2bfffd/tmp/mr_job_1.root.20131216.152251.298419/files/
Traceback (most recent call last):
File "/home/bigdata/workers/process_data/mr_job_1.py", line 178, in <module>
MRSwapData().run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 494, in run
mr_job.execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob/job.py", line 512, in execute
super(MRJob, self).execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 147, in execute
self.run_job()
File "/usr/local/lib/python2.7/dist-packages/mrjob/launch.py", line 208, in run_job
runner.run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/runner.py", line 458, in run
self._run()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 806, in _run
self._prepare_for_launch()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 817, in _prepare_for_launch
self._upload_local_files_to_s3()
File "/usr/local/lib/python2.7/dist-packages/mrjob/emr.py", line 905, in _upload_local_files_to_s3
s3_key.set_contents_from_filename(path)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1290, in set_contents_from_filename
encrypt_key=encrypt_key)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 1221, in set_contents_from_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 713, in send_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/key.py", line 889, in _send_file_internal
query_args=query_args
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 547, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 947, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 908, in _mexe
raise e
socket.error: [Errno 104] Connection reset by peer
I ran into this issue too. I poked around in the code and found that boto raises a PleaseRetryException or an exception of the type self.http_exceptions. Since it's not only one type of exception, and I don't want to import those into my code, I'm instead doing something like this: