connecting to AWS S3 using boto3 or smart_open - debugging connection error

581 Views Asked by At

My goal is to upload objects to S3, I have been trying with both smart_open and boto3 libraries with no success.

I don't know much about configuring IAM policies or Access points in S3; but finding very hard to debug and understand how to pass configurations.

IAM

this is my policy - it should be open and allow PUT. I don't have any access point set.

{
    "Version": "2012-10-17",
    "Id": "Policy1449009487903",
    "Statement": [
        {
            "Sid": "Stmt1449009478455",
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::MY_BUCKET/*",
            "Condition": {
                "StringLike": {
                    "aws:Referer": [
                        "https://s3-us-west-2.amazonaws.com/MY_BUCKET/*"
                    ]
                }
            }
        }
    ]
}

boto3

With boto3, I try to open a session, and then upload a file from local disk:

import boto3
session = boto3.Session(
    aws_access_key_id = ACCESS_KEY,
    aws_secret_access_key = SECRET_KEY,
)

s3 = boto3.resource('s3')
s3.Bucket(S3_BUCKET).upload_file(path_to_my_file_on_disk ,'test.json')

But I got error (very long), which end with:

EndpointConnectionError: Could not connect to the endpoint URL: "https://MY_BUCKET.s3.us-oregon.amazonaws.com/test.json"

Note that the url is different from the URI of an object shared on s3, that should be:

s3://MY_BUCKET/test.json

Looking at : https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html

I just tried:

import boto3
# Print out bucket names
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
    print(bucket.name)

And it yields error: fail to connect to:

EndpointConnectionError: Could not connect to the endpoint URL: "https://s3.us-oregon.amazonaws.com/"

Smart_open

I tried with smart_open like this:

with smart_open.open('s3://{}:{}@{}/{}'.format(ACCESS_KEY, SECRET_KEY, S3_BUCKET, filename), 'wb') as o:
    o.write(json.dumps(template).encode('utf8'))

But also here, it fails to connect. It does not say why though.

Reading on Stackoverflow, some threads reported that uploading with Smart_open version >= 5.0.0 could be more complicated - see:

https://github.com/RaRe-Technologies/smart_open/blob/develop/howto.md

So I tried:

session = boto3.Session(
    aws_access_key_id= ACCESS_KEY,
    aws_secret_access_key= SECRET_KEY)

with smart_open.open(
  's3://' + S3_BUCKET + '/robots.txt', mode = 'w', transport_params={'client': session.client('s3')}) as o:
    o.write("nothing to see here\n")
    o.close()

No success

with smart_open.open(
  's3://' + S3_BUCKET + '/robots.txt', 
  'w',
  transport_params = {
    'client_kwargs': {
      'S3.Client.create_multipart_upload': {
        'ServerSideEncryption': 'aws:kms'
      }
    },
    'client': boto3.client('s3')
  }
) as o:
    o.write("nothing to see here\n")
    o.close()

no success.

Can you help debug and point to the correct direction ?

1

There are 1 best solutions below

0
On

I found a solution for boto3: it turned out I had to specify correct region in the Session:

s3 = boto3.client('s3',
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    region_name=MY_REGION
)

s3.upload_file(path_to_filename , S3_BUCKET, 'test.json')

worked out.

However, with smart_open I could not find a solution:

Ref. How to use Python smart_open module to write to S3 with server-side encryption

I tried to specify the boto3 session as above, and then:

session = boto3.Session(
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    region_name=MY_REGION)

client_kwargs = {'S3.Client.create_multipart_upload': {'ServerSideEncryption': 'AES256'}}
                 
with smart_open.open('s3://{}:{}@{}/{}'.format(ACCESS_KEY, SECRET_KEY, S3_BUCKET, filename), 'wb', transport_params={'client_kwargs': client_kwargs}
) as o:
    o.write(json.dumps(myfile).encode('utf8'))

Someone can show a correct way for smart_open as well ? Using 6.3.0 version.

I post this partial answer if someone can find it useful... .. debugging cumbersome for me, not an expert of AWS IAM either