Access s3n urls via hadoop and point to riak cs

475 Views Asked by At

I have code written for amazon emr using s3 and s3n urls in hadoop.

eg pig:

X = LOAD("s3n://testbucket/testfile.txt") using PigStorage();

I'd like to continue using the code and switch to using Riak-CS instead of amazon s3.

i.e I'd like that s3 url to point to my Riak CS Cluster, where I will setup the bucket and file.

Is there an option in Hadoop config to route s3n urls via Proxy or specific hostname.

1

There are 1 best solutions below

1
On

Setup jets3t.properties see Riak CS endpoint. Example for local setup is here: http://qiita.com/kuenishi/items/71b3cda9bbd1a0bc4f9e#2-3

> cat conf/jets3t.properties
s3service.https-only=false
#s3service.s3-endpoint=localhost
#s3service.s3-endpoint-http-port=8080
#s3service.s3-endpoint-https-port=8080
#s3service.disable-dns-buckets=true

httpclient.proxy-autodetect=false
httpclient.proxy-host=localhost
httpclient.proxy-port=8080
httpclient.retry-max=11