I have KNOX gateway setup for our Hadoop cluster and I have gone through KNOX WebHDFS examples. So far, I know that the below cURL commands can be used to create a directory and upload a single file.
curl -k -u username:password -X PUT https://localhost:8443/gateway/default/webhdfs/v1/user/testuser?op=MKDIRS
curl -i -k -u username:password -X PUT 'https://localhost:8443/gateway/default/webhdfs/v1/user/testuser/file1?op=CREATE'
curl -i -k -u username:password -T file1 -X PUT '{Value of Location header from command above}'
Now if I want to upload three files, say file2
, file3
, file4
to the HDFS location /user/testuser
, I have to execute the last two commands (from above) three times for three files respectively.
I want to know a way through which I can upload multiple files in a single go. Is there a way I could provide multiple files as input in one PUT
request? If there is none, I'm even okay with moving the files to a folder and the upload that folder instead with a single PUT
request.
Knox proxies the WebHDFS APIs. I do not think WebHDFS has the ability to upload multiple files or non-empty directory see WebHDFS File and Directory Operations so most likely you won't be able to do that.
The other option is to use a script (a bash script) that uses multiple PUT requests.