Uploading png images to GCS in R - can upload individually, but cannot upload entire directory

308 Views Asked by At

We have a local directory with 20000 images that needs to go to GCS. Using R (and googleCloudStorageR), we can loop over each of the images and upload to GCS as such:

# setup
library(googleCloudStorageR)
gcs_auth(json_file = 'my/gcs/admin/creds.json')
all_png_images <- list.files('../path/to/local-images/directory')

# and loop
for(i in 1:length(all_png_images)) {
    googleCloudStorageR::gcs_upload(
      file = paste0(base_url, all_png_images[i]),
      bucket = 'my-bucket',
      name = paste0('images-folder/', all_png_images[i]),
      predefinedAcl = 'default'
    )
}

and this works perfectly... however, it would be much better if I could simply point to the directory and upload all at once, rather than have to point to a directory and loop over each file. I have tried to use the gcs_save_all function, to no success:

googleCloudStorageR::gcs_save_all(
  directory = 'path-to-all-images',
  bucket = 'my-bucket'
)

throws the error 2020-10-01 16:23:47 -- File size detected as 377.1 Kb 2020-10-01 16:23:47> Request Status Code: 400 Error: API returned: Cannot insert legacy ACL for an object when uniform bucket-level access is enabled. Read more at https://cloud.google.com/storage/docs/uniform-bucket-level-access

I am trying to find out why gcs_save_all is not working, or if there is another way I can do this in R.

2

There are 2 best solutions below

0
On BEST ANSWER

The function in the library needs to be updated to support bucket level ACL, for now you can replicate its function to what it will be fixed to, by specifying "bucketLevel" ACL which is supported in gcs_upload

e.g.

gcs_save_all <- function(directory = getwd(),
                         bucket = gcs_get_global_bucket(),
                         pattern = ""){

  tmp <- tempfile(fileext = ".zip")
  on.exit(unlink(tmp))

  bucket <- as.bucket_name(bucket)

  the_files <- list.files(path = directory,
                          all.files = TRUE,
                          recursive = TRUE,
                          pattern = pattern)
  
  withCallingHandlers(
    zip::zip(tmp, files = the_files),
    deprecated = function(e) NULL)

  # modified to accept ACL on bucket
  gcs_upload(tmp, bucket = bucket, name = directory, predefinedAcl = "bucketLevel")

}
0
On

Edit:

I opened an issue on googleCloudStorageR and they've fixed the problem in the latest version on GitHub, so updating to that will allow you to do this:

googleCloudStorageR::gcs_save_all(
  directory = 'path-to-all-images',
  bucket = 'my-bucket',
  predefinedAcl = "bucketLevel"
)

Original answer:

GCS has two ways of handling authorization for objects in a bucket: ACLs and IAM. In order to simplify and allow users to only worry about one authorization scheme, buckets can enable "uniform bucket-level access," which prevents the use of ACLs. Presumably, you have this feature enabled on your bucket.

When you set predefinedAcl = 'default', you specified that GCS shouldn't do anything special with ACLs, which is the correct setting when uploading to a bucket with uniform bucket-level access.

It appears gcs_save_all does not have such a parameter, and the default value appears to be the predefined ACL "private," which is not a valid choice when uniform bucket-level access is enabled.