How to zip all files in a Google cloud storage bucket?

2.2k Views Asked by At

I need to zip all files in a given Cloud Storage bucket. I've already tried several alternatives, but nothing works so far.

At the moment, I'm only able to retrieve the blobs in the bucket, but I also need to generate a zip file grouping all these files and white it in the bucket .

Could you please help me with this issue?

Here is the code so far:

from google.cloud import storage
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="file.json"

def zip_in_bucket(bucketname):

client = storage.Client()
bucket = client.get_bucket(bucketname)
object_generator = bucket.list_blobs()
for files in object_generator:
    print(files.name)

Return:

file-1.csv
file-2.csv
file-3.csv
file-4.csv
file-5.csv
1

There are 1 best solutions below

1
On

there are two variants of solving this problem, either you use the google cloud shell.

gsutil cp -r gs://<<bucketname>>
zip <<NameOfZip>>.zip ./<<*_Filepattern*>>
gsutil cp <<NameOfZip>>.zip  gs://<<bucketname>>

or you can do this using your python code. The thing is that you have to stream the files through your cloud function.

source_bucket_name = <<bucketname>>
storage_client = storage.Client()
source_bucket = storage_client.get_bucket(source_bucket_name)

blobs = list_blobs_with_prefix(source_bucket_name , prefix='20210702/',    delimiter='/')
for i in blobs:
    print(i)
    filterBlobs = list_blobs_with_prefix(source_bucket_name , prefix=i)
    with ZipFile('LOG'+i.split("/")[1]+'_'+i.split("/")[0]+'XXXX.zip', 'w') as zip:
    for file in filterBlobs:
        data = source_bucket.blob(file).download_as_string()
        zip.writestr(file.split("/")[-1],data)
    zip.close()