What is the equivalent of connecting to google cloud storage(gcs) like in aws s3 using s3fs?

1.1k Views Asked by At

I want to access google cloud storage as in the code below.

# amazon s3 connection
import s3fs as fs 

with fs.open("s3://mybucket/image1.jpg") as f:
    image = Image.open(f).convert("RGB")


# Is there an equivalent code like this GCP side?
with cloudstorage.open("gs://my_bucket/image1.jpg") as f:
     image = Image.open(f).convert("RGB")
2

There are 2 best solutions below

3
On BEST ANSWER

You're looking for gcsfs. Both s3fs and gcsfs are part of the fsspec project and have very similar APIs.

import gcsfs

fs = gcsfs.GCSFileSystem()

with fs.open("gs://my_bucket/image1.jpg") as f:
     image = Image.open(f).convert("RGB")

Note that both of these can be accessed from the fsspec interface, as long as you have the underlying drivers installed, e.g.:

import fsspec

with fsspec.open('s3://my_s3_bucket/image1.jpg') as f:
    image1 = Image.open(f).convert("RGB")

with fsspec.open('gs://my_gs_bucket/image1.jpg') as f:
    image2 = Image.open(f).convert("RGB")

# fsspec handles local paths too!
with fsspec.open('/Users/myname/Downloads/image1.jpg') as f:
    image3 = Image.open(f).convert("RGB")

fsspec is the file system handler underlying pandas and other libraries which parse cloud URLs. The reason the following "just works" is because fsspec is providing the cloud URI handling:

pd.read_csv("s3://path/to/my/aws.csv")
pd.read_csv("gs://path/to/my/google.csv")
pd.read_csv("my/local.csv")
0
On

You can use the cloud storage client libraries. They have an example in their docs.

If there's no client library for your language then you have to use the api though client libraries are available for python.