Unable to access mounted GCS bucket from within Vertex AI Jupyter Notebook

1.1k Views Asked by At

I've mounted a GCS bucket within a Vertex AI notebook using the following commands:


MY_BUCKET=cloud-ai-platform-a013866a-a18a-470f-9d35-f485abb17e82

cd ~/

mkdir -p gcs

gcsfuse --implicit-dirs --rename-dir-limit=100 --disable-http2 --max-conns-per-host=100 $MY_BUCKET "/home/jupyter/gcs"

Within the terminal I can do ls gcs/ and get a list of the directories within the mounted bucket (test uncorrupted_split_heightmaps), but when I try to access these directories from within a Jupyter Notebook, they cannot be found.

Running the following code within a Jupyter Notebook:

import os
print(os.listdir('../gcs'))

gives the output:

[]

instead of the expected output:

[test, uncorrupted_split_heightmaps]

And

from tensorflow.keras.preprocessing.image import ImageDataGenerator
idg = ImageDataGenerator()
heightmap_iterator = idg.flow_from_directory('../gcs/test', 
                                             target_size = (256, 256), 
                                             batch_size = 8,
                                             color_mode = 'grayscale',
                                             classes = [''])

gives the output:

Found 0 images belonging to 1 classes.

instead of the expected output:

Found 732458 images belonging to 1 classes.

How can I access the mounted GCS bucket from within a Jupyter Notebook?

2

There are 2 best solutions below

0
On

Alishan's answer is right. I just wanted to add that if you are using managed notebooks, you might no longer need gcsfuse since you can already view and access the cloud storage buckets using the "Browse GCS" button found on the far left as mentioned in the documentation1

You can simply path to the objects in those buckets (e.g. images) in Jupyter notebook using the cloud storage pathing (gs://{name of bucket}/{directory}). In your example, the path might be 'gs://cloud-ai-platform-a013866a-a18a-470f-9d35-f485abb17e82/gcs/test'

1
On

Try using the user managed notebook instead of managed notebook. That resolved the issue for me.