I am trying to create an automated pipeline that gets files from this api fiftyone and load it to s3. From what I saw the fiftyone package can only download it locally.
import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
classes=["Cat","Dog"],
max_samples=100,
label_types=["detections"],
seed=51,
dataset_name="open-images-pets"
Thats the code I use to download the files, thing is they download locally. Anyone that has some experience with this and how could this be done?
Thank you!
You're right that the code snippet that you shared will download the files from Open Images to whatever local machine you are working on. From there, you can use something like boto3 to upload the files to s3. Then, you may want to check out the examples for using
s3fs-fuse
and FiftyOne to see how you can mount those cloud files and use them in FiftyOne.Directly using FiftyOne inside of a Sagemaker notebook is in development.
Note that FiftyOne Teams has more support for cloud data, with methods to upload/download to the cloud and use cloud objects directly rather than with
s3fs-fuse
.