I'm trying to read a file geodatabase file into a geodataframe using the geopandas python library. The geodatabase file is on S3, so I'm using fssspec
to read it in, but I'm getting an error:
import geopandas as gpd
import fsspec
fs = fsspec.filesystem('s3', profile='my-profile', anon=False)
it works to read in a geojson file:
# this runs w/o error
g_file = fs.open("my-bucket/my-file.geojson")
gdf = gpd.read_file(g_file)
this causes an error:
gbd_file = fs.open("my-bucket/my-file.gdb/")
gdf = gpd.read_file(gdb_file, driver="FileGDB")
Here's the error traceback:
---------------------------------------------------------------------------
CPLE_OpenFailedError Traceback (most recent call last)
fiona/_shim.pyx in fiona._shim.gdal_open_vector()
fiona/_err.pyx in fiona._err.exc_wrap_pointer()
CPLE_OpenFailedError: '/vsimem/83f6a4d8051c449c86c4c608520eb998' not recognized as a supported file format.
During handling of the above exception, another exception occurred:
DriverError Traceback (most recent call last)
<ipython-input-33-7245da312526> in <module>
----> 1 gdf = gpd.read_file(file, driver='FileGDB')
~/my-conda-envs/nwm/lib/python3.7/site-packages/geopandas/io/file.py in _read_file(filename, bbox, mask, rows, **kwargs)
158
159 with fiona_env():
--> 160 with reader(path_or_bytes, **kwargs) as features:
161
162 # In a future Fiona release the crs attribute of features will
~/my-conda-envs/nwm/lib/python3.7/site-packages/fiona/collection.py in __init__(self, bytesbuf, **kwds)
554 # Instantiate the parent class.
555 super(BytesCollection, self).__init__(self.virtual_file, vsi=filetype,
--> 556 encoding='utf-8', **kwds)
557
558 def close(self):
~/my-conda-envs/nwm/lib/python3.7/site-packages/fiona/collection.py in __init__(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, ignore_fields, ignore_geometry, **kwargs)
160 if self.mode == 'r':
161 self.session = Session()
--> 162 self.session.start(self, **kwargs)
163 elif self.mode in ('a', 'w'):
164 self.session = WritingSession()
fiona/ogrext.pyx in fiona.ogrext.Session.start()
fiona/_shim.pyx in fiona._shim.gdal_open_vector()
DriverError: '/vsimem/83f6a4d8051c449c86c4c608520eb998' not recognized as a supported file format.
One other potential clue: I can get it to work by doing simply:
gdf = gpd.read_file("s3://my-bucket/my-file.gdb/", driver="FileGDB")
BUT only on a machine that is part of the bucket access policy. What I want is to access the data from any machine using the AWS credentials stored in the my-profile
profile.
Unfortunately, I can't provide a way to reproduce the error since I'm doing everything on the cloud. It works fine locally...
We are seeing similar issues using read-only keys for S3 locations and shapefiles (and possibly even NAS folders with read-only permissions).
Can you try both with keys that have read-write permissions and those with read-only? My guess is that the gdal drivers on the back end need write permissions/access even though only reading is desired.
The driver issue is hinted at in the last part of the error trace
If there is anyone that can confirm the specifics of the permissions needed by the gdal drivers that would be great!