I am trying to load in netcdf from a THREDDS server but experience an unrecognizable error after a certain timestep.
def list_dates(start, end):
num_days = (end - start).days
return [start + dt.timedelta(days=x) for x in range(num_days)]
start_date = dt.date(2017, 3, 1)
end_date = dt.date(2017, 3, 31)
date_list = list_dates(start_date, end_date)
window = dt.timedelta(days=5)
url = 'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/{0:%Y%m}/avhrr-only-v2.{0:%Y%m%d}.nc'
#url2= 'https://www.ncei.noaa.gov/thredds/dodsC/Datasets/noaa.oisst.v2.highres/icec.day.mean.{0:%Y}.v2.nc'
data = []
cur_date = start_date
for cur_date in date_list:
print(cur_date)
date_window = list_dates(cur_date - window, cur_date + window)
url_list = [url.format(x) for x in date_window]
window_data=xr.open_mfdataset(url_list).sst
data.append(window_data.mean('time'))
print(data[-1])
dataf=xr.concat(data, dim=pd.DatetimeIndex(date_list, name='time'))
print (dataf)
Loading in of this data goes smoothly until March 22 which is the fail date. I have tried changing months and years, and every time the script fails at the 22nd timestep. There happen to be two error codes which I will provide below. Any info on what is going on here would be greatly appreciated. For reference, I am running the latest versions of python and netCDF4, as well as xarray.
Errors:
Error 1: KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/201703/avhrr-only-v2.20170322.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]
Error 2: OSError: [Errno -37] NetCDF: Write to read only: b'https://www.ncei.noaa.gov/thredds/dodsC/OisstBase/NetCDF/V2.0/AVHRR/201703/avhrr-only-v2.20170322.nc'
I'm not entirely sure what the underlying issue is, but there seems to be something going sideways between the xarray netcdf4 cache and the netCDF-C library. If you add the following somewhere before your
for
loop:that will help make sure any files (or
netCDF4.Dataset
objects, in this case) in the cache get cleaned up when you are done with them, and the underlying issue is avoided (not sure for how long, but I ran the code from2017-03-01
to2017-05-31
without an error).Again, not sure exactly what is happening, but probably worth a opening a github issue with xarray. It might not be an issue at the
xarray
level, but it will get to the right place from there.