python load OpenDap to NetcdfFile

1.9k Views Asked by At

I am opening netcdf data from an opendap server (a subset of the data) using an URL. When I open it the data is (as far as I can see) not actually loaded until the variable is requested. I would like to save the data to a file on disk, how would I do this?

I currently have:

import numpy as np
import netCDF4 as NC

url = u'http://etc/etc/hourly?varname[0:1:10][0:1:30]'
set = NC.Dataset(url) # I think data is not yet loaded here, only the "layout"
varData = set.variables['varname'][:,:] # I think data is loaded here

# now i want to save this data to a file (for example test.nc), set.close() obviously wont work

Hope someone can help, thanks!

2

There are 2 best solutions below

0
On

It's quite simple; create a new NetCDF file, and copy whatever you want :) Luckily this can be automated for a large part, in copying the correct dimensions, NetCDF attributes, ... from the input file. I quickly coded this example, the input file is also a local file, but if the reading with OPenDAP already works, it should work in a similar way.

import netCDF4 as nc4

# Open input file in read (r), and output file in write (w) mode:
nc_in = nc4.Dataset('drycblles.default.0000000.nc', 'r')
nc_out = nc4.Dataset('local_copy.nc', 'w')

# For simplicity; copy all dimensions (with correct size) to output file
for dim in nc_in.dimensions:
    nc_out.createDimension(dim, nc_in.dimensions[dim].size)

# List of variables to copy (they have to be in nc_in...):
# If you want all vaiables, this could be replaced with nc_in.variables
vars_out = ['z', 'zh', 't', 'th', 'thgrad']

for var in vars_out:
    # Create variable in new file:
    var_in  = nc_in.variables[var]
    var_out = nc_out.createVariable(var, datatype=var_in.dtype, dimensions=var_in.dimensions)

    # Copy NetCDF attributes:
    for attr in var_in.ncattrs():
        var_out.setncattr(attr, var_in.getncattr(attr))

    # Copy data:
    var_out[:] = var_in[:]

nc_out.close()

Hope it helps, if not let me know.

0
On

If you can use xarray, this should work as:

import xarray as xr

url = u'http://etc/etc/hourly?varname[0:1:10][0:1:30]'
ds = xr.open_dataset(url, engine='netcdf4')  # or engine='pydap'
ds.to_netcdf('test.nc')

The xarray documentation has another example of how you could do this.