Read multiple .nc files into a 3D pandas dataframe python

2.8k Views Asked by At

I would like to read in multiple SST netcdf files and from each file extract the SST data in selected lat, lon range and then store this data in a three dimensional panda dataframe. Closing each netcdf after it has been read to save memory.

I would like to end with one dataframe of a years worth of daily data.

I have read one file with NetCDF4 and stored each variable but that is as far as I have got.

my_file = 'C:/Users/lisa/Desktop/Sean/20160719000127-UoS-L2i-SSTskin-ISAR_002-D054_PtA-v01.0-fv01.5.nc'
fh = netCDF4.Dataset(my_file, mode='r')
lon = fh.variables['lon'][:]
lat = fh.variables['lat'][:]
time = fh.variables['time'][:]
sst = fh.variables['sea_surface_temperature'][:]

The data is from OPeNDAP for 2016 from the following address.

http://www.ifremer.fr/opendap/cerdap1/ghrsst/l4/saf/odyssea-nrt/data/

Any help would be much appreciated!!

2

There are 2 best solutions below

0
On

The Pandas.DataFrame does not support 3-dimensional data in this way. This use case is exactly why xarray was developed.

To do what you're trying to do in xarray:

import xarray as xr

ds = xr.open_mfdataset(['file1.nc', 'file2.nc', 'file3.nc'])

This will concatentate your files together and put it all in one xarray.Dataset. getting 1d or 2d data into Pandas is pretty easy

ds.sel(lat=36.0, lon=42.5).to_dataframe()
0
On

I would suggest preprocessing with CDO, e.g.

cdo mergetime 2016*-UoS-L2i-SSTskin-ISAR_002-D054_PtA-v01.0-fv01.5.nc merged.nc
cdo sellonlatbox,lon1,lon2,lat,lat2 merged.nc box_2016.nc

You may have a open file limit (256) on your system in which case you will need to split the mergetime command up into a loop over months, extract the area and then do a final mergetime on the 12 monthly files at the end.