Grib2 data extraction with xarray and cfgrib very slow, how to improve the code?

648 Views Asked by At

The code is taking about 20 minutes to load a month for each variable with 168 time steps for the cycle of 00 and 12 UTC of each day. When it comes to saving to csv, the code takes even longer, it's been running for almost a day and it still hasn't saved to any station. How can I improve the code below?

enter image description here

1

There are 1 best solutions below

0
Jeff Coldplume On

Reading .grib files using xr.open_mfdataset() and cfgrib:

I can speak to the slowness of reading grib files using xr.open_mfdataset(). I had a similar task where I was reading in many grib using xarray and it was taking forever. Other people have experienced similar issues with this as well (see here).

According to the issue raised here, "cfgrib is not optimized to handle files with a huge number of fields even if they are small."

One thing that worked for me was converting as many of the individual grib files as I could to one (or several) netcdf files and then read in the newly created netcdf file(s) to xarray instead. Here is a link to show you how you could do this with several different methods. I went with the grib_to_netcdf command via ecCodes tool.

In summary, I would start with converting your grib files to netcdf, as it should be able to read in the data to xarray in a more performant manner. Then you can focus on other optimizations further down in your code.

I hope this helps!