How can I plot from hdf4 file using Python

61 Views Asked by At

I have downloaded MODIS land use land cover data.The key name of the data product is MCD12C1.Now I want to plot/visualize different land use land cover data on map using python.`

import numpy as np
import pandas as pd
import rioxarray
import rasterio as rio
from osgeo import gdal
import earthpy as et
import earthpy.spatial as es
import earthpy.plot as ep 

title='c:/MCD12C1.A2022001.061.2023244164746'
all_bands=[]
with rio.open(f'{title}.HDF') as dataset:
    hdf4_meta = dataset.meta 
    crs = dataset.read_crs()

    for layer_name in [name for name in dataset.subdatasets]:
        print(layer_name)
        with rio.open(layer_name) as subdataset:
            modis_meta = subdataset.profile
            all_bands.append(subdataset.read(1))

I have written the above code.After this how to proceed ? `

1

There are 1 best solutions below

8
raphael On

I guess your problem is that the data is quite large and your script will take ages to finish reading... what you want is to open the file lazily and then only select what you really need!

I'd suggest that you have a look at rioxarray (the rasterio-extension for xarray) and load the data with chunks=True so that the file is loaded as a lazy dask dataset.

Then you can easily pre-select the data and only load what you really need.

Something like this should do the job:

import rioxarray as rxr

file = rxr.open_rasterio(
    "MCD12C1.A2022001.061.2023244164746.hdf", 
    chunks=True # use dask to avoid loading the data immediately
)
df = file.Majority_Land_Cover_Type_1.sel(band=1)

# trigger loading the data
# coordinates
x = df.x.to_numpy()
y = df.y.to_numpy()

# data-values (transposed as required by eomaps)
data = df.to_numpy().T

file.close()

Once you have your dataset loaded, you can plot it with EOmaps like this:

I use m.set_shape.raster(maxsize=1e5) to visualize the data as a 2D raster, pre-aggregated to ~1e5 datapoints (in total it's ~26 million) prior to plotting.

from eomaps import Maps

m = Maps()
m.set_data(data, x, y, crs = df.spatial_ref.crs_wkt)
m.set_shape.raster(maxsize=1e5, aggregator="median")
m.plot_map(cmap="viridis")

enter image description here