Time series of cluster analysis spatiotemporal python

112 Views Asked by At

I want to cluster patterns based on three meteorological fields. Each field has the shape 31, 137, 181 (31 timesteps, 137 latitudes, 181 longitudes) and I want a spatial maps of clusters for each timestep. Given the labels in a generic kmeans clustering approach do not have a meaning and do not allow 'tracking' of clusters over time, I tried using tslearn.clustering in python, but I'm not sure about how to structure my data? This is the example that is working for me

import numpy as np
import xarray as xr
from tslearn.clustering import TimeSeriesKMeans

# Load the data
ds_var1 = xr.open_dataset(var1).var1
ds_var2 = xr.open_dataset(var2).var2
ds_var3 = xr.open_dataset(var3).var3

# Stack
X = np.stack((ds_msl.values.flatten(),
              ds_10u.values.flatten(), 
              ds_10v.values.flatten()), 
              axis=1)

X = X.reshape(-1, 3, 1) ###??

# Cluster the data using time-series k-means
n_clusters = 8
km = TimeSeriesKMeans(n_clusters=n_clusters, 
                      metric="euclidean", 
                      verbose=0, 
                      max_iter=5, 
                      n_init=5, 
                      random_state=0)

labels = km.fit_predict(X)

# Reshape the cluster labels to match the original data shape
labels = labels.reshape(ds_var1.shape)

As far as I understood the shape of X is meant to be number_of_samples, number_of_frames, features? But it is not obvious to me how this applies to my problem. X = X.reshape(-1, 3, 1) (X.shape = (768707, 3, 1)) produces output, but I don't know whether this is actually the correct structure of the data/ do I need to assume a completely different shape (e.g. X = X.reshape(-1, 3) which also works)? I also could not find it in the documentation. Can anyone help me figure this out?

0

There are 0 best solutions below