How can I combine Nystroem approximation with SpectralClustering in scikit-learn?

93 Views Asked by At

I have a collection of very large images. I tesselated these images into patches and computed feature embeddings via a pretrained ResNet for each patch in each image. The mean number of patches is 15000 with a feature dimension of 1024.

I would like to apply spectral clustering to this dataset, clustering each image, now of shape [15000, 1024], individually.

This seems to be computationally very expensive, taking roughly 15 minutes per image. I read about the combination of Nystroem approximation with (or within?) Spectral Clustering. But I can not figure out how to combine them in scikit-learn.

nystrom = Nystroem(n_components=300)
spc = SpectralClustering(n_clusters=k_clusters, affinity='precomputed', assign_labels='cluster_qr')
image_features_approx = nystrom.fit_transform(image_features)
cluster_labels = spc.fit_predict(image_features_approx)

This does not work, as when setting SpectralClustering to precomputed it expects a square affinity matrix, which the Nystroem kernel approximation does not offer. Any idea on how to solve this?

Setting affinity to precomputed failed as described above. Setting it to nearest_neighbors or rbf does not help as Nystroem only reduced the feature dimension, not the number of samples. There are multiple papers talking about Nystroem Spectral Clustering, such as http://www1.cs.columbia.edu/~jebara/papers/ALT2013FSCVTNM.pdf But my math skills are far too lacking to understand or implement from scratch.

0

There are 0 best solutions below