I'm using ELKI's SimplifiedHierarchyExtraction with AnderbergHierarchicalClustering, LatLngDistanceFunction and minClSize = 100.
I saw that beside the "clu_" Clusters there are also 2 -3 "mrg_" Clusters which have some DBID's, but the number of it is < minClSize.
My question is: what is the best way to handle this "mrg_" Clusters?:
- passing its
DBID´sto one of its"clu_" children? - taking them as
a clusteralthough they are under theminClSize? - simply ignoring them?
This is a hierarchical result.
You need to include all child clusters into a cluster.
So the
mrg_cluster has some (potentially 0) new objects, plus all those objects in child clusters. In particular, it can have more than one child cluster (that is why it is called merge)