I'm using ELKI's SimplifiedHierarchyExtraction
with AnderbergHierarchicalClustering
, LatLngDistanceFunction
and minClSize = 100
.
I saw that beside the "clu_" Clusters
there are also 2 -3 "mrg_" Clusters
which have some DBID's
, but the number of it is < minClSize
.
My question is: what is the best way to handle this "mrg_" Clusters?
:
- passing its
DBID´s
to one of its"clu_" children
? - taking them as
a cluster
although they are under theminClSize
? - simply ignoring them?
This is a hierarchical result.
You need to include all child clusters into a cluster.
So the
mrg_
cluster has some (potentially 0) new objects, plus all those objects in child clusters. In particular, it can have more than one child cluster (that is why it is called merge)