Recreating Density Cluster MDS with known labels

478 Views Asked by At

I have this lovely Density plot. enter image description here

This shows what I would like, the blue cluster cleanly falling out from the rest of the data. I created this plot with the following code:

library(densityClust)
nbhf_dist=dist(StrongGMM[,6:10])
NB_den_clus=densityClust(nbhf_dist, gaussian=TRUE)
plot(NB_den_clus)
NBClust <- findClusters(NB_den_clus, rho=5, delta=5)
plotMDS(NBClust)

rho and delta were determined by the decision plot (NB_den_clus). All of this is perfect.

My issue is that I would like to recreate this density plot with different labels. I am trying to see if the location of my data collection is heavily impacting these clusters.

For the final cluster finding, this is the output:

> str(NBClust)
List of 8
 $ rho      : num [1:1064] 6.46 2 4.12 5.97 14.47 ...
 $ delta    : num [1:1064] 0.771 0.478 1.178 0.953 2.292 ...
 $ distance :Class 'dist'  atomic [1:565516] 4.2 2.61 25.07 25.48 20.04 ...
  .. ..- attr(*, "Size")= int 1064
  .. ..- attr(*, "Diag")= logi FALSE
  .. ..- attr(*, "Upper")= logi FALSE
  .. ..- attr(*, "method")= chr "euclidean"
  .. ..- attr(*, "call")= language dist(x = StrongGMM[, 6:10])
 $ dc       : num 1.9
 $ threshold: Named logi [1:2] 5 5
  ..- attr(*, "names")= chr [1:2] "rho" "delta"
 $ peaks    : int [1:3] 441 416 1021
 $ clusters : int [1:1064] 3 3 3 1 1 2 2 2 2 2 ...
 $ halo     : logi [1:1064] FALSE FALSE FALSE TRUE TRUE TRUE ...
 - attr(*, "class")= chr "densityCluster"

Is there anyway to apply known labels to my original density cluster distance matrix, rather than the clusters the density cluster function generates, and still have the same MDS plot?

Please let me know if I need to clarify at any point. I understand that I am not giving data to reproduce, but right now I'm just not even sure where to start. I have attempted to replace the NBClust$clusters vector with the labels I would like to use, but that produces a blank MDS (just the points, no colored labels). I believe this is ineffective without the peaks. However, I have no way of knowing what the peaks would be for my known clusters.

I think the answer is earlier in the code.

1

There are 1 best solutions below

0
On

In the densityClust package, there is an additional function called plotDensityClust, which is an add on. Not by the original authors, so you may need to re-download the package from github to access it. This function plots three diagnostic plots for your density cluster object. It also has additional features that allow you to dictate your MDS appearance.

From above, I took the NBClust object and replaced the $cluster vector with a vector that corresponded to the labels I wanted. I then assigned colors.

#Create new cluster list from old cluster list
StationClust=NBClust 
#Taking labels from previous dataframe.
facstations=as.factor(alles$juststa)
#Converting them to integers.
intstations=mapvalues(facstations, from = c("Station-1", "Station-2", "Station-20","Station-21", "Station-22","Station-24","Station-27-30","Station-3","Station-5","Station-6","Station-9", "Station-EK"), to = c(1,2,20,21,22,24,27,3,5,6,9,31))
#Create color list.
col=c("red", "blue", "green","purple","orange","cyan3", "gray","yellow","chocolate4", "darkred","darkgreen",  "hotpink")  
#Apply new clusters. 
StationClust$clusters=as.integer(intstations)
plotDensityClust(StationClust, col=col)

That gave me the following plot, which was exactly what I wanted.enter image description here

Looks a mess! But that was the aim.