tSNE clustering forms spaghetti-like shapes

30 Views Asked by At

I am using tSNE to cluster a set of mRNAs using some previously calculated mRNA structures, using Partitioning around medoids (PAM). The dimensions are 6 conditions (1 with gene values, 5 with mRNA characteristics) and 10.500 genes.

The code used is:

gower_df <- daisy(PV.shock60,
                  metric = "gower",
                  type = list(logratio=2))

#use silhouette width to define the right number of clusters
silhouette <- c()
silhouette = c(silhouette, NA)
for(i in 2:10){
  pam_clusters = pam(as.matrix(gower_df),
                     diss = TRUE,
                     k = i)
  silhouette = c(silhouette ,pam_clusters$silinfo$avg.width)
}

pam.df = pam(gower_df, diss = TRUE, k = 8)

#to visualize the clustering
tsne_object <- Rtsne(gower_df, is_distance = TRUE, perplexity = 150)
tsne_df <- tsne_object$Y %>%
  data.frame() %>%
  setNames(c("X", "Y")) %>%
  mutate(cluster = factor(pam.df$clustering))

And here is the result:

enter image description here

I am wondering the reason for these spaghetti-like structures. I fixed some of it increasing perplexity (preserving local structure over global structure isn't necessarily a goal).

Would you have any advice on what could be the cause for these structures?

I tried increasing amounts of perplexity, which gave me better defined clusters. 150 was the minimum level before it started yielding similar cluster definition.

I was expecting clusters with smoother edges, instead of the spaghetti-like structure.

0

There are 0 best solutions below