Missing categorical annotations in R pheatmap() despite no missing values in data

21 Views Asked by At

I am trying to make a clustere emphasized text*d heatmap with pheatmap() in R for cytokine values. I also want to add in annotations based on a categorical variable with 4 different options. Despite ensuring there is no missing data the annotations have multiple rows with no colored annotation. As seen in the picture enter image description here

My original data set had 380 samples, but I cleaned the data to only include rows that have no missing values, which brought me down to 154 samples with 41 variables. Here are the names of the variables-

[1] "tbm_category" "tnfa"         "il6"          "il33"         "il3"          "il8"          "il7"          "ip10"         "il10"  
[10] "egf"          "vegf"         "grob"         "il1b"         "ifng"         "il1ra"        "mip3a"        "il12"         "mip1a"  
[19] "il31"         "mip1b"        "il1a"         "il4"          "mip3b"        "il2"          "groa"         "fractalkine"  "fgfbasic"  
[28] "eotaxin"      "il15"         "il5"          "gcsf"         "pdgfaa"       "mcp1"         "ifna"         "il21"         "trail"  
[37] "tnfsf5"       "il23"         "flt3ligand"   "il18"         "granzymeb"`

The "tbm_category" is a categorical variable with 4 different groups and these are the options and their counts:

Definite TBM Not TBM
26 57
Possible TBM Probable TBM
52 19

The data is called "cleaned_cyto_ordered" and I ordered it based on "tbm_category".

I also grouped the cytokines together for ease of coding: numeric_variables <- names(cleaned_cyto_ordered)[names(cleaned_cyto_ordered) != "tbm_category"]

Here is my check to ensure no missing values

if (any_missing) {print("There are missing values in the dataset.")} else {print("There are no missing values in the dataset.")}
[1] "There are no missing values in the dataset."

My goal is to cluster the cytokine data and then add the colored annotations based on tbm_category. The code I used to create the heat map I attached (I log10 transformed the cytokine values as they tend to be extremely small and to get any meaningful analysis you have to lag10 transform):

pheatmap(log10(cleaned_cyto_ordered[, numeric_variables]),
         annotation_row = data.frame(Category= cleaned_cyto_ordered$tbm_category),
         scale = "row",
         cluster_rows = FALSE,
         show_rownames = FALSE, clustering_distance_cols = "correlation")

I have been playing around with this all day and I have given up hope to fixingt the missing categorical annotations. Any help is greatly appreciated

0

There are 0 best solutions below