Phyloseq sample_data() and sample_names() not working to incorporate samples into phyloseq object

2.3k Views Asked by At

My question relates to using phyloseq() to build a phyloseq object. I've followed https://joey711.github.io/phyloseq/import-data.html & https://vaulot.github.io/tutorials/Phyloseq_tutorial.html and the code has worked before.

Specifically, why is sample_names() returning a vector of NULL values when my samples are the SAME in both the otu_mat and samples_df

here are the beginning files:

   > head(otu_mat)
                      Species  1  2  3   4  5   6  7  8  9 10 11 12  13 14 15 16 17 18 19 20 21 22 23 24
1           Absiella dolichum 44 14  3 152 50 106  4  4 14 41  2  0   1 21  1  2  4  2  2  4  0  4  7  0
2  Acetivibrio ethanolgignens 50  4 10  65 30  96 71 88 35 65 16 21 108 44 12 40  8 64 42 77 29 24 21 31
3 Acetoanaerobium sticklandii  0  0  0   0  0   0  0  0  0  0  0  0   0  0  0  0  0  0  0  0  0  0  0  0
4        Acetobacterium bakii  0  0  0   0  0   0  0  0  0  0  0  0   0  0  0  0  0  0  0  0  0  0  0  0
5       Acholeplasma axanthum 53 26  2  45  9  19  8  7  4 36  3  4  75 16  2 18  6 22  4  2 25 13  3 13
6      Acholeplasma brassicae  0  0  0   0  1   0  0  0  0  1  0  0   2  0  0  2  1  1  1  1  0  3  0  3
    > head(tax_mat)
 Kingdom  Phylum     Class     Order      Family             Genus                    Species            
  <chr>    <chr>      <chr>     <chr>      <chr>              <chr>                    <chr>              
1 Bacteria Firmicutes Clostrid… Clostridi… Lachnospiraceae    unclassified Lachnospir… Lachnospiraceae ba…
2 Bacteria Bacteroid… Bacteroi… Bacteroid… Tannerellaceae     Parabacteroides          unclassified Parab…
3 Bacteria Firmicutes Clostrid… Clostridi… Lachnospiraceae    Blautia                  Ruminococcus gnavus
4 Bacteria Firmicutes Clostrid… Clostridi… Ruminococcaceae    Ruminococcus             Ruminococcus bromii
5 Bacteria Firmicutes Clostrid… Clostridi… Clostridiales inc… Clostridiales Family XI… Emergencia         
6 Bacteria Firmicutes Clostrid… Clostridi… Ruminococcaceae    Anaerotruncus            unclassified Anaer…
> head(samples_df)
 OG_Sample Sample  Activity    Diet    Sex   Time Weight                Treatment
1       101      1 Sedentary Control Female Week_3     21 Sedentary_Control_Female
2       102      2 Sedentary Control Female Week_3     20 Sedentary_Control_Female
3       103      3 Sedentary Control Female Week_3     21 Sedentary_Control_Female
4       104      4 Sedentary Control Female Week_3     22 Sedentary_Control_Female
5       105      5 Sedentary Control Female Week_3     19 Sedentary_Control_Female
6       106      6 Sedentary Control Female Week_3     19 Sedentary_Control_Female
# Here is the code:
otu_mat <- w3.long %>% rownames_to_column("Species")
# make tax_otu file (OTUs as rows, taxnomoy as columns)
tax_mat1 <- l %>% 
  select(Kingdom, Phylum, Class, Order, Family, Genus, Species) %>% 
  group_by(Species)
# This filters tax_mat1 by Species in otu_mat
tax_mat <- tax_mat1 %>% 
  filter(Species %in% otu_mat$Species) %>%
  distinct(.keep_all = FALSE)
# Subset of Metadata file for conformable arrays
w3.env <- as.data.frame(GS2.env[1:56,])
samples_df <- w3.env %>% select(Sample, Activity, Diet, Sex, Time, Weight, Treatment)
rownames(samples_df) <- samples_df$Sample

# --------------------- Preprocessing for Phyloseq ------------------------------------------
row.names(otu_mat) <- otu_mat$Species
otu_mat <- otu_mat %>% select(-Species)

row.names(tax_mat) <- tax_mat$Species

otu_mat <- as.matrix(otu_mat)
tax_mat <- as.matrix(tax_mat)


OTU = otu_table(otu_mat, taxa_are_rows = TRUE)
TAX = tax_table(tax_mat)
samples = sample_data(samples_df)

# Phyloseq uses the phyloseq() command to rapidly combine the otu_table, 
# tax_table and samples
carbom <- phyloseq(OTU, TAX, samples)
carbom

This is the error:

Error in validObject(.Object) : invalid class “phyloseq” object: Component sample names do not match. Try sample_names()

I retry using sample_names() and it returns a vector of NULL values

samples = sample_names(samples_df)

Here is verification the samples names match

> # find the overlapping samples
> common.ids <- intersect(rownames(samples), colnames(otu_mat))
> dim(common.ids)
NULL

Any help is greatly appreciated!

0

There are 0 best solutions below