My question relates to using phyloseq() to build a phyloseq object. I've followed https://joey711.github.io/phyloseq/import-data.html & https://vaulot.github.io/tutorials/Phyloseq_tutorial.html and the code has worked before.
Specifically, why is sample_names() returning a vector of NULL values when my samples are the SAME in both the otu_mat and samples_df
here are the beginning files:
> head(otu_mat)
Species 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 Absiella dolichum 44 14 3 152 50 106 4 4 14 41 2 0 1 21 1 2 4 2 2 4 0 4 7 0
2 Acetivibrio ethanolgignens 50 4 10 65 30 96 71 88 35 65 16 21 108 44 12 40 8 64 42 77 29 24 21 31
3 Acetoanaerobium sticklandii 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 Acetobacterium bakii 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 Acholeplasma axanthum 53 26 2 45 9 19 8 7 4 36 3 4 75 16 2 18 6 22 4 2 25 13 3 13
6 Acholeplasma brassicae 0 0 0 0 1 0 0 0 0 1 0 0 2 0 0 2 1 1 1 1 0 3 0 3
> head(tax_mat)
Kingdom Phylum Class Order Family Genus Species
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Bacteria Firmicutes Clostrid… Clostridi… Lachnospiraceae unclassified Lachnospir… Lachnospiraceae ba…
2 Bacteria Bacteroid… Bacteroi… Bacteroid… Tannerellaceae Parabacteroides unclassified Parab…
3 Bacteria Firmicutes Clostrid… Clostridi… Lachnospiraceae Blautia Ruminococcus gnavus
4 Bacteria Firmicutes Clostrid… Clostridi… Ruminococcaceae Ruminococcus Ruminococcus bromii
5 Bacteria Firmicutes Clostrid… Clostridi… Clostridiales inc… Clostridiales Family XI… Emergencia
6 Bacteria Firmicutes Clostrid… Clostridi… Ruminococcaceae Anaerotruncus unclassified Anaer…
> head(samples_df)
OG_Sample Sample Activity Diet Sex Time Weight Treatment
1 101 1 Sedentary Control Female Week_3 21 Sedentary_Control_Female
2 102 2 Sedentary Control Female Week_3 20 Sedentary_Control_Female
3 103 3 Sedentary Control Female Week_3 21 Sedentary_Control_Female
4 104 4 Sedentary Control Female Week_3 22 Sedentary_Control_Female
5 105 5 Sedentary Control Female Week_3 19 Sedentary_Control_Female
6 106 6 Sedentary Control Female Week_3 19 Sedentary_Control_Female
# Here is the code:
otu_mat <- w3.long %>% rownames_to_column("Species")
# make tax_otu file (OTUs as rows, taxnomoy as columns)
tax_mat1 <- l %>%
select(Kingdom, Phylum, Class, Order, Family, Genus, Species) %>%
group_by(Species)
# This filters tax_mat1 by Species in otu_mat
tax_mat <- tax_mat1 %>%
filter(Species %in% otu_mat$Species) %>%
distinct(.keep_all = FALSE)
# Subset of Metadata file for conformable arrays
w3.env <- as.data.frame(GS2.env[1:56,])
samples_df <- w3.env %>% select(Sample, Activity, Diet, Sex, Time, Weight, Treatment)
rownames(samples_df) <- samples_df$Sample
# --------------------- Preprocessing for Phyloseq ------------------------------------------
row.names(otu_mat) <- otu_mat$Species
otu_mat <- otu_mat %>% select(-Species)
row.names(tax_mat) <- tax_mat$Species
otu_mat <- as.matrix(otu_mat)
tax_mat <- as.matrix(tax_mat)
OTU = otu_table(otu_mat, taxa_are_rows = TRUE)
TAX = tax_table(tax_mat)
samples = sample_data(samples_df)
# Phyloseq uses the phyloseq() command to rapidly combine the otu_table,
# tax_table and samples
carbom <- phyloseq(OTU, TAX, samples)
carbom
This is the error:
Error in validObject(.Object) : invalid class “phyloseq” object: Component sample names do not match. Try sample_names()
I retry using sample_names() and it returns a vector of NULL values
samples = sample_names(samples_df)
Here is verification the samples names match
> # find the overlapping samples
> common.ids <- intersect(rownames(samples), colnames(otu_mat))
> dim(common.ids)
NULL
Any help is greatly appreciated!