Is it possible to specify the number and names of taxonomic ranks when reading in data to a phyloseq object? When creating a phyloseq object from qiime output, e.g.,
ps=qza_to_phyloseq(features="features.qza", taxonomy="taxonomy.qza", metadata="metadata.qza")
the object defaults to 7 taxonomic levels, i.e., Kingdom, Phylum, Class, Order, Family, Genus, and Species.
I am using the PR2 database which is currently utilizing 9 taxonomic levels, i.e, Domain (replacing Kingdom), Supergroup, Division, Subdivision (new taxonomic rank added), Class, Order, Family, Genus, and Species. (See here for details: https://pr2-database.org/documentation/pr2-taxonomy-9-levels/). This means that the taxonomy does not match the grouping, i.e., Kingdom is actually a Domain, and the genus and species are missing entirely.
Phyloseq is a bit of a black box to me, so I am not sure where I can override the default. Any help would be greatly appreciated.
I figured out where this step is occurring
parse_taxonomy
is a function ofqiime2R
that hard codes the 7 taxonomic levels we all learned in grade school.Simply edit the function and save it. For The current 9-level PR2 database, your code will look like this:
Save and close! Note that this will not permanently change the function, only in your current session.