Is there any standard process for 16s rRNA analysis?

95 Views Asked by At

I'm trying to reproduce the dendogram results of this paper, concerning to an specific 16s rRNA analysis.

But I don't know if there is a standard method for data management or data analysis. So, I've trying by myself. Below, a summary.

In the methods section says: "The resulting FASTQ files were deposited at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA386442. MiSeq paired-end raw sequence forward and reverse reads were subsequently merged using ea-utils v1.1.2 with standard settings, followed by a split library step from QIIME v1.9.1 and removal sequence reads shorter than 200 nucleotides, reads that contained ambiguous bases, or reads with an average quality score of less than 30. "

So, I downloaded the sra files using SRATOOLKIT and used this code at the terminal:

for n in {141..188}; do prefetch "SRR5577$n"; done

Later, I converted to fastq files using:

for n in {141..188}; do fastq-dump "SRR5577$n"; done

But, for the merge step I can't use the fastq-join function or any other in the ea-utils package on github. It seems data doesn't have a correct format.

Did I do it well? Where can I learn more about this kind of analysis?

1

There are 1 best solutions below

0
On

I would suggest using --split-files in fastq-dump, e.g.:

for n in {141..188}; do fastq-dump --split-files "SRR5577$n"; done

As it appears that the data are paired-end. otherwise you wouldn't need to merge them. It will give you separate forward and reverse read files which presumably you input to ea-utils.