Converting .fasta files to .gff3 files

3.4k Views Asked by At

I'm trying to use Roary to do phylogenetic analysis.

It says "Alternatively you can use ncbi-genome-download to pull down the FASTA files and convert them to GFF3 with Prokka." in https://sanger-pathogens.github.io/Roary/

I already have all the .fasta file I need.

How am I supposed to convert it to .gff3 files?

1

There are 1 best solutions below

0
On

Fasta files contain nucleotide or peptide sequences (nucleotides in the case of bacterial/archaeal genomes). Files in GFF3 format, on the other hand, contain annotations, a list of intervals corresponding to genes or other genomic features. Optionally, Fasta sequences can be appended to the end of a GFF3 file (separated by a ##FASTA directive). I personally find it abominable to combine GFF3 and Fasta in the same file, but apparently this is required for some software packages.

You mentioned the following excerpt from the Roary documentation.

Alternatively you can use ncbi-genome-download to pull down the FASTA files and convert them to GFF3 with Prokka.

Unless I'm mistaken, convert is the wrong word to use here. Prokka doesn't convert Fasta files to GFF3 files, it takes bacterial/archaeal genome sequences as input and annotates them. How to do that? Which parameters should you use? Well, @heathobrien is right: you'll need to read the prokka documentation (and maybe the paper as well).