Installing and querying a local NCBI nucleotide database : file nt.ndb not found

151 Views Asked by At

to install and query a local NCBI nucleotide database, I downloaded the latest version with:

mkdir NCBI_nt_DB
cd NCBI_nt_DB/
wget "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.119.tar.gz"
tar -zxvpf nt.119.tar.gz

Then I ran blastn:

blastn -db ./nt.119 -query ../temp/query.fasta

and got the following error:

BLAST Database error: File /gpfs/.../NCBI_nt_DB/nt.ndb not found. If you renamed any BLAST database files, please use original file names, and makeblastdb to rename the database. If you deleted any BLAST database files, you need to recreate the database.

I am using BLAST 2.15.0+

Any suggestions on how to resolve the error? I have tried several download versions from /blast/db/ and none of them have a .ndb file.

1

There are 1 best solutions below

1
On

Oh, that's an easy mistake to make.

nt.119 isn't the most recent version. It's just chunk 119 of the current version.

To BLAST against nt, you must download and decompress all 119 chunks. Together, they make up the nt database, and you can then: blastn -db nt

The first chunk, nt.000.tar.gz, contains .nal and .ndb and a few other "metadata" files that help the BLAST algorithm coordinate across the >100 database chunks.

The overall storage requirements are substantial, which is one of the reasons why many people prefer doing this on a computing cluster with infinite storage or using a cloud BLAST service (such as ours).