I keep getting the error that the database isn't detected when I run BLAST on Nextflow.
I used the following code and I cannot run the second process (extractTopHits) because I keep getting an error that says "No such variable: db"
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
params.query = "/home/galaxy/Vivian/16s.fasta"
params.db = "/home/galaxy/Vivian/blastdb"
process blastSearch {
input:
path query
path db
output:
path "top_hits.txt"
"""
/home/galaxy/Vivian/ncbi-blast-2.13.0+/bin/blastn -db $db/16S_ribosomal_RNA -query $query -outfmt 6> cat blast_result | head -n 10 | cut -f 2 > top_hits.txt
"""
}
process extractTopHits {
input:
path "top_hits.txt"
output:
path "sequences.txt"
"""
/home/galaxy/Vivian/ncbi-blast-2.13.0+/bin/blastdbcmd -db $db -entry_batch top_hits.txt > sequence> """
}
workflow {
def query_ch = Channel.fromPath(params.query)
blastSearch(query_ch, params.db) | extractTopHits | view
}
The example in the docs unfortunately is not correct and will not work as intended. The problem is that the
extractTopHitsinput declaration does not specify a database directory so that it can be staged into the process working directory. You might prefer instead the following approach:Note that the above uses the
condadirective so that we can avoid specifying absolute paths to the executables. Enabling Conda is as easy as adding the following to yournextflow.config: