Error in the process of converting Whole Transcriptome Sequence data into Variant Calling Format Files

35 Views Asked by At

I'm trying to convert RNA-seq data into a variant calling format (VCF) file and running into an error when I try to run the mpileup command of bcftools.

Here's the error:

[E::hts_hopen] Failed to open file 107_2.bam
[E::hts_open_format] Failed to open file "107_2.bam" : Exec format error

Here's everything I did to get there: Initial files

  1. Started with indexing the reference genome:
bwa index transcripts.fasta transcripts_BWA
  1. Mapped reads to my reference
bwa mem /N/project/tomWGS/OVULERNASEQ/data/transcripts_and_genome/transcripts.fasta $R1 $R2 > /N/project/tomWGS/OVULERNASEQ/data/BWA_Mem_Out/$R1.sam
  1. Cleaned up the .sam files
samtools fixmate -O bam "${file}" /N/project/tomWGS/OVULERNASEQ/data/fixmate_BWA_Out/"${species}"_"${individual}"_"${tissue}".bam
  1. Sorted into coordinate order
samtools sort -O bam -o /N/project/tomWGS/OVULERNASEQ/data/sort_BWA_Out/"${file}".bam -T /N/project/tomWGS/OVULERNASEQ/data/sort_BWA_Out/tmp/ "${file}"
  1. Merged Bam files from different tissues
samtools merge /N/project/tomWGS/OVULERNASEQ/data/BWA_Merged/"${species}"_"${individual}".bam "${fname}" "${lname}"
  1. Indexed .bam files
samtools index "${file}" /N/project/tomWGS/OVULERNASEQ/data/index_BWA_Out/"${species}"_"${individual}".bam
  1. Tried using mpileup to start getting a VCF file. This is where the problem starts
bcftools mpileup -Ob -o /N/project/tomWGS/OVULERNASEQ/data/mpileup_Out_Test/107.bcf -f /N/project/tomWGS/OVULERNASEQ/data/transcripts_and_genome/transcripts.fasta 107_2.bam 107_4.bam 107_6.bam 107_7.bam

Any help with improving my pipeline and working out my error would be appreciated.

Thank you for your time.

0

There are 0 best solutions below