I want to do alignment using star and I use proxy file for star the alignment. Without a proxy file star-align run also without reference. So if I gave as input constrain of the alignment process the presence of database.done the alignment process can start. How can manage this situation?
rule star_index:
input:
config['references']['transcriptome_fasta']
output:
genome=config['references']['starindex_dir'],
tp=touch("database.done")
shell:
'STAR --limitGenomeGenerateRAM 54760833024 --runMode genomeGenerate --genomeDir {output.genome} --genomeFastaFiles {input}'
rule star_map:
input:
dt="trim/{sample}/",
forward_paired="trim/{sample}/{sample}_forward_paired.fq.gz",
reverse_paired="trim/{sample}/{sample}_reverse_paired.fq.gz",
forward_unpaired="trim/{sample}/{sample}_forward_unpaired.fq.gz",
reverse_unpaired="trim/{sample}/{sample}_reverse_unpaired.fq.gz",
t1p="database.done",
output:
out1="ALIGN/{sample}/Aligned.sortedByCoord.out.bam",
out2="ALIGN/{sample}/",
# out2=touch("Star.align.done")
params:
genomedir = config['references']['basepath'],
sample="mitico",
platform_unit=config['platform'],
cente=config['center']
threads: 12
log: "ALIGN/log/{params.sample}_star.log"
shell:
'mkdir -p ALIGN/;STAR --runMode alignReads --genomeDir {params.genomedir} '
r' --outSAMattrRGline ID:{params.sample} SM:{params.sample} PL:{config[platform]} PU:{params.platform_unit} CN:{params.cente} '
'--readFilesIn {input.forward_paired} {input.reverse_paired} \
--readFilesCommand zcat
--outWigType wiggle \
--outWigStrand Stranded --runThreadN {threads} --outFileNamePrefix {output.out2} 2> {log} '
How can start a module only after all the previous function have finished. I mean.Here i create the index then I trim ll my data and then I staart the alignment. I want after finishis all this sstep for all the sample start a new function like run fastqc. How can decode this in snakemake? thanks so much for patience help
Without any mention of the genome as a required input for "star_map", I believe the rule is starting too early.
Try moving the genome reference from being a "Parameter" to being an "Input" requirement for star_map. Snakemake doesn't wait for parameters, only inputs. All reference genomes should be listed as inputs. In fact, all required files should be listed as input requirements. Param's are just for mostly convenience; ad-hoc strings and things on the fly.
I'm not entirely sure as to the connectivity across your files, some of these references are to a YAML file you have not provided, so I cannot guarantee the code will work.
Snakemake doesn't check what files your shell commands are touching and modifying. Snakemake only knows to coordinate the files described in the "input" and "output" directives.