When using tools like picard
or fgbio
through snakemake wrappers, I keep running into out-of-memory issues. At the moment I resort to direct shell
calls, which allow me to set the VMs memory. I would prefer to pass these parameters to the wrapped tools. Is there a way, maybe through the resources
directive, passing something like mem_mb=10000
? I tried, but have not gotten it to work yet.
Is there a way to set parameters for the Java VM when using a Snakemake wrapper?
579 Views Asked by Krischan At
2
There are 2 best solutions below
0

According to wrapper sources (https://bitbucket.org/snakemake/snakemake-wrappers/src/bd3178f4b82b1856370bb48c8bdbb1932ace6a19/bio/picard/markduplicates/wrapper.py?at=master&fileviewer=file-view-default), it uses cmdline:
from snakemake.shell import shell
shell("picard MarkDuplicates {snakemake.params} INPUT={snakemake.input} "
"OUTPUT={snakemake.output.bam} METRICS_FILE={snakemake.output.metrics} "
"&> {snakemake.log}")
So you could pass any options using params: "smth"
section.
If you check picard
excecutable script sources:
cat `which picard`
You will find:
...
pass_args=""
for arg in "$@"; do
case $arg in
'-D'*)
jvm_prop_opts="$jvm_prop_opts $arg"
;;
'-XX'*)
jvm_prop_opts="$jvm_prop_opts $arg"
;;
'-Xm'*)
jvm_mem_opts="$jvm_mem_opts $arg"
;;
*)
if [[ ${pass_args} == '' ]] #needed to avoid preceeding space on first arg e.g. ' MarkDuplicates'
then
pass_args="$arg"
else
pass_args="$pass_args \"$arg\"" #quotes later arguments to avoid problem with ()s in MarkDuplicates regex arg
fi
;;
esac
done
...
So I assume this should work:
rule markdups:
input:
"in.bam",
output:
bam = "out.bam",
metrics = "metrics.tmp",
params:
"-Xmx10000m"
wrapper:
"0.31.0/bio/picard/markduplicates"
I have never used the wrapper directive but looking for example at markduplicates/wrapper.py the shell command is
picard MarkDuplicates {snakemake.params} ...
. So maybe using theparams
slot works?picard should understand that
-Xmx...
is a java parameter.