Running multiple config files with Bppancestor

53 Views Asked by At

I need to run Bppancestor with multiple config files, I have tried different approaches but none of them worked. I have around 150 files, so doing it one by one is not an efficient solution.

The syntax to run bppancestor is the following one:

bppancestor params=config_file

I tried doing:

bppancestor params=directory_of_config_files/*

and using a Snakefile to try to automatize the workflow:

ARCHIVE_FILE = 'bpp_output.tar.gz'

# a single config file
CONFIG_FILE = 'config_files/{sims}.conf'

# Build the list of input files.
CONF = glob_wildcards(CONFIG_FILE).sims

# pseudo-rule that tries to build everything.
# Just add all the final outputs that you want built.
rule all:
    input: ARCHIVE_FILE

# run bppancestor
rule bpp:
    input:
        CONF,
    shell: 
        'bppancestor params={input}'

# create an archive with all results
rule create_archive:
    input: CONF, 
    output: ARCHIVE_FILE
    shell: 'tar -czvf {output} {input}'

Could someone give me advice on this?

1

There are 1 best solutions below

5
On

You're very close. Rule bpp should use as input a specific config file and specify concrete output (not sure if the output is a file or a folder). If I understand the syntax correctly, this link suggests that output files can be specified using output.sites.file and output.nodes.file:

rule bpp:
    input:
        CONFIG_FILE,
    output:
        sites='sites.{sims}',
        nodes='nodes.{sims}',
    shell: 
        'bppancestor params={input} output.sites.file={output.sites} output.nodes.file={output.nodes}'

Rule create_archive will collect all the outputs and archive them:

rule create_archive:
    input: expand('sites.{sims}', CONF), expand('nodes.{sims}', CONF)
    output: ARCHIVE_FILE
    shell: 'tar -czvf {output} {input}'