ruby rake in SLURM: running elements of an each loop in parallel

95 Views Asked by At

I am trying to write a rake file where I am first doing a few tasks that create a single file each but then I need a task to take create outfiles in parallel, wait for all these to be ready before rake continues reading the next tasks.

That would be something like this:

First few tasks are like this:

file "file.out" => [dependencies] do
   sh "echo aaa"
end

desc "task description"
task :task_name => [dependencies] do
   puts "bbb"
end

The parallelized task would be:

[X, Y, Z].transpose.each |x, y, z|
   file x => [dependencies] do
      sh "echo ccc"
   end
end

desc "parallelized task description"
task :parallelized_task_name => [dependencies] do
   puts "ddd"
end

where each instance of this x,y,z loop are run in parallel. I then need to make sure all parallelized tasks are finished before I do anything else.

An important thing to note is that I am running this rake file through SLURM. My command would be something like:

sbatch -p queue --mem 80000 --wrap "source ruby-2.3.1; rake -f rakefile --trace"

As for now, I am running the parallelized task using the peach ruby gem:

[X, Y, Z].transpose.peach |x, y, z|
   file x => [dependencies] do
      sh "echo ccc"
   end
end

desc "parallelized task description"
task :parallelized_task_name => [dependencies] do
   puts "ddd"
end

And submitting to SLURM like this:

sbatch -p queue --ntasks=7 -c 1 --mem-per-cpu=80000 --wrap "source ruby-2.3.1; rake -f rakefile --trace"

Unfortunately, my so-called parallelized task does not seem to be and my outfiles are generated one after the other. Anything I am missing?

I realize I am a bit confused about the notions of cores, tasks, nodes, CPU... That is why it is a bit difficult for me to find out what I am doing wrong.

Any help appreciated!

Thanks!

1

There are 1 best solutions below

0
On

Just for info, the answer to running tasks in parallel in ruby is to add the -j option in the command. My command would therefore be in my case:

sbatch -p queue --ntasks=7 -c 1 --mem-per-cpu=80000 --wrap "source ruby-2.3.1; rake -f rakefile -j 35 --trace"

Here, I could run 35 tasks in parallel.

A.