How to print to a common file with qsub arrays?

28 Views Asked by At

I am running a bunch of java tasks on a cluster with many nodes. I use qsub arrays with a script file like:

#PBS ...
#PBS -t 1-100
java myJavaProgram

Now, my java program prints results to a common file. I am experiencing that the file is sometime corrupted, for example lines terminate abruptly with the start of the next line or lines printed only partially. The same problem never occurs if the tasks run sequentially, so my guess is that this has to do with different java processes trying to print at the same time from different compute nodes. However, I do not understand if the problem is on the java side or on the way I let the code run on the cluster.

So my question is: What is a safe way of having java programs update the same file from different nodes?

NOTE: The java code that actually prints looks like this:

try (FileWriter fw = new FileWriter(output_file,true);PrintWriter Printer = new PrintWriter(fw,true)) {
            Printer.println(String.format());
}
1

There are 1 best solutions below

0
joanis On

When you have multiple processes writing output to the same file, you're always going to get conflicts between the outputs.

To use job arrays, or multiple parallel jobs, effectively, you should have each job write to its own output file - maybe add the job array index to the file name in each job - and then possibly concatenate the files from each array job into a joined output file when all the jobs have completed.

This is not a Java problem: as you have written it, your code is writing to the same place in the same file in different processes on multiple machines, so the bytes that remain in the actual file at the end are the last bytes the were written in any given position by any of the processes.

PBS gives each of your parallel job instances a variable specifying the id within the array, you should pass that information to your java program or use it to create the name of the output/log file the program is supposed to use. Something like java myJavaProgram --output output-file-${PBS_ARRAY_INDEX} or java myJavaProgram > output-file-${PBS_ARRAY_INDEX}.