How do I run array jobs on the JHPCE Cluster?

Array jobs allow multiple instances of a program to be run via a single qsub command.  This can often be more convenient than running numerous repetitive qsubs of the same program. The different instances of the job that get run are known as “tasks”.  These task values are numeric, and are specified by using the "-t START-END" option to qsub. The specific task is referenced within the qsub script via the $SGE_TASK_ID environment variable.

As an example, suppose you have 3 data files you want to run your program against:

$ ls data*
data1    data2    data3

In this simple example, the SGE script simply "cat"s each file.

$ more script1.sh
#$ -cwd

FILENAME="data$SGE_TASK_ID"
cat $FILENAME

exit 0

When the job is submitted, the "-t" option is used to specify the range of tasks to be run, so in our example, the command to submit 3 tasks, numbered 1, 2, and 3 would be "qsub -t 1-3 script1.sh". Within the script, the $SGE_TASK_ID variable will be assigned to 1, 2, and 3 for the 3 instances of the script that gets run.

$ qsub -t 1-3 script1.sh
Your job-array 5204694.1-3:1 ("script1.sh") has been submitted
$ qstat
job-ID  prior   name        user    state submit/start at     queue            slots ja-task-ID  
----------------------------------------------------------------------------------------------
5204694 0.00000 script1.sh mmill116 qw    06/27/2018 18:12:56                      1 1-3:1
$ qstat
job-ID  prior   name        user    state submit/start at     queue            slots ja-task-ID  
----------------------------------------------------------------------------------------------
5204694 0.59661 script1.sh mmill116 r     06/27/2018 18:12:59 shared.q@compute-087 1 1
5204694 0.54831 script1.sh mmill116 r     06/27/2018 18:12:59 shared.q@compute-086 1 2
5204694 0.53220 script1.sh mmill116 r     06/27/2018 18:12:59 shared.q@compute-054 1 3
$ qstat
$ ls
data1   data3       script1.sh.e5204694.1  script1.sh.e5204694.3  script1.sh.o5204694.2
data2   script1.sh  script1.sh.e5204694.2  script1.sh.o5204694.1  script1.sh.o5204694.3

The result of running this qsub would be 3 output files, where each output file has the task ID appended to it.

Now consider a more complicated scenario where the file names are not neatly numbered. One way to handle this situation is to create a file that contains a list of the files, and then use the $SGE_TASK_ID number to refer to the line number of the entry in that file to get to the file name. For this example, let’s say we have 3 files:

$ ls
first   second   third      

We could create a file list using the “ls” command…

$ ls > file-list
$ cat file-list
first
second
third

We can now create and SGE script that uses the awk command to pull out the line number from file-list based on the value of $SGE_TASK_ID (there are of course numerous other options to use in Unix instead of awk).

$ cat script2.sh
#$ -cwd

FILENAME=`awk "NR==$SGE_TASK_ID {print $1}" file-list`
cat $FILENAME

exit 0
$ qsub -t 1-3 script2.sh

By submitting this array job, 3 instances of the script2.sh script would get run, where each instance would access the filename from the file-list file, where the line number in file-list matches the value of $SGE_TASK_ID. As in our previous example, 3 output files would get created by the 3 tasks, and each output file would contain the contents of the respective input file.

Bookmark the permalink.