Why job arrays?
Suppose you wish to run a large number of largely identical jobs: you may wish to run the same program many times with different arguments or parameters; or perhaps process a thousand different input files. One might write a Perl script to generate all the required qsub files and a BASH script to submit them all. However this is not a good use of your time and it will do horrible things to the submit (login) node on a cluster. Much better to use an SGE array job!
What is job array?
-
An
SGE
array
job
might
be
described
as
a
job
with
a
for-loop
built
in.
Here
is
a
simple
example:
#!/bin/bash #$ -cwd #$ -S /bin/bash #$ -t 1-1000 # ...tell SGE that this is an array job, with "tasks" # numbered from 1 to 10000... ./myprog < data.$SGE_TASK_ID > results.$SGE_TASK_IDComputationally, this is equivalent to 1000 individual queue submissions in which SGE_TASK_ID takes the values 1, 2, 3. . . 1000, and where input and output files are indexed by the ID. However:- only one qsub command is issued (and only one qdel command would be required to delete all jobs);
- only one entry appears in qstat output;
- the load on the SGE submit node (i.e., the cluster login node) is vastly less than that of submitting 1000 separate jobs!
#!/bin/bash #$ -cwd #$ -S /bin/bash #$ -t 1-1000 mkdir myjob-$SGE_TASK_ID cd myjob-$SGE_TASK_ID ../myprog-one > one.output ../myprog-two < one.output > two.output
A more general for loop
-
It
is
not
necessary
that
SGE_TASK_ID
starts
at
1;
nor
must
the
increment
be
1.
For
example:
#$ -t 100-995:5
so that SGE_TASK_ID takes the values 100, 105, 110, 115... 995. Incidentally, in the case in which the upper-bound is not equal to the lower-bound plus an integer-multiple of the increment, for example:#$ -t 1-42:6
SGE automatically changes the upper bound, vizprompt> qsub array.qsub Your job-array 2642.1-42:6 ("array.qsub") has been submitted prompt> qstat job-ID prior name user state submit/start at queue slots ja-task-ID --------------------------------------------------------------------------------------- 2642 0.00000 array.qsub simonh qw 04/24/2009 12:29:29 1 1-37:6
Related environment variables
-
There
are
three
more
automatically
created
environment
variables
one
can
use,
as
illustrated
by
this
simple
qsub
script:
#!/bin/bash #$ -cwd #$ -S /bin/bash #$ -t 1-37:6 echo "The ID increment is: $SGE_TASK_STEPSIZE" if [[ $SGE_TASK_ID == $SGE_TASK_FIRST ]]; then echo "first" elif [[ $SGE_TASK_ID == $SGE_TASK_LAST ]]; then echo "last" else echo "neither" fi
A list of input files
-
One
can
be
sneaky
—
suppose
we
have
a
list
of
input
files,
rather
than
input
files
explicitly
indexed
by
suffix:
#!/bin/bash #$ -cwd #$ -S /bin/bash #$ -t 1-42 $INFILE=`awk "NR==$SGE_TASK_ID" my_file_list.text` # # ...or used sed: # sed -n "${SGE_TASK_ID}p" my_file_list.text # ./myprog < $INFILE
More about SGE job arrays
More on SGE job arrays can be found at: Wiki GridEngine page.