admin管理员组

文章数量:1391937

So the way SLURM handles array jobs seems useful if:

  • you want to run the exact same command over and over
  • you customize the names of the input and output
  • you have commands spread out over a bunch of different files

But what if you have ONE file where each line is a command, and you want to run them all in an array job?

Right now, I've been doing that using a script like this:

#!/usr/bin/env bash
#SBATCH -a 1-1936

sed -n ${SLURM_ARRAY_TASK_ID}p joblist_file |
parallel --halt soon,fail=1 --retries 3 --joblog job_${SLURM_ARRAY_TASK_ID}.log
exit $?

Or this if I want to run say, 25 jobs at a time

#!/usr/bin/env bash
#SBATCH -a 1-1936:25
#SBATCH -N 1-1
#SBATCH -n 8
#SBATCH --mem-per-cpu 2G

let QUIT_ID=${SLURM_ARRAY_TASK_ID}+25
let END_ID=${QUIT_ID}-1

sed -n "${SLURM_ARRAY_TASK_ID},${END_ID}p;${QUIT_ID}q" jobList_file | 
parallel -j 8 --halt soon,fail=1 --retries 3 --joblog job_${SLURM_ARRAY_TASK_ID}.log

exit $?

Which works but I feel like there's probably a better way?

So the way SLURM handles array jobs seems useful if:

  • you want to run the exact same command over and over
  • you customize the names of the input and output
  • you have commands spread out over a bunch of different files

But what if you have ONE file where each line is a command, and you want to run them all in an array job?

Right now, I've been doing that using a script like this:

#!/usr/bin/env bash
#SBATCH -a 1-1936

sed -n ${SLURM_ARRAY_TASK_ID}p joblist_file |
parallel --halt soon,fail=1 --retries 3 --joblog job_${SLURM_ARRAY_TASK_ID}.log
exit $?

Or this if I want to run say, 25 jobs at a time

#!/usr/bin/env bash
#SBATCH -a 1-1936:25
#SBATCH -N 1-1
#SBATCH -n 8
#SBATCH --mem-per-cpu 2G

let QUIT_ID=${SLURM_ARRAY_TASK_ID}+25
let END_ID=${QUIT_ID}-1

sed -n "${SLURM_ARRAY_TASK_ID},${END_ID}p;${QUIT_ID}q" jobList_file | 
parallel -j 8 --halt soon,fail=1 --retries 3 --joblog job_${SLURM_ARRAY_TASK_ID}.log

exit $?

Which works but I feel like there's probably a better way?

Share Improve this question edited Mar 14 at 11:44 sjenkins asked Mar 13 at 8:35 sjenkinssjenkins 416 bronze badges 2
  • Is parallel doing anything here? – KamilCuk Commented Mar 13 at 9:05
  • @KamilCuk parallel retries the task 3 times if it fails and also creates a joblog file that's pretty useful (or it was when I ran a similar script on UGE.) The joblog prints a line for each command with the exit code and command, so if half the commands fail or something I can combine, filter, and cut the joblogs to get the commands that still need to be run. – sjenkins Commented Mar 14 at 11:50
Add a comment  | 

1 Answer 1

Reset to default 0

Which works but I feel like there's probably a better way?

No.

If you do not need parallel, just:

eval "$(sed -n ${SLURM_ARRAY_TASK_ID}p joblist_file)"

Also the command exits with the exit status of the last command, exit $? is not needed.

本文标签: Good way to run a list of commands as a slurm array jobStack Overflow