Content
To match your job requirements with the hardware, you choose among the
The commands normally used for job control and management are
sbatch <jobscript>
srun <arguments> <command>
squeue -j jobID
for queues/running jobs $ scontrol show job jobID
for full job information (even after the job finished).scancel jobID
scancel -i -u $USER
cancel all your jobs (-u $USER
) but ask for every job (-i
)
scancel -9
send kill SIGKILL
instead of SIGTERM
$ squeue -l --me
squeue --start -j jobID
sinfo
(esp. sinfo --format="%25C %A"
) , squeue -l
A job script can be any script that contains special instruction for Slurm. Most commonly used forms are shell scripts, such as bash
or plain sh
. But other scripting languages (e.g. Python, Perl, R) are also possible.
#!/bin/bash #SBATCH -p cpu-clx:test #SBATCH -N 16 #SBATCH -t 06:00:00 module load impi srun mybinary
The job scripts have to have a shebang line at the top, followed by the #SBATCH
options. These #SBATCH
comments have to be at the top, as Slurm stops scanning for them after the first non-comment non-whitespace line (e.g. an echo
or variable declaration).
More examples can be found at Examples and Recipes.
Parameter | SBATCH flag | Comment |
---|---|---|
# nodes | -N <#> | |
# tasks | -n <#> | |
# tasks per node | #SBATCH --tasks-per-node <#> | Different defaults between mpirun and srun |
partition | -p <name> | e.g. cpu-clx, overview: Slurm partition CPU CLX |
# CPUs per task | -c <#> | interesting for OpenMP/Hybrid jobs |
Wall time limit | -t hh:mm:ss | |
--mail-type=ALL | See sbatch manpage for different types | |
Project/Account | -A <project> | Specify project for core hour accounting |
The maximum runtime is set per partition and can be viewed either on the system with sinfo
or here. There is no minimum walltime (we cannot stop your jobs from finishing, obviously), but a walltime of at least 1 hour is encouraged. A large amount of smaller, shorter jobs can cause problems with our accounting system. The occasional short job is fine, but if you submit larger amounts of jobs that finish (or crash) quickly, we might have to intervene and temporarily suspend your account. If you have lots of smaller workloads, please consider combining them into a single job that uses at least 1 hour.
Batch jobs are submitted by a user account to the compute system.
For the User Account the default project for computing time can be changed under the link User Data on the Portal NHR@ZIB.
To charge the account myaccount add the following line to the job script. #SBATCH --account=myaccount
After job script submission the batch system checks the project for account coverage and authorizes the job for scheduling. Otherwise the job is rejected, please notice the error message:
You can check the account of a job that is out of core hour. > squeue ... myaccount ... AccountOutOfNPL ...
For using compute resources interactively, e.g. to follow the execution of MPI programs, the following steps are required. Note that non-interactive batch jobs via job scripts (see below) are the primary way of using the compute resources.
salloc --interactive
command which should also include your resource requirements.salloc
successfully allocated the requested resources, you have to issue an additional srun command to work one of the allocated nodes (see example below) if you want to work on the compute node.srun
or MPI launch commands, like mpirun
or mpiexec
, can be used to start parallel programs (see according user guides)blogin1 ~ $ salloc -t 00:10:00 -p cpu-clx:test -N2 --tasks-per-node 24 salloc: Granted job allocation [...] salloc: Waiting for resource configuration salloc: Nodes bcn[1001,1003] are ready for job # To get a shell on one of the allocated nodes blogin1 ~ $ srun --pty --interactive --preserve-env ${SHELL} bcn1001 ~ $ srun hostname | sort | uniq -c 24 bcn1001 24 bcn1003 bcn1001 ~ $ exit # Exit a second time for Berlin/Lise blogin1:~ > exit salloc: Relinquishing job allocation [...]
We provide a varying number of nodes from the large40 and large96 partitions as post processeing nodes in a shared mode, so that multiple jobs can run at once on a single node. You can request CPUs and memory and should take care, that you do not exceed your limits. For each CPU/Hyperthread, there is about 9.6Gb of Memory on large40:shared or 4 on the large96:shared partition.
The maximum walltime on the shared partitions is 2 days.