Unterschiede
Hier werden die Unterschiede zwischen zwei Versionen angezeigt.
| Beide Seiten der vorigen Revision Vorhergehende Überarbeitung Nächste Überarbeitung | Vorhergehende Überarbeitung | ||
| hpc:scheduling:slurm_commands [2024/06/13 18:16] – ↷ Seite von hpc:tutorials:scheduling:slurm_commands nach hpc:scheduling:slurm_commands verschoben | hpc:scheduling:slurm_commands [2024/10/25 15:03] (aktuell) – | ||
|---|---|---|---|
| Zeile 1: | Zeile 1: | ||
| ==== Slurm Commands ==== | ==== Slurm Commands ==== | ||
| - | === Job starten | + | === Start Jobs === |
| + | |||
| + | **srun** | ||
| + | |||
| + | This is the simplest way to run a job on a cluster. | ||
| + | Initiate parallel job steps within a job or start an interactive job (with --pty). | ||
| + | |||
| + | **salloc** | ||
| + | |||
| + | Request interactive jobs/ | ||
| + | When the job is started a shell (or other program specified on the command line) it is started on the submission host (Frontend). | ||
| + | From this shell you should use srun to interactively start a parallel applications. | ||
| + | The allocation is released when the user exits the shell. | ||
| + | |||
| + | **sbatch** | ||
| + | |||
| + | Submit a batch script. | ||
| + | The script will be executed on the first node of the allocation. | ||
| + | The working directory coincides with the working directory of the sbatch directory. | ||
| + | Within the script one or multiple srun commands can be used to create job steps and execute parallel applications. | ||
| + | |||
| + | **Examples** | ||
| < | < | ||
| - | sbatch --job-name=$jobname | + | # General: |
| - | Eine Startzeit kann mit dem Schalter | + | sbatch --job-name=<name of job shown in squeue> |
| - | Beispielsweise: | + | |
| + | # A start date/time can be set via the --begin | ||
| --begin=16: | --begin=16: | ||
| --begin=now+1hour | --begin=now+1hour | ||
| --begin=now+60 (seconds by default) | --begin=now+60 (seconds by default) | ||
| --begin=2010-01-20T12: | --begin=2010-01-20T12: | ||
| - | Weitere Informationen bietet auch man sbatch | ||
| - | Sämtliche dort verwendeten Parameter können auch im Jobscript selber mit #SBATCH angebenen werden. | ||
| </ | </ | ||
| - | === Status eigener Jobs abfragen === | + | |
| + | A sbatch script of the command above would look like | ||
| < | < | ||
| - | squeue | + | # |
| + | #SBATCH --job-name=< | ||
| + | #SBATCH -N <num nodes> | ||
| + | #SBATCH --ntasks-per-node=< | ||
| + | #SBATCH --begin=2010-01-20T12: | ||
| + | |||
| + | / | ||
| </ | </ | ||
| - | === Knotenstatus anzeigen | + | |
| + | For more information see '' | ||
| + | All parameters used there can also be specified in the job script itself using #SBATCH. | ||
| + | |||
| + | Also, more examples can be found [[hpc: | ||
| + | |||
| + | === Check the status of your job submissions === | ||
| + | < | ||
| + | squeue --me | ||
| + | </ | ||
| + | |||
| + | === Check the status of nodes === | ||
| < | < | ||
| sinfo | sinfo | ||
| </ | </ | ||
| - | === Jobs löschen | + | |
| + | === Canceling jobs === | ||
| + | On allocation, you will be notified of the job ID. | ||
| + | Also, within your scripts and shells (if allocated via salloc) you can get the ID via the '' | ||
| + | You can use this ID to cancel your submission: | ||
| < | < | ||
| scancel < | scancel < | ||
| </ | </ | ||
| + | === Slurm Cheat Sheet === | ||
| + | |||
| + | A summary of the most common commands can be found [[https:// | ||