Unterschiede

Hier werden die Unterschiede zwischen zwei Versionen angezeigt.

Link zu dieser Vergleichsansicht

Beide Seiten der vorigen Revision Vorhergehende Überarbeitung
Nächste Überarbeitung
Vorhergehende Überarbeitung
hpc:scheduling:slurm_commands [2024/06/13 18:16] – ↷ Seite von hpc:tutorials:scheduling:slurm_commands nach hpc:scheduling:slurm_commands verschoben hpc:scheduling:slurm_commands [2024/10/25 15:03] (aktuell)
Zeile 1: Zeile 1:
 ==== Slurm Commands ==== ==== Slurm Commands ====
  
-=== Job starten ===+=== Start Jobs === 
 + 
 +**srun** 
 + 
 +This is the simplest way to run a job on a cluster. 
 +Initiate parallel job steps within a job or start an interactive job (with --pty). 
 + 
 +**salloc** 
 + 
 +Request interactive jobs/allocations. 
 +When the job is started a shell (or other program specified on the command line) it is started on the submission host (Frontend). 
 +From this shell you should use srun to interactively start a parallel applications. 
 +The allocation is released when the user exits the shell. 
 + 
 +**sbatch** 
 + 
 +Submit a batch script. 
 +The script will be executed on the first node of the allocation. 
 +The working directory coincides with the working directory of the sbatch directory. 
 +Within the script one or multiple srun commands can be used to create job steps and execute parallel applications. 
 + 
 +**Examples**
 <code> <code>
-sbatch --job-name=$jobname -N <num_nodes> --ntasks-per-node=<ppnJobscript +# General: 
-Eine Startzeit kann mit dem Schalter --begin vorgegeben werden +sbatch --job-name=<name of job shown in squeue> -N <num nodes> --ntasks-per-node=<spawned processes per node/path/to/sbatch.script.sh 
-Beispielsweise:+ 
 +# A start date/time can be set via the --begin parameter:
 --begin=16:00 --begin=16:00
 --begin=now+1hour --begin=now+1hour
 --begin=now+60 (seconds by default) --begin=now+60 (seconds by default)
 --begin=2010-01-20T12:34:00 --begin=2010-01-20T12:34:00
-Weitere Informationen bietet auch man sbatch 
-Sämtliche dort verwendeten Parameter können auch im Jobscript selber mit #SBATCH angebenen werden. 
 </code> </code>
-=== Status eigener Jobs abfragen ===+ 
 +A sbatch script of the command above would look like
 <code> <code>
-squeue+#!/bin/bash 
 +#SBATCH --job-name=<name of job shown in squeue> -N <num nodes> 
 +#SBATCH -N <num nodes> 
 +#SBATCH --ntasks-per-node=<spawned processes per node> 
 +#SBATCH --begin=2010-01-20T12:34:00 
 + 
 +/path/to/sbatch.script.sh
 </code> </code>
-=== Knotenstatus anzeigen ===+ 
 +For more information see ''man sbatch''
 +All parameters used there can also be specified in the job script itself using #SBATCH. 
 + 
 +Also, more examples can be found [[hpc:tutorials:sbatch_examples|here]]. 
 + 
 +=== Check the status of your job submissions === 
 +<code> 
 +squeue --me 
 +</code> 
 + 
 +=== Check the status of nodes ===
 <code> <code>
 sinfo sinfo
 </code> </code>
-=== Jobs löschen ===+ 
 +=== Canceling jobs === 
 +On allocation, you will be notified of the job ID.  
 +Also, within your scripts and shells (if allocated via salloc) you can get the ID via the ''$SLURM_JOBID'' environment variable. 
 +You can use this ID to cancel your submission: 
 <code> <code>
 scancel <jobid> scancel <jobid>
 </code> </code>
  
 +=== Slurm Cheat Sheet ===
 +
 +A summary of the most common commands can be found [[https://slurm.schedmd.com/pdfs/summary.pdf|here]].
  • Zuletzt geändert: 2024/06/13 18:16
  • von