Differences

This shows you the differences between two versions of the page.

--- en:services:application_services:high_performance_computing:running_jobs_slurm [2022/11/18 12:13] – [Recipe: Using ''/scratch''] vend
+++ en:services:application_services:high_performance_computing:running_jobs_slurm [2024/01/23 15:07] (current) – [Interactive session on the nodes] nboelte
@@ Line 153: / Line 153: @@
 {{ :en:services:application_services:high_performance_computing:partitions.png?1000 |}}
-This scheme shows the basic cluster setup at GWDG. The cluster is distributed across two facilities, with the "ehemalige Fernmeldezentrale" facility hosting the older resources and the shared /scratch file system and the "Faßberg" facility hosting the latest resources and the shared /scratch2 file system. The shared /scratch and /scratch2 are usually the best choices for temporary data in your jobs, but /scratch is only available at the "modular data center" (mdc) resources (select it with ''-C scratch'') and /scratch2 is only available at the Faßberg (fas) resources (select it with ''-C scratch2''). The scheme also shows the queues and resources by which nodes are selected using the ''-p'' (partition) and ''-C'' (constraint) options of ''sbatch''.
+This scheme shows the basic cluster setup at GWDG.  The shared /scratch is usually the best choice for temporary data in your jobs, but /scratch is only available at the "modular data center" (mdc) resources (select it with ''-C scratch''). The scheme also shows the queues and resources by which nodes are selected using the ''-p'' (partition) and ''-C'' (constraint) options of ''sbatch''.
 ====  ''sbatch'': Specifying node properties with ''-C''  ====
@@ Line 252: / Line 252: @@
 Please check out the software manual on how to set the directory for temporary files. Many programs have some flags for it, or read environment variables to determine the location, which you can set in your jobscript.
-====  Using ''/scratch2''  ====
+====  Using ''/scratch''  ====
-If you use scratch space only for storing temporary data, and do not need to access data stored previously, you can request /scratch or /scratch2:
+If you use scratch space only for storing temporary data, and do not need to access data stored previously, you can request /scratch:
 <code>
-#SBATCH -C "scratch|scratch2"
+#SBATCH -C "scratch"
 </code>
-For that case ''/scratch2'' is linked to ''/scratch'' on the latest nodes. You can just use ''/scratch/users/${USERID}'' for the temporary data (don't forget to create it on ''/scratch2''). On the latest nodes data will then be stored in ''/scratch2'' via the mentioned symlink.
+You can just use ''/scratch/users/${USERID}'' for the temporary data.
 ==== Interactive session on the nodes ====
@@ Line 267: / Line 267: @@
 ''<nowiki>--pty</nowiki>'' requests support for an interactive shell, and ''-p medium'' the corresponding partition. ''-c 16'' ensures that you 16 cpus on the node. You will get a shell prompt, as soon as a suitable node becomes available. Single thread, non-interactive jobs can be run with
 <code>srun -p medium ./myexecutable</code>
+If there is a waiting time for running jobs on the ''medium'' partition, you can use the ''int'' partition for an interactive job that uses CPU resources and ''gpu-int'' for an interactive job that also uses GPU resources. The interactive partitions do not have a waiting time but you do not get any dedicated resources like CPU/GPU cores or Memory but have to share with all other users, so you should not use them for non-interactive jobs.
 ==== GPU selection ====
@@ Line 289: / Line 291: @@
 <code>
+v100: Nvidia Tesla V100
+rtx5000: Nvidia Quadro RTX5000
 gtx1080 : GeForce GTX 1080
-gtx980  : GeForce GTX 980
+gtx980  : GeForce GTX 980 (only in the interactive partition gpu-int)
-k40     : Nvidia Tesla k40
 </code>
-Most GPUs are commodity graphics cards, and only provide good performance for single precision calculations. If you need double precision performance, or error correcting memory (ECC RAM), you can select the Tesla GPUs with
+Most GPUs are optimized for single precision (or lower) calculations. If you need double precision performance, tensor units or error correcting memory (ECC RAM), you can select the data center (Tesla) GPUs with
 <code>
 #SBATCH -p gpu
-#SBATCH -G k40:2
+#SBATCH -G v100:1
 </code>
-Our Tesla K40 are of the Kepler generation.
 <code> sinfo -p gpu --format=%N,%G </code> shows a list of host with GPUs, as well as their type and count.