Table of Contents
High Performance Computing
A more detailed and up to date version is available here.
Welcome to the official documentation of the Scientific Compute Cluster (SCC). It is the high performance computing system operated by the GWDG for both the Max Planck Society and the University of Göttingen.
This documentation will give you the necessary information to get access to the system, find the right software or compile your own, and run calculations.
Latest News
- [hpc-announce] You are invited to our GöHPCoffee talk on upcoming wednesday (2024/11/06 06:57)
- [hpc-announce] Emmy/Grete: Filesystem scratch-emmy is running again (2024/07/19 14:06)
- [hpc-announce] You are invited to our GöHPCoffee talk on upcoming wednesday (2024/06/10 11:13)
- [hpc-announce] HPC system back online (2024/05/27 17:37)
- [hpc-announce] You are invited to our GöHPCoffee talk on upcoming wednesday (2024/05/27 10:05)
An archive of all news items can be found at the HPC-announce maling list.
Accessing the system
To use the compute cluster, you need to have an HPC-enabled user account.
Once you are activated, you can use ssh to connect to login-mdc.hpc.gwdg.de
. These nodes are only accessible from the GÖNET. If you come from the internet, you need to either use a VPN or use the (non-HPC) login.gwdg.de
as a jump host. You can find detailed instructions here.
Submitting jobs
Our compute cluster is divided into frontends and compute nodes. The frontends are meant for editing, compiling, and interacting with the batch system. Please do not use them for intensive testing, i.e. calculations longer than a few minutes. All users share resources on the frontends and will be impaired in their daily work if you overuse them.
To run a program on one (or more) of the compute nodes, you need to interact with our batch system, or scheduler, Slurm. You can do this with several different commands, such as srun
, sbatch
, and squeue
1). A very simple example for such an interaction would be this:
$ srun hostname dmp023
This runs the program hostname
2) on one of our compute nodes. However, the program would only get access to a single core and very little memory. Not a problem for the hostname
program, but if you want to calculate something more serious, you will need access to more resources. You can find out how to do that in our Slurm documentation.
Software
We provide a growing number of programs, libraries, and software on our system. These are available as modules
. You can find a list with the module avail
command and load them via module load
. For example, if you want to run GROMACS, you simply use module load gromacs
to get the most recent version. Additionally, we use a package management tool called Spack to install software. A guide on how to use modules and Spack is available here.
We provide different compilers and libraries if you want to compile your software on your own. As with the rest of the software, these are available as modules. These include gcc
, intel
, and nvhpc
as compilers, openmpi
, intel-oneapi-mpi
as MPI libraries, and others such as mpi4py
, fftw
and hdf5
. You can find more specific instructions on code compilation on our dedicated page.
Performance Engineering and Analysis
Performance engineering, analyis and optimization is imperative for HPC applications especially considering the huge amount of resources spent on assembling and operating large computing systems with complex microprocessors (X-PUs) and memory architectures.
Performance analysis and optimization of HPC applications involve mainly three steps, application instrumentation, run-time measurements of key events and visual analysis of profiles and events traces.
Performance tools currently available for use in our clusters, for CPUs and GPUs are: “LIKWID”, “Score-P”, “Vampir”, and “Scalasca” for CPUs and Nsight Toolset (System, Compute and Graphic) for Nvidia GPUs. The tools and how to use them in the cluster are documented in Performance Tools page.
A short note on naming
The frontends and transfer nodes also have descriptive names of the form $func-$site.hpc.gwdg.de
based on their primary function and site, where $func
is either login
or transfer
while $site
is mdc
(modular data center, access to scratch
). For example, to reach any login node at the MDC site, you would connect to login-mdc.hpc.gwdg.de
.