Table of Contents
High Performance Computing
Welcome to the official documentation of the Scientific Compute Cluster (SCC). It is the high performance computing system operated by the GWDG for both the Max Planck Society and the University of Göttingen.
This documentation will give you the necessary information to get access to the system, find the right software or compile your own, and run calculations.
- [hpc-announce] You are invited to our GöHPCoffee on upcoming wednesday (2024/02/16 11:31)
- [hpc-announce] You are invited to our GöHPCoffee on upcoming wednesday (2024/02/05 09:04)
- [hpc-announce] EnginFrame on gwdu108 temporarily down (2024/02/01 17:33)
- [hpc-announce] finished migrating LDAP server of the SCC: downtime now over (2024/02/01 12:18)
An archive of all news items can be found at the HPC-announce maling list.
Accessing the system
to use the compute cluster, you need a full GWDG account. Most employees of the University of Göttingen and the Max Planck Institutes already have such an account. This account is not activated for the use of the compute resources by default. More information on how to get your account activated or how to get an account can be found here.
Once you are activated, you can
login-mdc.hpc.gwdg.de. These nodes are only accessible via ssh from the GÖNET. If you come from the internet, you need to either use a VPN or use our login server. You can find detailed instructions here.
Our compute cluster is divided into frontends and compute nodes. The frontends are meant for editing, compiling, and interacting with the batch system. Please do not use them for intensive testing, i.e. calculations longer than a few minutes. All users share resources on the frontends and will be impaired in their daily work if you overuse them.
To run a program on one (or more) of the compute nodes, you need to interact with our batch system, or scheduler, Slurm. You can do this with several different commands, such as
squeue1). A very simple example for such an interaction would be this:
$ srun hostname dmp023
This runs the program
hostname2) on one of our compute nodes. However, the program would only get access to a single core and very little memory. Not a problem for the
hostname program, but if you want to calculate something more serious, you will need access to more resources. You can find out how to do that in our Slurm documentation.
We provide a growing number of programs, libraries, and software on our system. These are available as
modules. You can find a list with the
module avail command and load them via
module load. For example, if you want to run GROMACS, you simply use
module load gromacs to get the most recent version. Additionally, we use a package management tool called Spack to install software. A guide on how to use modules and Spack is available here.
We provide different compilers and libraries if you want to compile your software on your own. As with the rest of the software, these are available as modules. These include
nvhpc as compilers,
intel-oneapi-mpi as MPI libraries, and others such as
hdf5. You can find more specific instructions on code compilation on our dedicated page.
Performance Engineering and Analysis
Performance engineering, analyis and optimization is imperative for HPC applications especially considering the huge amount of resources spent on assembling and operating large computing systems with complex microprocessors (X-PUs) and memory architectures.
Performance analysis and optimization of HPC applications involve mainly three steps, application instrumentation, run-time measurements of key events and visual analysis of profiles and events traces.
Performance tools currently available for use in our clusters, for CPUs and GPUs are: “LIKWID”, “Score-P”, “Vampir”, and “Scalasca” for CPUs and Nsight Toolset (System, Compute and Graphic) for Nvidia GPUs. The tools and how to use them in the cluster are documented in Performance Tools page.
A short note on naming
The frontends and transfer nodes also have descriptive names of the form
$func-$site.hpc.gwdg.de based on their primary function and site, where
$func is either
mdc (modular data center, access to
scratch). For example, to reach any login node at the MDC site, you would connect to