Data sharing

Data sharing

This documentation describes how multiple users can share data within SCC.

There are currently 4 options we offer, described below.

Using a hidden directory

In this case the data will be available for reading by all users or by a specific POSIX group only if they know the path to it. You should send the path to users whom you want to give the access. In order to make it safer, better to share data via /scratch filesystem.

First you need to create a directory with a random name

SHAREDIR=$(mktemp -p /scratch/users/$USER -d share.XXXXXXXX)

This will create a directory with a random name in /scratch/users/$USER and save the path in the variable SHAREDIR. Now you can copy or move the files you want to share in that directory:

cp /PATH/TO/MY/FILES $SHAREDIR/. # if you want to copy
mv /PATH/TO/MY/FILES $SHAREDIR/. # if you want to move

Now you need to set permissions to the directories:

chmod go=x /scratch/users/$USER # will set an execute permission on the parent directory
chmod -R go+rX $SHAREDIR # will make the files readable for other users and the group

After that send the path to users you want to share data with. To print the path of the shared directory run:

echo $SHAREDIR

The users who know the path can cd into it and copy the files contained in the directory.

After sharing is done, don't forget to unset execute permission of the parent directory to restrict the access to files again.

chmod go= /scratch/users/$USER # will unset all permissions on the parent directory

Project directory in scratch filesystem

In this case we will create a shared project directory in scratch filesystem (/scratch/projects/PROJECTNAME) and give the full access to it for users of a POSIX group of your choice. This option is good for collaboration where huge amount of data is involved. Be aware that the scratch filesystem doesn't have a backup. If you want a backup, you can consider the next option.

Applying for a group

In order to share the directory with a group of users you need to apply for a POSIX group by contacting us. Please give us a unique group name and usernames of people you want to be in the group.

Functional account

In this case you will have a HOME directory of the functional account as a shared space for collaboration like in the previous option. It comes with a backup and archiving possibilities, however, the IO performance for large computations will be affected by slow HOME filesystem, if you will need to process a large amount of data, please consider the previous option or both options simultaneously, by using a functional account for storing data and the scratch filesystem for processing.

First you need to apply for the functional account, which will take some days, since it should be approved by the head of your institute. Then you need to apply for the POSIX group as described in the previous section. When you have the access to you functional account, contact us, so we can add your functional account to the POSIX group. After the POSIX group is ready, change the permissions of the HOME directory on the login node (login.gwdg.de):

ssh functionalusername@login.gwdg.de 
chmod g+rwxs . # everyone in the group will have all access rights to the directory
chgrp YOURGROUP . # change the group of the Home directory

If you don't want members of the group to be able to delete or rename the files that don't belong to them, you can add a sticky bit to it with

chmod g+t .

Using S3

In this case you first need to get an S3-Bucket from us. In order to get an S3-Bucket you can simply write a Mail to support@gwdg.de and ask for one, which is accessible from the HPC system. You can then share your secret key and private key within your group to give everyone access. In this scenario, access to your data is done via http and it is reachable not only from the HPC system, but also from the Cloud and Internet (if needed).

You can access your S3-Bucket from a compute node using http://172.19.1.26:8090 as an endpoint.

In order to work with your S3-Bucket, you could for instance use rclone:

module load rclone
# List content of your Bucket
rclone ls <config-name>:<bucket-name>/<prefix>
# Or Snyc the Content of your $HOME with the Bucket
rclone sync -i $HOME/some/folder <config-name>:<bucket-name>/<prefix> 
# Or Snyc the Content of your Bucket with your $HOME
rclone sync -i <config-name>:<bucket-name>/<prefix> $HOME/some/folder

This requires a config file in /usr/users/$USER/.config/rclone/rclone.conf with the following content:

[<config-name>]
type = s3
provider = Ceph
env_auth = false
access_key_id = <AccessKey>
secret_access_key = <SecretKey>
region =
endpoint = http://172.19.1.26:8090
location_constraint =
acl =
server_side_encryption =
storage_class =