Data Sharing

This article describes how to share data in your KISSKI projects with others.

There are different tools that you can use to share your data with other people. The tool rsync requires less setup and rclone is well suited for sharing data to other services such as Owncloud:

  1. Rsync: Basic Copying to Local Machine
  2. Rclone: Copying to Owncloud
  3. Tmux: Copying with Long Duration

💎 Additional help sections in case you need to debug a particular step are marked with a diamond.

Rsync: Basic Copying to Local Machine

From Cluster to Remote Machine

The tool rsync is a great way to share files between two remote system. To copy files from Emmy to your local machine use the following in the Shell on your local machine:

rsync [FROM WHERE] [TO WHERE]

rsync -rpt nibsommr@glogin.hlrn.de:/home/nibsommr/deep-learning-with-gpu-cores/ .

with the useful flags:

Replace nibsommr with your NHR username. This will copy the repository deep-learning-with-gpu-cores to the your current place in the shell, indicated by..

💎 When you have problems use the verbose flag -vv to get more output on what went wrong.

💎 It might be that you need to pass your ssh key file explicitly with -i ~/.ssh/hlrn-key, replacing the path with your key file path.

From Remote Machine to Cluster

To copy a file from your local machine to the cluster, run the following on the local machine:

rsync [FROM WHERE] [TO WHERE]

rsync -rpt example-file.png nibsommr@glogin.hlrn.de:/home/nibsommr/

Use the syntax from before, swapping source and destination.

RClone: Copying to Owncloud

To copy data to and from owncloud, you can use Rclone as a client.

Setup

On the cluster, run rclone config and configure as follows (setup options):

- new remote
- name: owncloud
- 40 (Webdav)
- url: https://owncloud.gwdg.de/remote.php/nonshib-webdav
- 2: owncloud
- user name - the one you use for owncloud (e.g., single sign-on)
- password - the one used for owncloud
- no bearer token
- no editing advanced config necessary
- quit config (q)

💎 To test your configuration, run rclone lsd owncloud: and you should see all directories of your owncloud.

Copy Files from Cluster to Owncloud

Let’s say you want to copy the folder on the cluster /home/nibsommr/deep-learning-with-gpu-cores to the folder test in Owncloud:

rclone [FROM WHERE] [TO WHERE]

rclone copy -v /home/nibsommr/deep-learning-with-gpu-cores/ owncloud:test

Similarly and for completeness, to copy a folder from Owncloud to the cluster, swap the order of the arguments: Copying the folder Hackathon from Owncloud to the local repository on the cluster:

rclone [FROM WHERE] [TO WHERE]

rclone copy -v owncloud:Hackathon .

Share Owncloud Folder

Once you upload the data from the cluster to Owncloud, you can share it with colleagues that might not have cluster access. To do so, go to https://owncloud.gwdg.de and click on share symbol next to the folder you want to share. You can use the share with user and groups (if the other users also have access to owncloud) or also use a public to share with people that do not have access to owncloud.

Tmux: Copying with Long Duration

Copying might take a longer time if you have a lot of files. In this case, it might be problematic that as soon as you close the shell from which you started the process, it will stop! You can use tmux to accomodate for this.

Setup

The tool tmux is already installed on the cluster. For your local machine, you might need to install it.

Starting a Session

A tmux session is a group of shell windows. When you leave the session, programmes that are running in this session will still continue to run. Thus, the copying process can be started in a tmux session, you can close the shell and the copying process wil continue. To start a new session called copying, run

tmux new -s copying

💎 To see your active sessions, run tmux ls and you will see your sessions with their creation date.

Attaching Sessions

Attaching a session means attaching a tmux session to your currently opened terminal.

To attach to a session, run

tmux attach -t copying

where copying is the name of your session.

💎 If you get sessions should be nested with care, unset $TMUX to force it means that you are already inside a session! Probably (if you are not doing anything fancy) you would like to first deattach the current session and then attach the session of your choice.

Running the Copying

Inside the attached session, you can now start the copying process (using rsync or rclone as described in the chapters above).

Deattaching and Closing Sessions

To deattach your current terminal from the tmux session, press ctrl + b + d. The copy process that you started within the session will still continue.

You can reattach the session

To delete a session, run

tmux ls

to list all your sessions and

tmux kill-session -t copying

with copying as the name of the session that you want to kill.

With wrapping the copying process in a tmux session, you can ensure that the copying process will still continue after you closed your terminal. For a full overview of more tmux commands, a helpful tutorial can be found here and here.