Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:services:application_services:jupyter:start [2021/12/16 11:13] – [Transfer data to the Unix / Linux home directory] bbrauns | en:services:application_services:jupyter:start [2024/08/23 11:36] (current) – [Creating a new environment] bwegman1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Jupyter / JupyterHub ====== | ||
+ | |||
+ | GWDG offers [[https:// | ||
+ | |||
+ | ===== Preconditions For Use ===== | ||
+ | A valid [[de: | ||
+ | ===== Who is the service intended for? ===== | ||
+ | |||
+ | The service is intended for users in education to be used in seminars and lectures or by individual users who want to learn and try out the respective programming languages and their tools. | ||
+ | |||
+ | For larger use cases and when working with large datasets, complex models or parallel computations it is recommended to use the [[https:// | ||
+ | |||
+ | ===== What is Jupyter / JupyterHub? ===== | ||
+ | |||
+ | Jupyter makes it possible to work interactively with Python, Julia or R with only a browser. Source Code is written, executed and edited directly in the user's browser. This happens in a so called " | ||
+ | |||
+ | JupyterHub is the portal users log in to and start and manage their notebooks and associated files. | ||
+ | |||
+ | Please note that you can also use [[https:// | ||
+ | |||
+ | ===== How to use Jupyter / Jupyter-Hub? | ||
+ | |||
+ | ==== Prerequisites ==== | ||
+ | |||
+ | Storage and calculation of notebooks happens server side, the client does not need to install any software or meet any other prerequisites other than having a fairly modern browser to work with. To log into the service a [[en: | ||
+ | |||
+ | ==== Selecting a notebook image ==== | ||
+ | |||
+ | After successfully logging in a selection screen appears with a choice of notebook images to start the notebook server with: | ||
+ | * GWDG default image (based on jupyter/ | ||
+ | * This was also the default image in the past. | ||
+ | * Python Stack w/ TensorFlow (jupyter/ | ||
+ | * Python and R Spark Jupyter Notebook (jupyter/ | ||
+ | * Data Science Jupyter Notebook (jupyter/ | ||
+ | |||
+ | The notebook image provides the environment for the notebook server, in particular the pre-installed software. | ||
+ | While the GWDG default image is heavily extended from the regular data science notebook the regular notebooks from the Jupyter project may be preferable in some cases or provide a more specialized environment for a specific software set. | ||
+ | |||
+ | Irrespective of the selected image the user's home directory and data remains the same. | ||
+ | |||
+ | The notebook image can only be changed when the current notebook server is stopped and restarted. The server does not automatically stop when logging off or closing the browser, although this will cause it to timeout after a while and then stop. The server can be explicitly stopped from the menu File -> Hub Control Panel -> "Stop my server" | ||
+ | |||
+ | === Changelog of notebook images === | ||
+ | |||
+ | A simple changelog with [[en: | ||
+ | ==== Starting a notebook ==== | ||
+ | |||
+ | After successful login at Jupyter-Hub there is a drop down menu at the top left corner. Under „File - New“ a new notebook can be created. Previously used notebooks and their files are listed on the left hand side. | ||
+ | |||
+ | Detailed information about the user interface: https:// | ||
+ | |||
+ | ==== Managing notebooks ==== | ||
+ | |||
+ | An active notebook can be closed through the "File - Close and shutdown notebook" | ||
+ | |||
+ | Deleting a notebook is done through the directory listing - left click - delete. | ||
+ | ===== Usage ===== | ||
+ | * ⚠ **Please note** that all directories except the home directory are volatile and will be lost when the notebook server is closed. | ||
+ | * A maximum of 50 GB disk space and 10 GB RAM can be used. | ||
+ | * jupyter-cloud.gwdg.de is not suitable for continuous computations over multiple days. | ||
+ | |||
+ | ==== Notebook does not start / Kernel can not connect ==== | ||
+ | If, after upgrading packages / installing new packages / installing a new kernel, it is no longer possible to start a notebook or the kernel can not connect, it can help to rename the ' | ||
+ | |||
+ | * File - New - Terminal | ||
+ | * <code bash> mv -v .local/ .local.gwdg-disable </ | ||
+ | * Restart notebook server: File - Hub Control Panel - Stop My Server | ||
+ | |||
+ | ==== Installing additional python modules ==== | ||
+ | Additional Python modules can be installed via the terminal and the Python package manager " | ||
+ | |||
+ | === Installing large python modules and disk space === | ||
+ | |||
+ | The installation of large Python modules like " | ||
+ | |||
+ | <code bash> | ||
+ | mkdir -v ~/ | ||
+ | TEMP=~/ | ||
+ | </ | ||
+ | |||
+ | Prefixing the installation with the TEMP variable makes pip use that location for this one installation. | ||
+ | |||
+ | === Notebook fails to start after package installation or kernel connection failure === | ||
+ | |||
+ | If every notebook fails to start after a package installation or upgrade the issue can be resolved by renaming the folder '' | ||
+ | * File - New - Terminal | ||
+ | * <code bash> mv -v .local/ .local.gwdg-disable </ | ||
+ | * Afterwards the notebook server should be restarted: File - Hub Control Panel - Stop My Server | ||
+ | <code bash> | ||
+ | mv -v .local/ .local.gwdg-disable | ||
+ | </ | ||
+ | |||
+ | <WRAP tip> | ||
+ | **mamba** is an alternative implementation of the **conda** package manager. They are interchangeable, | ||
+ | https:// | ||
+ | |||
+ | There are a few steps below where **conda** is still used instead of **mamba** because in tests this appeared to be necessary. The **mamba** documentation may provide alternative and better solutions, the below examples are provided as working example, there are likely no the best solutions. | ||
+ | </ | ||
+ | |||
+ | ==== Installation of additional packages and environments via Conda/Mamba ==== | ||
+ | |||
+ | Management of software packages and environments with Conda/Mamba requires a terminal session started from the notebook server. The terminal ist available after login via '' | ||
+ | |||
+ | Before working with '' | ||
+ | <code bash> | ||
+ | . / | ||
+ | </ | ||
+ | |||
+ | === Creating a new environment === | ||
+ | |||
+ | The following describes the creation of a new, simple environment '' | ||
+ | |||
+ | Creating and activating the environment: | ||
+ | <code bash> | ||
+ | mamba create -y --prefix ./wikidoku | ||
+ | conda activate ./wikidoku | ||
+ | </ | ||
+ | |||
+ | As an example the package '' | ||
+ | <code bash> | ||
+ | mamba install -y jinja2 | ||
+ | </ | ||
+ | |||
+ | Next the new environment will be registered with the notebook. Terms for '' | ||
+ | <code bash> | ||
+ | python3 -m ipykernel install --user --name wikidoku --display-name " | ||
+ | jupyter kernelspec list | ||
+ | mamba deactivate | ||
+ | </ | ||
+ | |||
+ | If installation of the kernel fails with the message ''/ | ||
+ | <code bash> | ||
+ | python3 -m pip install jupyter | ||
+ | </ | ||
+ | |||
+ | === Selecting the new environment === | ||
+ | |||
+ | == Restarting of the notebook server == | ||
+ | |||
+ | After installation of a new environment it is recommended to restart the notebook server. Leave all existing terminals and close all open notebooks. In the Jupyter overview page click on '' | ||
+ | |||
+ | Via '' | ||
+ | |||
+ | === Installing additional kernels in an Conda/Mamba environment === | ||
+ | |||
+ | Installing a new, independent Python kernel für the current environment is possible. As an example an older Python 2.7 kernel will be installed next. | ||
+ | |||
+ | A new environment needs to be created and activated as per the steps above. | ||
+ | Next follows the installation of the kernel, the '' | ||
+ | <code bash> | ||
+ | mamba install -y python=2.7 | ||
+ | python3 -m pip install jupyter | ||
+ | python3 -m ipykernel install --user --name oldpython --display-name " | ||
+ | </ | ||
+ | |||
+ | The new kernel is now available for new and existing notebooks after restarting the notebook server. | ||
+ | The current kernel version can be queried from within Python: | ||
+ | <code python> | ||
+ | import sys | ||
+ | print (sys.version) | ||
+ | </ | ||
+ | |||
+ | === Removing an environment === | ||
+ | |||
+ | In order to remove an environment it has to be de-registered from the notebook server and then its files removed (optional but recommended). We list the installed kernels, de-register and remove the environment' | ||
+ | |||
+ | <code bash> | ||
+ | jupyter kernelspec list | ||
+ | jupyter kernelspec remove wikidoku | ||
+ | rm -rf ./wikidoku | ||
+ | </ | ||
+ | |||
+ | ==== Installing additional R packages==== | ||
+ | < | ||
+ | 1) create a file "/ | ||
+ | " | ||
+ | |||
+ | execute the following in a terminal (" | ||
+ | |||
+ | 2) mkdir -p ~/ | ||
+ | 3) R | ||
+ | 4) source(" | ||
+ | 5) biocLite() | ||
+ | |||
+ | This is because R downloads and installs packages to and from the default tmp directory, | ||
+ | from which it cannot execute files. Using a tmp directory inside the home directory solves | ||
+ | this problem. | ||
+ | |||
+ | How to install packages from Github (in R): | ||
+ | |||
+ | 1) library(devtools) | ||
+ | 2) options(unzip = " | ||
+ | 3) install_github(" | ||
+ | |||
+ | </ | ||
+ | |||
+ | ==== Transfer data to the Unix / Linux home directory ==== | ||
+ | In order to facilitate access to larger amounts of data on jupyter-cloud.gwdg.de, | ||
+ | |||
+ | Open a new jupyter terminal via the menu “New” → “Terminal” | ||
+ | |||
+ | < | ||
+ | jovyan@0d5793127e96: | ||
+ | myfile.txt | ||
+ | jovyan@0d5793127e96: | ||
+ | sending incremental file list | ||
+ | ./ | ||
+ | myfile.txt | ||
+ | |||
+ | sent 145 bytes received 44 bytes 75.60 bytes/sec | ||
+ | total size is 12 speedup is 0.06 | ||
+ | jovyan@0d5793127e96: | ||
+ | </ | ||
+ | |||
+ | If necessary, the respective ssh private key must be stored in the .ssh / directory in jupyter-cloud. The associated ssh public key must also be available on login.gwdg.de. | ||
+ | |||
+ | For accessing the data in the Unix / Linux home directory from a Windows machine, see: [[de: | ||
+ | |||
+ | ==== Install addition kernel with pipenv | ||
+ | |||
+ | Open a new jupyter terminal via the menu “New” → “Terminal” | ||
+ | |||
+ | < | ||
+ | pip install pipenv --user | ||
+ | mkdir myproject | ||
+ | cd myproject | ||
+ | export PATH=~/ | ||
+ | pipenv --python / | ||
+ | pipenv install ipykernel networkx | ||
+ | pipenv shell | ||
+ | ipython kernel install --user --name=projectname | ||
+ | </ | ||
+ | |||
+ | * Stop and restart server via control panel | ||
+ | * Afterwards " | ||
+ | |||
+ | ==== Install additional julia packages with an extra kernel ==== | ||
+ | ** !experimental! ** | ||
+ | |||
+ | The jupyter docker stacks image sets the variable JULIA_DEPOT_PATH to the path /opt/julia. However, this is volatile, since only the home directory is kept persistent. The following describes the installation of a new julia kernel, which has its package directory pointed to the home directory: | ||
+ | |||
+ | < | ||
+ | Start terminal | ||
+ | Temporarily change julia package directories: | ||
+ | |||
+ | export JULIA_DEPOT_PATH=/ | ||
+ | export JULIA_PKGDIR=/ | ||
+ | |||
+ | Create directory for custom packages and new julia kernel: | ||
+ | |||
+ | > mkdir / | ||
+ | > julia | ||
+ | julia > # switch to pkg with ' | ||
+ | pkg > add IJulia # switch back to julia with CTRL+C | ||
+ | julia > using IJulia | ||
+ | | ||
+ | Restart notebook server | ||
+ | Create new notebook with " | ||
+ | Add package example: | ||
+ | |||
+ | using Pkg | ||
+ | Pkg.add(" | ||
+ | </ | ||
+ | |||
+ | ==== As a tutor, how can I share larger datasets with others? ==== | ||
+ | See: [[public-folder|public folder]]. | ||
+ | |||
+ | |||
+ | |||