==== Using Docker Images ==== == HOW THIS TUTORIAL WORKS == This tutorial is written in order to give a small insight in using Docker image files with Singularity. It will cover, connecting to cluster, pulling the image and starting a database for a small test script. **It is written as shell script itself and might just be executed on the cluster!** You may also just copy paste the commands for a better understanding. Please check [[hpc:tutorials:singularity:create_image_files|this tutorial]] if you are interested in creating an own image file. Thx for reading \\ Jan Eberhardt #! /bin/bash # 0. Login on Frontend (you probably already did that) # Use your TUB account and host gateway.hpc.tu-berlin.de # ssh "@gateway.hpc.tu-berlin.de" # 1. Get Docker Image # Go to your home directory and download the image via singularity. # You must load the singularity module beforehand. module load singularity/3.1.0 # Pulling docker images is done by Singularity's pull command. Source will be something like "docker://[package name]". # Singularity will automatically download the latest version of the image and rewrite it to a Singularity image file (sif) as "[package name]_latest.sif". # Therefore you will need write permission in your current working directory (which is why we changed into home). cd singularity pull "docker://mongo" # 2. Create Python environment [if using Python] # Most Python projects will use open source libraries installed by pip. Since normal users are not allowed to do so, it is recommended to install # pip packages in user space or in a virtual python environment. We would discourage you from using user space for installation since most packages # you will only use once in your life and it is therefore cleaner to get an unique environment for each project of yours. # a) Load python module. module load python/3.7.1 # b) Create the environment py="~/mongodb_venv" python3 -m venv ${py} # c) Install required pip packages and updates and create start script. # You may change the next steps accordingly to your project. # EXAMPLE PIP PACKAGE LIST #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat << EOL > "${py}/pip-packages" pymongo>=3.8.0 EOL #<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< # EXAMPLE PYTHON SCRIPT #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat << EOS > ~/mongodb_run.py #! python3 from pymongo import MongoClient from pprint import pprint from sys import argv, executable from datetime import datetime print("Starting {:s}".format(argv[0])) print("Using environment {:s}".format(executable)) print("Connecting to localhost") db_name = "test_database" col_name = "test_collection" client = MongoClient("localhost", 27017) print("Open collection '{:s}' on database '{:s}'".format(col_name, db_name)) db = client[db_name] col = db[col_name] post = { "author": "HPC User", "text": "This is a test record!", "tags": [ "test", "mongodb", "pymongo" ], "date": datetime.utcnow() } print("Inserting single record, resulting:") post_record = col.insert_one(post) pprint(post_record); EOS #<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< "${py}/bin/pip3" install --upgrade pip "${py}/bin/pip3" install -r "${py}/pip-packages" # 3. Create DB directory dd="~/mongo" mkdir -p "${dd}" # 4. Start Server and run # Use mongodb_start.sbatch in order to allocate resources for and to start mongodb server: # This script will open up a server on a node and close it after the Python script finishes. # # We use the --exclusive switch of SBATCH in order to secure that port 27017 (default mongodb) is not in use. # If you do not like to use an exclusive node you will have to either accept the risk that the command fails or # to build a Singularity image of your own. #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cat << EOF > ./mongodb_start.sbatch #!/bin/bash # # Start MongoDB docker image # #SBATCH --job-name=MongoDBStart #SBATCH --partition=standard #SBATCH --nodes=1 #SBATCH --cpus-per-task=4 #SBATCH --exclusive # #1 prepare module load singularity/3.1.0 #- start instance (not the server) #- In that way we can use the instance command to stop the database when script finishes. singularity instance start --bind "${dd}:/data/db" ./mongo_latest.sif mongodb #- start server (by runscript) #- It will generate a lot of output, better redirecting that to oblivion (1>/dev/null). #- Also this call will lock your shell, avoided by ending the command with "&". singularity run instance://mongodb 1>/dev/null & #2 run program #- wait for database server to run sleep 5 #- run script ${py}/bin/python3 ~/mongodb_run.py #3 stop database after script finishes singularity instance stop mongodb EOF #<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< sbatch ./mongodb_start.sbatch