changing the $HOME folder in a singularity container - docker

please would you advise :
given a singularity container, how can I copy the files from the local drive to the singularity container ?
I am using a singularity container that is described at :
https://hub.docker.com/r/tobneu/slamdunk
(and from the docker image i have made a singularity image for a SLURM cluster)
I have searched the stackoverflow for answers, however, i have found the answer only to the reverse question i.e. copying the files from the singularity container to the local drive.
https://stackoverflow.com/questions/59736299/transferring-files-from-the-singularity-container-into-the-local-directory
thanks a lot,
bogdan

As mentioned on your previous question, singularity has the -H/--home command line parameter.
singularity exec -H /labs/zzz/data my_image.sif bash -c 'echo "HOME=$HOME";echo "PWD=$PWD"'
# HOME=/labs/zzz/data
# PWD=...

I shall rephrase the question though, as i have noted that the singularity container sees the $HOME folder that i do have on a SLURM cluster (i.e. /home/btanasa).
Please, if I may re-phrase the question : how shall i change the $HOME folder that the singularity container sees, for example /labs/zzz/data ? thanks a million !

Finally, i have got it to work. Shall anyone need to know the answer to the question in the title, it is outlined below by using --bind and --home:
singularity exec \
--bind /local/scratch/btanasa:/output8 \
--home /labs/jlgoldbe/MASSY_data_SLAMseq/the_SAMPLES_MAY2021:/home \
/labs/jlgoldbe/MASSY_data_SLAMseq/the_SAMPLES_MAY2021/SLAMDUNK_SINGULARITY/slamdunk_latest.sif slamdunk all \
-r GRCm38.primary_assembly.genome.fa \
-b 3UTRs_vM14_github_repository.27aug2020.sortdesc.LONG.with.SYMBOLS.to.use.bed \ -o /output8 \
-t 4 \
./8_R1_001.fastq.gz

Related

Singularity arguments conflict with my bioinformatics tool arguments

EDIT: documentation given by the informatic administration was shitty, old version of singularity, now the order of arguments is different and the problem is solved.
To make my tool more portable, and because I have to use it on a cluster, I have to put my bioinformatics tool at disposal for docker. Tool is located here. The docker hub is 007ptar007/metadbgwas, if you want to experience with it. The Dockerfile is in the repo, and to make it easier to everyone :
FROM ubuntu:latest
ENV DEBIAN_FRONTEND=noninteractive
USER root
COPY ./install_docker.sh ./
RUN chmod +x ./install_docker.sh && sh ./install_docker.sh
ENTRYPOINT ["/MetaDBGWAS/metadbgwas.sh"]
ENV PATH="/MetaDBGWAS/:${PATH}"
And the install_docker.sh script contains :
apt-get update
apt install -y libgatbcore-dev libhdf5-dev libboost-all-dev libpstreams-dev zlib1g-dev g++ cmake git r-base-core
Rscript -e "install.packages(c('ape', 'phangorn'))"
Rscript -e "install.packages('https://raw.githubusercontent.com/sgearle/bugwas/master/build/bugwas_1.0.tar.gz', repos=NULL, type='source')"
git clone --recursive https://github.com/Louis-MG/MetaDBGWAS.git
cd MetaDBGWAS
sed -i "51i#include <limits>" ./REINDEER/blight/robin_hood.h #temporary fix for REINDEER compilation
sh install.sh
The problem :
My tool parses the command line, and needs a verbose (-v, or --verbose) argument. It also needs to reject unknown arguments; anything that isn't used by the tool causes the help message to be printed in the standard output and exits. To use the tool, I need to mount volumes were the data is; using -v /path/to/files:/input option:
singularity run docker://007ptar007/metadbgwas --volumes '/path/to/data:/inputd/:/input' --files /input --strains /input/strains --threads 8 --output ~/output
But my tool sees this as a bad -v option value or the --volume as an unknown option. I can't change this on my tool. How do I solve this conflict ?
You need to put any arguments intended for singularity - such as the volume mounting - before the name of the image you want to run (e.g. the docker image you specify in your command):
singularity run -v '/path/to/data:/input' docker://007ptar007/metadbgwas --files /input --strains /input/strains --threads 8 --output ~/output

Avoiding duplicated arguments when running a Docker container

I have a tensorflow training script which I want to run using a Docker container (based on the official TF GPU image). Although everything works just fine, running the container with the script is horribly verbose and ugly. The main problem is that my training script allows the user to specify various directories used during training, for input data, logging, generating output, etc. I don't want to have change what my users are used to, so the container needs to be informed of the location of these user-defined directories, so it can mount them. So I end up with something like this:
docker run \
-it --rm --gpus all -d \
--mount type=bind,source=/home/guest/datasets/my-dataset,target=/datasets/my-dataset \
--mount type=bind,source=/home/guest/my-scripts/config.json,target=/config.json \
-v /home/guest/my-scripts/logdir:/logdir \
-v /home/guest/my-scripts/generated:/generated \
train-image \
python train.py \
--data_dir /datasets/my-dataset \
--gpu 0 \
--logdir ./logdir \
--output ./generated \
--config_file ./config.json \
--num_epochs 250 \
--batch_size 128 \
--checkpoint_every 5 \
--generate True \
--resume False
In the above I am mounting a dataset from the host into the container, and also mounting a single config file config.json (which configures the TF model). I specify a logging directory logdir and an output directory generated as volumes. Each of these resources are also passed as parameters to the train.py script.
This is all very ugly, but I can't see another way of doing it. Of course I could put all this in a shell script, and provide command line arguments which set these duplicated values from the outside. But this doesn't seem a nice solution, because if I want to anything else with the container, for example check the logs, I would use the raw docker command.
I suspect this question will likely be tagged as opinion-based, but I've not found a good solution for this that I can recommend to my users.
As user Ron van der Heijden points out, one solution is to use docker-compose in combination with environment variables defined in an .env file. Nice answer.

Is it possible to add an installer, run it and delete it during one build step in Docker?

I'm trying to create a Docker image from a pretty large installer binary (300+ MB). I want to add the installer to the image, install it, and delete the installer. This doesn't seem to be possible:
COPY huge-installer.bin /tmp
RUN /tmp/huge-installer.bin
RUN rm /tmp/huge-installer.bin # <- has no effect on the image size
Using multiple build stages doesn't seem to solve this, since I need to run the installer in the final image. If I could execute the installer directly from a previous build stage, without copying it, that would solve my problem, but as far as I know that's not possible.
Is there any way to avoid including the full weight of the installer in the final image?
I ended up solving this by using the built-in HTTP server in Python to make the project directory available to the image over HTTP.
Inside the Dockerfile, I can run commands like this, piping scripts directly to bash using curl:
RUN curl "http://127.0.0.1:${SERVER_PORT}/installer-${INSTALLER_VERSION}.bin" | bash
Or save binaries, run them and delete them in one step:
RUN curl -O "http://127.0.0.1:${SERVER_PORT}/binary-${INSTALLER_VERSION}.bin" && \
./binary-${INSTALLER_VERSION}.bin && \
rm binary-${INSTALLER_VERSION}.bin
I use a Makefile to start the server and stop it after the build, but you can use a build script instead.
Here's a Makefile example:
SHELL := bash
IMAGE_NAME := app-test
VERSION := 1.0.0
SERVER_PORT := 8580
.ONESHELL:
.PHONY: build
build:
# Kills the HTTP server when the build is done
function cleanup {
pkill -f "python3 -m http.server.*${SERVER_PORT}"
}
trap cleanup EXIT
# Starts a HTTP server that makes the contents of the project directory
# available to the image
python3 -m http.server -b 127.0.0.1 ${SERVER_PORT} &>/dev/null &
sleep 1
EXTRA_ARGS=""
# Allows skipping the build cache by setting NO_CACHE=1
if [[ -n $$NO_CACHE ]]; then
EXTRA_ARGS="--no-cache"
fi
docker build $$EXTRA_ARGS \
--network host \
--build-arg SERVER_PORT=${SERVER_PORT} \
-t ${IMAGE_NAME}:latest \
.
docker tag ${IMAGE_NAME}:latest ${IMAGE_NAME}:${VERSION}
I think the best way is to download the bin from a website then run it:
RUN wget http://myweb/huge-installer.bin && /tmp/huge-installer.bin && rm /tmp/huge-installer.bin
in this way your image layer will not contain the binary you download
I didn't test it thoroughly, but wouldn't such an approach be viable? (Besides LinPy's answer, which is way easier if you have the possibility to just do it that way.)
Dockerfile:
FROM alpine:latest
COPY entrypoint.sh /tmp/entrypoint.sh
RUN \
echo "I am an image that can run your huge installer binary!" \
&& echo "I will only function when you give it to me as a volume mount."
ENTRYPOINT [ "/tmp/entrypoint.sh" ]
entrypoint.sh:
#!/bin/sh
/tmp/your-installer # install your stuff here
while true; do
echo "installer finished, commit me now!"
sleep 5
done
Then run:
$ docker build -t foo-1
$ docker run --rm --name foo-1 --rm -d -v $(pwd)/your-installer:/tmp/your-installer
$ docker logs -f foo-1
# once it echoes "commit me now!", run the next command
$ docker commit foo-1 foo-2
$ docker stop foo-1
Since the installer was only mounted as a volume, the image foo-2 should not contain it anymore. You could also go and build another Dockerfile based on foo-2 to change the entrypoint, for example.
Cf. docker commit

Manually installing SonarQube plugins on Docker image

I want to create my custom SonarQube docker image, with some plugins already installed, but every time I run my container, the plugins are not there. It's like something removes the plugins from /opt/sonarqube/extensions/plugins and copy the bundled-plugins there.
My Dockerfile
FROM sonarqube
ENV SONARQUBE_HOME /opt/sonarqube
RUN wget "http://downloads.sonarsource.com/plugins/org/codehaus/sonar-plugins/sonar-scm-git-plugin/1.1/sonar-scm-git-plugin-1.1.jar" \
&& wget "https://github.com/SonarSource/sonar-java/releases/download/3.12-RC2/sonar-java-plugin-3.12-build4634.jar" \
&& wget "https://github.com/SonarSource/sonar-github/releases/download/1.1-M9/sonar-github-plugin-1.1-SNAPSHOT.jar" \
&& wget "https://github.com/SonarSource/sonar-auth-github/releases/download/1.0-RC1/sonar-auth-github-plugin-1.0-SNAPSHOT.jar" \
&& wget "https://github.com/QualInsight/qualinsight-plugins-sonarqube-badges/releases/download/qualinsight-plugins-sonarqube-badges-1.2.1/qualinsight-sonarqube-badges-1.2.1.jar" \
&& mv *.jar $SONARQUBE_HOME/extensions/plugins \
&& ls -lah $SONARQUBE_HOME/extensions/plugins
I tried listing the folder, and it lists my desired plugins. But if I list the same folder after I started the container, they are gone.
I've also tried removing the bundled-plugins with no luck.
Any ideas?
The Sonarqube image uses a volume for the /extensions/ directory, which results in the files in that directory not being stored in the image's filesystem; see the Dockerfile
To package those extensions in your image, you need them outside of that directory, and copy those files to the /extensions/ in an entrypoint script, or store your plugins in a separate image, and mount those plugins as a volume when running the image; you can find an example doing that here; https://github.com/SonarSource/docker-sonarqube/blob/master/recipes.md
Note the accepted answer is no longer true. It should work if you directly use recent parent sonarqube image. So if the Dockerfile metioned in the question does not work, you have a different problem.
See commit 80366e3419d698b4bba4447f153418ef64b3b705 for more info.
Remove volume for "conf", "logs" and "extensions" directories
Explicit declaration of volume is appropriate only for data stored by
application, volume for configurable things is almost never
appropriate (see docker-library/official-images#2437).
This reverts commit 69fca2e. And additionally removes declaration of
volume for "extensions" directory.
Before SonarQube 5.6, plugins are stored in a volume, so the appropriate command is (after starting the sonarqube container):
wget https://sonarsource.bintray.com/Distribution/sonar-auth-github-plugin/sonar-auth-github-plugin-1.2.jar -P `docker inspect -f '{{ (index .Mounts 1).Source }}' sonarqube`/plugins
docker restart sonarqube

How do I edit a file after I shell to a Docker container?

I successfully shelled to a Docker container using:
docker exec -i -t 69f1711a205e bash
Now I need to edit file and I don't have any editors inside:
root#69f1711a205e:/# nano
bash: nano: command not found
root#69f1711a205e:/# pico
bash: pico: command not found
root#69f1711a205e:/# vi
bash: vi: command not found
root#69f1711a205e:/# vim
bash: vim: command not found
root#69f1711a205e:/# emacs
bash: emacs: command not found
root#69f1711a205e:/#
How do I edit files?
As in the comments, there's no default editor set - strange - the $EDITOR environment variable is empty. You can log in into a container with:
docker exec -it <container> bash
And run:
apt-get update
apt-get install vim
Or use the following Dockerfile:
FROM confluent/postgres-bw:0.1
RUN ["apt-get", "update"]
RUN ["apt-get", "install", "-y", "vim"]
Docker images are delivered trimmed to the bare minimum - so no editor is installed with the shipped container. That's why there's a need to install it manually.
EDIT
I also encourage you to read my post about the topic.
If you don't want to add an editor just to make a few small changes (e.g., change the Tomcat configuration), you can just use:
docker cp <container>:/path/to/file.ext .
which copies it to your local machine (to your current directory). Then edit the file locally using your favorite editor, and then do a
docker cp file.ext <container>:/path/to/file.ext
to replace the old file.
You can use cat if it's installed, which will most likely be the case if it's not a bare/raw container. It works in a pinch, and ok when copy+pasting to a proper editor locally.
cat > file
# 1. type in your content
# 2. leave a newline at end of file
# 3. ctrl-c / (better: ctrl-d)
cat file
cat will output each line on receiving a newline. Make sure to add a newline for that last line. ctrl-c sends a SIGINT for cat to exit gracefully. From the comments you see that you can also hit ctrl-d to denote end-of-file ("no more input coming").
Another option is something like infilter which injects a process into the container namespace with some ptrace magic: https://github.com/yadutaf/infilter
To keep your Docker images small, don't install unnecessary editors. You can edit the files over SSH from the Docker host to the container:
vim scp://remoteuser#containerip//path/to/document
You can use cat if installed, with the > caracter.
Here is the manipulation :
cat > file_to_edit
#1 Write or Paste you text
#2 don't forget to leave a blank line at the end of file
#3 Ctrl + C to apply configuration
Now you can see the result with the command
cat file
For common edit operations I prefer to install vi (vim-tiny), which uses only 1491 kB or nano which uses 1707 kB.
In other hand vim uses 28.9 MB.
We have to remember that in order for apt-get install to work, we have to do the update the first time, so:
apt-get update
apt-get install vim-tiny
To start the editor in CLI we need to enter vi.
You can open existing file with
cat filename.extension
and copy all the existing text on clipboard.
Then delete old file with
rm filename.extension
or rename old file with
mv old-filename.extension new-filename.extension
Create new file with
cat > new-file.extension
Then paste all text copied on clipboard, press Enter and exit with save by pressing ctrl+z. And voila no need to install any kind of editors.
Sometime you must first run the container with root:
docker exec -ti --user root <container-id> /bin/bash
Then in the container, to install Vim or something else:
apt-get install vim
I use "docker run" (not "docker exec"), and I'm in a restricted zone where we cannot install an editor. But I have an editor on the Docker host.
My workaround is: Bind mount a volume from the Docker host to the container (https://docs.docker.com/engine/reference/run/#/volume-shared-filesystems), and edit the file outside the container. It looks like this:
docker run -v /outside/dir:/container/dir
This is mostly for experimenting, and later I'd change the file when building the image.
After you shelled to the Docker container, just type:
apt-get update
apt-get install nano
You can just edit your file on host and quickly copy it into and run it inside the container. Here is my one-line shortcut to copy and run a Python file:
docker cp main.py my-container:/data/scripts/ ; docker exec -it my-container python /data/scripts/main.py
If you use Windows container and you want change any file, you can get and use Vim in Powershell console easily.
To shelled to the Windows Docker container with PowerShell:
docker exec -it <name> powershell
First install Chocolatey package manager
Invoke-WebRequest https://chocolatey.org/install.ps1 -UseBasicParsing | Invoke-Expression;
Install Vim
choco install vim
Refresh ENVIRONMENTAL VARIABLE
You can just exit and shell back to the container
Go to file location and Vim it vim file.txt
See Stack Overflow question
sed edit file in place
It would be a good option here, if:
To modify a large file, it's impossible to use cat.
Install Vim is not allowed or takes too long.
My situation is using the MySQL 5.7 image when I want to change the my.cnf file, there is no vim, vi, and Vim install takes too long (China Great Firewall). sed is provided in the image, and it's quite simple. My usage is like
sed -i /s/testtobechanged/textwanted/g filename
Use man sed or look for other tutorials for more complex usage.
It is kind of screwy, but in a pinch you can use sed or awk to make small edits or remove text. Be careful with your regex targets of course and be aware that you're likely root on your container and might have to re-adjust permissions.
For example, removing a full line that contains text matching a regex:
awk '!/targetText/' file.txt > temp && mv temp file.txt
(More)
If you can only shell into container with bin/sh (in case bin/bash doesn't work)
and apt or apt-get doesn't work in the container, check whether apk is installed by entering apk in command prompt inside the container.
If yes, you can install nano as follows:
apk add nano
then nano will work as usual
An easy way to edit a few lines would be:
echo "deb http://deb.debian.org/debian stretch main" > sources.list
You can install nano
yum install nano
You can also use a special container which will contain only the command you need: Vim. I chose python-vim. It assumes that the data you want to edit are in a data container built with the following Dockerfile:
FROM debian:jessie
ENV MY_USER_PASS my_user_pass
RUN groupadd --gid 1001 my_user
RUN useradd -ms /bin/bash --home /home/my_user \
-p $(echo "print crypt("${MY_USER_PASS:-password}", "salt")" | perl) \
--uid 1001 --gid 1001 my_user
ADD src /home/my_user/src
RUN chown -R my_user:my_user /home/my_user/src
RUN chmod u+x /home/my_user/src
CMD ["true"]
You will be able to edit your data by mounting a Docker volume (src_volume) which will be shared by your data container (src_data) and the python-vim container.
docker volume create --name src_volume
docker build -t src_data .
docker run -d -v src_volume:/home/my_user/src --name src_data_1 src_data
docker run --rm -it -v src_volume:/src fedeg/python-vim:latest
That way, you do not change your containers. You just use a special container for this work.
First login as root :
docker run -u root -ti bash
Type following commands:
apt-get update &&
apt-get install nano
docker comes up with no editors. so simply install vim, 36MB space don't kill your docker!
Make sure to update the container before trying to install the editor.
apt-get update
apt-get install nano vi

Resources