How do I add an additional command line tool to an already existing Docker/Singularity image? - docker

I work in neuroscience, and I use a cloud platform called Brainlife to upload and download data (linked here, but I don't think knowledge of Brainlife is relevant to this question). I use Brainlife's command line interface to upload and download data on my university's server. In order to use their CLI, I run Singularity with a Docker image created by Brainlife (found here). I run this using the following code:
singularity shell docker://brainlife/cli -B
I also have the file saved on my server account, and can run it like this:
singularity shell brainlifeimage.sif -B
After running one of those commands, I am able to download and upload data, usually successfully. Currently I'm following Brainlife's tutorial to bulk download data. The tutorial uses the command line tool "jq" (link), which isn't on their docker image. I tried installing it within the Singularity shell like this:
apt-get install jq
And it returned:
Reading package lists... Done
Building dependency tree
Reading state information... Done
W: Not using locking for read only lock file /var/lib/dpkg/lock
E: Unable to locate package jq
Is there an easy way to add this one tool to the image? I've been reading over the Singularity and Docker documentations, but Docker is all new to me and I'm really lost.
If relevant, my university server runs on Ubuntu 16.04.7 LTS, and I am using terminal on a Mac laptop running MacOS 11.3. This is my first stack overflow question - please let me know if i can provide any additional info! Thanks so much.

The short, specific answer: jq is portable, so you can just mount it into the image and use it normally. e.g.,
singularity shell -B /path/to/jq:/usr/bin/jq brainlifeimage.sif
The short, general answer: you can't modify the read only image and need to build a new one.
Long answer with several options and specific examples:
Since singularity images are read only, they cannot have persistent changes made to them. This is great for reproducibility, a bit inconvenient if your tools are likely to change often. You can rebuild the image in several ways, though all will require sudo permissions.
Write a new Singularity definition based on the docker image
Create a new definition file (generally called Singularity or something.def), use the current container as a base and add the desired software in the %post section. Then build the new image with: sudo singularity build brainy_jq.sif Singularity
The definition file docs are quite good and highly recommended.
Bootstrap: docker
From: brainlife/cli:latest
%post
apt-get update && apt-get install -y jq
Create a sandbox of the current singularity image, make your changes, and convert back to a read-only image. See the singularity docs on writable sandbox directories and converting images between formats.
# use --sandbox to create a writable singularity image
sudo singularity build --sandbox writable_brain/ brainlifeimage.sif
# --writable must still be used to make changes, and sudo for correct permissions
sudo singularity exec writable_brain/ bash -c 'apt-get update && apt-get install -y jq'
# convert back to read-only image for normal usage
sudo singularity build brainlifeimage_jq.sif writable_brain/
Modify the source docker image locally and build from that. One of the more... creative options. Almost sudo-free, except singularity pull doesn't accept docker-daemon so a sudo singularity build is necessary.
# add jq to a new docker container. the value for --name doesn't matter, but we use it
# in later steps. The entrypoint needs to be overridden in this case as well.
docker run -it --name brainlife-jq --entrypoint=/bin/bash \
brainlife/cli:1.5.25 -c 'apt-get update && apt-get install -y jq'
# use docker commit to create an image from the container so it can be reused
# note that we're using the name of the image set in the previous step
# the output of docker commit is the hash for the newly created image, so we grab that
IMAGE_ID=$(docker commit brainlife-jq)
# tag the newly created image with a more useful name
docker tag $IMAGE_ID brainlife/cli:1.5.25-jq
# here we use docker-daemon instead of docker to build from a locally cached docker image
# instead of looking at docker hub
sudo singularity build brainlife_jq.sif docker-daemon://brainlife/cli:1.5.25-jq
# now check that it all worked as planned
singularity exec brainlife_jq.sif which jq
# /usr/bin/jq
ref: docker commit, using locally cached docker images

Related

How can I use a several line command in a Dockerfile in order to create a file within the resulting Image

I'm following installation instructions for RedhawkSDR, which rely on having a Centos7 OS. Since my machine uses Ubuntu 22.04, I'm creating a Docker container to run Centos7 then installing RedhawkSDR in that.
One of the RedhawkSDR installation instructions is to create a file with the following command:
cat<<EOF|sed 's#LDIR#'`pwd`'#g'|sudo tee /etc/yum.repos.d/redhawk.repo
[redhawk]
name=REDHAWK Repository
baseurl=file://LDIR/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk
EOF
How do I get a Dockerfile to execute this command when creating an image?
(Also, although I can see that this command creates the file /etc/yum.repos.d/redhawk.repo, which consists of the lines from [redhawk] to gpgkey=...., I have no idea how to parse this command and understand exactly why it does that...)
Using the text editor of your choice, create the file on your local system. Remove the word sudo from it; give it an additional first line #!/bin/sh. Make it executable using chmod +x create-redhawk-repo.
Now it is an ordinary shell script, and in your Dockerfile you can just RUN it.
COPY create-redhawk-repo ./
RUN ./create-redhawk-repo
But! If you look at what the script actually does, it just writes a file into /etc/yum.repos.d with a LDIR placeholder replaced with some other directory. The filesystem layout inside a Docker image is fixed, and there's no particular reason to use environment variables or build arguments to hold filesystem paths most of the time. You could use a fixed path in the file
[redhawk]
name=REDHAWK Repository
baseurl=file:///redhawk-yum/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk
and in your Dockerfile, just COPY that file in as-is, and make sure the downloaded package archive is in that directory. Adapting the installation instructions:
ARG redhawk_version=3.0.1
RUN wget https://github.com/RedhawkSDR/redhawk/releases/download/$redhawk_version/\
redhawk-yum-$redhawk_version-el7-x86_64.tar.gz \
&& tar xzf redhawk-yum-$redhawk_version-el7-x86_64.tar.gz \
&& rm redhawk-yum-$redhawk_version-el7-x86_64.tar.gz \
&& mv redhawk-yum-$redhawk_version-el7-x86_64 redhawk-yum \
&& rpm -i redhawk-yum/redhawk-release*.rpm
COPY redhawk.repo /etc/yum.repos.d/
Remember that, in a Dockerfile, you are root unless you've switched to another USER (and in that case you can use USER root to switch back); you do not need generally sudo in Docker at all, and can just delete sudo where it appears in these instructions.
How do I get a Dockerfile to execute this command when creating an image?
Just use printf and run this command as single line:
FROM image_name:image_tag
ARG LDIR="/default/folder/if/argument/not/set"
# if container has sudo command and default user is not root
# you should choose this variant
RUN printf '[redhawk]\nname=REDHAWK Repository\nbaseurl=file://%s/\nenabled=1\ngpgcheck=1\ngpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk\n' "$LDIR" | sudo tee /etc/yum.repos.d/redhawk.repo
# if default container user is root this command without piping may be used
RUN printf '[redhawk]\nname=REDHAWK Repository\nbaseurl=file://%s/\nenabled=1\ngpgcheck=1\ngpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk\n' "$LDIR" > /etc/yum.repos.d/redhawk.repo
Where LDIR is an argument and docker build process should be run like:
docker build ./ --build-arg LDIR=`pwd`

how do I add root certificate and keep only in docker build time?

the question has 2 parts, the 1st part: how to add root certificate? is simple and we can have reference from like How do I add a CA root certificate inside a docker image?
the 2nd part, which is what I actually want to ask, is: how to keep the root certificate only in docker build time?
maybe we can use buildctl and RUN --mount=type=secret; but it cannot cover all cases.
say I would like to pass sites with self-signed certificate like:
RUN curl https://x01.self-signed-site/obj01
RUN npm install --registry https://x02.self-signed-site/npm
RUN pip install -i https://x03.self-signed-site/pypi/simple
RUN mvn install
...
thus, we need to config certificate for each tool:
(prepare certificate and prepare .npmrc, .curlrc, ...)
(for, curl, npm, pip, we can use env vars; but we cannot guarantee we can use this way for other tools)
therefore, we need to download self-signed certificate into image and also modify some files to apply the cert config. how to keep the change only in build time (no persistent layer in final image)?
we resolved this problem by using docker save and docker load; but currently, docker load does not work as we expect (see also how to keep layers when do `docker load`)
anyway, below is our solution in pseudo-code:
docker save -o out.tar <image>
mkdir contents && cd contents
tar xf ../out.tar
open manifest.json, get config <hash>.json as config.json
remove target layers in:
- config.json[history]
- config.json[rootfs][diff_ids]
- manifest.json[0][Layers]
remove layer tarballs (get layer_hashes from maniefst.josn[0][Layers]):
- <layer_hash>/*
fill gap between missing layers:
- <layer_hash_next>/json[parent] = <layer_hash_prev>
tar cf ../new.tar *
docker rmi <image>
docker load -i ../new.tar
ref: https://github.com/stallpool/track-network-traffic/blob/main/bin/docker_image_cleanup.py

How to serve a tensorflow model using docker image tensorflow/serving when there are custom ops?

I'm trying to use the tf-sentencepiece operation in my model found here https://github.com/google/sentencepiece/tree/master/tensorflow
There is no issue building the model and getting a saved_model.pb file with variables and assets. However, if I try to use the docker image for tensorflow/serving, it says
Loading servable: {name: model version: 1} failed:
Not found: Op type not registered 'SentencepieceEncodeSparse' in binary running on 0ccbcd3998d1.
Make sure the Op and Kernel are registered in the binary running in this process.
Note that if you are loading a saved graph which used ops from tf.contrib, accessing
(e.g.) `tf.contrib.resampler` should be done before importing the graph,
as contrib ops are lazily registered when the module is first accessed.
I am unfamiliar with how to build anything manually, and was hoping that I could do this without many changes.
One approach would be to:
Pull a docker development image
$ docker pull tensorflow/serving:latest-devel
In the container, make your code changes
$ docker run -it tensorflow/serving:latest-devel
Modify the code to add the op dependency here.
In the container, build TensorFlow Serving
container:$ tensorflow_serving/model_servers:tensorflow_model_server && cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/
Use the exit command to exit the container
Look up the container ID:
$ docker ps
Use that container ID to commit the development image:
$ docker commit $USER/tf-serving-devel-custom-op
Now build a serving container using the development container as the source
$ mkdir /tmp/tfserving
$ cd /tmp/tfserving
$ git clone https://github.com/tensorflow/serving .
$ docker build -t $USER/tensorflow-serving --build-arg TF_SERVING_BUILD_IMAGE=$USER/tf-serving-devel-custom-op -f tensorflow_serving/tools/docker/Dockerfile .
You can now use $USER/tensorflow-serving to serve your image following the Docker instructions

Docker TICK Sandbox does not provide UDF Python functionality

I'm running this docker image to use the TICK Kapacitor locally.
The problem I face is that when I try to use User Defined Functions, e.g any of these examples I get the error message that /usr/bin/python2 does not exist.
I add the following to the kapacitor.conf:
[udf.functions]
[udf.functions.tTest]
prog = "/usr/bin/python2"
args = ["-u", "/tmp/kapacitor_udf/mirror.py"]
timeout = "10s"
[udf.functions.tTest.env]
PYTHONPATH = "/tmp/kapacitor_udf/kapacitor/udf/agent/py"
Further attempts from my side including altering the image used to build Kapacitor to install python works but the agent seems to fail to compile anyway.
Is there anyone who managed to get UDFs running using the Kapacitor Docker image?
Thanks
Docker image from the official repository: docker pull kapacitor does not have python installed inside. You can verify this by running shell in the container:
PS> docker exec -it kapacitor bash
and execute one of the command options:
$ python -VERSION
$ python: command not found
or
$ readlink -f $(which python) | xargs -I% sh -c 'echo -n "%:"; % -V'
$ readlink: missing operand
or
$ find / -type f -executable -iname 'python *'
void returns. And oppositely if python is available, commands return version and list of executable files
Note: Here and further all command snippets are given for Powershell on Windows. And all command snippets inside docker container are given for bash shell as Linux images are used.
Basicly, there is two options to get kapacitor image with python inside to execute UDFs:
Install the python in the kapacitor image, i.e. build new docker image from very kapacitor image.
Example could be found here:
Build a new verion of kapacitor image from one of the python official images
The second option is more natural as you get consistent python installation and keep efforts on doing work of installing python which already done by the docker community.
So following option 2 we'll perform:
Examine the Dockefile of the official kapacitor image
Choose an appropriate python image
Create new project and Dockerfile for kapacitor
Build and Check the kapacitor image
Examine Dockefile of official kapacitor image
General note:
For any image, the original Dockerfiles can be obtained in this way:
https://hub.docker.com/
-> Description Tab
-> Supported tags and respective Dockerfile links section
-> each of the tags is a link that leads to the Dockerfile
So for kapacitor everything is in the influxdata-docker git repository
Then in the Dockerfile we find that the image is created based on
FROM buildpack-deps: stretch-curl
here:
buildpack-deps
the image provided by the project of the same name https://hub.docker.com/_/buildpack-deps
curl
This variant includes just the curl, wget, and ca-certificates packages. This is perfect for cases like the Java JRE, where downloading JARs is very common and
  necessary, but checking out code isn't.
stretch
short version name of the OS, in this case Debian 9 stretch https://www.debian.org/News/2017/20170617
Buildpack-deps images are in turn built based on
FROM debian: stretch
And Debian images from the minimum docker image
FROM: scratch
Choose appropriate python image
Among python images, for example 3.7, you can find similar versions inheriting from buildpack-deps
FROM buildpack-deps: stretch
Following the inheritance, we'll see:
FROM buildpack-deps: stretch
FROM buildpack-deps: stretch-smc
FROM buildpack-deps: stretch-curl
FROM debian: stretch
In other words, the python: 3.7-stretch image only adds functionality to the Debian compared to the kapacitor image.
This means that we can to rebuild kapacitor image on top of the python image: 3.7-stretch with no risk or gaining incompatibility.
Docker context folder preparation
Clone the repository
https://github.com/influxdata/influxdata-docker.git
Create the folder influxdata-docker/kapacitor/1.5/udf_python/python3.7
Copy the following three files into it from influxdata-docker/kapacitor/1.5/:
Dockerfile
entrypoint.sh
kapacitor.conf
In the copied Dockerfile FROM buildpack-deps: stretch-curl replace with FROM python: 3.7-stretch
Be carefuly! If we work on Windows and because of scientific curiosity open the entrypoint.sh file in the project folder, then be sure to check that it does not change the end-line character from Linux (LF) to Windows variant: (CR LF).
   Otherwise, when you start the container later, you get an error:
or in the container log:
exec: bad interpreter: No such file or directory
or if you'll start debugging and, running the container with bash, will do:
$ root # d4022ac550d4: / # exec /entrypoint_.sh
$ bash: /entrypoint_.sh: / bin / bash ^ M: bad interpreter: No such file or directory
Building
Run PS> docker build -f. \ Dockerfile -t kapacitor_python_udf
Again, in case of Windows environment
If during the build execution an error occurs of the form:
E: Release file for http://security.ubuntu.com/ubuntu/dists/bionic-security/InRelease is not valid yet (invalid for another 9h 14min 10s). Updates for this repository will not be applied.
then your computer clock probably went out of sync and/or Docker Desktop incorrectly initialized the time after the system returned from sleep. See the issue)
To fix it, restart Docker Desktop and / or Windows settings -> Date and time settings -> Clock synchronization -> perform Sync
You can also read more here
Launch and check
Launching the container with the same actions as for the standard image. Example:
PS> docker run --name=kapacitor -d `
--net=influxdb-network `
-h kapacitor `
-p 9092:9092 `
-e KAPACITOR_INFLUXDB_0_URLS_0=http://influxdb:8086 `
-v ${PWD}:/var/lib/kapacitor `
-v ${PWD}/kapacitor.conf:/etc/kapacitor/kapacitor.conf:ro `
kapacitor
Check:
PS> docker exec -it kapacitor_2 bash
$ python -VERSION
$ Python 3.7.7
$ readlink -f $(which python) | xargs -I% sh -c 'echo -n "%:"; % -V'
$ /usr/local/bin/python3.7: Python 3.7.7

Yum update fails -Centos 7 - dockerbuild

I have frequently built docker container using centos 7 as base image. But now I am getting error when I run,
RUN yum update add \
bash \
&& rm -rfv /var/cache/apk/*
ERROR:
Loaded plugins: fastestmirror, ovl
One of the configured repositories failed (Unknown),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
Contact the upstream for the repository and get them to fix the problem.
Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
Run the command with the repository temporarily disabled
yum --disablerepo=<repoid> ...
Disable the repository permanently, so yum won't use it by default. Yum
will then just ignore the repository until you permanently enable it
again or use --enablerepo for temporary usage:
yum-config-manager --disable <repoid>
or
`subscription-manager repos --disable=<repoid>`
Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
Cannot find a valid baseurl for repo: base/7/x86_64 Could not retrieve
mirrorlist
http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=container
error was 14: curl#6 - "Could not resolve host: mirrorlist.centos.org;
Name or service not known" The command '/bin/sh -c yum update add
bash && rm -rfv /var/cache/apk/*' returned a non-zero code: 1
I also saw few resolutions to use "dhclient" but this error happens when i do docker-compose build.
I ran into this problem attempting to run the same Dockerfile, which fetched several software packages using yum, on two different platforms; one macOS, the other an Ubuntu 16.04-based Linux OS (elementaryOS Loki), both using the official packages from docker.com.
My theory is that the Linux package is just more restrictive out of the box, security-wise, than the macOS one. Maybe this is configurable with some kind of /etc/something config file, but I don't have the expertise with Docker to say for sure. EDIT: See my comment below.
What I can say is there was no additional configuration required for me on macOS (10.11 El Capitan); just docker build . worked fine, and yum processes from the Dockerfile were able to reach all the remote repositories.
In the Ubuntu-derived Linux distro, however, it was necessary to use
docker build --network host .
followed by
docker run -it --network host <image> <command>
when I wanted to run a process inside that image which required internet access.
This may be the case for other Debian-derived systems as well.
There are, of course, security considerations which need to be taken into account when allowing a long-running Docker container to communicate through the host network adapter, unrestricted, and one would do well to review the appropriate documentation in that regard.
My assumption is that for some reason network behavior in docker varies based on distribution.
Try to use:
docker run -d --net mybridge centos
or
docker network create -d bridge mybridge
docker run -d --net mybridge centos
It should start working. Or just edit /etc/hosts and add mirror address
Name: mirrorlist.centos.org
Address: 67.219.148.138
root cause of the issue is, container proxy settings were wrong. Just corrected the proxy settings at the below location and worked.
/root/.docker/config.json

Resources