how to create parent docker image with running host operating system? - docker

I want to create a parent image with running host.
Over docker website it is written that we can do it, but I wanted to know the steps which are missing from this page.
https://docs.docker.com/develop/develop-images/baseimages/
Even it is not correct for docker environment still I need it for one POC.
I can see option to create image from tar file like:
FROM scratch
MAINTAINER \
[Adam Miller <maxamillion#fedoraproject.org>] \
[Patrick Uiterwijk <patrick#puiterwijk.org>]
ENV DISTTAG=f30container FGC=f30 FBR=f30
ADD fedora-30-x86_64-20180906.tar.xz /
Can I create a tar file of root file system with normal tar command and use to create parent image ?

Related

How can I use a several line command in a Dockerfile in order to create a file within the resulting Image

I'm following installation instructions for RedhawkSDR, which rely on having a Centos7 OS. Since my machine uses Ubuntu 22.04, I'm creating a Docker container to run Centos7 then installing RedhawkSDR in that.
One of the RedhawkSDR installation instructions is to create a file with the following command:
cat<<EOF|sed 's#LDIR#'`pwd`'#g'|sudo tee /etc/yum.repos.d/redhawk.repo
[redhawk]
name=REDHAWK Repository
baseurl=file://LDIR/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk
EOF
How do I get a Dockerfile to execute this command when creating an image?
(Also, although I can see that this command creates the file /etc/yum.repos.d/redhawk.repo, which consists of the lines from [redhawk] to gpgkey=...., I have no idea how to parse this command and understand exactly why it does that...)
Using the text editor of your choice, create the file on your local system. Remove the word sudo from it; give it an additional first line #!/bin/sh. Make it executable using chmod +x create-redhawk-repo.
Now it is an ordinary shell script, and in your Dockerfile you can just RUN it.
COPY create-redhawk-repo ./
RUN ./create-redhawk-repo
But! If you look at what the script actually does, it just writes a file into /etc/yum.repos.d with a LDIR placeholder replaced with some other directory. The filesystem layout inside a Docker image is fixed, and there's no particular reason to use environment variables or build arguments to hold filesystem paths most of the time. You could use a fixed path in the file
[redhawk]
name=REDHAWK Repository
baseurl=file:///redhawk-yum/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk
and in your Dockerfile, just COPY that file in as-is, and make sure the downloaded package archive is in that directory. Adapting the installation instructions:
ARG redhawk_version=3.0.1
RUN wget https://github.com/RedhawkSDR/redhawk/releases/download/$redhawk_version/\
redhawk-yum-$redhawk_version-el7-x86_64.tar.gz \
&& tar xzf redhawk-yum-$redhawk_version-el7-x86_64.tar.gz \
&& rm redhawk-yum-$redhawk_version-el7-x86_64.tar.gz \
&& mv redhawk-yum-$redhawk_version-el7-x86_64 redhawk-yum \
&& rpm -i redhawk-yum/redhawk-release*.rpm
COPY redhawk.repo /etc/yum.repos.d/
Remember that, in a Dockerfile, you are root unless you've switched to another USER (and in that case you can use USER root to switch back); you do not need generally sudo in Docker at all, and can just delete sudo where it appears in these instructions.
How do I get a Dockerfile to execute this command when creating an image?
Just use printf and run this command as single line:
FROM image_name:image_tag
ARG LDIR="/default/folder/if/argument/not/set"
# if container has sudo command and default user is not root
# you should choose this variant
RUN printf '[redhawk]\nname=REDHAWK Repository\nbaseurl=file://%s/\nenabled=1\ngpgcheck=1\ngpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk\n' "$LDIR" | sudo tee /etc/yum.repos.d/redhawk.repo
# if default container user is root this command without piping may be used
RUN printf '[redhawk]\nname=REDHAWK Repository\nbaseurl=file://%s/\nenabled=1\ngpgcheck=1\ngpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhawk\n' "$LDIR" > /etc/yum.repos.d/redhawk.repo
Where LDIR is an argument and docker build process should be run like:
docker build ./ --build-arg LDIR=`pwd`

Building image with Dockerfile is not using existing layers?

I'm quite new to Dockerfiles and not sure whats going on here. I'm using Podman and building a new image with a Dockerfile to set some additional permissions using a base image from Docker Hub. According to the layers (#8) in this base image it seems that the variabele SDKMAN_DIR is being set to /home/javaUser/.sdkman.
/bin/sh -c export SDKMAN_DIR="/home/javaUser/.sdkman" && curl -s "https://get.sdkman.io" | bash
However, when I build a new image based on the newly created image with the Dockerfile, start a new container, exec into container and execute the command echo $SDKMAN_DIR, the result is empty. I was expecting that my variant of the image had the same variabels set.
Can anyone tell me what I'm missing here?
bash-5.1$ echo $SDKMAN_DIR
Below you'll find the Dockerfile I'm using.
FROM docker.io/itext/dito-sdk:2.4.5
# Make available to rootless users
USER root
RUN chmod g+x /home/javaUser/.sdkman/bin/sdkman-init.sh
# Become rootless user
USER 2000
EXPOSE 8080
ENTRYPOINT /bin/bash -c "source /home/javaUser/.sdkman/bin/sdkman-init.sh && kotlin /opt/dito/startup-prepare.main.kts && java -jar /opt/dito/dito-sdk-docker.jar server /etc/opt/dito/config.yml"
Thanks in advance.
Edit
Somehow I got it to work by passing in an ENV. Still I'm not exactly sure why the variable is not available without explicitly defining an ENV variable. Does this has to do with not being the root but 2000 user after an exec into the container?
Edit
The variable is not available at runtime because of the fact that the variabele is being declared in the RUN step, and therefore only limited to that scope. Thanks to #DavidMaze for clarifying.

how do I add root certificate and keep only in docker build time?

the question has 2 parts, the 1st part: how to add root certificate? is simple and we can have reference from like How do I add a CA root certificate inside a docker image?
the 2nd part, which is what I actually want to ask, is: how to keep the root certificate only in docker build time?
maybe we can use buildctl and RUN --mount=type=secret; but it cannot cover all cases.
say I would like to pass sites with self-signed certificate like:
RUN curl https://x01.self-signed-site/obj01
RUN npm install --registry https://x02.self-signed-site/npm
RUN pip install -i https://x03.self-signed-site/pypi/simple
RUN mvn install
...
thus, we need to config certificate for each tool:
(prepare certificate and prepare .npmrc, .curlrc, ...)
(for, curl, npm, pip, we can use env vars; but we cannot guarantee we can use this way for other tools)
therefore, we need to download self-signed certificate into image and also modify some files to apply the cert config. how to keep the change only in build time (no persistent layer in final image)?
we resolved this problem by using docker save and docker load; but currently, docker load does not work as we expect (see also how to keep layers when do `docker load`)
anyway, below is our solution in pseudo-code:
docker save -o out.tar <image>
mkdir contents && cd contents
tar xf ../out.tar
open manifest.json, get config <hash>.json as config.json
remove target layers in:
- config.json[history]
- config.json[rootfs][diff_ids]
- manifest.json[0][Layers]
remove layer tarballs (get layer_hashes from maniefst.josn[0][Layers]):
- <layer_hash>/*
fill gap between missing layers:
- <layer_hash_next>/json[parent] = <layer_hash_prev>
tar cf ../new.tar *
docker rmi <image>
docker load -i ../new.tar
ref: https://github.com/stallpool/track-network-traffic/blob/main/bin/docker_image_cleanup.py

How do I add an additional command line tool to an already existing Docker/Singularity image?

I work in neuroscience, and I use a cloud platform called Brainlife to upload and download data (linked here, but I don't think knowledge of Brainlife is relevant to this question). I use Brainlife's command line interface to upload and download data on my university's server. In order to use their CLI, I run Singularity with a Docker image created by Brainlife (found here). I run this using the following code:
singularity shell docker://brainlife/cli -B
I also have the file saved on my server account, and can run it like this:
singularity shell brainlifeimage.sif -B
After running one of those commands, I am able to download and upload data, usually successfully. Currently I'm following Brainlife's tutorial to bulk download data. The tutorial uses the command line tool "jq" (link), which isn't on their docker image. I tried installing it within the Singularity shell like this:
apt-get install jq
And it returned:
Reading package lists... Done
Building dependency tree
Reading state information... Done
W: Not using locking for read only lock file /var/lib/dpkg/lock
E: Unable to locate package jq
Is there an easy way to add this one tool to the image? I've been reading over the Singularity and Docker documentations, but Docker is all new to me and I'm really lost.
If relevant, my university server runs on Ubuntu 16.04.7 LTS, and I am using terminal on a Mac laptop running MacOS 11.3. This is my first stack overflow question - please let me know if i can provide any additional info! Thanks so much.
The short, specific answer: jq is portable, so you can just mount it into the image and use it normally. e.g.,
singularity shell -B /path/to/jq:/usr/bin/jq brainlifeimage.sif
The short, general answer: you can't modify the read only image and need to build a new one.
Long answer with several options and specific examples:
Since singularity images are read only, they cannot have persistent changes made to them. This is great for reproducibility, a bit inconvenient if your tools are likely to change often. You can rebuild the image in several ways, though all will require sudo permissions.
Write a new Singularity definition based on the docker image
Create a new definition file (generally called Singularity or something.def), use the current container as a base and add the desired software in the %post section. Then build the new image with: sudo singularity build brainy_jq.sif Singularity
The definition file docs are quite good and highly recommended.
Bootstrap: docker
From: brainlife/cli:latest
%post
apt-get update && apt-get install -y jq
Create a sandbox of the current singularity image, make your changes, and convert back to a read-only image. See the singularity docs on writable sandbox directories and converting images between formats.
# use --sandbox to create a writable singularity image
sudo singularity build --sandbox writable_brain/ brainlifeimage.sif
# --writable must still be used to make changes, and sudo for correct permissions
sudo singularity exec writable_brain/ bash -c 'apt-get update && apt-get install -y jq'
# convert back to read-only image for normal usage
sudo singularity build brainlifeimage_jq.sif writable_brain/
Modify the source docker image locally and build from that. One of the more... creative options. Almost sudo-free, except singularity pull doesn't accept docker-daemon so a sudo singularity build is necessary.
# add jq to a new docker container. the value for --name doesn't matter, but we use it
# in later steps. The entrypoint needs to be overridden in this case as well.
docker run -it --name brainlife-jq --entrypoint=/bin/bash \
brainlife/cli:1.5.25 -c 'apt-get update && apt-get install -y jq'
# use docker commit to create an image from the container so it can be reused
# note that we're using the name of the image set in the previous step
# the output of docker commit is the hash for the newly created image, so we grab that
IMAGE_ID=$(docker commit brainlife-jq)
# tag the newly created image with a more useful name
docker tag $IMAGE_ID brainlife/cli:1.5.25-jq
# here we use docker-daemon instead of docker to build from a locally cached docker image
# instead of looking at docker hub
sudo singularity build brainlife_jq.sif docker-daemon://brainlife/cli:1.5.25-jq
# now check that it all worked as planned
singularity exec brainlife_jq.sif which jq
# /usr/bin/jq
ref: docker commit, using locally cached docker images

How to verify if the content of two Docker images is exactly the same?

How can we determine that two Docker images have exactly the same file system structure, and that the content of corresponding files is the same, irrespective of file timestamps?
I tried the image IDs but they differ when building from the same Dockerfile and a clean local repository. I did this test by building one image, cleaning the local repository, then touching one of the files to change its modification date, then building the second image, and their image IDs do not match. I used Docker 17.06 (the latest version I believe).
If you want to compare content of images you can use docker inspect <imageName> command and you can look at section RootFS
docker inspect redis
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:eda7136a91b7b4ba57aee64509b42bda59e630afcb2b63482d1b3341bf6e2bbb",
"sha256:c4c228cb4e20c84a0e268dda4ba36eea3c3b1e34c239126b6ee63de430720635",
"sha256:e7ec07c2297f9507eeaccc02b0148dae0a3a473adec4ab8ec1cbaacde62928d9",
"sha256:38e87cc81b6bed0c57f650d88ed8939aa71140b289a183ae158f1fa8e0de3ca8",
"sha256:d0f537e75fa6bdad0df5f844c7854dc8f6631ff292eb53dc41e897bc453c3f11",
"sha256:28caa9731d5da4265bad76fc67e6be12dfb2f5598c95a0c0d284a9a2443932bc"
]
}
if all layers are identical then images contains identical content
After some research I came up with a solution which is fast and clean per my tests.
The overall solution is this:
Create a container for your image via docker create ...
Export its entire file system to a tar archive via docker export ...
Pipe the archive directory names, symlink names, symlink contents, file names, and file contents, to an hash function (e.g., MD5)
Compare the hashes of different images to verify if their contents are equal or not
And that's it.
Technically, this can be done as follows:
1) Create file md5docker, and give it execution rights, e.g., chmod +x md5docker:
#!/bin/sh
dir=$(dirname "$0")
docker create $1 | { read cid; docker export $cid | $dir/tarcat | md5; docker rm $cid > /dev/null; }
2) Create file tarcat, and give it execution rights, e.g., chmod +x tarcat:
#!/usr/bin/env python3
# coding=utf-8
if __name__ == '__main__':
import sys
import tarfile
with tarfile.open(fileobj=sys.stdin.buffer, mode="r|*") as tar:
for tarinfo in tar:
if tarinfo.isfile():
print(tarinfo.name, flush=True)
with tar.extractfile(tarinfo) as file:
sys.stdout.buffer.write(file.read())
elif tarinfo.isdir():
print(tarinfo.name, flush=True)
elif tarinfo.issym() or tarinfo.islnk():
print(tarinfo.name, flush=True)
print(tarinfo.linkname, flush=True)
else:
print("\33[0;31mIGNORING:\33[0m ", tarinfo.name, file=sys.stderr)
3) Now invoke ./md5docker <image>, where <image> is your image name or id, to compute an MD5 hash of the entire file system of your image.
To verify if two images have the same contents just check that their hashes are equal as computed in step 3).
Note that this solution only considers as content directory structure, regular file contents, and symlinks (soft and hard). If you need more just change the tarcat script by adding more elif clauses testing for the content you wish to include (see Python's tarfile, and look for methods TarInfo.isXXX() corresponding to the needed content).
The only limitation I see in this solution is its dependency on Python (I am using Python3, but it should be very easy to adapt to Python2). A better solution without any dependency, and probably faster (hey, this is already very fast), is to write the tarcat script in a language supporting static linking so that a standalone executable file was enough (i.e., one not requiring any external dependencies, but the sole OS). I leave this as a future exercise in C, Rust, OCaml, Haskell, you choose.
Note, if MD5 does not suit your needs, just replace md5 inside the first script with your hash utility.
Hope this helps anyone reading.
Amazes me that docker doesn't do this sort of thing out of the box. Here's a variant on #mljrg's technique:
#!/bin/sh
docker create $1 | {
read cid
docker export $cid | tar Oxv 2>&1 | shasum -a 256
docker rm $cid > /dev/null
}
It's shorter, doesn't need a python dependency or a second script at all, I'm sure there are downsides but it seems to work for me with the few tests I've done.
There doesn't seem to be a standard way for doing this. The best way that I can think of is using the Docker multistage build feature.
For example, here I am comparing the apline and debian images. In yourm case set the image names to the ones you want to compare
I basically copy all the file from each image into a git repository and commit after each copy.
FROM alpine as image1
FROM debian as image2
FROM ubuntu
RUN apt-get update && apt-get install -y git
RUN git config --global user.email "you#example.com" &&\
git config --global user.name "Your Name"
RUN mkdir images
WORKDIR images
RUN git init
COPY --from=image1 / .
RUN git add . && git commit -m "image1"
COPY --from=image2 / .
RUN git add . && git commit -m "image2"
CMD tail > /dev/null
This will give you an image with a git repository that records the differences between the two images.
docker build -t compare .
docker run -it compare bash
Now if you do a git log you can see the logs and you can compare the two commits using git diff <commit1> <commit2>
Note: If the image building fails at the second commit, this means that the images are identical, since a git commit will fail if there are no changes to commit.
If we rebuild the Dockerfile it is almost certainly going to produce a new hash.
The only way to create an image with the same hash is to use docker save and docker load. See https://docs.docker.com/engine/reference/commandline/save/
We could then use Bukharov Sergey's answer (i.e. docker inspect) to inspect the layers, looking at the section with key 'RootFS'.

Resources