I'm rather new to Docker and I'm trying to make a simple Dockerfile that combines an alpine image with a python one.
This is what the Dockerfile looks like:
FROM alpine
RUN apk update &&\
apk add -q --progress \
bash \
bats \
curl \
figlet \
findutils \
git \
make \
mc \
nodejs \
openssh \
sed \
wget \
vim
ADD ./src/ /home/src/
WORKDIR /home/src/
FROM python:3.7.4-slim
When running:
docker build -t alp-py .
the image builds as normal.
When I run
docker run -it alp-py bash
I can access the bash, but when I cd to /home/ and ls, it shows an empty directory:
root#5fb77bbc81a1:/# cd home
root#5fb77bbc81a1:/home# ls
root#5fb77bbc81a1:/home#
I've alredy tried changing ADD to COPY and also trying:
CPOY . /home/src/
but nothing works.
What am I doing wrong? Am I missing something?
Thanks!
There is no such thing as "combining 2 images". You should see the images as different virtual machines (only for the purpose of understanding the concept - because they are more than that). You cannot combine them.
In your example you can start directly with the python image and install the tools you need on top of it:
FROM python:3.7.4-slim
RUN apt update &&\
apt-get install -y \
bash \
bats \
curl \
figlet \
findutils \
git \
make \
mc \
nodejs \
openssh \
sed \
wget \
vim
ADD ./src/ /home/src/
WORKDIR /home/src/
I didn't test if all the packages are available so you might want to so a bit of research to get them all in case you get errors.
When you use 2 FROM statements in your Dockerfile you are creating a multi-stage build. That is useful if you want to create a final image that doesn't contain your source code, but only binaries of your product (first stage build the source and the second only copies the binaries from the first one).
Related
When I use curl --head to test my website, it returns the server information.
I followed this tutorial to hide the nginx server header.
But when I run the command yum install nginx-module-security-headers
, it returns yum: not found.
I also tried apk add nginx-module-security-headers, and it shows that the package is missing.
I have used nginx:1.17.6-alpine as my base docker image. Does anyone know how to hide the server from header under this Alpine?
I think I have an easier solution here: https://gist.github.com/hermanbanken/96f0ff298c162a522ddbba44cad31081. Big thanks to hermanbanken on Github for sharing this gist.
The idea is to create a multi stage build with the nginx alpine image to be a base for compiling the module. This turns into the following Dockerfile:
ARG VERSION=alpine
FROM nginx:${VERSION} as builder
ENV MORE_HEADERS_VERSION=0.33
ENV MORE_HEADERS_GITREPO=openresty/headers-more-nginx-module
# Download sources
RUN wget "http://nginx.org/download/nginx-${NGINX_VERSION}.tar.gz" -O nginx.tar.gz && \
wget "https://github.com/${MORE_HEADERS_GITREPO}/archive/v${MORE_HEADERS_VERSION}.tar.gz" -O extra_module.tar.gz
# For latest build deps, see https://github.com/nginxinc/docker-nginx/blob/master/mainline/alpine/Dockerfile
RUN apk add --no-cache --virtual .build-deps \
gcc \
libc-dev \
make \
openssl-dev \
pcre-dev \
zlib-dev \
linux-headers \
libxslt-dev \
gd-dev \
geoip-dev \
perl-dev \
libedit-dev \
mercurial \
bash \
alpine-sdk \
findutils
SHELL ["/bin/ash", "-eo", "pipefail", "-c"]
RUN rm -rf /usr/src/nginx /usr/src/extra_module && mkdir -p /usr/src/nginx /usr/src/extra_module && \
tar -zxC /usr/src/nginx -f nginx.tar.gz && \
tar -xzC /usr/src/extra_module -f extra_module.tar.gz
WORKDIR /usr/src/nginx/nginx-${NGINX_VERSION}
# Reuse same cli arguments as the nginx:alpine image used to build
RUN CONFARGS=$(nginx -V 2>&1 | sed -n -e 's/^.*arguments: //p') && \
sh -c "./configure --with-compat $CONFARGS --add-dynamic-module=/usr/src/extra_module/*" && make modules
# Production container starts here
FROM nginx:${VERSION}
COPY --from=builder /usr/src/nginx/nginx-${NGINX_VERSION}/objs/*_module.so /etc/nginx/modules/
.... skipped inserting config files and stuff ...
# Validate the config
RUN nginx -t
Alpine repo probably doesn't have the ngx_security_headers module but, the mentioned tutorial also provides an option of using Headers More module. You should be able to install this module in your alpine distro using the command:
apk add nginx-mod-http-headers-more
Hope it helps.
Source
I found the alternate solution. The reason that it shows binary not compatible is because I have one nginx pre-installed under the target route, and it is not compatible with the header-more module I am using. That means I cannot simply install the third party library from Alpine package.
So I prepare a clean Alpine OS, and follow the GitHub repository to build Nginx from the source with additional feature. The path of build result is the prefix path you specified.
I'm just testing out Docker so this might be a pretty simple question but I cannot seem to find out why it's not doing what I expect.
I created a pretty simple Dockerfile for testing, just to build a simple image that installs some packages, clones a git repo and build its requirements:
FROM ubuntu:18.04
ENV PYTHONEXEC=python3 \
PIPEXEC=pip \
VIRTUALENVEXEC=virtualenv \
GITREPO=https://github.com/test/test.git \
REPODIR=test
RUN apt-get update && apt-get install -y git \
python3 \
python3-dev \
python3-virtualenv \
python-virtualenv \
qt5-default \
libcurl4-openssl-dev \
libxml2 \
libxml2-dev \
libxslt1-dev \
libssl-dev \
virt-viewer
RUN mkdir -p /app
WORKDIR /app
RUN git clone $GITREPO $REPODIR \
&& $VIRTUALENVEXEC -p $PYTHONEXEC venv \
&& . venv/bin/activate \
&& cd $REPODIR \
&& $PIPEXEC install -r requirements.txt
CMD ["sleep", "1000000"]
Then I build the image with:
docker build -t gitapp:latest .
This works so far. However, if I specify a -e parameter on the docker container run command, it seems not to replace it in the last RUN command.
So if I run docker container run -d -e "REPODIR=blah" gitapp, I expect it to be cloned in /app/blah, but it's still cloned in the /app/test directory.
When I run a docker container exec -it -e "REPODIR=blah" <container-id> env I get:
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=2f6ba38341d6
TERM=xterm
REPODIR=blah
PYTHONEXEC=python3
PIPEXEC=pip
VIRTUALENVEXEC=virtualenv
GITREPO=https://github.com/test/test.git
HOME=/root
So it seems that the variable is indeed passed to the container. Then why it isn't passed to the last RUN command so it clones the repo in the right directory? Am I missing something basic here?
When you execute a docker run you are instructing a container to execute Dockerfile's CMD or ENTRYPOINT command. Dockerfile commands that are above entrypoint have been already executed during build and are not executing again at runtime.
That's exactly the reason your github repo is being cloned to the directory defined initially at the Dockerfile and not in the one passed at the run command with -e flag.
A workaround would be to alter your image's entrypoint. You may transfer this part
RUN git clone $GITREPO $REPODIR \
&& $VIRTUALENVEXEC -p $PYTHONEXEC venv \
&& . venv/bin/activate \
&& cd $REPODIR \
&& $PIPEXEC install -r requirements.txt
to a bash script(let's call it my.script.sh) file that will be executed as image's entrypoint. Copy this file during build process in a preferred location, ensuring it holds executable flag and edit your Dockerfile's entrypoint accordingly:
CMD ["/path_to_script/myscript.sh" ]
This however has the caveat that the script will be executed each time the container is started in contrast with your current setup, possibly leading to delay depending on myscript.sh content.
I have been trying for a while to copy files via ssh from a remote server (not gihub) inside the docker image I want to build, but I can't connect to host. Here is the Dockerfile up until the critical point:
FROM r-base:latest
### Install libs
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
openssh-server \
openssh-client \
libcurl4-gnutls-dev \
libcairo2-dev \
libxt-dev \
xtail \
wget \
libssl-dev \
libxml2 \
libxml2-dev \
libv8-dev \
curl \
gnupg \
git
COPY ./setup setup
RUN mv setup/.ssh ~/.ssh
RUN touch ~/.ssh/known_hosts
RUN chmod -R 400 ~/.ssh
RUN ssh-agent sh -c 'ssh-add /root/.ssh/id_rsa'
#RUN eval "$(ssh-agent -s)"
#RUN ssh-add -K ~/.ssh/id_rsa #This is commented out as it causes an error
RUN ssh-keyscan hostname > ~/.ssh/known_host
RUN ssh-keygen -R hostname
## THIS IS THE COMMAND WE NEED TO RUN...
RUN scp -r user#hostname:/path/to/folder ./
The owner of the folder is user. The id_rsa.pub was added to the authorized_keys file of the user user on the host, and ssh was restarted there. However I get a Failed authentication error. I tried to use my personal id_rsa which works from the command line, but it also fails inside docker. Is this a docker issuor is it solvable?
I finally managed to do it by generating a key with the command suggested in this post
So to reproduce my case, locally:
cd setup/.ssh/
ssh-keygen -q -t rsa -N '' -f id_rsa
Then on the server add the id_rsa.pub contents to the known hosts for the user user. Can copy the contents to clipboard using xclip: xclip -sel clip < setup/.ssh/id_rsa.pub)
Dockerfile:
I have been trying for a while to copy files via ssh from a remote server (not gihub) inside the docker image I want to build, but I can't connect to host. Here is the Dockerfile up until the critical point:
FROM r-base:latest
### Install libs
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
openssh-server \
openssh-client \
libcurl4-gnutls-dev \
libcairo2-dev \
libxt-dev \
xtail \
wget \
libssl-dev \
libxml2 \
libxml2-dev \
libv8-dev \
curl \
gnupg \
git
COPY ./setup setup
RUN chmod -R 600 ~/.ssh
RUN echo "IdentityFile /root/.ssh/id_rsa" >> /etc/ssh/ssh_config
RUN echo "StrictHostKeyChecking no" >> /etc/ssh/ssh_config
## THIS IS THE COMMAND WE NEED TO RUN...
RUN scp -r user#hostname:/path/to/folder ./
There’s no specific requirement that you must do everything inside your Dockerfile. Especially things that require remote ssh access are better done outside Docker: consider that anyone who gets your image later on can docker cp a valid ssh key out of it and potentially get access to your internal systems.
For Docker caching reasons, it’s also not a good idea to git clone or otherwise try to remotely retrieve your application from inside the Dockerfile. If you re-run docker build, and nothing else in your Dockerfile has changed, then Docker will skip over the scp step too, even if the remote content has changed.
My general recommendation would be to copy this content from outside the Dockerfile, then build it
# Using whatever credentials are in your local ssh-agent
scp -r user#hostname:/path/to/stuff dist/
# Then your Dockerfile doesn’t need scp or credentials
docker build .
Your Dockerfile then doesn’t need a bunch of extra packages that are only relevant to this path: you should be able to remove sudo openssh-server openssh-client xtail curl gnupg git without actually affecting the single main process you’re trying to run inside your container.
docker image inspect <name>
gives me 16GB
and about 20 layers
When I am logged as root, this
du -hs /
show me just 2GB
FYI, there are already very multi-lines RUN commands in Dockerfile.
can I squash all layers into one layer without touching Dockerfile, rebuilding etc?
or possibly by adding extra action to Dockerfile which clear/improve caching
Dockerfile is
FROM heroku/heroku:18
ENV PYENV_ROOT="/pyenv"
ENV PATH="/pyenv/shims:/pyenv/bin:$PATH"
ENV PYTHON_VERSION 3.5.6
ENV GPG_KEY <value>
ENV PYTHONUNBUFFERED 1
ENV TERM xterm
ENV EDITOR vim
RUN apt-get update && apt-get install -y \
build-essential \
gdal-bin \
binutils \
iputils-ping \
libjpeg8 \
libproj-dev \
libjpeg8-dev \
libtiff-dev \
zlib1g-dev \
libfreetype6-dev \
liblcms2-dev \
libxml2-dev \
libxslt1-dev \
libssl-dev \
libncurses5-dev \
virtualenv \
python-pip \
python3-pip \
python-dev \
libmysqlclient-dev \
mysql-client-5.7 \
libpq-dev \
libcurl4-gnutls-dev \
libgnutls28-dev \
libbz2-dev \
tig \
git \
vim \
nano \
tmux \
tmuxinator \
fish \
sudo \
libnet-ifconfig-wrapper-perl \
ruby \
libssl-dev \
nodejs \
strace \
tcpdump \
# npm & grunt
&& curl -L https://npmjs.com/install.sh | sh \
&& npm install -g grunt-cli grunt \
# ruby & foreman
&& gem install foreman \
# installing pyenv
&& curl https://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer | bash
COPY . /app
COPY ./requirements /requirements
COPY ./requirements.txt /requirements.txt
COPY ./docker/docker_compose/django/foreman.sh /foreman.sh
COPY ./docker/docker_compose/django/Procfile /Procfile
COPY ./docker/docker_compose/django/entrypoint.sh /entrypoint.sh
# ADD sudoer user django with password django
RUN groupadd -r django -g 1000 && \
useradd -ms /usr/bin/fish -p $(openssl passwd -1 django) --uid 1000 --gid 1000 -r -g django django && \
usermod -a -G sudo django && \
chown -R django:django /app
COPY --chown=django:django ./docker/docker_compose/django/fish /home/django/.config/fish
COPY --chown=django:django ./docker/docker_compose/django/tmuxinator /home/django/.tmuxinator
COPY ./docker/docker_compose/django/fish /root/.config/fish
WORKDIR /app
RUN sed -i 's/\r//' /entrypoint.sh \
&& sed -i 's/\r//' /foreman.sh \
&& chmod +x /entrypoint.sh \
&& chown django /entrypoint.sh \
&& chmod +x /foreman.sh \
&& chown django /foreman.sh \
&& chown -R django:django /home/django/ \
&& pyenv install ${PYTHON_VERSION%%} \
&& mkdir -p /app/log \
&& pyenv global ${PYTHON_VERSION%%} \
&& pyenv rehash \
&& ${PYENV_ROOT%%}/versions/${PYTHON_VERSION%%}/bin/pip install -U pip \
&& ${PYENV_ROOT%%}/versions/${PYTHON_VERSION%%}/bin/pip install -r /requirements.txt \
&& chown -R django:django /pyenv/ \
&& ${PYENV_ROOT%%}/versions/${PYTHON_VERSION%%}/bin/pip install -r /requirements/dev_requirements.txt
# this user receives ENVs from the top
USER django
ENTRYPOINT ["/entrypoint.sh"]
What I've tried so far:
The --squash option from experimental mode of docker build is rather not for me. That Dockerfile is one of more Dockerfiles inside docker-compose.
I've also checked this:
https://github.com/jwilder/docker-squash
but seems docker load cannot load a squashed image.
also, that squash gives me 8GB (still far away from expected ~2GB)
docker save <image_id> | docker-squash -t latest_tiny | docker load
update after answers:
when I've added this:
&& apt-get autoremove \ # ? to consider
&& apt-get clean \ # ? to consider
&& rm -rf /var/lib/apt/lists/*
to apt-get and --no-cache-dir to each pip, the result was 72GB (yes, even much more - docker images shows 36GB before pip command, and 72GB as final size).
my working directory is clear (regarding COPY). du -hs / (as a root) still has 2GB. And all images were removed before rebuilding.
Following the #Mihai approach, I was able to slim down the image from 16GB to 9GB.
There is a simple trick to get rid of the intermediate layers. It will bring down the size as well but with how much depends on how it was built.
Create a Dockerfile like this:
FROM your_image as initial
FROM your_image_base
COPY --from=initial / /
your_image_base should be something like 'alpine' - so the smallest image from which your image and its parents descend from.
Now build the image and check the history and size:
docker build -t your-image:2.0 .
docker image history your-image:2.0
docker image ls
This way you do create a new Dockerfile (if that is acceptable for your process) without touching the initial Dockerfile.
Let me know if this solves your issue.
UPDATE AFTER SEEING THE Dockerfile:
maybe I miss it but I don't see you cleaning up the apt-get cache after you perform the installations. Your big RUN command should end with "&& rm -rf /var/lib/apt/lists/*" on the same line so that it doesn't store the whole cache on the layer.
Definitely add && rm -rf /var/lib/apt/lists/* on the end of your main run command, like Mihai said. Another thing that may help (depending on how big your dependencies are) is installing with pip using the --no-cache-dir option . Also, make sure you understand build context and consider using either a .dockerignore or sending the context to another directory (totally depends on how you're directory is setup)
I've also had luck exploring an image using dive. Honestly this looks like a pretty big image so not sure how much you're going to be able to get it down
To squash a (Docker) container image, without re-building the image or manipulating the original Dockerfile,
You can extend from your image and squash it:
docker build --squash -t your_image_squashed - <<< "FROM your_image"
It's very easy, just use
docker commit YOUR_CONTAINER_ID NEW_IMAGE_ID
The docker will throw away the intermediate layers, you lost history but the size is small
I'm not sure if I'm using docker wrong but I have a base image called repo/base and it looks like
# Pull base image.
FROM centos:centos7
# add yum repos
ADD yum-repos/* /etc/yum.repos.d/
ADD certs/RPM-GPG-KEY-EPEL-7 /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
# Upgrading system
RUN rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch && \
yum -y install epel-release \
wget \
git \
tar \
nodejs \
npm \
libicu-devel \
logstash-forwarder \
rhnpush \
monit \
java-1.8.0-openjdk-devel
ADD runner/* /
RUN chmod +x /runner.sh && chmod +x /service-wrapper.sh
ENTRYPOINT ["/runner.sh"]
Really not that big of a deal. I push this to an artifactory. Then I create a test image from it
FROM repo/base
RUN echo "foo"
Build it and push it to the repo. Here it looks like the same hashes are being pushed out AGAIN, so its as if docker isn't registering that the images already exist remote.
Is this normal or is it somehow related to my remote artifactory?