copy or add command not executed on docker hub - docker

This dockerfile works as expected on my laptop. But it fails if I use automated builds on docker hub.
FROM ubuntu
# Install required software via apt and pip
RUN apt-get -y update && \
apt-get install -y \
awscli \
python \
python-pip \
software-properties-common \
&& add-apt-repository ppa:ubuntugis/ppa \
&& apt-get -y update \
&& apt-get install -y \
gdal-bin \
&& pip install boto3
# Copy Build Thumbnail script to Docker image and add execute permissions
COPY build-thumbnails.py build-thumbnails.py
RUN chmod +x build-thumbnails.py
The error is:
Step 6/7 : COPY build-thumbnails.py build-thumbnails.py
COPY failed: stat /var/lib/docker/tmp/docker-builder259560514/build-thumbnails.py: no such file or directory
The repo is here...
https://github.com/shantanuo/docker/blob/master/batch/Dockerfile
Why would copy or add command not work for automated builds?

Seems like other people have had the same issue see here:
https://forums.docker.com/t/docker-build-failing-on-docker-hub/76191/2
The solution is to set the build context appropriately so that the relative path >in the Dockerfile COPY is correct.
In your Docker Hub repository go to “Builds” and click on “Configure Automated >Builds”. There you can set the “Build Context” for each build rule.
Check the last answer on this page too:
https://github.com/docker/hub-feedback/issues/811
Let me know if that helps!

Related

How to reduce multistage build duplicate steps time cost issue?

I have a go application, which depends on cgo. When build, it needs libsodium-dev, libzmq3-dev, libczmq-dev, and when run it also needs above three packages.
Currently, I use next multistage build: a golang build environment as the first stage & a debian slim as the second stage. But you could see the 3 packages installed for two times which waste time(Later I may have more such kinds of package added).
FROM golang:1.12.9-buster AS builder
WORKDIR /src/pigeon
COPY . .
RUN apt-get update && \
apt-get install -y --no-install-recommends libsodium-dev && \
apt-get install -y --no-install-recommends libzmq3-dev && \
apt-get install -y --no-install-recommends libczmq-dev && \
go build cmd/main/pgd.go
FROM debian:buster-slim
RUN apt-get update && \
apt-get install -y --no-install-recommends libsodium-dev && \
apt-get install -y --no-install-recommends libzmq3-dev && \
apt-get install -y --no-install-recommends libczmq-dev && \
apt-get install -y --no-install-recommends python3 && \
apt-get install -y --no-install-recommends python3-pip && \
pip3 install jinja2
WORKDIR /root/
RUN mkdir logger
COPY --from=builder /src/pigeon/pgd .
COPY --from=builder /src/pigeon/logger logger
CMD ["./pgd"]
Of course, I can give up multi-stage build, just use golang1.12.9-buster for build, and continue for run, but this will make final run image bigger (which is the advantage of multi-stage build).
Do I miss something or I had to make choice between above?
this is my take about your question:
FROM debian:buster-slim as base
RUN mkdir /debs /debs_tmp \
&& chmod 777 /debs /debs_tmp
WORKDIR /debs
RUN apt-get update \
&& apt-get install -y -d \
--no-install-recommends \
-o dir::cache::archives="/debs_tmp/" \
libsodium-dev \
libzmq3-dev \
libczmq-dev \
&& mv /debs_tmp/*.deb /debs \
&& rm -rf /debs_tmp \
&& apt-get install -y --no-install-recommends \
python3 \
python3-pip \
&& pip3 install jinja2 \
&& rm -rf /var/lib/apt/lists/*
##################
FROM golang:1.12.9-buster AS builder
COPY --from=base /debs /debs
WORKDIR /debs
RUN dpkg -i *.deb
WORKDIR /src/pigeon
COPY . .
RUN go build cmd/main/pgd.go
##################
FROM base
RUN rm -rf /debs
WORKDIR /root/
RUN mkdir logger
COPY --from=builder /src/pigeon/pgd .
COPY --from=builder /src/pigeon/logger logger
CMD ["./pgd"]
You can download the required packages in a temporary folder, move the debs in a new location and finally COPY the debs in the next stage. Finally you simply use the first image you've created.
BTW the containers will run as root. This might be an issue depending on what the software does, you might want to consider to use a user without "powers".
EDIT: sorry for the edits but I ran a couple of example locally and didn't have a go script ready.
At the COPY . . step, any time your source changes, the cache will bust and you will run all later steps again. You can reorder the steps to allow docker to cache the install of your dependencies. You can also join the apt-get install commands into one to reduce overhead of processing the package manager db.
FROM golang:1.12.9-buster AS builder
WORKDIR /src/pigeon
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libsodium-dev \
libzmq3-dev \
libczmq-dev
COPY . .
RUN go build cmd/main/pgd.go
FROM debian:buster-slim
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libsodium-dev \
libzmq3-dev \
libczmq-dev \
python3 \
python3-pip \
&& pip3 install jinja2
WORKDIR /root/
RUN mkdir logger
COPY --from=builder /src/pigeon/pgd .
COPY --from=builder /src/pigeon/logger logger
CMD ["./pgd"]
You will still install the packages twice, but now those installs are cached for future builds. The way to reuse the install of the libraries is to reorder the steps, installing the libraries in a common base image, and then install the go compiler on your build stage, but that will almost certainly be more overhead than installing libraries twice.
With BuildKit, you could share the apt cache between builds using an experimental syntax, but this requires that all builds use BuildKit (the syntax is not backwards compatible), and modifying docker's Debian image to preserve the apt package cache. From the BuildKit experimental documentation, there's the following example for apt:
# syntax = docker/dockerfile:experimental
FROM ubuntu
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
apt update && apt install -y gcc
https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/experimental.md

Set of artifacts that needs to be copied for Docker multistage builds with yum

I'm trying to build a multistage docker image from centos
FROM centos as python-base
RUN yum install -y wget \
tar \
make \
gcc \
gcc-c++ \
zlib \
zlib-devel \
libffi-devel \
openssl-devel \
&& yum clean all
WORKDIR /usr/src/
RUN wget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tgz
RUN tar xzf Python-3.7.0.tgz
WORKDIR /usr/src/Python-3.7.0
RUN ./configure --enable-optimizations
RUN make altinstall
RUN python3.7 -V
#=====================================================================================
FROM centos:cs as python37
COPY --from=python-base /usr/local/lib/python3.7 /usr/local/lib/python3.7
COPY --from=python-base /usr/local/bin/pip3.7 /usr/local/bin/pip3.7
COPY --from=python-base /usr/local/bin/python3.7 /usr/local/bin/python3.7
RUN ln -s /usr/local/bin/pip3.7 /usr/local/bin/pip
RUN ln -s /usr/local/bin/python3.7 /usr/local/bin/python
As show above, I've build python37 from the python-base stage. Here, I've copied the required artifacts from python-base to python37 stage.
FROM centos as httpd-base
RUN yum -y groupinstall "Development tools"\
httpd-2.4.6-88.el7.centos.x86_64 \
&& yum clean all
So my question is, what is the set of artifacts that needs to be copied from httpd-base stage to build an image that has httpd and not all the developments tools that is required only during installation.
Any best practices in this regard is also appreciated.
Thanks in advance.

Prevent docker from building the image from scratch after making changes to the code

A docker newbie, trying to develop in a docker container; I have a problem which is every time I make a single line change of code and try to rerun the container, docker will rebuild the image from scratch which takes a very long time; How should I set up the project correctly so it makes the best use of cache? Pretty sure it doesn't have to reinstall all the apt-get and pip installs (btw I am developing in python) whenever I make some changes to the source code. Anyone have any idea what I am missing. Appreciate any help.
My current docker file:
FROM tiangolo/uwsgi-nginx-flask:python3.6
# Copy the current directory contents into the container at /app
ADD ./app /app
# Run python's package manager and install the flask package
RUN apt-get update -y \
&& apt-get -y install default-jre \
&& apt-get install -y \
build-essential \
gfortran \
libblas-dev \
liblapack-dev \
libxft-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
ADD ./requirements.txt /app/requirements.txt
RUN pip3 install -r requirements.txt
Once the cache breaks in a Dockerfile, all of the following lines will need to be rebuilt since they no longer have a cache hit. The cache search looks for an existing previous layer and an identical command (or contents of something like a COPY) to reuse the cache. If both do not match, then you have a cache miss and it performs the build step. For your scenario, you simply need to reorder your lines to make sure the frequently changing part is at the end rather than the beginning of the file:
FROM tiangolo/uwsgi-nginx-flask:python3.6
# Run python's package manager and install the flask package
RUN apt-get update -y \
&& apt-get -y install default-jre \
&& apt-get install -y \
build-essential \
gfortran \
libblas-dev \
liblapack-dev \
libxft-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip3 install -r requirements.txt
# Copy the current directory contents into the container at /app
COPY app /app
I've also modified your ADD lines to COPY because you don't need the extra features provided by ADD.
During development, I'd recommend mounting app as a volume in your container so you don't need to rebuild the image for every code change. You can leave the COPY app /app inside your Dockerfile and the volume mount will simply overlay the directory, hiding anything in your image at that location. You only need to restart your container to pickup your modifications. Once finished, a build will create an image that looks identical to your development environment.

docker-compose update from S3 bucket

Our Dockerfile invokes a python script which copies a binary from S3 to /usr/bin. This works fine the first time. But from then on "docker-compose build" does nothing because everything is cached. This is a problem if the binary has changed.
Short of building with --no-cache, what is the best way to make sure "docker-compose build" will always pick up the new binary if there is one. We don't mind if it unnecessarily downloads the binary even if unchanged, so long as it does work then the binary has changed.
Seems like we want a Dockerfile step that always executes?
FROM ubuntu:trusty
RUN apt-get update
RUN apt-get -y install software-properties-common
RUN apt-get -y install --reinstall ca-certificates
RUN add-apt-repository ppa:fkrull/deadsnakes
RUN apt-get update && apt-get install -y \
curl \
wget \
vim \
git \
python3.5 \
python3-pip \
python3-setuptools \
libpcap0.8-dev
RUN ln -sf /usr/bin/python3.5 /usr/bin/python3
ADD . /app
WORKDIR /app
# Install Python Requirements
RUN pip3 install -r etc/python/requirements.txt
# Download/Install processor and associated libs
RUN python3 setup_processor.py
RUN mkdir -p /logs
ENTRYPOINT ["/app/entrypoint.sh"]
Where setup_processor.py downloads directly from S3 to /usr/bin.
So as of now there is no direct feature like this. But there is a workaround to your solution.
Add Build argument before your download step
ARG BUILD_ON=now
# Download/Install processor and associated libs
RUN python3 setup_processor.py
While building the image use below
docker build --build-arg BUILD_ON=$(date) ....
This will always make sure that you get a change in the ARG step and all steps cache after that will be invalidated
A feature has already been requested and being worked out on below thread
https://github.com/moby/moby/issues/1996

rebuild docker image from specific step

I have the below Dockerfile.
FROM ubuntu:14.04
MAINTAINER Samuel Alexander <samuel#alexander.com>
RUN apt-get -y install software-properties-common
RUN apt-get -y update
# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections
RUN add-apt-repository -y ppa:webupd8team/java
RUN apt-get -y update
RUN apt-get install -y oracle-java8-installer
RUN rm -rf /var/lib/apt/lists/*
RUN rm -rf /var/cache/oracle-jdk8-installer
# Define working directory.
WORKDIR /work
# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# JAVA PATH
ENV PATH /usr/lib/jvm/java-8-oracle/bin:$PATH
# Install maven
RUN apt-get -y update
RUN apt-get -y install maven
# Install Open SSH and git
RUN apt-get -y install openssh-server
RUN apt-get -y install git
# clone Spark
RUN git clone https://github.com/apache/spark.git
WORKDIR /work/spark
RUN mvn -DskipTests clean package
# clone and build zeppelin fork
RUN git clone https://github.com/apache/incubator-zeppelin.git
WORKDIR /work/incubator-zeppelin
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
# Install Supervisord
RUN apt-get -y install supervisor
RUN mkdir -p var/log/supervisor
# Configure Supervisord
COPY conf/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# bash
RUN sed -i s#/home/git:/bin/false#/home/git:/bin/bash# /etc/passwd
EXPOSE 8080 8082
CMD ["/usr/bin/supervisord"]
While building image it failed in step 23 i.e.
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
Now when I rebuild it starts to build from step 23 as docker is using cache.
But if I want to rebuild the image from step 21 i.e.
RUN git clone https://github.com/apache/incubator-zeppelin.git
How can I do that?
Is deleting the cached image is the only option?
Is there any additional parameter to do that?
You can rebuild the entire thing without using the cache by doing a
docker build --no-cache -t user/image-name
To force a rerun starting at a specific line, you can pass an arg that is otherwise unused. Docker passes ARG values as environment variables to your RUN command, so changing an ARG is a change to the command which breaks the cache. It's not even necessary to define it yourself on the RUN line.
FROM ubuntu:14.04
MAINTAINER Samuel Alexander <samuel#alexander.com>
RUN apt-get -y install software-properties-common
RUN apt-get -y update
# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections
RUN add-apt-repository -y ppa:webupd8team/java
RUN apt-get -y update
RUN apt-get install -y oracle-java8-installer
RUN rm -rf /var/lib/apt/lists/*
RUN rm -rf /var/cache/oracle-jdk8-installer
# Define working directory.
WORKDIR /work
# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# JAVA PATH
ENV PATH /usr/lib/jvm/java-8-oracle/bin:$PATH
# Install maven
RUN apt-get -y update
RUN apt-get -y install maven
# Install Open SSH and git
RUN apt-get -y install openssh-server
RUN apt-get -y install git
# clone Spark
RUN git clone https://github.com/apache/spark.git
WORKDIR /work/spark
RUN mvn -DskipTests clean package
# clone and build zeppelin fork, changing INCUBATOR_VER will break the cache here
ARG INCUBATOR_VER=unknown
RUN git clone https://github.com/apache/incubator-zeppelin.git
WORKDIR /work/incubator-zeppelin
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
# Install Supervisord
RUN apt-get -y install supervisor
RUN mkdir -p var/log/supervisor
# Configure Supervisord
COPY conf/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# bash
RUN sed -i s#/home/git:/bin/false#/home/git:/bin/bash# /etc/passwd
EXPOSE 8080 8082
CMD ["/usr/bin/supervisord"]
And then just run it with a unique arg:
docker build --build-arg INCUBATOR_VER=20160613.2 -t user/image-name .
To change the argument with every build, you can pass a timestamp as the arg:
docker build --build-arg INCUBATOR_VER=$(date +%Y%m%d-%H%M%S) -t user/image-name .
or:
docker build --build-arg INCUBATOR_VER=$(date +%s) -t user/image-name .
As an aside, I'd recommend the following changes to keep your layers smaller, the more you can merge the cleanup and delete steps on a single RUN command after the download and install, the smaller your final image will be. Otherwise your layers will include all the intermediate steps between the download and cleanup:
FROM ubuntu:14.04
MAINTAINER Samuel Alexander <samuel#alexander.com>
RUN DEBIAN_FRONTEND=noninteractive \
apt-get -y install software-properties-common && \
apt-get -y update
# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections && \
add-apt-repository -y ppa:webupd8team/java && \
apt-get -y update && \
DEBIAN_FRONTEND=noninteractive \
apt-get install -y oracle-java8-installer && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
rm -rf /var/cache/oracle-jdk8-installer && \
# Define working directory.
WORKDIR /work
# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# JAVA PATH
ENV PATH /usr/lib/jvm/java-8-oracle/bin:$PATH
# Install maven
RUN apt-get -y update && \
DEBIAN_FRONTEND=noninteractive \
apt-get -y install
maven \
openssh-server \
git \
supervisor && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# clone Spark
RUN git clone https://github.com/apache/spark.git
WORKDIR /work/spark
RUN mvn -DskipTests clean package
# clone and build zeppelin fork
ARG INCUBATOR_VER=unknown
RUN git clone https://github.com/apache/incubator-zeppelin.git
WORKDIR /work/incubator-zeppelin
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
# Configure Supervisord
RUN mkdir -p var/log/supervisor
COPY conf/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# bash
RUN sed -i s#/home/git:/bin/false#/home/git:/bin/bash# /etc/passwd
EXPOSE 8080 8082
CMD ["/usr/bin/supervisord"]
One workaround:
Locate the step you want to execute from.
Before that step put a simple dummy operation like "RUN pwd"
Then just build your Dockerfile. It will take everything up to that step from the cache and then execute the lines after the dummy command.
To complete Dmitry's answer, you can use uniq arg like date +%s to keep always same commanline
docker build --build-arg DUMMY=`date +%s` -t me/myapp:1.0.0
Dockerfile:
...
ARG DUMMY=unknown
RUN DUMMY=${DUMMY} git clone xxx
...
A simpler technique.
Dockerfile:Add this line where you want the caching to start being skipped.
COPY marker /dev/null
Then build using
date > marker && docker build .
Another option is to delete the cached intermediate image you want to rebuild.
Find the hash of the intermediate image you wish to rebuild in your build output:
Step 27/42 : RUN lolarun.sh
---> Using cache
---> 328dfe03e436
Then delete that image:
$ docker image rmi 328dfe03e436
Or if it gives you an error and you're okay with forcing it:
$ docker image rmi -f 328dfe03e436
Finally, rerun your build command, and it will need to restart from that point because it's not in the cache.
If place ARG INCUBATOR_VER=unknown at top, then cache will not be used in case of change of INCUBATOR_VER from command line (just tested the build).
For me worked:
# The rebuild starts from here
ARG INCUBATOR_VER=unknown
RUN INCUBATOR_VER=${INCUBATOR_VER} git clone https://github.com/apache/incubator-zeppelin.git
As there is no official way to do this, the most simple way is to temporarily change the specific RUN command to include something harmless like echo:
RUN echo && apt-get -qq update && \
apt-get install nano
After building remove it.

Resources