I am running my monolith application in a docker container and k8s on GKE.
The application contains python & node dependencies also webpack for front end bundle.
We have implemented CI/CD which is taking around 5-6 min to build & deploy new version to k8s cluster.
Main goal is to reduce the build time as much possible. Written Dockerfile is multi stage.
Webpack is taking more time to generate the bundle.To buid docker image i am using already high config worker.
To reduce time i tried using the Kaniko builder.
Issue :
As docker cache layers for python code it's working perfectly. But when there is any changes in JS or CSS file we have to generate bundle.
When there is any changes in JS & CSS file instead if generate new bundle its use caching layer.
Is there any way to separate out build new bundle or use cache by passing some value to docker file.
Here is my docker file :
FROM python:3.5 AS python-build
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt &&\
pip3 install Flask-JWT-Extended==3.20.0
ADD . /app
FROM node:10-alpine AS node-build
WORKDIR /app
COPY --from=python-build ./app/app/static/package.json app/static/
COPY --from=python-build ./app ./
WORKDIR /app/app/static
RUN npm cache verify && npm install && npm install -g --unsafe-perm node-sass && npm run sass && npm run build
FROM python:3.5-slim
COPY --from=python-build /root/.cache /root/.cache
WORKDIR /app
COPY --from=node-build ./app ./
RUN apt-get update -yq \
&& apt-get install curl -yq \
&& pip install -r requirements.txt
EXPOSE 9595
CMD python3 run.py
I would suggest to create separate build pipelines for your docker images, where you know that the requirements for npm and pip aren't so frequent.
This will incredibly improve the speed, reducing the time of access to npm and pip registries.
Use a private docker registry (the official one or something like VMWare harbor or SonaType Nexus OSS).
You store those build images on your registry and use them whenever something on the project changes.
Something like this:
First Docker Builder // python-builder:YOUR_TAG [gitrev, date, etc.)
docker build --no-cache -t python-builder:YOUR_TAG -f Dockerfile.python.build .
FROM python:3.5
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt &&\
pip3 install Flask-JWT-Extended==3.20.0
Second Docker Builder // js-builder:YOUR_TAG [gitrev, date, etc.)
docker build --no-cache -t js-builder:YOUR_TAG -f Dockerfile.js.build .
FROM node:10-alpine
WORKDIR /app
COPY app/static/package.json /app/app/static
WORKDIR /app/app/static
RUN npm cache verify && npm install && npm install -g --unsafe-perm node-sass
Your Application Multi-stage build:
docker build --no-cache -t app_delivery:YOUR_TAG -f Dockerfile.app .
FROM python-builder:YOUR_TAG as python-build
# Nothing, already "stoned" in another build process
FROM js-builder:YOUR_TAG AS node-build
ADD ##### YOUR JS/CSS files only here, required from npm! ###
RUN npm run sass && npm run build
FROM python:3.5-slim
COPY . /app # your original clean app
COPY --from=python-build #### only the files installed with the pip command
WORKDIR /app
COPY --from=node-build ##### Only the generated files from npm here! ###
RUN apt-get update -yq \
&& apt-get install curl -yq \
&& pip install -r requirements.txt
EXPOSE 9595
CMD python3 run.py
A question is: why do you install curl and execute again the pip install -r requirements.txt command in the final docker image?
Triggering every time an apt-get update and install without cleaning the apt cache /var/cache/apt folder produces a bigger image.
As suggestion, use the docker build command with the option --no-cache to avoid caching result:
docker build --no-cache -t your_image:your_tag -f your_dockerfile .
Remarks:
You'll have 3 separate Dockerfiles, as I listed above.
Build the Docker images 1 and 2 only if you change your python-pip and node-npm requirements, otherwise keep them fixed for your project.
If any dependency requirement changes, then update the docker image involved and then the multistage one to point to the latest built image.
You should always build only the source code of your project (CSS, JS, python). In this way, you have also guaranteed reproducible builds.
To optimize your environment and copy files across the multi-stage builders, try to use virtualenv for python build.
Related
I have a dockerfile in which i am using python:3.9.2-slim-buster as base image and i am doing the following stuff.
FROM lab.com:5000/python:3.9.2-slim-buster
ENV PYTHONPATH=base_platform_update
RUN apt-get update && apt-get install -y curl && apt-get clean
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
RUN chmod +x ./kubectl
RUN mv ./kubectl /usr/local/bin
WORKDIR /script
RUN pip install SomePackage
COPY base_platform_update ./base_platform_update
ENTRYPOINT ["python3", "base_platform_update/core/main.py"]
I want to convert this to use distroless image. I tried but its not working. I found these resources
https://github.com/GoogleContainerTools/distroless/blob/main/examples/python3/Dockerfile
https://www.abhaybhargav.com/stories-of-my-experiments-with-distroless-containers/
I know this is not correct but this is what i came up with after following these resources
# first stage
FROM lab.com:5000/python:3.9.2-slim-buster AS build-env
WORKDIR /script
COPY base_platform_update ./base_platform_update
RUN apt-get update && apt-get install -y curl && apt-get clean
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
RUN mv ./kubectl /usr/local/bin
# second stage
FROM gcr.io/distroless/python3
WORKDIR /script
COPY --from=build-env /script/base_platform_update ./base_platform_update
COPY --from=build-env /usr/local/bin/kubectl /usr/local/bin/kubectl
COPY --from=build-env /bin/chmod /bin/chmod
COPY --from=build-env /usr/local/bin/pip /usr/local/bin/pip
RUN chmod +x /usr/local/bin/kubectl
ENV PYTHONPATH=base_platform_update
RUN pip install SomePackage
ENTRYPOINT ["python3", "base_platform_update/core/main.py"]
it gives the following error:
/bin/sh: 1: pip: not found
The command '/bin/sh -c pip install SomePackage' returned a non-zero code: 127
I also thought of moving RUN pip install SomePackage to first stage but the couldn't figure out how to do that.
Any help would be appreciated. Thanks
EDIT:
docker images output
gcr.io/distroless/python3 latest 7f711ebcfe29 51 years ago 52.2MB
gcr.io/distroless/python3 debug 7c587fbe3d02 51 years ago 53.3MB
It could be that you need to add that dir to the PATH.
ENV PATH="/usr/local/bin:$PATH"
consider though the final image size difference after adding all those dependencies, it might not be worth all the hassle.
the latest image tagged as python:3.8.5-alpine is 42.7MB while gcr.io/distroless/python3 as of writing this is 52.2MB, after adding the binaries, the script, and nonetheless the package you want to install you may surpass that figure at the end. If pull time is important and network bandwidth usage is expensive that might be a thought to have, otherwise for the current use case seems like too much.
Distroless images are meant only for runtime, as a result, you can't (by default) use the python package manager to install packages, see Google GitHub project readme
"Distroless" images contain only your application and its runtime
dependencies. They do not contain package managers, shells or any
other programs you would expect to find in a standard Linux
distribution.
you could install the packages in a second new stage and copy the installed packages from it to the third but that's not bound to work cause of target OS the package was meant for, incompatibility between the second and third stage etc`.
Here's an exame Dockerfile for that:
# first stage
FROM python:3.8 AS builder
COPY requirements.txt .
# install dependencies to the local user directory (eg. /root/.local)
RUN pip install --user -r requirements.txt
# second unnamed stage
FROM python:3.8-slim
WORKDIR /code
# copy only the dependencies installation from the 1st stage image
COPY --from=builder /root/.local /root/.local
COPY ./src .
# update PATH environment variable
ENV PATH=/root/.local:$PATH
CMD [ "python", "./server.py" ]
Dockerfile credits
You could package your application to a binary using any number of python libs but that depends on how much you need it. You can do that with packages like pyinstaller though it mainly packages the project rather than turning it to a single binary, nuitka which is a rising option and very popular along with cx_Freeze.
Here's a relevant thread on the topic if you're interested.
There's also this article.
I am trying to update a dependency in docker using poetry, I have added
RUN poetry update
RUN poetry install -n
in Dockerfile but it doesn't update the package. There is an import error with an older version of tortoise ORM, which requires an upgrade (verified by running the project without docker and with virtualenv and with the newer package) which persists with the changes to the Dockerfile.
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
RUN pip install poetry
# RUN poetry config virtualenvs.create false
COPY poetry.lock pyproject.toml ./
# for poetry
RUN mkdir -p /app/app/
RUN touch /app/app/__init__.py
RUN poetry update
RUN poetry install -n
COPY ./app /app/app
EXPOSE 8000
my Dockerfile for reference.
Right now my DOCKERFILE builds a dotnet image that is installed/updated and run inside its own pod in a Kubernetes cluster.
FROM mcr.microsoft.com/dotnet/core/aspnet:3.1 AS base
ARG DOTNET_SKIP_FIRST_TIME_EXPERIENCE=true
ARG DOTNET_CLI_TELEMETRY_OPTOUT=1
WORKDIR /app
FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS build
ARG DOTNET_SKIP_FIRST_TIME_EXPERIENCE=true
ARG DOTNET_CLI_TELEMETRY_OPTOUT=1
ARG ArtifactPAT
WORKDIR /src
RUN apt-get update && apt-get install -y wget && rm -rf /var/lib/apt/lists/*
COPY /src .
RUN dotnet restore "./sourceCode.csproj" -s "https://api.nuget.org/v3/index.json"
RUN dotnet build "./sourceCode.csproj" -c Release -o /app
FROM build AS publish
RUN dotnet publish "./sourceCode.csproj" -c
Release -o /app
FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "SourceCode.dll"]
EXPOSE 80
The cluster is very bare-bones and does not include either curl nor wget on it. So, I need to get wget or curl installed in the pod/cluster to execute scripted commands that are set to run automatically after deployment and startup are completed. The command to do the install:
RUN apt-get update && apt-get install -y wget && rm -rf /var/lib/apt/lists/*
within the DOCKERFILE seems to do nothing to install in the Kubernetes cluster. As after the build run and deploys if I were to exec into the pod and try to run
wget --help
I get wget doesn't exist. I do not have a lot of experience build DOCKERFILEs so I am truely getting stumped. And I want this automated in the DOCKERFILE as I will not be able to log into environments above our Test to perform the install manually.
its not related to kubernetes nor pods. Actually you cant install anything to kubernetes pod. you can install packages to containers which runs on pod.
Your problem is that, you install wget to your build image. when you use this image below you lost all installed packages. because those packages belong to build image. build, base, final they are different images.you need to copy files explicitly like you did final image. like this
COPY --from=publish /app .
so add command in the below to your final image and you can use wget without no problem.
RUN apt-get update && apt-get install -y wget && rm -rf /var/lib/apt/lists/*
see this link for more info && best practices.
https://www.docker.com/blog/intro-guide-to-dockerfile-best-practices/
Everything between:
FROM mcr.microsoft.com/dotnet/core/aspnet:3.1 AS base
ARG DOTNET_SKIP_FIRST_TIME_EXPERIENCE=true
ARG DOTNET_CLI_TELEMETRY_OPTOUT=1
WORKDIR /app
and:
FROM base AS final
is irrelevant. With that line, you start constructing a new image from base which was defined in the first block.
(Incidentally, on the next line, you duplicate the WORKDIR statement needlessly. Also, final is the name you'll use to refer to base, it isn't a name for this finally defined image, so that doesn't really make sense - you don't want to do e.g. COPY --from=final.)
You need to install wget in either the base image, or in the last defined image which you'll actually be running, at the end.
I'm trying to dockerise my pelican site project. I've created a docker-compose.yml file and a Dockerfile.
However, every time I try to build my project (docker-compose up) I get the following errors for both pip install and npm install:
npm WARN saveError ENOENT: no such file or directory, open '/src/package.json'
...
Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'
The directory structure of the project is as follows:
- **Dockerfile**
- **docker-compose.yml**
- content/
- pelican-plugins/
- src/
- Themes/
- Pelican config files
- requirements.txt
- gulpfile.js
- package.js
All the pelican makefiles etc. are in the src directory.
I'm trying to load the content, src, and pelican-plugins directories as volumes so I can modify them on my local machine for the docker container to use.
Here is my Dockerfile:
FROM python:3
WORKDIR /src
RUN apt-get update -y
RUN apt-get install -y python-pip python-dev build-essential
# Install Node.js 8 and npm 5
RUN apt-get update
RUN apt-get -qq update
RUN apt-get install -y build-essential
RUN apt-get install -y curl
RUN curl -sL https://deb.nodesource.com/setup_8.x | bash
RUN apt-get install -y nodejs
# Set the locale
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
RUN npm install
RUN python -m pip install --upgrade pip
RUN pip install -r requirements.txt
ENV SRV_DIR=/src
RUN chmod +x $SRV_DIR
RUN make clean
VOLUME /src/output
RUN make devserver
RUN gulp
And here is my docker-compose.yml file:
version: '3'
services:
web:
build: .
ports:
- "80:80"
volumes:
- ./content:/content
- ./src:/src
- ./pelican-plugins:/pelican-plugins
volumes:
logvolume01: {}
It definitely looks like I have set up my volumes directories properly in dockerfiles...
Thanks in advance!
Your Dockerfile doesn't COPY (or ADD) any files at all, so the /src directory is empty.
You can verify this yourself. When you run docker build it will print out output like:
Step 13/22 : ENV LC_ALL en_US.UTF-8
---> Running in 3ab80c3741f8
Removing intermediate container 3ab80c3741f8
---> d240226b6600
Step 14/22 : RUN npm install
---> Running in 1d31955d5b28
npm WARN saveError ENOENT: no such file or directory, open '/src/package.json'
The last line in each step with just a hex number is actually a valid image ID that's the final result of running each step, and you can then:
% docker run --rm -it d240226b6600 sh
# pwd
/src
# ls
To fix this you need a line in the Dockerfile like
COPY . .
You probably also need to change into the src subdirectory to run npm install and the like as you've shown your directory layout. This can look like:
WORKDIR /src
COPY . .
# Either put "cd" into the command itself
# (Each RUN command starts a fresh container at the current WORKDIR)
RUN cd src && npm install
# Or change WORKDIRs
WORKDIR /src/src
RUN pip install -r requirements.txt
WORKDIR /src
Remember that everything in the Dockerfile happens before any setting in docker-compose.yml outside the build: block is even considered. Environment variables, volume mounts, and networking options for a container have no effect on the image build sequence.
In terms of Dockerfile style, your VOLUME declaration will have some tricky unexpected side effects and probably is unnecessary; I'd remove it. Your Dockerfile is also missing the CMD that the container should run. You should also combine RUN apt-get update && apt-get install into single commands; the way Docker layer caching works and the way the Debian repositories work, it's very easy to wind up with a cached package index that names files from a week ago that don't exist any more.
While the setup you're describing is fairly popular, it also essentially hides everything the Dockerfile does with your local source tree. The npm install you're describing here, for example, will be a no-op because the volume mount will hide /src/src/node_modules. I generally find it easier to just run python, npm, etc. locally while I'm developing, rather than write and debug this 50-line YAML file and run sudo docker-compose up.
FROM golang:1.8
RUN apt-get -y update && apt-get install -y curl
RUN go get -u github.com/gorilla/mux
RUN go get github.com/mattn/go-sqlite3
RUN curl -sL https://deb.nodesource.com/setup_6.x | bash - && \
apt-get install -y nodejs
COPY . /go/src/beginnerapp
WORKDIR ./src/beginnerapp/beginner-app-react
RUN npm run build
RUN go install beginnerapp/
WORKDIR /go/src/beginnerapp/beginner-app-react
VOLUME /go/src/beginnerapp/local-db
WORKDIR /go/src/beginnerapp
ENTRYPOINT /go/bin/beginnerapp
EXPOSE 8080
At the start, the golang project as well as the reactjs code don't exist on the image and need to be copied over before being able to build (js) / install (golang). Is there a way I can do that build/install process before copying files over to the image? Ideally I'd only need to copy over the golang executable and reactjs production build.
Yes this is possible now using multi stage builds. The idea is that you can have multiple FROM in your docker file and your main image will be built using the last FROM. Below is a sample pseudo structure
FROM node:latest as reactbuild
WORKDIR /app
COPY . .
RUN webpack build
FROM golang:latest as gobuild
WORKDIR /app
COPY . .
RUN go build
FROM alpine
WORKDIR /app
COPY --from=gobuild /app/myapp /app/myapp
COPY --from=reactbuild /app/dist /app/dist
Please read below article for more details
https://docs.docker.com/engine/userguide/eng-image/multistage-build/