For pip caching I put in my .travis.yml:
cache: pip
For directory caching I put in my .travis.yml:
cache:
directories:
- data/
What can I do to enable both types of caching simultaneously?
We can set cache to a hashmap where pip: true.
cache:
pip: true
directories:
- data/
(reference: Travis documentation)
Related
My Docker+NestJS+Webpack development environment is not running as efficiently as I would like.
Current behavior
Using Webpack with Hot Module Replacement Plugin is slower than using Nest-CLI with watch:
nest build --webpack --webpackPath webpack-hmr.config.js --watch is slower than nest start --debug --watch.
It keeps rebuilding my source files; I'm assuming it keeps rebuilding the entire source, instead of hot module replacing; with a new project with a couple of files, it is much faster, and only replaces what is needed.
Input Code
Dockerfile:
# Get current Node Alpine Linux image.
FROM node:erbium-alpine AS base
# Install essential packages to prepare for bcrypt.
# https://github.com/kelektiv/node.bcrypt.js/wiki/Installation-Instructions#alpine-linux-based-images
RUN apk --no-cache add --virtual builds-deps builds-deps build-base gcc wget python
# Install is-docker to make sure the installations are only done within the container.
RUN yarn global add is-docker
# Set our node environment, either development or production;
# defaults to production, docker-compose overrides this to development on build and run.
ARG NODE_ENV=production
ENV NODE_ENV $NODE_ENV
# Expose port 3000 for node as default, unless otherwise stated.
ARG PORT=3000
ENV PORT $PORT
EXPOSE $PORT
# Install dependencies first, in a different location for easier app bind mounting for local development;
# we need to keep node_modules and its affiliated packages in
# due to default /opt permissions we have to create the dir with root and change perms
WORKDIR /home/node/
# Copy files to prepare for installation.
COPY --chown=node:node package.json yarn.lock tsconfig.json tsconfig.build.json nest-cli.json ./
FROM base AS debug
# Expose port 9229 for debugging.
EXPOSE 9229
# Install git for husky hooks.
RUN apk --no-cache add git
# The official node image provides an unprivileged user as a security best practice
# but we have to manually enable it. We put it here so npm installs dependencies as the same
# user who runs the app.
# https://github.com/nodejs/docker-node/blob/master/docs/BestPractices.md#non-root-user
USER node
# Continue copy files to prepare for installation.
COPY --chown=node:node ./webpack/webpack-hmr-debug.config.js .run-if-changedrc ./
ADD --chown=node:node .husky .husky/
ADD --chown=node:node .git .git/
# Install dependencies.
RUN yarn
# Copy source files last.
ADD --chown=node:node src src/
# Start debugging!
CMD [ "yarn", "debug" ]
docker-compose.yml:
version : "3.7"
services:
backend_debug:
build:
context: ..
target: debug
dockerfile: ./docker/Dockerfile
args:
- NODE_ENV=development
profiles:
- debug
stop_signal : SIGINT
container_name: project_backend_debug
restart : unless-stopped
ports :
- "3000:3000"
- "9229:9229"
environment:
- MONGO_URI
stdin_open : true # Keep stdin open regardless.
tty : true # Show output with syntax highlighting support.
command: ['ash']
volumes :
- "../webpack/webpack-hmr-debug.config.js:/home/node/webpack-hmr-debug.config.js"
- "../tsconfig.build.json:/home/node/tsconfig.build.json"
- "../.run-if-changedrc:/home/node/.run-if-changedrc"
- "../nest-cli.json:/home/node/nest-cli.json"
- "../tsconfig.json:/home/node/tsconfig.json"
- "../node_modules:/home/node/node_modules/"
- "../package.json:/home/node/package.json"
- "../yarn.lock:/home/node/yarn.lock"
- "../.husky/:/home/node/.husky/"
- "../.git/:/home/node/.git/"
- "../dist:/home/node/dist/"
- "../src/:/home/node/src/"
webpack-hmr.config.js
const
{ RunScriptWebpackPlugin } = require('run-script-webpack-plugin'),
nodeExternals = require('webpack-node-externals');
module.exports = function (options, webpack) {
return {
...options,
entry: ['webpack/hot/poll?100', options.entry],
devtool: 'inline-source-map',
externals: [
nodeExternals({
allowlist: ['webpack/hot/poll?100'],
}),
],
plugins: [
...options.plugins,
new webpack.HotModuleReplacementPlugin(),
new webpack.WatchIgnorePlugin({
paths: [/\.js$/, /\.d\.ts$/],
}),
new RunScriptWebpackPlugin(
{
name: options.output.filename,
nodeArgs: ['--inspect=0.0.0.0:9229'] // Added to enable debugging.
}
),
]
};
};
Expected behavior
HMR should speed up development and show that HMR rebuilds by changing the specific files only.
Environment
Docker-Compose: 1.29.2
Webpack: 5.28.0
Docker: 20.10.8
Nest: 7.5.5
For Tooling issues:
Node: 12.22.6
Platform: MacOS Big Sur 11.5.2
I'm trying to speed up my Google Cloud Build for a React application (github repo). Therefor I started using Kaniko Cache as suggested in the official Cloud Build docs.
It seems the npm install part of my build process is now indeed cached. However, I would have expected that npm run build would also be cached when source files haven't changed.
My Dockerfile:
# Base image has ubuntu, curl, git, openjdk, node & firebase-tools installed
FROM gcr.io/team-timesheets/builder as BUILDER
## Install dependencies for functions first
WORKDIR /functions
COPY functions/package*.json ./
RUN npm ci
## Install app dependencies next
WORKDIR /
COPY package*.json ./
RUN npm ci
# Copy all app source files
COPY . .
# THIS SEEMS TO BE NEVER CACHED, EVEN WHEN SOURCE FILES HAVENT CHANGED
RUN npm run build:refs \
&& npm run build:production
ARG VCS_COMMIT_ID
ARG VCS_BRANCH_NAME
ARG VCS_PULL_REQUEST
ARG CI_BUILD_ID
ARG CODECOV_TOKEN
ENV VCS_COMMIT_ID=$VCS_COMMIT_ID
ENV VCS_BRANCH_NAME=$VCS_BRANCH_NAME
ENV VCS_PULL_REQUEST=$VCS_PULL_REQUEST
ENV CI_BUILD_ID=$CI_BUILD_ID
ENV CODECOV_TOKEN=$CODECOV_TOKEN
RUN npm run test:cloudbuild \
&& if [ "$CODECOV_TOKEN" != "" ]; \
then curl -s https://codecov.io/bash | bash -s - -X gcov -X coveragepy -X fix -s coverage; \
fi
WORKDIR /functions
RUN npm run build
WORKDIR /
ARG FIREBASE_PROJECT_ID
ARG FIREBASE_TOKEN
RUN if [ "$FIREBASE_TOKEN" != "" ]; \
then firebase deploy --project $FIREBASE_PROJECT_ID --token $FIREBASE_TOKEN; \
fi
Build output:
BUILD
Pulling image: gcr.io/kaniko-project/executor:latest
latest: Pulling from kaniko-project/executor
Digest: sha256:b9eec410fa32cd77cdb7685c70f86a96debb8b087e77e63d7fe37eaadb178709
Status: Downloaded newer image for gcr.io/kaniko-project/executor:latest
gcr.io/kaniko-project/executor:latest
INFO[0000] Resolved base name gcr.io/team-timesheets/builder to builder
INFO[0000] Using dockerignore file: /workspace/.dockerignore
INFO[0000] Retrieving image manifest gcr.io/team-timesheets/builder
INFO[0000] Retrieving image gcr.io/team-timesheets/builder
INFO[0000] Retrieving image manifest gcr.io/team-timesheets/builder
INFO[0000] Retrieving image gcr.io/team-timesheets/builder
INFO[0000] Built cross stage deps: map[]
INFO[0000] Retrieving image manifest gcr.io/team-timesheets/builder
INFO[0000] Retrieving image gcr.io/team-timesheets/builder
INFO[0000] Retrieving image manifest gcr.io/team-timesheets/builder
INFO[0000] Retrieving image gcr.io/team-timesheets/builder
INFO[0001] Executing 0 build triggers
INFO[0001] Resolving srcs [functions/package*.json]...
INFO[0001] Checking for cached layer gcr.io/team-timesheets/app/cache:9307850446a7754b17d62c95be0c1580672377c1231ae34b1e16fc284d43833a...
INFO[0001] Using caching version of cmd: RUN npm ci
INFO[0001] Resolving srcs [package*.json]...
INFO[0001] Checking for cached layer gcr.io/team-timesheets/app/cache:7ca523b620323d7fb89afdd0784f1169c915edb933e1d6df493f446547c30e74...
INFO[0001] Using caching version of cmd: RUN npm ci
INFO[0001] Checking for cached layer gcr.io/team-timesheets/app/cache:1fd7153f10fb5ed1de3032f00b9fb904195d4de9dec77b5bae1a3cb0409e4530...
INFO[0001] No cached layer found for cmd RUN npm run build:refs && npm run build:production
INFO[0001] Unpacking rootfs as cmd COPY functions/package*.json ./ requires it.
INFO[0026] WORKDIR /functions
INFO[0026] cmd: workdir
INFO[0026] Changed working directory to /functions
INFO[0026] Creating directory /functions
INFO[0026] Taking snapshot of files...
INFO[0026] Resolving srcs [functions/package*.json]...
INFO[0026] COPY functions/package*.json ./
INFO[0026] Resolving srcs [functions/package*.json]...
INFO[0026] Taking snapshot of files...
INFO[0026] RUN npm ci
INFO[0026] Found cached layer, extracting to filesystem
INFO[0029] WORKDIR /
INFO[0029] cmd: workdir
INFO[0029] Changed working directory to /
INFO[0029] No files changed in this command, skipping snapshotting.
INFO[0029] Resolving srcs [package*.json]...
INFO[0029] COPY package*.json ./
INFO[0029] Resolving srcs [package*.json]...
INFO[0029] Taking snapshot of files...
INFO[0029] RUN npm ci
INFO[0029] Found cached layer, extracting to filesystem
INFO[0042] COPY . .
INFO[0043] Taking snapshot of files...
INFO[0043] RUN npm run build:refs && npm run build:production
INFO[0043] Taking snapshot of full filesystem...
INFO[0061] cmd: /bin/sh
INFO[0061] args: [-c npm run build:refs && npm run build:production]
INFO[0061] Running: [/bin/sh -c npm run build:refs && npm run build:production]
> thdk-timesheets-app#1.2.16 build:refs /
> tsc -p common
> thdk-timesheets-app#1.2.16 build:production /
> webpack --env=prod
Hash: e33e0aec56687788a186
Version: webpack 4.43.0
Time: 81408ms
Built at: 12/04/2020 6:57:57 AM
....
Now, with the overhead of the cache system, there doesn't even seem to be a speed benefit.
I'm relatively new to Dockerfiles, so hopefully I'm just missing a simple line here.
Short answer: Cache invalidation is hard.
In a RUN section of a Dockerfile, any command can be run. In general, docker (when using local caching) or Kaniko now have do decide, if this step can be cached or not. This is usually determined by checking, if the output is deterministic - in other words: if the same command is run again, does it produce the same file changes (relative to the last image) than before?
Now, this simplistic view is not enough to determine a cacheable command, because any command can have side-effects that do not affect the local filesystem - for example, network traffic. If you run a curl -XPOST https://notify.example.com/build/XYZ to post a successful or failed build to some notification API, this should not be cached. Maybe your command is generating a random password for an admin-user and saves that to an external database - this step also should never be cached.
On the other hand, a completely reproducible npm run build could still result in two different bundled packages due to the way, that minifiers and bundlers work - e.g. where minified and uglified builds have different short variable names. Although the resulting builds are semantically the same, they are not on a byte-level - so although this step could be cached, docker or kaniko have no way of identifying that.
Distinguishing between cacheable and non-cacheable behavior is basically impossible and therefore you'll encounter problematic behavior in form of false-positives or false-negatives in caching again and again.
When I consult clients in building pipelines, I usually split Dockerfiles up into stages or put the cache-miss-or-hit-logic into a script, if docker decides wrong for a certain step.
When you split Dockerfiles, you have a base-image (which contains all dependencies and other preparation steps) and split off the custom-cacheable part into its own Dockerfile - the latter then references the former base-image. This usually means, that you have to have some form of templating in place (e.g. by having a FROM ${BASE_IMAGE} at the start, which then is rendered via envsubst or a more complex system like helm).
If that is not suitable for your usecase, you can choose to implement the logic yourself in a script. To find out, which files change, you can use git diff --name-only HEAD HEAD~1. By combining this with some more logic, you can customize your script behavior to only perform some logic if a certain set of files changed:
#!/usr/bin/env bash
# only rebuild, if something changed in 'app/'
if [[ ! -z "$(git diff --name-only HEAD HEAD~1 | grep -e '^(app/|package.*)')" ]]; then
npm run build:ref
curl -XPOST https://notify.api/deploy/$(git rev-parse --short HEAD)
// ... further steps ...
fi
You can easily extend this logic to your exact needs and take full control over the caching logic yourself - but you should only do this for steps involving false-positives or false-negatives by docker or kaniko, since all following steps will not be cached due to the undeterministic behavior.
Docker doesn't use build cache when something in package.json or package-lock.json is changed, even if this is only the version number in the file, no dependencies are changed.
How can I achieve it so docker use the old build cache and skip npm install (npm ci) everytime?
I know that docker looks at the modified date of files. But package.json is not changed at all only the version number.
Below is my Dockerfile
FROM node:10 as builder
ARG REACT_APP_BUILD_NUMBER=X
ENV REACT_APP_BUILD_NUMBER="${REACT_APP_BUILD_NUMBER}"
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY .npmrc ./
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM nginx:alpine
COPY nginx/nginx.conf /etc/nginx/nginx.conf
COPY --from=builder /usr/src/app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Here are some solutions that should help mitigate this problem. There are trade-offs with each, but they are not necessarily mutually exclusive - they can be mixed together for better overall build performance.
Solution I: Docker BuildKit cache mounts
Docker BuildKit enables partial mitigation of this problem using the experimental RUN --mount=type=cache flag. It supports a reusable cache mount during the image build progress.
An important caveat here is that support for Docker BuildKit may vary significantly between CI/development environments. Check the documentation and the build environment to ensure it will have proper support (otherwise, it will error). Here are some requirements (but not necessarily an exhaustive list):
The Docker daemon needs to support BuildKit (requires Docker 18.09+).
Docker BuildKit needs to be explicitly enabled with DOCKER_BUILDKIT=1 or by default from a daemon/cli configuration.
A comment is needed at the start of the Dockerfile to enable experimental support: # syntax=docker/dockerfile:experimental
Here is a sample Dockerfile that makes use of this feature, caching npm dependencies locally to /usr/src/app/.npm for reuse in subsequent builds:
# syntax=docker/dockerfile:experimental
FROM node
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY package.json package-lock.json /usr/src/app
RUN --mount=type=cache,target=/usr/src/app/.npm \
npm set cache /usr/src/app/.npm && \
npm ci
Notes:
This will cache fetched dependencies locally, but npm will still need to install these into the node_modules directory. Testing with a medium-sized project indicates that this does shave off some build time, but building node_modules can still be non-negligible.
/usr/src/app/.npm will not be included in the final build, and is only available during build time (however, a lingering .npm directory will exist).
The build cache can be cleared if needed, see this Docker forum spost.
Caching node_modules is not recommended. Removal of dependencies in package.json might not be properly propogated. Your milage may vary, if attempted.
Solution II: Install dependencies prior to copying package.json
On the host machine, a script extracts only the dependencies and devDependencies tags from package.json and copies those tags that a new file, such as package-dependencies.json.
E.g. package-dependencies.json:
{
"dependencies": {
"react": "^16.13.1"
},
"devDependencies": {
"gulp": "^4.0.2",
}
}
In the Dockerfile, COPY the package-dependencies.json and package-lock.json and install dependencies. Then, copy the original package.json. Unless changes occur to package-lock.json or package.json's dependencies/devDependencies tags, the layers will be cached and reused from a previous build, meaning minor changes to the package.json will not need to run npm ci/npm install.
Here is an example:
FROM node
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# copy dependency list and locked dependencies
COPY package-dependencies.json package-lock.json /usr/src/app/
# install dependencies
RUN npm ci
# copy over the full package configuration
COPY package.json /usr/src/app/
# ...
RUN npm run build
# ...
Notes:
If used mutually-exclusively, this solution will be faster than the first solution for small changes (such as a version bump), as it will not need to rerun npm ci.
package-dependencies.json will be in the layer history. While this file would be negligible/insignificant in size, it is still "wasted space" since it is not needed in the final image.
A quick script will be needed to generate package-dependencies.json. Depending on the build environments, this may be annoying to implement. Here is an example using the cli utility jq:
cat package.json | jq -S '. | with_entries(select (.key as $k | ["dependencies", "devDependencies"] | index($k)))' > package-dependencies.json
Solution III: All of the above
Solution I will enable caching npm dependencies locally for faster dependency fetching. Solution II will only ever trigger npm ci/npm install if a dependency or development dependency is updated. These solutions can used together to further accelerate build times.
I am working on a CloudBuild script that builds a multistage Docker image for integration testing. To optimize the build script I opted to use Kaniko. The relevant portions of the Dockerfile and cloudbuild.yaml files are available below.
cloudbuild.yaml
steps:
# Build BASE image
- name: gcr.io/kaniko-project/executor:v0.17.1
id: buildinstaller
args:
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>-installer:$BRANCH_NAME
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>-installer:$SHORT_SHA
- --cache=true
- --cache-ttl=24h
- --cache-repo=gcr.io/$PROJECT_ID/<MY_REPO>/cache
- --target=installer
# Build TEST image
- name: gcr.io/kaniko-project/executor:v0.17.1
id: buildtest
args:
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>-test:$BRANCH_NAME
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>-test:$SHORT_SHA
- --cache=true
- --cache-ttl=24h
- --cache-repo=gcr.io/$PROJECT_ID/<MY_REPO>/cache
- --target=test-image
waitFor:
- buildinstaller
# --- REMOVED SOME CODE FOR BREVITY ---
# Build PRODUCTION image
- name: gcr.io/kaniko-project/executor:v0.17.1
id: build
args:
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>:$BRANCH_NAME
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>:$SHORT_SHA
- --destination=gcr.io/$PROJECT_ID/<MY_REPO>:latest
- --cache=true
- --cache-ttl=24h
- --cache-dir=/cache
- --target=production-image
waitFor:
- test # TODO: This will run after tests which were not included here for brevity
images:
- gcr.io/$PROJECT_ID/<MY_REPO>
Dockerfile
FROM ruby:2.5-alpine AS installer
# Expose port
EXPOSE 3000
# Set desired port
ENV PORT 3000
# set the app directory var
ENV APP_HOME /app
RUN mkdir -p ${APP_HOME}
WORKDIR ${APP_HOME}
# Install necessary packanges
RUN apk add --update --no-cache \
build-base curl less libressl-dev zlib-dev git \
mariadb-dev tzdata imagemagick libxslt-dev \
bash nodejs
# Copy gemfiles to be able to bundle install
COPY Gemfile* ./
#############################
# STAGE 1.5: Test build #
#############################
FROM installer AS test-image
# Set environment
ENV RAILS_ENV test
# Install gems to /bundle
RUN bundle install --deployment --jobs $(nproc) --without development local_gems
# Add app files
ADD . .
RUN bundle install --with local_gems
#############################
# STAGE 2: Production build #
#############################
FROM installer AS production-image
# Set environment
ENV RAILS_ENV production
# Install gems to /bundle
RUN bundle install --deployment --jobs $(nproc) --without development test local_gems
# Add app files
ADD . .
RUN bundle install --with local_gems
# Precompile assets
RUN DB_ADAPTER=nulldb bundle exec rake assets:precompile assets:clean
# Puma start command
CMD ["bundle", "exec", "puma", "-C", "config/puma.rb"]
Since my Docker image is a multi-stage build with 2 separate end stages that share a common base build, I want to share the cache between the common portion and the other two. To accomplish this, I set all builds to share the same cache repository - --cache-repo=gcr.io/$PROJECT_ID/<MY_REPO>/cache. It has worked in all my tests thus far. However, I have been unable to ascertain if this is best practice or if another manner of caching a base image would be recommended. Is this an acceptable implementation?
I have come across Kaniko-warmer but I have been unable to use it for my situation.
Before mentioning any best practices on how to cache your base image, there are some best practices in order to optimize the performance of your build. Since you already use Kaniko and you are caching the image from your repository, I believe your implementation is following the Best Practices above.
The only suggestion I would make, is to use Google Cloud Storage to reuse the results from your previous builds. If your build is taking a long time and the files produced are not a lot and it doesn't take a lot of time to copy them from and to Cloud Storage, this would speed up more your build.
Furthermore there are some best practices that are stated in the following article, regarding the Optimization of your build cache. I believe the most important of them is to:
"position the build steps that change often at the bottom of the Dockerfile. If you put them at the top, Docker cannot use its build cache for the other build steps that are changing less often. Because a new Docker image is usually built for each new version of your source code, add the source code to the image as late as possible in the Dockerfile".
Finally another thing I would take into consideration is the cache expiration time.
Please keep in mind it must be configured appropriately in order not to lose any updates for the dependencies but not running builds without any use.
More links you may consider useful (bare in mind that these are not Google sources):
Docker documentation about Multi-stage Builds
Using Multi-Stage Builds to Simplify And Standardize Build Processes
I'm running into the errors:
ERROR in ../~/babel-polyfill/lib/index.js
Couldn't find preset "es2015-loose" relative to directory "/app"
amongst a few other preset not found errors upon building a ReactJS project. It runs on webpackdevserver in dev.
COPY in Docker doesn't copy over dot files by default. Should I be copying .babelrc over to avoid this breaking? How to do this if so. If not, what am I missing/wrong ordering in this build?
Dockerfile
FROM alpine:3.5
RUN apk update && apk add nodejs
RUN npm i -g webpack \
babel-cli \
node-gyp
ADD package.json /tmp/package.json
RUN cd /tmp && npm install
RUN mkdir -p /app && cp -a /tmp/node_modules /app/
WORKDIR /app
COPY . /app
docker-compose
version: '2.1'
services:
webpack:
build:
context: .
dockerfile: Docker.doc
volumes:
- .:/app
- /app/node_modules
COPY in Docker doesn't copy over dot files by default.
This is not true. COPY in the Dockerfile copies dot files by default. I came across this question as I had faced this issue earlier. For anyone else who may encounter this issue, troubleshoot with the following:
Check your host/local directory if the dotfiles exists. If you are copying the files over from your OS's GUI, there's a chance that the dotfiles will not be ported over simply because they are hidden.
Check if you have a .dockerignore file that may be ignoring these dotfiles. More info from .dockerignore docs