I am working on creating a docker image with the following
FROM node:lts-alpine
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm#5+)
COPY package*.json ./
RUN npm install --only=production
# Bundle app source
COPY . .
EXPOSE 8080
CMD [ "npm", "start"]
I am confused about the following line
# Bundle app source
COPY . .
What exactly is meant here by bundling? Copy everything? IF that is the case why is it copying the package.json file beforehand?
I was confused about the exact same thing.
The solution is the distinction between your application and its dependencies: Running npm install after copying package.json only installs the dependencies (creating the node_modules folder), but your application code is still not inside the container. That's what COPY . . does. I don't think the word "bundle" has any special meaning here, since it is just a normal file copy operation.
Separating those steps means the results of npm install (i.e. the state of the container after executing the command) can be cached by docker and thus don't have to be executed every time a part of the application code changes. This makes deploys faster.
PS: When talking about making deploys faster, have a look at npm ci: https://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable
To bundle your app's source code inside the Docker image, use the COPY instruction:
Related
Docker doesn't use build cache when something in package.json or package-lock.json is changed, even if this is only the version number in the file, no dependencies are changed.
How can I achieve it so docker use the old build cache and skip npm install (npm ci) everytime?
I know that docker looks at the modified date of files. But package.json is not changed at all only the version number.
Below is my Dockerfile
FROM node:10 as builder
ARG REACT_APP_BUILD_NUMBER=X
ENV REACT_APP_BUILD_NUMBER="${REACT_APP_BUILD_NUMBER}"
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY .npmrc ./
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM nginx:alpine
COPY nginx/nginx.conf /etc/nginx/nginx.conf
COPY --from=builder /usr/src/app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
Here are some solutions that should help mitigate this problem. There are trade-offs with each, but they are not necessarily mutually exclusive - they can be mixed together for better overall build performance.
Solution I: Docker BuildKit cache mounts
Docker BuildKit enables partial mitigation of this problem using the experimental RUN --mount=type=cache flag. It supports a reusable cache mount during the image build progress.
An important caveat here is that support for Docker BuildKit may vary significantly between CI/development environments. Check the documentation and the build environment to ensure it will have proper support (otherwise, it will error). Here are some requirements (but not necessarily an exhaustive list):
The Docker daemon needs to support BuildKit (requires Docker 18.09+).
Docker BuildKit needs to be explicitly enabled with DOCKER_BUILDKIT=1 or by default from a daemon/cli configuration.
A comment is needed at the start of the Dockerfile to enable experimental support: # syntax=docker/dockerfile:experimental
Here is a sample Dockerfile that makes use of this feature, caching npm dependencies locally to /usr/src/app/.npm for reuse in subsequent builds:
# syntax=docker/dockerfile:experimental
FROM node
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY package.json package-lock.json /usr/src/app
RUN --mount=type=cache,target=/usr/src/app/.npm \
npm set cache /usr/src/app/.npm && \
npm ci
Notes:
This will cache fetched dependencies locally, but npm will still need to install these into the node_modules directory. Testing with a medium-sized project indicates that this does shave off some build time, but building node_modules can still be non-negligible.
/usr/src/app/.npm will not be included in the final build, and is only available during build time (however, a lingering .npm directory will exist).
The build cache can be cleared if needed, see this Docker forum spost.
Caching node_modules is not recommended. Removal of dependencies in package.json might not be properly propogated. Your milage may vary, if attempted.
Solution II: Install dependencies prior to copying package.json
On the host machine, a script extracts only the dependencies and devDependencies tags from package.json and copies those tags that a new file, such as package-dependencies.json.
E.g. package-dependencies.json:
{
"dependencies": {
"react": "^16.13.1"
},
"devDependencies": {
"gulp": "^4.0.2",
}
}
In the Dockerfile, COPY the package-dependencies.json and package-lock.json and install dependencies. Then, copy the original package.json. Unless changes occur to package-lock.json or package.json's dependencies/devDependencies tags, the layers will be cached and reused from a previous build, meaning minor changes to the package.json will not need to run npm ci/npm install.
Here is an example:
FROM node
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# copy dependency list and locked dependencies
COPY package-dependencies.json package-lock.json /usr/src/app/
# install dependencies
RUN npm ci
# copy over the full package configuration
COPY package.json /usr/src/app/
# ...
RUN npm run build
# ...
Notes:
If used mutually-exclusively, this solution will be faster than the first solution for small changes (such as a version bump), as it will not need to rerun npm ci.
package-dependencies.json will be in the layer history. While this file would be negligible/insignificant in size, it is still "wasted space" since it is not needed in the final image.
A quick script will be needed to generate package-dependencies.json. Depending on the build environments, this may be annoying to implement. Here is an example using the cli utility jq:
cat package.json | jq -S '. | with_entries(select (.key as $k | ["dependencies", "devDependencies"] | index($k)))' > package-dependencies.json
Solution III: All of the above
Solution I will enable caching npm dependencies locally for faster dependency fetching. Solution II will only ever trigger npm ci/npm install if a dependency or development dependency is updated. These solutions can used together to further accelerate build times.
I expected and tried to include it in Dockefile directly. Here is my whole dockerfile:
FROM node
# make the 'app' folder the current working directory
WORKDIR /app
# copy both 'package.json' and 'package-lock.json' (if available)
COPY package*.json ./
# install project dependencies
RUN npm install
RUN npm i --save #koumoul/vuetify-jsonschema-form
RUN npm install --save axios vue-axios
RUN npm install vuetify#1.5.8
# copy project files and folders to the current working directory (i.e. 'app' folder)
COPY . .
But got
Module not found: Error: Can't resolve 'vuetify' in '/app/src/views'
It is not good practice to install separately from package.json. You should just include it in your package.json.. But I am going to teach you a technique for testing cases like this.
You can run first the image on your own docker run -it node bash then do there what you want to run. You can also apply bind mount so the files that you needed are included like docker run -it -v=$(pwd):/usr/src/app node bash.. With this you can practice everything that you are trying to run in your Dockerfile more directly
The npm version is located at package.json.
I have a Dockerfile, simplified as follows:
FROM NODE:carbon
COPY ./package.json ${DIR}/
RUN npm install
COPY . ${DIR}
RUN npm build
Correct my understanding,
If ./package.json changes, is it true that the writable docker image layers changes are from 2 to 5?
Assuming that I do not have any changes on npm package dependencies,
How could I change the project version but I do not want docker rebuild image layer for RUN npm install ?
To sum up, the behavior you describe using Docker is fairly standard (as soon as package.json has changed and has a different hash, COPY package.json ./ will be trigerred again as well as each subsequent Dockerfile command).
Thus, the docker setup outlined in the official doc of Node.js does not alleviate this, and proposes the following Dockerfile:
FROM node:carbon
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm#5+)
COPY package*.json ./
RUN npm install
# If you are building your code for production
# RUN npm install --only=production
# Bundle app source
COPY . .
EXPOSE 8080
CMD [ "npm", "start" ]
But if you really want to find ways to avoid rerunning npm install from scratch most of the time, you could just maintain two different files package.json and, say, package-deps.json, import (and rename) package-deps.json and run npm install, then use the proper package.json afterwards.
And if you want to have more checks to be sure that the dependencies of both files are not out-of-sync, it happens that the role of file package-lock.json may be of some help, if you use the new npm ci feature that comes with npm 5.8 (cf. the corresponding changelog) instead of using npm install.
In this case, as the latest version of npm available in Docker Hub is npm 5.6, you'll need to upgrade it beforehand.
All things put together, here is a possible Dockerfile for this use case:
FROM node:carbon
# Create app directory
WORKDIR /usr/src/app
# Upgrade npm to have "npm ci" available
RUN npm install -g npm#5.8.0
# Import conf files for dependencies
COPY package-lock.json package-deps.json ./
# Note that this REQUIRES to run the command "npm install --package-lock-only"
# before running "docker build …" and also REQUIRES a "package-deps.json" file
# that is in sync w.r.t. package.json's dependencies
# Install app dependencies
RUN mv package-deps.json package.json && npm ci
# Note that "npm ci" checks that package.json and package-lock.json are in sync
# COPY package.json ./ # subsumed by the following command
COPY . .
EXPOSE 8080
CMD [ "npm", "start" ]
Disclaimer: I did not try the solution above on a realistic example as I'm not a regular node user, but you may view this as a useful workaround for the dev phase…
I've got a node_modules folder which is 120MB+ and I'm wondering if we can somehow only push the node_modules folder if it has changed?
This is what my docker file looks like at the moment:
FROM node:6.2.0
# Create app directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# Install app dependencies
COPY package.json /usr/src/app/
RUN npm install
# Bundle app source
COPY . /usr/src/app
CMD export NODE_ENV=production
EXPOSE 80:7000
# EXPOSE 7000
CMD [ "npm", "start" ]
So what I'm wanting to do is only push the node_modules folder if it has changed! I don't mind manually specifying when the node_modules folder has changed, whether I do this by passing a flag & using an if statement, I don't know?
Use case:
I only made changes to my application code and didn't add any new packages.
I added some packages and require the node_modules folder to be pushed.
Edit:
So I tried the following docker file which brought in some logic from
http://bitjudo.com/blog/2014/03/13/building-efficient-dockerfiles-node-dot-js/
When I run docker built -t <name> . with the below Dockerfile & then gcloud docker -- push <url> it will still try push my whole directory to the registry?!
FROM node:6.2.0
ADD package.json /tmp/package.json
RUN cd /tmp && npm install
# Create app directory
RUN mkdir -p /usr/src/app && cp -a /tmp/node_modules /usr/src/app/
WORKDIR /usr/src/app
# Install app dependencies
# COPY package.json /usr/src/app/
# RUN npm install
# Bundle app source
ADD . /usr/src/app
CMD export NODE_ENV=production
EXPOSE 80:7000
# EXPOSE 7000
CMD [ "npm", "start" ]
Output from running gcloud docker -- push etc...:
f614bb7269f3: Pushed
658140f06d81: Layer already exists
be42b5584cbf: Layer already exists
d70c0d3ee1a2: Layer already exists
5f70bf18a086: Layer already exists
d0b030d94fc0: Layer already exists
42d0ce0ecf27: Layer already exists
6ec10d9b4afb: Layer already exists
a80b5871b282: Layer already exists
d2c5e3a8d3d3: Layer already exists
4dcab49015d4: Layer already exists
f614bb7269f3 is always being pushed and I can't figure out why (new to Docker). It's trying to push the whole directory which my app is in!?
Any ideas?
This blog post explains how to cache your dependencies in subsequent builds of your image by creating a layer that can be cached as long as the package.json file hasn't changes - http://bitjudo.com/blog/2014/03/13/building-efficient-dockerfiles-node-dot-js/
This is a link to the gist code snippet - https://gist.github.com/dweinstein/9468644
Worked wonders for our node app in my organization.
I am trying to run webpack inside a docker container for a node app. I get the following error.
sh: 1: webpack: Permission denied
The Dockerfile works fine on a normal build.
FROM node
# Create app directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY package.json /usr/src/app/
RUN npm install
# Bundle app source
COPY . /usr/src/app
EXPOSE 3001
#This launches webpack which fails.
CMD [ "npm", "start" ]
I had the same issue, as I was migrating an existing project to docker. I resolved it by not copying the entire project contents (COPY . /usr/src/app in your docker file) and instead only copying the files and directories actually required.
In my case, the unnecessary directories added when copying the whole project were, among other things, node_modules, the build directory and the entire .git repo directory.
I still don't know why copying the entire directory doesn't work (something conflicts with something? something has incorrect permissions?), but copying only what you need is better for image size anyway.