I see that bundle install and yarn install are usually done in Dockerfile as:
RUN bundle install && yarn install
Which means that if I modify Gemfile or yarn.lock, I need to re-build the image again. I know that there is layer caching and the docker build will not rebuild other layers except bundle install && yarn install layer. But it means I have to do docker-compose up -d --build
But I was wondering if it is ok to put these commands inside an entry script of docker-compose or in command as:
command: bundle install && yarn install && rails s
In this way, I believe, whenever I do docker-compose up -d, bundle install and yarn install will be executed without having to build the image.
Not sure if it has any advantages over conventional bundle install in Dockerfile except not having to append --build in docker-compose up. Correct that if I do this, bundle install and yarn install will get executed even when there are no changes to Gemfile or Yarn files. I guess this is one of the bad sides.
Please correct me if it is not the ideal way to go.
New to docker world.
It wastes several minutes of time and uses up network bandwidth every time you start your application. When you're doing local development, it'd be the equivalent of doing this, every time you run the application:
rm -rf vendor node_modules
bundle install # from scratch
yarn install # from scratch
bundle exec rails s
A core part of Docker is rebuilding images (in the same way that languages like Go, Java, Typescript, etc. have a "build" phase). Trying to avoid image rebuilds isn't usually advisable. With a well-written Dockerfile, and particularly for an interpreted language, running docker build should be fairly efficient.
The one important trick is to separately copy the files that specify dependencies, and the rest of your application. As soon as a Dockerfile COPY instruction encounters a file that's changed it will disable layer caching for the rest of the application. Since dependencies change relatively infrequently, a sequence that first copies the dependency file, then installs the dependencies, then copies the application can jump straight to the last step if the dependency file hasn't changed.
COPY Gemfile Gemfile.lock ./
RUN bundle install
COPY package.json yarn.lock ./
RUN yarn install
COPY . ./
(Make sure to include the Bundler vendor directory and the node_modules directory in a .dockerignore file so the last COPY step doesn't overwrite what previously got installed.)
This question is opinion based. As you already found out yourself, it is a common practice to install dependencies (bundle, yarn, others) during the image build process, and not image run process.
The rationale is that you run more times than you build, and you want the run operation to start quickly.
In the same way that you do apt install... or yum install... in the build stage, you should normally do bundle install in the build stage as well.
That said, if it makes sense to you to bundle install as a part of the entrypoint, that is your choice. I suspect that after you do it, you will see that it is less common for a reason.
Another note about docker layers: If the Gemfile change, not only the layer that refers to it will change, but all subsequent layers as well. For that reason, it is often common to separate the copy of the dependencies manifest (Gemfile.*) from the copying of the app, like this:
# Pre-install gems
COPY Gemfile* ./
RUN gem install bundler && \
bundle install --jobs=3 --retry=3
# Copy the rest of the app
COPY . .
So this way, if your app files change, but not the dependencies, the build will be faster.
Related
I'm using poetry in my project and now working on a feature that will allow to run the app inside of a docker container. Now, my Dockerfile looks like this:
COPY pyproject.toml /
...
RUN poetry install
The last command takes around 4 minutes which is quite a lot so I thought of caching somehow this dependencies. I'm trying to convert my pyproject.toml to requirements.txt so I could feed it to Docker and it would cache it if the file hasn't been changed since the last run.
Now I'm trying:
poetry export -f requirements.txt --output requirements.txt
And it only writes dependencies from [tool.poetry.dependencies] section, but the problem is that I have other sections and would like to see dependencies from those in my requirements.txt file. How should I modify the command above so it would take dependencies from other sections as well.
P.S. Maybe you might know other ways of how to cache poetry install in docker, I'd really appreciate that!
I think you can do 2-step dependencies install with poetry to cache dependensies like in example here - https://pythonspeed.com/articles/poetry-vs-docker-caching/, no need to migrate to requirements.txt. The idea is to copy only toml, install dependencies (this way dependencies will cache and need to update only if toml changes), then copy you source files (which change more often, than toml) and do install again. More detailed explanation in the link above (https://pythonspeed.com/articles/poetry-vs-docker-caching/)
Problem
Most of my Jenkins builds get stuck at npm install. The issue is not reproducible locally what makes it hard to narrow down. The build server would just endlessly hang at a "random" package while until you'd manually stop it.
16:33:55 [0m[91mnpm http fetch GET 200 https://registry.npmjs.org/ws/-/ws-6.2.1.tgz 737ms
Analysis
The frontend is developed with Aurelia and is part of a monorepo that is managed by Docker. This is my only project that uses Aurelia CLI so I thought I could find the problem there - but without any results.
I've already tried to analyze the issue by executing npm install --verbose but didn't gain any additional valuable information. It wasn't a specific package that lead to the problem nor was it a noticeable timeout.
# Dockerfile
FROM node:12.13.0 as builder
WORKDIR /web
COPY web .
RUN pwd
RUN npm install --verbose
RUN npm run build
FROM nginx:mainline-alpine
COPY --from=builder /web/dist /usr/share/nginx/html
COPY html/index.html /usr/share/nginx/html/index2.html
COPY nginx.conf /etc/nginx/nginx.conf
After investigating the problem for a long time, I discovered the newly introduced npm ci command and used it instead of npm install which solved the problem. Unfortunately installing a project with a clean state is a good idea ;-)
This command is similar to npm-install, except it’s meant to be used
in automated environments such as test platforms, continuous
integration, and deployment – or any situation where you want to make
sure you’re doing a clean install of your dependencies. It can be
significantly faster than a regular npm install by skipping certain
user-oriented features. It is also more strict than a regular install,
which can help catch errors or inconsistencies caused by the
incrementally-installed local environments of most npm users.
When I was building and trying to run my docker image with a rubyonrails app, I was getting this error:
warning Integrity check: System parameters don't match
error Integrity check failed
error Found 1 errors.
I tried changing my Docker file with
RUN yarn install --check-files
But that didn't do anything.
I then just deleted the yarn.lock file and my container now runs.
I am guessing the issue is that rails was run locally on my laptop, and now it is trying to run the same yarn.lock file on another computer and the integrity check is failing? Is this correct?
What should my dockerfile be doing? Should I exclude the yarn.lock file from getting into my docker container in the first place?
First of all, you will need to remove the node_modules folder and run the yarn install again. In your command line, follow these instructions:
Remove node_modules folder by typing rm -rf node_modules
Run yarn install
Run rails webpacker:install
Restart your command-line editor.
Be careful about the NodeJS version in your machine. It must be the same as the version in which the rails project was initialized. You can use nvm to manage the Node version.
Add "config.webpacker.check_yarn_integrity = false" in "development.rb" will solve the problem
I would suggest not copying yarn.lock in the container. You can add yarn.lock in the .dockerignore which Docker will ignore the yarn.lock while building the image.
I had faced similar issues because, locally I run on macOS and containers are alpine-based, so we end up ignoring the yarn.lock
I am trying to use yarn workspaces and then put my application into a Docker
image.
The folder structure looks like this:
root
Dockerfile
node_modules/
libA --> ../libA
libA/
...
app/
...
Unfortunately Docker doesn't support symbolic links - therefore it is not possible to copy the node_modules-folder in the root directory into a Docker image, even if the Dockerfile is in the root as in my case.
One thing I could do would be to exclude the symlinks with .dockerignore and then copy the real directory to the image.
Another idea - which I would prefer - would be to have a tool that replaces the symlinks with the actual contents of the symlink. Do you know if there is such a tool (preferably a Javascript package)?
Thanks
Yarn is used for dependency management, and should be configured to run within the Docker container to install the necessary dependencies, rather than copying them from your local machine.
The major benefit of Docker is that it allows you to recreate your development environment without worrying about the machine that it is running on - the same thing applies to Yarn, by running yarn install it installs the right versions for the relevant architecture of the machine your Docker image is built upon.
In your Dockerfile include the following after configuring your work directory:
RUN yarn install
Then you should be all sorted!
Another thing you should do is include the node_modules directory in your .gitignore and .dockerignore files so it is never include when distributing your code.
TL;DR: Don't copy node_modules directory from local machine, include RUN yarn install in Dockerfile
I am trying to setup docker-compose architecture for local development and production and I can't figure when in the containers life it's the best time to install library dependencies. In the same time I am not sure if these should be placed in the container or in external volume.
All my code is mounted in external volumes, so that changes are immidiately taken into without rebuilding the containers, but I am not sure about libraries that need to be installed by pip (I am running python backend) and npm/yarn (for webpack front-end).
Placing requirments.txt and package.json into the containers and running pip install and yarn install in the container build process means that I have to rebuild the container any time dependecies change - that is too much overhead.
Putting them in an external volume and running pip install and yarn install as part of the command of each container when it is started seems to solve the issue.
The build process of each container then contains only platform dependencies (eg. installing python, webpack or other platform tools), but libraries are installed after started (with CMD directive).
Is this the correct approach? I have seen lot of examples doing exactly the oposite and running npm install in the build process of the container - but I don't see any advantage for that, am I missing something?
Installing dependecies is usually part of the build process. Mounting code is a good trick when developing in order to get changes directly reflected.
Concerning adding requirements.txt or package.json. Installing dependecies takes time, and for that you need to take advantage of docker layer caching. In particular, you want to avoid cache invalidation.
For pip I suggest the following in development phase: For dependencies that you are unlikely to change, install these in separate RUN instuction. Your Docker file will look something like.
FROM ..
RUN pip install package1 package2 package3 ...
ADD requirements.txt requirements.txt
RUN RUN pip install -r requirements.txt
...
Keep only dependencies that might be changed in requirements.txt. Once you are done developing, add the packages back to the requirements.txt and build using the requirements file.
A similar approach would be adding two requirements files, and at the end combining them.