Docker-compose for local development, installing dependencies - docker

I am trying to setup docker-compose architecture for local development and production and I can't figure when in the containers life it's the best time to install library dependencies. In the same time I am not sure if these should be placed in the container or in external volume.
All my code is mounted in external volumes, so that changes are immidiately taken into without rebuilding the containers, but I am not sure about libraries that need to be installed by pip (I am running python backend) and npm/yarn (for webpack front-end).
Placing requirments.txt and package.json into the containers and running pip install and yarn install in the container build process means that I have to rebuild the container any time dependecies change - that is too much overhead.
Putting them in an external volume and running pip install and yarn install as part of the command of each container when it is started seems to solve the issue.
The build process of each container then contains only platform dependencies (eg. installing python, webpack or other platform tools), but libraries are installed after started (with CMD directive).
Is this the correct approach? I have seen lot of examples doing exactly the oposite and running npm install in the build process of the container - but I don't see any advantage for that, am I missing something?

Installing dependecies is usually part of the build process. Mounting code is a good trick when developing in order to get changes directly reflected.
Concerning adding requirements.txt or package.json. Installing dependecies takes time, and for that you need to take advantage of docker layer caching. In particular, you want to avoid cache invalidation.
For pip I suggest the following in development phase: For dependencies that you are unlikely to change, install these in separate RUN instuction. Your Docker file will look something like.
FROM ..
RUN pip install package1 package2 package3 ...
ADD requirements.txt requirements.txt
RUN RUN pip install -r requirements.txt
...
Keep only dependencies that might be changed in requirements.txt. Once you are done developing, add the packages back to the requirements.txt and build using the requirements file.
A similar approach would be adding two requirements files, and at the end combining them.

Related

Generate requirements.txt from pyproject.toml

I'm using poetry in my project and now working on a feature that will allow to run the app inside of a docker container. Now, my Dockerfile looks like this:
COPY pyproject.toml /
...
RUN poetry install
The last command takes around 4 minutes which is quite a lot so I thought of caching somehow this dependencies. I'm trying to convert my pyproject.toml to requirements.txt so I could feed it to Docker and it would cache it if the file hasn't been changed since the last run.
Now I'm trying:
poetry export -f requirements.txt --output requirements.txt
And it only writes dependencies from [tool.poetry.dependencies] section, but the problem is that I have other sections and would like to see dependencies from those in my requirements.txt file. How should I modify the command above so it would take dependencies from other sections as well.
P.S. Maybe you might know other ways of how to cache poetry install in docker, I'd really appreciate that!
I think you can do 2-step dependencies install with poetry to cache dependensies like in example here - https://pythonspeed.com/articles/poetry-vs-docker-caching/, no need to migrate to requirements.txt. The idea is to copy only toml, install dependencies (this way dependencies will cache and need to update only if toml changes), then copy you source files (which change more often, than toml) and do install again. More detailed explanation in the link above (https://pythonspeed.com/articles/poetry-vs-docker-caching/)

Upgrade a package in containerised application

I I would like to update a Yarn package inside package.json (Next.js project) within a docker container. I saw that inside the docker file we run yarn install --frozen-lockfile
For this project there is also a docker compose with other containers.
How would you do that? My first try was to run the docker compose up then yarn upgrade 'package' but I got errors not related to the package like I am running a new yarn install on my environment.
When you are upgrading anything it is always recommended NOT to do it on the live/running container. Instead, it is recommended you update what you want to update in your source code and Dockerfile and then create a NEW version of the image and deploy the new image over the old one with docker-compose in your case.
That's what best practice is striving towards. If this is possible it is recommended you go this route.

How to modify 'docker-compose-local.yml' for it to install all requirements (needed to run Amazon MWAA environment locally)

I am running aws-mwaa-local-runner in order to run a local Apache Airflow environment (in Docker for Windows).
However, after creating the container using ./mwaa-local-env start, I repeatedly get the Broken DAG ModuleNotFoundError' However, when I check my /docker/config/requirements.txt file (see here, although my file has a few more requirements that I need in it). When I compare my /docker/config/requirements.txt file with the output of pip freeze command run in Airflow container, I can see those requirements I need for my DAGs are missing.
I tried to pip install my other requirements in Airflow container but to no avail.
Is there a way to modify docker-compose-local.yml file so it installs all of my requirements.txt when creating the container (i.e. running the Airflow)?
Is there maybe something I might be missing? Any help or suggestion would be greatly appreciated.
Look at this: https://github.com/aws/aws-mwaa-local-runner . You should run the requirements file located in dags, locally.
pip install -r requirements.txt
Add your extra requirements to dags/requirements.txt, not docker/config/requirements.txt. The former is installed every time you start the service, but the latter is only installed when you build or rebuild the image.
Additionally, keeping your added requirements separate is important because you will need to upload the list to your MWAA environment.

Npm install of aurelia project in docker container gets stuck/hangs on jenkins build server

Problem
Most of my Jenkins builds get stuck at npm install. The issue is not reproducible locally what makes it hard to narrow down. The build server would just endlessly hang at a "random" package while until you'd manually stop it.
16:33:55 [0m[91mnpm http fetch GET 200 https://registry.npmjs.org/ws/-/ws-6.2.1.tgz 737ms
Analysis
The frontend is developed with Aurelia and is part of a monorepo that is managed by Docker. This is my only project that uses Aurelia CLI so I thought I could find the problem there - but without any results.
I've already tried to analyze the issue by executing npm install --verbose but didn't gain any additional valuable information. It wasn't a specific package that lead to the problem nor was it a noticeable timeout.
# Dockerfile
FROM node:12.13.0 as builder
WORKDIR /web
COPY web .
RUN pwd
RUN npm install --verbose
RUN npm run build
FROM nginx:mainline-alpine
COPY --from=builder /web/dist /usr/share/nginx/html
COPY html/index.html /usr/share/nginx/html/index2.html
COPY nginx.conf /etc/nginx/nginx.conf
After investigating the problem for a long time, I discovered the newly introduced npm ci command and used it instead of npm install which solved the problem. Unfortunately installing a project with a clean state is a good idea ;-)
This command is similar to npm-install, except it’s meant to be used
in automated environments such as test platforms, continuous
integration, and deployment – or any situation where you want to make
sure you’re doing a clean install of your dependencies. It can be
significantly faster than a regular npm install by skipping certain
user-oriented features. It is also more strict than a regular install,
which can help catch errors or inconsistencies caused by the
incrementally-installed local environments of most npm users.

Bundle install and yarn install inside entry script - Docker

I see that bundle install and yarn install are usually done in Dockerfile as:
RUN bundle install && yarn install
Which means that if I modify Gemfile or yarn.lock, I need to re-build the image again. I know that there is layer caching and the docker build will not rebuild other layers except bundle install && yarn install layer. But it means I have to do docker-compose up -d --build
But I was wondering if it is ok to put these commands inside an entry script of docker-compose or in command as:
command: bundle install && yarn install && rails s
In this way, I believe, whenever I do docker-compose up -d, bundle install and yarn install will be executed without having to build the image.
Not sure if it has any advantages over conventional bundle install in Dockerfile except not having to append --build in docker-compose up. Correct that if I do this, bundle install and yarn install will get executed even when there are no changes to Gemfile or Yarn files. I guess this is one of the bad sides.
Please correct me if it is not the ideal way to go.
New to docker world.
It wastes several minutes of time and uses up network bandwidth every time you start your application. When you're doing local development, it'd be the equivalent of doing this, every time you run the application:
rm -rf vendor node_modules
bundle install # from scratch
yarn install # from scratch
bundle exec rails s
A core part of Docker is rebuilding images (in the same way that languages like Go, Java, Typescript, etc. have a "build" phase). Trying to avoid image rebuilds isn't usually advisable. With a well-written Dockerfile, and particularly for an interpreted language, running docker build should be fairly efficient.
The one important trick is to separately copy the files that specify dependencies, and the rest of your application. As soon as a Dockerfile COPY instruction encounters a file that's changed it will disable layer caching for the rest of the application. Since dependencies change relatively infrequently, a sequence that first copies the dependency file, then installs the dependencies, then copies the application can jump straight to the last step if the dependency file hasn't changed.
COPY Gemfile Gemfile.lock ./
RUN bundle install
COPY package.json yarn.lock ./
RUN yarn install
COPY . ./
(Make sure to include the Bundler vendor directory and the node_modules directory in a .dockerignore file so the last COPY step doesn't overwrite what previously got installed.)
This question is opinion based. As you already found out yourself, it is a common practice to install dependencies (bundle, yarn, others) during the image build process, and not image run process.
The rationale is that you run more times than you build, and you want the run operation to start quickly.
In the same way that you do apt install... or yum install... in the build stage, you should normally do bundle install in the build stage as well.
That said, if it makes sense to you to bundle install as a part of the entrypoint, that is your choice. I suspect that after you do it, you will see that it is less common for a reason.
Another note about docker layers: If the Gemfile change, not only the layer that refers to it will change, but all subsequent layers as well. For that reason, it is often common to separate the copy of the dependencies manifest (Gemfile.*) from the copying of the app, like this:
# Pre-install gems
COPY Gemfile* ./
RUN gem install bundler && \
bundle install --jobs=3 --retry=3
# Copy the rest of the app
COPY . .
So this way, if your app files change, but not the dependencies, the build will be faster.

Resources