How can I use Erlang with Docker to run a Phoenix application? - docker

I want to use a docker image in production to run a Phoenix container, However, since Elixir is just a layer on top of Erlang, it feels like it might be a waste of space to have Elixir running in my production environment.
Ideally, I would be able to compile an entire Phoenix application into Erlang, and then use an image from erlang:alpine to actually run the app in production. Something like this...
FROM elixir:alpine as builder
(install dependencies and copy files)
RUN mix compile_app_to_erlang
FROM erlang:alpine
COPY --from=builder /path/to/compiled/erlang /some/other/path
CMD ["erlang", "run"]
note: compile_app_to_erlang is not a real command, but I'm looking for something like it. Also, I have no idea how erlang runs, so all the code in there is completely made up.
Also, from what I know, there is a project called distillery that kind of does this, but this seems like the type of thing that shouldn't be too complicated (if I knew how erlang worked,) and I'd rather not rely on another dependency if I don't have too. Plus it looks like if you use distillery you also have to use custom made docker images to run the code which is something I try to avoid.
Is something like this even possible?
If so, anyone know a DIY solution?

Elixir 1.9 added the concept of a "release" to Mix. (This was released about 11 months after the question was initially asked.) Running mix release will generate a tree containing the BEAM runtime, your compiled applications, and all of its dependencies. There is extensive documentation for the mix release task on hexdocs.pm.
In a Docker context, you can combine this with a multi-stage build to do exactly what you're requesting: start from the complete elixir image, create a tree containing the minimum required to run the image, and COPY it into a runtime image. I've been working with a Dockerfile like:
FROM elixir:1.13 AS build
WORKDIR /build
ENV MIX_ENV=prod
# Install two tools needed to build other dependencies.
RUN mix do local.hex --force, local.rebar --force
# Download dependencies.
COPY mix.exs mix.lock ./
RUN mix deps.get --only prod
# Compile dependencies (could depend on config/config.exs)
COPY config/ config/
RUN mix deps.compile
# Build the rest of the application.
COPY lib/ lib/
COPY priv/ priv/
RUN mix release --path /app
FROM ubuntu:20.04
# Get the OpenSSL runtime library
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive \
apt-get install --no-install-recommends --assume-yes \
libssl1.1
# Get the compiled application.
COPY --from=build /app /app
ENV PATH=/app/bin:$PATH
# Set ordinary metadata to run the container.
EXPOSE 4000
CMD ["myapp", "start"]
If you're using Phoenix, the Phoenix documentation has a much longer example. On the one hand that covers some details like asset compilation; on the other, its runtime image seems to have a bit more in it than may be necessary. That page also has some useful discussion on running Ecto migrations; with the Elixir fragment described there you could docker run a temporary container to do migrations, run them in an entrypoint wrapper script, or use any other ordinary Docker technique.

I suggest you to use distillery to build a binary.
Then just run a alpine container, mount the distillery release to it, run the binary. Yeah, you can eve use supervisor to run it.
You can use remote_console of distillery to link to the console of this binary.

Related

Understanding workflow of multi-stage Dockerfile

There are a few processes I'm struggling to wrap my brain around when it comes to multi-stage Dockerfile.
Using this as an example, I have a couple questions below it:
# Dockerfile
# Uses multi-stage builds requiring Docker 17.05 or higher
# See https://docs.docker.com/develop/develop-images/multistage-build/
# Creating a python base with shared environment variables
FROM python:3.8.1-slim as python-base
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_HOME="/opt/poetry" \
POETRY_VIRTUALENVS_IN_PROJECT=true \
POETRY_NO_INTERACTION=1 \
PYSETUP_PATH="/opt/pysetup" \
VENV_PATH="/opt/pysetup/.venv"
ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH"
# builder-base is used to build dependencies
FROM python-base as builder-base
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
curl \
build-essential
# Install Poetry - respects $POETRY_VERSION & $POETRY_HOME
ENV POETRY_VERSION=1.0.5
RUN curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python
# We copy our Python requirements here to cache them
# and install only runtime deps using poetry
WORKDIR $PYSETUP_PATH
COPY ./poetry.lock ./pyproject.toml ./
RUN poetry install --no-dev # respects
# 'development' stage installs all dev deps and can be used to develop code.
# For example using docker-compose to mount local volume under /app
FROM python-base as development
ENV FASTAPI_ENV=development
# Copying poetry and venv into image
COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH
# Copying in our entrypoint
COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x /docker-entrypoint.sh
# venv already has runtime deps installed we get a quicker install
WORKDIR $PYSETUP_PATH
RUN poetry install
WORKDIR /app
COPY . .
EXPOSE 8000
ENTRYPOINT /docker-entrypoint.sh $0 $#
CMD ["uvicorn", "--reload", "--host=0.0.0.0", "--port=8000", "main:app"]
# 'lint' stage runs black and isort
# running in check mode means build will fail if any linting errors occur
FROM development AS lint
RUN black --config ./pyproject.toml --check app tests
RUN isort --settings-path ./pyproject.toml --recursive --check-only
CMD ["tail", "-f", "/dev/null"]
# 'test' stage runs our unit tests with pytest and
# coverage. Build will fail if test coverage is under 95%
FROM development AS test
RUN coverage run --rcfile ./pyproject.toml -m pytest ./tests
RUN coverage report --fail-under 95
# 'production' stage uses the clean 'python-base' stage and copyies
# in only our runtime deps that were installed in the 'builder-base'
FROM python-base as production
ENV FASTAPI_ENV=production
COPY --from=builder-base $VENV_PATH $VENV_PATH
COPY ./docker/gunicorn_conf.py /gunicorn_conf.py
COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x /docker-entrypoint.sh
COPY ./app /app
WORKDIR /app
ENTRYPOINT /docker-entrypoint.sh $0 $#
CMD [ "gunicorn", "--worker-class uvicorn.workers.UvicornWorker", "--config /gunicorn_conf.py", "main:app"]
The questions I have:
Are you docker build ... this entire image and then just docker run ... --target=<stage> to run a specific stage (development, test, lint, production, etc.) or are you only building and running the specific stages you need (e.g. docker build ... -t test --target=test && docker run test ...)?
I want to say it isn't the former because you end up with a bloated image with build kits and what not... correct?
When it comes to local Kubernetes development (minikube, skaffold, devspace, etc.) and running unit tests, are you supposed referring to these stages in the Dockerfile (devspace Hooks or something) or using native test tools in the container (e.g. npm test, ./manage.py test, etc.)?
Thanks for clearing this questions up.
To answer from a less DevSpace-y persepctive and a more general Docker-y one (With no disrespect to Lukas!):
Question 1
Breakdown
❌ Are you docker build ... this entire image and then just docker run ... --target= to run a specific stage
You're close in your understanding and managed to outline the approach in your second part of the query:
✅ or are you only building and running the specific stages you need (e.g. docker build ... -t test --target=test && docker run test ...)?
The --target option is not present in the docker run command, which can be seen when calling docker run --help.
I want to say it isn't the former because you end up with a bloated image with build kits and what not... correct?
Yes, it's impossible to do it the first way, as when --target is not specified, then only the final stage is incorporated into your image. This is a great benefit as it cuts down the final size of your container, while allowing you to use multiple directives.
Details and Examples
It is a flag that you can pass in at build time so that you can choose which layers to build specifically. It's a pretty helpful directive that can be used in a few different ways. There's a decent blog post here talking about the the new features that came out with multi-stage builds (--target is one of them)
For example, I've had a decent amount of success building projects in CI utilising different stages and targets, the following is pseudo-code, but hopefully the context is applied
# Dockerfile
FROM python as base
FROM base as dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
FROM dependencies as test
COPY src/ src/
COPY test/ test/
FROM dependencies as publish
COPY src/ src/
CMD ...
A Dockerfile like this would enable you to do something like this in your CI workflow, once again, pseudo-code-esque
docker build . -t my-app:unit-test --target test
docker run my-app:unit-test pyunit ...
docker build . -t my-app:latest
docker push ...
In some scenarios, it can be quite advantageous to have this fine grained control over what gets built when, and it's quite the boon to be able to run those images that comprise of only a few stages without having built the entire app.
The key here, is that there's no expectation that you need to use --target, but it can be used to solve particular problems.
Question 2
When it comes to local Kubernetes development (minikube, skaffold, devspace, etc.) and running unit tests, are you supposed referring to these stages in the Dockerfile (devspace Hooks or something) or using native test tools in the container (e.g. npm test, ./manage.py test, etc.)?
Lukas covers a devspace specific approach very well, but ultimately you can test however you like. Using devspace to make it easier to run (and remember to run) tests certainly sounds like a good idea. Whatever tool you use to enable an easier workflow, will likely still use npm test etc under the hood.
If you wish to call npm test outside of a container that's fine, if you wish to call it in a container, that's also fine. The solution to your problem will always change depending on your landscape. CICD helps to standardise on external factors and provide a uniform means to ensure testing is performed, and deployments are auditable
Hope that helps in any way shape or form 👍
Copying my response to this from Reddit to help others who may look for this on StackOverflow:
DevSpace maintainer here. For my workflow (and the default DevSpace behavior if you set it up with devspace init), image building is being skipped during development because it tends to be the most annoying and time-consuming part of the workflow. Instead, most teams that use DevSpace have a dev image pushed to a registry and build by CI/CD which is then used in devspace.yaml using replacePods.replaceImage as shown here: https://devspace.sh/cli/docs/configuration/development/replace-pods
This means that your manifests or helm charts are being deployed referencing the prod images (as they should be) and then devspace will (after deployment) replace the images of your pods with dev-optimized images that ship all your tooling. Inside these pods, you can then use the terminal to build your application, run tests along with other dependencies running in your cluster etc.
However, typically teams also start using DevSpace in CI/CD after a while and then they add profiles (e.g. prod profile or integration-testing profile etc. - more on https://devspace.sh/cli/docs/configuration/profiles/basics) to their devspace.yaml where they add image building again because they want to build the images in their pipelines using kaniko or docker. For this, you would specify the build target in devspace.yaml as well: https://devspace.sh/cli/docs/configuration/images/docker#target
FWIW regarding 1: I never use docker run --target but I also always use Kubernetes directly over manual docker commands to run any workloads.

In docker with buildkit and run --mount, why is cabal install Downloading cached packages?

I am in the process of creating a Dockerfile that can build a haskell program. The Dockerfile uses ubuntu focal as a base image, installs ghcup, and then builds a haskell program. There are multiple reasons why I am doing this; it can support a low-configuration CI environment, and it can help new developers who are trying to build a complicated project.
In order to speed up build times, I am using docker v20 with buildkit. I have a sequence of events like this (it's quite a long file, but this excerpt is the relevant part):
# installs haskell
WORKDIR $HOME
RUN git clone https://github.com/haskell/ghcup-hs.git
WORKDIR ghcup-hs
RUN BOOTSTRAP_HASKELL_NONINTERACTIVE=NO ./bootstrap-haskell
#RUN source ~/.ghcup/env # Uh-oh: can't do this.
# We recreate the contents of ~/.ghcup/env
ENV PATH=$HOME/.cabal/bin:$HOME/.ghcup/bin:$PATH
# builds application
COPY application $HOME/application
WORKDIR $HOME/application
RUN mkdir -p logs
RUN --mount=type=cache,target=$HOME/.cabal \
--mount=type=cache,target=$HOME/.ghcup \
--mount=type=cache,target=$HOME/application/dist-newstyle \
cabal build |& tee logs/configure.log
But when I change some non-code files (README.md for example) in application, and build my docker image ...
DOCKER_BUILDKIT=1 docker build -t application/application:1.0 .
... it takes quite a bit of time and the output from cabal build includes a lot of Downloading [blah] followed by Building/Installing/Completed messages from cabal install.
However when I go into my container and type cabal build, it is much faster (it is already built):
host$ docker run -it application/application:1.0
container$ cabal build # this is fast
I would expect it to be just as fast in the prior case as well. Since I have not really changed the code files, and the dependencies are all downloaded, and since I am using RUN --mount.
Are there files somewhere that my --mount=type=cache entries are not covering? Is there a package registry file somewhere that I need to include in its own --mount=type=cache line? As far as I can tell, my builds ought to be nearly instant instead of taking several minutes to complete.

Confused by Dockerfile

I feel confused by the Dockerfile and build process. Specifically, I am working my way through the book Docker on AWS and I feel stuck until I can work my way through a few more of the details. The book had me write the following Dockerfile.
#Test stage
FROM alpine as test
LABEL application=todobackend
#Install basic utilities
RUN apk add --no-cache bash git
#Install build dependencies
RUN apk add --no-cache gcc python3-dev libffi-dev musl-dev linux-headers mariadb-dev py3-pip
RUN ../../usr/bin/pip3 install wheel
#Copy requirements
COPY /src/requirements* /build/
WORKDIR /build
#Build and install requirements
RUN pip3 wheel -r requirements_test.txt --no-cache-dir --no-input
RUN pip3 install -r requirements_test.txt -f /build --no-index --no-cache-dir
# Copy source code
COPY /src /app
WORKDIR /app
# Test entrypoint
CMD ["python3","manage.py","test","--noinput","--settings=todobackend.settings_test"]
The following is a list of the things I understand versus don't understand.
I understand this.
#Test stage
FROM alpine as test
LABEL application=todobackend
It is defining a 'test' stage so I can run commands like docker build --target test and will execute all of the following commands until the next FROM / as command indicates a different target. LABEL is labeling the specific docker image that is built and from which containers will be 'born' (not sure if that is the right word to use). I don't feel any confusion about that EXCEPT if that tag translates to containers spawned from that image.
So NOW I start to feel confused.
I PARTLY understand this
#Install basic utilities
RUN apk add --no-cache bash git
I understand that apk is an overloaded term that represents both the package manager on Alpine Linux and a file type. In this context, it is a package manager command to install (or upgrade) a package to the running system. HOWEVER, I am suppose to be building / packaging up an application and all of its dependencies into an enclosed 'environment'. Sooo... where / when does this 'environment' come in? That is where I feel confused. When the docker file is running apk, is it just saying "locally, on your current machine, please install these the normal way." (ie, the equivalent of a bash script where apk installs to its working directory). When I run docker build --target test -t todobackend-test on my previously pasted docker file, is the docker command doing both a native command execution AND a Docker Engine call to create an isolated environment for my docker image? I feel like what must be happening is when the docker command is run it acts like a wrapper around the built-in package manager / bash / pip functionality AND the docker engine and is doing both but I don't know.
Anyways, I feel hope that this made sense. I just want some implementation details. Feel free to link documentation but it can feel super tedious and unnecessarily detailed OR obfuscated sometimes.
I DO want to point out that if I run an apk command in my Dockerfile with a bad dependency name (e.g. python3-pip instead of py3-pip). I get a very interesting error:
/bin/sh: pip3: not found
Notice the command path. I am assuming anyone reading this will understand why that feels hella confusing.

Docker: Best practices for installing dependencies - Dockerfile or ENTRYPOINT?

Being relatively new to Docker development, I've seen a few different ways that apps and dependencies are installed.
For example, in the official Wordpress image, the WP source is downloaded in the Dockerfile and extracted into /usr/src and then this is installed to /var/www/html in the entrypoint script.
Other images download and install the source in the Dockerfile, meaning the entrypoint just deals with config issues.
Either way the source scripts have to be updated if a new version of the source is available, so one way versus the other doesn't seem to make updating for a new version any more efficient.
What are the pros and cons of each approach? Is one recommended over the other for any specific sorts of setup?
Generally you should install application code and dependencies exclusively in the Dockerfile. The image entrypoint should never download or install anything.
This approach is simpler (you often don't need an ENTRYPOINT line at all) and more reproducible. You might run across some setups that run commands like npm install in their entrypoint script; this work will be repeated every time the container runs, and the container won't start up if the network is unreachable. Installing dependencies in the Dockerfile only happens once (and generally can be cached across image rebuilds) and makes the image self-contained.
The Docker Hub wordpress image is unusual in that the underlying Wordpress libraries, the custom PHP application, and the application data are all stored in the same directory tree, and it's typical to use a volume mount for that application tree. Its entrypoint script looks for a wp-includes/index.php file inside the application source tree, and if it's not there it copies it in. That's a particular complex entrypoint script.
A generally useful pattern is to keep an application's data somewhere separate from the application source tree. If you're installing a framework, install it as a library using the host application's ordinary dependency system (for example, list it in a Node package.json file rather than trying to include it in a base image). This is good practice in general; in Docker it specifically lets you mount a volume on the data directory and not disturb the application.
For a typical Node application, for example, you might install the application and its dependencies in a Dockerfile, and not have an ENTRYPOINT declared at all:
FROM node:14
WORKDIR /app
# Install the dependencies
COPY package.json yarn.lock ./
RUN yarn install
# Install everything else
COPY . ./
# Point at some other data directory
RUN mkdir /data
ENV DATA_DIR=/data
# Application code can look at process.env.DATA_DIR
# Usual application metadata
EXPOSE 3000
CMD yarn start
...and then run this with a volume mounted for the data directory, leaving the application code intact:
docker build -t my-image .
docker volume create my-data
docker run -p 3000:3000 -d -v my-data:/data my-image

Docker: disable caching for a specific stage

I have a multi-stage Dockerfile. In stage one, I git clone from a github repo. In later stage, I do other stuff like pip etc and use a file from stage 1. I'd like to only disable caching for the first stage.
It looks like docker build --target stage1 --no-cache doesn't do what I want.
Is there a way to disable only a certain stage?
My Dockerfile looks like this:
FROM yijian/git-alpine
WORKDIR /tmp
RUN git clone https://github.com/abc/abc.git
FROM python:3.5.3-slim
RUN mkdir /app
ADD requirements.txt /app
ADD pip/pip.conf /root/.pip/pip.conf
WORKDIR /app
RUN pip3 install --upgrade pip && \
pip3 install pbr && \
pip3 install -r requirements.txt
ADD server.py /app
ADD docker/start.sh /app
RUN chmod a+x /app/start.sh
COPY --from=0 /tmp/abc/directory /usr/local/lib/python3.5/site-packages/abc/directory
EXPOSE 9092
ENTRYPOINT ["./start.sh"]
I don't believe that a single Dockerfile can have caching disabled for a specific state. That might make a nice feature request but I would rather see that as a declarative statement in the file rather than on the command line.
According to Docker's reference site:
https://docs.docker.com/engine/reference/commandline/build/#usage
The "--target" flag allows you to select a target stage from a Dockerfile, meaning that it would only run that part of the Dockerfile. I would expect that the --no-cache flag would work in conjunction with this flag, however I wouldn't expect the other sections of the Dockerfile to run.
I believe that what you want to occur would take multiple commands which may defeat the purpose of having a multistage Dockerfile.
It would take more work, but depending on what you want to cache, you could possible include a script, such as bash or powershell, which can accomplish this goal.
Another option (depending on your needs) may be to use a separate Docker container which caches just what you need. For instance, I created a CI build which uses a Dockerfile that only imports dependencies and then my main build happens in a container that references that first container. I have done this with "dotnet restore" commands, so that the dependencies are preloaded and also have done this using "npm install". This method would work with any package management tool which allows you to specify a source. so where you have a project.json, you can extract the common dependencies and call it cache.package.json, then build a base image that has already done the heavy downloading for you, then ideally when you run this again during your more frequent builds it needs to pull less. Take advantage of the layered approach Docker offers!
If your earlier stages change more often than the later ones you might want to consider reversing the order of the stages.
Stage 1. Setup your environment with pip (possibly in an virtual env).
Stage 2. copy environment files for virtual env from previous stage & then do git clone.
As long as your stage 1 doesn't need to change the cache can be used there and only the git clone part will be updated.

Resources