Is there a way to avoid rebuilding my Docker image each time I make a change in my source code ?
I think I have already optimize my Dockerfile enough to decrease building time, but it's always 2 commands and some waiting time for sometimes just one line of code added. It's longer than a simple CTRL + S and check the results.
The commands I have to do for each little update in my code:
docker-compose down
docker-compose build
docker-compose up
Here's my Dockerfile:
FROM python:3-slim as development
ENV PYTHONUNBUFFERED=1
COPY ./requirements.txt /requirements.txt
COPY ./scripts /scripts
EXPOSE 80
RUN apt-get update && \
apt-get install -y \
bash \
build-essential \
gcc \
libffi-dev \
musl-dev \
openssl \
wget \
postgresql \
postgresql-client \
libglib2.0-0 \
libnss3 \
libgconf-2-4 \
libfontconfig1 \
libpq-dev && \
pip install -r /requirements.txt && \
mkdir -p /vol/web/static && \
chmod -R 755 /vol && \
chmod -R +x /scripts
COPY ./files /files
WORKDIR /files
ENV PATH="/scripts:/py/bin:$PATH"
CMD ["run.sh"]
Here's my docker-compose.yml file:
version: '3.9'
x-database-variables: &database-variables
POSTGRES_DB: ${POSTGRES_DB}
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
ALLOWED_HOSTS: ${ALLOWED_HOSTS}
x-app-variables: &app-variables
<<: *database-variables
POSTGRES_HOST: ${POSTGRES_HOST}
SPOTIPY_CLIENT_ID: ${SPOTIPY_CLIENT_ID}
SPOTIPY_CLIENT_SECRET: ${SPOTIPY_CLIENT_SECRET}
SECRET_KEY: ${SECRET_KEY}
CLUSTER_HOST: ${CLUSTER_HOST}
DEBUG: 0
services:
website:
build:
context: .
restart: always
volumes:
- static-data:/vol/web
environment: *app-variables
depends_on:
- postgres
postgres:
image: postgres
restart: always
environment: *database-variables
volumes:
- db-data:/var/lib/postgresql/data
proxy:
build:
context: ./proxy
restart: always
depends_on:
- website
ports:
- 80:80
- 443:443
volumes:
- static-data:/vol/static
- ./files/templates:/var/www/html
- ./proxy/default.conf:/etc/nginx/conf.d/default.conf
- ./etc/letsencrypt:/etc/letsencrypt
volumes:
static-data:
db-data:
Mount your script files directly in the container via docker-compose.yml:
volumes:
- ./scripts:/scripts
- ./files:/files
Keep in mind you have to use a prefix if you use a WORKDIR in your Dockerfile.
Quickly answer
Is there a way to avoid rebuilding my Docker image each time I make a change in my source code ?
If your app needs a build step, you cannot skip it.
In your case, you can install the requirements before the python app, so on each source code modification, you just need to run your python app, not the entire stack: postgress, proxy, etc
Docker purpose
The main docker goal or feature is to enable developers to package applications into containers which are easy to deploy anywhere, simplifying your infrastructure.
So, in this sense, docker is not strictly for the developer stage. In the developer stage, the programmer should use an specialized IDE (eclipse, intellij, visual studio, etc) to create and update the source code. Also some languages like java, c# and frameworks like react/ angular needs a build stage.
These IDEs has features like hot reload (automatic application updates when source code change), variables & methods auto-completion, etc. These features achieve to reduce the developer time.
Docker for source code changes by developer
Is not the main goal but if you don't have an specialized ide or you are in a very limited developer workspace(no admin permission, network restrictions, windows, ports, etc ), docker can rescue you
If you are a java developer (for instance), you need to install java on your machine and some IDE like eclipse, configure the maven, etc etc. With docker, you could create an image with all the required techs and the establish a kind of connection between your source code and the docker container. This connection in docker is called Volumes
docker run --name my_job -p 9000:8080 \
-v /my/python/microservice:/src \
python-workspace-all-in-one
In the previous example, you could code directly on /my/python/microservice and you only need to enter into my_job and run python /src/main.py. It will work without python or any requirement on your host machine. All will be in python-workspace-all-in-one
In case of technologies that need a build process: java & c#, there is a time penalty because, the developer should perform a build on any source code change. This is not required with the usage of specialized ide as I explained.
I case of technologies who not require build process like: php, just the libraries/dependencies installation, docker will work almost the same as the specialized IDE.
Docker for local development with hot-reload
In your case, your app is based on python. Python don't require a build process. Just the libraries installation, so if you want to develop with python using docker instead the classic way: install python, execute python app.py, etc you should follow these steps:
Don't copy your source code to the container
Just pass the requirements.txt to the container
Execute the pip install inside of container
Run you app inside of container
Create a docker volume : your source code -> internal folder on container
Here an example of some python framework with hot-reload:
FROM python:3
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app
RUN pip install -r requirements.txt
CMD [ "mkdocs", "serve", "--dev-addr=0.0.0.0:8000" ]
and how build as dev version:
docker build -t myapp-dev .
and how run it with volumes to sync your developer changes with the container:
docker run --name myapp-dev -it --rm -p 8000:8000 -v $(pwd):/usr/src/app mydocs-dev
As a summary, this would be the flow to run your apps with docker in a developer stage:
start the requirements before the app (database, apis, etc)
create an special Dockerfile for development stage
build the docker image for development purposes
run the app syncing the source code with container (-v)
developer modify the source code
if you can use some kind of hot-reload library on python
the app is ready to be opened from a browser
Docker for local development without hot-reload
If you cannot use a hot-reload library, you will need to build and run whenever you want to test your source code modifications. In this case, you should copy the source code to the container instead the synchronization with volumes as the previous approach:
FROM python:3
RUN mkdir -p /usr/src/app
COPY . /usr/src/app
WORKDIR /usr/src/app
RUN pip install -r requirements.txt
RUN mkdocs build
WORKDIR /usr/src/app/site
CMD ["python", "-m", "http.server", "8000" ]
Steps should be:
start the requirements before the app (database, apis, etc)
create an special Dockerfile for development stage
developer modify the source code
build
docker build -t myapp-dev.
run
docker run --name myapp-dev -it --rm -p 8000:8000 mydocs-dev
Related
I'm trying to connect a Json file which resides in a docker volume of the following container to my main docker container which is running a django project.
Since I am using Caprover my Docker Compose options are very limited.
So Docker Composer is not really an option. I want to instead just expose the json file over the web with a link.
Something like domain.com/folder/jsonfile.json
Can somebody tell me if this is possible inside this dockerfile?
The image I am using is crucial to the container so can I just add an nginx image or do I need any other changes to make this work?
Or is nginx not even necessary?
FROM ubuntu:devel
ENV TZ=Etc/UTC
ARG APP_HOME=/app
WORKDIR ${APP_HOME}
ENV DEBIAN_FRONTEND=noninteractive
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime
RUN echo $TZ > /etc/timezone
RUN apt-get update && apt-get upgrade -y
RUN apt-get install gnumeric -y
RUN mkdir -p /etc/importer/data
RUN mkdir /voldata
COPY config.toml /etc/importer/
COPY datasets/* /etc/importer/data/
VOLUME /voldata
COPY importer /usr/bin/
RUN chmod +x /usr/bin/importer
COPY . ${APP_HOME}
CMD sleep 999d
Using the same volume in 2 containers
docker-compose:
volumes:
shared_vol:
services:
service1:
volumes:
- 'shared_vol:/path/to/file'
service2:
volumes:
- 'shared_vol:/path/to/file'
the mechanism above replaces the volumes_from since v3, but this works for v2 as well:
volumes:
shared_vol:
services:
service1:
volumes:
- 'shared_vol:/path/to/file'
service2:
volumes_from:
- service1
If you want to avoid unintentional altering add :ro for readonly to the target service:
service1:
volumes:
- 'shared_vol:/path/to/file'
service2:
volumes:
- 'shared_vol:/path/to/file:ro'
http-server
Surely you can provide the file via http (or other protocol). There are two oppertunities:
Including a http-service to your container (quite easy depending on what is already given in the container) e.g. using nodejs you can use this https://www.npmjs.com/package/http-server very easy. Size doesn't matter? So just install:
RUN apt-get install -y nodejs npm
RUN npm install -g http-server
EXPOSE 8080
CMD ["http-server", "--cors", "-p8080", "/path/to/your/json"]
docker-compose (Runs per default on 8080, so open this):
existing_service:
ports:
- '8080:8080'
Run a stand alone http-server (nginx, apache httpd,..) in another container, but then you depend again on using the same volume for two services, so for local solutions quite an overkill.
Base image
If you don't have good reasons i'll would never use something like :devel, :rolling or :latest as base image. Stick to a LTS version instead like ubuntu:22.04
Testing for http-server
Dockerfile
FROM ubuntu:20.04
ENV TZ=Etc/UTC
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update
RUN apt-get install -y nodejs npm
RUN npm install -g http-server#13.1.0 # Issue with JSON-File in V14: https://github.com/http-party/http-server/issues/634
COPY ./test.json ./usr/wwwhttp/test.json
EXPOSE 8080
CMD ["http-server", "--cors", "-p8080", "/usr/wwwhttp/"]
# docker build -t test/httpserver:latest .
# docker run -p 8080:8080 test/httpserver:latest
Disclaimer:
I am not that familiar with node-docker-images, this is just to give a quick working solution and go on from there. I'm not using nodeJS in production, but I'm sure it can be optimized from being fat to.. well.. being rather fat. But for quick prototyping size doesn't matter.
If you want to just have two containers access the same file, just use a volume with --mount.
I am writing this request today because I will like to create my first Docker container. I watched a lot of tutorials, and there I come across a problem that I cannot solve, I must have missed a piece of information.
My program is quite basic, I would like to create a volume so as not to lose the information retrieved each time the container is launched.
Here is my docker-compose
version: '3.3'
services:
homework-logger:
build: .
ports:
- '54321:1235'
volumes:
- ./app:/app
image: 'cinabre/homework-logger:latest'
networks:
- homeworks
networks:
homeworks:
name: homeworks-logger
and here is my DockerFile
FROM debian:9
WORKDIR /app
RUN apt-get update -yq && apt-get install wget curl gnupg git apt-utils -yq && apt-get clean -y
RUN apt-get install python3 python3-pip -y
RUN git clone http://192.168.5.137:3300/Cinabre/Homework-Logger /app
VOLUME /app
RUN ls /app
RUN python3 -m pip install bottle beaker bottle-cork requests
CMD ["python3", "main.py"]
I did an "LS" in the container to see if the / app folder was empty: it is not
Any ideas?
thanks in advance !
Volumes are there to hold your application data, not its code. You don't usually need the Dockerfile VOLUME directive and you should generally avoid it unless you understand exactly what it does.
In terms of workflow, it's commonplace to include the Dockerfile and similar Docker-related files in the source repository yourself. Don't run git clone in the Dockerfile. (Credential management is hard; building a non-default branch can be tricky; layer caching means Docker won't re-pull the branch if it's changed.)
For a straightforward application, you should be able to use a near-boilerplate Dockerfile:
FROM python:3.9 # unless you have a strong need to hand-install it
WORKDIR /app
# Install packages first. Unless requirements.txt changes, Docker
# layer caching won't repeat this step. Do not list out individual
# packages in the Dockerfile; list them in Python-standard setup.py
# or Pipfile.
COPY requirements.txt .
# ...in the "system" Python space, not a virtual environment.
RUN pip3 install -r requirements.txt
# Copy the rest of the application in.
COPY . .
# Set the default command to run the container, and other metadata.
EXPOSE 1235
CMD ["python3", "main.py"]
In your application code you need to know where to store the data. You might put this in an environment variable:
import os
DATA_DIR = os.environ.get('DATA_DIR', '.')
with open(f"${DATA_DIR}/output.txt", "w") as f:
...
Then in your docker-compose.yml file, you can specify an alternate data directory and mount that into your container. Do not mount a volume over the /app directory containing your application's source code.
version: '3.8'
services:
homework-logger:
build: .
image: 'cinabre/homework-logger:latest' # names the built image
ports:
- '54321:1235'
environment:
- DATA_DIR=/data # (consider putting this in the Dockerfile)
volumes:
- homework-data:/data # (could bind-mount `./data:/data` instead)
# Use the automatic `networks: [default]`
volumes:
homework-data:
I have a simple Dockerfile
FROM python:3.8-slim-buster
RUN apt-get update && apt-get install
RUN apt-get install -y \
curl \
gcc \
make \
python3-psycopg2 \
postgresql-client \
libpq-dev
RUN mkdir -p /var/www/myapp
WORKDIR /var/www/myapp
COPY . /var/www/myapp
RUN chmod 700 ./scripts/*.sh
And an associated docker-compose file
version: "3"
volumes:
postgresdata:
services:
myapp:
image: ralston3/myapp_api:prod-latest
tty: true
command: /bin/bash -c "/var/www/myapp/scripts/myscript.sh && echo 'hello world'"
ports:
- 8000:8000
volumes:
- .:/var/www/myapp
environment:
SOME_ENV_VARS=SOME_VARIABLE
# ... more here
depends_on:
- redis
- postgresql
# ... other docker services defined below
When I run docker-compose up via:
docker-compose up -f /path/to/docker-compose.yml up
My myapp container/service fails with myapp_myapp_1 exited with code 127 with another error mentioning myapp_1 | /bin/sh: 1: /var/www/myapp/scripts/myscript.sh: not found
Further, if I exec into the myapp container via docker exec -it {CONTAINER_ID} /bin/bash I can clearly see that all of my files are there. I can literally run the /var/www/myapp/scripts/myscript.sh and it works fine.
However, there seems to be some issue with docker-compose (which could totally be my mistake). But I'm just confused as to how I can exec into the container and clearly see the files there. But docker-compose exists with 127 saying "No such file or directory".
You are bind mounting the current directory into "/var/www/myapp" so it may be that your local directory is "hiding/overwriting" the container directory. Try removing the volumes declaration for you myapp service and if that works then you know it is the bind mount causing the issue.
Unrelated to your question, but a problem you will also encounter: you're installing Python a second time, above and beyond the version pre-installed in the python Docker image.
Either switch to debian:buster as base image, or don't bother installing antyhign with apt-get and instead just pip install your dependencies like psycopg.
See https://pythonspeed.com/articles/official-python-docker-image/ for explanation why you don't need to do this.
in my case there were 2 stages: builder and runner.
I was getting an executable in builder and running that exe using the alpine image in runner.
My mistake here was that I didn't use the alpine version for the builder. Ex. I used golang:1.20 but when I used golang:1.20-alpine the problem went away.
Make sure you use the correct version and tag!
I want to start a Python container dependent on a database container. But I would like the Python container to start only after the sql server container has fully executed. I built this docker-compose.yml file ...
version: "3.2"
services:
sql-server-db:
restart: always
build: ./
container_name: sql-server-db
image: microsoft/mssql-server-linux:2017-latest
env_file: /Users/davea/my_project/api/tests/.test_env
ports:
- 3900:1433
environment:
- ACCEPT_EULA=Y
- SA_PASSWORD=password
- DB_HOST=0.0.0.0
- DB_NAME=my_db
- DB_USER=SA
- DB_PASS=password
volumes:
- ../../CloudDB/CloudDB:/sqlscripts
python:
restart: always
build: ../
environment:
DEBUG: 'true'
volumes:
- /Users/davea/my_project/api:/my-app
depends_on:
- sql-server-db
Below is my Dockerfile for the sql server container ...
FROM microsoft/mssql-server-linux:latest
RUN apt-get update
RUN apt-get install unzip -y
RUN apt-get install tzdata
ENV TZ=America/New_York
RUN ln -fs /usr/share/zoneinfo/$TZ /etc/localtime && dpkg-reconfigure -f noninteractive tzdata
RUN date
RUN echo "========="
# Install sqlpackage, needed for deplying dacpac file
RUN wget -progress=bar:force -q -O sqlpackage.zip https://go.microsoft.com/fwlink/?linkid=873926 \
&& unzip -qq sqlpackage.zip -d /opt/sqlpackage \
&& chmod +x /opt/sqlpackage/sqlpackage
# Create work directory
RUN mkdir -p /usr/work
WORKDIR /usr/work
# Copy all SQL scripts into working directory
COPY . /usr/work/
# Grant permissions for the import-data script to be executable
RUN chmod +x /usr/work/import-data.sh
RUN pwd
CMD /bin/bash ./entrypoint.sh
but I'm noticing something strange. The SQL server container does not seem to be fully executing all the commands in the entrypoint.sh file. I see this output ...
...
Removing intermediate container 72550d896ede
---> ae6b93ca884e
Step 14/15 : RUN pwd
---> Running in f229ef6fec4c
/usr/work
Removing intermediate container f229ef6fec4c
---> 7758242bbd95
Step 15/15 : CMD /bin/bash ./entrypoint.sh
---> Running in 76fa5c8308e3
Removing intermediate container 76fa5c8308e3
---> 567633ad757f
Successfully built 567633ad757f
Successfully tagged microsoft/mssql-server-linux:2017-latest
WARNING: Image for service sql-server-db was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Building python
Step 1/17 : FROM python:3.8-slim
Below are the contents of the entrypoint.sh file. Is there another way I can structure things so taht the commands are executed? I'm noticing the Python container doesn't seem to recognize the SQL server container either.
#!/bin/bash -l
/usr/work/import-data.sh & /opt/mssql/bin/sqlservr
Is there somethign else I need to do to get the shell script from my sql server container to fully execute?
Your usage of depends_on option is incorrect, or perhaps not working the way in which you intend it to work.
See: Documentation of depends_on. It clearly state it does not wait for the database to be ready in case of sql servers.
Depends_on implies only to wait until the service is up and running.
depends_on does not wait for db and redis to be “ready” before starting web - only until they have been started. If you need to wait for a service to be ready, see Controlling startup order for more on this problem and strategies for solving it.
You will benefit to create some sort of manual "wait-for-it" script (as seen in this docker-compose example) before starting python container.
while dockerizing mlflow , only .trash is getting created
beacuse of that in mlflow ui , getting error as "no experiments exists"
dockerfile
FROM python:3.7.0
RUN pip install mlflow==1.0.0
WORKDIR /data
EXPOSE 5000
CMD mlflow server \
--backend-store-uri /data/ \
--default-artifact-root /data/ \
--host 0.0.0.0
docker compose :
mlflow:
# builds track_ml Dockerfile
build:
context: ./mlflow_dockerfile
expose:
- "5000"
ports:
- "5000:5000"
volumes:
- ./data:/data
You can use this Dockerfile, Taken from mlflow-workshop which is more generic and support different ENV to debug and working with different version.
By default it will store the artifacts and files inside /opt/mlflow. It's possible to define the following variables:
MLFLOW_HOME (/opt/mlflow)
MLFLOW_VERSION (0.7.0)
SERVER_PORT (5000)
SERVER_HOST (0.0.0.0)
FILE_STORE (${MLFLOW_HOME}/fileStore)
ARTIFACT_STORE (${MLFLOW_HOME}/artifactStore)
Dockerfile
FROM python:3.7.0
LABEL maintainer="Albert Franzi"
ENV MLFLOW_HOME /opt/mlflow
ENV MLFLOW_VERSION 0.7.0
ENV SERVER_PORT 5000
ENV SERVER_HOST 0.0.0.0
ENV FILE_STORE ${MLFLOW_HOME}/fileStore
ENV ARTIFACT_STORE ${MLFLOW_HOME}/artifactStore
RUN pip install mlflow==${MLFLOW_VERSION} && \
mkdir -p ${MLFLOW_HOME}/scripts && \
mkdir -p ${FILE_STORE} && \
mkdir -p ${ARTIFACT_STORE}
COPY scripts/run.sh ${MLFLOW_HOME}/scripts/run.sh
RUN chmod +x ${MLFLOW_HOME}/scripts/run.sh
EXPOSE ${SERVER_PORT}/tcp
VOLUME ["${MLFLOW_HOME}/scripts/", "${FILE_STORE}", "${ARTIFACT_STORE}"]
WORKDIR ${MLFLOW_HOME}
ENTRYPOINT ["./scripts/run.sh"]
scripts/run.sh
#!/bin/sh
mlflow server \
--file-store $FILE_STORE \
--default-artifact-root $ARTIFACT_STORE \
--host $SERVER_HOST \
--port $SERVER_PORT
Launch MLFlow Tracking Docker
docker build -t my_mflow_image .
docker run -d -p 5000:5000 --name mlflow-tracking my_mflow_image
Run trainings
Since we have our MLflow Tracking docker exposed at 5000, we can log
executions by setting the env variable MLFLOW_TRACKING_URI.
MLFLOW_TRACKING_URI=http://localhost:5000 python example.py
Also, better to remove - ./data:/data on first run, debug with out mount, and the suggest dockerfile you might need to mount different path that is mentioned in ENV based on your need.
Here is a link to Github where I put MLflow in a docker that uses azurite in the background to also pull the models later from it.
As a short notification, you need to give your script how ever you execute it the address where it should save the artifacts. You can do this with .env files or set these things manually.
set MLFLOW_TRACKING_URI=http://localhost:5000
Important is to also give these information not only your docker but also the script for the model training ;)
Here you can find a complete tutorial how to use MLflow and SKlearn together in different theoretical szenarios since it is also a bit tricky later on.
I hope you get enough inspiration how to use it.