Best practice working with AI models from inside docker containers - docker

I’m using TensorFlow docker images for the first time. Before I get going with big time investments, I want to make sure I understand where files should be. Should I store, run, create, save all files inside the container and remove what I want to later? Should any files remain on the host?

Edit the files always outside the container. I recommend you Docker Compose to setup your Docker environment. Here's an example:
# Use version 2.3 of Docker Compose to access the GPU with NVIDIA-Docker
# (it's the only version that supports GPUs
version: '2.3'
services:
ai_container:
image: ai_container
container_name: ai_container
working_dir: /ai_container/scripts
build:
context: .
dockerfile: Dockerfile
# You may want to expose the port 6006 to use Tensorboard
ports:
- "6006:6006"
# Mount your scripts,logs,results and datasets (with read-only)
volumes:
- ./scripts:/ai_container/scripts
- ./logs:/ai_container/logs
- ./results:/ai_container/results
- /hdd/my_heavy_dataset_folder/:/datasets:ro
# Depending on the task you may need extra memory
shm_size: '8gb'
# This enables the GPU (requires NVIDIA-Docker)
runtime: nvidia
# Start Tensorboard to keep the container alive
command: tensorboard --host 0.0.0.0 --logdir /ai_container/logs

Related

How to use other docker image inside my docker container

I have an application that requires PyTorch library to function, which I've installed on my computer. However, I want to deploy my application and use PyTorch inside a container. To achieve this, I plan to use a pre-built PyTorch Docker image and link it to my container in your Docker Compose file.
In my Docker Compose file, I have included the PyTorch Docker image as a separate service that my application container will depend on. I have also included an environment variable specifying the path where PyTorch is installed inside the PyTorch container. Finally, I have mounted the source code and data directories as volumes in the application container, so that the application code can access them :
version: '3'
services:
myapp:
build: .
image: myapp:latest
environment:
- PYTHONPATH=/usr/local/lib/python3.9/site-packages
volumes:
- ./src:/app/src
- ./data:/app/data
ports:
- "8000:8000"
depends_on:
- pytorch
pytorch:
image: pytorch/pytorch:latest
I am getting this error:
ModuleNotFoundError: No module named 'torch'
As I understand "depends on" will not make my app use the PyTorch docker image.
Important: I don't want to make a docker container or image on top of PyTorch docker. I need to make my docker container use PyTorch image

Conditionalizing bind mounted volumes for Docker Compose

Please note: my question mentions MySQL, but it is a Docker/Docker Compose volume management question at heart, and as such, should be answerable by anyone with decent experience in that area, regardless of their familiarity with MySQL.
My understanding is that Dockerized MySQL containers, when defined from inside a Docker Compose file like below, will be ephemeral, meaning they store all data on the container itself (no bind mounts, etc.) and so when the container dies, the data is gone as well:
version: "3.7"
services:
my-service-db:
image: mysql:8
container_name: $MY_SERVICE_DB_HOST
command: --default-authentication-plugin=mysql_native_password
restart: always
ports:
- $MY_SERVICE_DB_PORT:$MY_SERVICE_DB_PORT
environment:
MYSQL_ROOT_PASSWORD: $MY_SERVICE_DB_ROOT_PASSWORD
MYSQL_DATABASE: my_service_db_$MY_ENV
MYSQL_USER: $MY_SERVICE_DB_APP_USER
MYSQL_PASSWORD: $MY_SERVICE_DB_APP_PASSWORD
other-service-definitions-omitted-for-brevity:
- etc.
To begin with, if that understanding is incorrect, please begin by correcting me! Assuming its more or less correct...
Lets call this Ephemeral Mode.
But by providing a bind mount volume to that service definition, we can specify an external location for where data should be stored, and so the data will persist across service runs (compose ups/downs):
version: "3.7"
services:
my-service-db:
image: mysql:8
container_name: $MY_SERVICE_DB_HOST
command: --default-authentication-plugin=mysql_native_password
restart: always
ports:
- $MY_SERVICE_DB_PORT:$MY_SERVICE_DB_PORT
environment:
MYSQL_ROOT_PASSWORD: $MY_SERVICE_DB_ROOT_PASSWORD
MYSQL_DATABASE: my_service_db_$MY_ENV
MYSQL_USER: $MY_SERVICE_DB_APP_USER
MYSQL_PASSWORD: $MY_SERVICE_DB_APP_PASSWORD
volumes:
- ./my-service-db-data:/var/lib/mysql
other-service-definitions-omitted-for-brevity:
- etc.
Lets call this Persistent Mode.
There are times when I will want to run my Docker Compose file in Ephemeral Mode, and other times, run it in Persistent Mode.
Is it possible to make the volumes definition (inside the Docker Compose file) conditonal somehow? So that sometimes I can run docker-compose up -d <SPECIFY_EPHEMERAL_MODE_SOMEHOW>, and other times I can run docker-compose up -d <SPECIFY_PERSISTENT_MODE_SOMEHOW>?
You can have multiple Compose files that work together, where you have some base file and then other files that extend the definitions in the base file.
Without extra setup, Compose looks for docker-compose.override.yml alongside the main docker-compose.yml. Since the only difference between the "ephemeral" and "persistent" mode is the volumes: declaration, you can have an override file that only contains that:
# docker-compose.override.yml
version: '3.8'
services:
my-service-db: # matches main docker-compose.yml
volumes: # added to base definition
- ./my-service-db-data:/var/lib/mysql
You could also use this technique to move the actual database credentials and port publishing out of the main file into deploy-specific configuration. It's also somewhat common to use it for setups that need to run a known Docker image in production but build it in development, and for setups that overwrite the container's contents with a host directory.
If you want the file to be named something else, you can, but you need to consistently provide a docker-compose -f option or set the COMPOSE_FILE environment variable every time you run Compose.
docker-compose -f docker-compose.yml -f docker-compose.persistence.yml up -d
docker-compose -f docker-compose.yml -f docker-compose.persistence.yml ps
docker-compose -f docker-compose.yml -f docker-compose.persistence.yml logs app
# Slightly easier (Linux syntax):
export COMPOSE_FILE=docker-compose.yml:docker-compose.persistence.yml
docker-compose up -d
Philosophically, your application's data needs to be persisted somewhere. For application containers, a good practice is for them to be totally stateless (they do not mount volumes:) and push all of their data into a database. That means the database needs to persist data, or else it will get lost when the database restarts.
IME it's a little bit unusual to actively want the database to lose data. This would be more interesting if it were straightforward to create a database image with seeded data, but the standard images are built in a way that makes this difficult. In a test environment, still, I could see wanting it.
It's actually possible, and reasonable, to build an application that runs in Docker but uses an external database. Perhaps you're running in a cloud environment, and your cloud provider has a slightly pricey managed database service that provides automatic snapshots and failover, for example; you could configure your production application to use this managed database and keep no data in containers at all.

Best practice to separate OS environment docker image and application image

this is my second day working with Docker, can you help me with a solution for this typical case:
Currently, our application is a combination of Java Netty server, Tomcat, python flask, MariaDB.
Now we want to use Docker to make the deployment more easily.
My first idea is to create 1 Docker Image for environment (CentOS + Java 8 + Python 3), another image for MariaDB, and 1 Image for application.
So the docker-compose.yml should be like this
version: '2'
services:
centos7:
build:
context: ./
dockerfile: centos7_env
image:centos7_env
container_name: centos7_env
tty: true
mariadb:
image: mariadb/server:10.3
container_name: mariadb10.3
ports:
- "3306:3306"
tty: true
app:
build:
context: ./
dockerfile: app_docker
image: app:1.0
container_name: app1.0
depends_on:
- centos7
- mariadb
ports:
- "8081:8080"
volumes:
- /home/app:/home/app
tty: true
The app_dockerfile will be like this:
FROM centos7_env
WORKDIR /home/app
COPY docker_entrypoint.sh ./docker_entrypoint.sh
ENTRYPOINT ["docker_entrypoint.sh"]
In the docker_entrypoint.sh there should couple of commands like:
#!/bin/bash
sh /home/app/server/Server.sh start
sh /home/app/web/Web.sh start
python /home/app/analyze/server.py
I have some questions:
1- Is this design good, any better idea for this?
2- Should we separate image for database like this? Or we could install database on OS image, then do commit?
3- If run docker-compose up, will docker create 2 containers for OS image and app image which based on OS image?, is there anyway to just create container for app (which run on Centos already)?
4- If the app dockerfile not base on OS image, but use FROM SCRATCH, so can it run as expected?
Sorry for long question, Thank you all in advance!!!
One thing to understand is that Docker container is not a VM - they are much more lightweight, so you can run many containers on a single machine.
What I usually do is run each service in its own container. This allows me to package only stuff related to that particular service and update each container individually when needed.
With your example I would run the following containers:
MariaDB
Container running /home/app/server/Server.sh start
Container running /home/app/web/Web.sh start
Python container running python /home/app/analyze/server.py
You don't really need to run centos7 container - this is just a base image which you used to build another container on top of it. Though you would have to build it manually first, so that you can build other image from it - I guess this is what you are trying to achieve here, but it makes docker-compose.yml a bit confusing.
There's really no need to create a huge base container which contains everything. A better practice in my opinion is to use more specialized containers. For example in you case for Python you could have a container which container Python only, for Java - your preferred JDK.
My personal preference is Alpine-based images and you can find many official images based on it: python:<version>-alpine, node:<verion>-alpine, openjdk:<version>-alpine (though I'm not quite sure about all versions), postgres:<version>-alpine and etc.
Hope this helps. Let me know if you have other questions and I will try to address them here.

docker service with compose file single node and local image

So I need rolling-updates with docker on my single node server. Until now, I was using docker-compose but unfortunately, I can't achieve what I need with it. Reading the web, docker-swarm seems to be the way to go.
I have found how to run an app with multiple replicas on a single node using swarm:
docker service create --replicas 3 --name myapp-staging myapp_app:latest
myapp:latest being built from my docker-compose.yml:
version: "3.6"
services:
postgres:
env_file:
- ".env"
image: "postgres:11.0-alpine"
volumes:
- "/var/run/postgresql:/var/run/postgresql"
app:
build: "."
working_dir: /app
depends_on:
- "postgres"
env_file:
- ".env"
command: iex -S mix phx.server
volumes:
- ".:/app"
volumes:
postgres: {}
static:
driver_opts:
device: "tmpfs"
type: "tmpfs"
Unfortunately, this doesn't work since it doesn't get the config from the docker-compose.yml file: .env file, command entry etc.
Searching deeper, I find that using
docker stack deploy -c docker-compose.yml <name>
will create a service using my docker-compose.yml config.
But then I get the following error message:
failed to update service myapp-staging_postgres: Error response from daemon: rpc error: code = InvalidArgument desc = ContainerSpec: image reference must be provided
So it seems I have to use the registry and push my image there so that it works. I understand this need in case of a multiple node architecture, but in my case I don't want to do that. (Carrying images are heavy, I don't want my image to be public, and after all, image is here, so why should I move it to the internet?)
How can I set up my docker service using local image and config written in docker-compose.yml?
I could probably manage my way using docker service create options, but that wouldn't use my docker-compose.yml file so it would not be DRY nor maintainable, which is important to me.
docker-compose is a great tool for developers, it is sad that we have to dive into DevOps tools to achieve such common features as rolling updates. This whole swarm architecture seems too complicated for my needs at this stage.
You don't have to use registeries in your single node setup. you can build your "app" image on your node from a local docker file using this command -cd to the directory of you docker file-
docker build . -t my-app:latest
This will create a local docker image on your node, this image is only visible to your single node which is benefitial in your use case but i wouldn't recommend this in a production setup.
You can now edit the compose file to be:
version: "3.6"
services:
postgres:
env_file:
- ".env"
image: "postgres:11.0-alpine"
volumes:
- "/var/run/postgresql:/var/run/postgresql"
app:
image: "my-app:latest"
depends_on:
- "postgres"
env_file:
- ".env"
volumes:
- ".:/app"
volumes:
postgres: {}
static:
driver_opts:
device: "tmpfs"
type: "tmpfs"
And now you can run your stack from this node and it will use your local app image and benefit from the usage of the image [updates - rollbacks ...etc]
I do have a side note though on your stack file. You are using the same env file for both services, please mind that swarm will look for the ".env" file relative/next to the ".yml" file, so if this is not intentional please revise the location of your env files.
Also on a side note this solution is only feasable on a single node cluster and if you scale your cluster you will have to use a registery and registeries dont have to be public, you can deploy a private registery on your cluster and only your nodes can access it -or you can make it public- the accessibility of your registery is your choice.
Hope this will help with your issue.
Instead of docker images, you can directly use the docker file there. please check the below example.
version: "3.7"
services:
webapp:
build: ./dir
The error is because of compose unable to find an image on the Docker public registry.
Above method should solve your issue.
Basically you need to use docker images in order to make the rolling update to work in docker swarm. Also I would like to clarify that you can host a private registry and use it instead of public one.
Detailed Explanation:
When you try out rolling update how docker swarm works is that it sees whether there is a change in the image which is used for the service if so then docker swarm schedules service updation based on the updation criteria's set up and will work on it.
Let us say there is no change to the image then what happens? Simply docker will not apply the rolling update. Technically you can specify --force flag to make it force update the service but it will just redeploy the service.
Hence create a local repo and store the images into that and use that image name in docker-compose file to be used for a swarm. You can secure the repo by using SSL, user credentials, firewall restrictions which is up to you. Refer this for more details on deploying docker registry server.
Corrections in your compose file:
Since docker stack uses the image to create service you need to specify image: "<image name>" in app service like done in postgres service. AS you have mentioned build instruction image-name is mandatory as docker-compose doesn't know what tho name the image as.Reference.
Registry server is needed if you are going to deploy the application in multi-server. Since you have mentioned it's a single node deployment just having the image pulled/built on the server is enough. But private registry approach is the recommended.
My recommendation is that don't club all the services into a single docker-compose file. The reason is that when you deploy/destroy using docker-compose file all the services will be taken down. This is a kind of tight coupling. Of course, I understand that all the other services depend on DB. in such cases make sure DB service is brought up first before other services.
Instead of specifying the env file make it as a part of Docker file instruction. either copy the env file and source it in entry point or use ENV variable to define it.
Also just an update:
Stack is just to group the services in swarm.
So your compose file should be:
version: "3.6"
services:
postgres:
env_file:
- ".env"
image: "postgres:11.0-alpine"
volumes:
- "/var/run/postgresql:/var/run/postgresql"
app:
build: "."
image: "image-name:tag" #the image built will be tagged as image-name:tag
working_dir: /app # note here I've removed .env file
depends_on:
- "postgres"
command: iex -S mix phx.server
volumes:
- ".:/app"
volumes:
postgres: {}
static:
driver_opts:
device: "tmpfs"
type: "tmpfs"
Dockerfile:
from baseimage:tag
COPY .env /somelocation
# your further instructions go here
RUN ... & \
... & \
... && chmod a+x /somelocation/.env
ENTRYPOINT source /somelocation/.env && ./file-to-run
Alternative Dockerfile:
from baseimage:tag
ENV a $a
ENV b $b
ENV c $c # here a,b,c has to be exported in the shell befire building the image.
ENTRYPOINT ./file-to-run
And you may need to run
docker-compose build
docker-compose push (optional needed to push the image into registry in case registry is used)]
docker stack deploy -c docker-compose.yml <stackname>
NOTE:
Even though you can create the services as mentioned here by #M.Hassan I've explained the ideal recommended way.

docker rabbitmq how to expose port and reuse container with a docker file

Hi I am finding it very confusing how I can create a docker file that would run a rabbitmq container, where I can expose the port so I can navigate to the management console via localhost and a port number.
I see someone has provided this dockerfile example, but unsure how to run it?
version: "3"
services:
rabbitmq:
image: "rabbitmq:3-management"
ports:
- "5672:5672"
- "15672:15672"
volumes:
- "rabbitmq_data:/data"
volumes:
rabbitmq_data:
I have got rabbit working locally fine, but everyone tells me docker is the future, at this rate I dont get it.
Does the above look like a valid way to run a rabbitmq container? where can I find a full understandable example?
Do I need a docker file or am I misunderstanding it?
How can I specify the port? in the example above what are first numbers 5672:5672 and what are the last ones?
How can I be sure that when I run the container again, say after a machine restart that I get the same container?
Many thanks
Andrew
Docker-compose
What you posted is not a Dockerfile. It is a docker-compose file.
To run that, you need to
1) Create a file called docker-compose.yml and paste the following inside:
version: "3"
services:
rabbitmq:
image: "rabbitmq:3-management"
ports:
- "5672:5672"
- "15672:15672"
volumes:
- "rabbitmq_data:/data"
volumes:
rabbitmq_data:
2) Download docker-compose (https://docs.docker.com/compose/install/)
3) (Re-)start Docker.
4) On a console run:
cd <location of docker-compose.yml>
docker-compose up
Do I need a docker file or am I misunderstanding it?
You have a docker-compose file. The rabbitmq:3-management is the Docker image built using the RabbitMQ Dockerfile (which you don't need. The image will be downloaded the first time you run docker-compose up.
How can I specify the port? In the example above what are the first numbers 5672:5672 and what are the last ones?
"5672:5672" specifies the port of the queue.
"15672:15672" specifies the port of the management plugin.
The numbers on the left-hand-side are the ports you can access from outside of the container. So, if you want to work with different ports, change the ones on the left. The right ones are defined internally.
This means you can access the management plugin after at http:\\localhost:15672 (or more generically http:\\<host-ip>:<port exposed linked to 15672>).
You can see more info on the RabbitMQ Image on the Docker Hub.
How can I be sure that when I rerun the container, say after a machine restart that I get the same container?
I assume you want the same container because you want to persist the data. You can use docker-compose stop restart your machine, then run docker-compose start. Then the same container is used. However, if the container is ever deleted you lose the data inside it.
That is why you are using Volumes. The data collected in your container gets also stored in your host machine. So, if you remove your container and start a new one, the data is still there because it was stored in the host machine.

Resources