Docker editing entrypoint of existing container - docker

I've docker container build from debian:latest image.
I need to execute a bash script that will start several services.
My host machine is Windows 10 and I'm using Docker Desktop, I've found configuration files in
docker-desktop-data wsl2 drive in data\docker\containers\<container_name>
I've 2 config files there:
config.v2.json and hostcongih.json
I've edited the first of them and replaced:
"Entrypoint":null with "Entrypoint":["/bin/bash", "/opt/startup.sh"]
I have done it while the container was down, when I restarted it the script was not executed. When I opened config.v2.json file again the Entrypoint was set to null again.
I need to run this script at every container start.
Additional strange thing is that this container doesn't have any volume appearing in docker desktop. I can checkout this container and start another one, but I need to preserve current state of this container (installed packages, files, DB content). How can I change the entrypoint or run the script in other way?
Is there anyway to export the container to image alongside with it's configuration? I need to expose several ports and run the startup script. Is there anyway to make every new container made from the image exported from current container expose the same ports and run same startup script?

Docker's typical workflow involves containers that only run a single process, and are intrinsically temporary. You'd almost never create a container, manually set it up, and try to persist it; instead, you'd write a script called a Dockerfile that describes how to create a reusable image, and then launch some number of containers from that.
It's almost always preferable to launch multiple single-process containers than to try to run multiple processes in a single container. You can use a tool like Docker Compose to describe the multiple containers and record the various options you'd need to start them:
# docker-compose.yml
# Describe the file version. Required with the stable Python implementation
# of Compose. Most recent stable version of the file format.
version: '3.8'
# Persistent storage managed by Docker; will not be accessible on the host.
volumes:
dbdata:
# Actual containers.
services:
# The database.
db:
# Use a stock Docker Hub image.
image: postgres:15
# Persist its data.
volumes:
- dbdata:/var/lib/postgresql/data
# Describe how to set up the initial database.
environment:
POSTGRES_PASSWORD: passw0rd
# Make the container accessible from outside Docker (optional).
ports:
- '5432:5432' # first port any available host port
# second port MUST be standard PostgreSQL port 5432
# Reverse proxy / static asset server
nginx:
image: nginx:1.23
# Get static assets from the host system.
volumes:
- ./static:/usr/share/nginx/html
# Make the container externally accessible.
ports:
- '8000:80'
You can check this file into source control with your application. Also consider adding a third container that build: an image containing the actual application code; that probably will not have volumes:.
docker-compose up -d will start this stack of containers (without -d, in the foreground). If you make a change to the docker-compose.yml file, re-running the same command will delete and recreate containers as required. Note that you are never running an unmodified debian image, nor are you manually running commands inside a container; the docker-compose.yml file completely describes the containers, their startup sequences (if not already built into the images), and any required runtime options.
Also see Networking in Compose for some details about how to make connections between containers: localhost from within a container will call out to that same container and not one of the other containers or the host system.

Related

Putting file into HDFS using docker-compose

Is there a way to put some file, let's say data.json, into HDFS automatically right from Docker-compose/Dockerfile?
When I start namenode and datanode I can enter into containers with
docker exec -it namenode [datanode] bash, and use
hdfs dfs -put data.json hdfs:/ (when safe mode is finished)
and that works, but I need a way to run this automatically. When I try to build containers from Dockerfile and put comands:
FROM bde2020/hadoop-namenode:1.1.0-hadoop2.8-java8
WORKDIR /data
ADD hdfs_writer/data.json /data
# ADD python_script.py /data
CMD ["hdfs dfsadmin -safemode wait && hdfs dfs -put ./data.json hdfs:/"]
# CMD ["python python_script.py"]
Container namenode immediately terminates. I also tried with the python script, that I add to container and run it with CMD.
python_script
import time
import os
os.system("hdfs dfsadmin -safemode wait")
os.system("hdfs dfs -put -f data.json hdfs:/")
while True:
time.sleep(5)
in that case, container is running, but if I check logs and try to list hdfs with hdfs dfs -ls hdfs:/, there is following error
safemode: Call From 662aae005e8b/172.20.0.5 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
19/04/18 14:36:36 WARN ipc.Client: Failed to connect to server: namenode/172.20.0.5:8020: try once and fail.
I read recommended link from error log, and to be honest, I am not sure that I understand what should I do.
Any your suggestions or ideas about possible solution is highly valuable for me, as I am new to this field and I don't have much experience.
If you need some more info, I will be happy to provide it.
docker-compose.yml (just part of it)
namenode:
#docker-compose.yml and Dockerfile are in the dame directory
build: .
volumes:
- ./data/namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=cluster
env_file:
- ./hadoop.env
ports:
- 50070:50070
datanode:
image: bde2020/hadoop-datanode:1.1.0-hadoop2.8-java8
depends_on:
- namenode
volumes:
- ./data/datanode:/hadoop/dfs/data
env_file:
- ./hadoop.env
hadoop.env
CORE_CONF_fs_defaultFS=hdfs://namenode:8020
CORE_CONF_hadoop_http_staticuser_user=root
CORE_CONF_hadoop_proxyuser_hue_hosts=*
CORE_CONF_hadoop_proxyuser_hue_groups=*
HDFS_CONF_dfs_webhdfs_enabled=true
HDFS_CONF_dfs_permissions_enabled=false
HDFS_CONF_dfs_blocksize=1m
YARN_CONF_yarn_log___aggregation___enable=true
YARN_CONF_yarn_resourcemanager_recovery_enabled=true
YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate
YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs
YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/
YARN_CONF_yarn_timeline___service_enabled=true
YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true
YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true
YARN_CONF_yarn_resourcemanager_hostname=resourcemanager
YARN_CONF_yarn_timeline___service_hostname=historyserver
YARN_CONF_yarn_resourcemanager_address=resourcemanager:8032
YARN_CONF_yarn_resourcemanager_scheduler_address=resourcemanager:8030
YARN_CONF_yarn_resourcemanager_resource__tracker_address=resourcemanager:8031
You can't write to networked services in a Dockerfile. Imagine running docker build, running your combined application, tearing it down, and running it again. You'll reuse the same built image without re-running the Dockerfile steps; only the content in the image itself is kept. In most cases you need some minor amount of setup to communicate between services (Docker Compose can do this for you) but that is not set up during a build sequence. This is the same answer as "you can't run database migrations from a Dockerfile", but it applies equally to Hadoop.
A container only does one thing. Your sample Dockerfile sets a different CMD that waits for the namenode to be running and sets it up. This happens instead of starting the namenode process. A Docker container runs one main command and one main command only; there is not a way to run a main command and also a side support script of some form. The container you show would probably work, but you'd need to run it as a separate container alongside the namenode container.
You don't need to be "in Docker" to access Docker-hosted services. You can use a Docker Compose ports: directive to make services visible to the host, at which point you can use ordinary clients to interact with them. The docker exec path is the equivalent of "I ssh to my server as root, and then...", which isn't how you normally deal with any service at all.
Your server containers should only run servers. In your example you're both trying to launch an HDFS namenode and also populate the server from the same container; you'd be better off having the namenode container only be the namenode and running the setup job from another container or from the host. (See the standard postgres image's entrypoint script for some idea of the gyrations required otherwise.)
Docker Compose isn't great for one-off jobs. Every time you run docker-compose up it will discover that your setup container isn't running and try to start it again. Other more powerful orchestrators could be a better fit; for example, a Kubernetes Job is a reasonable fit for what you're describing.

Docker - Mount a volume from a container to an other (equivalent volumes_from) in docker-compose 3

I've two containers : nginx & angular. The angular container contains the code and is automatically pulled from the registry when there is a new version (with watchtower).
I set up a Shared Volume between angular & nginx to share the code from angular to nginx.
### Angular #########################################
angular:
image: registry.gitlab.com/***/***:staging
networks:
- frontend
- backend
volumes:
- client:/var/www/client
### NGINX Server #########################################
nginx:
image: registry.gitlab.com/***/***/***:staging
volumes:
- client:/var/www/client
depends_on:
- angular
networks:
- frontend
- backend
volumes:
client:
networks:
backend:
frontend:
When I build & run for the first time the environment, everything works.
The problem is when there is a new version of the client, the image is pulled, the container is re-built and the new code version is inside the angular container, but in the nginx container it still the old code version of the client.
The shared volumes does not let me do what i want because we can not specify who is the host, is it possible to mount a volumes from a container to an other ?
Thanks in advance.
EDIT
The angular container is only here to serve the files. We could rsync the built application to the server on the host machine then mouting the volume to the container (host -> guest) but it would go against our CI process : build app->build image->push to registry->watchtower pull new image
Docker volumes are not intended to share code, and I'd suggest reconsidering this workflow.
The first time you launch a container with an empty volume, and only the first time and only if the volume is already empty, Docker will populate it with contents from the container. Since volumes are intended to hold data, and the application is likely to change the data that will be persisted, Docker doesn't overwrite the application data if the container is restarted; whatever was in the volume directory remains unchanged.
In your setup that means this happens:
You start the angular container the first time, and since the client named volume is empty, Docker copies content into it.
You start the nginx container.
You delete and restart the angular container; but since the client named volume is empty, Docker leaves the old content there.
The nginx container still sees the old content.
For a typical browser application, you don't actually need a "program" running: once you've run through a Typescript/Webpack/... sequence, the output is a collection of totally static files. In the case of Angular, there is an Ahead-of-Time compiler that produces these static files. The sequence I'd recommend here is:
Check out your application source tree locally.
Develop your browser application in isolation, using developer-oriented tools like ng serve or npm start. Since this is all running locally, you don't need to fight with anything Docker-specific (filesystem mappings, permissions, port mappings, ...); it is a totally normal Javascript development sequence. The system components you need for this are just Node; it is strictly easier than installing and configuring Docker.
Compile your application to static files with the Angular AOT compiler or Webpack or npm build.
Publish those static files to a CDN; or bind-mount them into an nginx container; or maybe build them into a custom image.
In the last case you wouldn't use a named Docker volume. Instead you'd mount the local filesystem into the container. A complete docker-compose.yml file for this case could look like:
version: '3'
services:
nginx:
image: registry.gitlab.com/***/***/***:staging
volumes:
- ./client:/var/www/client
ports:
- '8000:80'
From your comment:
There is no program running for the client, the CI compile the app and build the custom Image which COPY the application files in /var/www/client. Then watchtower pull this new image and restart the container. The container only run in daemon with (tail -f /dev/null & wait).
Looking at this from a high level, I don't see any need to have two containers or volumes at all. Simply build your application with a multi-stage build that generates an nginx image with the needed content:
FROM your_angular_base AS build
COPY src /src
RUN steps to compile your code
FROM nginx_base as release
...
COPY --from=build /var/www/client/ /var/www/client/
...
Then your compose file is stripped down to just:
...
### NGINX Server #########################################
nginx:
image: registry.gitlab.com/***/***/***:staging
networks:
- frontend
- backend
networks:
backend:
frontend:
If you do find yourself in a situation where a volume is needed to be shared between two running containers, and the volume needs to be updated with each deploy of one of the images, then the best place for that is an entrypoint script that copies files from one location into the volume. I have an example of this in my docker-base with the save-volume and load-volume scripts.

What is the difference between docker and docker-compose

docker and docker-compose seem to be interacting with the same dockerFile, what is the difference between the two tools?
The docker cli is used when managing individual containers on a docker engine. It is the client command line to access the docker daemon api.
The docker-compose cli can be used to manage a multi-container application. It also moves many of the options you would enter on the docker run cli into the docker-compose.yml file for easier reuse. It works as a front end "script" on top of the same docker api used by docker, so you can do everything docker-compose does with docker commands and a lot of shell scripting. See this documentation on docker-compose for more details.
Update for Swarm Mode
Since this answer was posted, docker has added a second use of docker-compose.yml files. Starting with the version 3 yml format and docker 1.13, you can use the yml with docker-compose and also to define a stack in docker's swarm mode. To do the latter you need to use docker stack deploy -c docker-compose.yml $stack_name instead of docker-compose up and then manage the stack with docker commands instead of docker-compose commands. The mapping is a one for one between the two uses:
Compose Project -> Swarm Stack: A group of services for a specific purpose
Compose Service -> Swarm Service: One image and it's configuration, possibly scaled up.
Compose Container -> Swarm Task: A single container in a service
For more details on swarm mode, see docker's swarm mode documentation.
docker manages single containers
docker-compose manages multiple container applications
Usage of docker-compose requires 3 steps:
Define the app environment with a Dockerfile
Define the app services in docker-compose.yml
Run docker-compose up to start and run app
Below is a docker-compose.yml example taken from the docker docs:
services:
web:
build: .
ports:
- "5000:5000"
volumes:
- .:/code
- logvolume01:/var/log
links:
- redis
redis:
image: redis
volumes:
logvolume01: {}
A Dockerfile is a text document that contains all the commands/Instruction a user could call on the command line to assemble an image.
Docker Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration. By default, docker-compose expects the name of the Compose file as docker-compose.yml or docker-compose.yaml. If the compose file has a different name we can specify it with -f flag.
Check here for more details
docker or more specifically docker engine is used when we want to handle only one container whereas the docker-compose is used when we have multiple containers to handle. We would need multiple containers when we have more than one service to be taken care of, like we have an application that has a client server model. We need a container for the server model and one more container for the client model. Docker compose usually requires each container to have its own dockerfile and then a yml file that incorporates all the containers.

How to link multiple Docker containers and encapsulate the result?

I have a Node.js web-application that connects to a Neo4j database. I would like to encapsulate these in a single Docker image (using also a Neo4j Docker container), but I'm a docker novice and can't seem to figure this out. What's the recommended way to do it in the latest Docker versions?
My intuition would be to run the Neo4j container nested inside the app container. But from what I've read, I think the supported / recommended approach is to link the containers together. What I need is pretty well illustrated in this image. But the article where the image comes from isn't clear to me. Anyway, it's using the soon-to-be-deprecated legacy container linking, while networking is recommended these days. A tutorial or explanation would be much appreciated.
Also, how does docker-compose fit into all this?
Running a container within another container would imply to run a Docker engine within a Docker container. This is referenced as dind for Docker-in-Docker and I would strongly advise against it. You can search 'dind' online and discover why in most cases it is a bad idea, but as it is not the main object of your question I won't extend this subject any further.
Running both a node.js process and a neo4j process in the same container
While most people will tell you to refrain yourself from running more than one process within a Docker container, nothing prevents you from doing so. If you want to follow this path, take a look at the Using Supervisor with Docker from the Docker documentation website, or at the Phusion baseimage Docker image.
Just be aware that this way of doing things will make your Docker image more and more difficult to maintain over time.
Linking containers
As you found out, keeping Docker images as simple as you can (i.e: running one and only one app within a Docker container) will make your life easier on the long term.
Linking containers together is trivial when both containers run on the same Docker engine. It is just a matter of:
having your neo4j container expose the port its service listens on
running your node.js container with the --link <neo4j container name>:<alias> option
within the node.js application configuration, set the neo4j host to the <alias> hostname, docker will take care of forwarding that connection to the IP it assigned to the neo4j container
When you want to run those two containers on different hosts, things get more difficult.
With Docker Compose, you have to use the link: key to define your links
The new Docker network feature
You also discovered that linking containers won't be supported in the future and that the new way of making multiple Docker containers communicate is to create a virtual network and attach those 2 containers to that network.
Here's how to proceed:
docker network create mynet
docker run --detach --name myneo4j --net mynet neo4j
docker run --detach --name mynodejs --net mynet <your nodejs image>
Your node application configuration should then use myneo4j as the host to connect to.
To tell Docker Compose to use the new network feature, you would have to use the --x-networking option. Also you would not use the links: key.
Using the new networking feature also means that you won't be able to define any alias for the db. As a result you have to use the container name. Beware that unless you use the container_name: key in your docker-compose.yml file, Compose will create container names based on the directory which contains your docker-compose.yml file, the service name as found in the yml file and a number.
For instance, the following docker-compose.yml file, if within a directory named "foo" would create two containers named foo_web_1 and foo_db_1:
web:
build: .
ports:
- "8000:8000"
db:
image: postgres
when started with docker-compose --x-networking up, the web app configuration should then use foo_db_1 as the db hostname.
While if you use container_name:
web:
build: .
ports:
- "8000:8000"
db:
image: postgres
container_name: mydb
when started with docker-compose --x-networking up, the web app configuration should then use mydb as the db hostname.
Example of using Docker Compose to run a web app using nodeJS and neo4j
In this example, I will show how to dockerize the example app from github project aseemk/node-neo4j-template which uses nodejs and neo4j.
I assume you already have Docker 1.9.0+ and Docker Compose 1.5+ installed.
This project will use 2 docker containers, one to run the neo4j database and one to run the nodeJS web app.
Dockerizing the web app
We need to build a Docker image from which Docker compose will run a container. For that, we will write a Dockerfile.
Create a file named Dockerfile (mind the capital D) with the following content:
FROM node
RUN git clone https://github.com/aseemk/node-neo4j-template.git
WORKDIR /node-neo4j-template
RUN npm install
# ugly 20s sleep to wait for neo4j to initialize
CMD sleep 20s && node app.js
This Dockerfile describes the steps the Docker engine will have to follow to build a docker image for our web app. This docker image will:
be based on the official node docker image
clone the nodeJS example project from Github
change the working directory to the directory containing the git clone
run the npm install command to download and install the nodeJS app dependencies
instruct docker which command to use when running a container of that image
A quick review of the nodeJS code reveals that the author allows us to configure the URL to use to connect to the neo4j database using the NEO4J_URL environment variable.
Dockerizing the neo4j database
Well people took care of that for us already. We will use the official Docker image for neo4j which can be found on the Docker Hub.
A quick review of the readme tells us to use the NEO4J_AUTH environment variable to change the neo4j password. And setting this variable to none will disable the authentication all together.
Setting up Docker Compose
In the same directory as the one containing our Dockerfile, create a docker-compose.yml file with the following content:
db:
container_name: my-neo4j-db
image: neo4j
environment:
NEO4J_AUTH: none
web:
build: .
environment:
NEO4J_URL: http://my-neo4j-db:7474
ports:
- 80:3000
This Compose configuration file describes 2 services: db and web.
The db service will produce a container named my-neo4j-db from the official neo4j docker image and will start that container setting up the NEO4J_AUTH environment variable to none.
The web service will produce a container named at docker compose discretion using a docker image built from the Dockerfile found in the current directory (build: .). It will start that container setting up the environment variable NEO4J_URL to http://my-neo4j-db:7474 (note how we use here the name of the neo4j container my-neo4j-db). Furthermore, docker compose will instruct the Docker engine to expose the web container's port 3000 on the docker host port 80.
Firing it up
Make sure you are in the directory that contains the docker-compose.yml file and type: docker-compose --x-networking up.
Docker compose will read the docker-compose.yml file, figure out it has to first build a docker image for the web service, then create and start both containers and finally will provide you with the logs from both containers.
Once the log shows web_1 | Express server listening at: http://localhost:3000/, everything is cooked and you can direct your Internet navigator to http://<ip of the docker host>/.
To stop the application, hit Ctrl+C.
If you want to start the app in the background, use docker-compose --x-networking up -d instead. Then in order to display the logs, run docker-compose logs.
To stop the service: docker-compose stop
To delete the containers: docker-compose rm
Making neo4j storage persistent
The official neo4j docker image readme says the container persists its data on a volume at /data. We then need to instruct Docker Compose to mount that volume to a directory on the docker host.
Change the docker-compose.yml file with the following content:
db:
container_name: my-neo4j-db
image: neo4j
environment:
NEO4J_AUTH: none
volumes:
- ./neo4j-data:/data
web:
build: .
environment:
NEO4J_URL: http://my-neo4j-db:7474
ports:
- 80:3000
With that config file, when you will run docker-compose --x-networking up, docker compose will create a neo4j-data directory and mount it into the container at location /data.
Starting a 2nd instance of the application
Create a new directory and copy over the Dockerfile and docker-compose.yml files.
We then need to edit the docker-compose.yml file to avoid name conflict for the neo4j container and the port conflict on the docker host.
Change its content to:
db:
container_name: my-neo4j-db2
image: neo4j
environment:
NEO4J_AUTH: none
volumes:
- ./neo4j-data:/data
web:
build: .
environment:
NEO4J_URL: http://my-neo4j-db2:7474
ports:
- 81:3000
Now it is ready for the docker-compose --x-networking up command. Note that you must be in the directory with that new docker-compose.yml file to start the 2nd instance up.

how to ignore some container when i run `docker-compose rm`

I have four containers that was node ,redis, mysql, and data. when i run docker-compose rm,it will remove all of my container that include the container data.my data of mysql is in the the container and i don't want to rm the container data.
why i must rm that containers?
Sometime i must change some configure files of node and mysql and rebuild.So
,I must remove containers and start again.
I have searched using google again over again and got nothing.
As things stand, you need to keep your data containers outside of Docker Compose for this reason. A data container shouldn't be running anyway, so this makes sense.
So, to create your data-container do something like:
docker run --name data mysql echo "App Data Container"
The echo command will complete and the container will exit immediately, but as long as you don't docker rm the container you will still be able to use it in --volumes-from commands, so you can do the following in Compose:
db:
image: mysql
volumes-from:
- data
And just remove any code in docker-compose.yml to start up the data container.
An alternative to docker-compose, in Go (https://github.com/michaelsauter/crane), let's you create contianer groups -- including overriding the default group so that you can ignore your data containers when rebuilding your app.
Given you have a "crane.yaml" with the following containers and groups:
containers:
my-app:
...
my-data1:
...
my-data2:
...
groups:
default:
- "my-app"
data:
- "my-data1"
- "my-data2"
You can build your data containers once:
# create your data-only containers (safe to run several times)
crane provision data # needed when building from Dockerfile
crane create data
# build/start your app.
crane lift -r # similar to docker-compose build && docker compose up
# Force re-create off your data-only containers...
crane create --recreate data
PS! Unlike docker-compose, even if building from Dockerfile, you MUST specify an "image" -- when not pulling, this is the name docker will give the image locally! Also note that the container names are global, and not prefixed by the folder name the way they are in docker-compose.
Note that there is at least one major pitfall with crane: It simply ignores misplaced or wrongly spelled fields! This makes it harder to debug that docker-compose yaml.
#AdrianMouat Now , I can specify a *.yml file when I starting all container with the new version 1.2rc of docker-compose (https://github.com/docker/compose/releases). just like follows:
file:data.yml
data:
image: ubuntu
volumes:
- "/var/lib/mysql"
thinks for your much useful answer

Resources