Docker - docker-compose - postgres image - docker

I'm using the postgres:latest image, and creating backups using the following command
pg_dump postgres -U postgres > /docker-entrypoint-initdb.d/backups/redmine-$(date +%Y-%m-%d-%H-%M).sql
and it's running periodically using crontab
*/30 * * * * /docker-entrypoint-initdb.d/backup.sh
However, on occasion I might need to run
docker-compose down/up
for whatever reason
The problem
I always need to manually run /etc/init.d/cron start whenever I restart the container. This is a bit of a problem because it's difficult to remember to do, and if I (or anyone else) forgets this, backups wont be made
According to the documentation, scripts ending with *.sql and *.sh inside the /docker-entrypoint-initdb.d/ are run on container startup (and they do)
However, if I put /etc/init.d/cron start inside a executable .sh file, the other commands inside that file are executed and I've verified that. But the cron service does not start, probably because the /etc/init.d/cron start inside the executable file does not execute successfully
I would appreciate any suggestion for a solution

You will want to keep your docker containers as independent of other services as possible, I would recommend that you instead of running the cronjob in the container do it on the host, that way it will run even if the container is restarted (weather automatically or manually).
If you really really feel the need for it, I would build a new image with the postgres image as base, and add the cron right from there, that way it is in the container already from start, without any extra scripts needed. Or even create another image just to invoke the cronjob and connect via the docker network.

Expanding on #Jite's answer, you could run pg_dump remotely in a different container using the --host option
This image, for example, provides a minimal environment with psql client and dump/restore utilities

Related

Unable to resolve docker container during another container's build process

I have two Dockerfiles, one for a database, and one for a web server. The web server's Dockerfile has a RUN statement which requires a connection to the database container. The web server is unable to resolve the database's IP then errors out. But if I comment out the RUN line, then manually run it inside the container, it successfully resolves the database. Should the web server be able to resolve the database during its build process?
# Web server
FROM tomcat:9.0.26-jdk13-openjdk-oracle
# The database container cannot be resolved when myscript runs. "Unable to connect to the database." is thrown.
RUN myscript
CMD catalina.sh run
# But if I comment out the RUN line then connect to web server container and run myscript, the database container is resolved
docker exec ... bash
# This works
./myscript
I ran into the same problem on database migrations and NuGet pushes. You may want to run something similar on your db like migrations, initial/test data and so on. It could be solved in two ways:
Move your DB operations to the ENTRYPOINT so that they're executed at runtime (where the DB container is up and reachable).
Build your image using docker build instead of something like docker-compose up --build because docker build has a switch called --network. So you could create a network in your compose file, bring the DB up with docker-compose up -d db-container and then access them during the build with docker build --network db-container-network -t your-image .
I'd prefer #1 over #2 if possible because
it's simpler: the network is only present in docker-compose file, not on multiple places
you can specify relations usind depends_on and make sure that they're respected properly without taking manually care of it
But depending on the action you want to execute, you need to take care that it's not executed multiple times because it's running on every start and not just during build (when the cache got purged by file changes).
However, I'd consider this as best practice anyway when running such automated DB operations to expect that they may executed more than one and should create the expected result anyway (e.g. by checking if the migration version or change is present).

Is there an easy way to automatically run a script whenever I (re)start a container?

I have built a Docker image, copied a script into the image, and automatically execute it when I run the image, thanks to this Dockerfile command:
ENTRYPOINT ["/path/to/script/my_script.sh"]
(I had to give it chmod rights in a RUN command to actually make it run)
Now, I'm quite new to Docker, so I'm not sure if what I want to do is even good practice:
My basic idea is that I would rather not always have to create a new container whenever I want to run this script, but to instead find a way to re-execute this script whenever I (re)start the same container.
So, instead of having to type docker run my_image, accomplishing the same via docker (re)start container_from_image.
Is there an easy way to do this, and does it even make sense from a resource parsimony perspective?
docker run is fairly cheap, and the typical Docker model is generally that you always start from a "clean slate" and set things up from there. A Docker container doesn't have the same set of pre-start/post-start/... hooks that, for instance, a systemd job does; there is only the ENTRYPOINT/CMD mechanism. The way you have things now is normal.
Also remember that you need to delete and recreate containers for a variety of routine changes, with the most important long-term being that you have to delete a container to change the underlying image (because the installed software or the base Linux distribution has a critical bug you need a fix for). I feel like a workflow built around docker build/run/stop/rm is the "most Dockery" and fits well with the immutable-infrastructure pattern. Repeated docker stop/start as a workflow feels like you're trying to keep this specific container alive, and in most cases that shouldn't matter.
From a technical point of view you can think of the container environment and its filesystem, and the main process inside the container. docker run is actually docker create plus docker start. I've never noticed the "create" half of this taking substantial time, but if you're doing something like starting a JVM or loading a large dataset on startup, the "start" half will be slow whether or not it's coupled with creating a new container.
For chmod issue you can do something like this
COPY . /path/to/script/my_script.sh
RUN chmod 777 -R /path/to/script/my_script.sh
For rerun script issue
The ENTRYPOINT specifies a command that will always be executed when the container starts.
It can be either
docker run container_from_image
or
docker start container_from_image
So whenever your container start your ENTRYPOINT command will be executed.
You can refer this for more detail

What is the best way to do periodical cleanups inside a docker container?

I have a docker container that runs a simple custom download server using uwsgi on debian and a python script. The files are generated and saved inside the container for each request. Now, periodically I want to delete old files that the server generated for past requests.
So far, I achieved the cleanup via a cronjob on the host, that looks something like this:
*/30 * * * * docker exec mycontainer /path/on/container/delete_old_files.sh
But that has a few drawbacks:
Cron needs to be installed and running on the docker host
The user manually has to add a cronjob for each container they start
There is an extra cleanup script in the source
The fact that the cron job is needed needs to be documented
I would much prefer a solution that rolls out with the docker container and is also suitable for more general periodical tasks in the background of a docker container.
Any best practices on this?
Does python or uwsgi have an easy mechanism for periodical background tasks?
I'm aware, that I could install cron inside the container and to something like: CMD ['sh', '-c', 'cron; uswgi <uswgi-options>... --wsgi-file server.py'] but that seems a bit clonky and against docker philosopy.
A solution like this in server.py:
def cleanup():
# ...
threading.Timer(30*60, cleanup).start() # seconds...
cleanup()
# ... rest of the code here ...
Seems good, but I'm not sure how it interferes with uwsgi's own threading and processing.
It seems like a simple problem but isn't.
You should not store live data in containers. Containers can be a little bit fragile and need to be deleted and restarted routinely (because you forgot an option; because the underlying image has a critical security fix) and when this happens you will lose all of the data that's in the container.
What you can do instead is use a docker run -v option to cause the data to be stored in a path on the host. If they're all in the same place then you can have one cron job that cleans them all up. Running cron on the host is probably the right solution here, though in principle you could have a separate dedicated cron container that did the cleanup.

What is the '--rm' flag doing?

I am trying Docker for the first time and do not yet have a "mental model". Total beginner.
All the examples that I am looking at have included the --rm flag to run, such as
docker run -it --rm ...
docker container run -it --rm ...
Question:
Why do these commands include the --rm flag? I would think that if I were to go through the trouble of setting up or downloading a container with the good stuff in it, why remove it? I want to keep it to use again.
So, I know I have the wrong idea of Docker.
Containers are merely an instance of the image you use to run them.
The state of mind when creating a containerized app is not by taking a fresh, clean ubuntu container for instance, and downloading the apps and configurations you wish to have in it, and then let it run.
You should treat the container as an instance of your application, but your application is embedded into an image.
The proper usage would be creating a custom image, where you embed all your files, configurations, environment variables etc, into the image. Read more about Dockerfile and how it is done here
Once you did that, you have an image that contains everything, and in order to use your application, you just run the image with proper port settings or other dynamic variables, using docker run <your-image>
Running containers with --rm flag is good for those containers that you use for very short while just to accomplish something, e.g., compile your application inside a container, or just testing something that it works, and then you are know it's a short lived container and you tell your Docker daemon that once it's done running, erase everything related to it and save the disk space.
The flag --rm is used when you need the container to be deleted after the task for it is complete.
This is suitable for small testing or POC purposes and saves the headache for house keeping.
From https://docs.docker.com/engine/reference/run/#clean-up---rm
By default a container’s file system persists even after the container exits. This makes debugging a lot easier (since you can inspect the final state) and you retain all your data by default. But if you are running short-term foreground processes, these container file systems can really pile up. If instead you’d like Docker to automatically clean up the container and remove the file system when the container exits, you can add the --rm flag
In short, it's useful to keep the host clean from stopped and unused containers.
When you run a container from an image using a simple command like (docker run -it ubuntu), it spins up a container. You attach to your container using docker attach container-name (or using exec for different session).
So, when you're within your container and working on it and you type exit or ctrl+z or any other way to come out of the container, other than ctrl+p+q, your container exits. That means that your container has stopped, but it is still available on your disk and you can start it again with : docker start container-name/ID.
But when you run the container with —rm tag, on exit, the container is deleted permanently.
I use --rm when connecting to running containers to perform some actions such as database backup or file copy. Here is an example:
docker run -v $(pwd):/mnt --link app_postgres_1:pg --rm postgres:9.5 pg_dump -U postgres -h pg -f /mnt/docker_pg.dump1 app_db
The above will connect a running container named 'app_postgres_1' and create a backup. Once the backup command completes, the container is fully deleted.
The "docker run rm " command makes us run a new container and later when our work is completed then it is deleted by saving the disk space.
The important thing to note is, the container is just like a class instance and not for data storage. We better delete them once the work is complete. When we start again, it starts fresh.
The question comes then If the container is deleted then what about the data in a container? The data is actually saved in the local system and get linked to it when the container is started. The concept is named as "Volume or shared volume".

How to execute docker commands after a process has started

I wrote a Dockerfile for a service (I have a CMD pointing to a script that starts the process) but I cannot run any other commands after the process has started? I tried using '&' to run the process in the background so that the other commands would run after the process has started but it's not working? Any idea on how to achieve this?
For example, consider I started a database server and wanted to run some scripts only after the database process has started, how do I do that?
Edit 1:
My specific use case is I am running a Rabbitmq server as a service and I want to create a new user, make him administrator and delete the default guest user once the service starts in a container. I can do it manually by logging into the docker container but I wanted to automate it by appending these to the shell script that starts the rabbitmq service but that's not working.
Any help is appreciated!
Regards
Specifically around your problem with Rabbit MQ - you can create a rabbitmq.config file and copy that over when creating the docker image.
In that file you can specify both a default_user and default_pass that will be created when an the database is set from scratch see https://www.rabbitmq.com/configure.html
AS for the general problem - you can change the entry point to a script that runs whatever you need and the service you want instead of the run script of the service
I partially understood your question. Based on what I perceived from your question, I would recommend you to mention the Copy command to copy the script you want to run into the dockerfile. Once you build an image and run the container, start the db service. Then exec the container and get into the container, run the script manually.
If you have CMD command in the dockerfile, then it will be overwritten by the command you specify during the execution. So, I don't think you have any other option to run the script unless you don't have CMD in the dockerfile.

Resources