Docker Process Management - docker

I have a deployed application running inside a Docker container, which is, in effect, an websocket client that runs forever. Every deploy I'm rebuilding the container and starting it with docker run using the command set in the Dockerfile.
Now, I've noticed a few times that the process occasionally dies without restarting. When running docker ps, I can see that the container is up, and has been up for 2 weeks, however the process running inside of it has died without the host being any the wiser
Do I need to go so far as to have a process manager inside of the docker container to manage the containerized process?
EDIT:
Dockerfile: https://github.com/DVG/catpen-edi/blob/master/Dockerfile

We've developed a process-manager tailor-made for Docker containers and have been using it with quite a bit of success to solve exactly the problem you describe. The best starting point is to take a look at chaperone-docker on github. The readme on the first page contains a quick link to a minimal base image as well as a fully configured LAMP stack so you can try it out and see what a fully-configured image would look like. It's open-source and fully documented.

This is a very interesting problem here related to PID1 and the fact that docker replaces PID1 with the command specified in CMD or ENTRYPOINT. What's happening is that the child process isn't automagically adopted by anything if the parent dies and it becomes an orphan (since there is no PID1 in the sense of a traditional init system like you're used to). Here is some excellent reading to give you a few ideas. You may get some mileage out of their baseimage-docker image which comes with their simplified init system ("my_app"), which will solve some of this problem for you. However, I would strongly caution you against automatically adopting the Phusion mindset for all of your containers, as there exists some ideological friction in that space. I can't recall any discussion on Docker's Github about a potential minimal init system to solve this problem, but I can't imagine it will be a problem forever. Good luck!

If you have two ruby processes it sounds like the child hasn't exited, the application has just stopped working. It's likely the EventMachine reactor is sitting in the background.
Does the EDI app really need to spawn the additional Ruby process? This only adds another layer between Docker and your app. Run the server directly with CMD [ "ruby", "boot.rb" ]. If you find the problem still occurs with a single process then you will need to find what is causing your app to hang.
When a process is running as PID 1 is docker it will need handle the SIGINT and SIGTERM signals too.
# Trap ^C
Signal.trap("INT") {
shut_down
exit
}
# Trap `Kill `
Signal.trap("TERM") {
shut_down
exit
}
Docker also has restart policies for when the container does actually die.
docker run --restart=always
no
Do not automatically restart the container when it exits. This is
the default.
on-failure[:max-retries]
Restart only if the container
exits with a non-zero exit status. Optionally, limit the number of
restart retries the Docker daemon attempts.
always
Always restart the
container regardless of the exit status. When you specify always, the
Docker daemon will try to restart the container indefinitely. The
container will also always start on daemon startup, regardless of the
current state of the container.
unless-stopped
Always restart the
container regardless of the exit status, but do not start it on daemon
startup if the container has been put to a stopped state before.

Related

docker-compose autorestart and supervisord autorestart : which to use?

I ve seen in some build the use of supervisor to run the docker-compose up -d command with the possibility to autostart and/or autorestart.
Im wondering if this cohabitation of supervisor and docker-compose works well? Aren't the two autorestart options interfering with each other? Also what is the benefit to use supervisor in place of a simple docker-compose except run at startup if the server is shut down?
Please share your experience if you have some on using theses two tools
Thank you
Running multiple single-process containers is almost always better than running a single multiple-process container; avoid supervisord when possible.
Mechanically, the combination should work fine. Supervisord will capture logs and take responsibility for restarting the process in the container. That means docker logs will have no interesting output, and you need to get the file content out of the container. If one of the managed processes fails then supervisord will restart it. The container itself will probably never be restarted, unless supervisord manages to crash somehow.
There are a couple of notable disadvantages to using supervisord:
As noted, it swallows logs, so you need a complex file-oriented approach to read them out.
If one of the processes fails then you'll have difficulty seeing that from outside the container.
If you have a code update you have to delete and recreate the container with the new image, which with supervisord means restarting every process.
In a clustered environment like Kubernetes, every process in the supervisord container runs on the same node, and you can't scale individual processes up to handle additional load.
Given the choice of tools you suggest, I'd pretty much always use Compose with its restart: policies, and not use supervisord at all.

Docker container restarting in a loop on GCE

I have correctly deployed a Docker container which runs a Python script that grabs some data from the internet and slaps it in BigQuery. The container works well on my machine and on a GCE instance that I've provisioned.
Now, everything works well for the most part but I am failing to understand why the docker container always restarts after exiting (apparently correctly). Logs, in this case, seems to be fairly useless as there is no error whatsoever. My current hunch is that something is failing silently, forcing the instance to restart.
Is there any way to find out the reboot reason for a given Docker container?
Things tried so far
I've tried to print the exit code of the container in the following way. The result is always 0, no matter those restart cycles.
while true
do
docker inspect my_container --format='{{.State.ExitCode}}'
sleep 1
done
The Google Cloud documentation provides you different ways in which you can review your container related logs including container starts and stops.
In any way, I think there is no problem with your container: by default Compute Engine will restart a container on exit, although you can specify a different restart policy if you need to. Please, see the relevant documentation.

docker container lifecycle confusion

I am new to Docker, and I find the definitions of containers' lifecycle differ a lot.
here is what "Manning.Docker.in.Action.2016.3" shows:
here is what google gives me:
https://medium.com/#nagarwal/lifecycle-of-docker-container-d2da9f85959
here is what the official document says:
status: One of created, restarting, running, removing, paused, exited, or dead
https://docs.docker.com/engine/reference/commandline/ps/
So what's going on here? I guess some new states(and renaming) are introduced in newer version of Docker?
Thanks in advance
Your linked diagram separates docker create from docker start, it includes "die" as a state transition, and it shows how to get to the "restarting" state. That's all valid, though it leads to a more complicated state machine.
(docker create wasn't in the very first versions of Docker but it appeared in Docker 1.3.0 in 2014, which should predate your diagram.)
Practically I might suggest an even simpler state machine:
-------> running -+------> stopped ------>
run | stop rm
\------> exited ------>
process exits rm
That is, never try to restart a container or make changes inside a running container; if you need to tweak anything, delete the existing container and create a new one. This gives you a consistent environment (when the main container process starts you always know what's in its filesystem, up to mounted data). It also matches what happens in cluster environments like Kubernetes, where the cluster manager will routinely create and delete containers for you.
When you get in a situation where internet gives you different answers, you should consider trying it yourself. Especially with technologies like docker, where it is pretty simple to make tests. For example:
I want to run a container (I will use nginx):
docker run -d nginx
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
258cd2edbed8 nginx "nginx -g 'daemon of…" 3 seconds ago Up 2 seconds 80/tcp jolly_golick
Note: docker will keep a container running only if there is a process running in it.
If you would start a debian container (for example), you would see how it immediately stop, as there is nothing running in it. So you could do
docker run -d debian sleep 10
and see that the container is up for 10 seconds.
When a container is running, you can do some things on it. You can't do other things, like removing it. To remove a container, you need to stop it first (or kill it), or force container removal.
Note: You would get all this info from docker itself, if you would be playing around with it, as it would return these info. Like if you would try to remove a running container, you would get this error:
Error response from daemon: You cannot remove a running container 258cd2edbed85bed23ab543312968bd893c1fbd9ba81de40366337f434daedff. Stop the container before attempting removal or force remove
I can't do all possible combinations here. You would get a similar error if you would try removing a paused container. Just play with it, and you will get a clear picture of how it works.

How to turn a docker container to a zombie

A few years ago. When I just started playing docker. I remember there are some blog posts mentioned if you don't handle your pid(1) process well. You will create a zombie docker container. At that time. I chose just follow the suggestion start using a init tool called dumb-init. And I never really see a zombie container be created.
But I am still curious why it's a problem. If I remember correctly, docker stop xxx by default will send SIGTERM to the container pid(1) process. And if the process can not gracefully stop within 10s (default). Docker will force kill it by sending SIGKILL to pid(1) process. And I also know that pid(1) process is special in Linux system. It can ignore SIGKILL signal (link). But I think even if the process's PID in docker container is 1. It just because it's using namespaces to scope its processes. In the host machine, you should see the process is another PID. Which can be killed by the kernel.
So my questions are:
Why can't docker engine just kill the container in the host kernel level? So no matter what. The user can ensure the container be killed properly.
How can I create a zombie process in docker container? (If someone can share a Gist will be great!)
Not zombie containers, but zombie processes. Write this zombie.py:
#!/usr/bin/env python3
import subprocess
import time
p = subprocess.Popen(['/bin/sleep', '1'])
time.sleep(2)
subprocess.run(['/bin/ps', '-ewl'])
Write this Dockerfile:
FROM python:3
COPY zombie.py /
CMD ["/zombie.py"]
Build and run it:
chmod +x zombie.py
docker build -t zombie .
docker run --rm zombie
What happens here is the /bin/sleep command runs to execution. The parent process needs to use the wait call to clean up after it, but it doesn't so when it runs ps, you'll see a "Z" zombie process.
But wait, there's more! Say your process does carefully clean up after itself. In this specific example, subprocess.run() includes the required wait call, for instance, and you might change the Popen call to run. If that subprocess launches another subprocess, and it exits (or crashes) without waiting for it, the init process with pid 1 becomes the new parent process of the zombie. (It's worked this way for 40 years.) In a Docker container, though, the main container process runs with pid 1, and if it's not expecting "extra" child processes, you could wind up with stale zombie processes for the life of the container.
This leads to the occasional suggestion that a Docker container should always run some sort of "real" init process, maybe something as minimal as tini, so that something picks up after zombie processes and your actual container job doesn't need to worry about it.

update running docker container

I have a running docker container with a base image fedora:latest.
I would like to preserve the state of my running applications, but still update a few packages which got security fixes (i.e. gnutls, openssl and friends) since I first deployed the container.
How can I do that without interrupting service or losing the current state?
So optimally I would like to get a bash/csh/dash/sh on the running container, or any fleet magic?
It's important to note that you may run into some issues with the container shutting down.
For example, imagine that you have a Dockerfile for an Apache container which runs Apache in the foreground. Imagine that you attach a shell to your container (via docker exec) and you start updating. You have to apply a fix to Apache and, in the process of updating, Apache restarts. The instant that Apache shuts down, the container will stop. You're going to lose the current state of the applications. This is going to require extremely careful planning and some luck, and some updates will probably not be possible.
The better way to do it is rebuild the image upon which the container is based with all the appropriate updates, then re-run the container. There will be a (brief) interruption in service. However, in order for you to be able to save the state of your applications, you would need to design the images in such a way that any state information that needs to be preserved is stored in a persistent manner - either in the host file system by mounting a directory or in a data container.
In short, if you're going to lose important information when your container shuts down, then your system is fragile & you're going to run into problems sooner or later. Better to redesign it so that everything that needs to be persistent is saved outside the container.
If the docker container has a running bash
docker attach <containerIdOrName>
Otherwise execute a new program in the same container (here: bash)
docker exec -it <containerIdOrName> bash

Resources