Best way to detect Docker container exit - docker

I have some remote servers running Docker images, and want them to "phone home" when the container stops, to alert me to check out the recent logs and figure out what caused the main process in the container to exit. What's the best way to do this? I could just drop the container's CMD/ENTRYPOINT into a wrapper script that does something like ./my_process || phone_home.sh, but I imagine that won't handle actual Docker engine issues, like OOM or someone explicitly doing docker stop my-container.
Is there anything built in to Docker for this or am I better off with some custom sidecar "watcher" service that just constantly checks docker ps output?

Related

A windows container cannot be stopped succesfully

I use the dotnet3.5 image to run containers on win10 with docker desktop 2.1.0.1(37199). Sadly, I found that after I had created a container, did nothing to it, and left it alone for 4 days, the container automotically became unstoppable. The snapshot tells the story.
The container seemed existing there when docker ps -a, but I cannot get into the container by docker exec. And for I cannot stop it--the docker stop process hangs there after I use docker stop container2--I cannot rm the container.
The only way to resolve this issue is to restore docker desktop's factory setting.
By the way, although in the snapshot the running image is aspnet:3.5-windowsservercore-10.0.14393.953, this issue also happens when the aspnet:3.5
Does anyone have good ideas to the unstoppable container? Any suggestions are welcome.
The command used above is incorrect. There is a difference between the commands and options we use. "# docker ps" or "# docker container ls" will give you the list of currently running processes or active containers.
Whereas "-a" will give you all the list of all those which are used to date which contains the list of active and deleted containers.
In your case, the container was is not there and you are trying to access the one which is non-existing, which is why it is stuck.

Is there a way to hibernate a docker container

I like to use Jupyter Notebook. If I run it in a VM in virtualbox, I can save the state of the VM, and then pick up right where I left off the next day. Can I do something similar if I were to run it in a docker container? i.e. dump the "state" of the container to disk, then crank it back up and reload the "state"?
It looks like docker checkpoint may be the thing I'm attempting to accomplish here. There's not much in the docs that describes it as such. In fact, the docs for docker checkpoint say "Manage checkpoints" which is massively unhelpful.
UPDATE: This IS, in fact, what docker checkpoint is supposed to accomplish. When I checkpoint my jupyter notebook container, it saves it, I can start it back up with docker start --checkpoint [my_checkpoint] jupyter_notebook, and it shows the things I had running as being in a Running state. However, attempts to then use the Running notebooks fail. I'm not sure if this is a CRIU issue or a Jupyter issue, but I'll bring it up in the appropriate git issue tracker.
Anyhoo docker checkpoint is the thing that is supposed to provide VM-save-state/hibernate style functionality.
The closest approach I can see is docker pause <container-id>
https://docs.docker.com/engine/reference/commandline/pause/
The docker pause command suspends all processes in the specified containers. On Linux, this uses the cgroups freezer. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the cgroups freezer the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed.
Take into account as an important difference against VirtualBox hibernation, that there is no disk persistence of the memory state of the containerized process.
If you just stop the container, it hibernates:
docker stop myjupyter
(hours pass)
docker start myjupyter
docker attach myjupyter
I do this all the time, especially with docker containers which have web browers in them.

docker-compose where can I get a detail log (info) about what happened

I'm trying to pull and set a container with an database (mysql) with the following command:
docker-compose up
But for some reason is failing to start the container is showing me this:
some_db exited with code 0
But that doesn't give me any information of why that happened so I can fix it, so how can I see the details of that error?.
Instead of looking for how to find an error in your composition, you should change your process. With docker it is easy to fall into using all of the lovely tools to make things work seamlessly together, but this sounds like an individual container issue, not an issue with docker compse. You should try building an empty container with the baseimage you want then using docker exec -it <somecontainername> sh to go into it and run the commands you're trying to execute in your entrypoint manually to find the specific breakpoint.
Try at least docker logs <exited_container_name/id> (or more recently: docker container logs <exited_container_name/id>)
That way, you can check if there was any error message.
An exited container means the main process stopped: you need to make sure that main process remains up in order for the container to not exit immediately.
Try also docker inspect <exited_container_name/id> (or, again, docker container inspect <exited_container_name/id>)
If you see for instance "OOMKilled": true, that would mean the container memory was exceeded. You can also look for the "ExitCode" for clues.
See more at "Ten tips for debugging Docker containers" from Mark Betz.

How can I know why a Docker container stopped?

I have a Docker container that contains a JVM process. When the process ends the container completes and stops.
While thankfully rare, my JVM can go legs-up suddenly with a hard-failure, e.g. OutOfMemoryError. When this happens my container just stops, like a normal JVM exit.
I can have distributed logging, etc., for normal JVM logging, but in this hard-fail case I want to know the JVM's dying words, which are typically uttered on stderr.
Is there a way to know why my container stopped, look around in logs, stderr, or something along these lines?
You can run the docker logs [ContainerName] or docker logs [ContainerID] command even on stopped containers. You can see them with docker ps -a.
Source: the comment of Usman Ismail above: this is just a way to convert his comment into a proper answer.

Strategies for deciding when to use 'docker run' vs 'docker start' and using the latest version of a given image

I'm dockerizing some of our services. For our dev environment, I'd like to make things as easy as possible for our developers and so I'm writing some scripts to manage the dockerized components. I want developers to be able to start and stop these services just as if they were non-dockerized. I don't want them to have to worry about creating and running the container vs stopping and starting and already-created container. I was thinking that this could be handled using Fig. To create the container (if it doesn't already exist) and start the service, I'd use fig up --no-recreate. To stop the service, I'd use fig stop.
I'd also like to ensure that developers are running containers built using the latest images. In other words, something would check to see if there was a later version of the image in our Docker registry. If so, this image would be downloaded and run to create a new container from that image. At the moment it seems like I'd have to use docker commands to list the contents of the registry (docker search) and compare that to existing local containers (docker ps -a) with the addition of some greping and awking or use the Docker API to achieve the same thing.
Any persistent data will be written to mounted volumes so the data should survive the creation of a new container.
This seems like it might be a common pattern so I'm wondering whether anyone else has given these sorts of scenarios any thought.
This is what I've decided to do for now for our Neo4j Docker image:
I've written a shell script around docker run that accepts command-line arguments for the port, database persistence directory on the host, log file persistence directory on the host. It executes a docker run command that looks like:
docker run --rm -it -p ${port}:7474 -v ${graphdir}:/var/lib/neo4j/data/graph.db -v ${logdir}:/var/log/neo4j my/neo4j
By default port is 7474, graphdir is $PWD/graph.db and logdir is $PWD/log.
--rm removes the container on exit, however the database and logs are maintained on the host's file system. So no containers are left around.
-it allows the container and the Neo4j service running within it to receive signals so that the service can be gracefully shut down (the Neo4j server gracefully shuts down on SIGINT) and the container exited by hitting ^C or sending it a SIGINT if the developer puts this in the background. No need for separate start/stop commands.
Although I certainly wouldn't do this in production, I think this fine for a dev environment.
I am not familiar with fig but your scenario seems good.
Usually, I prefer to kill/delete + run my container instead of playing with start/stop though. That way, if there is a new image available, Docker will use it. This work only for stateless services. As you are using Volumes for persistent data, you could do something like this.
Regarding the image update, what about running docker pull <image> every N minutes and checking the "Status" that the command returns? If it is up to date, then do nothing, otherwise, kill/rerun the container.

Resources