Should we use supervisors to keep processes running in Docker containers? - docker

I'm using Docker to run a java REST service in a container. If I were outside of a container then I might use a process manager/supervisor to ensures that the java service restarts if it encounters a strange one-off error. I see some posts about using supervisord inside of containers but it seems like they're focused mostly on running multiple services, rather than just keeping one up.
What is the common way of managing services that run in containers? Should I just be using some built in Docker stuff on the container itself rather than trying to include a process manager?

You should not use a process supervisor inside your Docker container for a single-service container. Using a process supervisor effectively hides the health of your service, making it more difficult to detect when you have a problem.
You should rely on your container orchestration layer (which may be Docker itself, or a higher level tool like Docker Swarm or Kubernetes) to restart the container if the service fails.
With Docker (or Docker Swarm), this means setting a restart policy on the container.

Related

nested docker setup: child exposes parent

I' have the following docker setup:
Ubuntu server
-> running Jenkins (started with docker-compose)
-> running a pipeline which starts a node-alpine image
-> which then calls a new docker-compose up (needed for tests)
If I call docker ps from the node-alpine container, I see all the containers from the ubuntu server. I would have expected to only see the newly started containers.
Is this an indication that my setup is flawed? Or just the way docker works?
That's just the way Docker works. There's no such thing as a hierarchy of containers.
Typically with setups like this you give an orchestrator (like Jenkins) access to the host's Docker socket. That means containers launched from Jenkins are indistinguishable from containers launched directly from the host, and it means that Jenkins can do anything with Docker that you could have done from the host.
Remember that access to the Docker socket means reading and modifying arbitrary files as root on the host, along with starting, stopping, deleting, and replacing other containers. In this setup you might re-evaluate how much you really need that lowest level container to start further containers, since it is a significant security exposure.

Why does Docker have a daemon?

I recently discovered rkt, a competitor container runtime to Docker. It seems like rkt does not need a daemon. For me, rkt is like running any other command and it works easily with systemd (or other init systems).
This makes me wonder about the utility of Docker's daemon.
Why does Docker need a daemon ? What does the daemon provide that would not be possible without it ? Is its only goal to remove the need for an init system like systemd (as can be seen in the Rancher OS) ?
Docker was designed as a client/server application which allows you to have remote access to the docker API. This allows tools like the classic container based swarm that were effectively a reverse proxy to a cluster of docker hosts.
The daemon also provides a place for shared state. It's restarting containers according to their restart policy. But it's also managing networks and volumes that may be shared between multiple containers.
Lastly, with the introduction of swarm mode, the daemon is also the central location for these tools that would otherwise be run as their own daemons with tools like kubernetes.
If you need a daemon-less solution but otherwise like docker, then consider using runc which is the runtime environment that docker uses for each container by default.
This doesn't involve the init inside the container. If you need that, docker now includes an optional init that you can enable per container. And you've always had the option to include your own init, like tini, if you needed something to cleanup zombie processes.

Does the process inside of the docker image still need to be managed?

Say I am running a java web application inside of my docker container that runs on elastic beanstalk (or any other framework for that matter).
I still am responsible for making sure my process has some kind of process managaement to make sure it is running correct? i.e. supervisord or runit
Or is this something that EB will somehow manage?
When the process inside the container stops, so too does the container (designed to run that single process). So you don't have to manage the process inside your container, instead rely on the system managing your containers to restart them. For example "services" in Docker Swarm and Replication Controllers in Kubernetes are designed to keep a desired number of containers running. When one dies a new one takes its place

How could one use Docker Compose to synchronize container execution?

How could one use Docker Compose to synchronize container execution?
The problem I'm trying to solve is similar to Docker Compose wait for container X before starting Y. I use Docker Compose to launch several containers, all running on the same host, three of which are PostgreSQL, Liquibase, and a Web application servlet running in Tomcat. The PostgreSQL and Web application containers are both long running while the Liquibase container is ephemeral. The containers must not only start in order, but each container must also wait for the preceding container to be available or complete. In particular, the PostgreSQL server must be ready to process SQL commands before the Liquibase container runs, and the Liquibase schema migration task must complete before the Web application starts to ensure that the database schema is in a valid state.
I understand that I can achieve this synchronization using two wrapper "wait-for" scripts that poll for certain conditions (and this may be the only available option), the first of which would poll the availability of the PostgreSQL server to process commands while the second, which would run just prior to the Web application, could poll for the presence of a particular database object. However, like process synchronization, I think container synchronization is a common problem that can be addressed with more general inter-process communication and synchronization primitives like semaphores. Docker Compose would likely benefit the most from such synchronization mechanisms, but Docker containers might find them useful, too, for example, to establish multiple synchronization points within a container.
Until Docker Compose or Docker supports container synchronization primitives (similar to process synchronization primitives, but accessible from the shell), Dependencies for docker-compose with inotify is one of the better solutions that I've found to the Docker Compose container synchronization problem.
In addition to consul, etcd, and ZooKeeper, MQTT retained messages are another simple mechanism that Docker containers might use to coordinate activities. Mosquito is a lightweight, open-source implementation of MQTT.
I've come to the conclusion that Docker Compose is not the most appropriate tool for container synchronization. Tools like Kubernetes or Marathon facilitate more sophisticated container synchronization. What is the best Docker Linux Container orchestration tool? compares available container synchronization tools.

Is it wrong to run a single process in docker without providing basic system services?

After reading the introduction of the phusion/baseimage I feel like creating containers from the Ubuntu image or any other official distro image and running a single application process inside the container is wrong.
The main reasons in short:
No proper init process (that handles zombie and orphaned processes)
No syslog service
Based on this facts, most of the official docker images available on docker hub seem to do things wrong. As an example, the MySQL image runs mysqld as the only process and does not provide any logging facilities other than messages written by mysqld to STDOUT and STDERR, accessible via docker logs.
Now the question arises which is the appropriate way to run an service inside docker container.
Is it wrong to run only a single application process inside a docker container and not provide basic Linux system services like syslog?
Does it depend on the type of service running inside the container?
Check this discussion for a good read on this issue. Basically the official party line from Solomon Hykes and docker is that docker containers should be as close to single processes micro servers as possible. There may be many such servers on a single 'real' server. If a processes fails you should just launch a new docker container rather than try to setup initialization etc inside the containers. So if you are looking for the canonical best practices the answer is yeah no basic linux services. It also makes sense when you think in terms of many docker containers running on a single node, you really want them all to run their own versions of these services?
That being said the state of logging in the docker service is famously broken. Even Solomon Hykes the creator of docker admits its a work in progress. In addition you normally need a little more flexibility for a real world deployment. I normally mount my logs onto the host system using volumes and have a log rotate daemon etc running in the host vm. Similarly I either install sshd or leave an interactive shell open in the the container so I can issue minor commands without relaunching, at least until I am really sure my containers are air-tight and no more debugging will be needed.
Edit:
With docker 1.3 and the exec command its no longer necessary to "leave an interactive shell open."
It depends on the type of service you are running.
Docker allows you to "build, ship, and run any app, anywhere" (from the website). That tells me that if an "app" consists of/requires multiple services/processes, then those should be ran in a single Docker container. It would be a pain for a user to have to download, then run multiple Docker images just to run one application.
As a side note, breaking up your application into multiple images is subject to configuration drift.
I can see why you would want to limit a docker container to one process. One reason being uptime. When creating a Docker provisioning system, it's essential to keep the uptime of a container to a minimum so that scaling sideways is fast. This means, that if I can get away with running a single process per Docker container, then I should go for it. But that's not always possible.
To answer your question directly. No, it's not wrong to run a single process in docker.
HTH

Resources