Alternative to supervisord for docker - docker

Supervisord is really great tool even for docker environment. It helps a lot with stderr redirection and signals forwarding. But it has a couple of disadvantages:
It doesn't support delayed startup. It could be useful to delay some agent startup until main app is initializing. Priority doesn't solve this issue.
If some app enters FATAL state supervisord just logs it but continue to work. So you can't see it until look at logs of container. It could much more friendly if supervisord just stops because in that case you see the problem with docker ps -a
So what is the best alternative to supervisord?

In response to the "PID1 zombie reaping" issue, I recommended before (in "Use of Supervisor in docker") to use runit instead of supervisord
Runit uses less memory than Supervisord because Runit is written in C and Supervisord in Python.
And in some use cases, process restarts in the container are preferable over whole-container restarts.
See the phusion/baseimage-docker image for more.
As noted by Torsten Bronger in the comments:
Runit is not there to solve the reaping problem.
Rather, it's to support multiple processes. Multiple processes are encouraged for security (through process and user isolation).
Since 2015, you now can Specify an init process that should be used as the PID 1 in the container, with docker run --init
The default init process used is the first docker-init executable found in the system path of the Docker daemon process.
This docker-init binary, included in the default installation, is backed by tini.

Any new visitor:
If you are a beginner to containerization it is highly recommended to use one process per container it makes everyone's life easy.
Without any hack, all other tools will work perfectly, it will improve Observability as well.
Don't treat containers like VMs.

Not sure why you'd need supervisor if you have docker. Docker does everything supervisor does, start/stop/manage services, keep the stdout/stderr stored so you can read it later. Restart everything after reboot.

Related

Question about safely starting and stopping docker container

I'm learning to use RabbitMQ and I am hosting it in a Docker container on my laptop. I went this route instead of running it directly on my laptop because I also wanted to gain some experience with Docker. Is it best to start and stop the container on shutdown/startup or does Docker automatically start and stop the container for me? I'm having a hard time finding clear instructions on making sure the state of my container is maintained and ensuring the work I do in it isn't lost due to simple mistakes. Thanks in advance!!
Is it best to start and stop the container on shutdown/startup or does Docker automatically start and stop the container for me?
Docker will automatically start containers for you if you set an appropriate restart policy on the container. In general, this means using the unless-stopped policy, as in:
docker run --restart=unless-stopped ...
Your container will but "shut down" when your system shuts down the same way as any other process -- your service will probably first receive a SIGTERM, and may later receive a SIGKILL. This is sufficient for most applications.
I'm having a hard time finding clear instructions on making sure the state of my container is maintained and ensuring the work I do in it isn't lost due to simple mistakes.
Any data in your container will persist between a stop and a start as long as you don't remove the container. However, if you are generating data that you care about, your best choice is to mount a volume in the container and store the data there. This makes the life cycle of your data independent of the container, which means you can replace/upgrade your container while still preserving your data.
You can read more about volumes here.
Using volumes effectively requires you to know where your application stores data. This will often be documented in the README for the image, but not always.
Do you have any resources you recommend for learning how to use and develop in Docker?
The Docker documentation itself is a great place to start. Looking at the design of some of the official images can also be educational.

What's the purpose of the node-modules container in wolkenkit?

That container is built when deploying the application.
Looks like its purpose is to share dependencies across modules.
It looks like it is started as a container but nothing is apparently running, a bit like an init container.
Console says it starts/stops that component when using respective wolkenkit start and wolkenkit stop command.
On startup:
On shutdown:
When you docker ps, that container cannot be found:
Can someone explain these components?
When starting a wolkenkit application, the application is boxed in a number of Docker containers, and these containers are then started along with a few other containers that provide the infrastructure, such as databases, a message queue, ...
The reason why the application is split into several Docker containers is because wolkenkit builds upon the CQRS pattern, which suggests separating the read side of an application from the application's write side, and hence there is one container for the read side, and one for the write side (actually there are a few more, but you get the picture).
Now, since you may develop on an operating system other than Linux, the wolkenkit application may run under a different operating system than when you develop it, as within Docker it's always Linux. This means that the start command can not simply copy over the node_modules folder into the containers, as they may contain binary modules, which are then not compatible (imagine installing on Windows on the host, but running on Linux within Docker).
To avoid issues here, wolkenkit runs an npm install when starting the application inside of the containers. The problem now is that if wolkenkit did this in every single container, the start would be super slow (it's not the fastest thing on earth anyway, due to all the Docker building and starting that's happening under the hood). So wolkenkit tries to optimize this as much as possible.
One concept here is to run npm install only once, inside of a container of its own. This is the node-modules container you encountered. This container is then linked as a volume to all the containers that contain the application's code. This way you only have to run npm install once, but multiple containers can use the outcome of this command.
Since this container now contains data, but no code, it only has to be there, it doesn't actually do anything. This is why it gets created, but is not run.
I hope this makes it a little bit clearer, and I was able to answer your question :-)
PS: Please note that I am one of the core developers of wolkenkit, so take my answer with a grain of salt.

How to interact with already running instance via terminal in Mongooseim?

I am using Mongooseim 3.2.0 from the source code on the ubuntu server. Below are concern:
What is the best way to run mongooseim as a service so that it automatically restarts if mongooseim crashes or system restarts?
How to interact via terminal with already running mongooseim instance on the ubuntu server like "mongooseimctl live". My guess is running "mongooseimctl live" will try to create another instance. I just want to see the live logs and interaction and don't want to keep scrolling the long log files for this purpose.
I apologize if the answer to above is obvious but just want to follow the best guidance.
mongooseimctl live or mongooseimctl foreground is mostly useful for development or smoke testing a deployment (unless you're running inside a container). For real world use cases you should start the server in the background with mongooseimctl start.
Back to the container - the best approach for containerised applications is to run them in the foreground, therefore in a container startup script use mongooseimctl foreground.
Once the server is running (no matter how it was started) attaching a shell to troubleshoot issues can be done with mongooseimctl debug. This is the command to use when you get the Protocol 'inet_tcp': the name mongooseim#localhost seems to be in use by another Erlang node error. Be careful if it's a production environment - you can easily take the server down with access to this shell.
If you're just interested in watching logs, with no interactive access to the server internals that the shell offers, a simple tail -f /your-configured-mongooseim-log-dir/* should be enough.
Ubuntu nowadays uses systemd for managing its services' lifetimes. A systemd .service file can be found at https://github.com/esl/MongooseIM/blob/master/tools/pkg/platforms/debian_stretch/files/build/mongooseim.service - we use it for packaging into Debian/Ubuntu .deb packages.

Is it recommended to run systemd inside docker container?

I am planning to use 'systemd' inside the container. Based on the articles I have read, it is preferable to limit only one process per container.
But if I configure 'systemd' inside the container, I will end up running many processes.
It would be great to understand the pros and cons of using systemd inside the container before I take any decision.
I'd advise you to avoid systemd in a container if at all possible.
Systemd mounts filesystems, controls several kernel parameters, has its own internal system for capturing process output, configures system swap space, configures huge pages and POSIX message queues, starts an inter-process message bus, starts per-terminal login prompts, and manages a swath of system services. Many of these are things Docker does for you; others are system-level controls that Docker by default prevents (for good reason).
Usually you want a container to do one thing, which occasionally requires multiple coordinating processes, but you usually don't want it to do any of the things systemd does beyond provide the process manager. Since systemd changes so many host-level parameters you often need to run it as --privileged which breaks the Docker isolation, which is usually a bad idea.
As you say in the question, running one "piece" per container is usually considered best. If you can't do this then a light-weight process manager like supervisord that does the very minimum an init process is required to is a better match, both for the Docker and Unix philosophies.
s6 became a somewhat popular init for containers when you need more than one process. And yes, it's not "one process per container", it's "one thing per container". Running a website e.g. is still one thing but it's usually more than one process.
You should think it more to be a question which init system you like to use.
One may use the old /sbin/init or the systemd-daemon running as PID-1 in a container. Any command like "docker stop" will talk to PID-1 only. If you do only have one java application in a container then it is recommended to run that process directly as PID-1 of the container.
Running systemd is mostly not required - if you have multiple services in a container or if some wrapper script uses 'systemctl' then you may still want to use activate it. But the latter use cases would also be covered by docker-systemctl-replacement.

Is it best practice to daemonize a process within docker?

Many best practice guides emphasize making your process a daemon and having something watch it to restart in case of failure. This made sense for a while. A specific example can be sidekiq.
bundle exec sidekiq -d
However, with Docker as I build I've found myself simply executing the command, if the process stops or exits abruptly the entire docker container poofs and a new one is automatically spun up - basically the entire point of daemonizing a process and having something watch it (All STDOUT is sent to CloudWatch / Elasticsearch for monitoring).
I feel like this also tends to re-enforce the idea of a single process in a docker container, which if you daemonize would tend to in my opinion encourage a violation of that general standard.
Is there any best practice documentation on this even if you're running only a single process within the container?
You don't daemonize a process inside a container.
The -d is usually seen in the docker run -d command, using a detached (not daemonized) mode, where the the docker container would run in the background completely detached from your current shell.
For running multiple processes in a container, the background one would be a supervisor.
See "Use of Supervisor in docker" (or the more recent docker --init).
Some relevent 12 Factor app recommendations
An app is executed in the execution environment as one or more processes
Concurrency is implemented by running additional processes (rather than threads)
Website:
https://12factor.net/
Docker was open sourced by a PAAS operator (dotCloud) so it's entirely possible the authors were influenced by this architectural recommendation. Would explain why Docker is designed to normally run a single process.
The thing to remember here is that a Docker container is not a virtual machine, although it's entirely possible to make it quack like one. In practice a docker container is a jailed process running on the host server. Container orchestration engines like Kubernetes (Mesos, Docker Swarm mode) have features that will ensure containers stay running, replacing them should the need arise.
Remember my mention of duck vocalization? :-) If you want your container to run multiple processes then it's possible to run a supervisor process that keeps everything healthy and running inside (A container dies when all processes stop)
https://docs.docker.com/engine/admin/using_supervisord/
The ultimate expression of this VM envy would be LXD from Ubuntu, here an entire set of VM services get bootstrapped within LXC containers
https://www.ubuntu.com/cloud/lxd
In conclusion is it a best practice? I think there is no clear answer. Personally I'd say no for two reasons:
I'm fixated on deploying 12 factor compliant applications, so married to the single process model
If I need to run two processes on the same set of data, then in Kubernetes I can run containers within the same POD... Means Kubernetes manages the processes (running as separate containers with a common data volume).
Clearly my reasons are implementation specific.
There are multiple run supervisors that can help you take a foreground process (or multiple ones) run them monitored and restart them on failure (or exit the container).
one is runit (http://smarden.org/runit/), which I have not used myself.
my choice is S6 (http://skarnet.org/software/s6/). someone already built a container envelope for it, named S6-overlay (https://github.com/just-containers/s6-overlay) which is what I usually use if/when I need to have a user-space process run as daemon. it also has facets to do prep work on container start, change permissions and more, in runtime.
tl;dr: I can't find a best practices document that relates directly to this for docker, but I agree with you.
The only best "Best Practices" for docker I could find was at dockers own site, which states that containers should be one process. In my mind, that means foregrounded processes as well. So basically, I've drawn the same conclusion as you. (You've probably read that too, but this is for anyone else reading this).
Honestly, I think we are still in (relatively) new territory with best practices for docker. Anecdotally, it has been a best practice in the organizations I've worked with. The number of times I've felt more satisfied with a foregrounded process has been significantly greater then the times I've said to myself "Boy, I sure wish I backgrounded that one." In fact, I don't think I've ever said that.
The only exception I can think of is when you are trying to evaluate software and need a quick and dirty way to ship infrastructure off to someone. EG: "Hey, there is this new thing called LAMP stacks I just heard of, here is a docker container that has all the components for you to play around with". Again, though, that's an outlier and I would shudder if something like that ever made it to production or even any sort of serious development environment.
Additionally, it certainly forces a micro-architecture style, which I think is ultimately a good thing.

Resources