How critical is dumb-init for Docker? - docker

I hope that this question will not be marked as primarily opinion-based, but that there is an objective answer to it.
I have read Introducing dumb-init, an init system for Docker containers, which extensively describes why and how to use dumb-init. To be honest, for someone not too experienced with how the Linux process structure works, this sounds pretty dramatic - and it feels as if you are doing things entirely wrong if you don't use dumb-init.
This is why I'm thinking about using it within my very own Docker images… what keeps me from doing this is the fact that I have not yet found an official Docker image that uses it.
Take mongo as an example: They call mongod directly.
Take postgres as an example: They call postgres directly.
Take node as an example: They call node directly.
…
If dumb-init is so important - why is apparently nobody using it? What am I missing here?

Something like dumb-init or tini can be used if you have a process that spawns new processes and you don't have good signal handlers implemented to catch child signals and stop your child if your process should be stopped etc.
If your process doesn't spawn new processes (e.g. Node.js), then this may not be necessary.
I guess that MongoDB, PostgreSQL, ... which may run child processes have good signal handlers implemented. Otherwise there would have been zombie processes and someone would have filed an issue to fix this.
Only problem may be the official language images, like node, ruby, golang. They don't have dumb-init/tini in it as you normally don't need them. But it's up to the developer which may implement bad child execution code to either fix the signal handlers or use helper as PID 1.

Related

Nix and services to run and stop

My main goal is not to have a reproducible environment but rather an independent one.
I can achieve this by using docker and namely docker-compose where I can describe services I need and start/stop them with ease. That way I can have two versions of the database to be used in two different projects without polluting my global-space with them. That all sounds nice and shiny but I am on macOs and all of that is particularly slow. To the extend when it is even unusable.
As a slight alternative to docker/containers people often propose nix. I like the idea of it. You have reproducible and isolated environments without any virtualizition/containerization on top of it. Cool! I could have said if there was any info on services and how to use them with nix. The only thing I found is that there is such thing as shellHook which allows to do anything like starting a db when you enter nix shell. But you can't automatically stop it if you leave it or if you simply close the terminal.
Is there something in the nix world which helps manage services with the ease it helps to manage libraries/languages/frameworks?
Sounds like you need a process manager.
You might be able to use nix-processmgmt to write service configurations that you can run with supervisord on macOS. Launchd is also supported, but project configuration shouldn't be mixed with system configuration, if avoidable.
I haven't tried this yet, because I can do my backend work on NixOS with arion, which uses docker compose as a backend. I'll be interested to know what you think of nix-processmgmt.

How to use Dask on Databricks

I want to use Dask on Databricks. It should be possible (I cannot see why not). If I import it, one of two things happens, either I get an ImportError but when I install distributed to solve this DataBricks just says Cancelled without throwing any errors.
Anyone looking for an answer, check this medium blogpost. To prevent people from missing this in comments, I'm posting this as an answer.
I don't think we have heard of anyone using Dask under databricks, but so long as it's just python, it may well be possible.
The default scheduler for Dask is threads, and this is the most likely thing to work. In this case you don't even need to install distributed.
For the Cancelled error, it sounds like you are using distributed, and, at a guess, the system is not allowing you to start extra processes (you could test this with the subprocess module). To work around, you could do
client = dask.distributed.Client(processes=False)
Of course, if it is indeed the processes that you need, this would not be great. Also, I have no idea how you might expose the dashboard's port.

Turning on dbg tracing on a remote node?

When running our Erlang application in our system tests, I sometimes want to turn on and capture a debug trace.
The Erlang node is started using a relx start script (called as _rel/bin/foo foreground), so I don't have any control over the startup options. The system test runner (written in Python) is capturing stdout from the node.
How do I connect to an Erlang node, using -remsh, turn on dbg-tracing, and have that output written to stdout on the original node? And how do I do this all in a Python-friendly way (though I'm happy to write an escript if that'll make it easier).
To complicate this further, the relx generated release doesn't include the runtime_tools library, so dbg: isn't actually available, so I'll also add this question.
There are quite few way you could do that. All depends on what you are familiar with, and what your use case is.
I would start from doing everything by hand. That way you have greatest control on that's going one, and how effects look like (if you are turning too much debugging or not enough). That's I'm most familiar with, and in the end you almost always will have to connect to remote shell and do something by hand (from my experience)
One feature of dbg that not too many people talk about i ability of saving/loading trace pasterns from files. I find those easiest way to store and share debugging information in between sessions; but lack of readability might be too big trade-off.
You don't have to use dbg if you don't want to interfere with your live system too much. You could use erlang:trace which is given by default, but you must be cautious about state you leave your VM in (dbg should turn off all tracing upon exit; with erlang:trace that's your responsibility)
If you debug session is part of python script, writng escript and calling it from python would be my way to go. You just have to remember that escripts are run in new VM, and -remsh will not allow you to just run your code on other VM. You will have to use rpc module for that.
Since you are using application is released you might look into logging. One might assume that there should already be some logging in place, quite possible lager which is somewhat standard in Erlang, and which have possibility to change logging level during runtime.
Personally I would try some mix of first and last option, and just experiment.

Erlang: Who supervises the supervisor?

In all Erlang supervisor examples I have seen yet, there usually is a "master" supervisor who supervises the whole tree (or at least is the root node in the supervisor tree). What if the "master"-supervisor breaks? How should the "master"-supervisor be supervised?? any typical pattern?
The top supervisor is started in your application start/2 callback using start_link, this means that it links with the application process. If the application process receives an exit signal from the top supervisor dying it does one of two things:
If the application is started as an permanent application the entire node i terminated (and maybe restarted using HEART).
If the application is started as temporary the application stops running, no restart attempts will be made.
Typically Supervisor is set to "only" supervise other processes. Which mens there is no user written code which is executed by Supervisor - so it very unlikely to crash.
Of course, this cannot be enforced ... So typical pattern is to not have any application specific logic in Supervisor ... It should only Supervise - and do nothing else.
Good question. I have to concur that all of the examples and tutorials mostly ignore the issue - even if occasionally someone mentions the issue (without providing an example solution):
If you want reliability, use at least two computers, and then make them supervise each other. How to actually implement that with OTP is (with the current state of documentation and tutorials), however, appears to be somewhere between well hidden and secret.

Windows srvany.exe and service STOP

I've read the many answers online on how to use SRVANY.exe to create a Windows service out of anything. My service is a batch file that sets up the environment (i need to set env vars and map drives) and then spawns my c++ app. But when i do a NET STOP, the srvany.exe process goes away, and my c++ app stays alive. Is there any way to have it killed when it receives the stop command? I'd need to be able to bounce it in case of any config file changes.
The reason i picked cmd shell is the easy drive mapping. In theory i can wrap it with either perl or python, whichever is easier to get this behavior, but then i'd need to shell out anyway to map the drives. Does this make sense?
AlwaysUp is a commercial alternative to SrvAny which covers shortcomings like this one in addition to adding more useful features.
NSSM is a open source alternative with slightly fewer features than AlwaysUp but still it can kill the underlying process when you stop the service.
no, srvany was not designed to stop your applications. The main purpose was to be able to start applications as a service that were not designed to run as a service.
As a clumsy workaround you can run a scheduled task that will monitor if srvany runs and if not it will terminate your application.

Resources