Get the PID on the host from inside the container? - docker

Before Linux Kernel v4, I was able to obtain the host PID from inside the docker container from the process scheduling information.
For instance, if I run sleep command inside the container and my local PID is 37, then I can check the actual PID on the host via:
root#helloworld-595777cb8b-gjg4j:/# head /proc/37/sched
sleep (27062, #threads: 1)
I can verify on the host that the PID 27062 corresponds to the process within the container.
root 27062 0.0 0.0 4312 352 pts/0 S 16:29 0:00 sleep 3000
I have tried this with on RHEL7 (Kernel: Linux 3.10) with Docker version: 17.09.0-ce.
I am not able to reproduce the same result on RHEL8 (Kernel: Linux 4.18) with Docker version: 20.10. In fact, I always get the local PID from the scheduling information.
/ # head /proc/8/sched
sleep (8, #threads: 1)
I might be wrong but my assumption is that something is changed within the Kernel which forbids to obtain the host PID?
So the question is how to obtain the host PID from within the container?

The bug (or "feature" if you prefer) that allowed the host PID to be discovered from /proc/PID/sched in the container was fixed (or "broken" if you prefer) in Linux kernel 4.14 by commit 74dc3384fc79 ("sched/debug: Use task_pid_nr_ns in /proc/$pid/sched").
As a result of the change, the container cannot get the host PID of a process (at least via /proc/PID/sched) if PID namespaces are in use.

Related

Logspout container in Docker

I am trying to deploy logspout container in docker, but keep running into an issue which I have searched in this website and github but to no avail, so hoping someone knows.
I followed the following commands as per the Readme here: https://github.com/gliderlabs/logspout
(1) docker pull gliderlabs/logspout:latest (also tried with logspout:master, same results)
(2) docker run -d --name="logspout" --volume=/var/run/docker.sock:/var/run/docker.sock --publish=127.0.0.1:8000:80 gliderlabs/logspout (also tried with -v /var/run/docker.sock:/var/run/docker.sock, same results)
The container gets created but stops immediately. When I check the container logs (docker container logs logspout), I only see the following entries:
2021/12/19 06:37:12 # logspout v3.2.14 by gliderlabs
2021/12/19 06:37:12 # adapters: raw syslog tcp tls udp multiline
2021/12/19 06:37:12 # options :
2021/12/19 06:37:12 persist:/mnt/routes
2021/12/19 06:37:12 # jobs : pump routes http[health,logs,routes]:80
2021/12/19 06:37:12 # routes : none
2021/12/19 06:37:12 pump ended: Get http://unix.sock/containers/json?: dial unix /var/run/docker.sock: connect: no such file or directory
I checked docker.sock as ls -la /var/run/docker.sock results in srw-rw---- 1 root docker 0 Dec 12 09:49 /var/run/docker.sock. So docker.sock does exist, which adds to the confusion as to why the container can't find it.
I am new to linux/docker, but my understanding is that using -v or --version would automatically mount the location to the container, but does not seem to be happening here. So I am wondering if anyone has any suggestion on what needs to be done so that the logspout container can find the docker.sock.
System Info: Docker version 20.10.11, build dea9396; Raspberry Pi 4 ARM 64, OS: Debian GNU/Linux 11 (bullseye)
EDIT: added comment about -v tag in step (2) above
The container must be able to access the Docker Unix socket to mount it. This is typically a problem when namespace remapping is enabled. To disable remapping for the logspout container, pass the --userns=host flag to docker run, .. create, etc.

Pulumi does not perform graceful shutdown of kubernetes pods

I'm using pulumi to manage kubernetes deployments. One of the deployments runs an image which intercepts SIGINT and SIGTERM signals to perform a graceful shutdown like so (this example is running in my IDE):
{"level":"info","msg":"trying to activate gcloud service account","time":"2021-06-17T12:19:25-05:00"}
{"level":"info","msg":"env var not found","time":"2021-06-17T12:19:25-05:00"}
{"Namespace":"default","TaskQueue":"main-task-queue","WorkerID":"37574#Paymahns-Air#","level":"error","msg":"Started Worker","time":"2021-06-17T12:19:25-05:00"}
{"Namespace":"default","Signal":"interrupt","TaskQueue":"main-task-queue","WorkerID":"37574#Paymahns-Air#","level":"error","msg":"Worker has been stopped.","time":"2021-06-17T12:19:27-05:00"}
{"Namespace":"default","TaskQueue":"main-task-queue","WorkerID":"37574#Paymahns-Air#","level":"error","msg":"Stopped Worker","time":"2021-06-17T12:19:27-05:00"}
Notice the "Signal":"interrupt" with a message of Worker has been stopped.
I find that when I alter the source code (which alters the docker image) and run pulumi up the pod doesn't gracefully terminate based on what's described in this blog post. Here's a screenshot of logs from GCP:
The highlighted log line in the image above is the first log line emitted by the app. Note that the shutdown messages aren't logged above the highlighted line which suggests to me that the pod isn't given a chance to perform a graceful shutdown.
Why might the pod not go through the graceful shutdown mechanisms that kubernetes offers? Could this be a bug with how pulumi performs updates to deployments?
EDIT: after doing more investigation I found that this problem is happening because starting a docker container with go run /path/to/main.go actually ends up created two processes like so (after execing into the container):
root#worker-ffzpxpdm-78b9797dcd-xsfwr:/gadic# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.3 0.3 2046200 30828 ? Ssl 18:04 0:12 go run src/cmd/worker/main.go --temporal-host temporal-server.temporal.svc.cluster.local --temporal-port 7233 --grpc-port 6789 --grpc-hos
root 3782 0.0 0.5 1640772 43232 ? Sl 18:06 0:00 /tmp/go-build2661472711/b001/exe/main --temporal-host temporal-server.temporal.svc.cluster.local --temporal-port 7233 --grpc-port 6789 --
root 3808 0.1 0.0 4244 3468 pts/0 Ss 19:07 0:00 /bin/bash
root 3817 0.0 0.0 5900 2792 pts/0 R+ 19:07 0:00 ps aux
If run kill -TERM 1 then the signal isn't forwarded to the underlying binary, /tmp/go-build2661472711/b001/exe/main, which means the graceful shutdown of the application isn't executed. However, if I run kill -TERM 3782 then the graceful shutdown logic is executed.
It seems the go run spawns a subprocess and this blog post suggests the signals are only forwarded to PID 1. On top of that, it's unfortunate that go run doesn't forward signals to the subprocess it spawns.
The solution I found is to add RUN go build -o worker /path/to/main.go in my dockerfile and then to start the docker container with ./worker --arg1 --arg2 instead of go run /path/to/main.go --arg1 --arg2.
Doing it this way ensures there aren't any subprocess spawns by go and that ensures signals are handled properly within the docker container.

who and w commands in CentOS 8 Docker container

While playing with CentOs 8 on Docker container I found out, that outputs of who and w commands are always empty.
[root#9e24376316f1 ~]# who
[root#9e24376316f1 ~]# w
01:01:50 up 7:38, 0 users, load average: 0.00, 0.04, 0.00
USER TTY FROM LOGIN# IDLE JCPU PCPU WHAT
Even when I'm logged in as a different user in second terminal.
When I want to write to this user it shows
[root#9e24376316f1 ~]# write test
write: test is not logged in
Is this because of Docker? Maybe it works in some way that disallow sessions to see each other?
Or maybe that's some other issue. I would really appreciate some explanation.
These utilities obtain the information about current logins from the utmp file (/var/run/utmp). You can easily check that in ordinary circumstances (e.g. on the desktop system) this file contains something like the following string (here qazer is my login and tty7 is a TTY where my desktop environment runs):
$ cat /var/run/utmp
tty7:0qazer:0�o^�
while in the container this file is (usually) empty:
$ docker run -it centos
[root#5e91e9e1a28e /]# cat /var/run/utmp
[root#5e91e9e1a28e /]#
Why?
The utmp file is usually modified by programs which authenticate the user and start the session: login(1), sshd(8), lightdm(1). However, the container engine cannot rely on them, as they may be absent in the container file system, so "logging in" and "executing on behalf of" is implemented in the most primitive and straightforward manner, avoiding relying on anything inside the container.
When any container is started or any command is execd inside it, the container engine just spawns the new process, arranges some security settings, calls setgid(2)/setuid(2) to forcibly (without any authentication) alter the process' UID/GID and then executes the required binary (the entry point, the command, and so on) within this process.
Say, I start the CentOS container running its main process on behalf of UID 42:
docker run -it --user 42 centos
and then try to execute sleep 1000 inside it:
docker exec -it $CONTAINER_ID sleep 1000
The container engine will perform something like this:
[pid 10170] setgid(0) = 0
[pid 10170] setuid(42) = 0
...
[pid 10170] execve("/usr/bin/sleep", ["sleep", "1000"], 0xc000159740 /* 4 vars */) = 0
There will be no writes to /var/run/utmp, thus it will remain empty, and who(1)/w(1) will not find any logins inside the container.

Issue accessing vespa outside docker container

Installed Docker on Mac and trying to run Vespa on Docker following steps specified in following link
https://docs.vespa.ai/documentation/vespa-quick-start.html
I did n't had any issues till step 4. I see vespa container running after step 2 and step 3 returned 200 OK response.
But Step 5 failed to return 200 OK response. Below is the command I ran on my terminal
curl -s --head http://localhost:8080/ApplicationStatus
I keep getting
curl: (52) Empty reply from server whenever I run without -s option.
So I tried to see listening ports inside my vespa container and don't see anything for 8080 but can see for 19071(used in step 3)
➜ ~ docker exec vespa bash -c 'netstat -vatn| grep 8080'
➜ ~ docker exec vespa bash -c 'netstat -vatn| grep 19071'
tcp 0 0 0.0.0.0:19071 0.0.0.0:* LISTEN
Below doc has info related to vespa ports
https://docs.vespa.ai/documentation/reference/files-processes-and-ports.html
I'm assuming port 8080 should be active after docker run(step 2 of quick start link) and can be accessed outside container as port mapping is done.
But I don't see 8080 port active inside container in first place.
A'm I missing something. Do I need to perform any additional step than mentioned in quick start? FYI I installed Jenkins inside my docker and was able to access outside container via port mapping. But not sure why it's not working with vespa.I have been trying from quiet sometime but no progress. Please advice me if I'm missing something here.
You have too low memory for your docker container, "Minimum 6GB memory dedicated to Docker (the default is 2GB on Macs).". See https://docs.vespa.ai/documentation/vespa-quick-start.html
The deadlock detector warnings and failure to get configuration from configuration server (which is likely oom killed) indicates that you are too low on memory.
My guess is that your jdisc container had not finished initialize or did not initialize properly? Did you try to check the log?
docker exec vespa bash -c '/opt/vespa/bin/vespa-logfmt /opt/vespa/logs/vespa/vespa.log'
This should tell you if there was something wrong. When it is ready to receive requests you would see something like this:
[2018-12-10 06:30:37.854] INFO : container Container.org.eclipse.jetty.server.AbstractConnector Started SearchServer#79afa369{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
[2018-12-10 06:30:37.857] INFO : container Container.org.eclipse.jetty.server.Server Started #10280ms
[2018-12-10 06:30:37.857] INFO : container Container.com.yahoo.container.jdisc.ConfiguredApplication Switching to the latest deployed set of configurations and components. Application switch number: 0
[2018-12-10 06:30:37.859] INFO : container Container.com.yahoo.container.jdisc.ConfiguredApplication Initializing new set of configurations and components. Application switch number: 1

Map process id of application on docker container with process id on host

i am running application in docker container only and not on host machine.. Application has some process ID on docker container. That application also has process id on host . Process Id on host and process ID on container are differerent. How can I see process ID of application running on docker container from host ? How can I map the process ID of application running on container only (and not on host ) with process ID of this application on host ? I searched on internet , but could not find correct set of commands
Running a command like this should get you the PID of the container's main process (ID 1) on the host.
docker container top
$ docker container top cf1b
UID PID PPID C STIME TTY TIME CMD
root 3289 3264 0 Aug24 pts/0 00:00:00 bash
root 9989 9963 99 Aug24 ? 6-07:24:43 java -javaagent:/apps/docker-custom/newrelic/newrelic.jar -Xmx4096m -Xms4096m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:-TieredCompilation -XX:+ParallelRefProcEnabled -jar /apps/service/app.jar
So in this case PID 1 in my container maps to ID 9989 on the host.
If a process is indeed ONLY in your container, that becomes more chellenging. It You can use tools like nsenter to peek into the name spaces but if you have exec privelages to your container then that would achieve the same thing, but the docker container top command on the host combined with the ps command within the container can give you an idea of what is happening.
If you can clarify what your end goal is, we might be able to provide more clear guidance.
In order to get the mapping between container process ID and host process ID, one could run ps -ef on container and docker top <container> on the host. The CMD column present in both of these outputs will help in the decision. Below is the sample output in my environment:
container1:/$ ps -ef
UID PID PPID C STIME TTY TIME CMD
2033 10 0 0 11:08 pts/0 00:00:00 postgres -c config_file=/etc/postgresql/postgresql_primary.conf
host1# docker top warehouse_db
UID PID PPID C STIME TTY TIME CMD
bbharati 11677 11660 0 11:08 pts/0 00:00:00 postgres -c config_file=/etc/postgresql/postgresql_primary.conf
As we can see, the container process with PID=10 maps to the host process with PID=11677

Resources