Make docker-compose logs -f only show /dev/stderr - docker

I've got a container which runs Apache2 and has some log files that print their input into /dev/stdout and /dev/stderr. When running docker-compose logs -f it blindly prints both stdout and stderr mixed. Is there any way to only show one of the two?
lrwxrwxrwx 1 root root 11 Oct 10 01:22 access.log -> /dev/stdout
lrwxrwxrwx 1 root root 11 Oct 10 01:22 error.log -> /dev/stderr
lrwxrwxrwx 1 root root 11 Oct 10 01:22 other_vhosts_access.log -> /dev/stdout
Ideally, I'd like to optionally switch between outputs. I could imagine --only-stderr and --only-stdout flags. I'm aware of possible workarounds for this, but I'm interested if this is natively possible.

I see your point and I would need the same thing, but apparently it's not supported by docker-compose at the moment.
See this feature request: https://github.com/docker/compose/issues/6078
Although you cannot do that with docker-compose, you can still achieve something similar by following the logs of a particular Docker container.
You wrote "I'm aware of possible workarounds for this" so you might know it already, but it will hopefully help other people visiting this page :)
After executing docker-compose up, list your running containers:
docker ps
Copy the NAME of the given container and read its logs:
docker logs NAME_OF_THE_CONTAINER -f
To only read the error logs:
docker logs NAME_OF_THE_CONTAINER -f 1>/dev/null
To only read the access logs:
docker logs NAME_OF_THE_CONTAINER -f 2>/dev/null

Related

docker command not available in custom pipe for BitBucket Pipeline

I'm working on a build step that handles common deployment tasks in a Docker Swarm Mode cluster. As this is a common problem for us and for others, we've shared this build step as a BitBucket pipe: https://bitbucket.org/matchory/swarm-secret-pipe/
The pipe needs to use the docker command to work with a remote Docker installation. This doesn't work, however, because the docker executable cannot be found when the pipe runs.
The following holds true for our test repository pipeline:
The docker option is set to true:
options:
docker: true
The docker service is enabled for the build step:
main:
- step:
services:
- docker: true
Docker works fine in the repository pipeline itself, but not within the pipe.
Pipeline log shows the docker path being mounted into the pipe container:
docker container run \
--volume=/opt/atlassian/pipelines/agent/build:/opt/atlassian/pipelines/agent/build \
--volume=/opt/atlassian/pipelines/agent/ssh:/opt/atlassian/pipelines/agent/ssh:ro \
--volume=/usr/local/bin/docker:/usr/local/bin/docker:ro \
--volume=/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes:/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes \
--volume=/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes/matchory/swarm-secret-pipe:/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes/matchory/swarm-secret-pipe \
--workdir=$(pwd) \
--label=org.bitbucket.pipelines.system=true \
radiergummi/swarm-secret-pipe:1.3.7#sha256:baf05b25b38f2a59b044e07f4ad07065de90257a000137a0e1eb71cbe1a438e5
The pipe is pretty standard and uses a recent Alpine image; nothing special in that regard. The PATH is never overwritten. Now for the fun part: If I do ls /usr/local/bin/docker inside the pipe, it shows an empty directory:
ls /usr/local/bin
total 16K
drwxr-xr-x 1 root root 4.0K May 13 13:06 .
drwxr-xr-x 1 root root 4.0K Apr 4 16:06 ..
drwxr-xr-x 2 root root 4.0K Apr 29 09:30 docker
ls /usr/local/bin/docker
total 8K
drwxr-xr-x 2 root root 4.0K Apr 29 09:30 .
drwxr-xr-x 1 root root 4.0K May 13 13:06 ..
ls: /usr/local/bin/docker/docker: No such file or directory
As far as I understand pipelines and Docker, /usr/local/bin/docker should be the docker binary file. Instead, it appears to be an empty directory for some reason.
What is going on here?
I've also looked at other, official, pipes. They don't do anything differently, but seem to be using the docker command just fine (eg. the Azure pipe).
After talking to BitBucket support, I solved the issue. As it turns out, if the docker context is changed, any docker command is sent straight to the remote docker binary, which (on our services) lives at a different path than in BitBucket Pipelines!
As we changed the docker context before using the pipe, and the docker instance mounted into the pipe still has the remote context set, but the pipe searches for the docker binary at another place, the No such file or directory error is thrown.
TL;DR: Always restore the default docker host/context before passing control to a pipe, e.g.:
script:
- export DEFAULT_DOCKER_HOST=$DOCKER_HOST
- unset DOCKER_HOST
- docker context create remote --docker "host=ssh://${DEPLOY_SSH_USER}#${DEPLOY_SSH_HOST}"
- docker context use remote
# do your thing
- export DOCKER_HOST=$DEFAULT_DOCKER_HOST # <------ restore the default host
- pipe: matchory/swarm-secret-pipe:1.3.16

jmxterm: "Unable to create a system terminal" inside Docker container

I have a Docker image which contains JRE, some Java web application and jmxterm. The latter is used for running some ad-hoc administrative tasks. The image is used on the CentOS 7 server with Docker 1.13 (which is pretty old but is the latest version which is supplied via the distro's repository) to run the web application itself.
All works well, but after updating jmxterm from 1.0.0 to the latest version (1.0.2), I get the following warning when entering the running container and starting jmxterm:
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
After this, jmxterm does not react to arrow keys (when trying to navigate through the command history), nor does it provide autocompletion.
Some quick investigation shows that the problem may be reproduced in the clean environment with CentOS 7. Say, this is how I could bootstrap the system and the container with all stuff I need:
$ vagrant init centos/7
$ vagrant up
$ vagrant ssh
[vagrant#localhost ~]$ sudo yum install docker
[vagrant#localhost ~]$ sudo systemctl start docker
[vagrant#localhost ~]$ sudo docker run -it --entrypoint bash openjdk:11
root#0c4c614de0ee:/# wget https://github.com/jiaqi/jmxterm/releases/download/v1.0.2/jmxterm-1.0.2-uber.jar
And this is how I enter the container and run jmxterm:
[vagrant#localhost ~]$ sudo docker exec -it 0c4c614de0ee sh
root#0c4c614de0ee:/# java -jar jmxterm-1.0.2-uber.jar
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
root#0c4c614de0ee:/# bea<TAB>
<Nothing happens, but autocompletion had to appear>
Few observations:
the problem does not appear with older jmxterm no matter which image do I use;
the problem arises with new jmxterm no matter which image do I use;
the problem is not reproducible on my laptop (which has newer kernel and Docker);
the problem is not reproducible if I use latest Docker (from the external repo) on the CentOS 7 server instead of CentOS 7's native version 1.13.
What happens, and why the error is reproducible only in specific environments? Is there any workaround for this?
TLDR: running new jmxterm versions as java -jar jmxterm-1.0.2-uber.jar < /dev/tty is a quick, dirty and hacky workaround for having the autocompletion and other stuff work inside the interactive container session.
A quick check shows that jmxterm tries to determine the terminal device used by the process — probably to obtain the terminal capabilities later — by running the tty utility:
root#0c4c614de0ee:/# strace -f -e 'trace=execve,wait4' java -jar jmxterm-1.0.2-uber.jar
execve("/opt/java/openjdk/bin/java", ["java", "-jar", "jmxterm-1.0.2-uber.jar"], 0x7ffed3a53210 /* 36 vars */) = 0
...
[pid 432] execve("/usr/bin/tty", ["tty"], 0x7fff8ea39608 /* 36 vars */) = 0
[pid 433] wait4(432, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 432
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
The utility fails with the status of 1, which is likely the reason for the error message. Why?
root#0c4c614de0ee:/# strace -y tty
...
readlink("/proc/self/fd/0", "/dev/pts/3", 4095) = 10
stat("/dev/pts/3", 0x7ffe966f2160) = -1 ENOENT (No such file or directory)
...
write(1</dev/pts/3>, "not a tty\n", 10not a tty
) = 10
The utility says "not a tty" while we definitely have one. A quick check shows that... There is really no PTY device in the container though the standard streams of the shell are connected to one!
root#0c4c614de0ee:/# ls -l /proc/self/fd
total 0
lrwx------. 1 root root 64 Jun 3 21:26 0 -> /dev/pts/3
lrwx------. 1 root root 64 Jun 3 21:26 1 -> /dev/pts/3
lrwx------. 1 root root 64 Jun 3 21:26 2 -> /dev/pts/3
lr-x------. 1 root root 64 Jun 3 21:26 3 -> /proc/61/fd
root#0c4c614de0ee:/# ls -l /dev/pts
total 0
crw-rw-rw-. 1 root root 5, 2 Jun 3 21:26 ptmx
What if we check the same with latest Docker?
root#c0ebd608f79a:/# ls -l /proc/self/fd
total 0
lrwx------ 1 root root 64 Jun 3 21:45 0 -> /dev/pts/1
lrwx------ 1 root root 64 Jun 3 21:45 1 -> /dev/pts/1
lrwx------ 1 root root 64 Jun 3 21:45 2 -> /dev/pts/1
lr-x------ 1 root root 64 Jun 3 21:45 3 -> /proc/16/fd
root#c0ebd608f79a:/# ls -l /dev/pts
total 0
crw--w---- 1 root tty 136, 0 Jun 3 21:44 0
crw--w---- 1 root tty 136, 1 Jun 3 21:45 1
crw-rw-rw- 1 root root 5, 2 Jun 3 21:45 ptmx
Bingo! Now we have our PTYs where they should be, so jmxterm works well with latest Docker.
It seems pretty weird that with older Docker the processes are connected to some PTYs while there are no devices for them in /dev/pts, but tracing the Docker process explains why this happens. Older Docker allocates the PTY for the container before setting other things up (including entering the new mount namespace and mounting devpts into it or just entering the mount namespace in case of docker exec -it):
[vagrant#localhost ~]$ sudo strace -p $(pidof docker-containerd-current) -f -e trace='execve,mount,unshare,openat,ioctl'
...
[pid 3885] openat(AT_FDCWD, "/dev/ptmx", O_RDWR|O_NOCTTY|O_CLOEXEC) = 9
[pid 3885] ioctl(9, TIOCGPTN, [1]) = 0
[pid 3885] ioctl(9, TIOCSPTLCK, [0]) = 0
...
[pid 3898] unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET|CLONE_NEWPID) = 0
...
[pid 3899] mount("devpts", "/var/lib/docker/overlay2/3af250a9f118d637bfba5701f5b0dfc09ed154c6f9d0240ae12523bf252e350c/merged/dev/pts", "devpts", MS_NOSUID|MS_NOEXEC, "newinstance,ptmxmode=0666,mode=0"...) = 0
...
[pid 3899] execve("/bin/bash", ["bash"], 0xc4201626c0 /* 7 vars */ <unfinished ...>
Note the newinstance mount option which ensures that the devpts mount owns its PTYs exclusively and does not share them with other mounts. This leads to the interesting effect: the PTY device for the container stays on the host and belongs to the host's devpts mount, while the containerized process still has access to it, as it obtained the already-open file descriptors just in the beginning of its life!
The latest Docker first mounts devpts for the container and then allocates the PTY, so the PTY belongs to container's devpts mount and is visible inside the container's filesystem:
$ sudo strace -p $(pidof containerd) -f -e trace='execve,mount,unshare,openat,ioctl'
...
[pid 14043] unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWNET) = 0
...
[pid 14044] mount("devpts", "/var/lib/docker/overlay2/b743cf16ab954b9a4b4005bca0aeaa019c4836c7d58d6073044e5b48446c3d62/merged/dev/pts", "devpts",
MS_NOSUID|MS_NOEXEC, "newinstance,ptmxmode=0666,mode=0"...) = 0
...
[pid 14044] openat(AT_FDCWD, "/dev/ptmx", O_RDWR|O_NOCTTY|O_CLOEXEC) = 7
[pid 14044] ioctl(7, TIOCGPTN, [0]) = 0
[pid 14044] ioctl(7, TIOCSPTLCK, [0]) = 0
...
[pid 14044] execve("/bin/bash", ["/bin/bash"], 0xc000203530 /* 4 vars */ <unfinished ...>
Well, the problem is caused by inappropriate Docker behavior, but how comes that older jmxterm worked well in the same environment? Let's check (note, that Java 8 image is used here, as older jmxterm does not play well with Java 11):
root#504a7757e310:/# wget https://github.com/jiaqi/jmxterm/releases/download/v1.0.0/jmxterm-1.0.0-uber.jar
root#504a7757e310:/# strace -f -e 'trace=execve,wait4' java -jar jmxterm-1.0.0-uber.jar
execve("/usr/local/openjdk-8/bin/java", ["java", "-jar", "jmxterm-1.0.0-uber.jar"], 0x7fffdcaebdd0 /* 10 vars */) = 0
...
[pid 310] execve("/bin/sh", ["sh", "-c", "stty -a < /dev/tty"], 0x7fff1f2a1cc8 /* 10 vars */) = 0
So, older jmxterm just uses /dev/tty instead of asking tty for the device name, and this works, as this device is present inside the container:
root#504a7757e310:/# ls -l /dev/tty
crw-rw-rw-. 1 root root 5, 0 Jun 3 21:36 /dev/tty
The huge difference between these versions of jmxterm is that newer tool version uses higher major version of jline, which is the library responsible for interaction with the terminal (akin to the readline in the C world). The difference between major jline versions leads to the difference in jmxterm's behavior, and current versions just rely on tty.
This observation leads us to the quick and dirty workaround which does not require neither updating Docker nor patching the jline/jmxterm tandem: we may just attach jmxterm's stdin to /dev/tty forcibly and thus make jline use this device (which is now referenced by /proc/self/fd/0) instead of the /dev/pts entry (which, formally, is not always correct, but is still enough for ad-hoc use):
root#0c4c614de0ee:/# java -jar jmxterm-1.0.2-uber.jar < /dev/tty
Welcome to JMX terminal. Type "help" for available commands.
$>bea<TAB>
bean beans
Now we have the autocompletion, history and other cool things we need to have.
If you are trying to run an interactive application (that needs tty) inside a docker container or a pod in kubernetes, then the following should work.
For docker-compose use:
image: image-name:2.0
container_name: container-name
restart: always
stdin_open: true
tty: true
For kubernetes use:
spec:
containers:
- name: web
image: web:latest
tty: true
stdin: true

Why is Loki's Docker Driver Client stopping to log after some time?

I want to send logs of my Docker containers to Grafana Loki. Therefore, I installed Loki's Docker Driver Client and started my containers with it. First I can see logs, but after some time I see no more logs.
Installation
I installed Loki's Docker Driver Client as a Docker plugin on my Docker Engine (version 20.10.2):
$ docker plugin install grafana/loki-docker-driver:master-54d1d3b --alias loki --grant-all-permissions
I didn't use the tag lastest, because of the bug Unable to connect to logging plugin in Swarm
Configuration
I started my Docker containers with Loki's Docker Driver Client as log driver:
$ docker container run
--log-driver=loki
--log-opt loki-url="$LOKI_URL"
--log-opt loki-retries=5
--log-opt loki-batch-size=400
--log-opt max-size="10m"
--log-opt max-file=5
--detach
--name $CONTAINER_NAME
--restart unless-stopped
$IMAGE:$TAG
I also added json-log driver's max-size and max-file to limit disk space, see Configuring the Docker Driver.
Problem
First I could see logs in Grafana and in command line with docker container logs, but after some time no more logs were shown. If I tried to look into the logs on Docker host and I saw an error:
$ docker container logs 75d4b13eb3e8
error from daemon in stream: Error grabbing logs: error getting log reader: LogDriver.ReadLogs: logger does not exist for 75d4b13eb3e8203b9247ecdeb41fdf495cc8fea7dcfc4775fd8261263b1dcd32
Research
I looked into the directories of the containers (see Where is a log file with logs from a container?), but I couldn't see any log files:
$ sudo ls /var/lib/docker/containers/75d4b13eb3e8203b9247ecdeb41fdf495cc8fea7dcfc4775fd8261263b1dcd32
checkpoints config.v2.json hostconfig.json hostname hosts mounts resolv.conf resolv.conf.hash
I also checked the log path (see Get an instance’s log path), but it was empty:
$ docker inspect --format='{{.LogPath}}' 75d4b13eb3e8
I found container's logs in plugin's directory (see Loki log driver not storing logs as files on disk, even with keep-file: true), but the log files don't change anymore:
$ sudo ls -la /var/lib/docker/plugins/eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288/rootfs/var/log/docker/75d4b13eb3e8203b9247ecdeb41fdf495cc8fea7dcfc4775fd8261263b1dcd32
total 912
drwxr-xr-x 2 root root 4096 Jan 22 12:59 .
drwxr-xr-x 17 root root 4096 Jan 22 15:46 ..
-rw-r----- 1 root root 923177 Jan 22 13:34 json.log
I looked into Docker daemon's logs (see Read the logs) and found errors and a warning (at the same time logging stopped):
$ sudo journalctl -u docker.service | grep eac33cc9913c
[...]
[...]level=error msg="panic: send on closed channel" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="goroutine 153 [running]:" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="main.(*loki).Log(0xc0000c5e00, 0xc0001d81c0, 0xc0000c5e80, 0x0)" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="\t/src/loki/cmd/docker-driver/loki.go:69 +0x2fb" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="main.consumeLog(0xc0002c0480)" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="\t/src/loki/cmd/docker-driver/driver.go:165 +0x4c2" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="created by main.(*driver).StartLogging" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=error msg="\t/src/loki/cmd/docker-driver/driver.go:116 +0xa75" plugin=eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288
[...]level=warning msg="Unable to connect to plugin: /run/docker/plugins/eac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288/loki.sock/LogDriver.StopLogging: Post http://%2Frun%2Fdocker%2Fplugins%2Feac33cc9913ca962a189904392e516dd495d6fd52391fb5af4a34af46b281288%2Floki.sock/LogDriver.StopLogging: EOF, retrying in 1s"
[...]
What did I do wrong?
I was experiencing the same issue.
My only differences in configuration are that I'm trialing the latest Enterprise Edition (19.03) as it brings dual logging capability although this is also supported in the latest CE versions, and I'm using the latest Loki Docker driver client now that the Github issue previously mentioned has been resolved.
I ended up setting the log-opts properties no-file and keep-file in docker-compose.yml:
logging:
driver: "loki"
options:
loki-url: "http://${LOKI_URL}:3100/loki/api/v1/push"
loki-batch-size: "400"
no-file: "false"
keep-file: "true"
max-size: "5m"
max-file: "3"
Since making this change I am receiving logs in Loki and can still use docker container logs and docker service logs on my Docker hosts.
no-file: "false" tells the driver to continue creating logs on disk and keep-file: "true" tells the driver to keep json logs if the container is stopped (by default files are removed).
Note: Originally I was adding these settings to /etc/docker/daemon.json on the host but would still see the error getting log reader issue, I had to switch to specifying the log driver per container/swarm service.
Regarding this issue
First I could see logs in Grafana and in command line with docker container logs, but after some time no more logs were shown.
On Grafana please select Query type: Range not Instant and you will see all the logs for the selected period of time, if exists in loki.

Riak container does not start when its data volume is mounted

The following command works perfectly and the riak service starts as expected:
docker run --name=riak -d -p 8087:8087 -p 8098:8098 -v $(pwd)/schemas:/etc/riak/schema basho/riak-ts
The local schemas directory is mounted successfully and the sql file in it is read by riak. However if I try to mount the riak's data or log directories, the riak service does not start and timeouts after 15 seconds:
docker run --name=riak -d -p 8087:8087 -p 8098:8098 -v $(pwd)/logs:/var/log/riak -v $(pwd)/schemas:/etc/riak/schema basho/riak-ts
Output of docker logs riak:
+ /usr/sbin/riak start
riak failed to start within 15 seconds,
see the output of 'riak console' for more information.
If you want to
wait longer, set the environment variable
WAIT_FOR_ERLANG to the
number of seconds to wait.
Why does riak not start when it's logs or data directories are mounted to local directories?
This issue is with the directory owner of mounted log folder. The folder $GROUP and $USER are expected to be riak as follow:
root#20e489124b9a:/var/log# ls -l
drwxr-xr-x 2 riak riak 4096 Jul 19 10:00 riak
but with volumes you are getting:
root#3546d261a465:/var/log# ls -l
drwxr-xr-x 2 root root 4096 Jul 19 09:58 riak
One way to solve this is to have the directory ownership as riak user and group on host before starting the container. I looked the UID/GID (/etc/passwd) in docker image and they were:
riak:x:102:105:Riak user,,,:/var/lib/riak:/bin/bash
now change the ownership on host directories before starting the container as:
sudo chown 102:105 logs/
sudo chown 102:105 data/
This should solve it. At least for now. Details here.

Docker tomcat8-jre8 hacked?

I hosted a web-app on jelastic (dogado) as a docker container (the official docker container link). After 2 weeks I get an email:
Dear Jelastic customer, there was a process of the command
"/usr/local/tomcat/3333" which was sending massive packets to
different targets this morning. The symptoms look like the docker
instance has a security hole and was used in an DDoS attack or part of
a botnet.
The top command showed this process:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
334 root 20 0 104900 968 456 S 99.2 0.1 280:51.95 3333
root#node0815-somename:/# ls -al /proc/334
...
lrwxrwxrwx 1 root root 0 Jul 26 08:16 cwd -> /usr/local/tomcat
lrwxrwxrwx 1 root root 0 Jul 26 08:16 exe -> /usr/local/tomcat/3333
We have killed the process and changed the permissions of the file:
root#node0815-somename:/# kill 334
root#node0815-somename:/# chmod 000 /usr/local/tomcat/3333
Please investigate or use a more security hardenend docker template.
Has anyone encountered the same or a similar problem before? Is it possible that the container was hacked?
The guys which provide the container gave me a hint...
I remove only the ROOT war.
RUN rm -rf /usr/local/tomcat/webapps/ROOT
I forget completely that the tomcat delivers example apps. So I have to delete the security holes:
RUN rm -rf /usr/local/tomcat/webapps/
Do you use any protection tools? We don't except the scenario when your container can be hacked if there are no protection.
We strongly recommend using IPtables and Fail2Ban to protect your containers from hack attacks (You have root access to your Docker container using SSH, so you are able to install and configure these packages), especially if you have attached public IP to your containers.
Also, you have access to all container logs (via Dashboard or SSH), so you are able to analyze logs and take preventive actions.
Have a nice day.

Resources