I hosted a web-app on jelastic (dogado) as a docker container (the official docker container link). After 2 weeks I get an email:
Dear Jelastic customer, there was a process of the command
"/usr/local/tomcat/3333" which was sending massive packets to
different targets this morning. The symptoms look like the docker
instance has a security hole and was used in an DDoS attack or part of
a botnet.
The top command showed this process:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
334 root 20 0 104900 968 456 S 99.2 0.1 280:51.95 3333
root#node0815-somename:/# ls -al /proc/334
...
lrwxrwxrwx 1 root root 0 Jul 26 08:16 cwd -> /usr/local/tomcat
lrwxrwxrwx 1 root root 0 Jul 26 08:16 exe -> /usr/local/tomcat/3333
We have killed the process and changed the permissions of the file:
root#node0815-somename:/# kill 334
root#node0815-somename:/# chmod 000 /usr/local/tomcat/3333
Please investigate or use a more security hardenend docker template.
Has anyone encountered the same or a similar problem before? Is it possible that the container was hacked?
The guys which provide the container gave me a hint...
I remove only the ROOT war.
RUN rm -rf /usr/local/tomcat/webapps/ROOT
I forget completely that the tomcat delivers example apps. So I have to delete the security holes:
RUN rm -rf /usr/local/tomcat/webapps/
Do you use any protection tools? We don't except the scenario when your container can be hacked if there are no protection.
We strongly recommend using IPtables and Fail2Ban to protect your containers from hack attacks (You have root access to your Docker container using SSH, so you are able to install and configure these packages), especially if you have attached public IP to your containers.
Also, you have access to all container logs (via Dashboard or SSH), so you are able to analyze logs and take preventive actions.
Have a nice day.
Related
I'm working on a build step that handles common deployment tasks in a Docker Swarm Mode cluster. As this is a common problem for us and for others, we've shared this build step as a BitBucket pipe: https://bitbucket.org/matchory/swarm-secret-pipe/
The pipe needs to use the docker command to work with a remote Docker installation. This doesn't work, however, because the docker executable cannot be found when the pipe runs.
The following holds true for our test repository pipeline:
The docker option is set to true:
options:
docker: true
The docker service is enabled for the build step:
main:
- step:
services:
- docker: true
Docker works fine in the repository pipeline itself, but not within the pipe.
Pipeline log shows the docker path being mounted into the pipe container:
docker container run \
--volume=/opt/atlassian/pipelines/agent/build:/opt/atlassian/pipelines/agent/build \
--volume=/opt/atlassian/pipelines/agent/ssh:/opt/atlassian/pipelines/agent/ssh:ro \
--volume=/usr/local/bin/docker:/usr/local/bin/docker:ro \
--volume=/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes:/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes \
--volume=/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes/matchory/swarm-secret-pipe:/opt/atlassian/pipelines/agent/build/.bitbucket/pipelines/generated/pipeline/pipes/matchory/swarm-secret-pipe \
--workdir=$(pwd) \
--label=org.bitbucket.pipelines.system=true \
radiergummi/swarm-secret-pipe:1.3.7#sha256:baf05b25b38f2a59b044e07f4ad07065de90257a000137a0e1eb71cbe1a438e5
The pipe is pretty standard and uses a recent Alpine image; nothing special in that regard. The PATH is never overwritten. Now for the fun part: If I do ls /usr/local/bin/docker inside the pipe, it shows an empty directory:
ls /usr/local/bin
total 16K
drwxr-xr-x 1 root root 4.0K May 13 13:06 .
drwxr-xr-x 1 root root 4.0K Apr 4 16:06 ..
drwxr-xr-x 2 root root 4.0K Apr 29 09:30 docker
ls /usr/local/bin/docker
total 8K
drwxr-xr-x 2 root root 4.0K Apr 29 09:30 .
drwxr-xr-x 1 root root 4.0K May 13 13:06 ..
ls: /usr/local/bin/docker/docker: No such file or directory
As far as I understand pipelines and Docker, /usr/local/bin/docker should be the docker binary file. Instead, it appears to be an empty directory for some reason.
What is going on here?
I've also looked at other, official, pipes. They don't do anything differently, but seem to be using the docker command just fine (eg. the Azure pipe).
After talking to BitBucket support, I solved the issue. As it turns out, if the docker context is changed, any docker command is sent straight to the remote docker binary, which (on our services) lives at a different path than in BitBucket Pipelines!
As we changed the docker context before using the pipe, and the docker instance mounted into the pipe still has the remote context set, but the pipe searches for the docker binary at another place, the No such file or directory error is thrown.
TL;DR: Always restore the default docker host/context before passing control to a pipe, e.g.:
script:
- export DEFAULT_DOCKER_HOST=$DOCKER_HOST
- unset DOCKER_HOST
- docker context create remote --docker "host=ssh://${DEPLOY_SSH_USER}#${DEPLOY_SSH_HOST}"
- docker context use remote
# do your thing
- export DOCKER_HOST=$DEFAULT_DOCKER_HOST # <------ restore the default host
- pipe: matchory/swarm-secret-pipe:1.3.16
I have a Docker image which contains JRE, some Java web application and jmxterm. The latter is used for running some ad-hoc administrative tasks. The image is used on the CentOS 7 server with Docker 1.13 (which is pretty old but is the latest version which is supplied via the distro's repository) to run the web application itself.
All works well, but after updating jmxterm from 1.0.0 to the latest version (1.0.2), I get the following warning when entering the running container and starting jmxterm:
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
After this, jmxterm does not react to arrow keys (when trying to navigate through the command history), nor does it provide autocompletion.
Some quick investigation shows that the problem may be reproduced in the clean environment with CentOS 7. Say, this is how I could bootstrap the system and the container with all stuff I need:
$ vagrant init centos/7
$ vagrant up
$ vagrant ssh
[vagrant#localhost ~]$ sudo yum install docker
[vagrant#localhost ~]$ sudo systemctl start docker
[vagrant#localhost ~]$ sudo docker run -it --entrypoint bash openjdk:11
root#0c4c614de0ee:/# wget https://github.com/jiaqi/jmxterm/releases/download/v1.0.2/jmxterm-1.0.2-uber.jar
And this is how I enter the container and run jmxterm:
[vagrant#localhost ~]$ sudo docker exec -it 0c4c614de0ee sh
root#0c4c614de0ee:/# java -jar jmxterm-1.0.2-uber.jar
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
root#0c4c614de0ee:/# bea<TAB>
<Nothing happens, but autocompletion had to appear>
Few observations:
the problem does not appear with older jmxterm no matter which image do I use;
the problem arises with new jmxterm no matter which image do I use;
the problem is not reproducible on my laptop (which has newer kernel and Docker);
the problem is not reproducible if I use latest Docker (from the external repo) on the CentOS 7 server instead of CentOS 7's native version 1.13.
What happens, and why the error is reproducible only in specific environments? Is there any workaround for this?
TLDR: running new jmxterm versions as java -jar jmxterm-1.0.2-uber.jar < /dev/tty is a quick, dirty and hacky workaround for having the autocompletion and other stuff work inside the interactive container session.
A quick check shows that jmxterm tries to determine the terminal device used by the process — probably to obtain the terminal capabilities later — by running the tty utility:
root#0c4c614de0ee:/# strace -f -e 'trace=execve,wait4' java -jar jmxterm-1.0.2-uber.jar
execve("/opt/java/openjdk/bin/java", ["java", "-jar", "jmxterm-1.0.2-uber.jar"], 0x7ffed3a53210 /* 36 vars */) = 0
...
[pid 432] execve("/usr/bin/tty", ["tty"], 0x7fff8ea39608 /* 36 vars */) = 0
[pid 433] wait4(432, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 432
WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
The utility fails with the status of 1, which is likely the reason for the error message. Why?
root#0c4c614de0ee:/# strace -y tty
...
readlink("/proc/self/fd/0", "/dev/pts/3", 4095) = 10
stat("/dev/pts/3", 0x7ffe966f2160) = -1 ENOENT (No such file or directory)
...
write(1</dev/pts/3>, "not a tty\n", 10not a tty
) = 10
The utility says "not a tty" while we definitely have one. A quick check shows that... There is really no PTY device in the container though the standard streams of the shell are connected to one!
root#0c4c614de0ee:/# ls -l /proc/self/fd
total 0
lrwx------. 1 root root 64 Jun 3 21:26 0 -> /dev/pts/3
lrwx------. 1 root root 64 Jun 3 21:26 1 -> /dev/pts/3
lrwx------. 1 root root 64 Jun 3 21:26 2 -> /dev/pts/3
lr-x------. 1 root root 64 Jun 3 21:26 3 -> /proc/61/fd
root#0c4c614de0ee:/# ls -l /dev/pts
total 0
crw-rw-rw-. 1 root root 5, 2 Jun 3 21:26 ptmx
What if we check the same with latest Docker?
root#c0ebd608f79a:/# ls -l /proc/self/fd
total 0
lrwx------ 1 root root 64 Jun 3 21:45 0 -> /dev/pts/1
lrwx------ 1 root root 64 Jun 3 21:45 1 -> /dev/pts/1
lrwx------ 1 root root 64 Jun 3 21:45 2 -> /dev/pts/1
lr-x------ 1 root root 64 Jun 3 21:45 3 -> /proc/16/fd
root#c0ebd608f79a:/# ls -l /dev/pts
total 0
crw--w---- 1 root tty 136, 0 Jun 3 21:44 0
crw--w---- 1 root tty 136, 1 Jun 3 21:45 1
crw-rw-rw- 1 root root 5, 2 Jun 3 21:45 ptmx
Bingo! Now we have our PTYs where they should be, so jmxterm works well with latest Docker.
It seems pretty weird that with older Docker the processes are connected to some PTYs while there are no devices for them in /dev/pts, but tracing the Docker process explains why this happens. Older Docker allocates the PTY for the container before setting other things up (including entering the new mount namespace and mounting devpts into it or just entering the mount namespace in case of docker exec -it):
[vagrant#localhost ~]$ sudo strace -p $(pidof docker-containerd-current) -f -e trace='execve,mount,unshare,openat,ioctl'
...
[pid 3885] openat(AT_FDCWD, "/dev/ptmx", O_RDWR|O_NOCTTY|O_CLOEXEC) = 9
[pid 3885] ioctl(9, TIOCGPTN, [1]) = 0
[pid 3885] ioctl(9, TIOCSPTLCK, [0]) = 0
...
[pid 3898] unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET|CLONE_NEWPID) = 0
...
[pid 3899] mount("devpts", "/var/lib/docker/overlay2/3af250a9f118d637bfba5701f5b0dfc09ed154c6f9d0240ae12523bf252e350c/merged/dev/pts", "devpts", MS_NOSUID|MS_NOEXEC, "newinstance,ptmxmode=0666,mode=0"...) = 0
...
[pid 3899] execve("/bin/bash", ["bash"], 0xc4201626c0 /* 7 vars */ <unfinished ...>
Note the newinstance mount option which ensures that the devpts mount owns its PTYs exclusively and does not share them with other mounts. This leads to the interesting effect: the PTY device for the container stays on the host and belongs to the host's devpts mount, while the containerized process still has access to it, as it obtained the already-open file descriptors just in the beginning of its life!
The latest Docker first mounts devpts for the container and then allocates the PTY, so the PTY belongs to container's devpts mount and is visible inside the container's filesystem:
$ sudo strace -p $(pidof containerd) -f -e trace='execve,mount,unshare,openat,ioctl'
...
[pid 14043] unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWNET) = 0
...
[pid 14044] mount("devpts", "/var/lib/docker/overlay2/b743cf16ab954b9a4b4005bca0aeaa019c4836c7d58d6073044e5b48446c3d62/merged/dev/pts", "devpts",
MS_NOSUID|MS_NOEXEC, "newinstance,ptmxmode=0666,mode=0"...) = 0
...
[pid 14044] openat(AT_FDCWD, "/dev/ptmx", O_RDWR|O_NOCTTY|O_CLOEXEC) = 7
[pid 14044] ioctl(7, TIOCGPTN, [0]) = 0
[pid 14044] ioctl(7, TIOCSPTLCK, [0]) = 0
...
[pid 14044] execve("/bin/bash", ["/bin/bash"], 0xc000203530 /* 4 vars */ <unfinished ...>
Well, the problem is caused by inappropriate Docker behavior, but how comes that older jmxterm worked well in the same environment? Let's check (note, that Java 8 image is used here, as older jmxterm does not play well with Java 11):
root#504a7757e310:/# wget https://github.com/jiaqi/jmxterm/releases/download/v1.0.0/jmxterm-1.0.0-uber.jar
root#504a7757e310:/# strace -f -e 'trace=execve,wait4' java -jar jmxterm-1.0.0-uber.jar
execve("/usr/local/openjdk-8/bin/java", ["java", "-jar", "jmxterm-1.0.0-uber.jar"], 0x7fffdcaebdd0 /* 10 vars */) = 0
...
[pid 310] execve("/bin/sh", ["sh", "-c", "stty -a < /dev/tty"], 0x7fff1f2a1cc8 /* 10 vars */) = 0
So, older jmxterm just uses /dev/tty instead of asking tty for the device name, and this works, as this device is present inside the container:
root#504a7757e310:/# ls -l /dev/tty
crw-rw-rw-. 1 root root 5, 0 Jun 3 21:36 /dev/tty
The huge difference between these versions of jmxterm is that newer tool version uses higher major version of jline, which is the library responsible for interaction with the terminal (akin to the readline in the C world). The difference between major jline versions leads to the difference in jmxterm's behavior, and current versions just rely on tty.
This observation leads us to the quick and dirty workaround which does not require neither updating Docker nor patching the jline/jmxterm tandem: we may just attach jmxterm's stdin to /dev/tty forcibly and thus make jline use this device (which is now referenced by /proc/self/fd/0) instead of the /dev/pts entry (which, formally, is not always correct, but is still enough for ad-hoc use):
root#0c4c614de0ee:/# java -jar jmxterm-1.0.2-uber.jar < /dev/tty
Welcome to JMX terminal. Type "help" for available commands.
$>bea<TAB>
bean beans
Now we have the autocompletion, history and other cool things we need to have.
If you are trying to run an interactive application (that needs tty) inside a docker container or a pod in kubernetes, then the following should work.
For docker-compose use:
image: image-name:2.0
container_name: container-name
restart: always
stdin_open: true
tty: true
For kubernetes use:
spec:
containers:
- name: web
image: web:latest
tty: true
stdin: true
I have a swarm of few services, and in the compose file there are few volumes created with the vieux/sshfs driver, which are used by the services.
The containers spawned by the services execute a single script, after which the container finishes/exits and a new one is created on its place - basically the services are spawning new containers all the time.
All works smooth, except that there is exceptionally large amount of zombie processes accumulated in the host machine. The zombies go away when the docker daemon is re stared - it must be docker who makes the zombies.
"ps aux | grep 'Z'" is
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 3040 0.0 0.0 0 0 ? Zs 14:13 0:00 [ssh] <defunct>
root 3042 0.0 0.0 0 0 ? Zs 14:13 0:00 [sshfs] <defunct>
root 3052 0.0 0.0 0 0 ? Zs 14:13 0:00 [ssh] <defunct>
root 3055 0.0 0.0 0 0 ? Zs 14:13 0:00 [sshfs] <defunct>
...
As far as I understand, the volumes are created only once, and the services are just using the local copy of the volume - not creating a new ssh connection and reading straight form the remote machine - and this should not be creating another ssh connection process that will become zombie.
I have trouble finding info on the topic, which makes me think that I am doing something fundamentally wrong. Please help.
I have just resolved the issue by enabling Tini for the services in the docker-compose file as follows -
init: true
Few zombies (<10) pop up, but then they get killed in a second - no accumulation.
I still don't get what the zombies had to do with the ssh. If anyone can answer that I would be grateful.
PS: I have checked few days after I have enabled Tini. There are some accumulated zombies (~300, before there ware ~2000). Problem seems mitigated, but it is still there.
I recently read an article about it.
It just said than if you declare your volume directly into the docker-compose.yml in may lead to some issue with zombie sshfs process.
To avoid this, I try to declare the volume as external and run manually the creation of the docker volume.
docker volume create -d vieux/sshfs -o sshcmd="$USER_SSH#$IP:/mysupervolume" -o IdentityFile="/root/.ssh/$SSH_KEY" nameofmyvolume
Hope it will help someone.
Kr,
I am trying to set up a multi-container service with docker-compose.
Some of the containers need to be restarted from a fresh container (eg. the file system should be like in the image) when they restart.
How can I achieve this?
I've found the restart: always option I can put on my service in the docker-compose.yml file, but that doesn't give me a fresh file system as it uses the same container.
I've also seen the --force-recreate option of docker-compose up, but that doesn't apply as that only recreates the containers when the command is runned.
EDIT:
This is probably not a docker-compose issue, but more of a general docker question: What is the best way to make sure a container is in a fresh state when it is restarted? With fresh state, I mean a state identical to that of a brand new container from the same image. Restarted is the docker command docker restart or docker stop and docker start.
In docker, immutability typically refers to the image layers. They are immutable, and any changes are pushed to a container specific copy-on-write layer of the filesystem. That container specific layer will last for the lifetime of the container. So to have those files not-persist, you have two options:
Recreate the container instead of just restart it
Don't write the changes to the container filesystem, and don't write them to any persistent volumes.
You cannot do #1 with a restart policy by it's very definition. A restart policy gives you the same container filesystem, with the application restarted. But if you use docker's swarm mode, it will recreate containers when they exit, so if you can migrate to swarm mode, you can achieve this result.
Option #2 looks more difficult than it is. If you aren't writing to the container filesystem, or to a volume, then where? The answer is a tmpfs volume that is only stored in memory and is lost as soon as the container exits. In compose, this is a tmpfs: /data/dir/to/not/persist line. Here's an example on the docker command line.
First, let's create a container with a tmpfs mounted at /data, add some content, and exit the container:
$ docker run -it --tmpfs /data --name no-persist busybox /bin/sh
/ # ls -al /data
total 4
drwxrwxrwt 2 root root 40 Apr 7 21:50 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
/ # echo 'do not save' >>/data/tmp-data.txt
/ # cat /data/tmp-data.txt
do not save
/ # ls -al /data
total 8
drwxrwxrwt 2 root root 60 Apr 7 21:51 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
-rw-r--r-- 1 root root 12 Apr 7 21:51 tmp-data.txt
/ # exit
Easy enough, it behaves as a normal container, let's restart it and check the directory contents:
$ docker restart no-persist
no-persist
$ docker attach no-persist
/ # ls -al /data
total 4
drwxr-xr-x 2 root root 40 Apr 7 21:51 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
/ # echo 'still do not save' >>/data/do-not-save.txt
/ # ls -al /data
total 8
drwxr-xr-x 2 root root 60 Apr 7 21:52 .
drwxr-xr-x 1 root root 4096 Apr 7 21:50 ..
-rw-r--r-- 1 root root 18 Apr 7 21:52 do-not-save.txt
/ # exit
As you can see, the directory returned empty, and we can add data as needed back to the directory. The only downside of this is the directory will be empty even if you have content in the image at that location. I've tried combinations of named volumes, or using the mount syntax and passing the volume-nocopy option to 0, without luck. So if you need the directory to be initialized, you'll need to do that as part of your container entrypoint/cmd by copying from another location.
In order to not persist any changes to your containers it is enough that you don't map any directory from host to the container.
In this way, every time the containers runs (with docker run or docker-compose up ), it starts with a fresh file system.
docker-compose down also removes the containers, deleting any data.
The best solution I have found so far, is for the container itself to make sure to clean up when starting or stopping. I solve this by cleaning up when starting.
I copy my app files to /srv/template with the docker COPY directive in my Dockerfile, and have something like this in my ENTRYPOINT script:
rm -rf /srv/server/
cp -r /srv/template /srv/server
cd /srv/server
I've got a container which runs Apache2 and has some log files that print their input into /dev/stdout and /dev/stderr. When running docker-compose logs -f it blindly prints both stdout and stderr mixed. Is there any way to only show one of the two?
lrwxrwxrwx 1 root root 11 Oct 10 01:22 access.log -> /dev/stdout
lrwxrwxrwx 1 root root 11 Oct 10 01:22 error.log -> /dev/stderr
lrwxrwxrwx 1 root root 11 Oct 10 01:22 other_vhosts_access.log -> /dev/stdout
Ideally, I'd like to optionally switch between outputs. I could imagine --only-stderr and --only-stdout flags. I'm aware of possible workarounds for this, but I'm interested if this is natively possible.
I see your point and I would need the same thing, but apparently it's not supported by docker-compose at the moment.
See this feature request: https://github.com/docker/compose/issues/6078
Although you cannot do that with docker-compose, you can still achieve something similar by following the logs of a particular Docker container.
You wrote "I'm aware of possible workarounds for this" so you might know it already, but it will hopefully help other people visiting this page :)
After executing docker-compose up, list your running containers:
docker ps
Copy the NAME of the given container and read its logs:
docker logs NAME_OF_THE_CONTAINER -f
To only read the error logs:
docker logs NAME_OF_THE_CONTAINER -f 1>/dev/null
To only read the access logs:
docker logs NAME_OF_THE_CONTAINER -f 2>/dev/null