I would like to run a docker or LXC container but restrict access to the container itself. Specifically, is it possible to prevent even the root (root on the host) from accessing the container?
From access, I mean SSH in to the container, tcpdump the tx/rx puts to the container, profiling the application etc.
Thanks!
It is not possible to effectively restrict a privileged user on the host from inspecting or accessing the container. If that were the case, it's hard to imagine how it would be possible for the root user to even start the container in the first place.
In general, it's useful to remember that containerization is used to confine processes to a restricted space: it's used to keep a process from getting out to the host, not to prevent other processes from getting in.
Related
There are many sites which preach that we should not run docker containers as root users.
For example:
By default, Docker gives root permission to the processes within your
containers, which means they have full administrative access to your
container and host environments.
I do not understand that how a container can access host environment & cause security vulnerabilities if I do not do volume/port mapping.
Can someone give an example of such security risk?
By default, docker tries to do very strong isolation between containers and host. If you need to have a root user (you can't avoid it) docker offers a security mechanism which maps the root user from the container to a random virtual high UUID on a host which is nothing if someone manages to escape.
Leaving root inside the container can leave the "attacker" option to install additional packages they wish, they see other containers/resources to which container has access (for instance they can try to NMAP around the container), .. well .. they are after all root inside container.
As of example of security risk. There was one "big one" called dirty cow.
Hope I pushed you in the right direction for further research.
docker and other containerization technology build on the namespaces feature of the Linux kernel to confine and isolated processes, limiting their view on available resources such as the filesystem, access to other processes or to the network.
By default docker uses a really strong isolation of processes limiting their access to a minimum. This leads many people to believe that running any untrusted process/docker image within a container is perfectly safe - it is not!
Because albeit the strong isolation of such processes they still run directly on the kernel of the host system. And when they run as root within the container (and not using a user namespace) they are actually root on the host, too. The only thing then preventing a malicious container from completely overtaking the host system is that it has no direct/straight-forward access to critical resources. If, though, it somehow gets ahold on an handle to a resource outside of its namespaces it may be used to break out of the isolation.
It is easy for an incautious user to unintentionally provide such a handle to outside resources to a container. For example:
# DON'T DO THIS
# the user intends to share the mapping of user names to ids with the container
docker run -v /etc/passwd:/etc/passwd untrusted/image
With the process within the container running as root the container would not only be able to read all users in /etc/passwd but also to edit that file, since it has also root access on the host. In this case - should you really need something like this - best practice would be to bind-mount /etc/passwd read-only.
Another example: some applications require extended access to system capabilities which requires to loosen some of the strict isolation docker applies by default, e.g.:
# DON'T DO THIS
docker run --privileged --cap-add=CAP_ALL untrusted/image
This would remove most of the limitations and most notably would allow the container to load new kernel modules, i.e., inject and execute arbitrary code into the kernel, which is obviously bad.
But besides giving access to external resources by mistake there also exists the possibility of bugs in the Linux kernel that could be exploited to escape the isolation of the container - which are much easier to use when the process already has root privileges inside the container.
Therefor best practice with docker is to limit the access of containers as much as possible:
drop/do not add any capabilities/privileges that are not required
bind-mount only files/directories that are really required
use a non-root user when possible
Although starting containers as root is the default in docker most applications do not actually require being started as root. So why give root privileges when they are not really required?
I've just recently started exploring rootless docker and there are some things that I don't fully grasp. Below is my understanding of the concept and some questions. Please correct me if something's wrong
With rootless, the daemon and containers can be run as non-root users to mitigate potential vulnerabilities, e.g. if someone were to gain access to a container running as root, then he could also have root if he got outside of the container (and into the host system). So, if someone were to gain access to a rootless container, then he'd only be able to act as the non-root user running the container.
I want to run multiple containers that don't need any network between them, so I'm thinking that it would probably make sense to not run them as the same user, but is that correct? Also, in such a scenario, would I need to install and run the daemon multiple times for each user?
What about the user inside the container. I've tried out pihole/pihole and the default user inside the container is root (id: 0). Is that now ok, as the container is otherwise rootless? I've tried setting it to a different user by using user: "1005:1005" (in docker-compose.yml), but then the container is not able to start as it's missing permissions to do some tasks).
I would like to read host's ifconfig output during the run of the Docker container, to be able to parse it and get OpenVPN interface (tap0) IP address and process it within my application.
Unfortunately, propagating this value via the environment is not my case, because IP address could change in time of running the container and I don't want to restart my application container each time to see a new value.
Current working solution is a CRON on the host which writes the IP into the file on a shared volume and container reads from it - but I am looking for better solution as it seems to me as a workaround. Also, there was a plan to create new container with network: host which will see host's interfaces - it works, but it also looks like a workaround as it involves many steps and probably security issues.
I have a question, is there any valid and more clean way to achieve my goal - read host's ifconfig in docker container in realtime?
A specific design goal of Docker is that containers can’t directly access the host’s network configuration. The workarounds you’ve identified are pretty much the only way to do these.
If you’re trying to modify the host’s network configuration in some way (you’re trying to actually run a VPN, for example) you’re probably better off running it outside of Docker. You’ll still need root permission either way, but you won’t need to disable a bunch of standard restrictions to do what you need.
If you’re trying to provide some address where the service can be reached, using configuration like an environment variable is required. Even if you could access the host’s configuration, this might not be the address you need: consider a cloud environment where you’re running on a cloud instance behind a load balancer, and external clients need the load balancer; that’s not something you can directly know given only the host’s network configuration.
If a user had access to a root Ubuntu terminal in a docker container, can they do anything to destroy the hard drive or SSD it is on?
Link: gitlab.com/pwnsquad/term
Docker by default gives root access to containers.
Container can damage your host system only if you bypassed the container isolation mechanisms of Docker, otherwise the only damage can be done to the container itself, not host.
The simplest ways to break the isolation mechanisms are following:
using Dockers' bind mounts, when you map host's path into container' path. In this case this path may be completely cleaned from inside container. Avoid bind mounts (use volumes) or mount in ro mode to avoid that
using networking, specially network=host guarantees container access to all host's active network services and thus probably making host vulnerable to attacks on them. In this case you can connect to services, which are bound locally (to 127.0.0.1/localhost) thus not expecting remote connections and as a result could be less protected.
Is the docker host completely protected for anything the docker instance can do?
As long as you don't expose a volume to the docker instance, are there other ways it can actually connect into the host and 'hack' it?
For example, say I allow customers to run code inside of a server that I run. I want to understand the potential security implications of allowing a customer to run arbitrary code inside of an docker instance.
All processes inside Docker are isolated from the host machine. The cannot by default see or interfere with other processes. This is guranteed by the process namespaces used by docker.
As long as you don't mount crucial stuff (example: docker.sock) onto the container, there are no security risks associated with running a container, and even with allowing code execution inside the container.
For a list of security features in docker, check Docker security.
The kernel is shared between the host and the docker container. This is less separation than lets say a VM has.
Runing any untrusted container is NOT SECURE. There are kernel vulnerabilities, which can be abused and ways to break out of containers.
Thats why its a best practice for example to either not use root user in containers or have a separate user namespace for containers.