changing transparent_hugepage in docker - docker

I have a container which requires /sys/kernel/mm/transparent_hugepage/enabled set to "never". The host has this set to a different value, which I cannot change due to other applications running on the host. Is it impossible run a container with different transparent_hugepage values from the host? Both the host and the container are using CentOS 6.6.

I imagine you're referring to Redis, but unfortunately it is impossible. Even if you give the container access to change kernel parameters (via --privileged or --cap-add), it would change for that container, the host, and all other containers.
The kernel is shared between the host and all containers so they all need to agree on the same kernel parameters. The only exceptions to this rule are those parameters within Kernel Resource Control Groups, or cgroups:
PID: Process IDs
UTS: Hostnames
Network: Networking params like TCP backlog, etc
Mount: Mounted filesystems
User: UID/GIDs
IPC: Inter-Process Control chit-chat is isolated
(more on cgroups: http://en.wikipedia.org/wiki/Cgroups)
Your specific request is related to a kernel memory-management parameter that applies globally.

Related

Is docker container completely isolated with outside of the docker container?

I wonder whether things like shell script execution can affect on the outside of the container. For example, let's say I want to save some file at the host machine from inside of the container, not using docker volumes or mount. Is that can be done? Or let's say I want to kill a process which is running on the host machine with shell commands from inside of the container. Is that can be done?
No, that is not possible, a docker container environment is completely isolated from the host, the only way to change some files in the host is by mounting a volume from the host to the container, you can kill an external PID but it's not a common practice.
Docker takes advantage of Linux namespaces to provide the isolated workspace we call a container. When a container is deployed, Docker creates a set of namespaces for that specific container, isolating it from all the other running containers. The various namespaces created for a container include:
PID Namespace: Anytime a program starts, a unique ID number is assigned to the namespace that is different than the host system. Each container has its own set of PID namespaces for its processes.
MNT Namespace: Each container is provided its own namespace for mount directory paths.
NET Namespace: Each container is provided its own view of the network stack avoiding privileged access to the sockets or interfaces of another container.
UTS Namespace: This provides isolation between the system identifiers; the hostname and the NIS domain name.
IPC Namespace: The inter-process communication (IPC) namespace creates a grouping where containers can only see and communicate with other processes in the same IPC namespace.
Containers allow developers to package large or small amounts of code and their dependencies together into an isolated package. This model then allows multiple isolated containers to run on the same host, resulting in better usage of hardware resources, and decreasing the impact of misbehaving applications on each other and their host system.
I hope it may help you.
You cannot modify host files without mounting them inside the container, though you can mount entire root inside (e.g -v /:/host). As for killing host processes, it is possible if you ran the container with host PID mode: docker run --pid=host ....

What is the practical use case for --net=host argument in docker?

For running a container we can specify --net=host to enable host networking, which allows the container shares the host’s networking namespace. But what is the practical use case for this?
I've found it useful in two situations:
You have a server process that listens on a very large number of ports, or does not use a consistent port, so the docker run -p option is impractical or impossible.
You have a process that needs to examine or manage the host network environment. (Its wire protocol somehow depends on sending the host's IP address; it's a service-discovery system and you want it to advertise both Docker and non-Docker services running on the host.)
Host networking disables one of Docker's important isolation systems. If you run a container with host networking, you can't use features like port remapping and you can't accept inbound connections from other containers using the container name as a host name. In both of these cases, running the server outside Docker might be more appropriate.
In SO questions I frequently see --net host suggested as a hack to get around programs that have 127.0.0.1 hard-coded as the location of a database or another external resource. This isn't usually necessary, and adding a layer of configuration (environment variables work well) and the standard Docker networking setup is better practice.

If a user had access to the terminal in a docker container can they do anything to destroy the hard drive its on?

If a user had access to a root Ubuntu terminal in a docker container, can they do anything to destroy the hard drive or SSD it is on?
Link: gitlab.com/pwnsquad/term
Docker by default gives root access to containers.
Container can damage your host system only if you bypassed the container isolation mechanisms of Docker, otherwise the only damage can be done to the container itself, not host.
The simplest ways to break the isolation mechanisms are following:
using Dockers' bind mounts, when you map host's path into container' path. In this case this path may be completely cleaned from inside container. Avoid bind mounts (use volumes) or mount in ro mode to avoid that
using networking, specially network=host guarantees container access to all host's active network services and thus probably making host vulnerable to attacks on them. In this case you can connect to services, which are bound locally (to 127.0.0.1/localhost) thus not expecting remote connections and as a result could be less protected.

Docker Namespace in kernel level

How to differentiate pid 1,17 etc of docker containers with host's 1,17 etc pid's and what all the kernel changes are happening when we create a new process inside the docker container?
How the process inside the docker can be seen in the host?
How to differentiate pid 1,17 etc of docker containers with host's 1,17
By default, those pid are in different namespace.
Since issue 10080 and --pid host, the container pids can stay in the host's pid namespace.
There also issue 10163: "Allow shared PID namespaces", requesting a --pid=container:id
what all the kernel changes are happening when we create a new process inside the docker container
Note and update May 2016: issue 10163 and --pid=container:id is now resolved by PR 22481 for docker 1.12, allowing to join another container's PID namespace.
No changes on the kernel level, only the use of:
cgroups or control groups. A key to running applications in isolation is to have them only use the resources you want.
union file systems to provide the building blocks for containers

Why does Docker claim its container being portable?

Docker claims that containers built with it are more portable than pure LXC containers. I think I understand that there are some conventions and automation of the LXC configuration like for hostname and network configuration. But is there more than that?
If you take a LXC container (and its configuration file), it will be portable only if you run it on a host with the same network configuration; i.e. a bridge with the same name, the same network range, the same router address, and the same DNS server.
Moreover, if the container exposes services, you will have to setup network rules (or something similar) to reach those services. With Docker, there is a coherent syntax to express "yup, I want to expose port 8000 of that container" and then "hey, which public port was allocated for that container's port 8000?"
Docker also adapts the LXC configuration file depending on the local capabilities (for instance, an upcoming patch will enable apparmor containment iff it's available).

Resources