runc and ctr commands do not show docker images and containers - docker

I have multiple Docker images and containers running on a VM. But commands like "runc list" doesn't list any of these.
How can I make runc/containerd aware of my existing docker images?

The runtime (runc) uses so-called runtime root directory to store and obtain the information about containers. Under this root directory, runc places sub-directories (one per container), and each of them contains the state.json file, where the container state description resides.
The default location for runtime root directory is either /run/runc (for non-rootless containers) or $XDG_RUNTIME_DIR/runc (for rootless containers) - the latter also usually points to somewhere under /run (e.g. /run/user/$UID/runc).
When the container engine invokes runc, it may override the default runtime root directory and specify the custom one (--root option of runc). Docker uses this possibility, e.g. on my box, it specifies /run/docker/runtime-runc/moby as the runtime root.
That said, to make runc list see your Docker containers, you have to point it to Docker's runtime root directory by specifying --root option. Also, given that Docker containers are not rootless by default, you will need the appropriate privileges to access the runtime root (e.g. with sudo).
So, that's how this should work:
$ docker run -d alpine sleep 1000
4acd4af5ba8da324b7a902618aeb3fd0b8fce39db5285546e1f80169f157fc69
$ sudo runc --root /run/docker/runtime-runc/moby/ list
ID PID STATUS BUNDLE CREATED OWNER
4acd4af5ba8da324b7a902618aeb3fd0b8fce39db5285546e1f80169f157fc69 18372 running /run/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/4acd4af5ba8da324b7a902618aeb3fd0b8fce39db5285546e1f80169f157fc69 2019-07-12T17:33:23.401746168Z root
As to images, you can not make runc see them, as it has no notion of image at all - instead, it operates on bundles. Creating the bundle (e.g. based on image) is responsibility of the caller (in your case - containerd).

Related

Install Docker binary on a server without root access

I have a server by a provider without any root access. It is not possible to write scripts in /etc/ or /var/lib/docker. Docker is not installed. My idea is to install and run docker binary in directory. I will install docker with a shell script. The script should be able to be started from any directory without root access.
When the script starts ./docker/dockerd --data-root=docker/var/lib/docker I get this error message.
WARN[2018-11-17T18:26:19.492488618+01:00] Error while setting daemon root propagation, this is not generally critical but may cause some functionality to not work or fallback to less desirable behavior dir=docker/var/lib/docker error="error getting daemon root's parent mount: open /proc/self/mountinfo: permission denied"
Error starting daemon: open /var/run/docker.pid: permission denied
dockerd has so many parameter. Here for the pidfile: -p | **--pidfile*[=/var/run/docker.pid]
http://manpages.ubuntu.com/manpages/cosmic/man8/dockerd.8.html
Thank you for the help
#!/bin/bash
DOCKER_RELEASE='docker-18.06.1-ce.tgz'
wget https://download.docker.com/linux/static/stable/x86_64/$DOCKER_RELEASE
tar xzvf $DOCKER_RELEASE
rm $DOCKER_RELEASE
./docker/dockerd --data-root=docker/var/lib/docker
As announced today (Feb. 4th, 2019) by Akihiro Suda:
Finally, it is now possible to run upstream dockerd as an unprivileged user!
See moby/moby PR 38050:
Allow running dockerd in an unprivileged user namespace (rootless mode).
Close #37375 "Proposal: allow running dockerd as an unprivileged user (aka rootless mode)", opened in June 2018
No SETUID/SETCAP binary is required, except newuidmap and newgidmap.
How I did it:
By using user_namespaces(7), mount_namespaces(7), network_namespaces(7), and slirp4netns.
Warning, there are restrictions:
Restrictions:
Only vfs graphdriver is supported.
However, on Ubuntu and a few distros, overlay2 and overlay are also supported.
Starting with Linux 4.18, we will be also able to implement FUSE snapshotters.
(See Graphdriver plugins, where Docker graph driver plugins enable admins to use an external/out-of-process graph driver for use with Docker engine.
This is an alternative to using the built-in storage drivers, such as aufs/overlay/devicemapper/btrfs.)
Cgroups (including docker top) and AppArmor are disabled at the moment.
In future, Cgroups will be optionally available when delegation permission is configured on the host.
Checkpoint is not supported at the moment.
Running rootless dockerd in rootless/rootful dockerd is also possible, but not fully tested.
The documentation is now in docs/rootless.md:
Note the following requirements:
newuidmap and newgidmap need to be installed on the host.
These commands are provided by the uidmap package on most distros.
/etc/subuid and /etc/subgid should contain >= 65536 sub-IDs.
e.g. penguin:231072:65536.
That is:
$ id -u
1001
$ whoami
penguin
$ grep ^$(whoami): /etc/subuid
penguin:231072:65536
$ grep ^$(whoami): /etc/subgid
penguin:231072:65536
Either slirp4netns (v0.3+) or VPNKit needs to be installed.
slirp4netns is preferred for the best performance.
You will have to modify your script:
You need to run dockerd-rootless.sh instead of dockerd.
$ dockerd-rootless.sh --experimental"
Update May 2019: Tõnis Tiigi does explore this rootless option with "Experimenting with Rootless Docker":
User namespaces map a range of user ID-s so that the root user in the inner namespace maps to an unprivileged range in the parent namespace.
A fresh process in user namespace also picks up a full set of process capabilities.
The rootless mode works in a similar way, except we create a user namespace first and start the daemon already in the remapped namespace. The daemon and the containers will both use the same user namespace that is different from the host one.
Although Linux allows creating user namespaces without extended privileges these namespaces only map a single user and therefore do not work with many current existing containers.
To overcome that, rootless mode has a dependency on the uidmap package that can do the remapping of users for us. The binaries in uidmap package use setuid bit (or file capabilities) and therefore always run as root internally.
To make the launching of different namespaces and integration with uidmap simpler Akihiro created a project called rootlesskit.
Rootlesskit also takes care of setting up networking for rootless containers. By default rootless docker uses networking based on moby/vpnkit project that is also used for networking in the Docker Desktop products.
Alternatively, users can install slirp4netns and use that instead.
Again:
Caveats:
Some examples of things that do not work on rootless mode are cgroups resource controls, apparmor security profiles, checkpoint/restore, overlay networks etc.
Exposing ports from containers currently requires manual socat helper process.
Only Ubuntu based distros support overlay filesystems in rootless mode.
For other systems, rootless mode uses vfs storage driver that is suboptimal in many filesystems and not recommended for production workloads.
I appreciate the OP has moved on, but here's a short answer for others. If the files /etc/subuid and /etc/subgid do not fulfill the prerequisite settings (see code below), then you will be forced to involve someone with root access.
# rightmost values returned from these commands should be >= 65536
# if not, you're out of luck if admin doesn't like you.
cat /etc/subgid | grep `whoami`
cat /etc/subuid | grep `whoami`

What Is The Difference Between Binding Mounts And Volumes While Handling Persistent Data In Docker Containers?

I want to know why we have two different options to do the same thing, What are the differences between the two.
We basically have 3 types of volumes or mounts for persistent data:
Bind mounts
Named volumes
Volumes in dockerfiles
Bind mounts are basically just binding a certain directory or file from the host inside the container (docker run -v /hostdir:/containerdir IMAGE_NAME)
Named volumes are volumes which you create manually with docker volume create VOLUME_NAME. They are created in /var/lib/docker/volumes and can be referenced to by only their name. Let's say you create a volume called "mysql_data", you can just reference to it like this docker run -v mysql_data:/containerdir IMAGE_NAME.
And then there's volumes in dockerfiles, which are created by the VOLUME instruction. These volumes are also created under /var/lib/docker/volumes but don't have a certain name. Their "name" is just some kind of hash. The volume gets created when running the container and are handy to save persistent data, whether you start the container with -v or not. The developer gets to say where the important data is and what should be persistent.
What should I use?
What you want to use comes mostly down to either preference or your management. If you want to keep everything in the "docker area" (/var/lib/docker) you can use volumes. If you want to keep your own directory-structure, you can use binds.
Docker recommends the use of volumes over the use of binds, as volumes are created and managed by docker and binds have a lot more potential of failure (also due to layer 8 problems).
If you use binds and want to transfer your containers/applications on another host, you have to rebuild your directory-structure, where as volumes are more uniform on every host.
Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts are dependent on the directory structure of the host machine, volumes are completely managed by Docker. Volumes are often a better choice than persisting data in a container’s writable layer, because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container. More on
Differences between -v and --mount behavior
Because the -v and --volume flags have been a part of Docker for a long time, their behavior cannot be changed. This means that there is one behavior that is different between -v and --mount.
If you use -v or --volume to bind-mount a file or directory that does not yet exist on the Docker host, -v creates the endpoint for you. It is always created as a directory.
If you use --mount to bind-mount a file or directory that does not yet exist on the Docker host, Docker does not automatically create it for you, but generates an error. More on
Docker for Windows shared folders limitation
Docker for Windows does make much of the VM transparent to the Windows host, but it is still a virtual machine. For instance, when using –v with a mongo container, MongoDB needs something else supported by the file system. There is also this issue about volume mounts being extremely slow.
More on
Bind mounts are like a superset of Volumes (named or unnamed).
Bind mounts are created by binding an existing folder in the host system (host system is native linux machine or vm (in windows or mac)) to a path in the container.
Volume command results in a new folder, created in the host system under /var/lib/docker
Volumes are recommended because they are managed by docker engine (prune, rm, etc).
A good use case for bind mount is linking development folders to a path in the container. Any change in host folder will be reflected in the container.
Another use case for bind mount is keeping the application log which is not crucial like a database.
Command syntax is almost the same for both cases:
bind mount:
note that the host path should start with '/'. Use $(pwd) for convenience.
docker container run -v /host-path:/container-path image-name
unnamed volume:
creates a folder in the host with an arbitrary name
docker container run -v /container-path image-name
named volume:
should not start with '/' as this is reserved for bind mount.
'volume-name' is not a full path here. the command will cause a folder to be created with path "/var/lib/docker/volumes/volume-name" in the host.
docker container run -v volume-name:/container-path image-name
A named volume can also be created beforehand a container is run (docker volume create). But this is almost never needed.
As a developer, we always need to do comparison among the options provided by tools or technology. For Volume & Bind mounts, I would suggest to list down what kind of application you are trying to containerize.
Following are the parameters that I would consider before choosing Volume over Bind Mounts:
Docker provide various CLI commands to Volumes easily outside containers.
For backup & restore, Volume is far easier than Bind as it depends upon the underlying host OS.
Volumes are platform-agnostic so they can work on Linux as well as on Window containers.
With Bind, you have 2 technologies to take care of. Your host machine directory structure as well as Docker.
Migration of Volumes are easier not only on local machines but on cloud machines as well.
Volumes can be easily shared among multiple containers.

Change file permissions in mounted folder inside docker container on Windows Host

Disclaimer/Edit 2
Some years later, for everyone reading this question - If you are on Windows and want to use docker with linux containers, I highly recommend not using docker for windows at all and instead starting the entire docker environment inside a VM altogether. This Ext3 NTFS issue will break your neck on so many different levels that installing docker-machine might not even be worth the effort.
Edit:
I am using docker-machine which starts a boot2docker instance inside a Virtualbox VM with a shared folder on /c/Users from which you can mount volumes into your containers. The permissions of said volumes are the ones the question is about. The VMs are stored under /c/Users/tom/.docker/
I chose to use the docker-machine Virtualbox workflow over Hyper-V because I need VBox in my daily workflow and running Hyper-V and Virtualbox together on one system is not possible due to incompabilities between different Hypervisors.
Original question
I am currently trying to set up PHPMyAdmin in a container on windows but I can't change the permissions of the config.inc.php file.
I found: Cannot call chown inside Docker container (Docker for Windows) and thought this might be somewhat related but it appears to apply only to MongoDB.
This is my docker-compose.yml
version: "3"
services:
pma:
image: (secrect company registry)/phpmyadmin
ports:
- 9090:80
volumes:
- /c/Users/tom/projects/myproject/data/var/www/public/config.inc.php:/var/www/public/config.inc.php
now, when I docker exec -it [container] bash and change in the mounted directory, I try to run chmod on the config.inc.php but for some reason, it fails silently.
root#22a4bag43245: ls -la config.inc.php
-rw------- 1 root root 0 Aug 11 15:11 config.inc.php
root#22a4bag43245: chmod 655 config.inc.php
root#22a4bag43245: ls -la config.inc.php
-rw------- 1 root root 0 Aug 11 15:11 config.inc.php
Considering the linked answer, I thought I could just move the volume out of my Userhome but then vbox doesn't mount the folder at all.
How do I change the file permissions of /var/www/public/config.inc.php persistently?
I had the same problem of not being able to change ownership even after using chown. And as I researched, it was because of NTFS volumes being mounted inside ext filesystem. So I used another approach.
The volumes internal to docker are free from these problems. So you can mount your file on internal docker volume and then create a hard symlink to that file inside your local folder wherever you want:
sudo ln $(docker volume inspect --format '{{ .Mountpoint }}' <project_name>_<volume_name>) <absolute_path_of_destination>
This way you can have your files in desired place, inside docker and without any permission issues, and you will be able to modify the contents of file as in the normal volume mount due to hard symlink.
Here is a working implementation of this process which mounts and links a directory. In case you wanna know about the details, see possible fix section in issue.
EDIT
Steps to implement this approach:
Mount the concerned file in internal docker-volume(also known as named volumes).
Before making hardlink, make sure volumes and concerned file are present there. To ensure this, you should have run your container at least once before or if you want to automate this file creation, you can include a docker run which creates the required files and exits.
docker run --rm -itd \
-v "<Project_name>_<volume_name>:/absolute/path" \
<image> bash -c "touch /absolute/path/<my_file>"
This docker run will create volumes and required files. Here, container is my project name, by default, it is the name of the folder in which project is present and <volume_name> is the same as one which we want to use in our original container. <image> can be the same one which is already being used in your original containers.
Create a hardlink in your OS to the actual file location on your system. You can find the file location using docker volume inspect --format '{{ .Mountpoint }}' <project_name>_<volume_name>/<my_file>. Linux users can use ln in terminal and windows users can use mklink in command prompt.
In step 3 we have not used /absolute/path since the <volume_name> refers to that location already, and we just need to refer to the file.
Try one of the following:
If you can rebuild the image image: image: (secrect company registry)/docker-stretchimal-apache2-php7-pma then inside the docker file, add the following
USER root
RUN chmod 655 config.inc.php
Then you can rebuild the image and push it to the registry, and what you were doing should work. This should be your preferred solution, as you don't want to be manually changing the permissions everytime you start a new container
Try to exec using the user root explicitly
docker exec -it -u root [container] bash

Docker root access to host system

When I run a container as a normal user I can map and modify directories owned by root on my host filesystem. This seems to be a big security hole. For example I can do the following:
$ docker run -it --rm -v /bin:/tmp/a debian
root#14da9657acc7:/# cd /tmp/a
root#f2547c755c14:/tmp/a# mv df df.orig
root#f2547c755c14:/tmp/a# cp ls df
root#f2547c755c14:/tmp/a# exit
Now my host filesystem will execute the ls command when df is typed (mostly harmless example). I cannot believe that this is the desired behavior, but it is happening in my system (debian stretch). The docker command has normal permissions (755, not setuid).
What am I missing?
Maybe it is good to clarify a bit more. I am not at the moment interested in what the container itself does or can do, nor am I concerned with the root access inside the container.
Rather I notice that anyone on my system that can run a docker container can use it to gain root access to my host system and read/write as root whatever they want: effectively giving all users root access. That is obviously not what I want. How to prevent this?
There are many Docker security features available to help with Docker security issues. The specific one that will help you is User Namespaces.
Basically you need to enable User Namespaces on the host machine with the Docker daemon stopped beforehand:
dockerd --userns-remap=default &
Note this will forbid the container from running in privileged mode (a good thing from a security standpoint) and restart the Docker daemon (it should be stopped before performing this command). When you enter the Docker container, you can restrict it to the current non-privileged user:
docker run -it --rm -v /bin:/tmp/a --user UID:GID debian
Regardless, try to enter the Docker container afterwards with your default command of
docker run -it --rm -v /bin:/tmp/a debian
If you attempt to manipulate the host filesystem that was mapped into a Docker volume (in this case /bin) where files and directories are owned by root, then you will receive a Permission denied error. This proves that User Namespaces provide the security functionality you are looking for.
I recommend going through the Docker lab on this security feature at https://github.com/docker/labs/tree/master/security/userns. I have done all of the labs and opened Issues and PRs there to ensure the integrity of the labs there and can vouch for them.
Access to run docker commands on a host is access to root on that host. This is the design of the tool since the functionality to mount filesystems and isolate an application requires root capabilities on linux. The security vulnerability here is any sysadmin that grants access to users to run docker commands that they wouldn't otherwise trust with root access on that host. Adding users to the docker group should therefore be done with care.
I still see Docker as a security improvement when used correctly, since applications run inside a container are restricted from what they can do to the host. The ability to cause damage is given with explicit options to running the container, like mounting the root filesystem as a rw volume, direct access to devices, or adding capabilities to root that permit escaping the namespace. Barring the explicit creation of those security holes, an application run inside a container has much less access than it would if it was run outside of the container.
If you still want to try locking down users with access to docker, there are some additional security features. User namespacing is one of those which prevents root inside of the container from having root access on the host. There's also interlock which allows you to limit the commands available per user.
You're missing that containers run as uid 0 internally by default. So this is expected. If you want to restrict the permission more inside the container, build it with a USER statement in Dockerfile. This will setuid to the named user at runtime, instead of running as root.
Note that the uid of this user it not necessarily predictable, as it is assigned inside the image you build, and it won't necessarily map to anything on the outside system. However, the point is, it won't be root.
Refer to Dockerfile reference for more information.

How to list Docker mounted volumes from within the container

I want to list all container directories that are mounted volumes.
I.e. to be able to get similar info I get from
docker inspect --format "{{ .Volumes }}" <self>
But from within the container and without having docker installed in there.
I tried cat /proc/mounts, but I couldn't find a proper filter for it.
(EDIT - this may no longer work on Mac) If your Docker host is OS X, the mounted volumes will be type osxfs (or fuse.osxfs). You can run a
mount | grep osxfs | awk '{print $3}'
and get a list of all the mounted volumes.
If your Docker host is Linux (at least Ubuntu 14+, maybe others), the volumes appear to all be on /dev, but not on a device that is in your container's /dev filesystem. The volumes will be alongside /etc/resolv.conf, /etc/hostname, and /etc/hosts. If you do a mount | grep ^/dev to start, then filter out any of the files in ls /dev/*, then filter out the three files listed above, you should be left with host volumes.
mount | grep ^/dev/ | grep -v /etc | awk '{print $3}'
My guess is the specifics may vary from Linux to Linux. Not ideal, but at least possible to figure out.
Assuming you want to check what volumes are mounted from inside a linux based container you can look up entries beginning with "/dev" in /etc/mtab, removing the /etc entries
$ grep "^/dev" /etc/mtab | grep -v " \/etc/"
/dev/nvme0n1p1 /var/www/site1 ext4 rw,relatime,discard,data=ordered 0 0
/dev/nvme0n1p1 /var/www/site2 ext4 rw,relatime,discard,data=ordered 0 0
As you can read from many of the comments you had, a container is initially nothing but a restricted, reserved part of resources that is totally cut away from the rest of your machine. It is not aware of being a Docker, and inside the container everything behaves as if it were a separate machine. Sort of like the matrix, I guess ;)
You get access to the host machine's kernel and its resources, but yet again restricted as just a filtered out set. This is done with the awesome "cgroups" functionality that comes with Unix/Linux kernels.
Now the good news: There are multiple ways for you to provide the information to your Docker, but that is something that you are going to have to provide and build yourself.
The easiest ad most powerful way is to mount the Unix socket located on your host at /var/run/docker.sock to the inside of your container at the same location. That way, when you use the Docker client inside your container you are directly talking to the docker engine on your host.
However, with great power comes great responsibility. This is a nice setup, but it is not very secure. Once someone manages to get into your docker it has root access to your host system this way.
A better way would be to provide a list of mounts through the environment settings, or clinging on to some made-up conventions to be able to predict the mounts.
(Do you realize that there is a parameter for mounting, to give mounts an alias for inside your Docker?)
The docker exec command is probably what you are looking for.
This will let you run arbitrary commands inside an existing container.
For example:
docker exec -it <mycontainer> bash
Of course, whatever command you are running must exist in the container filesystem.
#docker cp >>>> Copy files/folders between a container and the local filesystem
docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH
docker cp [OPTIONS] SRC_PATH CONTAINER:DEST_PATH
to copy full folder:
docker cp ./src/build b081dbbb679b:/usr/share/nginx/html
Note – This will copy build directory in container’s …/nginx/html/ directory to copy only files present in folder:
docker cp ./src/build/ b081dbbb679b:/usr/share/nginx/html
Note – This will copy contents of build directory in container’s …./nginx/html/ directory
Docker Storage options:
Volumes are stored in a part of the host filesystem which is managed by Docker(/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.
When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container. This is similar to the way that bind mounts work, except that volumes are managed by Docker and are isolated from the core functionality of the host machine.
A given volume can be mounted into multiple containers simultaneously. When no running container is using a volume, the volume is still available to Docker and is not removed automatically. You can remove unused volumes using docker volume prune.
When you mount a volume, it may be named or anonymous. Anonymous volumes are not given an explicit name when they are first mounted into a container, so Docker gives them a random name that is guaranteed to be unique within a given Docker host. Besides the name, named and anonymous volumes behave in the same ways.
Volumes also support the use of volume drivers, which allow you to store your data on remote hosts or cloud providers, among other possibilities.
Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.
Available since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full path on the host machine. The file or directory does not need to exist on the Docker host already. It is created on demand if it does not yet exist. Bind mounts are very performant, but they rely on the host machine’s filesystem having a specific directory structure available. If you are developing new Docker applications, consider using named volumes instead. You can’t use Docker CLI commands to directly manage bind mounts.
One side effect of using bind mounts, for better or for worse, is that you can change the host filesystem via processes running in a container, including creating, modifying, or deleting important system files or directories. This is a powerful ability which can have security implications, including impacting non-Docker processes on the host system.
tmpfs mounts are stored in the host system’s memory only, and are never written to the host system’s filesystem.
A tmpfs mount is not persisted on disk, either on the Docker host or within a container. It can be used by a container during the lifetime of the container, to store non-persistent state or sensitive information. For instance, internally, swarm services use tmpfs mounts to mount secrets into a service’s containers.
If you need to specify volume driver options, you must use --mount.
-v or --volume: Consists of three fields, separated by colon characters (:). The fields must be in the correct order, and the meaning of each field is not immediately obvious.
o In the case of named volumes, the first field is the name of the volume, and is unique on a given host machine. For anonymous volumes, the first field is omitted.
o The second field is the path where the file or directory will be mounted in the container.
o The third field is optional, and is a comma-separated list of options, such as ro. These options are discussed below.
• --mount: Consists of multiple key-value pairs, separated by commas and each consisting of a = tuple. The --mount syntax is more verbose than -v or --volume, but the order of the keys is not significant, and the value of the flag is easier to understand.
o The type of the mount, which can be bind, volume, or tmpfs. This topic discusses volumes, so the type will always be volume.
o The source of the mount. For named volumes, this is the name of the volume. For anonymous volumes, this field is omitted. May be specified as source or src.
o The destination takes as its value the path where the file or directory will be mounted in the container. May be specified as destination, dst, or target.
o The readonly option, if present, causes the bind mount to be mounted into the container as read-only.
o The volume-opt option, which can be specified more than once, takes a key-value pair consisting of the option name and its value.

Resources