How to get host's udev events from a Docker container? - docker

In a Docker container, I am looking for a way to get the udev events on the host.
Using udevadm monitor, it sends back host's kernel events only in a container.
The question is whether there is a way to detect host's udev events or forward host's event to containers?

This is how I made my container receive host events by udev:
docker run --net=host -v /run/udev/control:/run/udev/control
--net=host allows container and host operate through PF_NETLINK sockets, which are used by udev monitor to receive kernel events (found here)
/run/udev/control is a file, which udev monitor uses to check if udevd is already running. If it doesn't exist, monitoring is disabled.

Just like above answer pointed out: we could enable --net=host, but host network is not suggested because of multiple known reasons.
In fact this issue happens just because it need NETLINK to communicate between kernel & user space, but if not use host network, host & container will in different netns, so enable udev in container could make them in same netns which then no need to use host network.
When we ran into this issue, we did next:
# apt-get install udev
# vim /etc/init.d/udev to comment some special settings:
1) Comments next:
#if [ ! -e "/run/udev/" ]; then
# warn_if_interactive
#fi
2) Comments next:
#if ! ps --no-headers --format args ax | egrep -q '^\['; then
# log_warning_msg "udev does not support containers, not started"
# exit 0
#fi
# root#e751e437a8ba:~# service udev start
[ ok ] Starting hotplug events dispatcher: systemd-udevd.
[ ok ] Synthesizing the initial hotplug events (subsystems)...done.
[ ok ] Synthesizing the initial hotplug events (devices)...done.
[ ok ] Waiting for /dev to be fully populated...done.

Related

Logspout container in Docker

I am trying to deploy logspout container in docker, but keep running into an issue which I have searched in this website and github but to no avail, so hoping someone knows.
I followed the following commands as per the Readme here: https://github.com/gliderlabs/logspout
(1) docker pull gliderlabs/logspout:latest (also tried with logspout:master, same results)
(2) docker run -d --name="logspout" --volume=/var/run/docker.sock:/var/run/docker.sock --publish=127.0.0.1:8000:80 gliderlabs/logspout (also tried with -v /var/run/docker.sock:/var/run/docker.sock, same results)
The container gets created but stops immediately. When I check the container logs (docker container logs logspout), I only see the following entries:
2021/12/19 06:37:12 # logspout v3.2.14 by gliderlabs
2021/12/19 06:37:12 # adapters: raw syslog tcp tls udp multiline
2021/12/19 06:37:12 # options :
2021/12/19 06:37:12 persist:/mnt/routes
2021/12/19 06:37:12 # jobs : pump routes http[health,logs,routes]:80
2021/12/19 06:37:12 # routes : none
2021/12/19 06:37:12 pump ended: Get http://unix.sock/containers/json?: dial unix /var/run/docker.sock: connect: no such file or directory
I checked docker.sock as ls -la /var/run/docker.sock results in srw-rw---- 1 root docker 0 Dec 12 09:49 /var/run/docker.sock. So docker.sock does exist, which adds to the confusion as to why the container can't find it.
I am new to linux/docker, but my understanding is that using -v or --version would automatically mount the location to the container, but does not seem to be happening here. So I am wondering if anyone has any suggestion on what needs to be done so that the logspout container can find the docker.sock.
System Info: Docker version 20.10.11, build dea9396; Raspberry Pi 4 ARM 64, OS: Debian GNU/Linux 11 (bullseye)
EDIT: added comment about -v tag in step (2) above
The container must be able to access the Docker Unix socket to mount it. This is typically a problem when namespace remapping is enabled. To disable remapping for the logspout container, pass the --userns=host flag to docker run, .. create, etc.

Cannot open vfio device in docker container as non-root user

I have enabled virtualization in the BIOS and enabled the IOMMU on kernel command line (intel_iommu=on).
I bound a solarflare NIC to the vfio-pci device and added a udev rule to ensure the vfio device is accessible by my non-root user (e.g., /etc/udev/rules.d/10-vfio-docker-users.rules):
SUBSYSTEM=="vfio", OWNER="myuser", GROUP=="myuser"
I've launched my container with -u 1000 and mapped /dev (-v /dev:/dev). Running in an interactive shell in the container, I am able to verify that the device is there with the permissions set by my udev rule:
bash-4.2$ whoami
whoami: unknown uid 1000
bash-4.2$ ls -al /dev/vfio/35
crw-rw---- 1 1000 1000 236, 0 Jan 25 00:23 /dev/vfio/35
However, if I try and open it (e.g., python -c "open('/dev/vfio/35', 'rb')" I get IOError: [Errno 1] Operation not permitted: '/dev/vfio/35'. However, the same command works outside the container as the normal non-root user with user-id 1000!
It seems that there are additional security measures that are not allowing me to access the vfio device within the container. What am I missing?
Docker drops a number of privileges by default, including the ability to access most devices. You can explicitly grant access to a device using the --device flag, which would look something like:
docker run --device /dev/vfio/35 ...
Alternately, you can ask Docker not to drop any privileges:
docker run --privileged ...
You'll note that in both of the above examples it was not necessary to explicitly bind-mount /dev; in the first case, the device(s) you have exposed with --device will show up, and in the second case you see the host's /dev by default.

Docker Swarm with GlusterFS as the external volume storage and VIP

I was wondering if Docker Swarm was possible to be a load balancer with GlusterFS as the local filesystem? And use Pacemaker to hold the VIP (because I understand Docker cannot create a VIP).
My idea - which I'm hoping can be verified or suggested better :)
System:
2x CentOS 8 servers
- 192.168.0.1
---- /dev/sda (OS)
---- /dev/sdb (data)
- 192.168.0.2
---- /dev/sda (OS)
---- /dev/sdb (data)
Install Pacemaker, Corosync
dnf --enablerepo=HighAvailability -y install pacemaker pcs psmisc policycoreutils-python-utils
systemctl start pcsd
Add a VIP to both servers
pcs resource create vip IPaddr2 ip=192.168.0.100 cidr_netmask=24 op monitor interval=30s
Set up both storage
mkfs.xfs /dev/sdb
Make the directory and add to startup
mkdir -p /my-data/
echo "/dev/sdb /my-data xfs defaults 0 0" >> /etc/fstab
Install GlusterFS on both nodes
dnf install -y glusterfs-server
Setup Gluster for the volume
gluster volume create gfs replica 2 transport tcp node01:/my-data node02:/my-data force
gluster volume start gfs
Make it accessible for the replication
echo 'node01:/my-data /mnt glusterfs defaults,_netdev 0 0' >> /etc/fstab
echo 'node02:/my-data /mnt glusterfs defaults,_netdev 0 0' >> /etc/fstab
Install Docker and Docker-Compose
Initialise Swarm
- on node01 use IP 192.168.0.1 -> manager
- on node02 use IP 192.168.0.2 -> manager
Create the directories
mkdir /mnt/html
mkdir /mnt/mysql
In the docker-compose.yml file:
volumes:
- "/mnt/html:/var/www/html/wp-content"
volumes:
- "/mnt/mysql:/var/lib/mysql"
Apart of the docker-compose.yml - apache:
Use IP 192.168.0.100 as the access on 80
My thoughts are that as 192.168.0.100 is only accessible on one of the Pacemaker resources, that the secondary Manager wouldn't be hit on the front end. If that node went down on the IP .100 then the other node02 would take that IP and the Swarm would still be active.
Is this something that would work? I cant find anything about having a VIP on the Swarm - at least working solutions.
I have them both as Managers because I assume if the manager goes off then its not going to work? Then if I had a 3rd, 4th, etc. I'd add them as Workers.

Allow outbound container networking through vpnkit

I have a linuxkit built VM here with a custom container service that I am trying to run.
services:
...
- name: net-manager
image: aemengo/net-manager:6bcc223a83e8a303a004bc6f6e383a54a3d19c55-amd64
net: host
capabilities:
- all
binds:
- /usr/bin/vpnkit-expose-port:/usr/bin/vpnkit-expose-port # userland proxy
- /usr/bin/vpnkit-iptables-wrapper:/usr/bin/iptables # iptables wrapper
- /var/vpnkit:/port # vpnkit control 9p mount
- /var/run:/var/run
command:
- sleep
- 1d
With a base image of Alpine, the point of the net-manager service is to allow public internet connectivity to virtual ethernet adapters that I am spinning up on the host: net namespace. My current attempt is the following (inside the container):
$ sysctl net.ipv4.conf.all.forwarding=1
$ /usr/bin/iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
Just like you would do with a VM that wasn't utilizing vpnkit, but there doesn't seem to be any noticeable effect from doing this. For example, nc -v google.com is still failing. What am I missing? vpnkit is mounted and forwarded as the example here instructs to do:
https://github.com/linuxkit/linuxkit/blob/master/examples/docker-for-mac.yml
It turns out that the problem was this line here:
binds:
...
/usr/bin/vpnkit-iptables-wrapper:/usr/bin/iptables
By overriding what the iptables executable was to the one provided by docker, things were misbehaving even though the commands reported no issue. It must be used for something swarm specific, as was mentioned in their docs.
The fix was to remove that binding and run the iptables that was provided in the container

Marathon won't launch docker container

I have a 1/1 master/slave setup with the slave having 8gb ram 8 cpus. I am trying to use marathon to deploy a docker container with 1gb mem and 1 cpu but it just hangs on waiting
I believe this is usually caused by marathon not getting the resources it wants for the task
when I look at my logs I see
Sending 1 offers to framework
8bb1a298-cc23-426e-ad43-d440a2a560c4-0000 (marathon) at
scheduler-d4a993b4-69ea-4ac3-9e98-b54afe1e790b#127.0.0.1:52016 I0127
23:07:37.396546 2471 master.cpp:3297] Processing DECLINE call for
offers: [ 5271fcb3-4d77-4b12-af85-d94fd9172514-O127 ] for framework
8bb1a298-cc23-426e-ad43-d440a2a560c4-0000 (marathon) at
scheduler-d4a993b4-69ea-4ac3-9e98-b54afe1e790b#127.0.0.1:52016 I0127
23:07:37.396917 2466 hierarchical.cpp:744] Recovered cpus(​):6;
mem(​):5968; disk(​):156020; ports(​):[31000-31056, 31058-32000]
(total: cpus(​):8; mem(​):6992; disk(​):156020;
ports(​):[31000-32000], allocated: cpus(​):2; mem(​):1024;
ports(*):[31057-31057]) on slave
8bb1a298-cc23-426e-ad43-d440a2a560c4-S0 from framework
8bb1a298-cc23-426e-ad43-d440a2a560c4-0000
so it looks like marathon is declining the offer it gets? the next line in the logs say that mesos is reclaiming the offered resources and what its reclaiming looks like its plenty for my task?
any ideas on how to trouble shoot this further?
edit: so got to dig into this a bit further and found the marathon logs.
Basically the deployment works if we do not enter any information for port mapping in the marathon docker section. The docker container deploys successfully and I can ping it successfully from its host but I cannot access it from elsewhere.
if we set the container port as 8081 (which is what the docker container exposes are its application listens on) we get further in the deployment process but the app within the container fails to build with error
Error: listen EADDRINUSE :::8081
at Object.exports._errnoException (util.js:856:11)
at exports._exceptionWithHostPort (util.js:879:20)
at Server._listen2 (net.js:1234:14)
at listen (net.js:1270:10)
at Server.listen (net.js:1366:5)
at EventEmitter.listen (/usr/src/app/node_modules/express/lib/application.js:617:24)
at Object. (/usr/src/app/index.js:16:18)
at Module._compile (module.js:425:26)
at Object.Module._extensions..js (module.js:432:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:313:12)
at Function.Module.runMain (module.js:457:10)
at startup (node.js:138:18)
at node.js:974:3
So I think we are further along than we were but we are still having some port issues. I dont know why the container would build successfully on its own and with marathon with no port settings but not with marathon with port settings
There are few things to check:
On you slave: ps aux | grep sbin/mesos-slave should contain something like:
--containerizers=docker,mesos --executor_registration_timeout=5mins
Again on slave check that there's a Docker Daemon running:
ps aux | grep "docker daemon"
Make sure you've configured Docker network (in Marathon) as BRIDGE. With HOST mode you might get in collision with ports already used on host. This will allow mapping slave:32001 -> docker:8080.
...
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 8080,
"hostPort": $PORT0,
"protocol": "tcp"
}
],
...
When the task starts in Marathon you'll see the app ID like myapp.a72db5b0-ca16-11e5-ba5f-fea9945fabaf. Use Mesos CLI (pip install mesos.cli mesos.interface) to fetch the logs. There's a command similar to Unix's tail for fetching stdout logs (-f follow logs):
mesos tail -f -i myapp.a72db5b0-ca16-11e5-ba5f-fea9945fabaf
and stderr:
mesos tail -f -i myapp.a72db5b0-ca16-11e5-ba5f-fea9945fabaf stderr
-i allows you to get logs from inactive tasks (in case that the task is crashing quickly). If you don't catch the ID in Marathon, use mesos ps -i.
In case that the task is not starting, there's either not enough resources or some problem with Marathon. Navigate your browser to http://{marathon URI:8080]/logging and increase verbosity for task allocation. Then check Marathon logs.

Resources