Can I run k8s master INSIDE a docker container? Getting errors about k8s looking for host's kernel details

Can I run k8s master INSIDE a docker container? Getting errors about k8s looking for host's kernel details - docker

In a docker container I want to run k8s.
When I run kubeadm join ... or kubeadm init commands I see sometimes errors like
\"modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could
not open moddep file
'/lib/modules/3.10.0-1062.1.2.el7.x86_64/modules.dep.bin'.
nmodprobe:
FATAL: Module configs not found in directory
/lib/modules/3.10.0-1062.1.2.el7.x86_64",
err: exit status 1
because (I think) my container does not have the expected kernel header files.
I realise that the container reports its kernel based on the host that is running the container; and looking at k8s code I see
// getKernelConfigReader search kernel config file in a predefined list. Once the kernel config
// file is found it will read the configurations into a byte buffer and return. If the kernel
// config file is not found, it will try to load kernel config module and retry again.
func (k *KernelValidator) getKernelConfigReader() (io.Reader, error) {
possibePaths := []string{
"/proc/config.gz",
"/boot/config-" + k.kernelRelease,
"/usr/src/linux-" + k.kernelRelease + "/.config",
"/usr/src/linux/.config",
}
so I am bit confused what is simplest way to run k8s inside a container such that it consistently past this getting the kernel info.
I note that running docker run -it solita/centos-systemd:7 /bin/bash on a macOS host I see :
# uname -r
4.9.184-linuxkit
# ls -l /proc/config.gz
-r--r--r-- 1 root root 23834 Nov 20 16:40 /proc/config.gz
but running exact same on a Ubuntu VM I see :
# uname -r
4.4.0-142-generic
# ls -l /proc/config.gz
ls: cannot access /proc/config.gz
[Weirdly I don't see this FATAL: Module configs not found in directory error every time, but I guess that is a separate question!]
UPDATE 22/November/2019. I see now that k8s DOES run okay in a container. Real problem was weird/misleading logs. I have added an answer to clarify.

I do not believe that is possible given the nature of containers.
You should instead test your app in a docker container then deploy that image to k8s either in the cloud or locally using minikube.
Another solution is to run it under kind which uses docker driver instead of VirtualBox
https://kind.sigs.k8s.io/docs/user/quick-start/

It seems the FATAL error part was a bit misleading.
It was badly formatted by my test environment (all on one line.
When k8s was failing I saw the FATAL and assumed (incorrectly) that was root cause.
When I format the logs nicely I see ...
kubeadm join 172.17.0.2:6443 --token 21e8ab.1e1666a25fd37338 --discovery-token-unsafe-skip-ca-verification --experimental-control-plane --ignore-preflight-errors=all --node-name 172.17.0.3
[preflight] Running pre-flight checks
[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.4.0-142-generic
DOCKER_VERSION: 18.09.3
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.3. Latest validated version: 18.06
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-142-generic/modules.dep.bin'\nmodprobe: FATAL: Module configs not found in directory /lib/modules/4.4.0-142-generic\n", err: exit status 1
[discovery] Trying to connect to API Server "172.17.0.2:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.17.0.2:6443"
[discovery] Failed to request cluster info, will try again: [the server was unable to return a response in the time allotted, but may still be processing the request (get configmaps cluster-info)]
There are other errors later, which I originally though were a side-effect of the nasty looking FATAL error e.g. .... "[util/etcd] Attempt timed out"]} but I now think root cause is Etcd part times out sometimes.
Adding this answer in case someone else puzzled like I was.

Related

How to use vpnkit with minikube on mac

There are many question around this topic, but not the specific info I am after.
Host OS is Mac, and recently had to uninstall Docker Desktop due to their licensing change. So instead we have moved to minikube, and it is all working great with VirtualBox driver.
But ideally we would like to use the hyperkit driver, as it requires less resources than virtualbox, and is (anecdotally) faster. This also all works great until we connect to our VPN (using cisco anyconnect) and then all outbound networking from within the minikube VM stops working. e.g.
k8> minikube ssh "traceroute 8.8.8.8"
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 46 byte packets
1 host.minikube.internal (192.168.64.1) 0.154 ms 0.181 ms 0.151 ms
2 * * *
Everything else is is fine, inbound networking via ingress is all good. And maven-docker-plugin is happily creating images with the minikube docker daemon. Just nothing outbound.
So figured I'd try to work with VPNKit as I have read it is meant to address this issue. But cannot find a lot of detailed documentation, and so am struggling.
We have tried starting VPNKit with minimal config:
vpnkit --ethernet /tmp/vpkit-ethernet.socket --debug
And then attempt to start minikube, but it fails:
k8> minikube delete
🔥 Deleting "minikube" in hyperkit ...
💀 Removed all traces of the "minikube" cluster.
k8> minikube start --driver=hyperkit --hyperkit-vpnkit-sock=/tmp/vpnkit-ethernet.socket
😄 minikube v1.25.1 on Darwin 10.15.7
✨ Using the hyperkit driver based on user configuration
👍 Starting control plane node minikube in cluster minikube
🔥 Creating hyperkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
🔥 Deleting "minikube" in hyperkit ...
🤦 StartHost failed, but will try again: creating host: create: Error creating machine: Error in driver during machine creation: hyperkit crashed! command line:
hyperkit loglevel=3 console=ttyS0 console=tty0 noembed nomodeset norestore waitusb=10 systemd.legacy_systemd_cgroup_controller=yes random.trust_cpu=on hw_rng_model=virtio base host=minikube
🔥 Creating hyperkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
😿 Failed to start hyperkit VM. Running "minikube delete" may fix it: creating host: create: Error creating machine: Error in driver during machine creation: hyperkit crashed! command line:
hyperkit loglevel=3 console=ttyS0 console=tty0 noembed nomodeset norestore waitusb=10 systemd.legacy_systemd_cgroup_controller=yes random.trust_cpu=on hw_rng_model=virtio base host=minikube
❌ Exiting due to PR_HYPERKIT_CRASHED: Failed to start host: creating host: create: Error creating machine: Error in driver during machine creation: hyperkit crashed! command line:
hyperkit loglevel=3 console=ttyS0 console=tty0 noembed nomodeset norestore waitusb=10 systemd.legacy_systemd_cgroup_controller=yes random.trust_cpu=on hw_rng_model=virtio base host=minikube
💡 Suggestion: Hyperkit is broken. Upgrade to the latest hyperkit version and/or Docker for Desktop. Alternatively, you may choose an alternate --driver
🍿 Related issues:
▪ https://github.com/kubernetes/minikube/issues/6079
▪ https://github.com/kubernetes/minikube/issues/5780
And in the vpnkit log we see:
time="2022-02-14T06:07:57Z" level=debug msg="usernet: accepted vmnet connection"
time="2022-02-14T06:07:57Z" level=warning msg="Uwt: Pipe.listen: rejected ethernet connection: EOF"
time="2022-02-14T06:08:07Z" level=debug msg="usernet: accepted vmnet connection"
time="2022-02-14T06:08:07Z" level=warning msg="Uwt: Pipe.listen: rejected ethernet connection: EOF"
So kind of implies something is not right with how I started vpnkit. Have played with IP args to ensure it all matches, but does not help.
My guess is that the --ethernet=path arg is not the right type of socket. I have seen there is also --vsock-path=path but specifying this does not appear to create the socket file like --ethernet=path does. Do I have to create this some other way?
Or are there other config options I need to mess with. e.g. I thought --gateway-forwards=path could help, but can find no documentation on file format or contents.
So, I guess two main questions:
Is what we are trying even possible? Is it the the right way to go about it? Or is it much more complicated than simply running the vpnkit command?
If we are on the right track, does anyone have experience with this, and know how to set up the socket for minikube+vpnkit+hyperkit? What args, config, or other setup is required?
And just to note: --hyperkit-vpnkit-sock=auto is not an option for us, as we do not have docker installed, and so the docker socket file does not exist.
And just in case its a version issue:
k8> minikube version
minikube version: v1.25.1
commit: 3e64b11ed75e56e4898ea85f96b2e4af0301f43d
k8> vpnkit --version
854498c13b1884d4a48d84f3569eb34681af2126
k8> hyperkit -v
hyperkit: 0.20200908
Homepage: https://github.com/docker/hyperkit
License: BSD

Logspout container in Docker

I am trying to deploy logspout container in docker, but keep running into an issue which I have searched in this website and github but to no avail, so hoping someone knows.
I followed the following commands as per the Readme here: https://github.com/gliderlabs/logspout
(1) docker pull gliderlabs/logspout:latest (also tried with logspout:master, same results)
(2) docker run -d --name="logspout" --volume=/var/run/docker.sock:/var/run/docker.sock --publish=127.0.0.1:8000:80 gliderlabs/logspout (also tried with -v /var/run/docker.sock:/var/run/docker.sock, same results)
The container gets created but stops immediately. When I check the container logs (docker container logs logspout), I only see the following entries:
2021/12/19 06:37:12 # logspout v3.2.14 by gliderlabs
2021/12/19 06:37:12 # adapters: raw syslog tcp tls udp multiline
2021/12/19 06:37:12 # options :
2021/12/19 06:37:12 persist:/mnt/routes
2021/12/19 06:37:12 # jobs : pump routes http[health,logs,routes]:80
2021/12/19 06:37:12 # routes : none
2021/12/19 06:37:12 pump ended: Get http://unix.sock/containers/json?: dial unix /var/run/docker.sock: connect: no such file or directory
I checked docker.sock as ls -la /var/run/docker.sock results in srw-rw---- 1 root docker 0 Dec 12 09:49 /var/run/docker.sock. So docker.sock does exist, which adds to the confusion as to why the container can't find it.
I am new to linux/docker, but my understanding is that using -v or --version would automatically mount the location to the container, but does not seem to be happening here. So I am wondering if anyone has any suggestion on what needs to be done so that the logspout container can find the docker.sock.
System Info: Docker version 20.10.11, build dea9396; Raspberry Pi 4 ARM 64, OS: Debian GNU/Linux 11 (bullseye)
EDIT: added comment about -v tag in step (2) above

The container must be able to access the Docker Unix socket to mount it. This is typically a problem when namespace remapping is enabled. To disable remapping for the logspout container, pass the --userns=host flag to docker run, .. create, etc.

docker run hello-world results in "Incorrect Usage" error: "flag provided but not defined: -console"

When running docker run hello-world I get an "Incorrect Usage" error (full output pasted below). I'm running the following:
Docker 17.05.0-ce, build 89658be
docker-containerd 0.2.3 (commit 9048e5e)
runc v1.0.0-rc4
Linux kernel 4.1.15
Using buildroot 2017.11 (commit 1f1a242) to generate custom toolchain/rootfs
systemd 234
Seems as though I can pull the hello-world image down properly, as it is included in docker images output. Wondering if there is an incompatibility between docker/containerd/runc? Or maybe something obvious? First time working with docker.
Additionally, I've run a docker check-config.sh script I found that states the only kernel configuration features I'm missing are optional. They are CONFIG_CGROUP_PIDS, CONFIG_CGROUP_HUGETLB, CONFIG_AUFS_FS, /dev/zfs, zfs command, and zpool command. Everything else, including all required, are enabled.
Output:
# docker run hello-world
[ 429.332968] device vethc0d83d1 entered promiscuous mode
[ 429.359681] IPv6: ADDRCONF(NETDEV_UP): vethc0d83d1: link is not ready
Incorrect Usage.
NAME:
docker-runc create - create a container
USAGE:
docker-runc create [command options] <container-id>
Where "<container-id>" is your name for the instance of the container that you
are starting. The name you provide for the container instance must be unique on
your host.
DESCRIPTION:
The create command creates an instance of a container for a bundle. The bundle
is a directory with a specification file named "config.json" and a root
filesystem.
The specification file includes an args parameter. The args parameter is used
to specify command(s) that get run when the container is started. To change the
command(s) that get executed on start, edit the args parameter of the spec. See
"runc spec --help" for more explanation.
OPTIONS:
--bundle value, -b value path to the root of the bundle directory, defaults to the current directory
--console-socket value path to an AF_UNIX socket which will receive a file descriptor referencing the master end of the console's pseudoterminal
--pid-file value specify the file to write the process id to
--no-pivot do not use pivot root to jail process inside rootfs. This should be used whenever the rootfs is on top of a ramdisk
--no-new-keyring do not create a new session keyring for the container. This will cause the container to inherit the calling processes session key
--preserve-fds value Pass N additional file descriptors to the container (stdio + $LISTEN_FDS + N in total) (default: 0)
flag provided but not defined: -console
[ 429.832198] docker0: port 1(vethc0d83d1) entered disabled state
[ 429.849301] device vethc0d83d1 left promiscuous mode
[ 429.859317] docker0: port 1(vethc0d83d1) entered disabled state
docker: Error response from daemon: oci runtime error: flag provided but not defined: -console.

The -console option was replaced with --console-socket in runc Dec 2016 for v1.0.0-rc4.
So I would guess you need an older version of runc or a newer version of Docker.
If you are building Docker yourself, use Docker 17.09.0-ce or an older release of runc. I'm not sure if that's v0.1.1 or just an earlier 1.0 like v1.0.0-rc2
If you were upgrading packages, something has gone wrong with the install. Probably purge everything and reinstall Docker.

dashDB local MPP deployment issue - cannot connect to database

I am facing a huge problem at deploying a dashDB local cluster. After a successful deployment the following error comes in case of trying to create a single table or launch a query. Furthermore webserver is not working properly like in previous SMP deployment.
Cannot connect to database "BLUDB" on node "20" because the difference
between the system time on the catalog node and the virtual timestamp
on this node is greater than the max_time_diff database manager
configuration parameter.. SQLCODE=-1472, SQLSTATE=08004,
DRIVER=4.18.60
I followed official deployment guide, so followings were doublechecked:
each physical machines' and docker containers' /etc/hosts file contains all ips, fully qualified and simple hostnames
there is a NFS preconfigured and mounted to /mnt/clusterfs on every single server
none of the servers signed an error at phase "docker logs --follow dashDB" command
nodes config file is located in /mnt/clusterfs directory
After starting dashDB with following command:
docker exec -it dashDB start
It looks as it should be (see below), but the error can be found at /opt/ibm/dsserver/logs/dsserver.0.log.
#
--- dashDB stack service status summary ---
##################################################################### Redirecting to /bin/systemctl status slapd.service
SUMMARY
LDAPrunning: SUCCESS
dashDBtablesOnline: SUCCESS
WebConsole : SUCCESS
dashDBconnectivity : SUCCESS
dashDBrunning : SUCCESS
#
--- dashDB high availability status ---
#
Configuring dashDB high availability ... Stopping the system Stopping
datanode dashdb02 Stopping datanode dashdb01 Stopping headnode
dashdb03 Running sm on head node dashdb03 .. Running sm on data node
dashdb02 .. Running sm on data node dashdb01 .. Attempting to activate
previously failed nodes, if any ... SM is RUNNING on headnode dashdb03
(ACTIVE) SM is RUNNING on datanode dashdb02 (ACTIVE) SM is RUNNING on
datanode dashdb01 (ACTIVE) Overall status : RUNNING
After several redeployment nothing has changed. Please help me in what I am doing wrong.
Many Thanks, Daniel

Always make sure to NTP service is started on every single cluster node before starting a docker container. Otherwise it will take no effect on it.

Docker on RHEL 6 Cgroup mounting failing

I'm trying to get my head around something that's been working on a Centos+Vagrant, but not on our providers RHEL (Red Hat Enterprise Linux Server release 6.5 (Santiago)). A sudo service docker restart hands this:
Stopping docker: [ OK ]
Starting cgconfig service: Error: cannot mount cpuset to /cgroup/cpuset: Device or resource busy
/sbin/cgconfigparser; error loading /etc/cgconfig.conf: Cgroup mounting failed
Failed to parse /etc/cgconfig.conf [FAILED]
Starting docker: [ OK ]
The service starts okey enough, but images cannot run. A mounting failed error is shown when I try. And the startup-log also gives a warning or two. Regarding the kernelwarning, centos gives the same and has no problems as Epel should resolve this:
WARNING: You are running linux kernel version 2.6.32-431.17.1.el6.x86_64, which might be unstable running docker. Please upgrade your kernel to 3.8.0.
2014/08/07 08:58:29 docker daemon: 1.1.2 d84a070; execdriver: native; graphdriver:
[1233d0af] +job serveapi(unix:///var/run/docker.sock)
[1233d0af] +job initserver()
[1233d0af.initserver()] Creating server
2014/08/07 08:58:29 Listening for HTTP on unix (/var/run/docker.sock)
[1233d0af] +job init_networkdriver()
[1233d0af] -job init_networkdriver() = OK (0)
2014/08/07 08:58:29 WARNING: mountpoint not found
Anyone had any success overcoming this problem or should I throw in the towel and wait for the provider to update to RHEL 7?

I have the same issue.
(1) check cgconfig status
# /etc/init.d/cgconfig status
if it stopped, restart it
# /etc/init.d/cgconfig restart
check cgconfig is running
(2) check cgconfig is on
# chkconfig --list cgconfig
cgconfig 0:off 1:off 2:off 3:off 4:off 5:off 6:off
if cgconfig is off, turn it on
(3) if still does not work, may be some cgroups modules is missing. In the kernel .config file, make menuconfig, add those modules into kernel and recompile and reboot
after that, it should be OK

I ended up asking the same question at Google Groups and in the end finding a solution with some help. What worked for me was this:
umount cgroup
sudo service cgconfig start
The project of making Docker work was put on halt all the same. Later a problem of network connection for the containers. This took to much time to solve and had to give up.

So I spent the whole day trying to rig docker to work on my vps. I was running into this same error. Basically what it came down to was the fact that OpenVZ didn't support docker containers up until a couple months ago. Specifically this RHEL update:
https://openvz.org/Download/kernel/rhel6/042stab105.14
Assuming this is your problem, or some variation of it, the burden of solving it is on your host. They will need to follow these steps:
https://openvz.org/Docker_inside_CT

In my case
/etc/rc.d/rc.cgconfig start
was generating
Starting cgconfig service: Error: cannot mount cpu,cpuacct,memory to
/cgroup/cpu_and_mem: Device or resource busy /usr/sbin/cgconfigparser;
error loading /etc/cgconfig.conf: Cgroup mounting failed Failed to
parse /etc/cgconfig.conf
i had to use:
/etc/rc.d/rc.cgconfig restart
and it automagicly umouted and mounted groups
Stopping cgconfig service: Starting cgconfig service:

it seems like the cgconfig service not running,so check it!
# /etc/init.d/cgconfig status
# mkdir -p /cgroup/cpuacct /cgroup/memory /cgroup/devices /cgroup/freezer net_cls /cgroup/blkio
# cat /etc/cgconfig.conf |tail|grep "="|awk '{print "mount -t cgroup -o",$1,$1,$NF}'>cgroup_mount.sh
# sh ./cgroup_mount.sh
# /etc/init.d/cgconfig restart
# /etc/init.d/docker restart

This situation occurs when the kernel is booted with cgroup_disable=memory and /etc/cgconfig.conf contains memory = /cgroup/memory;
This causes only /cgroup/cpuset to be mounted instead of the full set.
Solution: either remove cgroup_disable=memory from your kernel boot options or comment out memory = /cgroup/memory; from cgconfig.conf.

The cgconfig service startup uses mount and umount which requires an extra privilege bump from docker.
See the --privileged=true flag here for more info.
I was able to overcome this issue by starting my container with:
docker run -it --privileged=true my-image.
Tested in Centos6, Centos6.5.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart