Kubernetes Kube-Proxy Server: Can it run without iptables' nat module? - docker

Background:
In our environment, iptables' nat module is disabled. So I must use '-b=none --iptables=false' to start docker daemon and always add '--net host' when using 'docker run' command.
The same problem arising when using kubernetes.
When I try to start the 'kube-proxy' service, I got an error:
> F0822 14:32:49.065506 29630 server.go:101] Unable to create proxer:
> failed to initialize iptables: error creating chain
> "KUBE-PORTALS-CONTAINER": exit status 3: iptables v1.4.21: can't
> initialize iptables table `nat': Table does not exist (do you need to
> insmod?) Perhaps iptables or your kernel needs to be upgraded.
Is there a way to bypass this?

Kube-proxy makes heavy use of IPtables, even in userspace mode. I'm afraid you won't be able to run a Kubernetes node on a machine where IPtables is disabled completely.

Related

Can I run k8s master INSIDE a docker container? Getting errors about k8s looking for host's kernel details

In a docker container I want to run k8s.
When I run kubeadm join ... or kubeadm init commands I see sometimes errors like
\"modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could
not open moddep file
'/lib/modules/3.10.0-1062.1.2.el7.x86_64/modules.dep.bin'.
nmodprobe:
FATAL: Module configs not found in directory
/lib/modules/3.10.0-1062.1.2.el7.x86_64",
err: exit status 1
because (I think) my container does not have the expected kernel header files.
I realise that the container reports its kernel based on the host that is running the container; and looking at k8s code I see
// getKernelConfigReader search kernel config file in a predefined list. Once the kernel config
// file is found it will read the configurations into a byte buffer and return. If the kernel
// config file is not found, it will try to load kernel config module and retry again.
func (k *KernelValidator) getKernelConfigReader() (io.Reader, error) {
possibePaths := []string{
"/proc/config.gz",
"/boot/config-" + k.kernelRelease,
"/usr/src/linux-" + k.kernelRelease + "/.config",
"/usr/src/linux/.config",
}
so I am bit confused what is simplest way to run k8s inside a container such that it consistently past this getting the kernel info.
I note that running docker run -it solita/centos-systemd:7 /bin/bash on a macOS host I see :
# uname -r
4.9.184-linuxkit
# ls -l /proc/config.gz
-r--r--r-- 1 root root 23834 Nov 20 16:40 /proc/config.gz
but running exact same on a Ubuntu VM I see :
# uname -r
4.4.0-142-generic
# ls -l /proc/config.gz
ls: cannot access /proc/config.gz
[Weirdly I don't see this FATAL: Module configs not found in directory error every time, but I guess that is a separate question!]
UPDATE 22/November/2019. I see now that k8s DOES run okay in a container. Real problem was weird/misleading logs. I have added an answer to clarify.
I do not believe that is possible given the nature of containers.
You should instead test your app in a docker container then deploy that image to k8s either in the cloud or locally using minikube.
Another solution is to run it under kind which uses docker driver instead of VirtualBox
https://kind.sigs.k8s.io/docs/user/quick-start/
It seems the FATAL error part was a bit misleading.
It was badly formatted by my test environment (all on one line.
When k8s was failing I saw the FATAL and assumed (incorrectly) that was root cause.
When I format the logs nicely I see ...
kubeadm join 172.17.0.2:6443 --token 21e8ab.1e1666a25fd37338 --discovery-token-unsafe-skip-ca-verification --experimental-control-plane --ignore-preflight-errors=all --node-name 172.17.0.3
[preflight] Running pre-flight checks
[WARNING FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] The system verification failed. Printing the output from the verification:
KERNEL_VERSION: 4.4.0-142-generic
DOCKER_VERSION: 18.09.3
OS: Linux
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.3. Latest validated version: 18.06
[WARNING SystemVerification]: failed to parse kernel config: unable to load kernel module: "configs", output: "modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/4.4.0-142-generic/modules.dep.bin'\nmodprobe: FATAL: Module configs not found in directory /lib/modules/4.4.0-142-generic\n", err: exit status 1
[discovery] Trying to connect to API Server "172.17.0.2:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.17.0.2:6443"
[discovery] Failed to request cluster info, will try again: [the server was unable to return a response in the time allotted, but may still be processing the request (get configmaps cluster-info)]
There are other errors later, which I originally though were a side-effect of the nasty looking FATAL error e.g. .... "[util/etcd] Attempt timed out"]} but I now think root cause is Etcd part times out sometimes.
Adding this answer in case someone else puzzled like I was.

Allow outbound container networking through vpnkit

I have a linuxkit built VM here with a custom container service that I am trying to run.
services:
...
- name: net-manager
image: aemengo/net-manager:6bcc223a83e8a303a004bc6f6e383a54a3d19c55-amd64
net: host
capabilities:
- all
binds:
- /usr/bin/vpnkit-expose-port:/usr/bin/vpnkit-expose-port # userland proxy
- /usr/bin/vpnkit-iptables-wrapper:/usr/bin/iptables # iptables wrapper
- /var/vpnkit:/port # vpnkit control 9p mount
- /var/run:/var/run
command:
- sleep
- 1d
With a base image of Alpine, the point of the net-manager service is to allow public internet connectivity to virtual ethernet adapters that I am spinning up on the host: net namespace. My current attempt is the following (inside the container):
$ sysctl net.ipv4.conf.all.forwarding=1
$ /usr/bin/iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
Just like you would do with a VM that wasn't utilizing vpnkit, but there doesn't seem to be any noticeable effect from doing this. For example, nc -v google.com is still failing. What am I missing? vpnkit is mounted and forwarded as the example here instructs to do:
https://github.com/linuxkit/linuxkit/blob/master/examples/docker-for-mac.yml
It turns out that the problem was this line here:
binds:
...
/usr/bin/vpnkit-iptables-wrapper:/usr/bin/iptables
By overriding what the iptables executable was to the one provided by docker, things were misbehaving even though the commands reported no issue. It must be used for something swarm specific, as was mentioned in their docs.
The fix was to remove that binding and run the iptables that was provided in the container

Can't install Kubernetes on Vagrant

Use this guide to install Kubernetes on Vagrant cluster:
https://kubernetes.io/docs/getting-started-guides/kubeadm/
At (2/4) Initializing your master, there came some errors:
[root#localhost ~]# kubeadm init
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.6.4
[init] Using Authorization mode: RBAC
[preflight] Running pre-flight checks
[preflight] Some fatal errors occurred:
/proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can skip pre-flight checks with `--skip-preflight-checks`
I checked the /proc/sys/net/bridge/bridge-nf-call-iptables file content, there is only one 0 in it.
At (3/4) Installing a pod network, I downloaded kube-flannel file:
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
And run kubectl apply -f kube-flannel.yml, got error:
[root#localhost ~]# kubectl apply -f kube-flannel.yml
The connection to the server localhost:8080 was refused - did you specify the right host or port?
Until here, I don't know how to goon.
My Vagrantfile:
# Master Server
config.vm.define "master", primary: true do |master|
master.vm.network :private_network, ip: "192.168.33.200"
master.vm.network :forwarded_port, guest: 22, host: 1234, id: 'ssh'
end
In order to set /proc/sys/net/bridge/bridge-nf-call-iptables by editing /etc/sysctl.conf. There you can add [1]
net.bridge.bridge-nf-call-iptables = 1
Then execute
sudo sysctl -p
And the changes will be applied. With this the pre-flight check should pass.
[1] http://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf
Update #2019/09/02
Sometimes modprobe br_netfilter is unreliable, you may need to redo it after relogin, so use the following instead when on a systemd sytem:
echo br_netfilter > /etc/modules-load.d/br_netfilter.conf
systemctl restart systemd-modules-load.service
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
YES, the accepted answer is right, but I faced with
cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
So I did
modprobe br_netfilter
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
sudo sysctl -p
Then solved.
On Ubuntu 16.04 I just had to:
modprobe br_netfilter
Default value in /proc/sys/net/bridge/bridge-nf-call-iptables is already 1.
Then I added br_netfilter to /etc/modules to load the module automatically on next boot.
As mentioned in K8s docs - Installing kubeadm under the Letting iptables see bridged traffic section:
Make sure that the br_netfilter module is loaded. This can be done
by running lsmod | grep br_netfilter. To load it explicitly call
sudo modprobe br_netfilter.
As a requirement for your Linux Node's iptables to correctly see
bridged traffic, you should ensure
net.bridge.bridge-nf-call-iptables is set to 1 in your sysctl
config, e.g.
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system
Regardng the preflight erros - you can see in Kubeadm Implementation details under the preflight-checks:
Kubeadm executes a set of preflight checks before starting the init,
with the aim to verify preconditions and avoid common cluster startup
problems..
The following missing configurations will produce errors:
.
.
if /proc/sys/net/bridge/bridge-nf-call-iptables file does not exist/does not contain 1
if advertise address is ipv6 and /proc/sys/net/bridge/bridge-nf-call-ip6tables does not exist/does not contain 1.
if swap is on
.
.
The one-liner way:
sysctl net.bridge.bridge-nf-call-iptables=1

Running docker using linux kernel 4.3.0 got iptables nat error

I upgrade my debian kernel into 4.3.0
root#qa-control-nce-yuztest1:/usr/src/kernels/linux-4.3# uname -a
Linux qa-control-nce-yuztest1 4.3.0 #1 SMP Thu Dec 10 00:47:22 CST 2015 x86_64 GNU/Linux
bug found docker daemon ha
root#qa-control-nce-yuztest1:/usr/src/kernels/linux-4.3# docker -d
Warning: '-d' is deprecated, it will be removed soon. See usage.
WARN[0000] please use 'docker daemon' instead.
WARN[0000] Udev sync is not supported. This will lead to unexpected behavior, data loss and errors. For more information, see https://docs.docker.com/reference/commandline/daemon/#daemon-storage-driver-option
INFO[0000] API listen on /var/run/docker.sock
WARN[0000] Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section.
INFO[0000] [graphdriver] using prior storage driver "devicemapper"
FATA[0000] Error starting daemon: Error initializing network controller: error obtaining controller instance: Failed to create NAT chain: iptables failed: iptables -t nat -N DOCKER: iptables v1.4.14: can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
(exit status 3)
Seems iptables nat table does not exist, but i don't know how to deal with that.
Need your help, thanks in advance!
You need a kernel with iptables nat configured. I suspect because you upgraded the kernel yourself, that means you are not using one provided by a distro? In which case maybe you configured it from scratch and did not enable iptables nat.
When running the config ('make menuconfig'), search for '_nat', and find the iptables nat configuration through that, and enable it.

Docker on RHEL 6 Cgroup mounting failing

I'm trying to get my head around something that's been working on a Centos+Vagrant, but not on our providers RHEL (Red Hat Enterprise Linux Server release 6.5 (Santiago)). A sudo service docker restart hands this:
Stopping docker: [ OK ]
Starting cgconfig service: Error: cannot mount cpuset to /cgroup/cpuset: Device or resource busy
/sbin/cgconfigparser; error loading /etc/cgconfig.conf: Cgroup mounting failed
Failed to parse /etc/cgconfig.conf [FAILED]
Starting docker: [ OK ]
The service starts okey enough, but images cannot run. A mounting failed error is shown when I try. And the startup-log also gives a warning or two. Regarding the kernelwarning, centos gives the same and has no problems as Epel should resolve this:
WARNING: You are running linux kernel version 2.6.32-431.17.1.el6.x86_64, which might be unstable running docker. Please upgrade your kernel to 3.8.0.
2014/08/07 08:58:29 docker daemon: 1.1.2 d84a070; execdriver: native; graphdriver:
[1233d0af] +job serveapi(unix:///var/run/docker.sock)
[1233d0af] +job initserver()
[1233d0af.initserver()] Creating server
2014/08/07 08:58:29 Listening for HTTP on unix (/var/run/docker.sock)
[1233d0af] +job init_networkdriver()
[1233d0af] -job init_networkdriver() = OK (0)
2014/08/07 08:58:29 WARNING: mountpoint not found
Anyone had any success overcoming this problem or should I throw in the towel and wait for the provider to update to RHEL 7?
I have the same issue.
(1) check cgconfig status
# /etc/init.d/cgconfig status
if it stopped, restart it
# /etc/init.d/cgconfig restart
check cgconfig is running
(2) check cgconfig is on
# chkconfig --list cgconfig
cgconfig 0:off 1:off 2:off 3:off 4:off 5:off 6:off
if cgconfig is off, turn it on
(3) if still does not work, may be some cgroups modules is missing. In the kernel .config file, make menuconfig, add those modules into kernel and recompile and reboot
after that, it should be OK
I ended up asking the same question at Google Groups and in the end finding a solution with some help. What worked for me was this:
umount cgroup
sudo service cgconfig start
The project of making Docker work was put on halt all the same. Later a problem of network connection for the containers. This took to much time to solve and had to give up.
So I spent the whole day trying to rig docker to work on my vps. I was running into this same error. Basically what it came down to was the fact that OpenVZ didn't support docker containers up until a couple months ago. Specifically this RHEL update:
https://openvz.org/Download/kernel/rhel6/042stab105.14
Assuming this is your problem, or some variation of it, the burden of solving it is on your host. They will need to follow these steps:
https://openvz.org/Docker_inside_CT
In my case
/etc/rc.d/rc.cgconfig start
was generating
Starting cgconfig service: Error: cannot mount cpu,cpuacct,memory to
/cgroup/cpu_and_mem: Device or resource busy /usr/sbin/cgconfigparser;
error loading /etc/cgconfig.conf: Cgroup mounting failed Failed to
parse /etc/cgconfig.conf
i had to use:
/etc/rc.d/rc.cgconfig restart
and it automagicly umouted and mounted groups
Stopping cgconfig service: Starting cgconfig service:
it seems like the cgconfig service not running,so check it!
# /etc/init.d/cgconfig status
# mkdir -p /cgroup/cpuacct /cgroup/memory /cgroup/devices /cgroup/freezer net_cls /cgroup/blkio
# cat /etc/cgconfig.conf |tail|grep "="|awk '{print "mount -t cgroup -o",$1,$1,$NF}'>cgroup_mount.sh
# sh ./cgroup_mount.sh
# /etc/init.d/cgconfig restart
# /etc/init.d/docker restart
This situation occurs when the kernel is booted with cgroup_disable=memory and /etc/cgconfig.conf contains memory = /cgroup/memory;
This causes only /cgroup/cpuset to be mounted instead of the full set.
Solution: either remove cgroup_disable=memory from your kernel boot options or comment out memory = /cgroup/memory; from cgconfig.conf.
The cgconfig service startup uses mount and umount which requires an extra privilege bump from docker.
See the --privileged=true flag here for more info.
I was able to overcome this issue by starting my container with:
docker run -it --privileged=true my-image.
Tested in Centos6, Centos6.5.

Resources