Kuberntes master not starting up in OpenStack heat - docker

I have been trying to setup a Kubernetes cluster for the last week or so in OpenStack using this guide. I have faced a few issues in the process one of which is described in this question -> kube-up.sh failes in OpenStack
On issuing the ./cluster/kube-up.sh script, it tries to bring up the cluster using the openstack stack create step (Log) . Here, for some reason the kubernetes master does not properly come up and here is where the installation fails. I was able to SSH into the master node and found this in /var/log/cloud-init-output.log
[..]
Complete!
* INFO: Running install_centos_stable_post()
* INFO: Running install_centos_check_services()
* INFO: Running install_centos_restart_daemons()
* INFO: Running daemons_running()
* INFO: Salt installed!
2017-01-02 12:57:31,574 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
2017-01-02 12:57:31,576 - util.py[WARNING]: Running scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Cloud-init v. 0.7.5 finished at Mon, 02 Jan 2017 12:57:31 +0000. Datasource DataSourceOpenStack [net,ver=2]. Up 211.20 seconds
On digging further I found this snippet in the /var/log/messages file -> https://paste.ubuntu.com/23733430/
So I would assume that the Docker daemon is not starting up. Also there is something screwed up with my etcd configuration due to which flanneld service is also not starting up. Here is the output of service flanneld status
● flanneld.service - Flanneld overlay address etcd agent
Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled; vendor preset: disabled)
Active: activating (start) since Tue 2017-01-03 13:32:37 UTC; 48s ago
Main PID: 15666 (flanneld)
CGroup: /system.slice/flanneld.service
└─15666 /usr/bin/flanneld -etcd-endpoints= -etcd-prefix= -iface=eth0 --ip-masq
Jan 03 13:33:16 kubernetesstack-master flanneld[15666]: E0103 13:33:16.229827 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:17 kubernetesstack-master flanneld[15666]: E0103 13:33:17.230082 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:18 kubernetesstack-master flanneld[15666]: E0103 13:33:18.230326 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:19 kubernetesstack-master flanneld[15666]: E0103 13:33:19.230560 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:20 kubernetesstack-master flanneld[15666]: E0103 13:33:20.230822 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:21 kubernetesstack-master flanneld[15666]: E0103 13:33:21.231325 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:22 kubernetesstack-master flanneld[15666]: E0103 13:33:22.231581 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:23 kubernetesstack-master flanneld[15666]: E0103 13:33:23.232140 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:24 kubernetesstack-master flanneld[15666]: E0103 13:33:24.234041 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Jan 03 13:33:25 kubernetesstack-master flanneld[15666]: E0103 13:33:25.234277 15666 network.go:53] Failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
My etcd daemon is running:
[root#kubernetesstack-master salt]# netstat -tanlp | grep etcd
tcp 0 0 192.168.173.3:4379 0.0.0.0:* LISTEN 20338/etcd
tcp 0 0 192.168.173.3:4380 0.0.0.0:* LISTEN 20338/etcd
Although its running on a non standard port.
I'm also in a corporate network under a proxy. Any pointers on how to debug this further is appreciated. As of now I have reached a dead end on how to proceed on this. Asking in the kubernetes slack channels have also produced zero results!

/usr/bin/flanneld -etcd-endpoints=
That line is the source of your troubles, assuming you didn't elide the output before posting it. Your situation is made worse by etcd running on non-standard ports, but thankfully I think the solution to both of those is actually the same fix.
I would expect running systemctl cat flanneld.service (you may need sudo, depending on the strictness of your systemd setup) to output the unified systemd descriptor for flanneld, including any "drop-ins", overrides, etc, and if my theory is correct, one of them will be either Environment= or EnvironmentFile= and that's the place I bet flanneld.service expected to have ETCD_ENDPOINTS= or FLANNELD_ETCD_ENDPOINTS= (as seen here) available to the Exec.
So hopefully that file is either missing or is actually blank, and in either case you are one swift vi away from teaching flanneld about your etcd endpoints, and everything being well in the world again.

Related

Run Docker on Raspberry Pi4 with overlay fs

I prefer to create a situation where on a Raspberry Pi4 Docker is running while the SD-card is read only. This with overlay fs.
In the dockercontainer a database is running, the data of the database is written to an USB-stick (volume mapping).
When overlayfs is activated (after reboot, enabled via “sudo raspi-config”), docker will not start-up any more.
The steps on https://docs.docker.com/storage/storagedriver/overlayfs-driver/
System information:
Linux raspberrypi 5.10.63-v8+ #1488 SMP PREEMPT Thu Nov 18 16:16:16 GMT 2021 aarch64 GNU/Linux
Docker information:
pi#raspberrypi:~ $ docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.6.3-docker)
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server Version: 20.10.11
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
………
Status docker after restart:
pi#raspberrypi:~ $ sudo systemctl status docker.*
Warning: The unit file, source configuration file or drop-ins of docker.service changed on disk. Run 'systemctl daemon-reload' to reload units.
● docker.socket - Docker Socket for the API
Loaded: loaded (/lib/systemd/system/docker.socket; enabled; vendor preset: enabled)
Active: failed (Result: service-start-limit-hit) since Thu 2021-12-09 14:30:43 GMT; 1h 13min ago
Triggers: ● docker.service
Listen: /run/docker.sock (Stream)
CPU: 2ms
Dec 09 14:30:36 raspberrypi systemd[1]: Starting Docker Socket for the API.
Dec 09 14:30:36 raspberrypi systemd[1]: Listening on Docker Socket for the API.
Dec 09 14:30:43 raspberrypi systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2021-12-09 14:30:43 GMT; 1h 13min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 992 (code=exited, status=1/FAILURE)
CPU: 162ms
Dec 09 14:30:43 raspberrypi systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Dec 09 14:30:43 raspberrypi systemd[1]: Stopped Docker Application Container Engine.
Dec 09 14:30:43 raspberrypi systemd[1]: docker.service: Start request repeated too quickly.
Dec 09 14:30:43 raspberrypi systemd[1]: docker.service: Failed with result 'exit-code'.
Dec 09 14:30:43 raspberrypi systemd[1]: Failed to start Docker Application Container Engine.
Running the command given in docker.service with additional overlay flag
pi#raspberrypi:~ $ sudo /usr/bin/dockerd --storage-driver=overlay -H fd:// --containerd=/run/containerd/containerd.sock
unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: storage-driver: (from flag: overlay, from file: overlay2)
pi#raspberrypi:~ $ sudo /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
INFO[2021-12-09T14:34:31.667296985Z] Starting up
failed to load listeners: no sockets found via socket activation: make sure the service was started by systemd
Which steps am I missing to be able to run Docker with overlay fs, such that the SD-card in the Raspberry is read only?
Without the overlay fs active it all works as expected.
I ran into this issue as well and found a way around it. In summary, you can't run the default Docker FS driver (overlay2) on overlayfs. Fortunately, Docker supports other storage drivers, including fuse-overlayfs. Switching to this driver resolves the issue but there's one final catch. When Docker starts, it attempts to rename /var/lib/docker/runtimes and since overlayfs doesn't support renames of directories already in lower layers, it fails. If you simply rm -rf this directory while Docker is stopped and before you enable RPi's overlayfs, everything should work.

Docker change IP from Bridge

i have tried all of thes solutions to change the ip-address of my bridge:
Change default docker0 bridge ip address
But i allways got thes error:
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2021-07-10 20:02:29 CEST; 9min ago
Docs: https://docs.docker.com
Process: 3130 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --bip 192.168.198.128/25 (code=exited, status=1/FAILURE)
Main PID: 3130 (code=exited, status=1/FAILURE)
Jul 10 20:02:26 Server systemd[1]: Failed to start Docker Application Container Engine.
Jul 10 20:02:29 Server systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jul 10 20:02:29 Server systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Jul 10 20:02:29 Server systemd[1]: Stopped Docker Application Container Engine.
Jul 10 20:02:29 Server systemd[1]: docker.service: Start request repeated too quickly.
Jul 10 20:02:29 Server systemd[1]: docker.service: Failed with result 'exit-code'.
Jul 10 20:02:29 Server systemd[1]: Failed to start Docker Application Container Engine.
Jul 10 20:03:09 Server systemd[1]: docker.service: Start request repeated too quickly.
Jul 10 20:03:09 Server systemd[1]: docker.service: Failed with result 'exit-code'.
Jul 10 20:03:09 Server systemd[1]: Failed to start Docker Application Container Engine.
When i try to change it via
dockerd --bip=192.168.198.128/25
Then i recived this:
INFO[2021-07-10T20:30:14.220551981+02:00] Starting up
failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid
root#Server:/etc/docker# sudo service docker stop
root#Server:/etc/docker# dockerd --bip=192.168.198.128/25
INFO[2021-07-10T20:30:27.749647728+02:00] Starting up
INFO[2021-07-10T20:30:27.751145974+02:00] detected 127.0.0.53 nameserver, assuming systemd-resolved, so using resolv.conf: /run/systemd/resolve/resolv.conf
INFO[2021-07-10T20:30:27.752521878+02:00] parsed scheme: "unix" module=grpc
INFO[2021-07-10T20:30:27.752770076+02:00] scheme "unix" not registered, fallback to default scheme module=grpc
INFO[2021-07-10T20:30:27.752906825+02:00] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>} module=grpc
INFO[2021-07-10T20:30:27.753028929+02:00] ClientConn switching balancer to "pick_first" module=grpc
INFO[2021-07-10T20:30:27.756109499+02:00] parsed scheme: "unix" module=grpc
INFO[2021-07-10T20:30:27.756411626+02:00] scheme "unix" not registered, fallback to default scheme module=grpc
INFO[2021-07-10T20:30:27.756544438+02:00] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>} module=grpc
INFO[2021-07-10T20:30:27.756653280+02:00] ClientConn switching balancer to "pick_first" module=grpc
INFO[2021-07-10T20:30:27.761446487+02:00] [graphdriver] using prior storage driver: overlay2
WARN[2021-07-10T20:30:27.807419442+02:00] Your kernel does not support swap memory limit
WARN[2021-07-10T20:30:27.807711781+02:00] Your kernel does not support cgroup rt period
WARN[2021-07-10T20:30:27.807832417+02:00] Your kernel does not support cgroup rt runtime
INFO[2021-07-10T20:30:27.808384049+02:00] Loading containers: start.
ERRO[2021-07-10T20:30:28.211882902+02:00] failed to get event error="rpc error: code = Unavailable desc = transport is closing" module=libcontainerd namespace=moby
failed to start daemon: Error initializing network controller: Error creating default "bridge" network: failed to allocate gateway (192.168.198.128): Address already in use
but i dont have it in use, not in my network and not as a ip in the config of any network-connections.
Did anybody has an idea how to change it?
Thanks bevore.

Kubernetes installation conflicts with docker-ce-17.03.0

I can't install kubernetes in CentOS following this installation guide (link).
1: Flannel and docker service can't start after default installation
By default above installation will install Docker 1.12, but flannel and docker service can't start.
● flanneld.service - Flanneld overlay address etcd agent
Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled; vendor preset: disabled)
Active: activating (start) since Mon 2017-03-20 11:24:45 EDT; 27s ago
Main PID: 31572 (flanneld)
CGroup: /system.slice/flanneld.service
└─31572 /usr/bin/flanneld -etcd-endpoints=http://127.0.0.1:2379 -etcd-prefix=/atomic.io/network
Mar 20 11:25:00 JackKubeNode1 flanneld-start[31572]: E0320 11:25:00.259468 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:01 JackKubeNode1 flanneld-start[31572]: E0320 11:25:01.265559 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:02 JackKubeNode1 flanneld-start[31572]: E0320 11:25:02.592586 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:03 JackKubeNode1 flanneld-start[31572]: E0320 11:25:03.677965 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:04 JackKubeNode1 flanneld-start[31572]: E0320 11:25:04.719815 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:05 JackKubeNode1 flanneld-start[31572]: E0320 11:25:05.820301 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:09 JackKubeNode1 flanneld-start[31572]: E0320 11:25:09.016167 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:10 JackKubeNode1 flanneld-start[31572]: E0320 11:25:10.021494 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:11 JackKubeNode1 flanneld-start[31572]: E0320 11:25:11.022784 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:12 JackKubeNode1 flanneld-start[31572]: E0320 11:25:12.238389 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
Mar 20 11:25:13 JackKubeNode1 flanneld-start[31572]: E0320 11:25:13.513397 31572 network.go:102] failed to retrieve network config: client: etcd cluster is unavailable or misconfigured
A dependency job for docker.service failed. See 'journalctl -xe' for details.
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/docker.service.d
└─flannel.conf
Active: inactive (dead) since Mon 2017-03-20 11:25:16 EDT; 1min 29s ago
Docs: http://docs.docker.com
Main PID: 30412 (code=exited, status=0/SUCCESS)
Mar 20 11:18:32 JackKubeNode1 dockerd-current[30412]: time="2017-03-20T11:18:32.059329808-04:00" level=info msg="Daemon has completed initialization"
Mar 20 11:18:32 JackKubeNode1 dockerd-current[30412]: time="2017-03-20T11:18:32.059499814-04:00" level=info msg="Docker daemon" commit="96d83a5/1.12.6" graphdriver=devicemapper version=1.12.6
Mar 20 11:18:33 JackKubeNode1 dockerd-current[30412]: time="2017-03-20T11:18:33.169919003-04:00" level=info msg="API listen on /var/run/docker.sock"
Mar 20 11:18:33 JackKubeNode1 systemd[1]: Started Docker Application Container Engine.
Mar 20 11:25:15 JackKubeNode1 systemd[1]: Stopping Docker Application Container Engine...
Mar 20 11:25:15 JackKubeNode1 dockerd-current[30412]: time="2017-03-20T11:25:15.912002109-04:00" level=info msg="Processing signal 'terminated'"
Mar 20 11:25:16 JackKubeNode1 dockerd-current[30412]: time="2017-03-20T11:25:15.982882827-04:00" level=info msg="stopping containerd after receiving terminated"
Mar 20 11:25:16 JackKubeNode1 dockerd-current[30412]: time="2017-03-20T11:25:16.352579523-04:00" level=error msg="libcontainerd: failed to receive event from containerd: rpc error: code = 13 desc = transport is closing"
Mar 20 11:26:42 JackKubeNode1 systemd[1]: Dependency failed for Docker Application Container Engine.
Mar 20 11:26:42 JackKubeNode1 systemd[1]: Job docker.service/start failed with result 'dependency'.
2: It is said in link above issue is fixed in docker 1.13. So I manually install docker first then install kubernetes. But docker-ce-17.03 was installed, then there was conflicts between kubernetes and docker-ce-17.03 during kubernetes dependency resolves. How to work it around?
Processing Conflict: docker-ce-17.03.0.ce-1.el7.centos.x86_64 conflicts docker\n-Processing Conflict: docker-ce-17.03.0.ce-1.el7.centos.x86_64 conflicts docker\n-Processing Conflict: docker-ce-17.03.0.ce-1.el7.centos.x86_64 conflicts docker-io
Processing Conflict: docker-ce-17.03.0.ce-1.el7.centos.x86_64 conflicts docker-io
Processing Conflict: docker-ce-selinux-17.03.0.ce-1.el7.centos.noarch conflicts docker-selinux
Processing Conflict: docker-ce-selinux-17.03.0.ce-1.el7.centos.noarch conflicts docker-selinux
3: Recently Docker renamed docker-VERSION as docker-ce-VERSION, and looks like kubernetes doesn't accept new name docker-ce-VERSION. I think the issue I met can be worked around if I manually install docker-1.13. But how to install docker-1.13? I always install docker-ce-17.03 when running "yum install docker".
Docker ce 17.03 is basically docker 1.13 which isn't supported yet by stable kubernetes release. See this kuberentes github issue

Can't start docker service after updating to 1.12.6-176.1 in OpenSUSE:

I have updated docker in my OpenSUSE 13.2.
After some tests I see that -H flag in /etc/sysconfig/docker is causing dockerd not to start, but I need it to enable port 2375 or 2376 (as it has been working OK for months). With it, TSL or not TSL, all or any port, docker will not start. I have tried binding to 0.0.0.0, localhost, ...
-- Logs begin at Tue 2016-10-25 12:48:00 CEST, end at Thu 2017-02-02 23:02:35 CET. --
Feb 02 23:01:35 ezequiel dockerd[22661]: time="2017-02-02T23:01:35.134216922+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Feb 02 23:01:35 ezequiel dockerd[22661]: time="2017-02-02T23:01:35.247510727+01:00" level=info msg="Loading containers: done."
Feb 02 23:01:35 ezequiel dockerd[22661]: time="2017-02-02T23:01:35.247659069+01:00" level=info msg="Daemon has completed initialization"
Feb 02 23:01:35 ezequiel dockerd[22661]: time="2017-02-02T23:01:35.247709386+01:00" level=info msg="Docker daemon" commit=78d1802 graphdriver=btrfs version=1.12.6
Feb 02 23:01:35 ezequiel dockerd[22661]: time="2017-02-02T23:01:35.267370317+01:00" level=info msg="API listen on 192.168.100.1:2375"
Feb 02 23:02:35 ezequiel docker_service_helper.sh[22662]: Docker is dead
Feb 02 23:02:35 ezequiel systemd[1]: docker.service: control process exited, code=exited status=1
Feb 02 23:02:35 ezequiel dockerd[22661]: time="2017-02-02T23:02:35.810756005+01:00" level=info msg="Processing signal 'terminated'"
Feb 02 23:02:35 ezequiel systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has failed.
--
-- The result is failed.
Feb 02 23:02:35 ezequiel systemd[1]: Unit docker.service entered failed state.
If I remove it, docker starts, but I can't access it from outside the host (I used to use TSL through port 2376)
I have tried dockerd directly and it binds to tcp port:
# /usr/bin/dockerd --containerd /run/containerd/containerd.sock --add-runtime oci=/usr/bin/docker-runc --label provider=generic -g /optLVM/varLibDocker -H tcp://127.0.0.1:2375
WARN[0000] [!] DON'T BIND ON ANY IP ADDRESS WITHOUT setting -tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING [!]
INFO[0000] [graphdriver] using prior storage driver "btrfs"
INFO[0000] Graph migration to content-addressability took 0.00 seconds
WARN[0000] Your kernel does not support swap memory limit.
WARN[0000] Your kernel does not support kernel memory limit.
WARN[0000] mountpoint for pids not found
INFO[0000] Loading containers: start.
.................INFO[0000] Firewalld running: false
INFO[0000] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[0000] Loading containers: done.
INFO[0000] Daemon has completed initialization
INFO[0000] Docker daemon commit=78d1802 graphdriver=btrfs version=1.12.6
INFO[0000] API listen on 127.0.0.1:2375
So it seems something has changed in the configuration.
My old version was:
docker-1.12.1-152.3.x86_64
And new one:
docker-1.12.6-176.1.x86_64
Thanks for any help... I do need TCP, with or without TSL, to access docker remotely.
I got the same Problem after updating.
From 1.12.1 to 1.12.6 they changed somthing with the "fd://". For me it did not work anymore. I'm using tcp with tls.
In my config file (/etc/docker/daemon.json)
{
"tls" : true,
"tlsverify": true,
"tlscacert": "/etc/docker/ca.pem",
"tlscert" : "/etc/docker/server/server-cert.pem",
"tlskey" : "/etc/docker/server/server-key.pem",
"hosts" : ["unix:///var/run/docker.sock", "tcp://10.10.1.1:2376"]
}
I am adding the "host": unix:///var/run/docker.sock
I think the local communication is handled via the unix socket and remote connection runs over tcp.
You can find the Infos here ...

Flannel and docker don't start

I'm trying to set up a kubernetes cluster on 2 nodes , centos 7.1 using this guide. However when I attempt to start the services on the minion like so:
for SERVICES in kube-proxy kubelet docker flanneld; do
systemctl restart $SERVICES
systemctl enable $SERVICES
systemctl status $SERVICES
done
I get the following error:
-- Logs begin at Wed 2015-12-23 13:00:41 UTC, end at Wed 2015-12-23 16:03:54 UTC. --
Dec 23 16:03:47 sc-test2 systemd[1]: docker-storage-setup.service: main process exited, code=exited, status=1/FAILURE
Dec 23 16:03:47 sc-test2 systemd[1]: Failed to start Docker Storage Setup.
-- Subject: Unit docker-storage-setup.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker-storage-setup.service has failed.
--
-- The result is failed.
Dec 23 16:03:47 sc-test2 systemd[1]: Unit docker-storage-setup.service entered failed state.
Dec 23 16:03:48 sc-test2 flanneld[36477]: E1223 16:03:48.187350 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
Dec 23 16:03:49 sc-test2 flanneld[36477]: E1223 16:03:49.189860 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
Dec 23 16:03:50 sc-test2 flanneld[36477]: E1223 16:03:50.192894 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
Dec 23 16:03:51 sc-test2 flanneld[36477]: E1223 16:03:51.194940 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
Dec 23 16:03:52 sc-test2 flanneld[36477]: E1223 16:03:52.197222 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
Dec 23 16:03:53 sc-test2 flanneld[36477]: E1223 16:03:53.199248 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
Dec 23 16:03:54 sc-test2 flanneld[36477]: E1223 16:03:54.201160 36477 network.go:53] Failed to retrieve network config: 100: Key not found (/atomic.io)
I'm sure I set the key on the master with :
etcdctl mk /coreos.com/network/config '{"Network":"172.17.0.0/16"}'
By far installation seems to be the hardest bit on using kubernetes :(
Today's christmas but I spent the whole day trying to get this to work :) This is what I did:
#1 FLANNEL
As mentioned I'd set the flannel etcd key on the master with:
etcdctl mk /coreos.com/network/config '{"Network":"172.17.0.0/16"}'
but I got this error when trying to start flannel on the minion:
Failed to retrieve network config: 100: Key not found (/atomic.io)
So I edited the /etc/sysconfig/flanneld file on the minion from:
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD="http://master:2379"
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_KEY="/coreos.com/network"
# Any additional options that you want to pass
#FLANNEL_OPTIONS=""
to:
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD="http://master:2379"
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_KEY="/atomic.io/network"
# Any additional options that you want to pass
#FLANNEL_OPTIONS=""
i.e. changed the FLANNEL_ETCD key.
After this systemctl start flanneld worked.
#2 DOCKER
I didn't find a way to make the version installed as a dependency by kubernetes work so I uninstalled it and following the docker docs for Centos installed docker-engine and manually created a docker.service file for systemctl.
cd /usr/lib/systemd/system
and the contents of the docker.service:
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target docker.socket
Requires=docker.socket
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/etc/sysconfig/flanneld
ExecStart=/usr/bin/docker daemon -H fd:// --bip=${FLANNEL_SUBNET}
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
then start and enable the daemon with systemctl as well as query the status.
systemctl restart docker
systemctl enable docker
systemctl status docker

Resources