Docker swarm: load-balancer doesn't cycle through all tasks - docker

I have 2 nodes in my swarm cluster, a manager and a worker. I deployed a stack with 5 replicas distributed in those nodes. The yaml file has a network called webnet for the service web. After they are deployed I try to access the service, but when I use the IP address of the manager node it load-balances between 2 replicas, and if I use the IP address of the worker it load-balances among the other 3 replicas. So, using only docker, how can I load-balance among all the 5 replicas?
My nodes:
root#debiancli:~/docker# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
5y256zrqwalq1hcxmqnnqc177 centostraining Ready Active 18.06.0-ce
mkg6ecl3x28uyyqx7gvzz0ja3 * debiancli Ready Active Leader 18.06.0-ce
Tasks in manager (self) and worker (centostraining):
root#debiancli:~/docker# docker node ps self
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
stbe721sstq7 getstartedlab_web.3 get-started:part2 debiancli Running Running 2 hours ago
6syjojjmyh0y getstartedlab_web.5 get-started:part2 debiancli Running Running 2 hours ago
root#debiancli:~/docker# docker node ps centostraining
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
wpvsd98vfwd1 getstartedlab_web.1 get-started:part2 centostraining Running Running less than a second ago
e3z8xybuv53l getstartedlab_web.2 get-started:part2 centostraining Running Running less than a second ago
sd0oi675c2am getstartedlab_web.4 get-started:part2 centostraining Running Running less than a second ago
The stack and its tasks:
root#debiancli:~/docker# docker stack ls
NAME SERVICES ORCHESTRATOR
getstartedlab 1 Swarm
root#debiancli:~/docker# docker stack ps getstartedlab
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
wpvsd98vfwd1 getstartedlab_web.1 get-started:part2 centostraining Running Running less than a second ago
e3z8xybuv53l getstartedlab_web.2 get-started:part2 centostraining Running Running less than a second ago
stbe721sstq7 getstartedlab_web.3 get-started:part2 debiancli Running Running 2 hours ago
sd0oi675c2am getstartedlab_web.4 get-started:part2 centostraining Running Running less than a second ago
6syjojjmyh0y getstartedlab_web.5 get-started:part2 debiancli Running Running 2 hours ago
The networks (getstartedlab_webnet is used by my tasks)
root#debiancli:~/docker# docker network ls
NETWORK ID NAME DRIVER SCOPE
b95dd9ee2ae6 bridge bridge local
63578897e920 docker_gwbridge bridge local
x47kwsfa23oo getstartedlab_webnet overlay swarm
7f77ad495edd host host local
ip8czm66ofng ingress overlay swarm
f2cc6118fde7 none null local
docker-compose.yml used to deploy the stack:
root#debiancli:~/docker#cat docker-compose.yml
version: "3"
services:
web:
image: get-started:part2
deploy:
replicas: 5
resources:
limits:
cpus: "0.1"
memory: 50M
restart_policy:
condition: on-failure
ports:
- "4000:80"
networks:
- webnet
networks:
webnet:
Accessing the service from a third machine (this curl and grep pulls the container name)
[Ubuntu:~]$ debiancli=192.168.182.129
[Ubuntu:~]$ centostraining=192.168.182.133
[Ubuntu:~]$ curl -s $debiancli:4000 | grep -oP "(?<=</b> )[^<].{11}"
f4c1e3ff53f2
[Ubuntu:~]$ curl -s $debiancli:4000 | grep -oP "(?<=</b> )[^<].{11}"
de2110bee2f7
[Ubuntu:~]$ curl -s $debiancli:4000 | grep -oP "(?<=</b> )[^<].{11}"
f4c1e3ff53f2
[Ubuntu:~]$ curl -s $debiancli:4000 | grep -oP "(?<=</b> )[^<].{11}"
de2110bee2f7
[Ubuntu:~]$ curl -s $debiancli:4000 | grep -oP "(?<=</b> )[^<].{11}"
f4c1e3ff53f2
[Ubuntu:~]$ curl -s $debiancli:4000 | grep -oP "(?<=</b> )[^<].{11}"
de2110bee2f7
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
72b757f92983
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
d2e824865436
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
b53c3fd0cfbb
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
72b757f92983
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
d2e824865436
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
b53c3fd0cfbb
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
72b757f92983
[Ubuntu:~]$ curl -s $centostraining:4000 | grep -oP "(?<=</b> )[^<].{11}"
d2e824865436
Notice that when I probe debiancli (swarm manager) it loops through containers f4c1e3ff53f2 and de2110bee2f7 only, that is, the 2 replicas running on the manager and the same happens for the 3 replicas in centostraining (swarm worker). So, what am I missing?

Related

Docker | Bind for 0.0.0.0:80 failed | Port is already allocated

i've been trying all the existing commands for several hours and could not fix this problem.
i used everything covered in this Article: Docker - Bind for 0.0.0.0:4000 failed: port is already allocated.
I currently have one container: docker ps -a | meanwhile docker ps is empty
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5ebb9289dfd1 dockware/dev:latest "/bin/bash /entrypoi…" 2 minutes ago Created TheGoodPartDocker
when i Try docker-compose up -d i get the Error:
ERROR: for TheGoodPartDocker Cannot start service shop: driver failed programming external connectivity on endpoint TheGoodPartDocker (3b59ebe9366bf1c4a848670c0812935def49656a88fa95be5c4a4be0d7d6f5e6): Bind for 0.0.0.0:80 failed: port is already allocated
I've tried to remove everything using: docker ps -aq | xargs docker stop | xargs docker rm
Or remove ports: fuser -k 80/tcp
even deleting networks:
sudo service docker stop
sudo rm -f /var/lib/docker/network/files/local-kv.db
or just manually shut down stop and run:
docker-compose down
docker stop 5ebb9289dfd1
docker rm 5ebb9289dfd1
here is also my netstat : netstat | grep 80
unix 3 [ ] STREAM CONNECTED 20680 /mnt/wslg/PulseAudioRDPSink
unix 3 [ ] STREAM CONNECTED 18044
unix 3 [ ] STREAM CONNECTED 32780
unix 3 [ ] STREAM CONNECTED 17805 /run/guest-services/procd.sock
And docker port TheGoodPartDocker gives me no result.
I also restarted my computer, but nothing works :(.
Thanks for helping
Obviously port 80 is already occupied by some other process. You need to stop the process, before you start the container. To find out the process use ss:
$ ss -tulpn | grep 22
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1187,fd=3))
tcp LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=1187,fd=4))

docker redis container startup issue

I have question about docker redis-sentinel:5.0.10 startup issue.
I am running docker on CentOS7 Linux.
Before 5.0.10 version I used 4.0.9 and image was taken from our own repository, now I switched to bitnami repo.
The main problem is that I try to use redis-sentinel:5.0.10 (or redis-sentinel:5.0.7) and it falls into restart loop and cannot start properly.
I run containers like that:
[root#XXX opt]# docker run -d -p 26380:26379 -v /opt/app/redis:/data --name redis-sentinel -e REDIS_MASTER_HOST=XXX.XXX.XXX -e REDIS_MASTER_SET=XXX-XXX -e REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS=30000 -e REDIS_SENTINEL_QUORUM=2 -e REDIS_SENTINEL_FAILOVER_TIMEOUT=180000 --net=host --restart=always bitnami/redis-sentinel:5.0.10
[root#XXX opt]# docker run -d --net=host -v /opt/app/redis:/data -v /opt/app/redis.conf:/usr/local/etc/redis/redis.conf --name redis-client --restart=always redis:5.0.10-alpine redis-server --slaveof XXX.XXX.XXX 6379
In log there are messages like:
redis-sentinel 10:16:50.45 Welcome to the Bitnami redis-sentinel container
redis-sentinel 10:16:50.45 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis-sentinel
redis-sentinel 10:16:50.45 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis-sentinel/issues
redis-sentinel 10:16:50.46
redis-sentinel 10:16:50.46 INFO ==> ** Starting Redis sentinel setup **
redis-sentinel 10:16:50.47 ERROR ==> The configuration file /opt/bitnami/redis-sentinel/etc/sentinel.conf is not writable
Why it is not writable?
[root#dam31 ~]# docker run --rm -it bitnami/redis-sentinel:5.0.10 sh
redis-sentinel 14:03:36.16
redis-sentinel 14:03:36.16 Welcome to the Bitnami redis-sentinel container
redis-sentinel 14:03:36.17 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis-sentinel
redis-sentinel 14:03:36.17 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis-sentinel/issues
redis-sentinel 14:03:36.17
$ cd /opt/bitnami/redis-sentinel/etc/
$ ls -l
total 12
-rw-rw-r-- 1 root root 9797 Dec 8 13:25 sentinel.conf
docker ps command says:
[root#XXX app]# docker ps --no-trunc
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
45988acb45407dfc7a19f4a2a08da7c8f7a99381a97bc17a3ae985d377605462 redis:5.0.10-alpine "docker-entrypoint.sh redis-server --slaveof XXX.XXX.XXX 6379" 4 minutes ago Up 3 minutes redis-client
94b9b8e712e2219bc3f9a18aba349985968e3410c5336905282fb43b38e89e8e bitnami/redis-sentinel:5.0.10 "/opt/bitnami/scripts/redis-sentinel/entrypoint.sh /opt/bitnami/scripts/redis-sentinel/run.sh" 4 minutes ago Restarting (1) 19 seconds ago redis-sentinel
On other machines I have upgraded redis to 5.0.7 version successfully and it runs properly, nothing was done otherwise:
[root#XXX app]# docker ps --no-trunc
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
07e535350c02a67478dd07708a06798981fe7e99ae448567837dacdee198ec1e redis:5.0.7-alpine "docker-entrypoint.sh redis-server /usr/local/etc/redis/redis.conf --slaveof XXX.XXX.XXX 6379" 8 weeks ago Up 8 weeks redis-client
fea1bff3b2c1fbc9c7cea2becad64b7e2727dfc1f73f1d541e08b9b75143b3a9 bitnami/redis-sentinel:5.0.7 "/entrypoint.sh /run.sh" 8 weeks ago Up 8 weeks redis-sentinel
If I run on the same machine (where I tried to run redis:5.0.10) redis:5.0.7 the same error occured:
[root#XXX ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
15cdd46fa325 redis:5.0.7-alpine "docker-entrypoint.s…" 15 minutes ago Up 15 minutes redis-client
b0b02a36b68c bitnami/redis-sentinel:5.0.7 "/entrypoint.sh /run…" 16 minutes ago Restarting (1) 50 seconds ago redis-sentinel
redis-sentinel 14:16:23.86 Welcome to the Bitnami redis-sentinel container
redis-sentinel 14:16:23.87 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis-sentinel
redis-sentinel 14:16:23.87 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis-sentinel/issues
redis-sentinel 14:16:23.87 Send us your feedback at containers#bitnami.com
redis-sentinel 14:16:23.87
redis-sentinel 14:16:23.87 INFO ==> ** Starting Redis sentinel setup **
redis-sentinel 14:16:23.88 ERROR ==> The configuration file /opt/bitnami/redis-sentinel/etc/sentinel.conf is not writable
What I am doing wrong? Any thoughts? (NB! SELinux is disabled)

Can a mounted volume in Kubernetes be accessed from the host os filesystem

My real question is, if secrets are mounted as volumes in pods - can they be read if someone gains root access to the host OS.
For example by accessing /var/lib/docker and drilling down to the volume.
If someone has root access to your host with containers, he can do pretty much whatever he wants... Don't forget that pods are just a bunch of containers, which in fact are processes with pids. So for example, if I have a pod called sleeper:
kubectl get pods sleeper-546494588f-tx6pp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
sleeper-546494588f-tx6pp 1/1 Running 1 21h 10.200.1.14 k8s-node-2 <none>
running on the node k8s-node-2. With root access to this node, I can check what pid this pod and its containers have (I am using containerd as container engine, but points below are very similar for docker or any other container engine):
[root#k8s-node-2 /]# crictl -r unix:///var/run/containerd/containerd.sock pods -name sleeper-546494588f-tx6pp -q
ec27f502f4edd42b85a93503ea77b6062a3504cbb7ac6d696f44e2849135c24e
[root#k8s-node-2 /]# crictl -r unix:///var/run/containerd/containerd.sock ps -p ec27f502f4edd42b85a93503ea77b6062a3504cbb7ac6d696f44e2849135c24e
CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID
70ca6950de10b 8ac48589692a5 2 hours ago Running sleeper 1 ec27f502f4edd
[root#k8s-node-2 /]# crictl -r unix:///var/run/containerd/containerd.sock# inspect 70ca6950de10b | grep pid | head -n 1
"pid": 24180,
And then finally with those information (pid number), I can access "/" mountpoint of this process and check its content including secrets:
[root#k8s-node-2 /]# ll /proc/24180/root/var/run/secrets/kubernetes.io/serviceaccount/
total 0
lrwxrwxrwx. 1 root root 13 Nov 14 13:57 ca.crt -> ..data/ca.crt
lrwxrwxrwx. 1 root root 16 Nov 14 13:57 namespace -> ..data/namespace
lrwxrwxrwx. 1 root root 12 Nov 14 13:57 token -> ..data/token
[root#k8s-node-2 serviceaccount]# cat /proc/24180/root/var/run/secrets/kubernetes.io/serviceaccount/namespace ; echo
default
[root#k8s-node-2 serviceaccount]# cat /proc/24180/root/var/run/secrets/kubernetes.io/serviceaccount/token | cut -d'.' -f 1 | base64 -d ;echo
{"alg":"RS256","kid":""}
[root#k8s-node-2 serviceaccount]# cat /proc/24180/root/var/run/secrets/kubernetes.io/serviceaccount/token | cut -d'.' -f 2 | base64 -d 2>/dev/null ;echo
{"iss":"kubernetes/serviceaccount","kubernetes.io/serviceaccount/namespace":"default","kubernetes.io/serviceaccount/secret.name":"default-token-6sbz9","kubernetes.io/serviceaccount/service-account.name":"default","kubernetes.io/serviceaccount/service-account.uid":"42e7f596-e74e-11e8-af81-525400e6d25d","sub":"system:serviceaccount:default:default"}
It is one of the reasons why it is super important to properly secure access to your kubernetes infrastructure.

Docker container only on network

I use Docker and I have multiple webapps each need a MySQL server. Actually each webapp use his own bridge network to communicate with his MySQL server but each MySQL server use a different port (3306, 3307, 3308 ...).
I can't run them all on the port 3306 because this one is already used by the first MySQL webapp's.
Is it possible to do something to run all on MySQL servers on the 3306 ?
What I have :
| Net1 (bridge) | Net2(bridge) | Net3(bridge) | .... |
|--------------------|----------------------|--------------------|-----|
| Webapp1:80 | Webapp2:8080 | Webapp3:8081 | ... |
| Mysql:3306 | Mysql:3307 | Mysql:3308 | ... |
What I would like:
| Net1 (bridge) | Net2(bridge) | Net3(bridge) | .... |
|--------------------|----------------------|--------------------|-----|
| Webapp1:80 | Webapp2:8080 | Webapp3:8081 | ... |
| Mysql:3306 | Mysql:3306 | Mysql:3306 | ... |
How I run my containers:
docker network create --driver bridge webapp1net
docker run -d -p 3306:3306\
--net=webapp1net \
--net-alias=[webapp1net] \
-h webapp1-mysql \
--name webapp1-mysql mysql
docker run -d -p 127.0.0.1:80:80\
--net=webapp1net \
--net-alias=[webapp1net] \
-h webapp1 \
--name webapp1 webapp1
Thanks
Old post:
With Docker, I would like to know if it's possible to expose a container only on the network and not on the host.
Example:
I have 3 services each on a network and use MySQL but I don't want to change the MySQL's port.
Net 1 : myapp:80 (accessible by the localhost) & MySQL:3306 (only on the network)
Net 2 : myapp:8080 (accessible by the localhost) & MySQL:3306 (only on the network)
etc.
Is it possible to do something by running MySQL on 0.0.0.0 ?
Thanks

Can't Ping a Pod after Ubuntu cluster setup

I have followed the most recent instructions (updated 7th May '15) to setup a cluster in Ubuntu** with etcd and flanneld. But I'm having trouble with the network... it seems to be in some kind of broken state.
**Note: I updated the config script so that it installed 0.16.2. Also a kubectl get minions returned nothing to start but after a sudo service kube-controller-manager restart they appeared.
This is my setup:
| ServerName | Public IP | Private IP |
------------------------------------------
| KubeMaster | 107.x.x.32 | 10.x.x.54 |
| KubeNode1 | 104.x.x.49 | 10.x.x.55 |
| KubeNode2 | 198.x.x.39 | 10.x.x.241 |
| KubeNode3 | 104.x.x.52 | 10.x.x.190 |
| MongoDev1 | 162.x.x.132 | 10.x.x.59 |
| MongoDev2 | 104.x.x.103 | 10.x.x.60 |
From any machine I can ping any other machine... it's when I create pods and services that I start getting issues.
Pod
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED
auth-dev-ctl-6xah8 172.16.37.7 sis-auth leportlabs/sisauth:latestdev 104.x.x.52/104.x.x.52 environment=dev,name=sis-auth Running 3 hours
So this pod has been spun up on KubeNode3... if I try and ping it from any machine other than it's KubeNode3 I get a Destination Net Unreachable error. E.g.
# ping 172.16.37.7
PING 172.16.37.7 (172.16.37.7) 56(84) bytes of data.
From 129.250.204.117 icmp_seq=1 Destination Net Unreachable
I can call etcdctl get /coreos.com/network/config on all four and get back {"Network":"172.16.0.0/16"}.
I'm not sure where to look from there. Can anyone help me out here?
Supporting Info
On the master node:
# ps -ef | grep kube
root 4729 1 0 May07 ? 00:06:29 /opt/bin/kube-scheduler --logtostderr=true --master=127.0.0.1:8080
root 4730 1 1 May07 ? 00:21:24 /opt/bin/kube-apiserver --address=0.0.0.0 --port=8080 --etcd_servers=http://127.0.0.1:4001 --logtostderr=true --portal_net=192.168.3.0/24
root 5724 1 0 May07 ? 00:10:25 /opt/bin/kube-controller-manager --master=127.0.0.1:8080 --machines=104.x.x.49,198.x.x.39,104.x.x.52 --logtostderr=true
# ps -ef | grep etcd
root 4723 1 2 May07 ? 00:32:46 /opt/bin/etcd -name infra0 -initial-advertise-peer-urls http://107.x.x.32:2380 -listen-peer-urls http://107.x.x.32:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster infra0=http://107.x.x.32:2380,infra1=http://104.x.x.49:2380,infra2=http://198.x.x.39:2380,infra3=http://104.x.x.52:2380 -initial-cluster-state new
On a node:
# ps -ef | grep kube
root 10878 1 1 May07 ? 00:16:22 /opt/bin/kubelet --address=0.0.0.0 --port=10250 --hostname_override=104.x.x.49 --api_servers=http://107.x.x.32:8080 --logtostderr=true --cluster_dns=192.168.3.10 --cluster_domain=kubernetes.local
root 10882 1 0 May07 ? 00:05:23 /opt/bin/kube-proxy --master=http://107.x.x.32:8080 --logtostderr=true
# ps -ef | grep etcd
root 10873 1 1 May07 ? 00:14:09 /opt/bin/etcd -name infra1 -initial-advertise-peer-urls http://104.x.x.49:2380 -listen-peer-urls http://104.x.x.49:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster infra0=http://107.x.x.32:2380,infra1=http://104.x.x.49:2380,infra2=http://198.x.x.39:2380,infra3=http://104.x.x.52:2380 -initial-cluster-state new
#ps -ef | grep flanneld
root 19560 1 0 May07 ? 00:00:01 /opt/bin/flanneld
So I noticed that the flannel configuration (/run/flannel/subnet.env) was different to what docker was starting up with (wouldn't have a clue how they got out of sync).
# ps -ef | grep docker
root 19663 1 0 May07 ? 00:09:20 /usr/bin/docker -d -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.85.1/24 --mtu=1472
# cat /run/flannel/subnet.env
FLANNEL_SUBNET=172.16.60.1/24
FLANNEL_MTU=1472
FLANNEL_IPMASQ=false
Note that the docker --bip=172.16.85.1/24 was different to the flannel subnet FLANNEL_SUBNET=172.16.60.1/24.
So naturally I changed /etc/default/docker to reflect the new value.
DOCKER_OPTS="-H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.60.1/24 --mtu=1472"
But now a sudo service docker restart wasn't erroring out... so looking at /var/log/upstart/docker.log I could see the following
FATA[0000] Shutting down daemon due to errors: Bridge ip (172.16.85.1) does not match existing bridge configuration 172.16.60.1
So the final piece to the puzzle was deleting the old bridge and restarting docker...
# sudo brctl delbr docker0
# sudo service docker start
If sudo brctl delbr docker0 returns bridge docker0 is still up; can't delete it run ifconfig docker0 down and try again.
Please try this:
ip link del docker0
systemctl restart flanneld

Resources