Could not successfully bind to port 2181 - docker

I'm following https://github.com/PacktPublishing/Apache-Kafka-Series---Kafka-Connect-Hands-on-Learning and I've below docker-compose file and using Mac.
version: '2'
services:
# this is our kafka cluster.
kafka-cluster:
image: landoop/fast-data-dev:cp3.3.0
environment:
ADV_HOST: localhost # Change to 192.168.99.100 if using Docker Toolbox
RUNTESTS: 0 # Disable Running tests so the cluster starts faster
ports:
- 2181:2181 # Zookeeper
- 3030:3030 # Landoop UI
- 8081-8083:8081-8083 # REST Proxy, Schema Registry, Kafka Connect ports
- 9581-9585:9581-9585 # JMX Ports
- 9092:9092 # Kafka Broker
and when I run
docker-compose up kafka-cluster
[+] Running 1/0
⠿ Container code-kafka-cluster-1 Created 0.0s
Attaching to code-kafka-cluster-1
code-kafka-cluster-1 | Setting advertised host to 127.0.0.1.
code-kafka-cluster-1 | runtime: failed to create new OS thread (have 2 already; errno=22)
code-kafka-cluster-1 | fatal error: newosproc
code-kafka-cluster-1 |
code-kafka-cluster-1 | runtime stack:
code-kafka-cluster-1 | runtime.throw(0x512269, 0x9)
code-kafka-cluster-1 | /usr/lib/go/src/runtime/panic.go:566 +0x95
code-kafka-cluster-1 | runtime.newosproc(0xc420026000, 0xc420035fc0)
code-kafka-cluster-1 | /usr/lib/go/src/runtime/os_linux.go:160 +0x194
code-kafka-cluster-1 | runtime.newm(0x5203a0, 0x0)
code-kafka-cluster-1 | /usr/lib/go/src/runtime/proc.go:1572 +0x132
code-kafka-cluster-1 | runtime.main.func1()
code-kafka-cluster-1 | /usr/lib/go/src/runtime/proc.go:126 +0x36
code-kafka-cluster-1 | runtime.systemstack(0x593600)
code-kafka-cluster-1 | /usr/lib/go/src/runtime/asm_amd64.s:298 +0x79
code-kafka-cluster-1 | runtime.mstart()
code-kafka-cluster-1 | /usr/lib/go/src/runtime/proc.go:1079
code-kafka-cluster-1 |
code-kafka-cluster-1 | goroutine 1 [running]:
code-kafka-cluster-1 | runtime.systemstack_switch()
code-kafka-cluster-1 | /usr/lib/go/src/runtime/asm_amd64.s:252 fp=0xc420020768 sp=0xc420020760
code-kafka-cluster-1 | runtime.main()
code-kafka-cluster-1 | /usr/lib/go/src/runtime/proc.go:127 +0x6c fp=0xc4200207c0 sp=0xc420020768
code-kafka-cluster-1 | runtime.goexit()
code-kafka-cluster-1 | /usr/lib/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4200207c8 sp=0xc4200207c0
code-kafka-cluster-1 | Could not successfully bind to port 2181. Maybe some other service
code-kafka-cluster-1 | in your system is using it? Please free the port and try again.
code-kafka-cluster-1 | Exiting.
code-kafka-cluster-1 exited with code 1
Note: % sudo lsof -i :2181 - this command shows no output.

the landoop/fast-data-dev library is not working on arm64 Apple M1 chip.
Here you can fix the problem by updating the Dockerfile.
https://github.com/lensesio/fast-data-dev/issues/175#issuecomment-947001807

Change the zookeeper port mapping as below
ports:
- 2182:2181 # Zookeeper

You can build new docker image and run it with the following commands -
git clone https://github.com/faberchri/fast-data-dev.git
cd fast-data-dev
docker build -t faberchri/fast-data-dev .
docker run --rm -p 3030:3030 faberchri/fast-data-dev

After looking into Namig Aliyev answer, here is what worked for me.
Let's say your working directory is kafka and inside it, you have your file docker-compose.yml
Please follow these steps to reproduce same results :
git clone https://github.com/faberchri/fast-data-dev.git
update docker-compose.yml file :
In kafka-cluster service replace image parameter line with this "build: ./fast-data-dev/"
docker-compose run kafka-cluster
Wait a couple of minutes and it should work and be accessible via :
http://localhost:3030/
This what worked for me.

Error suggests you're running something else on port 2181 already. So either stop that, or remove the port mapping since you shouldn't be connecting to Zookeeper anyway for using Kafka. As of latest Kafka versions (which I doubt the linked course will be using), --zookeeper flags are removed from Kafka CLI tools
Other solution would be to not use the Landoop container. Plenty of other Docker Compose files exist on the web for Kafka
Overall, I'd suggest not using Docker at all for developing a Kafka Connector.

Related

How do I run Locust in a distributed Docker configuration?

I'm working on running Locust with multiple workers in a Fargate environment, but I wanted to see what it looked like in a simple distributed Docker setup. I took the below docker-compose.yml from the website and modified it so that everything would run on localhost. I can start Locust just fine with docker-compose up --scale worker=4, and four workers and master come up, but when I try to run a test via the Web UI, I get:
Attaching to locust-distributed-docker-master-1, locust-distributed-docker-worker-1, locust-distributed-docker-worker-2, locust-distributed-docker-worker-3, locust-distributed-docker-worker-4
locust-distributed-docker-worker-1 | [2021-11-17 19:01:19,719] be1b465ae5c7/INFO/locust.main: Starting Locust 2.5.0
locust-distributed-docker-master-1 | [2021-11-17 19:01:19,956] 8769b6dcd3ed/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
locust-distributed-docker-master-1 | [2021-11-17 19:01:20,016] 8769b6dcd3ed/INFO/locust.main: Starting Locust 2.5.0
locust-distributed-docker-worker-4 | [2021-11-17 19:01:20,144] bd481d228ef6/INFO/locust.main: Starting Locust 2.5.0
locust-distributed-docker-worker-3 | [2021-11-17 19:01:20,716] 26af3d44e1c9/INFO/locust.main: Starting Locust 2.5.0
locust-distributed-docker-worker-2 | [2021-11-17 19:01:21,122] d536c752bdee/INFO/locust.main: Starting Locust 2.5.0
locust-distributed-docker-master-1 | [2021-11-17 19:01:42,998] 8769b6dcd3ed/WARNING/locust.runners: You are running in distributed mode but have no worker servers connected. Please connect workers prior to swarming.
The whole point of this exercise is to watch the console to see how the workers interact with the master, nothing else.
docker-compose.yml:
version: '3'
services:
master:
image: locustio/locust
ports:
- "8089:8089"
volumes:
- ./:/mnt/locust
command: -f /mnt/locust/locustfile.py --master
worker:
image: locustio/locust
volumes:
- ./:/mnt/locust
command: -f /mnt/locust/locustfile.py --worker --master-host 127.0.0.1
To use the example in this way, with workers running on the same machine as the master, it's easiest to follow the example exactly and just not include the master IP address. The trouble is networking between Docker containers locally doesn't work the same as regular networking between hosts. You'll have to research the differences.
Locust workers will automatically discover the master if they run on the same host if you don't specify an IP address for the master. In the example docker-compose.yml --master-host master refers to the master service that will run by name, creating a networking bridge between the worker containers and the master container to allow the workers to automatically discover the master. When you actually deploy the workers, you may need a different setup that does specify a separate IP address to communicate with if they're run on separate hosts.
So just follow the example directly and have your workers command like this:
worker:
image: locustio/locust
volumes:
- ./:/mnt/locust
command: -f /mnt/locust/locustfile.py --worker --master-host master
That should result in output like this:
% docker compose up --scale worker=4
[+] Running 5/0
⠿ Container docker-compose_master_1 Created 0.0s
⠿ Container docker-compose_worker_2 Created 0.0s
⠿ Container docker-compose_worker_3 Created 0.0s
⠿ Container docker-compose_worker_1 Created 0.0s
⠿ Container docker-compose_worker_4 Created 0.0s
Attaching to master_1, worker_1, worker_2, worker_3, worker_4
worker_3 | [2021-11-18 16:32:49,911] d90df67c6a69/INFO/locust.main: Starting Locust 2.5.0
worker_4 | [2021-11-18 16:32:50,062] 112a60412b1e/INFO/locust.main: Starting Locust 2.5.0
master_1 | [2021-11-18 16:32:50,224] 859d07f8570b/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
worker_2 | [2021-11-18 16:32:50,233] 56ffce9d4448/INFO/locust.main: Starting Locust 2.5.0
master_1 | [2021-11-18 16:32:50,238] 859d07f8570b/INFO/locust.main: Starting Locust 2.5.0
master_1 | [2021-11-18 16:32:50,239] 859d07f8570b/INFO/locust.runners: Client '56ffce9d4448_dfda9f3bcff742909af80b63d7866714' reported as ready. Currently 1 clients ready to swarm.
master_1 | [2021-11-18 16:32:50,249] 859d07f8570b/INFO/locust.runners: Client '112a60412b1e_49c9e2df265d4fd7bc0f6554a76e66c9' reported as ready. Currently 2 clients ready to swarm.
worker_1 | [2021-11-18 16:32:50,256] 988707b23133/INFO/locust.main: Starting Locust 2.5.0
master_1 | [2021-11-18 16:32:50,259] 859d07f8570b/INFO/locust.runners: Client '988707b23133_88ac7446afd843a5ae7a20dceaed9ea4' reported as ready. Currently 3 clients ready to swarm.
master_1 | [2021-11-18 16:32:50,336] 859d07f8570b/INFO/locust.runners: Client 'd90df67c6a69_e432779d02f94947abb992ff1043eb0e' reported as ready. Currently 4 clients ready to swarm.

Docker on Windows: Error starting protocol stack: listen unix /root/.ethereum/geth.ipc: bind: operation not permitted

On a Windows 10 system, I am trying to run a Docker containiner running geth which listens to port 8545. This docker-compose.yml has been tested to run perfectly on both Ubuntu and Mac OS X.
docker-compose version 1.21.1, build 7641a569 is being used on the Windows 10 system.
Problem: Docker throws an error after executing docker-compose up.
Fatal: Error starting protocol stack: listen unix /root/.ethereum/geth.ipc: bind: operation not permitted
What might be causing this error, and how can we solve it?
docker-compose.yml
version: '3'
services:
geth:
image: ethereum/client-go:latest
volumes:
- ./nodedata:/root/.ethereum
- ./files/genesis.json:/root/genesis.json:ro
ports:
- 30303:30303
- "30303:30303/udp"
- 8545:8545
- 8546:8546
command: --networkid 1337 --cache 512 --port 30303 --maxpeers 50 --rpc --rpcaddr "0.0.0.0" --rpcapi "eth,personal,web3,net" --bootnodes enode://0b37f58139bef9fef04ff50c1d2d95acade0b6989433ed2148683f294a12e8ca7eb17915864a0dd61d5533e898b7040b75df1a17cca27e90d106f95dea255b45#167.99.55.99:30303
container_name: geth-nosw
Output after running docker-compose up
Starting geth-node ... done
Attaching to geth-node
geth-node | INFO [07-22|20:43:11.482] Maximum peer count ETH=50 LES=0 total=50
geth-node | INFO [07-22|20:43:11.488] Starting peer-to-peer node instance=Geth/v1.8.13-unstable-526abe27/linux-amd64/go1.10.3
geth-node | INFO [07-22|20:43:11.488] Allocated cache and file handles database=/root/.ethereum/geth/chaindata cache=384 handles=1024
geth-node | INFO [07-22|20:43:11.521] Initialised chain configuration config="{ChainID: 1337 Homestead: 1 DAO: <nil> DAOSupport: false EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 4 Constantinople: <nil> Engine: clique}"
geth-node | INFO [07-22|20:43:11.521] Initialising Ethereum protocol versions="[63 62]" network=1366
geth-node | INFO [07-22|20:43:11.524] Loaded most recent local header number=0 hash=b85de5…3971b4 td=1
geth-node | INFO [07-22|20:43:11.524] Loaded most recent local full block number=0 hash=b85de5…3971b4 td=1
geth-node | INFO [07-22|20:43:11.524] Loaded most recent local fast block number=0 hash=b85de5…3971b4 td=1
geth-node | INFO [07-22|20:43:11.525] Loaded local transaction journal transactions=0 dropped=0
geth-node | INFO [07-22|20:43:11.530] Regenerated local transaction journal transactions=0 accounts=0
geth-node | INFO [07-22|20:43:11.530] Starting P2P networking
geth-node | INFO [07-22|20:43:13.670] UDP listener up self=enode://3e0e8e9a886a347fffb0150e670b45c8ae19f0f87ebb6d3fa0f7f312f17220b426913ac96df9527ae0ca00138c9e50ffe646255d5655e6023c47ef10aabf0224#[::]:30303
geth-node | INFO [07-22|20:43:13.672] Stats daemon started
geth-node | INFO [07-22|20:43:13.674] RLPx listener up self=enode://3e0e8e9a886a347fffb0150e670b45c8ae19f0f87ebb6d3fa0f7f312f17220b426913ac96df9527ae0ca00138c9e50ffe646255d5655e6023c47ef10aabf0224#[::]:30303
geth-node | INFO [07-22|20:43:13.676] Blockchain manager stopped
geth-node | INFO [07-22|20:43:13.677] Stopping Ethereum protocol
geth-node | INFO [07-22|20:43:13.677] Ethereum protocol stopped
geth-node | INFO [07-22|20:43:13.677] Transaction pool stopped
geth-node | INFO [07-22|20:43:13.681] Database closed database=/root/.ethereum/geth/chaindata
geth-node | INFO [07-22|20:43:13.681] Stats daemon stopped
geth-node | Fatal: Error starting protocol stack: listen unix /root/.ethereum/geth.ipc: bind: operation not permitted
geth-node | Fatal: Error starting protocol stack: listen unix /root/.ethereum/geth.ipc: bind: operation not permitted
geth-node exited with code 1
The problem is that you cannot create a unix socket on a volume that is linked to a windows file system.
Here's a link on how to work around that.

How to add advanced Kafka configurations to the Landoop `fast-data-dev` image

Does anyone know how to pass custom Kafka configuration options to the landoop/fast-data-dev docker image?
Since there is no way to puss a custom config file and/or config params, what I've tried so far was to mount my own server.properties config file into /opt/confluent/etc/kafka by adding the following into my docker compose file
landoop:
hostname: 'landoop'
image: 'landoop/fast-data-dev:latest'
expose:
- '3030'
ports:
- '3030:3030'
environment:
- RUNTESTS=0
- RUN_AS_ROOT=1
volumes:
- ./docker/landoop/tmp:/tmp
- ./docker/landoop/opt/confluent/etc/kafka:/opt/confluent/etc/kafka
however, this causes Kafka to throw the following logs:
landoop_1 | 2017-09-28 11:53:03,886 INFO exited: broker (exit status 1; not expected)
landoop_1 | 2017-09-28 11:53:04,749 INFO spawned: 'broker' with pid 281
landoop_1 | 2017-09-28 11:53:05,851 INFO success: broker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
landoop_1 | 2017-09-28 11:53:11,867 INFO exited: rest-proxy (exit status 1; not expected)
landoop_1 | 2017-09-28 11:53:12,604 INFO spawned: 'rest-proxy' with pid 314
landoop_1 | 2017-09-28 11:53:13,024 INFO exited: schema-registry (exit status 1; not expected)
landoop_1 | 2017-09-28 11:53:13,735 INFO spawned: 'schema-registry' with pid 341
landoop_1 | 2017-09-28 11:53:13,739 INFO success: rest-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
in addition, when I go to http://localhost:3030/kafka-topics-ui/, I see the following:
KAFKA REST
/api/kafka-rest-proxy
CONNECTIVITY ERROR
Any suggestions? Thank you.
There are couple of things that i did which simplified the whole process. This is valid only for dev environment
Start the docker with a interactive shell as the entry point
Start the docker on the host network
Make necessary changes to the server.properties file include the host IP where the docker was running (as mentioned in (2) docker is running on host network)
** 4. If you want to make any advanced configuration, you may do now**
Run the actual entry point "/usr/local/bin/setup-and-run.sh"
Actual commands used:
Start the container
sudo docker run -it --entrypoint /bin/bash --net=host --rm -e ADV_HOST=HOSTIP landoop/fast-data-dev:latest
Add following to /run/broker/server.properties
advertised.host.name = HOST-IP
advertised.port = 9092
Run /usr/local/bin/setup-and-run.sh
At today, with latest version of landoop/fast-data-dev is possible to specify custom Kafka configuration options by converting the configuration option to uppercase, replacing dots with underscores and prepending with KAFKA_.
For example if you want to set specific values for "log.retention.bytes" and "log.retention.hours" you should add the following to your docker compose environment section:
environment:
KAFKA_LOG_RETENTION_BYTES: 1073741824
KAFKA_LOG_RETENTION_HOURS: 48
ADV_HOST: 127.0.0.1
RUNTESTS: 0
BROWSECONFIGS: 1
You can specify this way also configuration options for other services ( schema registry, connect, rest proxy ). Check doc for details https://hub.docker.com/r/landoop/fast-data-dev/.
Once the container is up you can confirm this by looking at configuration file at the following path inside the container:
/run/broker/server.properties
or also through the Landoop UI at the following URL if you've set "BROWSECONFIGS" to 1 on the environments parameters:
http://127.0.0.1:3030/config/broker/server.properties

Docker two tier application issue: failed to connect to mongo container

I have a simple nodeJS application consisting of the frontend and a mongo database. I want to deploy it via Docker.
In my docker-compose file I have the following:
version: '2'
services:
express-container:
build: .
ports:
- "3000:3000"
depends_on:
- mongo-container
mongo-container:
image: mongo:3.0
When I run docker-compose up, I have the following error:
Creating todoangularv2_mongo-container_1 ...
Creating todoangularv2_mongo-container_1 ... done
Creating todoangularv2_express-container_1 ...
Creating todoangularv2_express-container_1 ... done
Attaching to todoangularv2_mongo-container_1, todoangularv2_express-container_1
mongo-container_1 | 2017-07-25T15:26:09.863+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=25f03f51322b
mongo-container_1 | 2017-07-25T15:26:09.864+0000 I CONTROL [initandlisten] db version v3.0.15
mongo-container_1 | 2017-07-25T15:26:09.864+0000 I CONTROL [initandlisten] git version: b8ff507269c382bc100fc52f75f48d54cd42ec3b
mongo-container_1 | 2017-07-25T15:26:09.864+0000 I CONTROL [initandlisten] build info: Linux ip-10-166-66-3 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 BOOST_LIB_VERSION=1_49
mongo-container_1 | 2017-07-25T15:26:09.864+0000 I CONTROL [initandlisten] allocator: tcmalloc
mongo-container_1 | 2017-07-25T15:26:09.864+0000 I CONTROL [initandlisten] options: {}
mongo-container_1 | 2017-07-25T15:26:09.923+0000 I JOURNAL [initandlisten] journal dir=/data/db/journal
mongo-container_1 | 2017-07-25T15:26:09.924+0000 I JOURNAL [initandlisten] recover : no journal files present, no recovery needed
express-container_1 | Listening on port 3000
express-container_1 |
express-container_1 | events.js:72
express-container_1 | throw er; // Unhandled 'error' event
express-container_1 | ^
express-container_1 | Error: failed to connect to [mongo-container:27017]
So my frontend cannot reach the mongo container called 'mongo-container' in the docker-compose file. In the application itself I'm giving the URL for the mongo database as follows:
module.exports = {
url : 'mongodb://mongo-container:27017/todo'
}
Any idea how I can change my application so that when it is run on Docker, I don't have this connectivity issue?
EDIT: the mongo container gives the following output:
WAUTERW-M-T3ZT:vagrant wim$ docker logs f63
2017-07-26T09:15:02.824+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=f637f963c87f
2017-07-26T09:15:02.825+0000 I CONTROL [initandlisten] db version v3.0.15
2017-07-26T09:15:02.825+0000 I CONTROL [initandlisten] git version: b8ff507269c382bc100fc52f75f48d54cd42ec3b
...
2017-07-26T09:15:21.461+0000 I STORAGE [FileAllocator] done allocating datafile /data/db/local.0, size: 64MB, took 0.024 secs
2017-07-26T09:15:21.476+0000 I NETWORK [initandlisten] waiting for connections on port 27017
The express container gives the following output:
WAUTERW-M-T3ZT:vagrant wim$ docker logs 25a
Listening on port 3000
events.js:72
throw er; // Unhandled 'error' event
^
Error: failed to connect to [mongo-container:27017]
at null.<anonymous> (/usr/src/app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/connection/server.js:555:74)
at EventEmitter.emit (events.js:106:17)
at null.<anonymous> (/usr/src/app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/connection/connection_pool.js:156:15)
at EventEmitter.emit (events.js:98:17)
at Socket.<anonymous> (/usr/src/app/node_modules/mongoose/node_modules/mongodb/lib/mongodb/connection/connection.js:534:10)
at Socket.EventEmitter.emit (events.js:95:17)
at net.js:441:14
at process._tickCallback (node.js:415:13)
EDIT: the issue appeared in the Dockerfile. Here is a corrected one (simplified a bit as I started from a node image rather than an Ubuntu image):
FROM node:0.10.40
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD ["node", "/usr/src/app/bin/www"]
You could substitute depends_on by links session, that express dependency between services like depends_on and, according to the documentation, containers for the linked service will be reachable at a hostname identical to the alias, or the service name if no alias was specified.
version: '2'
services:
express-container:
build: .
ports:
- "3000:3000"
links:
- "mongo-container"
mongo-container:
image: mongo:3.0
The issue appeared in the Dockerfile. Here is a corrected one (simplified a bit as I started from a node image rather than an Ubuntu image):
FROM node:0.10.40
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY . /usr/src/app
RUN npm install
CMD ["node", "/usr/src/app/bin/www"]

Can't Ping a Pod after Ubuntu cluster setup

I have followed the most recent instructions (updated 7th May '15) to setup a cluster in Ubuntu** with etcd and flanneld. But I'm having trouble with the network... it seems to be in some kind of broken state.
**Note: I updated the config script so that it installed 0.16.2. Also a kubectl get minions returned nothing to start but after a sudo service kube-controller-manager restart they appeared.
This is my setup:
| ServerName | Public IP | Private IP |
------------------------------------------
| KubeMaster | 107.x.x.32 | 10.x.x.54 |
| KubeNode1 | 104.x.x.49 | 10.x.x.55 |
| KubeNode2 | 198.x.x.39 | 10.x.x.241 |
| KubeNode3 | 104.x.x.52 | 10.x.x.190 |
| MongoDev1 | 162.x.x.132 | 10.x.x.59 |
| MongoDev2 | 104.x.x.103 | 10.x.x.60 |
From any machine I can ping any other machine... it's when I create pods and services that I start getting issues.
Pod
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED
auth-dev-ctl-6xah8 172.16.37.7 sis-auth leportlabs/sisauth:latestdev 104.x.x.52/104.x.x.52 environment=dev,name=sis-auth Running 3 hours
So this pod has been spun up on KubeNode3... if I try and ping it from any machine other than it's KubeNode3 I get a Destination Net Unreachable error. E.g.
# ping 172.16.37.7
PING 172.16.37.7 (172.16.37.7) 56(84) bytes of data.
From 129.250.204.117 icmp_seq=1 Destination Net Unreachable
I can call etcdctl get /coreos.com/network/config on all four and get back {"Network":"172.16.0.0/16"}.
I'm not sure where to look from there. Can anyone help me out here?
Supporting Info
On the master node:
# ps -ef | grep kube
root 4729 1 0 May07 ? 00:06:29 /opt/bin/kube-scheduler --logtostderr=true --master=127.0.0.1:8080
root 4730 1 1 May07 ? 00:21:24 /opt/bin/kube-apiserver --address=0.0.0.0 --port=8080 --etcd_servers=http://127.0.0.1:4001 --logtostderr=true --portal_net=192.168.3.0/24
root 5724 1 0 May07 ? 00:10:25 /opt/bin/kube-controller-manager --master=127.0.0.1:8080 --machines=104.x.x.49,198.x.x.39,104.x.x.52 --logtostderr=true
# ps -ef | grep etcd
root 4723 1 2 May07 ? 00:32:46 /opt/bin/etcd -name infra0 -initial-advertise-peer-urls http://107.x.x.32:2380 -listen-peer-urls http://107.x.x.32:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster infra0=http://107.x.x.32:2380,infra1=http://104.x.x.49:2380,infra2=http://198.x.x.39:2380,infra3=http://104.x.x.52:2380 -initial-cluster-state new
On a node:
# ps -ef | grep kube
root 10878 1 1 May07 ? 00:16:22 /opt/bin/kubelet --address=0.0.0.0 --port=10250 --hostname_override=104.x.x.49 --api_servers=http://107.x.x.32:8080 --logtostderr=true --cluster_dns=192.168.3.10 --cluster_domain=kubernetes.local
root 10882 1 0 May07 ? 00:05:23 /opt/bin/kube-proxy --master=http://107.x.x.32:8080 --logtostderr=true
# ps -ef | grep etcd
root 10873 1 1 May07 ? 00:14:09 /opt/bin/etcd -name infra1 -initial-advertise-peer-urls http://104.x.x.49:2380 -listen-peer-urls http://104.x.x.49:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster infra0=http://107.x.x.32:2380,infra1=http://104.x.x.49:2380,infra2=http://198.x.x.39:2380,infra3=http://104.x.x.52:2380 -initial-cluster-state new
#ps -ef | grep flanneld
root 19560 1 0 May07 ? 00:00:01 /opt/bin/flanneld
So I noticed that the flannel configuration (/run/flannel/subnet.env) was different to what docker was starting up with (wouldn't have a clue how they got out of sync).
# ps -ef | grep docker
root 19663 1 0 May07 ? 00:09:20 /usr/bin/docker -d -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.85.1/24 --mtu=1472
# cat /run/flannel/subnet.env
FLANNEL_SUBNET=172.16.60.1/24
FLANNEL_MTU=1472
FLANNEL_IPMASQ=false
Note that the docker --bip=172.16.85.1/24 was different to the flannel subnet FLANNEL_SUBNET=172.16.60.1/24.
So naturally I changed /etc/default/docker to reflect the new value.
DOCKER_OPTS="-H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock --bip=172.16.60.1/24 --mtu=1472"
But now a sudo service docker restart wasn't erroring out... so looking at /var/log/upstart/docker.log I could see the following
FATA[0000] Shutting down daemon due to errors: Bridge ip (172.16.85.1) does not match existing bridge configuration 172.16.60.1
So the final piece to the puzzle was deleting the old bridge and restarting docker...
# sudo brctl delbr docker0
# sudo service docker start
If sudo brctl delbr docker0 returns bridge docker0 is still up; can't delete it run ifconfig docker0 down and try again.
Please try this:
ip link del docker0
systemctl restart flanneld

Resources