I have three different nodes that every one has docker with Ubuntu on it. I want to make Kafka cluster with these three nodes; In fact, I installed docker on each node with loading Ubuntu with on them. I configure "zookeeper.properties" in docker environment for "150.20.11.157" like this:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
server.1=0.0.0.0:2888:3888
server.2=150.20.11.134:2888:3888
server.3=150.20.11.137:2888:3888
clientPort=2186
For node 150.20.11.134, "zookeeper.properties" file in docker environment is like this:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
server.1=150.20.11.157:2888:3888
server.2=0.0.0.0:2888:3888
server.3=150.20.11.137:2888:3888
clientPort=2186
For node 150.20.11.137, "zookeeper.properties" file in docker environment is like this:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
server.1=150.20.11.157:2888:3888
server.2=150.20.11.134:2888:3888
server.3=0.0.0.0:2888:3888
clientPort=2186
Also, I setup "server.properties" like this, for node 150.20.11.157:
broker.id=0
port=9092
listeners = PLAINTEXT://150.20.11.157:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=150.20.11.157:2186,150.20.11.134:2186,
150.20.11.137:2186
"server.properties" for node 150.20.11.134 is:
broker.id=1
port=9092
listeners = PLAINTEXT://150.20.11.134:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=150.20.11.157:2186,150.20.11.134:2186,
150.20.11.137:2186
"server.properties" for node 150.20.11.137 is:
broker.id=2
port=9092
listeners = PLAINTEXT://150.20.11.137:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=150.20.11.157:2186,150.20.11.134:2186,
150.20.11.137:2186
More over, every node has a "myid" file in "/tmp/zookeeper/data" of docker environment with its server id inside it.
To make a Kafka cluster of three node like this picture, I make a "docker-compose.yaml" file and a dockerfile for it.
This is my docker-compose file:
version: '3.7'
services:
zookeeper:
build: .
command: /root/kafka_2.11-2.0.1/bin/zookeeper-server-start.sh
/root/kafka_2.11-2.0.1/config/zookeeper.properties
ports:
- 2186:2186
kafka1:
build:
context: .
args:
brokerId: 0
command: /root/kafka_2.11-2.0.1/bin/kafka-server-start.sh
/root/kafka_2.11-2.0.1/config/server.properties
depends_on:
- zookeeper
kafka2:
build:
context: .
args:
brokerId: 1
command: /root/kafka_2.11-2.0.1/bin/kafka-server-start.sh
/root/kafka_2.11-2.0.1/config/server.properties
depends_on:
- zookeeper
kafka3:
build:
context: .
args:
brokerId: 2
command: /root/kafka_2.11-2.0.1/bin/kafka-server-start.sh
/root/kafka_2.11-2.0.1/config/server.properties
depends_on:
- zookeeper
producer:
build: .
command: bash -c "sleep 4 && /root/kafka_2.11-2.0.1/bin/kafka-
topics.sh --create --zookeeper zookeeper:2186 --replication-
factor 2 --partitions 3 --topic dates && while true; do date |
/kafka_2.11-2.0.1/bin/kafka-console-producer.sh --broker-list
kafka1:9092,kafka2:9092,kafka3:9092 --topic dates; sleep 1;
done "
depends_on:
- zookeeper
- kafka1
- kafka2
- kafka3
consumer:
build: .
command: bash -c "sleep 6 && /root/kafka_2.11-2.0.1/bin/kafka-
console-consumer.sh localhost:9092 --topic dates --bootstrap-
server kafka1:9092,kafka2:9092,kafka3:9092"
depends_on:
- zookeeper
- kafka1
- kafka2
- kafka3
The problem is after "dockerfile build ." when I do "sudo docker-compose up" on each node. It does not run completely. Some of my log is in following:
zookeeper_1 | [2019-01-17 16:09:27,197] INFO Reading configuration from: /root/kafka_2.11-2.0.1/config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
kafka3_1 | [2019-01-17 16:09:29,426] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
kafka3_1 | [2019-01-17 16:09:29,702] INFO starting (kafka.server.KafkaServer)
kafka3_1 | [2019-01-17 16:09:29,702] INFO Connecting to zookeeper on 150.20.11.157:2186,150.20.11.134:2186,150.20.11.137:2186 (kafka.server.KafkaServer)
kafka1_1 | [2019-01-17 16:09:30,012] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
zookeeper_1 | [2019-01-17 16:09:27,240] INFO Resolved hostname: 150.20.11.137 to address: /150.20.11.137 (org.apache.zookeeper.server.quorum.QuorumPeer)
kafka1_1 | [2019-01-17 16:09:30,486] INFO starting (kafka.server.KafkaServer)
kafka3_1 | [2019-01-17 16:09:29,715] INFO [ZooKeeperClient] Initializing a new session to 150.20.11.157:2186,150.20.11.134:2186,150.20.11.137:2186. (kafka.zookeeper.ZooKeeperClient)
zookeeper_1 | [2019-01-17 16:09:27,241] INFO Resolved hostname: 150.20.11.134 to address: /150.20.11.134 (org.apache.zookeeper.server.quorum.QuorumPeer)
zookeeper_1 | [2019-01-17 16:09:27,241] INFO Resolved hostname: 0.0.0.0 to address: /0.0.0.0 (org.apache.zookeeper.server.quorum.QuorumPeer)
kafka3_1 | [2019-01-17 16:09:29,720] INFO Client environment:zookeeper.version=3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 00:39 GMT (org.apache.zookeeper.ZooKeeper)
zookeeper_1 | [2019-01-17 16:09:27,241] INFO Defaulting to majority quorums (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
kafka3_1 | [2019-01-17 16:09:29,721] INFO Client environment:host.name=be08b050be4c (org.apache.zookeeper.ZooKeeper)
zookeeper_1 | [2019-01-17 16:09:27,242] ERROR Invalid config, exiting abnormally (org.apache.zookeeper.server.quorum.QuorumPeerMain)
zookeeper_1 | org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing /root/kafka_2.11-2.0.1/config/zookeeper.properties
zookeeper_1 | at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:156)
zookeeper_1 | at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:104)
zookeeper_1 | at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
zookeeper_1 | Caused by: java.lang.IllegalArgumentException: /tmp/zookeeper/data/myid file is missing
zookeeper_1 | at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:408)
zookeeper_1 | at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:152)
zookeeper_1 | ... 2 more
kafka1_1 | [2019-01-17 16:09:30,487] INFO Connecting to zookeeper on 150.20.11.157:2186,150.20.11.134:2186,150.20.11.137:2186 (kafka.server.KafkaServer)
zookeeper_1 | Invalid config, exiting abnormall
In fact, I configured Kafka cluster without using docker on every node and I could run Zookeeper and Kafka server without any problem. Kafka cluster was like this picture:
Would you please tell me what I am doing wrong to config this cluser?
Any help would be appreciated.
I change the docker-compose file and problem solved. Zookeeper and Kafka server run without any problem. Topic was created. Also,Consumer and Producer worked with topic in three nodes. My docker-compose for one node is like this:
version: '3.7'
services:
zookeeper:
image: ubuntu_mesos
command: /root/kafka_2.11-2.0.1/bin/zookeeper-server-start.sh
/root/kafka_2.11-2.0.1/config/zookeeper.properties
environment:
ZOOKEEPER_SERVER_ID: 1
ZOOKEEPER_CLIENT_PORT: 2186
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_INIT_LIMIT: 10
ZOOKEEPER_SYNC_LIMIT: 5
ZOOKEEPER_SERVERS:
0.0.0.0:2888:3888;150.20.11.134:2888:3888;150.20.11.137:2888:3888
network_mode: host
expose:
- 2186
- 2888
- 3888
ports:
- 2186:2186
- 2888:2888
- 3888:3888
kafka:
image: ubuntu_mesos
command: bash -c "sleep 20; /root/kafka_2.11-2.0.1/bin/kafka-server-
start.sh /root/kafka_2.11-2.0.1/config/server.properties"
network_mode: host
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 0
KAFKA_ZOOKEEPER_CONNECT:
150.20.11.157:2186,150.20.11.134:2186,150.20.11.137:2186
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://150.20.11.157:9092
expose:
- 9092
ports:
- 9092:9092
producer:
image: ubuntu_mesos
command: bash -c "sleep 40; /root/kafka_2.11-2.0.1/bin/kafka-topics.sh --
create --zookeeper 150.20.11.157:2186 --replication-factor 2 --partitions
3 --topic testFlink -- /root/kafka_2.11-2.0.1/bin/kafka-console-
producer.sh --broker-list 150.20.11.157:9092 --topic testFlink"
depends_on:
- zookeeper
- kafka
consumer:
image: ubuntu_mesos
command: bash -c "sleep 44; /root/kafka_2.11-2.0.1/bin/kafka-console-
consumer.sh --bootstrap-server 150.20.11.157:9092 --topic testFlink --
from-beginning"
depends_on:
- zookeeper
- kafka
Two other nodes have the docker-compose like above too.
Hope it was helpful for others.
Related
We are using the curator service discovery in docker and kubernetes environments. We setup the connection string using the DNS names of the containers/pods. The problem I am seeing is that it seems to interpret these down to the IP address. The container or pod can change IP addresses and curator does not seem to pickup the change.
The behavior I see if it I standup a 3 node zookeeper cluster and stand up 1 or more agents. I then roll the zookeeper nodes 1 at a time and they each change their IP address, when I bounce the third zookeeper instance all the client lose their connection.
Is there a way to force it to always use the DNS names for connection?
Here is my compose example
version: '2.4'
x-zookeeper:
&zookeeper-env
JVMFLAGS: -Dzookeeper.4lw.commands.whitelist=ruok
ZOO_ADMINSERVER_ENABLED: 'true'
ZOO_STANDALONE_ENABLED: 'false'
ZOO_SERVERS: server.1=zookeeper1:2888:3888;2181 server.2=zookeeper2:2888:3888;2181 server.3=zookeeper3:2888:3888;2181
x-agent:
&agent-env
ZK_CONNECTION: zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
SERVICE_NAME: myservice
services:
zookeeper1:
image: artifactory.rd2.thingworx.io/zookeeper:${ZOOKEEPER_IMAGE_VERSION}
restart: always
ports:
- 2181
- 8080
healthcheck:
test: echo ruok | nc localhost 2181 | grep imok
interval: 15s
environment:
<<: *zookeeper-env
ZOO_MY_ID: 1
zookeeper2:
image: artifactory.rd2.thingworx.io/zookeeper:${ZOOKEEPER_IMAGE_VERSION}
restart: always
ports:
- 2181
- 8080
healthcheck:
test: echo ruok | nc localhost 2181 | grep imok
interval: 15s
environment:
<<: *zookeeper-env
ZOO_MY_ID: 2
zookeeper3:
image: artifactory.rd2.thingworx.io/zookeeper:${ZOOKEEPER_IMAGE_VERSION}
restart: always
ports:
- 2181
- 8080
healthcheck:
test: echo ruok | nc localhost 2181 | grep imok
interval: 15s
environment:
<<: *zookeeper-env
ZOO_MY_ID: 3
agent1:
image: artifactory.rd2.thingworx.io/twxdevops/discovery-tool:latest
environment:
<<: *agent-env
GLOBAL_ID: AGENT1
agent2:
image: artifactory.rd2.thingworx.io/twxdevops/discovery-tool:latest
environment:
<<: *agent-env
GLOBAL_ID: AGENT2
agent3:
image: artifactory.rd2.thingworx.io/twxdevops/discovery-tool:latest
environment:
<<: *agent-env
GLOBAL_ID: AGENT3
agent4:
image: artifactory.rd2.thingworx.io/twxdevops/discovery-tool:latest
environment:
<<: *agent-env
GLOBAL_ID: AGENT4
agent5:
image: artifactory.rd2.thingworx.io/twxdevops/discovery-tool:latest
environment:
<<: *agent-env
GLOBAL_ID: AGENT5
The run steps are
docker-compose up -d zookeeper1 zookeeper2 zookeeper3 agent1
docker-compose rm -sf zookeeper3
docker-compose up -d agent2
docker-compose up -d zookeeper3
docker-compose rm -sf zookeeper2
docker-compose up -d agent3
docker-compose up -d zookeeper2
docker-compose rm -sf zookeeper1
docker-compose up -d agent5
docker-compose up -d zookeeper1
After I kill the last zookeeper node the agent gets the following error and does not recover. You can see it is referencing an IP address
Path:null finished:false header:: 5923,4 replyHeader:: 5923,8589934594,0 request:: '/services/myservice/cc1996fb-cca5-4108-bd06-567b45f594d7,F response:: #7b226e616d65223a226d7973657276696365222c226964223a2263633139393666622d636361352d343130382d626430362d353637623435663539346437222c2261646472657373223a223137322e32312e302e33222c22706f7274223a383038302c2273736c506f7274223a6e756c6c2c227061796c6f6164223a7b2240636c617373223a22636f6d2e7468696e67776f72782e646973636f766572792e7a6b2e53657276696365496e7374616e636544657461696c73222c2261747472696275746573223a7b22474c4f42414c4944223a224147454e5433227d7d2c22726567697374726174696f6e54696d65555443223a313634393739313735353936322c227365727669636554797065223a2244594e414d4943222c2275726953706563223a7b227061727473223a5b7b2276616c7565223a2261646472657373222c227661726961626c65223a747275657d2c7b2276616c7565223a223a222c227661726961626c65223a66616c73657d2c7b2276616c7565223a22706f7274222c227661726961626c65223a747275657d5d7d7d,s{4294967301,4294967301,1649791757073,1649791757073,0,0,0,144117976615550976,404,0,4294967301}
agent1_1 | 19:48:46.438 [ServiceEventWatcher-myservice] DEBUG com.thingworx.discovery.zk.ZookeeperProvider - ZooKeeper resolved addresses for service myservice: [ServiceDefinition [serviceName=myservice, host=172.21.0.7, port=8080, tags={GLOBALID=AGENT2}], ServiceDefinition [serviceName=myservice, host=172.21.0.4, port=8080, tags={GLOBALID=AGENT1}], ServiceDefinition [serviceName=myservice, host=172.21.0.3, port=8080, tags={GLOBALID=AGENT3}]]
agent1_1 | 19:48:47.070 [main-SendThread(172.21.0.5:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x200028941eb0001 for sever service-discovery-docker-tests_zookeeper2_1.service-discovery-docker-tests_default/172.21.0.5:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
agent1_1 | org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0x200028941eb0001, likely server has closed socket
agent1_1 | at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
agent1_1 | at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
agent1_1 | at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1275)
agent1_1 | 19:48:47.171 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager - State change: SUSPENDED
agent1_1 | 19:48:47.363 [main-SendThread(172.21.0.9:2181)] DEBUG org.apache.zookeeper.SaslServerPrincipal - Canonicalized address to 172.21.0.9
agent1_1 | 19:48:47.363 [main-SendThread(172.21.0.9:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 172.21.0.9/172.21.0.9:2181.
agent1_1 | 19:48:47.363 [main-SendThread(172.21.0.9:2181)] INFO org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
agent1_1 | 19:48:47.430 [ServiceEventWatcher-myservice] DEBUG com.thingworx.discovery.zk.ZookeeperProvider - Getting registered addresses from ZooKeeper for service myservice
Zookeeper cluster is happy and fine. So the main question is there a way to have it use the DNS names instead of the IP addresses? Should also mention that service discovery uses ephemeral nodes so disconnect and reconnect is bad.
I have a local application with kafka and zookeeper but when i run docker compose up in my arch linux desktop kafka show this error:
container log:
kafka_1 | java.net.NoRouteToHostException: No route to host
kafka_1 | at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
kafka_1 | at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
kafka_1 | at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:344)
kafka_1 | at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290)
kafka_1 | [main-SendThread(zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server zookeeper/172.18.0.2:2181.
kafka_1 | [main-SendThread(zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
kafka_1 | [main-SendThread(zookeeper:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for sever zookeeper/172.18.0.2:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
docker compose file:
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
restart: unless-stopped
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ports:
- '2181:2181'
networks:
- domper-network
kafka:
image: confluentinc/cp-kafka:latest
restart: unless-stopped
depends_on:
- zookeeper
ports:
- '9092:9092'
- '9094:9094'
environment:
KAFKA_BROKER_ID: 1
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_LISTENERS: INTERNAL://:9092,OUTSIDE://:9094
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9092,OUTSIDE://host.docker.internal:9094
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,OUTSIDE:PLAINTEXT
extra_hosts:
- 'host.docker.internal:172.17.0.1' #gateway do docker
networks:
- domper-network
postgres:
image: postgres
restart: unless-stopped
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: Admin#2021!
POSTGRES_DB: domper
POSTGRES_HOST_AUTH_METHOD: password
ports:
- 5432:5432
volumes:
- postgres-domper-data:/var/lib/postgresql/data
networks:
- domper-network
pgadmin:
image: dpage/pgadmin4
restart: unless-stopped
environment:
PGADMIN_DEFAULT_EMAIL: 'admin#admin.com.br'
PGADMIN_DEFAULT_PASSWORD: 'Admin#2021!'
ports:
- 16543:80
depends_on:
- postgres
networks:
- domper-network
api:
build: ./
restart: 'no'
command: bash -c "npm i && npm run migration:run && npm run seed:run && npm run start:dev"
ports:
- 8888:8888
env_file:
- dev.env
volumes:
- ./:/var/www/api
- /var/www/api/node_modules/
depends_on:
- postgres
- kafka
networks:
- domper-network
# healthcheck:
# test: ["CMD", "curl", "-f", "http://localhost:8888/healthcheck"]
# interval: 60s
# timeout: 5s
# retries: 5
notification-service:
build: ../repoDomperNotification/
restart: 'no'
command: npm run start
ports:
- 8889:8889
env_file:
- dev.env
volumes:
- ../repoDomperNotification/:/var/www/notification
- /var/www/notification/node_modules/
depends_on:
- kafka
networks:
- domper-network
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:8889/healthcheck']
interval: 60s
timeout: 5s
retries: 5
volumes:
postgres-domper-data:
driver: local
networks:
domper-network:
but in my windows desktop works and i don't know what this mean, I think it's some host configuration.
I've tried all the things I found on the internet and none of them worked.
log after #OneCricketeer (remove remove host.docker.internal and the extra_hosts) suggestion:
repodompercore-kafka-1 | ===> Running preflight checks ...
repodompercore-kafka-1 | ===> Check if /var/lib/kafka/data is writable ...
repodompercore-kafka-1 | ===> Check if Zookeeper is healthy ...
repodompercore-kafka-1 | SLF4J: Class path contains multiple SLF4J bindings.
repodompercore-kafka-1 | SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
repodompercore-kafka-1 | SLF4J: Found binding in [jar:file:/usr/share/java/cp-base-new/slf4j-simple-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
repodompercore-kafka-1 | SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
repodompercore-kafka-1 | SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
repodompercore-kafka-1 | log4j:WARN No appenders could be found for logger (io.confluent.admin.utils.cli.ZookeeperReadyCommand).
repodompercore-kafka-1 | log4j:WARN Please initialize the log4j system properly.
repodompercore-kafka-1 | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
repodompercore-kafka-1 exited with code 1
I am trying to use kafka on docker. But when running the docker compose, kafka just stuck on configuring mode and not indicating that already running. I do listing for kafka topics but it's not working or not giving the response. Here's my docker-compose.yml:
version: '3'
services:
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
hostname: zookeeper
ports:
- 2181:2181
environment:
ZOO_MY_ID: 1
networks:
- kafka_net
kafka:
image: wurstmeister/kafka
container_name: kafka
ports:
- 9092:9092
expose:
- 9092
depends_on:
- zookeeper
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_ADVERTISED_PORT: 9092
KAFKA_LISTENERS: INSIDE://0.0.0.0:9092,OUTSIDE://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:9092,OUTSIDE://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_BROKER_ID: 1
restart: always
networks:
- kafka_net
networks:
kafka_net:
driver: "bridge"
Here's the zookeeper response on docker-compose log:
2021-04-19 01:07:56,385 [myid:] - INFO [main:Environment#100] - Server environment:user.dir=/opt/zookeeper-3.4.13
2021-04-19 01:07:56,393 [myid:] - INFO [main:ZooKeeperServer#836] - tickTime set to 2000
2021-04-19 01:07:56,394 [myid:] - INFO [main:ZooKeeperServer#845] - minSessionTimeout set to -1
2021-04-19 01:07:56,394 [myid:] - INFO [main:ZooKeeperServer#854] - maxSessionTimeout set to -1
2021-04-19 01:07:56,418 [myid:] - INFO [main:ServerCnxnFactory#117] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2021-04-19 01:07:56,430 [myid:] - INFO [main:NIOServerCnxnFactory#89] - binding to port 0.0.0.0/0.0.0.0:2181
Here's the only kafka response on docker-compose log:
kafka | [Configuring] 'log.dirs' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'zookeeper.connect' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'listeners' in '/opt/kafka/config/server.properties'
kafka | Excluding KAFKA_VERSION from broker config
kafka | [Configuring] 'broker.id' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'listener.security.protocol.map' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'advertised.listeners' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'port' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'advertised.host.name' in '/opt/kafka/config/server.properties'
kafka | Excluding KAFKA_HOME from broker config
kafka | [Configuring] 'advertised.port' in '/opt/kafka/config/server.properties'
kafka | [Configuring] 'inter.broker.listener.name' in '/opt/kafka/config/server.properties'
When running this command, it's not giving a response:
docker exec -it kafka /bin/sh
/ # cd /opt/kafka_2.13-2.7.0
/opt/kafka_2.13-2.7.0 # bin/kafka-topics.sh --list --zookeeper zookeeper:2181
what should i do on this? i still trying to make it run. I already try with kafka native (without docker) and it run perfectly the i can create a topic
Try to use another image of kafka, to example - bitnami. I had the same problem as you on Apple M1 Big Shur, and i used bitnami and it's working!! this is my docker-compose.yml:
version: "2"
services:
zookeeper:
image: docker.io/bitnami/zookeeper:3
ports:
- "2181:2181"
volumes:
- "zookeeper_data:/bitnami"
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
kafka:
image: docker.io/bitnami/kafka:2
ports:
- "9092:9092"
volumes:
- "kafka_data:/bitnami"
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
- KAFKA_LISTENERS=PLAINTEXT://:9092
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:9092
depends_on:
- zookeeper
volumes:
zookeeper_data:
driver: local
kafka_data:
driver: local
Seems like you missed an important part from your docker-compose log. I ran your docker-compose file on my local and would like to highlight this part from the logs.
kafka | [2021-04-19 09:20:19,209] INFO Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation (org.apache.zookeeper.common.X509Util)
kafka | [2021-04-19 09:20:19,253] ERROR Exiting Kafka due to fatal exception (kafka.Kafka$)
kafka | java.lang.IllegalArgumentException: requirement failed: Each listener must have a different port, listeners: INSIDE://0.0.0.0:9092,OUTSIDE://0.0.0.0:9092
kafka | at kafka.utils.CoreUtils$.validate$1(CoreUtils.scala:265)
kafka | at kafka.utils.CoreUtils$.listenerListToEndPoints(CoreUtils.scala:276)
kafka | at kafka.server.KafkaConfig.$anonfun$listeners$1(KafkaConfig.scala:1680)
kafka | at kafka.server.KafkaConfig.listeners(KafkaConfig.scala:1679)
kafka | at kafka.server.KafkaConfig.validateValues(KafkaConfig.scala:1779)
kafka | at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1756)
kafka | at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1312)
kafka | at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:34)
kafka | at kafka.Kafka$.main(Kafka.scala:68)
kafka | at kafka.Kafka.main(Kafka.scala)
kafka exited with code 1
So you were using the same port for accessing Kafka from inside and outside the docker network. I changed the below setting (the INSIDE port) and it worked for me. I was able to create and list topics. I was also able to produce and consume simple messages from the kafka container.
KAFKA_LISTENERS: INSIDE://0.0.0.0:29092,OUTSIDE://0.0.0.0:9092
KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:29092,OUTSIDE://localhost:9092
This is some of the testing I did from the kafka container
docker exec -it kafka bash
bash-4.4# cd /opt/kafka_2.13-2.7.0/
bash-4.4# bin/kafka-topics.sh --create --topic test-topic --partitions 16 --replication-factor 1 --zookeeper zookeeper:2181
Created topic test-topic.
bash-4.4# bin/kafka-topics.sh --list --zookeeper zookeeper:2181
test-topic
This is an excellent post that I would like to refer you. Please let me know if this works for you or not.
When the Spark job is run locally without Docker via spark-submit everything works fine.
However, running on a docker container results in no output being generated.
To see if Kafka itself was working, I extracted Kafka on to the Spark worker container, and make a console consumer listen to the same host, port and topic, (kafka:9092, crypto_topic) which was working correctly and showing output. (There's a producer constantly pushing data to the topic in another container)
Expected -
20/09/11 17:35:27 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.29.10:42565 with 366.3 MB RAM, BlockManagerId(driver, 192.168.29.10, 42565, None)
20/09/11 17:35:27 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.29.10, 42565, None)
20/09/11 17:35:27 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.29.10, 42565, None)
-------------------------------------------
Batch: 0
-------------------------------------------
+---------+-----------+-----------------+------+----------+------------+-----+-------------------+---------+
|name_coin|symbol_coin|number_of_markets|volume|market_cap|total_supply|price|percent_change_24hr|timestamp|
+---------+-----------+-----------------+------+----------+------------+-----+-------------------+---------+
+---------+-----------+-----------------+------+----------+------------+-----+-------------------+---------+
...
...
...
followed by more output
Actual
20/09/11 14:49:44 INFO BlockManagerMasterEndpoint: Registering block manager d7443d94165c:46203 with 366.3 MB RAM, BlockManagerId(driver, d7443d94165c, 46203, None)
20/09/11 14:49:44 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, d7443d94165c, 46203, None)
20/09/11 14:49:44 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, d7443d94165c, 46203, None)
20/09/11 14:49:44 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
no more output, stuck here
docker-compose.yml file
version: "3"
services:
zookeeper:
image: zookeeper:3.6.1
container_name: zookeeper
hostname: zookeeper
ports:
- "2181:2181"
networks:
- crypto-network
kafka:
image: wurstmeister/kafka:2.13-2.6.0
container_name: kafka
hostname: kafka
ports:
- "9092:9092"
environment:
- KAFKA_ADVERTISED_HOST_NAME=kafka
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_ADVERTISED_PORT=9092
# topic-name:partitions:in-sync-replicas:cleanup-policy
- KAFKA_CREATE_TOPICS="crypto_topic:1:1:compact"
networks:
- crypto-network
kafka-producer:
image: python:3-alpine
container_name: kafka-producer
command: >
sh -c "pip install -r /usr/src/producer/requirements.txt
&& python3 /usr/src/producer/kafkaProducerService.py"
volumes:
- ./kafkaProducer:/usr/src/producer
networks:
- crypto-network
cassandra:
image: cassandra:3.11.8
container_name: cassandra
hostname: cassandra
ports:
- "9042:9042"
#command:
# cqlsh -f /var/lib/cassandra/cql-queries.cql
volumes:
- ./cassandraData:/var/lib/cassandra
networks:
- crypto-network
spark-master:
image: bde2020/spark-master:2.4.5-hadoop2.7
container_name: spark-master
hostname: spark-master
ports:
- "8080:8080"
- "7077:7077"
- "6066:6066"
networks:
- crypto-network
spark-consumer-worker:
image: bde2020/spark-worker:2.4.5-hadoop2.7
container_name: spark-consumer-worker
environment:
- SPARK_MASTER=spark://spark-master:7077
ports:
- "8081:8081"
volumes:
- ./sparkConsumer:/sparkConsumer
networks:
- crypto-network
networks:
crypto-network:
driver: bridge
spark-submit is run by
docker exec -it spark-consumer-worker bash
/spark/bin/spark-submit --master $SPARK_MASTER --class processing.SparkRealTimePriceUpdates \
--packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.3,org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.5 \
/sparkConsumer/sparkconsumer_2.11-1.0-RELEASE.jar
Relevant parts of Spark code
val inputDF: DataFrame = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "kafka:9092")
.option("subscribe", "crypto_topic")
.load()
...
...
...
val queryPrice: StreamingQuery = castedDF
.writeStream
.outputMode("update")
.format("console")
.option("truncate", "false")
.start()
queryPrice.awaitTermination()
val inputDF: DataFrame = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "kafka:9092")
.option("subscribe", "crypto_topic")
.load()
This part of the code was actually
val inputDF: DataFrame = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", KAFKA_BOOTSTRAP_SERVERS)
.option("subscribe", KAFKA_TOPIC)
.load()
Where KAFKA_BOOTSTRAP_SERVERS and KAFKA_TOPIC are read in from a config file while packaging the jar locally.
The best way to debug for me was to set the logs to be more verbose.
Locally, the value of KAFKA_BOOTSTRAP_SERVERS was localhost:9092, but in the Docker container it was changed to kafka:9092 in the config file there. This however, didn't reflect as the JAR was packaged already. So changing the value to kafka:9092 while packaging locally fixed it.
I would appreciate any help about how to have a JAR pick up configurations dynamically though. I don't want to package via SBT on the Docker container.
I have a docker image called ubuntu_mesos_spark. I installed zookeeper on it. I change “zoo.cfg” file like this:
This is “zoo.cfg” in node1(150.20.11.157)
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2187
dataDir=/var/lib/zookeeper
server.1=0.0.0.0:2888:3888
server.2=150.20.11.157:2888:3888
server.3=150.20.11.137:2888:3888
This is “zoo.cfg” in node1(150.20.11.134)
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2187
dataDir=/var/lib/zookeeper
server.1=150.20.11.157:2888:3888
server.2=0.0.0.0:2888:3888
server.3=150.20.11.137:2888:3888
This is “zoo.cfg” in node1(150.20.11.137)
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2187
dataDir=/var/lib/zookeeper
server.1=150.20.11.157:2888:3888
server.2=150.20.11.134:2888:3888
server.3=0.0.0.0:2888:3888
Also I made a “myid” file in “/var/lib/zookeeper” of each node. For example for “150.20.11.157” its ID is “1” in myid file.
I installed Mesos and Spark on the docker too. I have a Mesos cluster of these three nodes too. I defined IP address of slaves nodes on this file: “spark/conf/slaves”
150.20.11.134
150.20.11.137
I added these lines in “spark/conf/spark-env.sh”:
export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so
export SPARK_EXECUTOR_URI=/home/spark/program_file/spark-2.3.2-bin-
hadoop2.7.tgz
Moreover, I added these lines in my “~/.bashrc” file:
export SPARK_HOME="/home/spark"
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-
src.zip:$PYTHO$
export PYSPARK_HOME=/usr/bin/python3.6
export PYSPARK_DRIVER_PYTHON=python3.6
export ZOO_LOG_DIR=/var/log/zookeeper
I want to run master code in “150.20.11.157”.My docker-compose is :
version: '3.7'
services:
zookeeper:
image: ubuntu_mesos_spark
command: /zookeeper-3.4.12/bin/zkServer.sh start
environment:
ZOOKEEPER_SERVER_ID: 1
ZOOKEEPER_CLIENT_PORT: 2187
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_INIT_LIMIT: 10
ZOOKEEPER_SYNC_LIMIT: 5
ZOOKEEPER_SERVERS:
0.0.0.0:2888:3888;150.20.11.134:2888:3888;150.20.11.137:2888:3888
network_mode: host
expose:
- 2187
- 2888
- 3888
ports:
- 2187:2187
- 2888:2888
- 3888:3888
master:
image: ubuntu_mesos_spark
command: bash -c "sleep 20; /home/mesos-1.7.0/build/bin/mesos-
master.sh --ip=150.20.11.157 --work_dir=/var/run/mesos"
restart: always
depends_on:
- zookeeper
environment:
- MESOS_HOSTNAME="150.20.11.157,150.20.11.134,150.20.11.137"
- MESOS_QUORUM=1
- MESOS_LOG_DIR=/var/log/mesos
expose:
- 5050
- 4040
- 7077
- 8080
ports:
- 5050:5050
- 4040:4040
- 7077:7077
- 8080:8080
Also, I run this compose file on slaves nodes :“150.20.11.134,150.20.11.137”:
version: '3.7'
services:
zookeeper:
image: ubuntu_mesos_spark
command: /zookeeper-3.4.12/bin/zkServer.sh start
environment:
ZOOKEEPER_SERVER_ID: 2
ZOOKEEPER_CLIENT_PORT: 2187
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_INIT_LIMIT: 10
ZOOKEEPER_SYNC_LIMIT: 5
ZOOKEEPER_SERVERS:
0.0.0.0:2888:3888;150.20.11.134:2888:3888;150.20.11.137:2888:3888
network_mode: host
expose:
- 2187
- 2888
- 3888
ports:
- 2187:2187
- 2888:2888
- 3888:3888
slave:
image: ubuntu_mesos_spark
command: bash -c "/home/mesos-1.7.0/build/bin/mesos-slave.sh --
master=150.20.11.157:5050 --work_dir=/var/run/mesos
--systemd_enable_support=false"
restart: always
privileged: true
network_mode: host
depends_on:
- zookeeper
environment:
- MESOS_HOSTNAME="150.20.11.157,150.20.11.134,150.20.11.137"
- MESOS_MASTER=150.20.11.157
- MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins #also in Dockerfile
- MESOS_CONTAINERIZERS=docker,mesos
- MESOS_LOG_DIR=/var/log/mesos
- MESOS_LOGGING_LEVEL=INFO
expose:
- 5051
ports:
- 5051:5051
First I run "sudo docker-compose up" on Master node. Then I run it on slaves nodes. But I get this error:
On Master node, the error is:
Starting marzieh-compose_zookeeper_1 ... done
Recreating marzieh-compose_master_1 ... done
Attaching to marzieh-compose_zookeeper_1, marzieh-compose_master_1
zookeeper_1 | ZooKeeper JMX enabled by default
zookeeper_1 | Using config: /zookeeper-3.4.12/bin/../conf/zoo.cfg
zookeeper_1 | Starting zookeeper ... STARTED
marzieh-compose_zookeeper_1 exited with code 0
master_1 | I0123 11:46:59.585522 7 logging.cpp:201] INFO level logging started!
master_1 | I0123 11:46:59.586066 7 main.cpp:242] Build: 2019-01-21 05:16:39 by
master_1 | I0123 11:46:59.586097 7 main.cpp:243] Version: 1.7.0
master_1 | F0123 11:46:59.587368 7 process.cpp:1115] Failed to initialize: Failed to bind on 150.20.11.157:5050: Cannot assign requested address
master_1 | * Check failure stack trace: *
master_1 | # 0x7f505ce54b9c google::LogMessage::Fail()
master_1 | # 0x7f505ce54ae0 google::LogMessage::SendToLog()
master_1 | # 0x7f505ce544b2 google::LogMessage::Flush()
master_1 | # 0x7f505ce57770
google::LogMessageFatal::~LogMessageFatal()
master_1 | # 0x7f505cd19ed1 process::initialize()
master_1 | # 0x55fb7b12981a main
master_1 | # 0x7f504f0d0830 (unknown)
master_1 | # 0x55fb7b1288b9 _start
master_1 | bash: line 1: 7 Aborted (core dumped) /home/mesos-1.7.0/build/bin/mesos-master.sh --ip=150.20.11.157 --work_dir=/var/run/mesos
Moreover when I run "sudo docker-compose up" on slave nodes. I got this error:
slave_1 | F0123 11:40:06.878793 1 process.cpp:1115] Failed to initialize: Failed to bind on 0.0.0.0:5051: Address already in use
slave_1 | * Check failure stack trace: *
slave_1 | # 0x7fee9d319b9c google::LogMessage::Fail()
slave_1 | # 0x7fee9d319ae0 google::LogMessage::SendToLog()
slave_1 | # 0x7fee9d3194b2 google::LogMessage::Flush()
slave_1 | # 0x7fee9d31c770
google::LogMessageFatal::~LogMessageFatal()
slave_1 | # 0x7fee9d1deed1 process::initialize()
slave_1 | # 0x55e99f661784 main
slave_1 | # 0x7fee8f595830 (unknown)
slave_1 | # 0x55e99f65f139 _start
slave_1 | * Aborted at 1548243606 (unix time) try "date -d #1548243606" if you are using GNU date *
slave_1 | PC: # 0x7fee8f5ac196 (unknown)
slave_1 | * SIGSEGV (#0x0) received by PID 1 (TID 0x7fee9f9f38c0) from PID 0; stack trace: *
slave_1 | # 0x7fee8fee8390 (unknown)
slave_1 | # 0x7fee8f5ac196 (unknown)
slave_1 | # 0x7fee9d32055b google::DumpStackTraceAndExit()
slave_1 | # 0x7fee9d319b9c google::LogMessage::Fail()
slave_1 | # 0x7fee9d319ae0 google::LogMessage::SendToLog()
slave_1 | # 0x7fee9d3194b2 google::LogMessage::Flush()
slave_1 | # 0x7fee9d31c770 google::LogMessageFatal::~LogMessageFatal()
slave_1 | # 0x7fee9d1deed1 process::initialize()
slave_1 | # 0x55e99f661784 main
slave_1 | # 0x7fee8f595830 (unknown)
slave_1 | # 0x55e99f65f139 _start
slave_1 | I0123 11:41:07.818897 1 logging.cpp:201] INFO level logging started!
slave_1 | I0123 11:41:07.819437 1 main.cpp:349] Build: 2019-01-21 05:16:39 by
slave_1 | I0123 11:41:07.819470 1 main.cpp:350] Version: 1.7.0
slave_1 | I0123 11:41:07.823354 1 resolver.cpp:69] Creating default secret resolver
slave_1 | E0123 11:41:07.927773 1 main.cpp:483] EXIT with status 1: Failed to create a containerizer: Could not create
DockerContainerizer: Failed to create docker: Failed to get docker version: Failed to execute 'docker -H unix:///var/run/docker.sock --
version': exited with status 127
I searched a lot about that and I could not figure this out. Would you please guide me what the right way is to write docker compose for running Mesos and Spark cluster on docker?
Any help would be appreciated.
Thanks in advance.
Problem solved. I changed docker compose like this and Master and Slaves run without problem:
"docker-compose.yaml" in Master node is in the following:
version: '3.7'
services:
zookeeper:
image: ubuntu_mesos_spark_python3.6_client
command: /home/zookeeper-3.4.12/bin/zkServer.sh start
environment:
ZOOKEEPER_SERVER_ID: 1
ZOOKEEPER_CLIENT_PORT: 2188
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_INIT_LIMIT: 10
ZOOKEEPER_SYNC_LIMIT: 5
ZOOKEEPER_SERVERS: 0.0.0.0:2888:3888;150.20.11.157:2888:3888
network_mode: host
expose:
- 2188
- 2888
- 3888
ports:
- 2188:2188
- 2888:2888
- 3888:3888
master:
image: ubuntu_mesos_spark_python3.6_client
command: bash -c "sleep 30; /home/mesos-1.7.0/build/bin/mesos-master.sh
--ip=150.20.10.136 --work_dir=/var/run/mesos --hostname=x.x.x.x" ##hostname :
IP of the master node
restart: always
network_mode: host
depends_on:
- zookeeper
environment:
- MESOS_HOSTNAME="150.20.11.136"
- MESOS_QUORUM=1
- MESOS_LOG_DIR=/var/log/mesos
expose:
- 5050
- 4040
- 7077
- 8080
ports:
- 5050:5050
- 4040:4040
- 7077:7077
- 8080:8080
Also,"docker-compose.yaml" file in slave node is like this:
version: '3.7'
services:
zookeeper:
image: ubuntu_mesos_spark_python3.6_client
command: /home/zookeeper-3.4.12/bin/zkServer.sh start
environment:
ZOOKEEPER_SERVER_ID: 2
ZOOKEEPER_CLIENT_PORT: 2188
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_INIT_LIMIT: 10
ZOOKEEPER_SYNC_LIMIT: 5
ZOOKEEPER_SERVERS: 150.20.11.136:2888:3888;0.0.0.0:2888:3888
network_mode: host
expose:
- 2188
- 2888
- 3888
ports:
- 2188:2188
- 2888:2888
- 3888:3888
slave:
image: ubuntu_mesos_spark_python3.6_client
command: bash -c "sleep 30; /home/mesos-1.7.0/build/bin/mesos-slave.sh
--master=150.20.11.136:5050 --work_dir=/var/run/mesos
--systemd_enable_support=false"
restart: always
privileged: true
network_mode: host
depends_on:
- zookeeper
environment:
- MESOS_HOSTNAME="150.20.11.157"
#- MESOS_MASTER=172.28.10.136
#- MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins #also in Dockerfile
#- MESOS_CONTAINERIZERS=docker,mesos
- MESOS_LOG_DIR=/var/log/mesos
- MESOS_LOGGING_LEVEL=INFO
expose:
- 5051
ports:
- 5051:5051
Then I run "docker-compose up" in each node and they run without any problems.