SOLVED: and I really can not explain what was wrong....it suddenly works with the same configuration, maybe the connection was unstable i really can't say...just happy that my headaches are gone over this issue :)
So, I have this problem with deploying kafka connect on a external machine.
i get this error:
connect | Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listNodes
connect | [main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
connect | [main] ERROR io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Brokers found [].
I have used a lot of time in figuring out the issue, and did find some useful things like advertised.listeners and how they work, so i have added INTERNAL and EXTERNAL listeners now i know it is a connection problem somewhere but cannot figure out where.
I have tried using kafkacat on the external machine(tried both windows and ubuntu)
kafkacat -b something.b:9092 -L
and i do get a response with list of brokers, topics etc.
some of the output:
Metadata for all topics (from broker 3: something.a:9092/3):
3 brokers:
broker 2 at something.b:9092 (controller)
broker 3 at something.c:9092
broker 1 at something.a:9092
and it of course give same output with all of the 3 brokers
but when i try to spin up kafka connect i get the above mentioned error.
I am really out of ideas..... here is docker-compose code for my connect.
connect:
image: confluentinc/cp-kafka-connect-base:7.0.0
hostname: connect
container_name: connect
ports:
- 8083:8083
environment:
CONNECT_BOOTSTRAP_SERVERS: 'something.b:9092'
CONNECT_REST_PORT: 8083
CONNECT_REST_ADVERTISED_HOST_NAME: "connect"
CONNECT_GROUP_ID: group
CONNECT_CONFIG_STORAGE_TOPIC: connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: connect-status
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.converters.ByteArrayConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.converters.ByteArrayConverter"
CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_PLUGIN_PATH: '/usr/share/java'
I am using kafka version 2.4, maybe an issue there? or what am i missing.....
Related
This question already has answers here:
Connect to Kafka running in Docker
(5 answers)
Closed 1 year ago.
I'm trying to get a simple Debezium stack (with Docker Compose) running but the connection to the Kafka broker fails.
Here is my simplified docker-compose.yml:
version: "3"
services:
zookeeper:
image: debezium/zookeeper:1.6
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
kafka:
image: debezium/kafka:1.6
links:
- zookeeper
ports:
- "9092:9092"
environment:
BROKER_ID: 1
KAFKA_LISTENERS: LISTENER_BOB://kafka:29092,LISTENER_FRED://localhost:9092
KAFKA_ADVERTISED_LISTENERS: LISTENER_BOB://kafka:29092,LISTENER_FRED://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: LISTENER_BOB:PLAINTEXT,LISTENER_FRED:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_BOB
ZOOKEEPER_CONNECT: zookeeper:2181
As you may notice I added the listeners according to Kafka Listeners - Explained. But when I use kafkacat -b localhost:9092 -L to retrieve the broker's metadata the following error appears.
%6|1627623916.693|FAIL|rdkafka#producer-1| [thrd:localhost:9092/bootstrap]: localhost:9092/bootstrap: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 2ms in state APIVERSION_QUERY)
%6|1627623916.961|FAIL|rdkafka#producer-1| [thrd:localhost:9092/bootstrap]: localhost:9092/bootstrap: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 3ms in state APIVERSION_QUERY, 1 identical error(s) suppressed)
% ERROR: Failed to acquire metadata: Local: Broker transport failure
Kafka is even not able to start when I configure Kafka as Debezium suggests here.
2021-07-30 05:48:38,591 - WARN [Controller-1-to-broker-1-send-thread:NetworkClient#780] - [Controller id=1, targetBrokerId=1] Connection to node 1 (/localhost:9092) could not be established. Broker may not be available.
What am I doing wrong?
Thank you for any help in advance!
You've set a listener to only be local, not external.
Change to
KAFKA_LISTENERS: LISTENER_BOB://kafka:29092,LISTENER_FRED://0.0.0.0:9092
Trying to run Apache Kafka in docker containers, using the following docker-compose.yml:
version: '3.1'
services:
zookeeper:
container_name: zookeeper
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- /home/vagrant/myapp/runtime.local/volumes/zookeeper/01/data:/var/lib/zookeeper/data
- /home/vagrant/myapp/runtime.local/volumes/zookeeper/01/log:/var/lib/zookeeper/log
ports:
- 2181:2181
kafka:
container_name: kafka
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
ports:
- 29092:29092
- 9092:9092
volumes:
- /home/vagrant/myapp/runtime.local/volumes/kafka/01/data:/var/lib/kafka/data
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_NUM_PARTITIONS: 3
KAFKA_HEAP_OPTS: -Xmx512M -Xms512M
It seems that I can create a topic
docker exec -it kafka kafka-topics --create --topic test --partitions 1 --replication-factor 1 --if-not-exists --zookeeper zookeeper:2181
Created topic test.
But when I try to send a message via kafka-console-producer I got the following error:
docker exec -it kafka kafka-console-producer --topic test --bootstrap-server kafka:9092
>My first message
>[2021-07-14 15:39:21,010] WARN [Producer clientId=console-producer] Got error produce response with correlation id 4 on topic-partition test-2, retrying (2 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2021-07-14 15:39:21,011] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition test-2 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2021-07-14 15:39:21,112] WARN [Producer clientId=console-producer] Got error produce response with correlation id 6 on topic-partition test-2, retrying (1 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2021-07-14 15:39:21,112] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition test-2 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2021-07-14 15:39:21,643] WARN [Producer clientId=console-producer] Got error produce response with correlation id 8 on topic-partition test-2, retrying (0 attempts left). Error: NOT_LEADER_OR_FOLLOWER (org.apache.kafka.clients.producer.internals.Sender)
[2021-07-14 15:39:21,643] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition test-2 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2021-07-14 15:39:21,751] ERROR Error when sending message to topic test with key: null, value: 3 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.
[2021-07-14 15:39:21,753] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition test-2 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
The following is a partial stack trace in the kafka container:
[2021-07-14 15:39:12,232] ERROR [Broker id=1] Error while processing LeaderAndIsr request correlationId 1 received from controller 1 epoch 1 for partition test-1 (state.change.logger)
java.io.IOException: Invalid argument
at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1002)
at kafka.log.AbstractIndex.<init>(AbstractIndex.scala:124)
at kafka.log.OffsetIndex.<init>(OffsetIndex.scala:54)
at kafka.log.LazyIndex$.$anonfun$forOffset$1(LazyIndex.scala:106)
at kafka.log.LazyIndex.$anonfun$get$1(LazyIndex.scala:63)
at kafka.log.LazyIndex.get(LazyIndex.scala:60)
at kafka.log.LogSegment.offsetIndex(LogSegment.scala:64)
at kafka.log.LogSegment.readNextOffset(LogSegment.scala:456)
at kafka.log.Log.$anonfun$recoverLog$6(Log.scala:921)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.scala:17)
at scala.Option.getOrElse(Option.scala:201)
at kafka.log.Log.recoverLog(Log.scala:921)
at kafka.log.Log.$anonfun$loadSegments$3(Log.scala:801)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.scala:17)
at kafka.log.Log.retryOnOffsetOverflow(Log.scala:2465)
at kafka.log.Log.loadSegments(Log.scala:801)
at kafka.log.Log.<init>(Log.scala:328)
at kafka.log.Log$.apply(Log.scala:2601)
at kafka.log.LogManager.$anonfun$getOrCreateLog$1(LogManager.scala:830)
at scala.Option.getOrElse(Option.scala:201)
at kafka.log.LogManager.getOrCreateLog(LogManager.scala:783)
at kafka.cluster.Partition.createLog(Partition.scala:344)
at kafka.cluster.Partition.createLogIfNotExists(Partition.scala:324)
at kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:564)
at kafka.cluster.Partition.makeLeader(Partition.scala:548)
at kafka.server.ReplicaManager.$anonfun$makeLeaders$5(ReplicaManager.scala:1568)
at kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
at scala.collection.mutable.HashMap$Node.foreachEntry(HashMap.scala:633)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:499)
at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:1566)
at kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:1411)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:258)
at kafka.server.KafkaApis.handle(KafkaApis.scala:171)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:74)
at java.base/java.lang.Thread.run(Thread.java:829)
Am I doing something wrong?
I've set up a Zookeeper-ensemble (version 3.4.9) with 3 instances. This works like a charm on the test-system, but doesn't come up on the live-system at all. The error message is the following:
2020-08-28 06:26:24,643 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCnxManager#400] - Cannot open channel to 2 at election address /10.3.1.173:3888
java.net.NoRouteToHostException: Host is unreachable (Host unreachable)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:354)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
at java.lang.Thread.run(Thread.java:745)
I've searched on here and in other places, but the only accepted solution to the problem is to set each node's server address to 0.0.0.0, which doesn't work here. My setup is fully dockerized and applied with ansible, so it might look a bit different from what people normally seem to do. But the connection string e.g. for server.1 is this:
"server.1=0.0.0.0:2888:3888 server.2=10.3.1.173:2888:3888 server.3=10.3.1.175:2888:3888"
which is also applied to the zookeepers internal configuration, as the logs show (again for server.1):
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
2020-08-28 06:26:23,549 [myid:] - INFO [main:QuorumPeerConfig#124] - Reading configuration from: /conf/zoo.cfg
2020-08-28 06:26:23,559 [myid:] - INFO [main:QuorumPeer$QuorumServer#149] - Resolved hostname: 10.3.1.175 to address: /10.3.1.175
2020-08-28 06:26:23,559 [myid:] - INFO [main:QuorumPeer$QuorumServer#149] - Resolved hostname: 10.3.1.173 to address: /10.3.1.173
2020-08-28 06:26:23,560 [myid:] - INFO [main:QuorumPeer$QuorumServer#149] - Resolved hostname: 0.0.0.0 to address: /0.0.0.0
2020-08-28 06:26:23,560 [myid:] - INFO [main:QuorumPeerConfig#352] - Defaulting to majority quorums
(...)
2020-08-28 06:26:23,570 [myid:1] - INFO [main:QuorumPeerMain#127] - Starting quorum peer
2020-08-28 06:26:23,577 [myid:1] - INFO [main:Login#294] - successfully logged in.
2020-08-28 06:26:23,579 [myid:1] - INFO [main:NIOServerCnxnFactory#89] - binding to port 0.0.0.0/0.0.0.0:2181
This is applied to all 3 instance of zookeeper, but none of them can talk to another.
Additional information:
Apart from IP-addresses for the servers, the configuration is identical to the test-system. The Ansible Docker module is configured the same, the JAAS-Config (with DigestLoginModule) is the same, and the environment variables inside of all docker containers are the same, too.
Each server inside the live system can ping the other servers. I can also ping these servers from inside each Zookeeper container. In addition, I can curl each Zookeeper container on the JMX-port from inside any other container of the live-system. So they definitely can connect over the network.
Please help, thanks :D
Edit: #Stefano was asking how I start the docker containers, so I'll try to provide some insight. As mentioned, it's an Ansible setup in a task using the "docker_container" plugin which is used in a playbook to install the 3 instances across machines:
---
- name: Install Zookeeper
docker_container:
name: zookeeper
image: zookeeper:3.4.9
state: started
ports:
- "2181:2181" # Zookeeper Port
- "2888:2888"
- "3888:3888" # Election ports
- "9998:8080" # JMX metrics
env:
ZOO_MY_ID: "{{ ID }}" #this is 1 for server.1, etc.
ZOO_PORT: "2181"
ZOO_SERVERS: "{{ ZOO_SERVERS }}" #provided in host-vars
SERVER_JVMFLAGS: "-Djava.security.auth.login.config=/etc/kafka/zookeeper_jaas.conf -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent-0.12.0.jar=8080:/opt/jmx-exporter/zookeeper.yml"
volumes:
- /home/ansible/volumes/zoo1/data:/data
- /home/ansible/volumes/zoo1/datalog:/datalog
- /home/ansible/jmx-exporter:/opt/jmx-exporter
- /home/ansible/zookeeper_jaas.conf:/etc/kafka/zookeeper_jaas.conf
The ZOO_SERVERS are taken from the hosts file:
all:
(...)
children:
zookeeper:
hosts:
zoo1:
ID: "1"
ZOO_SERVERS: "server.1=0.0.0.0:2888:3888 server.2=10.3.1.173:2888:3888 server.3=10.3.1.175:2888:3888"
ansible_host: 10.3.1.171
zoo2:
ID: "2"
ZOO_SERVERS: "server.1=10.3.1.171:2888:3888 server.2=0.0.0.0:2888:3888 server.3=10.3.1.175:2888:3888"
ansible_host: 10.3.1.173
zoo3:
ID: "3"
ZOO_SERVERS: "server.1=10.3.1.171:2888:3888 server.2=10.3.1.173:2888:3888 server.3=0.0.0.0:2888:3888"
ansible_host: 10.3.1.175
So when I read back what I commented above, I noticed that I am not actually using the "confluentinc/cp-zookeeper" docker image, but the "zookeeper" docker image.
Once I changed from "zookeeper:3.4.9" to "confluentinc/cp-zookeeper:5.4.0" and adjusted the ZOO_PORT env-var's name to ZOOKEEPER_CLIENT_PORT, it somehow worked.
This doesn't answer the "why" but maybe this workaround helps someone else. I'll mark this as the accepted answer for now, but please feel free to provide additional insight.
I have some organizations with more than 2 peers. When I was editing the docker-compose-base.yaml, I am not sure how to define CORE_PEER_GOSSIP_BOOTSTRAP. Below is what I did, but the log showed that the peer fails to connect to the gossip peers. What is the correct way to do so? Thank you in advance!
docker-compose-base.yaml
peer0.caseManager.snts.com:
container_name: peer0.caseManager.snts.com
extends:
file: peer-base.yaml
service: peer-base
environment:
- CORE_PEER_ID=peer0.caseManager.snts.com
- CORE_PEER_ADDRESS=peer0.caseManager.snts.com:7051
- CORE_PEER_GOSSIP_BOOTSTRAP=[peer1.caseManager.snts.com:7051 peer2.caseManager.snts.com:7051]
- CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer0.caseManager.snts.com:7051
- CORE_PEER_LOCALMSPID=CaseManagerMSP
volumes:
- /var/run/:/host/var/run/
- ../crypto-config/peerOrganizations/caseManager.snts.com/peers/peer0.caseManager.snts.com/msp:/etc/hyperledger/fabric/msp
- ../crypto-config/peerOrganizations/caseManager.snts.com/peers/peer0.caseManager.snts.com/tls:/etc/hyperledger/fabric/tls
- peer0.caseManager.snts.com:/var/hyperledger/production
ports:
- 9051:7051
- 9053:7053
log of "docker-compose -p docker-compose.yaml up"
peer0.caseManager.snts.com | 2018-11-15 16:21:18.420 UTC [gossip/discovery] func1 -> WARN 023 Could not connect to {peer2.caseManager.snts.com:7051] [] [] peer2.caseManager.snts.com:7051] <nil> <nil>} : context deadline exceeded
peer0.caseManager.snts.com | 2018-11-15 16:21:18.420 UTC [gossip/discovery] func1 -> WARN 024 Could not connect to {[peer1.caseManager.snts.com:7051 [] [] [peer1.caseManager.snts.com:7051 <nil> <nil>} : context deadline exceeded
From a peer's perspective, the Bootstrap peer is another peer from the same Organization, who it can reach out to during bootstrap and get some necessary info to get communication going. (see here)
Your setup looks correct, and its perfectly plausible that your Peer0 started up earlier than Peer1 and Peer2 and was unable to find these during startup, but that's not out of ordinary. Did you end up having any error? If not, this looks like normal operation.
- CORE_PEER_GOSSIP_BOOTSTRAP=peer1.caseManager.snts.com:7051 peer2.caseManager.snts.com:7051
I'm testing a sample spring cloud stream application (running on a Ubuntu linux machine) with one source and one sink services. All my services are docker-containerized and I would like to use kafka as message broker.
Below the relevant parts of the docker-compose.yml:
zookeeper:
image: confluent/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka:0.9.0.0-1
container_name: kafka
ports:
- "9092:9092"
links:
- zookeeper:zk
environment:
- KAFKA_ADVERTISED_HOST_NAME=192.168.33.101
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_DELETE_TOPIC_ENABLE=true
- KAFKA_LOG_RETENTION_HOURS=1
- KAFKA_MESSAGE_MAX_BYTES=10000000
- KAFKA_REPLICA_FETCH_MAX_BYTES=10000000
- KAFKA_GROUP_MAX_SESSION_TIMEOUT_MS=60000
- KAFKA_NUM_PARTITIONS=2
- KAFKA_DELETE_RETENTION_MS=1000
.
.
.
# not shown: eureka service registry, spring cloud config service, etc.
myapp-service-test-source:
container_name: myapp-service-test-source
image: myapp-h2020/myapp-service-test-source:0.0.1
environment:
SERVICE_REGISTRY_HOST: 192.168.33.101
SERVICE_REGISTRY_PORT: 8761
ports:
- 8081:8080
.
.
.
Here the relevant part of application.yml for my service-test-source service:
spring:
cloud:
stream:
defaultBinder: kafka
bindings:
output:
destination: messages
content-type: application/json
kafka:
binder:
brokers: ${SERVICE_REGISTRY_HOST:192.168.33.101}
zkNodes: ${SERVICE_REGISTRY_HOST:192.168.33.101}
defaultZkPort: 2181
defaultBrokerPort: 9092
The problem is the following, if I launch the docker-compose above, in the test-source container log I notice that the service fails to connect to zookeeper, giving a repeated set of Connection refused error, and finishing with a ZkTimeoutException which makes the service terminate (see below).
The strange fact is that, if instead of running my source (and sink) test services as docker containers I run them as jar files via maven mvn spring-boot:run <etc...> the services work fine and are able to exchange messages via kafka. (note that kafka, zookeeper, etc. are still running as docker containers).
.
.
.
*** THE FOLLOWING REPEATED n TIMES ***
2017-02-14 14:40:09.164 INFO 1 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-14 14:40:09.166 WARN 1 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_111]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_111]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[zookeeper-3.4.6.jar!/:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) ~[zookeeper-3.4.6.jar!/:3.4.6-1569965]
.
.
.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:53)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.springframework.context.ApplicationContextException: Failed to start bean 'outputBindingLifecycle'; nested exception is org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
Any idea what the problem might be?
edit:
I discovered that in the "jar" execution logs the test-source service tries to connect to zookeeper through the IP 127.0.0.1, as can be seen from the log snipped below:
2017-02-15 14:24:04.159 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-15 14:24:04.159 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-15 14:24:04.178 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-02-15 14:24:04.201 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15a421fd9ec000a, negotiated timeout = 10000
2017-02-15 14:24:05.870 INFO 10348 --- [ main] org.apache.zookeeper.ZooKeeper : Initiating client connection, connectString=localhost:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient#72ba68e3
2017-02-15 14:24:05.882 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-15 14:24:05.883 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
This explains why everything works on the jar execution but not the docker one (the zookeeper container exports its 2181 port to the host machine, so it's visible as localhost for the service process when running directly on the host machine), but doesn't solve the problem: Apparently the spring cloud stream kafka configuration is ignoring the property spring.cloud.stream.kafka.binder.zkNodes as set in the application.yml (note that if I log the value of such environment variable from the service, I see the correct value of 192.168.33.101 that I hardcoded there for debugging purposes).
You have set the defaultBinder to be rabbit while trying to use the Kafka binder configuration. Do you have both rabbit and kafka binders in the classpath of your application? In that case, you can enable here
zookeeper:
image: wurstmeister/zookeeper
container_name: 'zookeeper'
ports:
- 2181:2181
--------------------- kafka --------------------------------
kafka:
image: wurstmeister/kafka
container_name: 'kafka'
environment:
- KAFKA_ADVERTISED_HOST_NAME=kafka
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CREATE_TOPICS=kafka_docker_topic:1:1
ports:
- 9092:9092
depends_on:
- zookeeper
spring:
profiles: dev
cloud:
stream:
defaultBinder: kafka
kafka:
binder:
brokers: kafka # i added brokers and zkNodes property
zkNodes: zookeeper #
bindings:
input:
destination: message
content-type: application/json