I've set up a Zookeeper-ensemble (version 3.4.9) with 3 instances. This works like a charm on the test-system, but doesn't come up on the live-system at all. The error message is the following:
2020-08-28 06:26:24,643 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCnxManager#400] - Cannot open channel to 2 at election address /10.3.1.173:3888
java.net.NoRouteToHostException: Host is unreachable (Host unreachable)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:354)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
at java.lang.Thread.run(Thread.java:745)
I've searched on here and in other places, but the only accepted solution to the problem is to set each node's server address to 0.0.0.0, which doesn't work here. My setup is fully dockerized and applied with ansible, so it might look a bit different from what people normally seem to do. But the connection string e.g. for server.1 is this:
"server.1=0.0.0.0:2888:3888 server.2=10.3.1.173:2888:3888 server.3=10.3.1.175:2888:3888"
which is also applied to the zookeepers internal configuration, as the logs show (again for server.1):
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
2020-08-28 06:26:23,549 [myid:] - INFO [main:QuorumPeerConfig#124] - Reading configuration from: /conf/zoo.cfg
2020-08-28 06:26:23,559 [myid:] - INFO [main:QuorumPeer$QuorumServer#149] - Resolved hostname: 10.3.1.175 to address: /10.3.1.175
2020-08-28 06:26:23,559 [myid:] - INFO [main:QuorumPeer$QuorumServer#149] - Resolved hostname: 10.3.1.173 to address: /10.3.1.173
2020-08-28 06:26:23,560 [myid:] - INFO [main:QuorumPeer$QuorumServer#149] - Resolved hostname: 0.0.0.0 to address: /0.0.0.0
2020-08-28 06:26:23,560 [myid:] - INFO [main:QuorumPeerConfig#352] - Defaulting to majority quorums
(...)
2020-08-28 06:26:23,570 [myid:1] - INFO [main:QuorumPeerMain#127] - Starting quorum peer
2020-08-28 06:26:23,577 [myid:1] - INFO [main:Login#294] - successfully logged in.
2020-08-28 06:26:23,579 [myid:1] - INFO [main:NIOServerCnxnFactory#89] - binding to port 0.0.0.0/0.0.0.0:2181
This is applied to all 3 instance of zookeeper, but none of them can talk to another.
Additional information:
Apart from IP-addresses for the servers, the configuration is identical to the test-system. The Ansible Docker module is configured the same, the JAAS-Config (with DigestLoginModule) is the same, and the environment variables inside of all docker containers are the same, too.
Each server inside the live system can ping the other servers. I can also ping these servers from inside each Zookeeper container. In addition, I can curl each Zookeeper container on the JMX-port from inside any other container of the live-system. So they definitely can connect over the network.
Please help, thanks :D
Edit: #Stefano was asking how I start the docker containers, so I'll try to provide some insight. As mentioned, it's an Ansible setup in a task using the "docker_container" plugin which is used in a playbook to install the 3 instances across machines:
---
- name: Install Zookeeper
docker_container:
name: zookeeper
image: zookeeper:3.4.9
state: started
ports:
- "2181:2181" # Zookeeper Port
- "2888:2888"
- "3888:3888" # Election ports
- "9998:8080" # JMX metrics
env:
ZOO_MY_ID: "{{ ID }}" #this is 1 for server.1, etc.
ZOO_PORT: "2181"
ZOO_SERVERS: "{{ ZOO_SERVERS }}" #provided in host-vars
SERVER_JVMFLAGS: "-Djava.security.auth.login.config=/etc/kafka/zookeeper_jaas.conf -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent-0.12.0.jar=8080:/opt/jmx-exporter/zookeeper.yml"
volumes:
- /home/ansible/volumes/zoo1/data:/data
- /home/ansible/volumes/zoo1/datalog:/datalog
- /home/ansible/jmx-exporter:/opt/jmx-exporter
- /home/ansible/zookeeper_jaas.conf:/etc/kafka/zookeeper_jaas.conf
The ZOO_SERVERS are taken from the hosts file:
all:
(...)
children:
zookeeper:
hosts:
zoo1:
ID: "1"
ZOO_SERVERS: "server.1=0.0.0.0:2888:3888 server.2=10.3.1.173:2888:3888 server.3=10.3.1.175:2888:3888"
ansible_host: 10.3.1.171
zoo2:
ID: "2"
ZOO_SERVERS: "server.1=10.3.1.171:2888:3888 server.2=0.0.0.0:2888:3888 server.3=10.3.1.175:2888:3888"
ansible_host: 10.3.1.173
zoo3:
ID: "3"
ZOO_SERVERS: "server.1=10.3.1.171:2888:3888 server.2=10.3.1.173:2888:3888 server.3=0.0.0.0:2888:3888"
ansible_host: 10.3.1.175
So when I read back what I commented above, I noticed that I am not actually using the "confluentinc/cp-zookeeper" docker image, but the "zookeeper" docker image.
Once I changed from "zookeeper:3.4.9" to "confluentinc/cp-zookeeper:5.4.0" and adjusted the ZOO_PORT env-var's name to ZOOKEEPER_CLIENT_PORT, it somehow worked.
This doesn't answer the "why" but maybe this workaround helps someone else. I'll mark this as the accepted answer for now, but please feel free to provide additional insight.
Related
I need help! (who would have thought, right? lol)
I have a job interview in few days and it would mean the world to me to be well prepared for it and have some working examples.
I am trying to set up an ELK pipeline to stream data from kafka, through logstash, elasticsearch and finally read it from Kibana. The usual.
I am making use of containers, but the duo logstash - elasticsearch are giving me an aneurism.
Everything else works perfectly fine. I've checked the logs off of kafka and that is working just fine. Kibana is collected to elasticsearch just fine as well. But logstash and es really don't want to match.
Here is the setup
docker-compose.yml
version: '3.6'
services:
elasticsearch:
image: elasticsearch:8.6.0
container_name: elasticsearch
#restart: always
volumes:
- elastic_data:/usr/share/elasticsearch/data/
environment:
cluster.name: elf-kafka-cluster
ES_JAVA_OPTS: "-Xmx256m -Xms256m"
discovery.type: single-node
xpack.security.enabled: false
ports:
- '9200:9200'
- '9300:9300'
networks:
- elk
kibana:
image: kibana:8.6.0
container_name: kibana
#restart: always
ports:
- '5601:5601'
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
depends_on:
- elasticsearch
networks:
- elk
logstash:
image: logstash:8.6.0
container_name: logstash
#restart: always
volumes:
- type: bind
source: ./logstash_pipeline/
target: /usr/share/logstash/pipeline
read_only: true
command: logstash -f /home/ettore/Documenti/Portfolio/ELK/logstash/logstash.conf
depends_on:
- elasticsearch
ports:
- '9600:9600'
environment:
xpack.monitoring.enabled: true
# LS_JAVA_OPTS: "-Xmx256m -Xms256m"
links:
- elasticsearch
networks:
- elk
volumes:
elastic_data: {}
networks:
elk:
driver: bridge
logstash.conf
input {
kafka {
bootstrap_servers => "localhost:9092"
topics => ["topic"]
}
}
output {
elasitcsearch {
hosts => ["http://localhost:9200"]
index => "topic"
workers => 1
}
}
These are logstash error logs when I compose up:
logstash | [2023-01-17T13:59:02,680][WARN ][deprecation.logstash.monitoringextension.pipelineregisterhook] Internal collectors option for Logstash monitoring is deprecated and targeted for removal in the next major version.
logstash | Please configure Metricbeat to monitor Logstash. Documentation can be found at:
logstash | https://www.elastic.co/guide/en/logstash/current/monitoring-with-metricbeat.html
logstash | [2023-01-17T13:59:04,711][INFO ][logstash.licensechecker.licensereader] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
logstash | [2023-01-17T13:59:05,373][INFO ][logstash.licensechecker.licensereader] Failed to perform request {:message=>"Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused", :exception=>Manticore::SocketException, :cause=>#<Java::OrgApacheHttpConn::HttpHostConnectException: Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused>}
logstash | [2023-01-17T13:59:05,379][WARN ][logstash.licensechecker.licensereader] Attempted to resurrect connection to dead ES instance, but got an error {:url=>"http://elasticsearch:9200/", :exception=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :message=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::SocketException] Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused"}
logstash | [2023-01-17T13:59:05,436][INFO ][logstash.licensechecker.licensereader] Failed to perform request {:message=>"Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused", :exception=>Manticore::SocketException, :cause=>#<Java::OrgApacheHttpConn::HttpHostConnectException: Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused>}
logstash | [2023-01-17T13:59:05,444][WARN ][logstash.licensechecker.licensereader] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://elasticsearch:9200/_xpack][Manticore::SocketException] Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused {:url=>http://elasticsearch:9200/, :error_message=>"Elasticsearch Unreachable: [http://elasticsearch:9200/_xpack][Manticore::SocketException] Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
logstash | [2023-01-17T13:59:05,449][WARN ][logstash.licensechecker.licensereader] Attempt to validate Elasticsearch license failed. Sleeping for 0.02 {:fail_count=>1, :exception=>"Elasticsearch Unreachable: [http://elasticsearch:9200/_xpack][Manticore::SocketException] Connect to elasticsearch:9200 [elasticsearch/172.20.0.2] failed: Connection refused"}
logstash | [2023-01-17T13:59:05,477][ERROR][logstash.licensechecker.licensereader] Unable to retrieve license information from license server {:message=>"No Available connections"}
logstash | [2023-01-17T13:59:05,567][ERROR][logstash.monitoring.internalpipelinesource] Failed to fetch X-Pack information from Elasticsearch. This is likely due to failure to reach a live Elasticsearch cluster.
logstash | [2023-01-17T13:59:05,661][INFO ][logstash.config.source.local.configpathloader] No config files found in path {:path=>"/home/ettore/Documenti/Portfolio/ELK/logstash/logstash.conf"}
logstash | [2023-01-17T13:59:05,664][ERROR][logstash.config.sourceloader] No configuration found in the configured sources.
logstash | [2023-01-17T13:59:06,333][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
logstash | [2023-01-17T13:59:06,411][INFO ][logstash.runner ] Logstash shut down.
logstash | [2023-01-17T13:59:06,419][FATAL][org.logstash.Logstash ] Logstash stopped processing because of an error: (SystemExit) exit
logstash | org.jruby.exceptions.SystemExit: (SystemExit) exit
logstash | at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:790) ~[jruby.jar:?]
logstash | at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:753) ~[jruby.jar:?]
logstash | at usr.share.logstash.lib.bootstrap.environment.<main>(/usr/share/logstash/lib/bootstrap/environment.rb:91) ~[?:?]
and this is to prove that everything is working as intended with es (or so it seems)
netstat -an | grep 9200
tcp 0 0 0.0.0.0:9200 0.0.0.0:* LISTEN
tcp6 0 0 :::9200 :::* LISTEN
unix 3 [ ] STREAM CONNECTED 49200
I've looked through everything and this is 100% not a duplicate because I have tried it all. I really can't figure it out. Hope anyone can help.
Thank you for you time.
You should set logstash.yml
Create a logstash.yml with values below:
http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.hosts: [ "http://localhost:9200" ]
In your docker-compose.yml, add another volume in Logstash container as shown below:
./logstash.yml:/usr/share/logstash/config/logstash.yml
Additionally, its good to run with restart condition.
SOLVED: and I really can not explain what was wrong....it suddenly works with the same configuration, maybe the connection was unstable i really can't say...just happy that my headaches are gone over this issue :)
So, I have this problem with deploying kafka connect on a external machine.
i get this error:
connect | Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listNodes
connect | [main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
connect | [main] ERROR io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Brokers found [].
I have used a lot of time in figuring out the issue, and did find some useful things like advertised.listeners and how they work, so i have added INTERNAL and EXTERNAL listeners now i know it is a connection problem somewhere but cannot figure out where.
I have tried using kafkacat on the external machine(tried both windows and ubuntu)
kafkacat -b something.b:9092 -L
and i do get a response with list of brokers, topics etc.
some of the output:
Metadata for all topics (from broker 3: something.a:9092/3):
3 brokers:
broker 2 at something.b:9092 (controller)
broker 3 at something.c:9092
broker 1 at something.a:9092
and it of course give same output with all of the 3 brokers
but when i try to spin up kafka connect i get the above mentioned error.
I am really out of ideas..... here is docker-compose code for my connect.
connect:
image: confluentinc/cp-kafka-connect-base:7.0.0
hostname: connect
container_name: connect
ports:
- 8083:8083
environment:
CONNECT_BOOTSTRAP_SERVERS: 'something.b:9092'
CONNECT_REST_PORT: 8083
CONNECT_REST_ADVERTISED_HOST_NAME: "connect"
CONNECT_GROUP_ID: group
CONNECT_CONFIG_STORAGE_TOPIC: connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: connect-status
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.converters.ByteArrayConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.converters.ByteArrayConverter"
CONNECT_LOG4J_ROOT_LOGLEVEL: "INFO"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_PLUGIN_PATH: '/usr/share/java'
I am using kafka version 2.4, maybe an issue there? or what am i missing.....
This question already has answers here:
Connect to Kafka running in Docker
(5 answers)
Closed 1 year ago.
I'm trying to get a simple Debezium stack (with Docker Compose) running but the connection to the Kafka broker fails.
Here is my simplified docker-compose.yml:
version: "3"
services:
zookeeper:
image: debezium/zookeeper:1.6
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
kafka:
image: debezium/kafka:1.6
links:
- zookeeper
ports:
- "9092:9092"
environment:
BROKER_ID: 1
KAFKA_LISTENERS: LISTENER_BOB://kafka:29092,LISTENER_FRED://localhost:9092
KAFKA_ADVERTISED_LISTENERS: LISTENER_BOB://kafka:29092,LISTENER_FRED://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: LISTENER_BOB:PLAINTEXT,LISTENER_FRED:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_BOB
ZOOKEEPER_CONNECT: zookeeper:2181
As you may notice I added the listeners according to Kafka Listeners - Explained. But when I use kafkacat -b localhost:9092 -L to retrieve the broker's metadata the following error appears.
%6|1627623916.693|FAIL|rdkafka#producer-1| [thrd:localhost:9092/bootstrap]: localhost:9092/bootstrap: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 2ms in state APIVERSION_QUERY)
%6|1627623916.961|FAIL|rdkafka#producer-1| [thrd:localhost:9092/bootstrap]: localhost:9092/bootstrap: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 3ms in state APIVERSION_QUERY, 1 identical error(s) suppressed)
% ERROR: Failed to acquire metadata: Local: Broker transport failure
Kafka is even not able to start when I configure Kafka as Debezium suggests here.
2021-07-30 05:48:38,591 - WARN [Controller-1-to-broker-1-send-thread:NetworkClient#780] - [Controller id=1, targetBrokerId=1] Connection to node 1 (/localhost:9092) could not be established. Broker may not be available.
What am I doing wrong?
Thank you for any help in advance!
You've set a listener to only be local, not external.
Change to
KAFKA_LISTENERS: LISTENER_BOB://kafka:29092,LISTENER_FRED://0.0.0.0:9092
I am trying to capture syslog messages sent over the network using rsyslog, and then have rsyslog capture, transform and send these messages to elasticsearch.
I found a nice article on the configuration on https://www.reddit.com/r/devops/comments/9g1nts/rsyslog_elasticsearch_logging/
Problem is that rsyslog keeps popping up an error at startup that it cannot connect to Elasticsearch on the same machine on port 9200. Error I get is
Failed to connect to localhost port 9200: Connection refused
2020-03-20T12:57:51.610444+00:00 53fd9e2560d9 rsyslogd: [origin software="rsyslogd" swVersion="8.36.0" x-pid="1" x-info="http://www.rsyslog.com"] start
rsyslogd: omelasticsearch: we are suspending ourselfs due to server failure 7: Failed to connect to localhost port 9200: Connection refused [v8.36.0 try http://www.rsyslog.com/e/2007 ]
Anyone can help on this?
Everything is running in docker on a single machine. I use below docker compose file to start the stack.
version: "3"
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- 9200:9200
networks:
- logging-network
kibana:
image: docker.elastic.co/kibana/kibana:7.6.1
depends_on:
- logstash
ports:
- 5601:5601
networks:
- logging-network
rsyslog:
image: rsyslog/syslog_appliance_alpine:8.36.0-3.7
environment:
- TZ=UTC
- xpack.security.enabled=false
ports:
- 514:514/tcp
- 514:514/udp
volumes:
- ./rsyslog.conf:/etc/rsyslog.conf:ro
- rsyslog-work:/work
- rsyslog-logs:/logs
volumes:
rsyslog-work:
rsyslog-logs:
networks:
logging-network:
driver: bridge
rsyslog.conf file below:
global(processInternalMessages="on")
#module(load="imtcp" StreamDriver.AuthMode="anon" StreamDriver.Mode="1")
module(load="impstats") # config.enabled=`echo $ENABLE_STATISTICS`)
module(load="imrelp")
module(load="imptcp")
module(load="imudp" TimeRequery="500")
module(load="omstdout")
module(load="omelasticsearch")
module(load="mmjsonparse")
module(load="mmutf8fix")
input(type="imptcp" port="514")
input(type="imudp" port="514")
input(type="imrelp" port="1601")
# includes done explicitely
include(file="/etc/rsyslog.conf.d/log_to_logsene.conf" config.enabled=`echo $ENABLE_LOGSENE`)
include(file="/etc/rsyslog.conf.d/log_to_files.conf" config.enabled=`echo $ENABLE_LOGFILES`)
#try to parse a structured log
action(type="mmjsonparse")
# this is for index names to be like: rsyslog-YYYY.MM.DD
template(name="rsyslog-index" type="string" string="rsyslog-%$YEAR%.%$MONTH%.%$DAY%")
# this is for formatting our syslog in JSON with #timestamp
template(name="json-syslog" type="list") {
constant(value="{")
constant(value="\"#timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"host\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"program\":\"") property(name="programname")
constant(value="\",\"tag\":\"") property(name="syslogtag" format="json")
constant(value="\",") property(name="$!all-json" position.from="2")
# closing brace is in all-json
}
# this is where we actually send the logs to Elasticsearch (localhost:9200 by default)
action(type="omelasticsearch" template="json-syslog" searchIndex="rsyslog-index" dynSearchIndex="on")
#################### default ruleset begins ####################
# we emit our own messages to docker console:
syslog.* :omstdout:
include(file="/config/droprules.conf" mode="optional") # this permits the user to easily drop unwanted messages
action(name="main_utf8fix" type="mmutf8fix" replacementChar="?")
include(text=`echo $CNF_CALL_LOG_TO_LOGFILES`)
include(text=`echo $CNF_CALL_LOG_TO_LOGSENE`)
First of all you need to run all the containers on the same docker network which in this case are not. Second , after running the containers on the same network , login to rsyslog container and check if 9200 is available.
I'm testing a sample spring cloud stream application (running on a Ubuntu linux machine) with one source and one sink services. All my services are docker-containerized and I would like to use kafka as message broker.
Below the relevant parts of the docker-compose.yml:
zookeeper:
image: confluent/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka:0.9.0.0-1
container_name: kafka
ports:
- "9092:9092"
links:
- zookeeper:zk
environment:
- KAFKA_ADVERTISED_HOST_NAME=192.168.33.101
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_DELETE_TOPIC_ENABLE=true
- KAFKA_LOG_RETENTION_HOURS=1
- KAFKA_MESSAGE_MAX_BYTES=10000000
- KAFKA_REPLICA_FETCH_MAX_BYTES=10000000
- KAFKA_GROUP_MAX_SESSION_TIMEOUT_MS=60000
- KAFKA_NUM_PARTITIONS=2
- KAFKA_DELETE_RETENTION_MS=1000
.
.
.
# not shown: eureka service registry, spring cloud config service, etc.
myapp-service-test-source:
container_name: myapp-service-test-source
image: myapp-h2020/myapp-service-test-source:0.0.1
environment:
SERVICE_REGISTRY_HOST: 192.168.33.101
SERVICE_REGISTRY_PORT: 8761
ports:
- 8081:8080
.
.
.
Here the relevant part of application.yml for my service-test-source service:
spring:
cloud:
stream:
defaultBinder: kafka
bindings:
output:
destination: messages
content-type: application/json
kafka:
binder:
brokers: ${SERVICE_REGISTRY_HOST:192.168.33.101}
zkNodes: ${SERVICE_REGISTRY_HOST:192.168.33.101}
defaultZkPort: 2181
defaultBrokerPort: 9092
The problem is the following, if I launch the docker-compose above, in the test-source container log I notice that the service fails to connect to zookeeper, giving a repeated set of Connection refused error, and finishing with a ZkTimeoutException which makes the service terminate (see below).
The strange fact is that, if instead of running my source (and sink) test services as docker containers I run them as jar files via maven mvn spring-boot:run <etc...> the services work fine and are able to exchange messages via kafka. (note that kafka, zookeeper, etc. are still running as docker containers).
.
.
.
*** THE FOLLOWING REPEATED n TIMES ***
2017-02-14 14:40:09.164 INFO 1 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-14 14:40:09.166 WARN 1 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_111]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_111]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[zookeeper-3.4.6.jar!/:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) ~[zookeeper-3.4.6.jar!/:3.4.6-1569965]
.
.
.
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:53)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.springframework.context.ApplicationContextException: Failed to start bean 'outputBindingLifecycle'; nested exception is org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
Any idea what the problem might be?
edit:
I discovered that in the "jar" execution logs the test-source service tries to connect to zookeeper through the IP 127.0.0.1, as can be seen from the log snipped below:
2017-02-15 14:24:04.159 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-15 14:24:04.159 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-15 14:24:04.178 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-02-15 14:24:04.201 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15a421fd9ec000a, negotiated timeout = 10000
2017-02-15 14:24:05.870 INFO 10348 --- [ main] org.apache.zookeeper.ZooKeeper : Initiating client connection, connectString=localhost:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient#72ba68e3
2017-02-15 14:24:05.882 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-02-15 14:24:05.883 INFO 10348 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
This explains why everything works on the jar execution but not the docker one (the zookeeper container exports its 2181 port to the host machine, so it's visible as localhost for the service process when running directly on the host machine), but doesn't solve the problem: Apparently the spring cloud stream kafka configuration is ignoring the property spring.cloud.stream.kafka.binder.zkNodes as set in the application.yml (note that if I log the value of such environment variable from the service, I see the correct value of 192.168.33.101 that I hardcoded there for debugging purposes).
You have set the defaultBinder to be rabbit while trying to use the Kafka binder configuration. Do you have both rabbit and kafka binders in the classpath of your application? In that case, you can enable here
zookeeper:
image: wurstmeister/zookeeper
container_name: 'zookeeper'
ports:
- 2181:2181
--------------------- kafka --------------------------------
kafka:
image: wurstmeister/kafka
container_name: 'kafka'
environment:
- KAFKA_ADVERTISED_HOST_NAME=kafka
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CREATE_TOPICS=kafka_docker_topic:1:1
ports:
- 9092:9092
depends_on:
- zookeeper
spring:
profiles: dev
cloud:
stream:
defaultBinder: kafka
kafka:
binder:
brokers: kafka # i added brokers and zkNodes property
zkNodes: zookeeper #
bindings:
input:
destination: message
content-type: application/json