I have three different nodes that every one has docker with Ubuntu on it. I can run zookeeper server on docker locally (I mean without any configuration for cluster), but I want to make kafka cluster with these three nodes; In fact, I installed docker on each node with loading Ubuntu with on them. I configure "zookeeper.properties" in docker environment for "150.20.11.157" like this:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
server.1=0.0.0.0:2888:3888
server.2=150.20.11.134:2888:3888
server.3=150.20.11.137:2888:3888
clientPort=2186
For node 150.20.11.134, "zookeeper.properties" file in docker environment is like this:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
server.1=150.20.11.157:2888:3888
server.2=0.0.0.0:2888:3888
server.3=150.20.11.137:2888:3888
clientPort=2186
For node 150.20.11.137, "zookeeper.properties" file in docker environment is like this:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
server.1=150.20.11.157:2888:3888
server.2=150.20.11.134:2888:3888
server.3=0.0.0.0:2888:3888
clientPort=2186
Also, I setup "server.properties" like this, for node 150.20.11.157:
broker.id=0
port=9092
listeners = PLAINTEXT://150.20.11.157:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=150.20.11.157:2186,150.20.11.134:2186,
150.20.11.137:2186
"server.properties" for node 150.20.11.134 is:
broker.id=1
port=9092
listeners = PLAINTEXT://150.20.11.134:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=150.20.11.157:2186,150.20.11.134:2186,
150.20.11.137:2186
"server.properties" for node 150.20.11.137 is:
broker.id=2
port=9092
listeners = PLAINTEXT://150.20.11.137:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=150.20.11.157:2186,150.20.11.134:2186,
150.20.11.137:2186
The problem is when I run zookeeper server on docker of each node. I get this error:
[2019-01-16 12:45:54,588] INFO Reading configuration from: ./config/zookeeper.properties
(org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2019-01-16 12:45:54,601] INFO Resolved hostname: 172.28.10.137 to address: /172.28.10.137 (org.apache.zookeeper.server.quorum.QuorumPeer)
[2019-01-16 12:45:54,603] INFO Resolved hostname: 0.0.0.0 to address: /0.0.0.0 (org.apache.zookeeper.server.quorum.QuorumPeer)
[2019-01-16 12:45:54,603] INFO Resolved hostname: 172.28.10.157 to address: /172.28.10.157 (org.apache.zookeeper.server.quorum.QuorumPeer)
[2019-01-16 12:45:54,603] INFO Defaulting to majority quorums (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2019-01-16 12:45:54,604] ERROR Invalid config, exiting abnormally (org.apache.zookeeper.server.quorum.QuorumPeerMain)
org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing ./config/zookeeper.properties
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:156)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:104)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.lang.IllegalArgumentException: /tmp/zookeeper/data/myid file is missing
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:408)
at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:152)
... 2 more
Invalid config, exiting abnormally
Would you please tell me how to have a Kafka cluster with three docker that each docker is on one physical node?
Thank you in advance.
I've run into this too.
I think the clue for this one is in the logs i.e.
Caused by: java.lang.IllegalArgumentException:
/tmp/zookeeper/data/myid file is missing at
org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:408)
The following lines in your zookeeper.properties file provide the details of the zookeeper ensemble:
server.1=0.0.0.0:2888:3888
server.2=150.20.11.134:2888:3888
server.3=150.20.11.137:2888:3888
When one of the zookeeper server starts, it knows which server it is by looking at the myid file in it's own data directory, which in this case would be /tmp/zookeeper/data.
So, all you need to do is just create a file with the name myid in the above mentioned directory of each server and write just x = 1, 2 or 3 (these correspond to server.x in the zookeeper.properties file).
Reference link - Apache Zookeeper.
Hope this helps!
Related
I was hoping this would be an easy one by just using the below snippet on the second instance's docker-compose.yml file
- DOCKER_VERNEMQ_DISCOVERY_NODE=<ip address of the first instance>
but that doesn't seem to work.
Log of the second instance confirms it's attempting to cluster:
13:56:09.795 [info] Sent join request to: 'VerneMQ#<ip address of the first instance>'
13:56:16.800 [info] Unable to connect to 'VerneMQ#<ip address of the first instance>'
While the log of the first instance does not show anything at all.
From within the second instance I can confirm that the endpoint is accessible:
$ docker exec -it vernemq /bin/sh
$ curl <ip address of the first instance>:44053
curl: (56) Recv failure: Connection reset by peer
then in the log of the first instance I see an error which is totally expected and confirms I've reached the first instance
13:58:33.572 [error] CRASH REPORT Process <0.3050.0> with 0 neighbours crashed with reason: bad argument in vmq_cluster_com:process_bytes/3 line 142
13:58:33.572 [error] Ranch listener {{172,19,0,2},44053} terminated with reason: bad argument in vmq_cluster_com:process_bytes/3 line 142
It might have to do with the fact that ip address as seen from within the docker container is 172.19.0.2 while the external one is 10. ....
Also tried adding hostname of the first instance to known_hosts to no avail.
Please advise.
I'm using erlio/docker-vernemq:1.10.0
$ docker --version
Docker version 19.03.13, build 4484c46d9d
$ docker-compose --version
docker-compose version 1.27.2, build 18f557f9
I managed to get this sorted by creating a docker overlay network
on machine1: docker swarm init
on machine2: docker swarm join --token ...
on machine1: docker network create --driver=overlay --attachable vernemq-overlay-net
The relevant bits of my dockerfile are:
version: '3.6'
services:
vernemq:
container_name: ${NODE_NAME:?Node name not specified}
image: vernemq/vernemq:1.10.4.1
environment:
- DOCKER_VERNEMQ_NODENAME=${NODE_NAME:?Node name not specified}
- DOCKER_VERNEMQ_DISCOVERY_NODE=${DISCOVERY_NODE:-}
networks:
default:
external:
name: vernemq-overlay-net
with the following env vars:
machine1:
NODE_NAME=vernemq1.example.com
DISCOVERY_NODE=
machine2:
NODE_NAME=vernemq2.example.com
DISCOVERY_NODE=vernemq1.example.com
Note:
Chances are machine2 won't find vernemq-overlay-net due to a bug in docker-compose as far as I remember.
In that case you start a container with docker: docker run -dit --name alpine --net=vernemq-overlay-net alpine which will make it available for docker-compose.
I have a HBase + HDFS setup, in which each of the HBase master, regionservers, HDFS namenode and datanodes are containerized.
When running all of these containers on a single host VM, things work fine as I can use the docker container names directly, and set configuration variables as:
CORE_CONF_fs_defaultFS: hdfs://namenode:9000
for both the regionserver and datanode. The system works as expected in this configuration.
When attempting to distribute these across multiple host VMs however, I run into issue.
I updated the config variables above to look like:
CORE_CONF_fs_defaultFS: hdfs://hostname:9000
and make sure the namenode container is exposing port 9000 and mapping it to the host machine's port 9000.
It looks like the names are not resolving correctly when I use the hostname, and the error I see in the datanode logs looks like:
2019-08-24 05:46:08,630 INFO impl.FsDatasetAsyncDiskService: Deleted BP-1682518946-<ip1>-1566622307307 blk_1073743161_2337 URI file:/hadoop/dfs/data/current/BP-1682518946-<ip1>-1566622307307/current/rbw/blk_1073743161
2019-08-24 05:47:36,895 INFO datanode.DataNode: Receiving BP-1682518946-<ip1>-1566622307307:blk_1073743166_2342 src: /<ip3>:48396 dest: /<ip2>:9866
2019-08-24 05:47:36,897 ERROR datanode.DataNode: <hostname>-datanode:9866:DataXceiver error processing WRITE_BLOCK operation src: /<ip3>:48396 dst: /<ip2>:9866
java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:786)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
at java.lang.Thread.run(Thread.java:748)
Where <hostname>-datanode is the name of the datanode container, and the IPs are various container IPs.
I'm wondering if I'm missing some configuration variable that would let containers from other VMs connect to the namenode, or some other change that'd allow this system to be distributed correctly. I'm wondering if the system is expecting the containers to be named a certain way, for example.
I have three physical nodes with Docker installed on each of them. I have configured Marathon, Flink, Mesos, Zookeeper and Hadoop on each docker. They work really well. I have to distribute data to Flink cluster, so I need Kafka.
Zookeeper has been already run; so,Kafka is run without error. The problem is that in this situation, when I want to create Kafka Topic, I see this error which I think it is because I do not run Zookeeper that is in Kafka folder:
Exception in thread "main" kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:230)
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
at kafka.zookeeper.ZooKeeperClient.kafka$zookeeper$ZooKeeperClient$$waitUntilConnected(ZooKeeperClient.scala:226)
at kafka.zookeeper.ZooKeeperClient.(ZooKeeperClient.scala:95)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1580)
at kafka.admin.TopicCommand$.main(TopicCommand.scala:57)
at kafka.admin.TopicCommand.main(TopicCommand.scala)
Also, I change my plan to use Zookeeper in Kafka folder. To do that I configure Zookeeper in Kafka folder with new ports like 2186,2889,3889. But when I run Zookeeper with this command:
/home/kafka_2.11-2.0.0/bin/zookeeper-server-start.sh /home/kafka_2.11-2.0.0/config/zookeeper.properties
I receive this error:
WARN Cannot open channel to 2 at election address /10.32.0.3:3889 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
at java.lang.Thread.run(Thread.java:748)
The configuration of zookeeper which is in "/home/zookeeper-3.4.14/conf/zoo.cfg" in first node:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper/data
clientPort=2181
server.1=0.0.0.0:2888:3888
server.2=10.32.0.3:2888:3888
server.3=10.32.0.4:2888:3888
The Zookeeper configuration which is in Kafka folder is like this for first node:
dataDir=/tmp/zookeeper/data
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2186
server.1=0.0.0.0:2889:3889
server.2=10.32.0.3:2889:3889
server.3=10.32.0.4:2889:3889
Would you please guide me on how to run two Zookeepers in one docker container? By the way, I cannot use another container for a Kafka cluster, since, I need two containers with one common IP address.
Any help would be really appreciated.
Problem solved. I used above configuration, but used dataDir=/var/lib/zookeeper/data for both Zookeeper. Also, first, I ran Hadoop and then ran Kafka with Zookeeper.
I use the following Kafka Docker image: https://hub.docker.com/r/wurstmeister/kafka/
I'm able to start Apache Kafka with the following properties:
<KAFKA_ADVERTISED_HOST_NAME>${local.ip}</KAFKA_ADVERTISED_HOST_NAME>
<KAFKA_ADVERTISED_PORT>${kafka.port}/KAFKA_ADVERTISED_PORT>
<KAFKA_ZOOKEEPER_CONNECT>zookeeper:2181</KAFKA_ZOOKEEPER_CONNECT>
<KAFKA_MESSAGE_MAX_BYTES>15000000</KAFKA_MESSAGE_MAX_BYTES>
but I see the following warning when trying to send the message into the topic:
WARN 9248 --- [ad | producer-1] org.apache.kafka.clients.NetworkClient : [Producer clientId=producer-1] Error while fetching metadata with correlation id 4 : {post.sent=LEADER_NOT_AVAILABLE}
I saw a few articles on the internet that told that this issue can be related to old properties like KAFKA_ADVERTISED_HOST_NAME and KAFKA_ADVERTISED_PORT and I should reconfigure to KAFKA_ADVERTISED_LISTENERS and KAFKA_LISTENERS. But when I start the Kafka container with the following properties:
<KAFKA_ADVERTISED_LISTENERS>PLAINTEXT://${local.ip}:${kafka.port}</KAFKA_ADVERTISED_LISTENERS>
<KAFKA_LISTENERS>PLAINTEXT://${local.ip}:${kafka.port}</KAFKA_LISTENERS>
<KAFKA_ZOOKEEPER_CONNECT>zookeeper:2181</KAFKA_ZOOKEEPER_CONNECT>
<KAFKA_MESSAGE_MAX_BYTES>15000000</KAFKA_MESSAGE_MAX_BYTES>
my application unable to connect to Kafka:
2018-08-25 16:20:57.407 INFO 17440 --- [ main] o.a.kafka.common.utils.AppInfoParser : Kafka version : 1.1.0
2018-08-25 16:20:57.408 INFO 17440 --- [ main] o.a.kafka.common.utils.AppInfoParser : Kafka commitId : fdcf75ea326b8e07
2018-08-25 16:20:58.513 WARN 17440 --- [| adminclient-1] org.apache.kafka.clients.NetworkClient : [AdminClient clientId=adminclient-1] Connection to node -1 could not be established. Broker may not be available.
2018-08-25 16:20:59.567 WARN 17440 --- [| adminclient-1] org.apache.kafka.clients.NetworkClient : [AdminClient clientId=adminclient-1] Connection to node -1 could not be established. Broker may not be available.
How to properly reconfigure the Docker Kafka in order to be able to use KAFKA_ADVERTISED_LISTENERS and KAFKA_LISTENERS?
From this awesome post, here's a good explanation about these properties:
LISTENERS are what interfaces Kafka binds to. ADVERTISED_LISTENERS are how clients can connect.
When your application connects to one of the addresses from LISTENERS the Kafka returns the correspondent KAFKA_ADVERTISED_LISTENER to that LISTENER you choosed. The KAFKA_ADVERTISED_LISTENER returned is the address that your application you will really use to communicate with Kafka.
So, you have to use in your application what you set on LISTENERS Kafka property for PLAINTEXT.
Using this configuration as you put:
<KAFKA_ADVERTISED_LISTENERS>PLAINTEXT://${local.ip}:${kafka.port}</KAFKA_ADVERTISED_LISTENERS>
<KAFKA_LISTENERS>PLAINTEXT://${local.ip}:${kafka.port}</KAFKA_LISTENERS>
You have to use on your application:
As you used on Kafka docker ${local.ip}:${kafka.port} you have to get the assigned kafka docker container IP and use it in your application.
Just to fill the variables for this scenario, let's suppose your kafka docker container IP 192.250.0.1 and the kafka port used is 9092, so your application bootstrap.servers property would be: 192.250.0.1:9092
Here's a command to see what Kafka returns to you when you try to connect using one of kafka's listener:
$ kafkacat -b 192.250.0.1:9092 -L
kafkacat is a very useful tool for test and debug kafka.
I have 2 docker containers, 1 running Logstash and the other running Zookeeper and Kafka. I am trying to send data from Logstash to Kafka but can't seem to get data across to my topic in Kafka.
I can log into the Docker Kafka container and produce a message to my topic from the terminal and then also consume it.
I am using the output kafka plugin:
output {
kafka {
topic_id => "MyTopicName"
broker_list => "kafkaIPAddress:9092"
}
}
The ipAddress I got from running docker inspect kafka2
When I run ./bin/logstash agent --config /etc/logstash/conf.d/01-input.conf I get this error.
Settings: Default pipeline workers: 4
Unknown setting 'broker_list' for kafka {:level=>:error}
Pipeline aborted due to error {:exception=>#<LogStash::ConfigurationError: Something is wrong with your configuration.>, :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/config/mixin.rb:134:in `config_init'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/outputs/base.rb:63:in `initialize'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/output_delegator.rb:74:in `register'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:181:in `start_workers'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:181:in `start_workers'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:136:in `run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/agent.rb:473:in `start_pipeline'"], :level=>:error}
stopping pipeline {:id=>"main"}
I have checker the configuration of the file by running the following command which returned OK.
./bin/logstash agent --configtest --config /etc/logstash/conf.d/01-input.conf
Configuration OK
Has anyone ever come across this, is it that I haven't opened the ports on the kafka container and if so how can I do that while keeping Kafka running?
The error is here broker_list => "kafkaIPAddress:9092"
try bootstrap_servers => "KafkaIPAddress:9092"
if you have the containers on separate machines, map kafka to the host 9092 and use the host address:port, if on same host use the internal Docker IP:port