Apache NiFi Cluster in Docker over 3 VM's - docker

I want to make a NiFi cluster in docker over 3 vm's.
I found a docker-compose file that creates a cluster on one node and try to edit this file.
I found out that i need zookeeper, but do i need one zookeeper each instance? and what ports should i open or map in docker?
the docker-compose file i found:
version: "3"
services:
zookeeper:
hostname: zookeeper
container_name: zookeeper
image: zookeeper:3.6.1
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
nifi:
image: apache/nifi:1.11.4
ports:
- 8080 # Unsecured HTTP Web Port
environment:
- NIFI_WEB_HTTP_PORT=8080
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
- NIFI_ZK_CONNECT_STRING=zookeeper:2181
- NIFI_ELECTION_MAX_WAIT=1 min
I changed the file like this (on each VM the ip is correct )
version: "3"
services:
zookeeper:
hostname: zookeeper
container_name: zookeeper
image: 'zookeeper:3.6.1'
ports:
- 2181
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
nifi:
image: apache/nifi:1.11.4
ports:
- 8080 # Unsecured HTTP Web Port
- 8082
- 9001
environment:
- NIFI_WEB_HTTP_PORT=8080
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
# - NIFI_ZK_CONNECT_STRING=zookeeper:2181
- NIFI_ZK_CONNECT_STRING=192.168.2.10:2181,192.168.2.20:2181,192.168.2.30:2181
- NIFI_ELECTION_MAX_WAIT=1 min
- NIFI_CLUSTER_ADDRESS=192.168.2.XX
and in the logs i found this message but cant find any solution
ERROR [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl Background retry gave up
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:972)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

I found out that i need zookeeper, but do i need one zookeeper each instance?
No, you can use one zookeeper
and what ports should i open or map in docker?
as I know you need 2888, 3888, 2181 to be open. But only 2181 to communicate with nifi
2888 and 3888 for zookeeper cluster communication.

Related

Unable to run Apache nifi docker container in swarm. Empty response

I am unable to run official nifi image in docker swarm.
When I start container in regular mode:
docker run --name nifi -p 8080:8080 -d apache/nifi:latest
everything works fine and I can access the application under http://localhost:8080/nifi
However when i try to run application in docker swarm:
docker swarm init
docker stack deploy -c docker-compose.yml nifi
With the following docker-compose.yml
version: "3"
services:
zookeeper:
hostname: zookeeper
container_name: zookeeper
image: 'bitnami/zookeeper:latest'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
nifi:
image: apache/nifi:latest
ports:
- "8080:8080"
expose:
- "8080"
environment:
- NIFI_WEB_HTTP_PORT=8080
- NIFI_WEB_HTTP_HOST=localhost
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
- NIFI_ZK_CONNECT_STRING=zookeeper:2181
- NIFI_ELECTION_MAX_WAIT=1 min
Application starts (zookeeper and nifi) but is unaccessible under http://localhost:8080/nifi
curl http://localhost:8080
curl: (52) Empty reply from server
However running the following code:
docker exec -it 629ecd6949d9 curl -v http://localhost:8080
shows that nifi is up and running, but for some reason it does not work from outside container.
I am close to start hitting the wall with my head. How can I fix this?
Best
Paweł
Refactored your compose file. Try to use it:
version: "3.3"
services:
zookeeper:
hostname: zookeeper
image: 'bitnami/zookeeper:latest'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
nifi:
image: apache/nifi:latest
ports:
- target: 8080
published: 8080
protocol: tcp
mode: host
environment:
- NIFI_WEB_HTTP_PORT=8080
- NIFI_WEB_HTTP_HOST=0.0.0.0
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
- NIFI_ZK_CONNECT_STRING=zookeeper:2181
- NIFI_ELECTION_MAX_WAIT=1 min
In order to make the NiFi image run in Docker swarm mode you need to add NIFI_WEB_HTTP_HOST=0.0.0.0 to the environment section of the docker-compose file:
version: "3"
services:
zookeeper:
hostname: zookeeper
container_name: zookeeper
image: 'bitnami/zookeeper:latest'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
nifi:
image: apache/nifi:latest
ports:
- "8080:8080"
expose:
- "8080"
environment:
- NIFI_WEB_HTTP_HOST=0.0.0.0 # This line right here
- NIFI_WEB_HTTP_PORT=8080
- NIFI_WEB_HTTP_HOST=localhost
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
- NIFI_ZK_CONNECT_STRING=zookeeper:2181
- NIFI_ELECTION_MAX_WAIT=1 min
Sorry, if NIFI_WEB_HOST=0.0.0.0 then it will cause problem when nifi container try to communicate with each other.
2020-02-20 03:20:13,509 WARN [Replicate Request Thread-5] o.a.n.c.c.h.r.ThreadPoolRequestReplicator
java.net.SocketTimeoutException: timeout
at okio.Okio$4.newTimeoutException(Okio.java:232)
at okio.AsyncTimeout.exit(AsyncTimeout.java:285)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:241)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:355)
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:227)
at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:215)
at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200)
at okhttp3.RealCall.execute(RealCall.java:77)
at org.apache.nifi.cluster.coordination.http.replication.okhttp.OkHttpReplicationClient.replicate(OkHttpReplicationClient.java:138)
at org.apache.nifi.cluster.coordination.http.replication.okhttp.OkHttpReplicationClient.replicate(OkHttpReplicationClient.java:132)
at org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator.replicateRequest(ThreadPoolRequestReplicator.java:647)
at org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:839)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at okio.Okio$2.read(Okio.java:140)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
... 28 common frames omitted

Unable to access topic from scaled Kafka cluster in Docker

I want to create scalable Kafka cluster on dockers. I am using docker image https://hub.docker.com/r/wurstmeister/kafka/ for creating kafka server. For creating single node kafka and zookeeper I have used following docker-compose :
version: '3.1'
services:
kafka:
image: "wurstmeister/kafka"
ports:
- "9095:9092"
hostname: kafka
depends_on:
- zookeeper
environment:
- KAFKA_ADVERTISED_HOST_NAME=kafka
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_CREATE_TOPICS=check:3:1
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2185:2181"
with these settings I am able to access kafka broker and zookeeper from my application code as kafka:9092 and zookeeper:2181 respectively in kafka consumer settings.
But now I want scalable cluster so I have modified docker-compose as:
version: '3.1'
services:
kafka:
image: "wurstmeister/kafka"
ports:
- "9095:9092"
hostname: kafka
depends_on:
- zookeeper
environment:
- KAFKA_ADVERTISED_HOST_NAME=kafka
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_CREATE_TOPICS=check:3:1
deploy:
replicas: 3
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2185:2181"
In this case 3 kafka brokers are getting created. But, in this case if I am unable to access my topic "check" by using kafka broker as kafka:9092 and kafka zookeeper as zookeepr:2181 in my application code's kafka consumer settings. How should I modify my docker-compose.yml OR my application code kafka consumer settings such that I can read from these multiple brokers?

Can't connect Kafka to Zookeeper

From docker-compose I got this yml:
version: '2'
services:
zookeeper:
container_name: zookeeper
image: confluentinc/cp-zookeeper:3.1.1
ports:
- "2080:2080"
environment:
- ZOOKEEPER_CLIENT_PORT=2080
- ZOOKEEPER_TICK_TIME=2000
kafka:
container_name: kafka
image: confluentinc/cp-kafka:3.1.1
ports:
- "9092:9092"
environment:
- KAFKA_CREATE_TOPICS=Topic1:1
- KAFKA_ZOOKEEPER_CONNECT=192.168.99.100:2080
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://192.168.99.100:9092
depends_on:
- zookeeper
schema-registry:
container_name: schema-registry
image: confluentinc/cp-schema-registry:3.1.1
ports:
- "8081:8081"
environment:
- SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=192.168.99.100:2080
- SCHEMA_REGISTRY_HOST_NAME=localhost
depends_on:
- zookeeper
- kafka
When I stand up this docker the console output ends with:
schema-registry | Error while running kafka-ready.
schema-registry | org.apache.kafka.common.errors.TimeoutException: Timed out waiting for Kafka to create /brokers/ids in Zookeeper. timeout (ms) = 40000
schema-registry exited with code 1
It seems like kafka never connect Zookeper or something like that, does anyone knows why this is happening?
Does changing
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=192.168.99.100:2080
into
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper:2080
help?
Additionally, KAFKA_ZOOKEEPER_CONNECT=192.168.99.100:2080 should mention zookeeper as well, instead of an IP address. Or, how can you be sure of that IP address?
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://192.168.99.100:9092 mentions an IP address you might not be able to guarantee as well. For the latter, that IP address could be changed into kafka.
I also had challenges in getting Kafka and Zookeeper to work in Docker (via Docker Compose). In the end, https://github.com/confluentinc/cp-docker-images/blob/5.0.0-post/examples/kafka-single-node/docker-compose.yml worked for me. You could use that as a source of inspiration.

Schema Registry container: Server died unexpectedly when launching using docker-compose

I have written docker-compose.yml file to create the following containers:
Confluent-Zookeeper
Confluent-Kafka
Confluent-Schema Registry
I want a single docker-compose file to spun up the necessary containers, expose required ports and interconnect the dependent containers. The goal is to have
I am using the official confluent images from Docker Hub.
My docker-compose file looks like this:
zookeeper:
image: confluent/zookeeper
container_name: confluent-zookeeper
hostname: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ports:
- "2181:2181"
kafka:
environment:
KAFKA_ZOOKEEPER_CONNECTION_STRING: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
image: confluent/kafka
container_name: confluent-kafka
hostname: kafka
links:
- zookeeper
ports:
- "9092:9092"
schema-registry:
image: confluent/schema-registry
container_name: confluent-schema_registry
environment:
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: zookeeper:2181
SCHEMA_REGISTRY_HOSTNAME: schema-registry
SCHEMA_REGISTRY_LISTENERS: http://schema-registry:8081
SCHEMA_REGISTRY_DEBUG: 'true'
SCHEMA_REGISTRY_KAFKASTORE_TOPIC_REPLICATION_FACTOR: '1'
links:
- kafka
- zookeeper
ports:
- "8081:8081"
Now when I run docker-compose up, all these containers will be created and launched. But Schema Registry container exits immediately. docker logs gives the following output:
(io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig:135)
[2017-05-17 06:06:33,415] ERROR Server died unexpectedly: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain:51)
org.apache.kafka.common.config.ConfigException: Only plaintext and SSL Kafka endpoints are supported and none are configured.
at io.confluent.kafka.schemaregistry.storage.KafkaStore.getBrokerEndpoints(KafkaStore.java:254)
at io.confluent.kafka.schemaregistry.storage.KafkaStore.<init>(KafkaStore.java:111)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:136)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:53)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:37)
at io.confluent.rest.Application.createServer(Application.java:117)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
I searched for this issue but nothing helped. I tried various other configurations like providing KAFKA_ADVERTISED_HOSTNAME, changing SCHEMA_REGISTRY_LISTENERS value, etc. but no luck.
Can anybody point out the exact configuration issue why Schema Registry container is failing?
Those are old and deprecated docker images. Use the latest supported docker images from confluentinc https://hub.docker.com/u/confluentinc/
You can find a full compose file here - confluentinc/cp-docker-images
You're missing the hostname (hostname: schema-registry) entry in the failing container. By default Docker will populate a container's /etc/hosts with the linked containers' aliases and names, plus the hostname of self.
The question is old, though it might be helpful to leave a solution that worked for me. I am using docker-compose:
version: '3.3'
services:
zookeeper:
image: confluent/zookeeper:3.4.6-cp1
hostname: "zookeeper"
networks:
- test-net
ports:
- 2181:2181
environment:
zk_id: "1"
kafka:
image: confluent/kafka:0.10.0.0-cp1
hostname: "kafka"
depends_on:
- zookeeper
networks:
- test-net
ports:
- 9092:9092
environment:
KAFKA_ADVERTISED_HOST_NAME: "kafka"
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_BROKER_ID: "0"
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
schema-registry:
image: confluent/schema-registry:3.0.0
hostname: "schema-registry"
depends_on:
- kafka
- zookeeper
networks:
- test-net
ports:
- 8081:8081
environment:
SR_HOSTNAME: schema-registry
SR_LISTENERS: http://schema-registry:8081
SR_DEBUG: 'true'
SR_KAFKASTORE_TOPIC_REPLICATION_FACTOR: '1'
SR_KAFKASTORE_TOPIC_SERVERS: PLAINTEXT://kafka:9092
networks:
test-net:
driver: bridge`

Kafka log directories in Docker

When i was running the kafka and zookeeper without Docker, I could see the topic partitions log files in the /tmp/kafka-logs directory. Now with Docker, even though i specify the log directory in the Volumes section in docker-compose.yml, i cant see the files in the docker VM, like "TOPICNAME-PARTITIONNUMBER".. Is there anything I'm missing here ? Any idea on where i could find these directories in Docker VMs..
zookeeper:
image: confluent/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
- "15001:15000"
environment:
ZK_SERVER_ID: 1
volumes:
- /tmp/docker/zk1/logs:/logs
- /tmp/docker/zk1/data:/data
kafka1:
image: confluent/kafka
container_name: kafka1
ports:
- "9092:9092"
- "15002:15000"
links:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_OFFSETS_STORAGE: kafka
# This is Container IP
KAFKA_ADVERTISED_HOST_NAME: 192.168.99.100
volumes:
- /tmp/docker/kafka1/logs:/logs
- /tmp/docker/kafka1/data:/data
this is how we configured for logs in our compose file and it has the log files in it. You should jump onto the container to see the '/var/lib/kafka/data', directory and the data inside it
volumes:
-kb1_data:/var/lib/kafka/data
Remember that 1st parameter in list for volumes, ports and other fields about sharing resources in docker-compose is about host, the 2nd is about the container.
So you should change the order for your volumes values.
zookeeper:
image: confluent/zookeeper
container_name: zookeeper
ports:
- "2181:2181"
- "15001:15000"
environment:
ZK_SERVER_ID: 1
volumes:
#- ./host/folder:/container/folder
- ./logs:/tmp/docker/zk1/logs
- ./data:/tmp/docker/zk1/data
kafka1:
image: confluent/kafka
container_name: kafka1
ports:
#- "host-port:container-port"
- "9092:9092"
- "15002:15000"
links:
- zookeeper
environment:
- KAFKA_BROKER_ID: 1
- KAFKA_OFFSETS_STORAGE: kafka
- # This is Container IP
- KAFKA_ADVERTISED_HOST_NAME: 192.168.99.100
volumes:
#- ./host/folder:/container/folder
- ./logs:/tmp/docker/kafka1/logs
- ./data:/tmp/docker/kafka1/data

Resources