I need to create a kafka image with topics already created - docker

I have a requirement that i need to setup kafka locally with topics already there in the container.I am using ladoop/fast-data-dev for doing that
How manually i am doing it-
docker run -d --name landoopkafka -p 2181:2181 -p 3030:3030 -p 8081:8081 -p 8082:8082 -p 8083:8083 -p 9092:9092 -e ADV_HOST=localhost landoop/fast-data-dev
After running this command my container is up and running.
now i go to bash inside this container like docker -exec -it landopkafka bash
and create topic using this command
kafka-topics --zookeeper localhost:2181 --create --topic hello_topic --partitions 1 --replication-factor 1
My topic is created.
But my requirement is i need to have a docker file which will have topic created and i just need to run it.
OR
A docker compose file which i need to run
Guys i need help on this ,as i am absolutely new to docker and kafka

I had to do it too! What if I did not want to use wurstmeister images? I decided to make a custom script which will do the job, and run this script in a separate container.
Repository
https://github.com/yan-khonski-it/kafka-compose
Note, it will work with kafka versions that use zookeeper.
Is Zookeeper a must for Kafka?
To start kafka with all your topics and zookeeper - docker-compose up -d.
Implementation details.
docker-compose.yml
# These services are kafka related. This docker-compose allows to start kafka locally quickly.
version: '2.1'
networks:
demo-network:
name: demo-network
driver: bridge
services:
zookeeper:
image: "confluentinc/cp-zookeeper:${CONFLUENT_PLATFORM_VERSION}"
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 32181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 32181:32181
hostname: zookeeper
networks:
- demo-network
kafka:
image: "confluentinc/cp-kafka:${CONFLUENT_PLATFORM_VERSION}"
container_name: kafka
hostname: kafka
ports:
- 9092:9092
- 29092:29092
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
KAFKA_BROKER_ID: 1
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_HOST://kafka:29092
LISTENERS: PLAINTEXT://0.0.0.0:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
volumes:
- /var/run/docker.sock:/var/run/docker.sock
depends_on:
- "zookeeper"
networks:
- demo-network
# Automatically creates required kafka topics if they were not created.
kafka-topics-creator:
build:
context: kafka-topic-creator
dockerfile: Dockerfile
container_name: kafka-topics-creator
depends_on:
- zookeeper
- kafka
environment:
ZOOKEEPER_HOSTS: "zookeeper:32181"
KAFKA_TOPICS: "topic_v1 topic_v2"
networks:
- demo-network
Then I have a directory kafka-topics-creator.
Here, I have three files
create-kafka-topics.sh, Dockerfile, README.md.
Dockerfile
# It is recommened to use same version as kafka broker is used.
# So no additional images are pulled.
FROM confluentinc/cp-kafka:4.1.2
WORKDIR usr/bin
# Once it is executed, this container is not needed.
COPY create-kafka-topics.sh create-kafka-topics.sh
ENTRYPOINT ["./create-kafka-topics.sh"]
create-kafka-topics.sh
#!/bin/bash
# Simply wait until original kafka container and zookeeper are started.
sleep 15.0s
# Parse string of kafka topics into an array
# https://stackoverflow.com/a/10586169/4587961
kafkatopicsArrayString="$KAFKA_TOPICS"
IFS=' ' read -r -a kafkaTopicsArray <<< "$kafkatopicsArrayString"
# A separate variable for zookeeper hosts.
zookeeperHostsValue=$ZOOKEEPER_HOSTS
# Create kafka topic for each topic item from split array of topics.
for newTopic in "${kafkaTopicsArray[#]}"; do
# https://kafka.apache.org/quickstart
kafka-topics --create --topic "$newTopic" --partitions 1 --replication-factor 1 --if-not-exists --zookeeper "$zookeeperHostsValue"
done
README.md - so other people know how to use it.Always document your stuff - good advise.
# Creates kafka topics automatically.
## Parameters
`ZOOKEEPER_HOSTS` - zookeeper hosts, I used value `"zookeeper:32181"` to run it locally.
`KAFKA_TOPICS` - space separated list of kafka topics. Example, `topic_1, topic_2, topic_3`.
Note, this container should run only **after** your original kafka broker and zookeeper are running.
After this container creates topics, it is not needed anymore.
How to check that the topics were created.
One solution is to check logs of kafka-topics-creator container.
docker logs kafka-topics-creator should print
$ docker logs kafka-topics-creator
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic "topic_v1".
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic "topic_v2".

You can create a docker-compose file like this...
version: '2'
services:
zookeeper:
image: wurstmeister/zookeeper:latest
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka:0.10.2.1
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1
KAFKA_CREATE_TOPICS: "MY_TOPIC_ONE:1:1,/
MY_TOPIC_TWO:1:1"
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
Put your topics there and run docker-compose up

You should instead try to use the wurstmeister/kafka image which supports an environment variable to create topics during container startup.
Sure, the Landoop container has a bunch of other useful things, but sounds like you only want Kafka and don't want to mess with editing any Dockerfiles
The other solution is to startup a second container after Kafka which runs the create scripts, then stops itself

Related

Kafka communication inside docker network fails, outside of docker works [duplicate]

This question already has answers here:
Connect to Kafka running in Docker
(5 answers)
Closed 1 year ago.
I have setup a kafka-zookeeper compose in docker with three listeners:
INTERNAL on Port 9092
EXTERNAL_SAME_HOST on 29092
EXTERNAL_DIFFERENT_HOST on 29093
Server with docker has the IP 192.168.66.66.
I tested it, and 2 and 3 are reachable via my kafka-python test scripts.
1 however (Tested from inside a container) fails throwing "No brokers available". What did I do wrong with INTERNAL?
My docker-compose.yml:
version: "3.8"
services:
#########
# Kafka #
#########
zookeeper:
container_name: zookeeper
image: wurstmeister/zookeeper
networks:
- kafka_network
ports:
- "2181:2181"
kafka:
container_name: kafka
image: wurstmeister/kafka
networks:
- kafka_network
ports:
- "29092:29092"
- "29093:29093"
expose:
- "9092"
environment:
# Using three ways to reach kafka: From INSIDE docker (Other containers), from outside docker but running on the same server as docker (EXTERNAL_SAME_HOST), and from another computer.
KAFKA_LISTENERS: EXTERNAL_SAME_HOST://:29092,EXTERNAL_DIFFERENT_HOST://:29093,INTERNAL://:9092
# Publishing the above ports
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9092,EXTERNAL_SAME_HOST://localhost:29092,EXTERNAL_DIFFERENT_HOST://192.168.66.66:29093
# Setting the security protocol for all three listeners to PLAINTEXT
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL_SAME_HOST:PLAINTEXT,EXTERNAL_DIFFERENT_HOST:PLAINTEXT
# Settings for zookeeper-kafka-communication
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# Create Kafka topics "NAME:PARTITION:REPLICAS:RETENTION_POLICY,..."
KAFKA_CREATE_TOPICS: "debug:1:1:delete,debug2:1:1:delete" # Some test topics for debugging purposes
# To encourage correct API use, forbid automatic topic creation (Only topics created by an explicit command or the topic creation above can be used. Prevents typos)
KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false'
depends_on:
- zookeeper
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
kafka_network:
name: kafka
driver: bridge
My kafka-python test producer.py:
from kafka import KafkaProducer
import json
import time
kafka_ip = 'kafka:9092'
topic = 'debug'
producer = KafkaProducer(bootstrap_servers=[kafka_ip],
value_serializer=lambda x:
json.dumps(x).encode('utf-8'))
while True:
data = 'INTERNAL listener test'
producer.send(topic, value=data)
time.sleep(1)
And the dockerfile I used for the producer:
FROM python:3.8-alpine
COPY requirements.txt /
RUN pip install -r requirements.txt
COPY src/ /
CMD [ "python", "./producer.py" ]
I created the container via
docker build --no-cache -t kafka_internal_test_producer .
and ran it with
docker run -d --name kafka_internal_test_producer kafka_inter
nal_test_producer
See the comments under the question, the producer was running on the wrong network. Setting it to "kafka" via docker connect/portainer UI worked. Thanks #Robin Moffatt for the quick solution!

Kafka in a Docker Container - External and and Internal connections

I have a situation where, Kafka is running in a docker container using a specific IP address within a network. The network is created using the following command
sudo docker network create --subnet=172.19.0.0/16 --gateway 172.19.0.1 --ip-range=172.19.0.1/24 my_net
Kafka container is started using the following
docker run -d --name kafkanode --net my_net --hostname=kafkahost01 kafka_zook:212-358 -p 2181:2181 -p 9092:9092 tail -f /dev/null
I have producers within the same host from a different container.
Kafka's server.properties a simple configuration like the below works for a producer within the same host and from a different container.
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://kafkahost01:9092
However, in our case, we will have producers who will also be sending messages from outside of that machine.
Unfortunately, i am not able to get connected from outside the docker host machine. Can someone please help me with the configuration?
We are using Kafka 2.12-2.6.0
Zookeeper -- 3.5.8
Server properties edited with the following values
listeners=INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=INTERNAL://kafkahost01:29092,EXTERNAL://10.20.30.40:9092
inter.broker.listener.name=INTERNAL
Thanks
Balaji
Here you have a docker-compose example with inside and outside listeners configured. Try out.
(Replace localhost with your desired IP or DNS)
version: '3.7'
services:
zookeeper:
image: zookeeper:3.5.8
hostname: zookeeper
volumes:
- zookeeper-data:/data
- zookeeper-datalog:/datalog
kafka:
image: wurstmeister/kafka:2.13-2.6.0
hostname: kafka
depends_on:
- zookeeper
ports:
- 9093:9093
environment:
KAFKA_BROKER_ID: 1
KAFKA_ADVERTISED_LISTENERS: INSIDE://:9092,OUTSIDE://localhost:9093
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:9093
KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
volumes:
- kafka:/kafka
volumes:
zookeeper-data:
zookeeper-datalog:
kafka:
Running a producer within the same network:
# note: I just placed my docker-compose.yml in example dir, thats the reason for the example_default network
$ docker run -it --rm \
--name producer \
--network example_default \
wurstmeister/kafka:2.13-2.6.0 bash
bash-4.4# /opt/kafka/bin/kafka-console-producer.sh --bootstrap-server kafka:9092 --topic
example
>some
>test
And consuming from outside docker using kaf:
$ cat ~/.kaf/config
current-cluster: single
clusteroverride: ""
clusters:
- name: single
version: 2.7.0
brokers:
- localhost:9093
SASL: null
TLS: null
security-protocol: PLAINTEXT
schema-registry-url: ""
$ kaf nodes
ID ADDRESS
1 localhost:9093
$ kaf consume example -f --raw
some
test
Hope this example can help you define your own setup.

Setting up sunbird-telemetry Kafka DRUID and superset

I am trying to create a analytics dashboard based from mobile events. I want to dockerize all the components to containers in docker and deploy it in localhost and create an analytical dashboard.
Sunbird telemetry https://github.com/project-sunbird/sunbird-telemetry-service
Kafka https://github.com/wurstmeister/kafka-docker
Druid https://github.com/apache/incubator-druid/tree/master/distribution/docker
Superset https://github.com/apache/incubator-superset
What i did
Druid
I executed the command docker build -t apache/incubator-druid:tag -f distribution/docker/Dockerfile .
I executed the command docker-compose -f distribution/docker/docker-compose.yml up
After everything get executed open http://localhost:4008/ and see DRUID running
It takes 3.5 hours to complete both build and run
Kafka
Navigate to kafka folder
docker-compose up -d executed this command
Issue
When we execute druid a zookeeper starts running, and when we start kafka the docker file starts another zookeeper and i cannot establish a connection between kafka and zookeeper.
After i start sunbird telemetry and tries to create topic and connect kafka from sunbird its not getting connected.
I dont understand what i am doing wrong.
Can we tell kafka to share the zookeeper started by DRUID. I am completed new to this environment and these stacks.
I am studying this stacks. Am i doing something wrong. Can anybody point out how to properly connect kafka and druid over docker.
Note:- I am running all this in my mac
My kafka compose file
version: '2'
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
build: .
ports:
- "9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: **localhost ip**
KAFKA_ZOOKEEPER_CONNECT: **localhost ip**:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
Can we tell kafka to share the zookeeper started by DRUID
You would put all services in the same compose file.
Druids kafka connection is listed here
https://github.com/apache/incubator-druid/blob/master/distribution/docker/environment#L31
You can set KAFKA_ZOOKEEPER_CONNECT to the same address, yes
For example, downloading the file above and adding Kafka to the Druid Compose file...
version: "2.2"
volumes:
metadata_data: {}
middle_var: {}
historical_var: {}
broker_var: {}
coordinator_var: {}
overlord_var: {}
router_var: {}
services:
# TODO: Add sunbird
postgres:
container_name: postgres
image: postgres:latest
volumes:
- metadata_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=FoolishPassword
- POSTGRES_USER=druid
- POSTGRES_DB=druid
# Need 3.5 or later for container nodes
zookeeper:
container_name: zookeeper
image: zookeeper:3.5
environment:
- ZOO_MY_ID=1
druid-coordinator:
image: apache/incubator-druid
container_name: druid-coordinator
volumes:
- coordinator_var:/opt/druid/var
depends_on:
- zookeeper
- postgres
ports:
- "3001:8081"
command:
- coordinator
env_file:
- environment
# renamed to druid-broker
druid-broker:
image: apache/incubator-druid
container_name: druid-broker
volumes:
- broker_var:/opt/druid/var
depends_on:
- zookeeper
- postgres
- druid-coordinator
ports:
- "3002:8082"
command:
- broker
env_file:
- environment
# TODO: add other Druid services
kafka:
image: wurstmeister/kafka
ports:
- "9092"
depends_on:
- zookeeper
environment:
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181/kafka # This is the same service that Druid is using
Can we tell kafka to share the zookeeper started by DRUID
Yes, as there's a zookeeper.connect setting for Kafka broker that specifies the Zookeeper address to which Kafka will try to connect. How to do it depends entirely on the docker image you're using. For example, one of the popular images wurstmeister/kafka-docker does this by mapping all environmental variables starting with KAFKA_ to broker settings and adds them to server.properties, so that KAFKA_ZOOKEEPER_CONNECT becomes a zookeeper.connect setting. I suggest taking a look at the official documentation to see what else you can configure.
and when we start kafka the docker file starts another zookeeper
This is your issue. It's the docker-compose file that starts Zookeeper, Kafka, and configures Kafka to use the bundled Zookeeper. You need to modify it, by removing the bundled Zookeeper and configuring Kafka to use a different one. Ideally, you should have a single docker-compose file that starts the whole setup.

Can not test apache kafka official site commands using docker: no such file or directory

I have run the apache kafka and prometheus using docker. I will attach the docker-compose and other configurations at the bottom of this post!
Introduction: First I should explain that each metric of kafka works well on prometheus. So there is no problem in the implementation and running of the images.
Problem: The only problem is where I want to test the stream (Producer, Broker and Consumer) following the tutorial of the official site of apache kafka. But whenever I execute the commands found on the site, I faced with the command not found error, because I don't know where the files exactly are! As an example whenever I execute the bin/zookeeper-server-start.sh config/zookeeper.properties command I face with the following error:
no such file or directory: bin/zookeeper-server-start.sh
Attachments:
docker-compose.yml:
version: '2'
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- "2181:2181"
kafka:
build: .
links:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ADVERTISED_PORT: 9092
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_OPTS: -javaagent:/usr/app/jmx_prometheus_javaagent.jar=7071:/usr/app/prom-jmx-agent-config.yml
volumes:
- /var/run/docker.sock:/var/run/docker.sock
prometheus:
image: prom/prometheus
ports:
- 9090:9090/tcp
volumes:
- ./mount/prometheus:/etc/prometheus
links:
- kafka
Dockerfile:
FROM wurstmeister/kafka
ADD prom-jmx-agent-config.yml /usr/app/prom-jmx-agent-config.yml
ADD jmx_prometheus_javaagent-0.10.jar /usr/app/jmx_prometheus_javaagent.jar
Question: Is there any solution to find where are the original files are mapped in the created container and execute them?
The quickstart on the Apache site never references Docker. Those scripts need downloaded (as part of Kafka), or you need to docker exec into the container to run them
However, Docker already starts Kafka and Zookeeper, so you wouldn't need to run those commands. You therefore could skip to writing your own producers/consumers without using any provided scripts

Starting a Kafka topics using Docker Compose with spotify/kafka?

I am attempting to connect Kafka topics to my front-end Java Spring application. I am utilizing Docker Compose and have tried to connect using two different Kafka images.
With wurstmeister/kafka I have been able to get the Kafka topics up and running by this service in my docker.compose.yml file. But I have not been able to connect the created topics to my front-end Java Spring application.
kafka:
image: wurstmeister/kafka:0.10.2.0
ports:
- "9092:9092"
expose:
- "9092"
- "2181"
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_CREATE_TOPICS: "test-topic1:1:1, test-topic2:1:1"
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
volumes:
- /var/run/docker.sock:/var/run/docker.sock
depends_on:
- zookeeper
Secondly, with spotify/kafka, I am having difficulty actually creating the topics with Kafka. In the documentation, it is looking for the topics as an environment variable, but the following docker-compose.yml service is not creating a topic. I have also tried putting quotes around the test-topic but that did not work as well.
kafka:
image: spotify/kafka
ports:
- "9092:9092"
- "2181:2181"
hostname: kafka
expose:
- "9092"
- "2181"
environment:
TOPICS: test-topic
I do not know if this is necessary, but my entire docker-compose.yml file is as follows, take note that the zookeeper service is only required if you use wurstmeister/kafka.
docker-compose.yml
version: '2'
services:
# zookeeper:
# image: wurstmeister/zookeeper
# ports:
# - "2181:2181"
kafka:
image: spotify/kafka
ports:
- "9092:9092"
- "2181:2181"
hostname: kafka
expose:
- "9092"
- "2181"
environment:
TOPICS: test-topic
redis:
image: redis
ports:
- "6379"
restart: always
kafka-websocket-connector:
build: ./kafka-websocket-connector
image: andrewterra/kafka-websocket-connector
ports:
- "8077:8077"
# - "9092:9092"
depends_on:
- kafka
- redis
# - zookeeper
links:
- kafka
- redis
Rather late, but you could use something like the following to use a shell script to create your topic:
command: >
bash -c
"(sleep 15s &&
/opt/kafka_2.11-0.10.1.0/bin/kafka-topics.sh
--create
--zookeeper
localhost:2181 --replication-factor 1 --partitions 1
--topic my_topic &) && (supervisord -n)"
Container run command to create topic
Use the container command docker run --net=host --rm. In the following example, the zookeeper is running on port 22181, please use the respective topic name, port.
Create
docker run --net=host --rm confluentinc/cp-kafka:4.0.0 kafka-topics --create --topic customer --partitions 1 --replication-factor 1 --if-not-exists --zookeeper localhost:22181
Describe
docker run --net=host --rm confluentinc/cp-kafka:4.0.0 kafka-topics --zookeeper localhost:22181 --topic customer --describe
List
docker run --net=host --rm confluentinc/cp-kafka:4.0.0 kafka-topics --list --zookeeper localhost:22181
Delete
docker run --net=host --rm confluentinc/cp-kafka:4.0.0 kafka-topics --delete --topic customer --zookeeper localhost:22181
The TOPICS environment variable is only used for the kafkaproxy image. https://github.com/spotify/docker-kafka#running-the-proxy
For kafka image, you will need to create the topics with a client.

Resources