I'm trying to run a docker-compose with Kafka and Sonar to run my tests and get its coverage, but they can not run in parallel and I could not find the reason.
If I move them to different docker-compose files, run the sonar one and wait it be up, right after run the kafka one, I can see in the sonar log:
sonarqube_1 | 2021.03.07 12:06:43 WARN app[][o.s.a.p.AbstractManagedProcess] Process exited with exit value [es]: 137
sonarqube_1 | 2021.03.07 12:06:43 INFO app[][o.s.a.SchedulerImpl] Process[es] is stopped
sonarqube_1 | 2021.03.07 12:06:43 INFO ce[][o.s.p.ProcessEntryPoint] Hard stopping process
Here is my docker-compose:
version: '2.1'
services:
zookeeper:
image: confluentinc/cp-zookeeper:5.5.1
hostname: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-server:5.5.1
hostname: broker
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_CONFLUENT_LICENSE_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_LOG_RETENTION_MS: 1000
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:29092
CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'true'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
sonarqube:
image: sonarqube
expose:
- 9000
ports:
- "127.0.0.1:9000:9000"
networks:
- sonarnet
environment:
- SONARQUBE_JDBC_URL=jdbc:postgresql://db:5432/sonar
- SONARQUBE_JDBC_USERNAME=sonar
- SONARQUBE_JDBC_PASSWORD=sonar
volumes:
- sonarqube_conf:/opt/sonarqube/conf
- sonarqube_data:/opt/sonarqube/data
- sonarqube_extensions:/opt/sonarqube/extensions
- sonarqube_bundled-plugins:/opt/sonarqube/lib/bundled-plugins
db:
image: postgres
networks:
- sonarnet
environment:
- POSTGRES_USER=sonar
- POSTGRES_PASSWORD=sonar
volumes:
- postgresql:/var/lib/postgresql
- postgresql_data:/var/lib/postgresql/data
networks:
sonarnet:
driver: bridge
volumes:
sonarqube_conf:
sonarqube_data:
sonarqube_extensions:
sonarqube_bundled-plugins:
postgresql:
postgresql_data:
Related
I have a docker-compose file that starts 3 zookeeper containers, and 3 kafka containers. In essence, I have a cluster with three brokers.
When running the docker-compose file, I get the following error:
mcflow_kafka_1 | [2022-09-11 03:50:52,600] INFO [feature-zk-node-event-process-thread]: Starting (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)
mcflow_kafka_1 | [2022-09-11 03:50:52,747] INFO Updated cache from existing <empty> to latest FinalizedFeaturesAndEpoch(features=Features{}, epoch=0). (kafka.server.FinalizedFeatureCache)
mcflow_kafka_1 | [2022-09-11 03:50:52,750] INFO Cluster ID = 8Z5JcWoMRUC_qnTWg9q1wQ (kafka.server.KafkaServer)
mcflow_kafka_1 | [2022-09-11 03:50:52,755] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
mcflow_kafka_1 | [2022-09-11 03:50:52,755] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
mcflow_kafka_1 | kafka.common.InconsistentClusterIdException: The Cluster ID 8Z5JcWoMRUC_qnTWg9q1wQ doesn't match stored clusterId Some(maywBQemSJuPZzgIh3BE5A) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
mcflow_kafka_1 | at kafka.server.KafkaServer.startup(KafkaServer.scala:252)
mcflow_kafka_1 | at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
mcflow_kafka_1 | at kafka.Kafka$.main(Kafka.scala:82)
mcflow_kafka_1 | at kafka.Kafka.main(Kafka.scala)
This is occurring because some file meta.properties is specifying a different clusterId value than what my container's clusterId is. Problem is, I'm not sure how to correct/locate this meta.properties file, since I'm running kafka in a Docker container.
Below is the docker-compose.yml file I'm using:
---
version: "3.0"
services:
zookeeper1:
image: confluentinc/cp-zookeeper:6.1.0
restart: always
environment:
ZOOKEEPER_SERVER_ID: 1
ZOOKEEPER_CLIENT_PORT: "2181"
ZOOKEEPER_TICK_TIME: "2000"
ZOOKEEPER_SERVERS: "zookeeper1:22888:23888"
ports:
- "2181:2181"
zookeeper2:
image: confluentinc/cp-zookeeper:6.1.0
restart: always
environment:
ZOOKEEPER_SERVER_ID: 2
ZOOKEEPER_CLIENT_PORT: "2182"
ZOOKEEPER_TICK_TIME: "2000"
ZOOKEEPER_SERVERS: "zookeeper2:22888:23888"
ports:
- "2182:2181"
zookeeper3:
image: confluentinc/cp-zookeeper:6.1.0
restart: always
environment:
ZOOKEEPER_SERVER_ID: 3
ZOOKEEPER_CLIENT_PORT: "2183"
ZOOKEEPER_TICK_TIME: "2000"
ZOOKEEPER_SERVERS: "zookeeper2:22888:23888"
ports:
- "2183:2181"
kafka1:
container_name: mcflow_kafka_1
image: confluentinc/cp-kafka:6.1.0
depends_on:
- zookeeper1
- zookeeper2
- zookeeper3
ports:
# Exposes 29092 for external connections to the broker
# Use kafka1:9092 for connections internal on the docker network
# See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
#- "29092:29092"
- "9092:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: "zookeeper1:2181,zookeeper2:2182,zookeeper3:2183"
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: >-
PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: >-
PLAINTEXT://kafka1:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_BROKER_ID: 1
KAFKA_BROKER_RACK: "r1"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_DELETE_TOPIC_ENABLE: "true"
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_SCHEMA_REGISTRY_URL: "schemaregistry1:8085"
kafka2:
container_name: mcflow_kafka_2
image: confluentinc/cp-kafka:6.1.0
depends_on:
- zookeeper1
- zookeeper2
- zookeeper3
ports:
# Exposes 29092 for external connections to the broker
# Use kafka1:9092 for connections internal on the docker network
# See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
#- "29092:29092"
- "9093:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: "zookeeper1:2181,zookeeper2:2182,zookeeper3:2183"
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: >-
PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: >-
PLAINTEXT://kafka2:9093,PLAINTEXT_HOST://localhost:29092
KAFKA_BROKER_ID: 2
KAFKA_BROKER_RACK: "r1"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_DELETE_TOPIC_ENABLE: "true"
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_SCHEMA_REGISTRY_URL: "schemaregistry1:8085"
kafka3:
container_name: mcflow_kafka_3
image: confluentinc/cp-kafka:6.1.0
depends_on:
- zookeeper1
- zookeeper2
- zookeeper3
ports:
# Exposes 29092 for external connections to the broker
# Use kafka1:9092 for connections internal on the docker network
# See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
#- "29092:29092"
- "9094:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: "zookeeper1:2181,zookeeper2:2182,zookeeper3:2183"
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: >-
PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: >-
PLAINTEXT://kafka3:9094,PLAINTEXT_HOST://localhost:29092
KAFKA_BROKER_ID: 3
KAFKA_BROKER_RACK: "r1"
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_DELETE_TOPIC_ENABLE: "true"
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
KAFKA_SCHEMA_REGISTRY_URL: "schemaregistry1:8085"
How can I resolve this issue?
I'm learning docker and trying to setup a Kafka Cluster with 3 Zookeepers instances and 3 brokers.
Sometimes the brokers runs fine, but sometimes they are exiting with status 1 and the logs shows the following message:
kafka.common.InconsistentClusterIdException: The Cluster ID
p2ouSB1rTKyUddeL26jpxg doesn't match stored clusterId
Some(4QuK9rOmTleiRryAOA5zwA) in meta.properties. The broker is trying
to join the wrong cluster. Configured zookeeper.connect may be wrong.
I searched a lot before post this question and nothing I founded resolved the issue.
First a though it was the volumes I was using to persist Zookeeper data and logs on the host, but now that I got ride of them, I'm still getting the same error and have no idea of how to setup a Cluster Id through docker-compose and share it among all brokers. Is there a fix to this?
I'm running Docker version 20.10.10, build b485636 and Docker Compose version v2.1.1 on Windows.
This is my docker-compose.yml:
version: '3.9'
services:
zookeeper-1:
hostname: zookeeper-1
container_name: zookeeper-1
extends:
file: ./docker/services.yml
service: zookeeper
ports:
- '2181:2181'
environment:
ZOOKEEPER_SERVER_ID: 1
ZOOKEEPER_CLIENT_PORT: 2181
zookeeper-2:
hostname: zookeeper-2
container_name: zookeeper-2
extends:
file: ./docker/services.yml
service: zookeeper
ports:
- '2182:2182'
environment:
ZOOKEEPER_SERVER_ID: 2
ZOOKEEPER_CLIENT_PORT: 2182
zookeeper-3:
hostname: zookeeper-3
container_name: zookeeper-3
extends:
file: ./docker/services.yml
service: zookeeper
ports:
- '2183:2183'
environment:
ZOOKEEPER_SERVER_ID: 3
ZOOKEEPER_CLIENT_PORT: 2183
broker-1:
container_name: broker-1
extends:
file: ./docker/services.yml
service: broker
ports:
- '19092:9092'
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper-1:2181,zookeeper-2:2182,zookeeper-3:2183'
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker-1:9092,PLAINTEXT_HOST://localhost:19092
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker-1:9092
depends_on:
- zookeeper-1
- zookeeper-2
- zookeeper-3
broker-2:
container_name: broker-2
extends:
file: ./docker/services.yml
service: broker
ports:
- '29092:9092'
environment:
KAFKA_BROKER_ID: 2
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper-1:2181,zookeeper-2:2182,zookeeper-3:2183'
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker-2:9092,PLAINTEXT_HOST://localhost:29092
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker-2:9092
depends_on:
- zookeeper-1
- zookeeper-2
- zookeeper-3
broker-3:
container_name: broker-3
extends:
file: ./docker/services.yml
service: broker
ports:
- '39092:9092'
environment:
KAFKA_BROKER_ID: 3
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper-1:2181,zookeeper-2:2182,zookeeper-3:2183'
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker-3:9092,PLAINTEXT_HOST://localhost:39092
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker-3:9092
depends_on:
- zookeeper-1
- zookeeper-2
- zookeeper-3
connect:
image: confluentinc/cp-kafka-connect:latest
hostname: connect
container_name: connect
depends_on:
- broker-1
- broker-2
- broker-3
ports:
- '8083:8083'
environment:
CONNECT_BOOTSTRAP_SERVERS: 'broker-1:9092,broker-2:9092,broker-3:9092'
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
# CLASSPATH required due to CC-2422
CLASSPATH: /usr/share/java/monitoring-interceptors/monitoring-interceptors-6.2.1.jar
CONNECT_PRODUCER_INTERCEPTOR_CLASSES: 'io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor'
CONNECT_CONSUMER_INTERCEPTOR_CLASSES: 'io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor'
CONNECT_PLUGIN_PATH: '/usr/share/java,/usr/share/confluent-hub-components,/connect-plugins'
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
control-center:
image: confluentinc/cp-enterprise-control-center:6.2.1
hostname: control-center
container_name: control-center
depends_on:
- broker-1
- broker-2
- broker-3
- connect
ports:
- '9021:9021'
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: 'broker-1:9092,broker-2:9092,broker-3:9092'
CONTROL_CENTER_CONNECT_CONNECT-DEFAULT_CLUSTER: 'connect:8083'
#CONTROL_CENTER_KSQL_KSQLDB1_URL: 'http://ksqldb-server:8088'
#CONTROL_CENTER_KSQL_KSQLDB1_ADVERTISED_URL: 'http://localhost:8088'
#CONTROL_CENTER_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
CONTROL_CENTER_REPLICATION_FACTOR: 1
CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
CONFLUENT_METRICS_TOPIC_REPLICATION: 1
PORT: 9021
This is my services.yml file:
services:
zookeeper:
image: confluentinc/cp-zookeeper:6.2.1
environment:
ZOOKEEPER_TICK_TIME: 2000
ZOOKEEPER_PEER_PORT: 2888
ZOOKEEPER_LEADER_PORT: 3888
ZOOKEEPER_INIT_LIMIT: 5
ZOOKEEPER_SYNC_LIMIT: 2
broker:
image: confluentinc/cp-server:6.2.1
environment:
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_CONFLUENT_LICENSE_TOPIC_REPLICATION_FACTOR: 1
KAFKA_CONFLUENT_BALANCER_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_JMX_PORT: 9101
KAFKA_JMX_HOSTNAME: localhost
# KAFKA_CONFLUENT_SCHEMA_REGISTRY_URL: http://schema-registry:8081
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'true'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
I have a Kafka cluster that I'm managing with Docker.
I have a container where I'm running the broker and another one where I run the pyspark program which is supposed to connect to the kafka topic inside the broker container.
If I run the pyspark script in my local laptop everything runs perfectly but if I try to run the same code from inside the pyspark container I get the following error:
AnalysisException: Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide"
Does anybody know how can I access the Kafka topic running in a docker container from another external container?
This is the PySpark code
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "localhost:9092") \
.option("failOnDataLoss", "false") \
.option("subscribe", "tweets") \
.option("startingOffsets", "earliest") \
.load()
This is my docker-compose.yml
version: "2"
services:
zookeeper:
image: confluentinc/cp-zookeeper:6.1.0
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-server:6.1.0
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "9092:9092"
- "9101:9101"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: "zookeeper:2181"
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
KAFKA_CONFLUENT_LICENSE_TOPIC_REPLICATION_FACTOR: 1
KAFKA_CONFLUENT_BALANCER_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_JMX_PORT: 9101
KAFKA_JMX_HOSTNAME: localhost
KAFKA_CONFLUENT_SCHEMA_REGISTRY_URL: http://schema-registry:8081
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:29092
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: "true"
CONFLUENT_SUPPORT_CUSTOMER_ID: "anonymous"
schema-registry:
image: confluentinc/cp-schema-registry:6.1.0
hostname: schema-registry
container_name: schema-registry
depends_on:
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: "broker:29092"
SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081
connect:
image: cnfldemos/cp-server-connect-datagen:latest
hostname: connect
container_name: connect
build:
context: .
dockerfile: Dockerfile_connect
depends_on:
- broker
- schema-registry
ports:
- "8083:8083"
environment:
CONNECT_BOOTSTRAP_SERVERS: "broker:29092"
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
# CLASSPATH required due to CC-2422
CLASSPATH: /usr/share/java/monitoring-interceptors/monitoring-interceptors-6.1.0.jar
CONNECT_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
CONNECT_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
control-center:
image: confluentinc/cp-enterprise-control-center:6.1.0
hostname: control-center
container_name: control-center
depends_on:
- broker
- schema-registry
- connect
- ksqldb-server
ports:
- "9021:9021"
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: "broker:29092"
CONTROL_CENTER_CONNECT_CLUSTER: "connect:8083"
CONTROL_CENTER_KSQL_KSQLDB1_URL: "http://ksqldb-server:8088"
CONTROL_CENTER_KSQL_KSQLDB1_ADVERTISED_URL: "http://localhost:8088"
CONTROL_CENTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
CONTROL_CENTER_REPLICATION_FACTOR: 1
CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
CONFLUENT_METRICS_TOPIC_REPLICATION: 1
PORT: 9021
ksqldb-server:
image: confluentinc/cp-ksqldb-server:6.1.0
hostname: ksqldb-server
container_name: ksqldb-server
depends_on:
- broker
- connect
ports:
- "8088:8088"
environment:
KSQL_CONFIG_DIR: "/etc/ksql"
KSQL_BOOTSTRAP_SERVERS: "broker:29092"
KSQL_HOST_NAME: ksqldb-server
KSQL_LISTENERS: "http://0.0.0.0:8088"
KSQL_CACHE_MAX_BYTES_BUFFERING: 0
KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
KSQL_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
KSQL_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
KSQL_KSQL_CONNECT_URL: "http://connect:8083"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_REPLICATION_FACTOR: 1
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
ksqldb-cli:
image: confluentinc/cp-ksqldb-cli:6.1.0
container_name: ksqldb-cli
depends_on:
- broker
- connect
- ksqldb-server
entrypoint: /bin/sh
tty: true
ksql-datagen:
image: confluentinc/ksqldb-examples:6.1.0
hostname: ksql-datagen
container_name: ksql-datagen
depends_on:
- ksqldb-server
- broker
- schema-registry
- connect
command: "bash -c 'echo Waiting for Kafka to be ready... && \
cub kafka-ready -b broker:29092 1 40 && \
echo Waiting for Confluent Schema Registry to be ready... && \
cub sr-ready schema-registry 8081 40 && \
echo Waiting a few seconds for topic creation to finish... && \
sleep 11 && \
tail -f /dev/null'"
environment:
KSQL_CONFIG_DIR: "/etc/ksql"
STREAMS_BOOTSTRAP_SERVERS: broker:29092
STREAMS_SCHEMA_REGISTRY_HOST: schema-registry
STREAMS_SCHEMA_REGISTRY_PORT: 8081
rest-proxy:
image: confluentinc/cp-kafka-rest:6.1.0
depends_on:
- broker
- schema-registry
ports:
- 8082:8082
hostname: rest-proxy
container_name: rest-proxy
environment:
KAFKA_REST_HOST_NAME: rest-proxy
KAFKA_REST_BOOTSTRAP_SERVERS: "broker:29092"
KAFKA_REST_LISTENERS: "http://0.0.0.0:8082"
KAFKA_REST_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.11.2
container_name: elasticsearch
environment:
- xpack.security.enabled=false
- discovery.type=single-node
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
cap_add:
- IPC_LOCK
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
kibana:
container_name: kibana
image: docker.elastic.co/kibana/kibana:7.11.2
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- 5601:5601
depends_on:
- elasticsearch
pyspark:
image: jupyter/pyspark-notebook:latest
hostname: pyspark
container_name: pyspark
ports:
- 8888:8888
#volumes:
# - pySpark: pySpark/
depends_on:
- broker
volumes:
elasticsearch-data:
driver: local
pySpark:
driver: local
There are several problems in your setup:
You don't add the package for Kafka support as described in docs. It's either needs to be added when starting pyspark, or when initializing session, something like this (change 3.0.1 to version that is used in your jupyter container):
SparkSession.builder.appName('my_app')\
.config('spark.jars.packages', 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1')\
.getOrCreate()
you're connecting to "localhost:9092", but it's incorrect inside the container because container with Jupyter has its own localhost that is different from Kafka container's localhost. You need to use name broker instead, as it's declared in the docker-compose.yaml
I am using docker-compose.yml file for my Kafka setup and this is working as expected. As I am trying to connect to oracle database, I need to install ojdbc driver as well. So, I have modified my compose file to directly download ojdbc jar but after adding this code, I am not able to start Kafka-connect.
Additional code added in the file :
command:
- /bin/bash
- -c
- /
cd /usr/share/java/kafka-connect-jdbc/
curl https://maven.xwiki.org/externals/com/oracle/jdbc/ojdbc8/12.2.0.1/ojdbc8-12.2.0.1.jar -o ojdbc8-12.2.0.1.jar
/etc/confluent/docker/run
I had also tried to start the Kafka-connect container manually but it's not working as well. If you could see in the below screenshot, all services got created successfully.
Full docker-compose.yml file:
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:5.4.0
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-kafka:5.4.0
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "29092:29092"
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
schema-registry:
image: confluentinc/cp-schema-registry:5.4.0
hostname: schema-registry
container_name: schema-registry
depends_on:
- zookeeper
- broker
ports:
- "8081:8081"
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181'
kafka-connect:
image: confluentinc/cp-kafka-connect:5.4.0
hostname: kafka-connect
container_name: kafka-connect
depends_on:
- broker
- schema-registry
ports:
- "8083:8083"
environment:
CONNECT_BOOTSTRAP_SERVERS: "broker:29092"
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: kafka-connect
CONNECT_CONFIG_STORAGE_TOPIC: kafka-connect-configs
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_OFFSET_STORAGE_TOPIC: kafka-connect-offsets
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_TOPIC: kafka-connect-status
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_ZOOKEEPER_CONNECT: 'zookeeper:2181'
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
command:
- /bin/bash
- -c
- /
cd /usr/share/java/kafka-connect-jdbc/
curl https://maven.xwiki.org/externals/com/oracle/jdbc/ojdbc8/12.2.0.1/ojdbc8-12.2.0.1.jar -o ojdbc8-12.2.0.1.jar
/etc/confluent/docker/run
ksqldb-server:
image: confluentinc/ksqldb-server:0.9.0
hostname: ksqldb-server
container_name: ksqldb-server
depends_on:
- broker
- kafka-connect
ports:
- "8088:8088"
environment:
KSQL_CONFIG_DIR: "/etc/ksql"
KSQL_BOOTSTRAP_SERVERS: "broker:29092"
KSQL_HOST_NAME: ksqldb-server
KSQL_LISTENERS: "http://0.0.0.0:8088"
KSQL_CACHE_MAX_BYTES_BUFFERING: 0
KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
KSQL_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
KSQL_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
KSQL_KSQL_CONNECT_URL: "http://connect:8083"
ksqldb-cli:
image: confluentinc/ksqldb-cli:0.9.0
container_name: ksqldb-cli
depends_on:
- broker
- kafka-connect
- ksqldb-server
entrypoint: /bin/sh
tty: true
rest-proxy:
image: confluentinc/cp-kafka-rest:5.4.0
depends_on:
- zookeeper
- broker
- schema-registry
ports:
- "8082:8082"
hostname: rest-proxy
container_name: rest-proxy
environment:
KAFKA_REST_HOST_NAME: rest-proxy
KAFKA_REST_BOOTSTRAP_SERVERS: 'broker:29092'
KAFKA_REST_LISTENERS: "http://0.0.0.0:8082"
KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
Ok, so this got resolved, after adding this line at bottom in the command section:
sleep infinity
sleep infinity is necessary, because we’ve sent the
/etc/confluent/docker/run process to a background thread (&) and so
the container will exit if the main command finishes.
More details here : https://rmoff.net/2018/12/15/docker-tips-and-tricks-with-ksql-and-kafka/
I have the following docker-compose.yml file
version: '3'
services:
zookeeper:
ports:
- "2181:2181"
hostname: zookeeper
image: confluentinc/cp-zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
broker:
image: confluentinc/cp-enterprise-kafka
hostname: broker
ports:
- "9092:9092"
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://broker:9092'
KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:9092
CONFLUENT_METRICS_REPORTER_ZOOKEEPER_CONNECT: zookeeper:2181
CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
CONFLUENT_METRICS_ENABLE: 'true'
CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
test:
image: java:8
volumes:
- .:/src
- ${GRADLE_USER_HOME}:/gradle_mount
environment:
- GRADLE_USER_HOME=/gradle_mount
- SPRING_KAFKA_BOOTSTRAP-SERVERS=broker:9092
depends_on:
- broker
- zookeeper
working_dir: /src
privileged: true
command: "./gradlew --no-daemon clean test --info"
The tests pass locally but fail in Jenkins and I suspect it is because the tests are firing off before the Kafka service is running.
Is there any way to make the test container wait for the service on the Kafka container (broker) to start?