Logstash Kafka input cannot connect - docker

I am trying to build a pipeline based on this tutorial where Kafka reads from a file with a File Source connector. Using these Docker images for the Elastic Stack, I want to register Logstash as a consumer for the "quickstart-data" topic but I have failed for the moment.
Here is my logstash.conf file:
input {
kafka {
bootstrap_servers => 'localhost:9092'
topics => 'quickstart-data'
}
}
output {
elasticsearch {
hosts => [ 'elasticsearch']
user => 'elastic'
password => 'changeme'
}
stdout {}
}
The connection to Elasticsearch works because I tested it with a heartbeat input.
The message error I get is the following:
Connection to node -1 could not be established. Broker may not be available.
Give up sending metadata request since no node is available
Any ideas?

I would recommend you keep things simple and use Kafka Connect for landing the data to Elasticsearch too : https://docs.confluent.io/current/connect/connect-elasticsearch/docs/elasticsearch_connector.html#quick-start

There may be a better way to do it but here how I correct the issue:
Change my Zookeeper & Kafka images to Confluent images
zookeeper:
image: confluentinc/cp-zookeeper:latest
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
networks:
- stack
kafka:
image: confluentinc/cp-kafka:latest
ports:
- "9092:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
depends_on:
- zookeeper
networks:
- stack
Logstash configuration (Please note that port is 2902):
input {
stdin{}
kafka {
id => "my_kafka_1"
bootstrap_servers => "kafka:29092"
topics => "test"
}
}

Related

Why i can't access kafka from an external machine?

I am new to kafka, i tried to setup my first cluster on a vps with Docker-compose. But i still cannot access it from my local pc ( outside the host ).
here is my docker compose
version: '2'
services:
zookeeper-1:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 22181:2181
zookeeper-2:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 32181:2181
kafka-1:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper-1
- zookeeper-2
ports:
- 29092:29092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181,zookeeper-2:2181
KAFKA_LISTENERS: EXTERNAL_SAME_HOST://:29092,EXTERNAL_DIFFERENT_HOST://:29093,INTERNAL://:9092
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9092,EXTERNAL_SAME_HOST://localhost:29092,EXTERNAL_DIFFERENT_HOST://XXX.XXX.XXX.XXX:29093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL_SAME_HOST:PLAINTEXT,EXTERNAL_DIFFERENT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
kafka-2:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper-1
- zookeeper-2
ports:
- 39092:39092
environment:
KAFKA_BROKER_ID: 2
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181,zookeeper-2:2181
KAFKA_LISTENERS: EXTERNAL_SAME_HOST://:39092,EXTERNAL_DIFFERENT_HOST://:39093,INTERNAL://:9093
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9093,EXTERNAL_SAME_HOST://localhost:39092,EXTERNAL_DIFFERENT_HOST://XXX.XXX.XXX.XXX:39093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL_SAME_HOST:PLAINTEXT,EXTERNAL_DIFFERENT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
I searched in the logs and i found that there is no available brokers ( always 0 ) because the server couldn't connect to "kafka:9092" AND Zookeeper keep failing to connect to the brokers.
[2022-04-13 14:56:59,422] WARN Session 0x0 for sever My-vps-URL/XXX.XXX.XXX.XXX:2181, Closing socket connection. Attempting reconnect except
it is a SessionExpiredException. (org.apache.zookeeper.ClientCnxn)
org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not
heard from server in 30006ms for session id 0x0
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1258)
KeeperErrorCode = ConnectionLoss for /brokers/ids
How can i fix this ?
Note that i tried a similar config with a different docker image ( bitnami's ), and with different cluster config ( 1 zookeeper 1 broker ) and it still doesn't work.
You have a Zookeeper error because you are running an even number of them. The number of Zookeepers doesn't need to match the number of brokers, and it should be an odd number of them, only, up to 7 max. You also shouldn't need ports on the Zookeeper servers.
For your Kafka Connection,
KAFKA_LISTENERS needs to include the IP of 0.0.0.0 to allow for the server to bind on all interfaces
You need to expose port 29093 and 39093 since those are your "different host" settings. You currently only have ports to connect from the same machine.
your clients need to connect to the EXTERNAL_DIFFERENT_HOST address you've set, not kafka:9092
Further reading - Connect to Kafka running in Docker
tried a similar config with a different docker image ( bitnami's )
That image has different variables, but same basic answer as above.
different cluster config ( 1 zookeeper 1 broker )
There is little benefit of running multiple of each of those on the same machine, so I suggest trying to get that configuration working, first.

Can't access endpoint on exposed port in docker-compose but can ping that container

My docker-compose file looks like this:
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
expose:
- "2181"
kafka:
build: .
depends_on:
- zookeeper
expose:
- "8778"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:8778
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_OPTS: '-javaagent:/usr/jolokia/agents/jolokia-jvm.jar'
telegraf:
image: telegraf:latest
links:
- "kafka"
- "zookeeper"
environment:
JOLOKIA_AGENT_URL: http://kafka:8778/jolokia/
ZOOKEEPER_CONNECTION_STRING: http://zookeeper:2181
volumes:
- ./telegraf.conf:/etc/telegraf/telegraf.conf
Example: I can ping kafka successfully from telegraf. I can successfully hit the endpoint I want from the kafka container when I'm execed into that container (curl from localhost when inside it). I cannot, however, reach the endpoint /jolokia/read exposed in the kafka container at port 8778 from the telegraf container.
What am I missing?
I suggest you remove the links section. This has been deprecated by Compose for years.
Then, Compose starts its own network bridge layer, so if you exec into telegraf or zookeeper containers, ping kafka should work, therefore DNS would be working and so curl should as well...
Note: adding Jolokia to Zookeeper should also be done
I'll also point out that the Confluent Helm Charts already provide Prometheus and Jolokia integration

docker-compose kafka - local machine client cannot produce message to kafka

I've read a lot of similar subjects but they aren't able to answer my problem here.
Trying to run some short integration tests, I'm using docker-compose 3, a single-node kafka. On client side I'm using Go shopify/sarama to consume / produce
zookeeper:
image: confluentinc/cp-zookeeper:5.2.2
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-enterprise-kafka:5.2.2
hostname: kafka
container_name: kafka
depends_on:
- zookeeper
ports:
- "29092:29092"
expose:
- 9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
I have another container from the docker-compose that will listen to
- "BROKERS_URL=kafka:9092"
the consumer is working just fine:
Sarama consumer up and running. {"brokers": ["kafka:9092"], "topics": ["validated"], "group": "event-service"}
But on the producer part, running directly from my machine:
kafka: client has run out of available brokers to talk to (Is your cluster reachable?)
producer, err := sarama.NewSyncProducer([]string{"http://localhost:29092"}, nil)
...
msg := &sarama.ProducerMessage{
Topic: "validated",
Key: sarama.StringEncoder(""),
Value: sarama.ByteEncoder(payload),
}
partition, offset, err := producer.SendMessage(msg)
...
Nothing weird / extravagante here, but it's not working and I'm confused.
also:
nc -vz localhost 29092
Connection to localhost port 29092 [tcp/*] succeeded!
Instead of
KAFKA_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
you need
KAFKA_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://0.0.0.0:29092
Testing connectivity from my host machine using kafkacat shows that this works:
➜ kafkacat -b localhost:29092 -L
Metadata for all topics (from broker 1: localhost:29092/1):
1 brokers:
broker 1 at localhost:29092 (controller)
0 topics:
This difference is that the listener is binding to all available interfaces (0.0.0.0). With your original configuration it binds to the loopback interface (lo) for localhost, and so only accepts traffic on this and not externally.

Docker alias incorrectly resolved by application when container started

I am creating two docker-compose files (mainly because I don't want to have to keep restarting my infrastructure while developing my application.) that need to reside on the same docker network so they can use alias names to connect.
The files look similar to the following:
APP:
version: '3.5'
networks:
default:
name: kafka_network
driver: bridge
services:
client:
build:
context: .
dockerfile: ./Dockerfile
working_dir: /app/
command: ./client
environment:
BADDR: kafka:9092
CGROUP: test_group
TOPICS: my-topic
INFRASTRUCTURE:
version: '3.5'
networks:
default:
name: kafka_network
driver: bridge
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
ports:
- 2181:2181
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
ports:
- 9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
My issue is that the client doesn't resolve kafka:9092 correctly... it always resolves to 127.0.0.1:9092.
ERROR:
Broker: kafka:9092
Consumer_Group: my_group
Topics: [my-topic]
Created Consumer rdkafka#consumer-1
% Error: GroupCoordinator: Connect to ipv4#127.0.0.1:9092 failed: Connection refused (after 0ms in state CONNECT)
When run locally, it appears to run fine, so I am really confused as to what the issue might be. If anyone knows anything about this I would be very grateful!
LOCAL:
[procyclinsur#P-428 client]$ ./client
Broker: localhost:9092
Consumer_Group: my-group
Topics: [my-topic]
Created Consumer rdkafka#consumer-1
% AssignedPartitions: [my-topic[0]#unset]
% Message on my-topic[0]#0:
hello mate
That's problem related to your Kafka's config - not to docker at all.
Look on:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
It means that you setup 2 listeners for your Kafka which your clients will receive in Kafka's protocol when connecting.
So when you connect on port 9092 your client's will try to get Kafka at "localhost "and when you connect at port 29092 your clients will try to get Kafka at "kafka" DNS name.
It's working locally for you because your Kafka container is exposed on localhost:9092 via docker ports section.
Here is article which is well describing that topic: https://rmoff.net/2018/08/02/kafka-listeners-explained/

Kafka Client Timeout of 60000ms expired before the position for partition could be determined

I'm trying to connect Flink to a Kafka consumer
I'm using Docker Compose to build 4 containers zookeeper, kafka, Flink JobManager and Flink TaskManager.
For zookeeper and Kafka I'm using wurstmeister images, and for Flink I'm using the official image.
docker-compose.yml
version: '3.1'
services:
zookeeper:
image: wurstmeister/zookeeper:3.4.6
hostname: zookeeper
expose:
- "2181"
ports:
- "2181:2181"
kafka:
image: wurstmeister/kafka:2.11-2.0.0
depends_on:
- zookeeper
ports:
- "9092:9092"
hostname: kafka
links:
- zookeeper
environment:
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_PORT: 9092
KAFKA_CREATE_TOPICS: 'pipeline:1:1:compact'
jobmanager:
build: ./flink_pipeline
depends_on:
- kafka
links:
- zookeeper
- kafka
expose:
- "6123"
ports:
- "8081:8081"
command: jobmanager
environment:
JOB_MANAGER_RPC_ADDRESS: jobmanager
BOOTSTRAP_SERVER: kafka:9092
ZOOKEEPER: zookeeper:2181
taskmanager:
image: flink
expose:
- "6121"
- "6122"
links:
- jobmanager
- zookeeper
- kafka
depends_on:
- jobmanager
command: taskmanager
# links:
# - "jobmanager:jobmanager"
environment:
JOB_MANAGER_RPC_ADDRESS: jobmanager
And When I submit a simple job to Dispatcher the Job fails with the following error:
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before the position for partition pipeline-0 could be determined
My Job code is:
public class Main {
public static void main( String[] args ) throws Exception
{
// get the execution environment
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// get input data by connecting to the socket
Properties properties = new Properties();
String bootstrapServer = System.getenv("BOOTSTRAP_SERVER");
String zookeeperServer = System.getenv("ZOOKEEPER");
if (bootstrapServer == null) {
System.exit(1);
}
properties.setProperty("zookeeper", zookeeperServer);
properties.setProperty("bootstrap.servers", bootstrapServer);
properties.setProperty("group.id", "pipeline-analysis");
FlinkKafkaConsumer kafkaConsumer = new FlinkKafkaConsumer<String>("pipeline", new SimpleStringSchema(), properties);
// kafkaConsumer.setStartFromGroupOffsets();
kafkaConsumer.setStartFromLatest();
DataStream<String> stream = env.addSource(kafkaConsumer);
// Defining Pipeline here
// Printing Outputs
stream.print();
env.execute("Stream Pipeline");
}
}
I know I'm late to the party but I had the exact same error. In my case, I was not setting up TopicPartitions correctly. My topic had 2 partitions and my producer was producing messages just fine, but it's the spark streaming application, as my consumer, that wasn't really starting and giving up after 60 secs complaining the same error.
Wrong code that I had -
List<TopicPartition> topicPartitionList = Arrays.asList(new topicPartition(topicName, Integer.parseInt(numPartition)));
Correct code -
List<TopicPartition> topicPartitionList = new ArrayList<TopicPartition>();
for (int i = 0; i < Integer.parseInt(numPartitions); i++) {
topicPartitionList.add(new TopicPartition(topicName, i));
}
I had an error that looks the same.
17:34:37.668 [org.springframework.kafka.KafkaListenerEndpointContainer#1-0-C-1] ERROR o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-3, groupId=api.dev] User provided listener org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer$ListenerConsumerRebalanceListener failed on partition assignment
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before the position for partition aaa-1 could be determined
Turns out it's my hosts file has been changed so the broker address is wrong.
Try this log settings to debug more details.
<logger name="org.apache.kafka.clients.consumer.internals.Fetcher" level="info" />
I was having issues with this error in a vSphere Integrated Containers environment. For me the problem was that I had advertise on the hostname and not the IP. I had to set the hostname and container name in my compose file.
Here are my settings that finally worked:
kafka:
depends_on:
- zookeeper
image: wurstmeister/kafka
ports:
- "9092:9092"
mem_limit: 10g
container_name: kafka
hostname: kafka
environment:
KAFKA_ADVERTISED_LISTENERS: OUTSIDE://kafka:9092
KAFKA_LISTENERS: OUTSIDE://0.0.0.0:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: OUTSIDE:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: OUTSIDE
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: <REPLACE_WITH_IP>:2181
I had the same problem, the issue was I had a wrong host entry in /etc/hosts file for kafka node!

Resources