Docker, Kafka - replication doesn't work between remote brokers - docker

Have docker images of kafka brokers and zookeeper - call them z1, b1, b2 for now.
They are deployed on two physical servers s1 and s2 as so:
s1 contains z1 and b1
s2 contains b2
In their own docker-compose.yml files, zookeeper has set ports as following:
- 2181:2181
- 2888:2888
- 3888:3888
and brokers as following:
- 9092:9092
Topic with --replication-factor 2 and --partitions 4 can be created.
No data are pushed to topic for whole time, but still following problem occurs.
If kafka-topics --describe --topic <name_of_topic> --zookeeper <zookeeperIP:port> is run shortly after topic creation, all is insync and looks good. On second run (with short delay), b1 removes b2 partitions replicas from it's insync, but b2 doesn't remove b1 partitions replicas from insync.
In server.log from b1, there are showing many of these exceptions:
WARN [ReplicaFetcherThread-0-1], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest#42746de3 (kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to ef447651b07a:9092 (id: 1 rack: null) failed
at kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:83)
at kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:93)
at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:248)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
Swapping leadership works between brokers b1 and b2, as they are shut down and started again, but then only the last one online is in full control of topic - is leader for all partitions and only one insync, even if the other broker comes back online.
Tried cleaning all data, reseting both brokers and zookeeper, but problem persists.
Why partitions aren't properly replicated ?

It looks like the brokers b1 and b2 can't talk to each other, which indicates a Docker-related networking issue (and such Docker networking issues are quite common in general).
You'd need to share more information for further help, e.g. the contents of the docker-compose.yml file(s) as well as e.g. the Dockerfile you use to create your images. I also wonder why you have created different images for the two brokers, typically you only need a single Kafka broker image, and then simply launch multiple containers (one per desired broker) off of that image.

I figured it out. There was problem with network, as Michael G. Noll said.
Firstly, I don't map ports manually anymore and use host network instead. It's easier to manage.
Secenodary, b1 and b2 had listeners set like so:
listeners=PLAINTEXT://:9092
They both had no ip specified, so 0.0.0.0 was used by default and there was collission, as they both listened there and pushed same connection information to zookeeper.
Final configuration:
b1 and b2 docker-compose.yml use host network:
network_mode: "host"
b1 server.properties` - listeners:
listeners=PLAINTEXT://<s1_IP>:9092
b2 server.properties` - listeners:
listeners=PLAINTEXT://<s2_IP>:9092
Everything works fine now, replication is working, even on broker restarts.
Data can be produced and consumed correctly.

Related

VertX-Hazelcast on non orchestrated docker

I'm trying to figure out, how I have to configure the VertX/Hazelcast-cluster with multiple containers on two nodes:
+->Primary-Gateway (Node: 192.168.1.12, Docker-Network Primary)
| + Service A
| + Service B
| + Service C
---|
|
+-->Secondary-Gateway (Node: 192.168.1.13, Docker-Network Secondary)
+ Service A
+ Service B
+ Service C
The gateway will receive the request and will forward over the ServiceDiscovery to each of the Service-Container.
The current configuration is like at Hazelcast Non Orchestrated Docker described and which is working:
each container expose the Hazelcast-Port and its PublicAddress is set.
i.e.
hazelcast:
network:
public-address: 192.168.1.12:10001
The hazelcast-member-list contains the two Gateway-PublicAddresses and additional the two Docker-Nodes
hazelcast:
network:
join:
multicast:
enabled: false
tcp-ip:
enabled: true
member-list:
- 192.168.1.12:10001 <= Both Gates
- 192.168.1.13:10001
- 192.168.1.12 <= Docker-Node to avoid rejects from the services
- 192.168.1.13
Is there a better way to configure the hazelcast-cluster, that I only
configure only the two Gate-Processes
avoid to expose each and every hazelcast-port of the Service-Process, since they could connect over the internal container-network
The idea is, that I can start an additional/douplicate a container easily without additional configuration. MiniCube may solve the issue, but it will be a bit overboarding for it.
What I have tried so far:
Multicast => Didn't work, since the container didn't start with "--host".
Add automatically the container-network => The other node is rejected, since its network is unknown.
thx

HBase + TestContainers - Port Remapping

I am trying to use Test Containers to run an integration test against HBase launched in a Docker container. The problem I am running into may be a bit unique to how a client interacts with HBase.
When the HBase Master starts in the container, it stores its hostname:port in Zookeeper so that clients can find it. In this case, it stores "localhost:16000".
In my test case running outside the container, the client retrieves "localhost:16000" from Zookeeper and cannot connect. The connection fails because the port has been remapped by TestContainers to some other random port, other than 16000.
Any ideas how to overcome this?
(1) One idea is to find a way to tell the HBase Client to use the remapped port, ignoring the value it retrieved from Zookeeper, but I have yet to find a way to do this.
(2) If I could get the HBase Master to write the externally accessible host:port in Zookeeper that would also fix the problem. But I do not believe the container itself has any knowledge about how Test Containers is doing the port remapping.
(3) Perhaps there is a different solution that Test Containers provides for this sort of situation?
You can take a look at KafkaContainer's implementation where we start a Socat (fast tcp proxy) container first to acquire a semi-random port and use it later to configure the target container.
The algorithm is:
In doStart, first start Socat targetting the original container's network alias & port like 12345
Get mapped port (it will be something like 32109 pointing to 12345)
Make the original container (e.g. with environment variables) use the mapped port in addition to the original one, or, if only one port can be configured, see CouchbaseContainer for the more advanced option
Return Socat's host & port to the client
we build a new image of hbase to be compliant with test container.
Use this image:
docker run --env HBASE_MASTER_PORT=16000 --env HBASE_REGION_PORT=16020 jcjabouille/hbase-standalone:2.4.9
Then create this Container (in scala here)
private[test] class GenericHbase2Container
extends GenericContainer[GenericHbase2Container](
DockerImageName.parse("jcjabouille/hbase-standalone:2.4.9")
) {
private val randomMasterPort: Int = FreePortFinder.findFreeLocalPort(18000)
private val randomRegionPort: Int = FreePortFinder.findFreeLocalPort(20000)
private val hostName: String = InetAddress.getLocalHost.getHostName
val hbase2Configuration: Configuration = HBaseConfiguration.create
addExposedPort(randomMasterPort)
addExposedPort(randomRegionPort)
addExposedPort(2181)
withCreateContainerCmdModifier { cmd: CreateContainerCmd =>
cmd.withHostName(hostName)
()
}
waitingFor(Wait.forLogMessage(".*0 row.*", 1))
withStartupTimeout(Duration.ofMinutes(10))
withEnv("HBASE_MASTER_PORT", randomMasterPort.toString)
withEnv("HBASE_REGION_PORT", randomRegionPort.toString)
setPortBindings(Seq(s"$randomMasterPort:$randomMasterPort", s"$randomRegionPort:$randomRegionPort").asJava)
override protected def doStart(): Unit = {
super.doStart()
hbase2Configuration.set("hbase.client.pause", "200")
hbase2Configuration.set("hbase.client.retries.number", "10")
hbase2Configuration.set("hbase.rpc.timeout", "3000")
hbase2Configuration.set("hbase.client.operation.timeout", "3000")
hbase2Configuration.set("hbase.client.scanner.timeout.period", "10000")
hbase2Configuration.set("zookeeper.session.timeout", "10000")
hbase2Configuration.set("hbase.zookeeper.quorum", "localhost")
hbase2Configuration.set("hbase.zookeeper.property.clientPort", getMappedPort(2181).toString)
}
}
More details here: https://hub.docker.com/r/jcjabouille/hbase-standalone

Starting Redis cluster hangs when calling redis-trib

I have tried to setup a Redis cluster running docker but it hangs when I try to join them. My docker ps gives me this:
Notice the port mapping.
All containers have this basic redis.conf file
port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
cluster-announce-ip 127.0.0.1
cluster-announce-port [7001, 7002, 7003, 7004, 7005 or 7006]
cluster-announce-bus-port [7101, 7102, 7103, 7104, 7105 or 7106]
Where the only change is the cluster-announce-port and cluster-announce-bus-port for each docker container. I hope you get the point.
I try to join the nodes with ./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006
And it discovers it perfectly and asking if the config should be accepted:
But then redis-trib hangs indefinitely with "Waiting for the cluster to join". I can see through docker logs r_1 to r_6, that the epoch is getting set:
1:M 15 Jul 10:38:08.493 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH
So redis-trib does call the different nodes.
I cant really find anything about the cluster-announce variables anywhere. Does anyone here know how to do this? I think my problems lies in this part.
The redis version I am using is 4.0.10.
Ok so I figured it out. I needed to
set my cluster-announce-ip to the Ethernet adapter that has been created when installing docker (open up a terminal and do ipconfig)
update redis-trib.rb to reflect this IP
map the 16379 port when the docker image is created

How to run a redis cluster on a docker cluster?

Context
I am trying to setup a redis cluster so that it runs on top off a docker cluster, to achieve maximum auto-healing.
More precisely, I have a docker compose file, which defines a service that has 3 replicas. Each service replica has a redis-server running on.
Then I have a program inside each replica that listens to changes on the docker cluster and that starts the cluster when conditions are met (each 3 redis-servers know each other).
Setting up the redis cluster works has expected, the cluster is formed and all the redis-servers communicate well, but the communication between redis-servers is inside the docker cluster.
The Problem
When I try to communicate from outside the docker cluster, because of the ingress mode I am able to talk to a redis-server, however when I try to add info (eg: set foo bar) and the client is moved to another redis-server the communication hangs and eventually times out.
Code
This is the docker-compose file.
version: "3.3"
services:
redis-cluster:
image: redis-srv-instance
volumes:
- /var/run/:/var/run
deploy:
mode: replicated
#endpoint_mode: dnsrr
replicas: 3
resources:
limits:
cpus: '0.5'
memory: 512M
ports:
- target: 6379
published: 30000
protocol: tcp
mode: ingress
The flux of commands that show the problem.
Client
~ ./redis-cli -c -p 30000
127.0.0.1:30000>
Redis-server
OK
1506533095.032738 [0 10.255.0.2:59700] "COMMAND"
1506533098.335858 [0 10.255.0.2:59700] "info"
Client
127.0.0.1:30000> set ghb fki
OK
Redis-server
1506533566.481334 [0 10.255.0.2:59718] "COMMAND"
1506533571.315238 [0 10.255.0.2:59718] "set" "ghb" "fki"
Client
127.0.0.1:30000> set rte fgh
-> Redirected to slot [3830] located at 10.0.0.3:6379
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
(150.31s)
not connected>
Any ideas? I have also tried making my one proxy/load balancer but didn't work.
Thank you! Have a nice day.
For this use case, sentinel might help. Redis on its own is not capably of high availability. Sentinel on the other side is a distributed system which can do the following for you:
Route the ingress trafic to the current Redis master.
Elect a new Redis master should the current one fail.
While I have previously done research on this topic, I have not yet managed to pull to getter a working example.
redis-cli would get the redis server ip inside the ingress network, and try to access the remote redis server by that ip directly. That is why redis-cli shows Redirected to slot [3830] located at 10.0.0.3:6379. But this internal 10.0.0.3 is not accessible to redis-cli.
One solution is to run another proxy service which attaches to the same network with redis cluster. The application sends all requests to that proxy service, and the proxy service talks with redis cluster.
Or you could create 3 swarm services that uses the bridge network and exposes the redis port to node. Your internal program needs to change accordingly.

Containerized Kafka client errors when producing messages to the host Kafka server

There are a number of similar types of queries on stackoverflow, but none quite match the problem that I am seeing.
I have a zookeeper/kafka setup on my server which work perfectly. One can produce
bin/kafka-console-producer.sh --broker-list 192.168.2.80:9092 --topic test
and consume
bin/kafka-console-consumer.sh --bootstrap-server 192.168.2.80:9092 --topic test --from-beginning
locally on the Linux Ubuntu 16.04 server.
From a Docker container - also running Ubuntu 16.04 - I want to produce and consume. The container's Kafka code was copied from that on the server.
Firstly I can create a new topic
bin/kafka-topics.sh --create --zookeeper 192.168.2.80:2181 --replication-factor 1 --partitions 1 --topic test2
from the container and then list it again
bin/kafka-topics.sh --list --zookeeper 192.168.2.80:2181
However when I try to produce new messages, using the above (kafka-console-producer.sh) command it fails with the following message:
[2017-06-05 13:59:05,317] ERROR Error when sending message to topic test2 with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for test2-0: 1526 ms has passed since batch creation plus linger time
immediately after entering the text of the message and pressing enter.
It may seem strange running a Docker container on the same host, but once this works I will move the container to a separate host for production.
My kafka server.properties file:
listeners=PLAINTEXT://0.0.0.0:9092
Kafka version:
2.12-0.10.2.1
Docker version:
Docker version 1.12.6, build 78d1802
The problem is (slightly simplified) caused by how Kafka's protocol works. Given a list of "bootstrap servers" (e.g. localhost:9092), a Kafka client will contact those bootstrap servers, but then use the hostnames of the actual Kafka brokers as returned by the bootstrap servers (the broker's advertised.listeners config, depending on your Kafka/Docker setup, might be set to e.g. kafka:9092). So here, the client would talk to localhost:9092 for bootstrapping (which will work), but then switch to kafka:9092 (which will not work, "thanks" to the networking setup).
Fortunately there is a way to configure Kafka + Docker in a way that "just works", and it doesn't require shenanigans such as fiddling with your host's /etc/hosts file and such. As part of this you need to set a few (new) Kafka settings though, which were added in kafka's KIP-103: Separation of Internal and External traffic.
Here's a snippet for Docker Compose (docker-compose.yml) that demonstrates how to do this:
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:3.2.1
hostname: zookeeper
ports:
- '32181:32181'
environment:
ZOOKEEPER_CLIENT_PORT: 32181
kafka:
image: confluentinc/cp-kafka:3.2.1
hostname: kafka
ports:
- '9092:9092'
- '29092:29092'
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
# Following line is needed for Kafka versions 0.11+
# in case you run less than 3 Kafka brokers in your
# cluster because the broker config
# `offsets.topic.replication.factor` (default: 3)
# is now enforced upon topic creation
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
Here, the key settings are:
listener.security.protocol.map (which is being set via KAFKA_LISTENER_SECURITY_PROTOCOL_MAP)
inter.broker.listener.name
advertised.listeners
In the setup above, the containerized Kafka broker listens on localhost:9092 for access from your host machine (e.g. your Mac laptop) and on kafka:29092 for access from other containers.
A full end-to-end example is available at:
https://github.com/confluentinc/cp-docker-images/blob/v3.2.1/examples/kafka-streams-examples/docker-compose.yml (documentation at http://docs.confluent.io/3.2.1/cp-docker-images/docs/tutorials/kafka-streams-examples.html).
Your producer (in the container) can't resolve the host name of your Linux guest OS which is returned in the Kafka producers initial metadata request to the bootstrap server. You can add it manually to the /etc/hosts file inside the container or add "--add-host" parameter to the docker run command that launches the image running your producer
Aha!
After further reading and the answers given above the solution came. As is often the case it is an easy one.
A simple edit of the kafka server.properties file:
advertised.listeners=PLAINTEXT://192.168.2.80:9092
Also note, the parameter 'listeners' is not set in this file.

Resources