VertX-Hazelcast on non orchestrated docker - docker

I'm trying to figure out, how I have to configure the VertX/Hazelcast-cluster with multiple containers on two nodes:
+->Primary-Gateway (Node: 192.168.1.12, Docker-Network Primary)
| + Service A
| + Service B
| + Service C
---|
|
+-->Secondary-Gateway (Node: 192.168.1.13, Docker-Network Secondary)
+ Service A
+ Service B
+ Service C
The gateway will receive the request and will forward over the ServiceDiscovery to each of the Service-Container.
The current configuration is like at Hazelcast Non Orchestrated Docker described and which is working:
each container expose the Hazelcast-Port and its PublicAddress is set.
i.e.
hazelcast:
network:
public-address: 192.168.1.12:10001
The hazelcast-member-list contains the two Gateway-PublicAddresses and additional the two Docker-Nodes
hazelcast:
network:
join:
multicast:
enabled: false
tcp-ip:
enabled: true
member-list:
- 192.168.1.12:10001 <= Both Gates
- 192.168.1.13:10001
- 192.168.1.12 <= Docker-Node to avoid rejects from the services
- 192.168.1.13
Is there a better way to configure the hazelcast-cluster, that I only
configure only the two Gate-Processes
avoid to expose each and every hazelcast-port of the Service-Process, since they could connect over the internal container-network
The idea is, that I can start an additional/douplicate a container easily without additional configuration. MiniCube may solve the issue, but it will be a bit overboarding for it.
What I have tried so far:
Multicast => Didn't work, since the container didn't start with "--host".
Add automatically the container-network => The other node is rejected, since its network is unknown.
thx

Related

HBase + TestContainers - Port Remapping

I am trying to use Test Containers to run an integration test against HBase launched in a Docker container. The problem I am running into may be a bit unique to how a client interacts with HBase.
When the HBase Master starts in the container, it stores its hostname:port in Zookeeper so that clients can find it. In this case, it stores "localhost:16000".
In my test case running outside the container, the client retrieves "localhost:16000" from Zookeeper and cannot connect. The connection fails because the port has been remapped by TestContainers to some other random port, other than 16000.
Any ideas how to overcome this?
(1) One idea is to find a way to tell the HBase Client to use the remapped port, ignoring the value it retrieved from Zookeeper, but I have yet to find a way to do this.
(2) If I could get the HBase Master to write the externally accessible host:port in Zookeeper that would also fix the problem. But I do not believe the container itself has any knowledge about how Test Containers is doing the port remapping.
(3) Perhaps there is a different solution that Test Containers provides for this sort of situation?
You can take a look at KafkaContainer's implementation where we start a Socat (fast tcp proxy) container first to acquire a semi-random port and use it later to configure the target container.
The algorithm is:
In doStart, first start Socat targetting the original container's network alias & port like 12345
Get mapped port (it will be something like 32109 pointing to 12345)
Make the original container (e.g. with environment variables) use the mapped port in addition to the original one, or, if only one port can be configured, see CouchbaseContainer for the more advanced option
Return Socat's host & port to the client
we build a new image of hbase to be compliant with test container.
Use this image:
docker run --env HBASE_MASTER_PORT=16000 --env HBASE_REGION_PORT=16020 jcjabouille/hbase-standalone:2.4.9
Then create this Container (in scala here)
private[test] class GenericHbase2Container
extends GenericContainer[GenericHbase2Container](
DockerImageName.parse("jcjabouille/hbase-standalone:2.4.9")
) {
private val randomMasterPort: Int = FreePortFinder.findFreeLocalPort(18000)
private val randomRegionPort: Int = FreePortFinder.findFreeLocalPort(20000)
private val hostName: String = InetAddress.getLocalHost.getHostName
val hbase2Configuration: Configuration = HBaseConfiguration.create
addExposedPort(randomMasterPort)
addExposedPort(randomRegionPort)
addExposedPort(2181)
withCreateContainerCmdModifier { cmd: CreateContainerCmd =>
cmd.withHostName(hostName)
()
}
waitingFor(Wait.forLogMessage(".*0 row.*", 1))
withStartupTimeout(Duration.ofMinutes(10))
withEnv("HBASE_MASTER_PORT", randomMasterPort.toString)
withEnv("HBASE_REGION_PORT", randomRegionPort.toString)
setPortBindings(Seq(s"$randomMasterPort:$randomMasterPort", s"$randomRegionPort:$randomRegionPort").asJava)
override protected def doStart(): Unit = {
super.doStart()
hbase2Configuration.set("hbase.client.pause", "200")
hbase2Configuration.set("hbase.client.retries.number", "10")
hbase2Configuration.set("hbase.rpc.timeout", "3000")
hbase2Configuration.set("hbase.client.operation.timeout", "3000")
hbase2Configuration.set("hbase.client.scanner.timeout.period", "10000")
hbase2Configuration.set("zookeeper.session.timeout", "10000")
hbase2Configuration.set("hbase.zookeeper.quorum", "localhost")
hbase2Configuration.set("hbase.zookeeper.property.clientPort", getMappedPort(2181).toString)
}
}
More details here: https://hub.docker.com/r/jcjabouille/hbase-standalone

Docker Swarm: re-aply placement preference of a service after node going back alive

Docker applies constraints strictly, while placement preferences is not strictly enforced.
Here is the strategy I want to apply for my service:
2 replicas
when possible, only one instance per container (spread across node)
Here is an extract of my docker-compose file:
deploy:
placement:
constraints:
- node.role == worker
preferences:
- spread: node.id
replicas: 2
Now a simple scenario:
2 worker nodes running
I deploy the service : each node has 1 instance
1 node becomes offline : the remaining node has 2 instances: OK
the node comes back online: one node has now 2 instances while the other doesn't have any
Is it possible to tell Docker to re-apply placement preferences automatically?
I had faced this issue long back & seems like it's still not fixed or a fix is not required since it hampers their pre-existing logics in some way.
Open case - https://github.com/moby/moby/issues/24103

How to run a redis cluster on a docker cluster?

Context
I am trying to setup a redis cluster so that it runs on top off a docker cluster, to achieve maximum auto-healing.
More precisely, I have a docker compose file, which defines a service that has 3 replicas. Each service replica has a redis-server running on.
Then I have a program inside each replica that listens to changes on the docker cluster and that starts the cluster when conditions are met (each 3 redis-servers know each other).
Setting up the redis cluster works has expected, the cluster is formed and all the redis-servers communicate well, but the communication between redis-servers is inside the docker cluster.
The Problem
When I try to communicate from outside the docker cluster, because of the ingress mode I am able to talk to a redis-server, however when I try to add info (eg: set foo bar) and the client is moved to another redis-server the communication hangs and eventually times out.
Code
This is the docker-compose file.
version: "3.3"
services:
redis-cluster:
image: redis-srv-instance
volumes:
- /var/run/:/var/run
deploy:
mode: replicated
#endpoint_mode: dnsrr
replicas: 3
resources:
limits:
cpus: '0.5'
memory: 512M
ports:
- target: 6379
published: 30000
protocol: tcp
mode: ingress
The flux of commands that show the problem.
Client
~ ./redis-cli -c -p 30000
127.0.0.1:30000>
Redis-server
OK
1506533095.032738 [0 10.255.0.2:59700] "COMMAND"
1506533098.335858 [0 10.255.0.2:59700] "info"
Client
127.0.0.1:30000> set ghb fki
OK
Redis-server
1506533566.481334 [0 10.255.0.2:59718] "COMMAND"
1506533571.315238 [0 10.255.0.2:59718] "set" "ghb" "fki"
Client
127.0.0.1:30000> set rte fgh
-> Redirected to slot [3830] located at 10.0.0.3:6379
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
(150.31s)
not connected>
Any ideas? I have also tried making my one proxy/load balancer but didn't work.
Thank you! Have a nice day.
For this use case, sentinel might help. Redis on its own is not capably of high availability. Sentinel on the other side is a distributed system which can do the following for you:
Route the ingress trafic to the current Redis master.
Elect a new Redis master should the current one fail.
While I have previously done research on this topic, I have not yet managed to pull to getter a working example.
redis-cli would get the redis server ip inside the ingress network, and try to access the remote redis server by that ip directly. That is why redis-cli shows Redirected to slot [3830] located at 10.0.0.3:6379. But this internal 10.0.0.3 is not accessible to redis-cli.
One solution is to run another proxy service which attaches to the same network with redis cluster. The application sends all requests to that proxy service, and the proxy service talks with redis cluster.
Or you could create 3 swarm services that uses the bridge network and exposes the redis port to node. Your internal program needs to change accordingly.

Docker, Kafka - replication doesn't work between remote brokers

Have docker images of kafka brokers and zookeeper - call them z1, b1, b2 for now.
They are deployed on two physical servers s1 and s2 as so:
s1 contains z1 and b1
s2 contains b2
In their own docker-compose.yml files, zookeeper has set ports as following:
- 2181:2181
- 2888:2888
- 3888:3888
and brokers as following:
- 9092:9092
Topic with --replication-factor 2 and --partitions 4 can be created.
No data are pushed to topic for whole time, but still following problem occurs.
If kafka-topics --describe --topic <name_of_topic> --zookeeper <zookeeperIP:port> is run shortly after topic creation, all is insync and looks good. On second run (with short delay), b1 removes b2 partitions replicas from it's insync, but b2 doesn't remove b1 partitions replicas from insync.
In server.log from b1, there are showing many of these exceptions:
WARN [ReplicaFetcherThread-0-1], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest#42746de3 (kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to ef447651b07a:9092 (id: 1 rack: null) failed
at kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:83)
at kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:93)
at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:248)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
Swapping leadership works between brokers b1 and b2, as they are shut down and started again, but then only the last one online is in full control of topic - is leader for all partitions and only one insync, even if the other broker comes back online.
Tried cleaning all data, reseting both brokers and zookeeper, but problem persists.
Why partitions aren't properly replicated ?
It looks like the brokers b1 and b2 can't talk to each other, which indicates a Docker-related networking issue (and such Docker networking issues are quite common in general).
You'd need to share more information for further help, e.g. the contents of the docker-compose.yml file(s) as well as e.g. the Dockerfile you use to create your images. I also wonder why you have created different images for the two brokers, typically you only need a single Kafka broker image, and then simply launch multiple containers (one per desired broker) off of that image.
I figured it out. There was problem with network, as Michael G. Noll said.
Firstly, I don't map ports manually anymore and use host network instead. It's easier to manage.
Secenodary, b1 and b2 had listeners set like so:
listeners=PLAINTEXT://:9092
They both had no ip specified, so 0.0.0.0 was used by default and there was collission, as they both listened there and pushed same connection information to zookeeper.
Final configuration:
b1 and b2 docker-compose.yml use host network:
network_mode: "host"
b1 server.properties` - listeners:
listeners=PLAINTEXT://<s1_IP>:9092
b2 server.properties` - listeners:
listeners=PLAINTEXT://<s2_IP>:9092
Everything works fine now, replication is working, even on broker restarts.
Data can be produced and consumed correctly.

How to create --link for 2 containers to link to each other?

If I have 2 containers, "app_server" and "varnish_server", how can I create --link so that app_server will have a record in the "hosts" file that links to the varnish server, and the varnish_server will have a record in the "hosts" file that links to the app_server?
This is currently not supported by docker directly. You need to have a third party to whom the two containers tell about their existence and can ask for the other:
[service discovery/name service]
^ ^
| |
v v
[app_server] <===> [varnish_server]
You start the service discovery container first, and link the app_server and varnish_server to that.
Example using etcd on linuxfiddle: http://linuxfiddle.net/f/e124aeeb-2c39-472d-932e-971f092bb6db

Resources