spark app socket communication between container on docker spark cluster - docker

So I have a Spark cluster running in Docker using Docker Compose. I'm using docker-spark images.
Then i add 2 more containers, 1 is behave as server (plain python) and 1 as client (spark streaming app). They both run on the same network.
For server (plain python) i have something like
import socket
s.bind(('', 9009))
s.listen(1)
print("Waiting for TCP connection...")
while True:
# Do and send stuff
And for my client (spark app) i have something like
conf = SparkConf()
conf.setAppName("MyApp")
sc = SparkContext(conf=conf)
sc.setLogLevel("ERROR")
ssc = StreamingContext(sc, 2)
ssc.checkpoint("my_checkpoint")
# read data from port 9009
dataStream = ssc.socketTextStream(PORT, 9009)
# What's PORT's value?
So what is PORT's value? is it the IP Adress value from docker inspect of the container?

Okay so i found that i can use the IP of the container, as long as all my containers are on the same network.
So i check the IP by running
docker inspect <container_id>
and check the IP, and use that as host for my socket
Edit:
I know it's kinda late, but i just found out that i can actually use the container's name as long as they're in the same network
More edit:
i made changes in docker-compose like:
container-1:
image: image-1
container_name: container-1
networks:
- network-1
container-2:
image: image-2
container_name: container-2
ports:
- "8000:8000"
networks:
- network-1
and then in my script (container 2):
conf = SparkConf()
conf.setAppName("MyApp")
sc = SparkContext(conf=conf)
sc.setLogLevel("ERROR")
ssc = StreamingContext(sc, 2)
ssc.checkpoint("my_checkpoint")
# read data from port 9009
dataStream = ssc.socketTextStream("container-1", 9009) #Put container's name here
I also expose the socket port in Dockerfile, I don't know if that have effect or not

Related

How to establish zmq connections between docker containers using docker compose option network_mode: host?

I want to establish a connection using the draft server/client pattern of ZeroMQ across different docker containers. For that, I have a zmq.server that is running in docker_container_server. In addition, I have a zmq.client which runs in docker_container_client. Both containers are based on different docker images. Despite using the network_mode: host option with docker compose, this does not work. If I do the same within one container, it just works fine.
The docker-compose files are the following.
docker-compose-client.yaml:
services:
client:
privileged: true
image: client-image
container_name: client
command: tail -F anything
network_mode: host
environment:
- "ROS_MASTER_URI=http://127.0.0.1:11311"
- "ROS_HOSTNAME=127.0.0.1"
build:
context: ./client
network: host
dockerfile: Dockerfile
volumes:
- /var/run/docker.sock:/var/run/docker.sock
docker-compose-server.yaml:
services:
server:
image: server-image
privileged: true
container_name: server
network_mode: host
command: tail -F anything
volumes:
- /var/run/docker.sock:/var/run/docker.sock
build:
context: .
network: host
dockerfile: Dockerfile.server
The server application uses pyzmq as follows:
import zmq
class ServerExample():
def __init__(self):
self.stop = False
self.ctx = zmq.Context.instance()
self.url = 'tcp://127.0.0.1:5555'
self.server = self.ctx.socket(zmq.SERVER)
self.server.bind(self.url)
try:
while not self.stop:
msg = self.server.recv(copy=False) # stucks in this line
print(msg.routing_id)
if __name__ == '__main__':
server = ServerExample()
The client uses cppzmq as follows: (I also tried a pyzmq-client. This also works within a container but not across different containers.)
#include <zmq.hpp>
class ClientExample
{
private:
zmq::socket_t socket;
zmq::context_t ctx;
public:
ClientExample()
{
socket = zmq::socket_t(ctx, zmq::socket_type::client);
socket_.connect("tcp://127.0.0.1:5555");
char[] some_data;
socket_.send(zmq::buffer(some_data)); // not received by server
}
};
The containers are started by docker compose up server and docker compose up client.
When I do everything in one container and have a look on netstat -t | grep 5555, I get the same result in the container and on the host: (In this case, there are two clients and one server)
tcp 0 2856 localhost:56674 localhost:5555 ESTABLISHED
tcp 0 2856 localhost:5555 localhost:47666 ESTABLISHED
tcp 0 0 localhost:5555 localhost:56674 ESTABLISHED
That is why I think that the docker compose option network_mode: host is properly working in my case.
When I start the docker_container_server and docker_container_client, the execution stucks in the line where the server should receive a message. netstat -t | grep 5555 ouputs both on the host and in the containers just nothing.
The versions I use:
Ubuntu 20.04
Docker Compose version v2.12.2
Docker Engine Version 20.10.21
libzmq v4.3.4
cppzmq v4.9.0
pyzmq v25.0.0b1
I searched a lot through the internet, read the documentations of docker, docker compose, zmq etc. I came to the conclusion, that it should work with the network_mode: host option and I think I confirmed that the network_mode: host option works as it should. The code itself should not contain any mistakes, as it works fine within one container. But I didn't figure out, why the communication doesn't work across containers.

Request between Docker containers failing dial tcp 172.18.0.6:3050: connect: connection refused

I am struggling with Go requests between containers.
The issue that I have that the rest of my containers can send request to the node Container that give response, but when I send request from my GoLang application to node I get that refuse error "dial tcp 172.18.0.6:3050: connect: connection refused".
So my whole docker set up is:
version: "3.3"
services:
##########################
### SETUP SERVER CONTAINER
##########################
node:
# Tell docker what file to build the server from
image: myUserName/mernjs:node-dev
build:
context: ./nodeMyApp
dockerfile: Dockerfile.dev
# The ports to expose
expose:
- 3050
# Port mapping
ports:
- 3050:3050
# Volumes to mount
volumes:
- ./nodeMyApp/src:/app/server/src
# Run command
# Nodemon for hot reloading (-L flag required for polling in Docker)
command: nodemon -L src/app.js
# Connect to other containers
links:
- mongo
# Restart action
restart: always
react:
ports:
- 8000:8000
build:
context: ../reactMyApp
dockerfile: Dockerfile.dev
volumes:
- ../reactMyApp:/usr/src/app
- /usr/src/app/node_modules
- /usr/src/app/.next
restart: always
environment:
- NODE_ENV=development
golang:
build:
context: ../goMyApp
environment:
- MONGO_URI=mongodb://mongo:27017
# Volumes to mount
volumes:
- ../goMyApp:/app/server
links:
- mongo
- node
restart: always
So my React app can send the request to "http://node:3050/api/greeting/name" and it get the response even that react app is not linked to the node app but when Golang app sends request to node docker container it gets connection refuse message GetJson err: Get "http://node:3050/api/greeting/name": dial tcp 172.18.0.6:3050: connect: connection refused
func GetJson(url string, target interface{}) error {
r, err := myClient.Get(url)
if err != nil {
fmt.Println("GetJson err: ", err)
return err
}
defer r.Body.Close()
return json.NewDecoder(r.Body).Decode(target)
}
type ResultsDetails struct {
Greeting string `bson:"greatingMessage" json:"greatingMessage"`
Message string `bson:"message" json:"message"`
}
func GetGreetingDetails(name string) ResultsDetails {
var resp ResultsDetails
GetJson("http://node:3050/api/greeting/"+name, &resp)
return resp
}
So how do I solve the Golang request to another Docker Node Container when docker doesnt see the host as the name of my container 'node'?
Update:
By accident i put Golang port, which it doenst run on any port since it is application that checks on database records. So it hasnt got any api, therefore it is not running on any port.
Is that could be the problem why my golang application cannot communication to other containers?
Since i have also another golang application which is api application and it is running on 5000 port and it is well communicating to my node application?
Network info:
After checking the network if node and golang share the same network and the answer is yes. All containers share the same network
(Unrelated to my issue) To anyone who has "dial tcp connection refused" issue I suggest to go though that guide https://maximorlov.com/4-reasons-why-your-docker-containers-cant-talk-to-each-other/. Really helpful. To those who this guide wont help prob read bellow this, maybe you trying to request the container api after just containers were built :D
For those who was interested what was wrong:
Technically reason why I was getting this error is because of the request that I was trying to run, was just when all containers were built.
I believe there is some delay to the network after containers are built. Thats why there host was throwing "dial tcp 172.18.0.6:3050: connect: connection refused" I've run that test on other containers that could possibly send request to that node container and they were all failing after the build time. But when re-requesting after few seconds all worked out.
Sorry to bother you guys. I really spent 3 days into this issue. And I was looking into completely wrong direction. Never thought that the issue is that silly :D
Thanks for you time.
I've met the same error in my harbor registry service.
After I docker exec -it into the container, and check if the service is available, and finally I found that http_proxy has been set.
Remove the http_proxy settings for docker service, then it works like a charm.
Failed on load rest config err:Get "http://core:8080/api/internal/configurations": dial tcp 172.22.0.8:8080: connect: connection refused
$docker exec -it harbor-jobservice /bin/bash
$echo $http_proxy $https_proxy

HBase + TestContainers - Port Remapping

I am trying to use Test Containers to run an integration test against HBase launched in a Docker container. The problem I am running into may be a bit unique to how a client interacts with HBase.
When the HBase Master starts in the container, it stores its hostname:port in Zookeeper so that clients can find it. In this case, it stores "localhost:16000".
In my test case running outside the container, the client retrieves "localhost:16000" from Zookeeper and cannot connect. The connection fails because the port has been remapped by TestContainers to some other random port, other than 16000.
Any ideas how to overcome this?
(1) One idea is to find a way to tell the HBase Client to use the remapped port, ignoring the value it retrieved from Zookeeper, but I have yet to find a way to do this.
(2) If I could get the HBase Master to write the externally accessible host:port in Zookeeper that would also fix the problem. But I do not believe the container itself has any knowledge about how Test Containers is doing the port remapping.
(3) Perhaps there is a different solution that Test Containers provides for this sort of situation?
You can take a look at KafkaContainer's implementation where we start a Socat (fast tcp proxy) container first to acquire a semi-random port and use it later to configure the target container.
The algorithm is:
In doStart, first start Socat targetting the original container's network alias & port like 12345
Get mapped port (it will be something like 32109 pointing to 12345)
Make the original container (e.g. with environment variables) use the mapped port in addition to the original one, or, if only one port can be configured, see CouchbaseContainer for the more advanced option
Return Socat's host & port to the client
we build a new image of hbase to be compliant with test container.
Use this image:
docker run --env HBASE_MASTER_PORT=16000 --env HBASE_REGION_PORT=16020 jcjabouille/hbase-standalone:2.4.9
Then create this Container (in scala here)
private[test] class GenericHbase2Container
extends GenericContainer[GenericHbase2Container](
DockerImageName.parse("jcjabouille/hbase-standalone:2.4.9")
) {
private val randomMasterPort: Int = FreePortFinder.findFreeLocalPort(18000)
private val randomRegionPort: Int = FreePortFinder.findFreeLocalPort(20000)
private val hostName: String = InetAddress.getLocalHost.getHostName
val hbase2Configuration: Configuration = HBaseConfiguration.create
addExposedPort(randomMasterPort)
addExposedPort(randomRegionPort)
addExposedPort(2181)
withCreateContainerCmdModifier { cmd: CreateContainerCmd =>
cmd.withHostName(hostName)
()
}
waitingFor(Wait.forLogMessage(".*0 row.*", 1))
withStartupTimeout(Duration.ofMinutes(10))
withEnv("HBASE_MASTER_PORT", randomMasterPort.toString)
withEnv("HBASE_REGION_PORT", randomRegionPort.toString)
setPortBindings(Seq(s"$randomMasterPort:$randomMasterPort", s"$randomRegionPort:$randomRegionPort").asJava)
override protected def doStart(): Unit = {
super.doStart()
hbase2Configuration.set("hbase.client.pause", "200")
hbase2Configuration.set("hbase.client.retries.number", "10")
hbase2Configuration.set("hbase.rpc.timeout", "3000")
hbase2Configuration.set("hbase.client.operation.timeout", "3000")
hbase2Configuration.set("hbase.client.scanner.timeout.period", "10000")
hbase2Configuration.set("zookeeper.session.timeout", "10000")
hbase2Configuration.set("hbase.zookeeper.quorum", "localhost")
hbase2Configuration.set("hbase.zookeeper.property.clientPort", getMappedPort(2181).toString)
}
}
More details here: https://hub.docker.com/r/jcjabouille/hbase-standalone

How to run a redis cluster on a docker cluster?

Context
I am trying to setup a redis cluster so that it runs on top off a docker cluster, to achieve maximum auto-healing.
More precisely, I have a docker compose file, which defines a service that has 3 replicas. Each service replica has a redis-server running on.
Then I have a program inside each replica that listens to changes on the docker cluster and that starts the cluster when conditions are met (each 3 redis-servers know each other).
Setting up the redis cluster works has expected, the cluster is formed and all the redis-servers communicate well, but the communication between redis-servers is inside the docker cluster.
The Problem
When I try to communicate from outside the docker cluster, because of the ingress mode I am able to talk to a redis-server, however when I try to add info (eg: set foo bar) and the client is moved to another redis-server the communication hangs and eventually times out.
Code
This is the docker-compose file.
version: "3.3"
services:
redis-cluster:
image: redis-srv-instance
volumes:
- /var/run/:/var/run
deploy:
mode: replicated
#endpoint_mode: dnsrr
replicas: 3
resources:
limits:
cpus: '0.5'
memory: 512M
ports:
- target: 6379
published: 30000
protocol: tcp
mode: ingress
The flux of commands that show the problem.
Client
~ ./redis-cli -c -p 30000
127.0.0.1:30000>
Redis-server
OK
1506533095.032738 [0 10.255.0.2:59700] "COMMAND"
1506533098.335858 [0 10.255.0.2:59700] "info"
Client
127.0.0.1:30000> set ghb fki
OK
Redis-server
1506533566.481334 [0 10.255.0.2:59718] "COMMAND"
1506533571.315238 [0 10.255.0.2:59718] "set" "ghb" "fki"
Client
127.0.0.1:30000> set rte fgh
-> Redirected to slot [3830] located at 10.0.0.3:6379
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
Could not connect to Redis at 10.0.0.3:6379: Operation timed out
(150.31s)
not connected>
Any ideas? I have also tried making my one proxy/load balancer but didn't work.
Thank you! Have a nice day.
For this use case, sentinel might help. Redis on its own is not capably of high availability. Sentinel on the other side is a distributed system which can do the following for you:
Route the ingress trafic to the current Redis master.
Elect a new Redis master should the current one fail.
While I have previously done research on this topic, I have not yet managed to pull to getter a working example.
redis-cli would get the redis server ip inside the ingress network, and try to access the remote redis server by that ip directly. That is why redis-cli shows Redirected to slot [3830] located at 10.0.0.3:6379. But this internal 10.0.0.3 is not accessible to redis-cli.
One solution is to run another proxy service which attaches to the same network with redis cluster. The application sends all requests to that proxy service, and the proxy service talks with redis cluster.
Or you could create 3 swarm services that uses the bridge network and exposes the redis port to node. Your internal program needs to change accordingly.

Containerized Kafka client errors when producing messages to the host Kafka server

There are a number of similar types of queries on stackoverflow, but none quite match the problem that I am seeing.
I have a zookeeper/kafka setup on my server which work perfectly. One can produce
bin/kafka-console-producer.sh --broker-list 192.168.2.80:9092 --topic test
and consume
bin/kafka-console-consumer.sh --bootstrap-server 192.168.2.80:9092 --topic test --from-beginning
locally on the Linux Ubuntu 16.04 server.
From a Docker container - also running Ubuntu 16.04 - I want to produce and consume. The container's Kafka code was copied from that on the server.
Firstly I can create a new topic
bin/kafka-topics.sh --create --zookeeper 192.168.2.80:2181 --replication-factor 1 --partitions 1 --topic test2
from the container and then list it again
bin/kafka-topics.sh --list --zookeeper 192.168.2.80:2181
However when I try to produce new messages, using the above (kafka-console-producer.sh) command it fails with the following message:
[2017-06-05 13:59:05,317] ERROR Error when sending message to topic test2 with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for test2-0: 1526 ms has passed since batch creation plus linger time
immediately after entering the text of the message and pressing enter.
It may seem strange running a Docker container on the same host, but once this works I will move the container to a separate host for production.
My kafka server.properties file:
listeners=PLAINTEXT://0.0.0.0:9092
Kafka version:
2.12-0.10.2.1
Docker version:
Docker version 1.12.6, build 78d1802
The problem is (slightly simplified) caused by how Kafka's protocol works. Given a list of "bootstrap servers" (e.g. localhost:9092), a Kafka client will contact those bootstrap servers, but then use the hostnames of the actual Kafka brokers as returned by the bootstrap servers (the broker's advertised.listeners config, depending on your Kafka/Docker setup, might be set to e.g. kafka:9092). So here, the client would talk to localhost:9092 for bootstrapping (which will work), but then switch to kafka:9092 (which will not work, "thanks" to the networking setup).
Fortunately there is a way to configure Kafka + Docker in a way that "just works", and it doesn't require shenanigans such as fiddling with your host's /etc/hosts file and such. As part of this you need to set a few (new) Kafka settings though, which were added in kafka's KIP-103: Separation of Internal and External traffic.
Here's a snippet for Docker Compose (docker-compose.yml) that demonstrates how to do this:
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:3.2.1
hostname: zookeeper
ports:
- '32181:32181'
environment:
ZOOKEEPER_CLIENT_PORT: 32181
kafka:
image: confluentinc/cp-kafka:3.2.1
hostname: kafka
ports:
- '9092:9092'
- '29092:29092'
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
# Following line is needed for Kafka versions 0.11+
# in case you run less than 3 Kafka brokers in your
# cluster because the broker config
# `offsets.topic.replication.factor` (default: 3)
# is now enforced upon topic creation
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
Here, the key settings are:
listener.security.protocol.map (which is being set via KAFKA_LISTENER_SECURITY_PROTOCOL_MAP)
inter.broker.listener.name
advertised.listeners
In the setup above, the containerized Kafka broker listens on localhost:9092 for access from your host machine (e.g. your Mac laptop) and on kafka:29092 for access from other containers.
A full end-to-end example is available at:
https://github.com/confluentinc/cp-docker-images/blob/v3.2.1/examples/kafka-streams-examples/docker-compose.yml (documentation at http://docs.confluent.io/3.2.1/cp-docker-images/docs/tutorials/kafka-streams-examples.html).
Your producer (in the container) can't resolve the host name of your Linux guest OS which is returned in the Kafka producers initial metadata request to the bootstrap server. You can add it manually to the /etc/hosts file inside the container or add "--add-host" parameter to the docker run command that launches the image running your producer
Aha!
After further reading and the answers given above the solution came. As is often the case it is an easy one.
A simple edit of the kafka server.properties file:
advertised.listeners=PLAINTEXT://192.168.2.80:9092
Also note, the parameter 'listeners' is not set in this file.

Resources