Choosing range of ports in spark - docker

From spark documentation I know that the ports that executors, i.e. workers (because by default there is just one executor per a worker) use for establishing connection with master are randomly determined, but how could I setup their range to publish those ports in docker. Also, if a worker establishes a connection with another container (which is not part of the distributed system), do I need to publish the port on which the worker would get returned data from the container (e.g. via an https request)?
Just to note, I do not use docker-compose.yml because I do not need the containers to be set as services and I want to add/remove containers when needed by increase/decrease in number of customers.

You should use the same docker network for all containers which will communicate with each other. Containers can reach others using container name (on all ports) just like if different hosts on a network.
Create a network (needed only once)
docker network create <network_name>
when you launch a container use --network to connect container to the network
docker run --network=<network_name> --name <container_name> <image>
You can also connect existing containers to networks
docker network connect <network_name> <container_name>
Reference:
https://docs.docker.com/engine/reference/commandline/network_create/
https://docs.docker.com/engine/reference/run/

Related

How do I combine docker-compose scaling and load balanced port exposure?

If you tell docker-compose to scale a service, and do NOT expose its ports,
docker-compose scale dataservice=2
There will be two IPs in the network that the dns name dataservice will resolve to. So, services that reach it by hostname will load balance.
I would also like to do this to the edge proxy as well. The point would be that
docker-compose scale edgeproxy=2
Would cause edgeproxy to resolve to one of 2 possible IP Addresses.
But the semantics of exposing ports is wrong for this. If I expose:
8443:8443
Then it will try to bind each edgeproxy to be bound to host 8443. What I want is more like:
0.0.0.0:8443:edgeproxy:8443
Where when you try to come into the docker network via host 8443, it randomly selects an edgeproxy:8443 IP to bind the incoming TCP connection to.
Is there an alternative to just do a port-forward? I want a port that can get me in to talk to any ip that will resolve as edgeproxy.
This is provided by swarm mode. You can enable a single node swarm cluster with:
docker swarm init
And then deploy your compose file as a stack with:
docker stack deploy -c docker-compose.yml $stack_name
There are quite a few differences from docker compose including:
Swarm doesn't build images
You manage the target state with docker service commands, trying to stop a container with docker stop won't work since swarm will restart it
The compose file needs to be in a v3 syntax
Networks will be an overlay network, and not attachable by containers outside of swarm, by default
One of the main changes is that exposed ports are published on an ingress network managed by swarm mode, and connections are round robin load balanced to your containers. You can also define a replica count inside the compose file, eliminating the need to run a scale command.
See more at: https://docs.docker.com/engine/swarm/

Keycloak docker containers are unable to discover each others

I have two instances of keycloak running on container each on is running on a single node.
The nodes are bare-metal nodes inside my company network.
keycloak uses TCPPING as discovery protocol.
Since the two containers are running on different nodes, and each instance is pining inside docker default network they are not able to find each other.
I said docker default network because I didn’t specify special network for the two containers.
Any idea how can I make the two instances in this architectural design discover each others!
and I was thinking about docker swarm as a solution.
Assuming the two nodes are on the same network and are able to connect to each other, you can get the two container to discover each other using docker host networking
It would be as easy as docker run --net=host
Docker host networking makes the container to use the networking of the host node and thus will be allocated an IP address by the DHCP server used by the host node and for all practical purposes , would look like another host in that network.
This allows the two containers to discover each other using TCPPING
Docker swarm would also enable this .Docker swarm basically abstracts multiple host nodes such that you can containers on them as if you are running docker on single host. But that will require docker-machine and whole new setup.

Can we have two or more container running on docker at the same time

I have not done any practical with the docker and container, But as per my knowledge.
As per the documents available online I did not get the details about the running two or more containers at the same time.
Docker allows container to map port address of container to the host machine.
Now, the question is can we run multiple container at the same time on docker? if yes then if two containers are mapped to same port number then how does the port is handled in this case?
Also out of curiosity, can two containers on docker communicate with each other?
Yes you can run multiple containers on a single host; docker is designed for exactly that.
You cannot map two containers of different images to the same port number; you get an error response if you try. However, if your containers run the same image (e.g.2 instances of a webapp) you could run them as a service, and have them exposed on the same port. Docker will load-balance the requests. You can read more about services here or follow the Get Started (Part 3, services) here
Yes, the containers on a single host can communicate with each other, by container name. For example if you have one container running MongoDB called mongo, and another one running Node.js called webserver, the webserver container can connect to the database by using the name mongo e.g. db.Connect("mongodb://mongo:27017/testdb").
We can run more one than one Docker at a time in a host but yes we will hit the limitation of binding the same port to the docker; so to resolve this we need to bind different port in the host to docker that is if you are running mongo-db then its default port is 27017 so we can run two mongo-db as -p 27017:27017 for Docker D1 and -p 27018:27017 for Docker D2 and 5000:27017 for docker D3; Like this you can bind different host port to map to 27017 for mongo-db port; Now your question is how to manage this ports from host then I would recommend you to use nginx for port managing in the host machine.
Coming to your next question all dockers are connected to default docker0 bridge network so we can connect to any of the dockers connected to default bridge 'docker0' network; If I am right it will come with ipaddress of 172.x.x.x network. Get inside to the docker and run 'ip addr' to see the ip-address assigned to the dockers and you can test connection by running ping command.
Yes two containers can run same time, they can also communicate with each other also, you can define your own network and they can communicate with each other. if two containers have their private ports, they are their internal ports, one container port does not collide with another container port. if you want to expose the port to host, then you have to publish the port(s).

Inter-container connection via localhost

I would like to set up my containers so that they connect to each other via localhost.
My setup is a main application container and two other containers that it needs to connect to (ActiveMQ and Wiremock).
I already run ActiveMQ and Wiremock in containers with the relevant ports exposed, and the main application runs through IntelliJ and connects to these. However, when I am not developing the main applications, I would like to run it in a container for simplicity but it cannot connect to the ports exposed by the others.
Setting --net=host doesn't seem to work, nor does creating a network docker network create <NAME> and assigning it in the docker run with --net=<NAME>.
The application already runs in a container in other environments on the host network.
docker creates a default network in which all containers run, and sets a network name for each of your containers, using the container name.
if you have a contained named mq for your ActiveMQ, then you would use something like tcp://mq:61616 (or whatever protocol / port you have configured) from your other containers, to connect to it.
you shouldn't need to set the --net option unless you need to create a specific network for specific containers to use.

Adding new containers to existing cluster (sworm)

I am having a problem trying to implement the best way to add new container to an existing cluster while all containers run in docker.
Assuming I have a docker swarm, and whenever a container stops/fails for some reason, the swarm bring up new container and expect it to add itself to the cluster.
How can I make any container be able to add itself to a cluster?
I mean, for example, if I want to create a RabbitMQ HA cluster, I need to create a master, and then create slaves, assuming every instance of RabbitMQ (master or slave) is a container, let's now assume that one of them fails, we have 2 options:
1) slave container has failed.
2) master container has failed.
Usually, a service which have the ability to run as a cluster, it also has the ability to elect a new leader to be the master, so, assuming this scenerio is working seemlesly without any intervention, how would a new container added to the swarm (using docker swarm) will be able to add itself to the cluster?
The problem here is, the new container is not created with new arguments every time, the container is always created as it was deployed first time, which means, I can't just change it's command line arguments, and this is a cloud, so I can't hard code an IP to use.
Something here is missing.
Maybe trying to declare a "Service" in the "docker Swarm" level, will acctualy let the new container the ability to add itself to the cluster without really knowing anything the other machines in the cluster...
There are quite a few options for scaling out containers with Swarm. It can range from being as simple as passing in the information via a container environment variable to something as extensive as service discovery.
Here are a few options:
Pass in IP as container environment variable. e.g. docker run -td -e HOST_IP=$(ifconfig wlan0 | awk '/t addr:/{gsub(/.*:/,"",$2);print$2}') somecontainer:latest
this would set the internal container environment variable HOST_IP to the IP of the machine it was started on.
Service Discovery. Querying a known point of entry to determine the information about any required services such as IP, Port, ect.
This is the most common type of scale-out option. You can read more about it in the official Docker docs. The high level overview is that you set up a service like Consul on the masters, which you have your services query to find the information of other relevant services. Example: Web server requires DB. DB would add itself to Consul, the web server would start up and query Consul for the databases IP and port.
Network Overlay. Creating a network in swarm for your services to communicate with each other.
Example:
$ docker network create -d overlay mynet
$ docker service create –name frontend –replicas 5 -p 80:80/tcp –network mynet mywebapp
$ docker service create –name redis –network mynet redis:latest
This allows the web app to communicate with redis by placing them on the same network.
Lastly, in your example above it would be best to deploy it as 2 separate containers which you scale individually. e.g. Deploy one MASTER and one SLAVE container. Then you would scale each dependent on the number you needed. e.g. to scale to 3 slaves you would go docker service scale <SERVICE-ID>=<NUMBER-OF-TASKS> which would start the additional slaves. In this scenario if one of the scaled slaves fails swarm would start a new one to bring the number of tasks back to 3.
https://docs.docker.com/engine/reference/builder/#healthcheck
Docker images have a new layer for health check.
Use a health check layer in your containers for example:
RUN ./anyscript.sh
HEALTHCHECK exit 1 or (Any command you want to add)
HEALTHCHECK check the status code of command 0 or 1 and than result as
1. healthy
2. unhealthy
3. starting etc.
Docker swarm auto restart the unhealthy containers in swarm cluster.

Resources