Docker container cannot reach other services for a few seconds - docker

I have a docker swarm node running a set of docker services connected by a overlay network. When needed I dynamically add another docker node via terraform . It'll be a separate ec2 instance setup and connected as a worker node to the existing swarm network.
I'll run a container from my manager and the running container needs to talk to the existing services in manager node. For eg: Connecting to postgres service and running few queries.
docker -H <node ip> run --network <overlay network where services are running> <some image> <command>
The script running in the container fails with "Name or service not known" error. I tried to manually ping by bashing into the container and ping succeeds after some 4 or 5 seconds. I tried this hundreds of times and I always get the same issue. Also, it doesn't matter when the node is joined to the swarm. Every time I run the above command, I face the same issue.
Also, I don't have control over what script is run in the container so I cannot add retries.
One more thing. Sometimes, some services can be reached immediately. For eg., Postgres will fail. But another service exposing rest end points can be reached. But it's not always the case.
I was able to reproduce this issue with a bunch of test services:
Steps to reproduce the issue:
Create a docker swarm and add another machine as a worker node to
docker swarm
Create a overlay network in node 1 docker network create -d overlay --attachable platform
Create services in node 1 for i in {1..25} do docker service create --network platform -p :80 --name "service-${i}"
dockerbogo/docker-nginx-hello-world done
Create a task from node 1 to be run in node 2 docker -H 10.128.0.3:2376 run --rm --network platform centos ping service-1
Docker daemon logs: https://pastebin.com/65Lihp8v
Any help?

Related

How to use Docker Compose with legacy Docker Swarm

I am trying to deploy my app with Compose and Swarm. Currently I don't want to upgrade my docker-compose.yaml from v2 to v3. So I am only able to do that with standalone(legacy) swarm rather that docker swarm mode based on Stoneman's answer and official Swarm documents.
Following the official instruction, I successfully set up a swarm cluster. I ran docker -H :4000 info on the swarm manager node to check the swarm cluster status, as shown below. There are two other worker nodes in this cluster. Next, I want to create an overlay network with this cluster and refer this network in the docker-compose.yaml. But when I ran docker -H :4000 network create -d overlay test on the swarm manager node to create the netwrok, it reported error: Error response from daemon: Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.
So, how can I create a network with a swarm cluster( without docker-machine and virtual box)? Currently, the swarm manager and worker nodes are running as docker containers.
Did you setup overlay networking with it's own etcd backend first? https://docs.docker.com/network/overlay-standalone.swarm/
Swarm "classic" is deprecated and replaced by docker swarm mode. Everything is harder in classic, including setting up overlay. I wouldn't recommend using it for anything new unless you had a hard requirement.
In swarm mode you run all commands at the swarm manager host. Same with creating networks, secrets, etc.
You can find out the docker manager machine by:
$docker node ls
Manager host is marked with MANAGER STATUS:Leader.
After creating the network on the manager all nodes on that swarm should see the network.
"I ran docker -H :4000 network create -d overlay test"
Better to declare the network inside the stack yml file, for faster and easier deployment. you can create network and expose your ports at the yml file, no need to create them manually every time you run the stack.
Add the following block under the docker service:
services:
...
#Network
networks:
- network-name-here
...
#Exposed ports:
ports:
- target: 4000
published: 4000
At the end of the yml file add the following block to declare the network, so its created every time you run $docker stack deploy:
networks:
network-name-here:
driver: overlay

Docker - Have new container communicate with pre-running container

I'm trying to setup some very simple networking between a pair of Docker containers and so far all the documentation I've seen is far more complex than for what I am trying to do.
My use case is simple:
Container 1 is already running and is listening on port 28016
Container 2 will start after container 1 and needs to connect to container 1 on port 28016.
I am aware I can set this up via Docker-Compose with ease, however Container 1 is long-lived and for this use case, I do not want to shut it down. Container 2 needs to start and automatically connect to container 1 via port 28016. Also, both containers are running on the same machine. I cannot figure out how to do this.
I've exposed 28016 in Container 1's dockerfile, and I'm running it with -p 28016:28016. What do I need to do for Container 2 to connect to Container 1?
There are a few ways of solving this. Most don't require you to publish the ports.
Using a user defined network
If you start your long-running container in a user-defined network, because then docker will handle
docker network create service-network
docker run --net=service-network --name Container1 service-image
If you then start your ephemeral container in the same network, it will be able to refer to the long-running container by name. E.g:
docker run --name Container2 --net=service-network ephemeral-image
Using the existing container network namespace
You can just run the ephemeral container inside the network namespace of the long running container:
docker run --name Container2 --net=container:Container1 ephemeral-image
In this case, the service would be available via localhost:28016.
Accessing the service on the host
Since you've published the service on the host with -p 28016:28016, you can refer to that access using the address of the host, which from inside the container is going to be the default gateway. You can get that with something like:
address=$(ip route | awk '$1 == "default" {print $3}')
And your service would be available on ${address}:28016.
Here are the steps to perform:
Create a network: docker network create my-net
Attach the network to the already running container: docker container attach <container-name> my-net
Start the new container with the --network my-net or with docker-compose add a network property:
...
networks:
- my-net
networks:
my-net:
external: true
The container should now be able to communicate using the container-name as a DNS host name

Docker: Difference between `docker run` and `docker service`

I am very new to docker , just started venturing into this. I read online about this. I came to know of the following commands of docker which is: docker run and docker service. As I understood , with docker run we are spinning a new container. However I am not clear what docker service do? Does it spin container in a Swarm?
Can anyone help understand in simple to understand?
The docker run command creates and starts a container on the local docker host.
A docker "service" is one or more containers with the same configuration running under docker's swarm mode. It's similar to docker run in that you spin up a container. The difference is that you now have orchestration. That orchestration restarts your container if it stops, finds the appropriate node to run the container on based on your constraints, scale your service up or down, allows you to use the mesh networking and a VIP to discover your service, and perform rolling updates to minimize the risk of an outage during a change to your running application.
Docker Run vs Docker service
docker run:
we can create number of containers with different images.
docker service:
we can create number of containers with same image in a single command line.
SYNTAX:
docker service create --name service-name --network network-name --replicas number-of-containers image-name
EXAMPLE:
docker service create --name service1 --network swarm-net --replicas 5 redis

run docker exec from swarm manager

I have two worker nodes: worker1 and worker2 and one swarm manager. I'm running all the services in the worker nodes only. I need to run from the manager docker exec to access some of the containers created in the worker nodes but I keep getting that the service is not recognized. I know I can run docker exec in any of the worker nodes and it works fine but I dont want to have to find on which node the service is running and then ssh to the designated node to run docker exec command. Is there a way to do so in swarm or not?
Swarm mode does not currently have a way to run an exec on a running task. You need to find the container and run the exec on the host. You can configure the workers to have a TLS protected port they listen on, which would give you remote access (see docker's guide). And you can lookup the node for each task in a service by checking the output of a docker service ps $service_name.
If this helps, nowadays you can create the overlay network with --attachable flag to enable any container to join the network. This is great feature as it allows a lot of flexibility.
E.g.
$ docker network create --attachable --driver overlay my-network
$ docker service create --network my-network --name web --publish 80:80 nginx
$ docker run --network=my-network -ti alpine sh
$ wget -qO- web
<!DOCTYPE html>
<html>
<head>
....

What is the difference between Docker Service and Docker Container?

When do we use a docker service create command and when do we use a docker run command?
In short: Docker service is used mostly when you configured the master node with Docker swarm so that docker containers will run in a distributed environment and it can be easily managed.
Docker run: The docker run command first creates a writeable container layer over the specified image, and then starts it using the specified command.
That is, docker run is equivalent to the API /containers/create then /containers/(id)/start
source: https://docs.docker.com/engine/reference/commandline/run/#parent-command
Docker service:
Docker service will be the image for a microservice within the context of some larger application. Examples of services might include an HTTP server, a database, or any other type of executable program that you wish to run in a distributed environment.
When you create a service, you specify which container image to use and which commands to execute inside running containers. You also define options for the service including:
the port where the swarm will make the service available outside the swarm
an overlay network for the service to connect to other services in the swarm
CPU and memory limits and reservations
a rolling update policy
the number of replicas of the image to run in the swarm
source: https://docs.docker.com/engine/swarm/how-swarm-mode-works/services/#services-tasks-and-containers
docker run command is used to create a standalone container
docker service create command is used to create instances (called tasks) of that service running in a cluster (called swarm) of computers (called nodes). Those tasks are containers of course, but not standalone containers. In a sense a service acts as a template when instantiating tasks.
For example
docker service create --name MY_SERVICE_NAME --replicas 3 IMAGE:TAG
creates 3 tasks of the MY_SERVICE_NAME service, which is based on the IMAGE:TAG image.
More information can be found here
Docker run will start a single container.
With docker service you manage a group of containers (from the same image). You can scale them (start multiple containers) or update them.
You may want to read "docker service is the new docker run"
According to these slides, "docker service create" is like an "evolved" docker run. You need to create a "service" if you want to deploy a container to Docker Swarm
Docker services are like "blueprints" for containers. You can e.g. define a simple worker as a service, and then scale that service to 20 containers to go through a queue really quickly. Afterwards you scale that service down to 3 containers again. Also, via Swarm these containers could be deployed to different nodes of your swarm.
But yeah, I also recommend reading the documentation, just like #Tristan suggested.
You can use docker in two way.
Standalone mode
When you are using the standalone mode you have installed docker daemon in only one machine. Here you have the ability to create/destroy/run a single container or multiple containers in that single machine.
So when you run docker run; the docker-cli creates an API query to the dockerd daemon to run the specified container.
So what you do with the docker run command only affects the single node/machine/host where you are running the command. If you add a volume or network with the container then those resources would only be available in the single node where you are running the docker run command.
Swarm mode (or cluster mode)
When you want or need to utilize the advantages of cluster computing like high availability, fault tolerance, horizontal scalability then you can use the swarm mode. With swarm mode, you can have multiple node/machine/host in your cluster and you can distribute your workload throughout the cluster. You can even initiate swarm mode in a single node cluster and you can add more node later.
Example
You can recreate the scenario for free here.
Suppose at this moment we have only one node called node-01.dc.local, where we have initiated following commands,
####### Initiating swarm mode ########
$ docker swarm init --advertise-addr eth0
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-21mxdqipe5lvzyiunpbrjk1mnzaxrlksnu0scw7l5xvri4rtjn-590dyij6z342uyxthletg7fu6 192.168.0.8:2377
####### create a standalone container #######
[node1] (local) root#192.168.0.8 ~
$ docker run -d --name app1 nginx
####### creating a service #######
[node1] (local) root#192.168.0.8 ~
$ docker service create --name app2 nginx
After a while, when you feel that you need to scale your workload you have added another machine named node-02.dc.local. And you want to scale and distribute your service to the newly created node.
So we have run the following command on the node-02.dc.local node,
####### Join the second machine/node/host in the cluster #######
[node2] (local) root#192.168.0.7 ~
$ docker swarm join --token SWMTKN-1-21mxdqipe5lvzyiunpbrjk1mnzaxrlksnu0scw7l5xvri4rtjn-590dyij6z342uyxthletg7fu6 192.168.0.8:2377
This node joined a swarm as a worker.
Now from the first node I have run the followings to scale up the service.
####### Listing services #######
[node1] (local) root#192.168.0.8 ~
$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
syn9jo2t4jcn app2 replicated 1/1 nginx:latest
####### Scalling app2 from single container to 10 more container #######
[node1] (local) root#192.168.0.8 ~
$ docker service update --replicas 10 app2
app2
overall progress: 10 out of 10 tasks
1/10: running [==================================================>]
2/10: running [==================================================>]
3/10: running [==================================================>]
4/10: running [==================================================>]
5/10: running [==================================================>]
6/10: running [==================================================>]
7/10: running [==================================================>]
8/10: running [==================================================>]
9/10: running [==================================================>]
10/10: running [==================================================>]
verify: Service converged
[node1] (local) root#192.168.0.8 ~
####### Verifying that app2's workload is distributed to both of the ndoes #######
$ docker service ps app2
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
z12bzz5sop6i app2.1 nginx:latest node1 Running Running 15 minutes ago
8a78pqxg38cb app2.2 nginx:latest node2 Running Running 15 seconds ago
rcc0l0x09li0 app2.3 nginx:latest node2 Running Running 15 seconds ago
os19nddrn05m app2.4 nginx:latest node1 Running Running 22 seconds ago
d30cyg5vznhz app2.5 nginx:latest node1 Running Running 22 seconds ago
o7sb1v63pny6 app2.6 nginx:latest node2 Running Running 15 seconds ago
iblxdrleaxry app2.7 nginx:latest node1 Running Running 22 seconds ago
7kg6esguyt4h app2.8 nginx:latest node2 Running Running 15 seconds ago
k2fbxhh4wwym app2.9 nginx:latest node1 Running Running 22 seconds ago
2dncdz2fypgz app2.10 nginx:latest node2 Running Running 15 seconds ago
But if you need to scale your app1 you can't because you have created the container with standalone mode.

Resources