docker stack communicate between containers - docker

I'm trying to setup a swarm using docker but I'm having issues with communicating between containers.
I have cluster with 5 nodes. 1 manager and 4 workers.
3 apps: redis, splash, myapp
myapp has to be on the 4 workers
redis, splash just on the manager
myapp has to be able to communicate with redis and splash
I tried using the container name but its not working. It resolves the container name to different IPs.
ping splash # return a different ip than the container actually has
I am deploying running the swarm using docker stack
docker stack deploy -c docker-stack.yml myapp
Linking container between them also doesn't work.
Any ideas ? Am I missing something ?
root#swarm-manager:~# docker version
Client:
Version: 17.09.0-ce
API version: 1.32
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:42:18 2017
OS/Arch: linux/amd64
Server:
Version: 17.09.0-ce
API version: 1.32 (minimum version 1.12)
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:40:56 2017
OS/Arch: linux/amd64
Experimental: false
docker-stack.yml contains:
version: "3"
services:
splash:
container_name: splash
image: scrapinghub/splash
ports:
- 8050:8050
- 5023:5023
deploy:
mode: global
placement:
constraints:
- node.role == manager
redis:
container_name: redis
image: redis
ports:
- 6379:6379
deploy:
mode: global
placement:
constraints:
- node.role == manager
myapp:
container_name: myapp
image: myapp_image:latest
environment:
REDIS_ENDPOINT: redis:6379
SPLASH_ENDPOINT: splash:8050
deploy:
mode: global
placement:
constraints:
- node.role == worker
entrypoint:
- ping google.com
---- EDIT ----
I tried with curl also. Didn't work.
docker stack deploy -c docker-stack.yml myapp
Creating network myapp_default
Creating service myapp_splash
Creating service myapp_redis
Creating service myapp_myapp
curl http://myapp_splash:8050
curl: (7) Failed to connect to myapp_splash port 8050: No route to host
curl http://splash:8050
curl: (7) Failed to connect to splash port 8050: No route to host
What worked is getting the actual container name of splash, which is some random generated string.
curl http://myapp_splash.d7bn0dpei9ijpba4q41vpl4zz.tuk1cimht99at9g0au8vj9lkz:8050
But this doesn't really help me.

Ping is not the proper tool to try and connect services. For some reason it doesn't work with docker networks. Try curl http://serviceName instead.
Other than that: Containers can't be named when using stack deploy, instead your service name is used (which coincidentally is the same) to access another service.

I manage to get it working using curl http://tasks.splash:8050 or http://tasks.myapp_splash:8050.
I don't know whats is causing this issue though. Feel free to comment with an answer.

It seems that containers in stack named tasks.<service name> so the command ping tasks.myservice works for me!
Itersting point to note that names like <stackname>_<service name> will also resolve and ping'able but IP address is incorrect. This is frustarating.
(For exmple if you do docker stack deploy -c my.yml AA you'll get name like AA_myservice which will resolve to incorrect addreses)
To add to above answer. From network point of view curl and ping do the same things. Both will try to resolve name passed to them and then curl will try to connect using specified protocol (http is the example above) and ping will send ICMP echo requests.

Related

Unable to loopback to host ip using docker-compose.yml file on Windows Sever 2019

I am using Windows Server 2019 with Containers and Hyper-V features enabled. Also, I made sure that Windows support for Linux containers is installed on the machine.
I need to use docker-compose.yml file to bring up the docker containers (web APIs) but I want the port exposed from the container to be accessible only on the host machine.
Below is the sample docker-compose.yml that I am using with loopback to 127.0.0.1.
webapi:
image: webapi
build:
context: .
dockerfile: webapi/Dockerfile
container_name: webapi
restart: always
environment:
- ASPNETCORE_ENVIRONMENT=Development
- ASPNETCORE_URLS=https://+:443
ports:
# Allow access to web APIs only on this local machine and on secure port.
- "127.0.0.1:5443:443"
This solution works fine on Windows 10 machine with Docker Desktop installed but doesn't work on Windows Server 2019 with Docker EE installed. I get the below error where webapi is a linux image:
ERROR: for webapi Cannot start service webapi: failed to create endpoint webapi on network containers_default: Windows does not support host IP addresses in NAT settings
ERROR: Encountered errors while bringing up the project.
My Windows Server 2019 docker configuration looks like this:
PS C:\Users\xyz> docker version
Client: Mirantis Container Runtime
Version: 20.10.5
API version: 1.41
Go version: go1.13.15
Git commit: 105e9a6
Built: 05/17/2021 16:36:02
OS/Arch: windows/amd64
Context: default
Experimental: true
Server: Mirantis Container Runtime
Engine:
Version: 20.10.5
API version: 1.41 (minimum version 1.24)
Go version: go1.13.15
Git commit: 1a7d997053
Built: 05/17/2021 16:34:40
OS/Arch: windows/amd64
Experimental: true
PS C:\Users\xyz> docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker Application (Docker Inc., v0.8.0)
cluster: Manage Mirantis Container Cloud clusters (Mirantis Inc., v1.9.0)
registry: Manage Docker registries (Docker Inc., 0.1.0)
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 5
Server Version: 20.10.5
Storage Driver: windowsfilter (windows) lcow (linux)
Windows:
LCOW:
Logging Driver: json-file
Plugins:
Volume: local
Network: ics internal l2bridge l2tunnel nat null overlay private transparent
Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
Swarm: inactive
Default Isolation: process
Kernel Version: 10.0 17763 (17763.1.amd64fre.rs5_release.180914-1434)
Operating System: Windows Server 2019 Datacenter Version 1809 (OS Build 17763.1999)
OSType: windows
Architecture: x86_64
CPUs: 4
Total Memory: 16GiB
Any help would be greatly appreciated.
After spending a fair amount of time I didn't find a way to fix the Windows does not support host IP addresses in NAT settings error nor using any other network / driver (bridge, host, etc). However, I found a workaround to make the port exposed only on the local machine by configuring the Kestrel web server of the web app (container) using the "AllowedHosts" parameter in the appsettings.json. I set the parameter value like below:
{
// allow the web apis to be accessible only on host machine as a security measure.
// allow access only if the host in the URL matches the values mentioned in the list.
"AllowedHosts": "localhost;127.0.0.1",
"Serilog": {
"Using": [],
"MinimumLevel": {
"Default": "Debug",
"Override": {
"Microsoft": "Warning",
"System": "Warning"
}
},
...
}
All this does is to check if the host in the URL is either localhost or 127.0.0.1 which is similar to loopback.
You can also override this parameter by passing the "AllowedHosts" parameter in the environment variable in the yml file like below:
webapi:
image: webapi
build:
context: .
dockerfile: webapi/Dockerfile
container_name: webapi
restart: always
environment:
- ASPNETCORE_ENVIRONMENT=Development
- ASPNETCORE_URLS=https://+:443
- AllowedHosts=*
ports:
# Allow access to web APIs only on this local machine and on secure port.
- "5443:443"
Please note that AllowedHosts is not used here for safe-listing the IPs to accept the connections from as it works on the basis of the target host mentioned in the URL.

docker swarm . Run some service only once per node

I use docker version
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:24:51 2018
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:23:15 2018
OS/Arch: linux/amd64
Experimental: false
And docker-compose for configuring my services. I want to have 3 replicas of a server that run on some specially labeled 3 nodes. For this I use yaml configuration like :
version: '3.7'
services:
ser:
deploy:
placement:
constraints:
- "node.labels.cloud.type == nodesforservice"
replicas: 3
restart_policy:
condition: any
rollback_config:
delay: 0s
parallelism: 0
update_config:
delay: 6s
parallelism: 1
environment:
- affinity:service!=stackname_servicename
image: service:latest
and deploy this configuration via
docker stack deploy --compose-file docker-stack.yml stackname
But I have found out that affinity:service!=stackname_servicename does not work properly ( or does not work at all ). It works only in the deprecated standalone mode. If there are only 2 nodes currently available the service will be deployed to some node twice. And it is what I try to avoid.
Is there is any possibility in a docker swarm explicitly say that 2 containers of the same service are not alowed ? I have found only posibility to create global services with --mode global but I need only 3 instances and not more.
if you are using docker create service, you can use --replicas-max-per-node 1 to enforce 1:1 relationship between container and node.
if you are using compose file you can declare max_replicas_per_node under:
deploy:
placement:
max_replicas_per_node: 1
if you need further control on which node can run the container using label, place the label matching at constraint block under placement.
More details here: Compose and Docker compatibility matrix
This is a rather old thread, but using a node label as placement constraint in combination with global mode deployment does the trick.
deploy:
mode: global
placement:
constraints:
- node.labels.cloud.type == nodesforservice
Of course the node label "cloud.type=nodesforservice" needs to be applied to the desired number of nodes.
For Docker swarm there is no such thing as affinity.

{{node.hostname}} usage in docker-compose.yml in Docker swarm mode

Hi I am looking at above example and trying to run docker swarm stack but getting below error. not sure what I am missing here.
docker-compose.yml
services:
nginx:
image: nginx
hostname: '{{.Node.Hostname}}'
version: '3.3'
docker stack deploy test -c docker-compose.yml
but getting below output/error : Error response from daemon: rpc
error: code = InvalidArgument desc = expanding hostname failed:
template: expansion:1:7: executing "expansion" at <.Node.Hostname>:
can't evaluate field Hostname in type struct { ID string }
Here is my docker-info output:
docker info Containers:
12 Running: 0 Paused: 0 Stopped:
12 Images: 41
Server Version: 18.03.1-ce
Storage Driver: devicemapper
Pool Name: docker-253:1-2490377-pool Pool
Blocksize: 65.54kB Base
Device Size: 10.74GB Backing Filesystem:
Thanks in advance.
I tried your setup with both version: '3.3' and version: '3.4' of compose.
According to https://docs.docker.com/engine/reference/commandline/service_create/#create-services-using-templates hostname is one of the fields you can use template strings on so this should work fine.
After creating the stack I verified the hostname with
$ docker inspect test_nginx | grep name
"com.docker.stack.namespace": "test"
"com.docker.stack.namespace": "test"
"Hostname": "{{.Node.Hostname}}",
So I think either this has been fixed in a more recent version of docker or something is odd with your host setup.

RabbitMQ Docker Container Error: Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces

I am having a problem with running rabbitmq from within Docker on Windows Server 1709 (Windows Server core edition).
I am using docker-compose to create the rabbitmq service. If I run the docker-compose on my local computer, everything works fine. When I run the docker-compose on the windows server (where docker has been set to docker lcow support on windows) I get the above mentioned error multiple times occurring the in the logs. Namely this error is:
Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces
It is worth noting that I receive this error even if I just do a manual pull of rabbitmq and a manual run with docker run -itd --rm --name rabbitmq rabbitmq:3-management
I am able to bash into the container for a short while before it crashes and exits and I see the following:
root#localhost:~# ls -la
---------- 2 root root 20 Jan 5 12:18 .erlang.cookie
On my localhost, the permissions look like this (which is correct):
root#localhost:~# ls -la
-r-------- 1 rabbitmq rabbitmq 20 Dec 28 00:00 .erlang.cookie
I can't understand why the permission structure is broken on the server.
Is it possible that this is an issue with LCOW support on Windows Server 1709 with Docker for Windows? Or is the problem with rabbitmq?
For reference here is the docker compose file used:
version: "3.3"
services:
rabbitmq:
image: rabbitmq:3-management
container_name: rabbitmq
hostname: localhost
ports:
- "1001:5672"
- "1002:15672"
environment:
- "RABBITMQ_DEFAULT_USER=user"
- "RABBITMQ_DEFAULT_PASS=password"
volumes:
- d:/docker_data/rabbitmq:/var/lib/rabbitmq/mnesia
restart: always
For reference here is the docker information where there error is happening.
docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 17.10.0-ee-preview-3
Storage Driver: windowsfilter (windows) lcow (linux)
LCOW:
Logging Driver: json-file
Plugins:
Volume: local
Network: ics l2bridge l2tunnel nat null overlay transparent
Log: awslogs etwlogs fluentd json-file logentries splunk syslog
Swarm: inactive
Default Isolation: process
Kernel Version: 10.0 16299 (16299.15.amd64fre.rs3_release.170928-1534)
Operating System: Windows Server Datacenter
OSType: windows
Architecture: x86_64
CPUs: 4
Total Memory: 7.905GiB
Name: ServerName
Docker Root Dir: D:\docker-root
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
docker version
Client:
Version: 17.10.0-ee-preview-3
API version: 1.33
Go version: go1.8.4
Git commit: 1649af8
Built: Fri Oct 6 17:52:28 2017
OS/Arch: windows/amd64
Server:
Version: 17.10.0-ee-preview-3
API version: 1.34 (minimum version 1.24)
Go version: go1.8.4
Git commit: b8571fd
Built: Fri Oct 6 18:01:48 2017
OS/Arch: windows/amd64
Experimental: true
I struggled with same problem when run RabbitMQ inside AWS ECS container
Disclaimer: I didn't check this behavior in detail and that is only my assumption, so the problem cause may be wrong, but at least solution is working
It feels like RabbitMQ creating .erlang.cookie file on container start if it doesn't exist. And if inside-container user is root:
...
rabbitmq:
image: rabbitmq:3-management
# set container user to root
user: 0:0
...
then .erlang.cookie will be created with root permissions. But RabbitMQ starting child processes with rabbitmq user permissions. And .erlang.cookie is not writeable (editable) in this case.
To avoid this problem, I created custom image with existing .erlang.cookie file using Dockerfile:
ARG COOKIE_VALUE=SomeDefaultRandomString01
FROM rabbitmq:3.11-alpine
ARG COOKIE_VALUE=$COOKIE_VALUE
RUN printf 'log.console = true\nlog.console.level = warning\nlog.default.level = warning\nlog.connection.level = warning\nlog.channel.level = warning\nlog.file.level = warning\n' > /etc/rabbitmq/conf.d/10-logs_to_stdout.conf && \
printf 'loopback_users.guest = false\n' > /etc/rabbitmq/conf.d/20-allow_remote_guest_users.conf && \
printf 'management_agent.disable_metrics_collector = true' > /etc/rabbitmq/conf.d/30-disable_metrics_data.conf && \
chown rabbitmq:rabbitmq /etc/rabbitmq/conf.d/* && mkdir -p /var/lib/rabbitmq/ && \
echo "$COOKIE_VALUE" > /var/lib/rabbitmq/.erlang.cookie && chmod 400 /var/lib/rabbitmq/.erlang.cookie && \
chown -R rabbitmq:rabbitmq /var/lib/rabbitmq
where .erlang.cookie value may be any random string, but it should be same for all nodes in RabbitMQ cluster (extra information here).

Mesos Slave - Docker compose

Am using mesos version 1.0.3. Just installed mesos thru
docker pull mesosphere/mesos-master:1.0.3
docker pull mesosphere/mesos-salve:1.0.3
Using docker-compose to start mesos-master and mesos-slave.
docker-compose file,
services:
#
# Zookeeper must be provided externally
#
#
# Mesos
#
mesos-master:
image: mesosphere/mesos-master:1.0.3
restart: always
privileged: true
network_mode: host
volumes:
- ~/mesos-data/master:/tmp/mesos
environment:
MESOS_CLUSTER: "mesos-cluster"
MESOS_QUORUM: "1"
MESOS_ZK: "zk://localhost:2181/mesos"
MESOS_PORT: 5000
MESOS_REGISTRY_FETCH_TIMEOUT: "2mins"
MESOS_EXECUTOR_REGISTRATION_TIMEOUT: "2mins"
MESOS_LOGGING_LEVEL: INFO
MESOS_INITIALIZE_DRIVER_LOGGING: "false"
mesos-slave1:
image: mesosphere/mesos-slave:1.0.3
depends_on: [ mesos-master ]
restart: always
privileged: true
network_mode: host
volumes:
- ~/mesos-data/slave-1:/tmp/mesos
- /sys/fs/cgroup:/sys/fs/cgroup
- /var/run/docker.sock:/var/run/docker.sock
environment:
MESOS_CONTAINERIZERS: docker
MESOS_MASTER: "zk://localhost:2181/mesos"
MESOS_PORT: 5051
MESOS_WORK_DIR: "/var/lib/mesos/slave-1"
MESOS_LOGGING_LEVEL: WARNING
MESOS_INITIALIZE_DRIVER_LOGGING: "false"
Mesos master runs fine without any issues. But the slave is not starting with the below error. Not sure, what else is missing here.
I0811 21:38:28.952507 1 main.cpp:243] Build: 2017-02-13 08:10:42 by ubuntu
I0811 21:38:28.952599 1 main.cpp:244] Version: 1.0.3
I0811 21:38:28.952601 1 main.cpp:247] Git tag: 1.0.3
I0811 21:38:28.952603 1 main.cpp:251] Git SHA: c673fdd00e7f93ab7844965435d57fd691fb4d8d
SELinux: Could not open policy file <= /etc/selinux/targeted/policy/policy.29: No such file or directory
2017-08-11 21:38:29,062:1(0x7f4f78d0d700):ZOO_INFO#log_env#726: Client environment:zookeeper.version=zookeeper C client 3.4.8
2017-08-11 21:38:29,062:1(0x7f4f78d0d700):ZOO_INFO#log_env#730: Client environment:host.name=<HOST_NAME>
2017-08-11 21:38:29,062:1(0x7f4f78d0d700):ZOO_INFO#log_env#737: Client environment:os.name=Linux
2017-08-11 21:38:29,062:1(0x7f4f78d0d700):ZOO_INFO#log_env#738: Client environment:os.arch=3.8.13-98.7.1.el7uek.x86_64
2017-08-11 21:38:29,062:1(0x7f4f78d0d700):ZOO_INFO#log_env#739: Client environment:os.version=#2 SMP Wed Nov 25 13:51:41 PST 2015
2017-08-11 21:38:29,063:1(0x7f4f78d0d700):ZOO_INFO#log_env#747: Client environment:user.name=(null)
2017-08-11 21:38:29,063:1(0x7f4f78d0d700):ZOO_INFO#log_env#755: Client environment:user.home=/root
2017-08-11 21:38:29,063:1(0x7f4f78d0d700):ZOO_INFO#log_env#767: Client environment:user.dir=/
2017-08-11 21:38:29,063:1(0x7f4f78d0d700):ZOO_INFO#zookeeper_init#800: Initiating client connection, host=localhost:2181 sessionTimeout=10000 watcher=0x7f4f82265e50 sessionId=0 sessionPasswd=<null> context=0x7f4f5c000930 flags=0
2017-08-11 21:38:29,064:1(0x7f4f74ccb700):ZOO_INFO#check_events#1728: initiated connection to server [127.0.0.1:2181]
2017-08-11 21:38:29,067:1(0x7f4f74ccb700):ZOO_INFO#check_events#1775: session establishment complete on server [127.0.0.1:2181], sessionId=0x15dc8b48c6d0155, negotiated timeout=10000
Failed to perform recovery: Failed to run 'docker -H unix:///var/run/docker.sock ps -a': exited with status 1; stderr='Error response from daemon: client is newer than server (client API version: 1.24, server API version: 1.22)
'
To remedy this do as follows:
Step 1: rm -f /var/lib/mesos/slave-1/meta/slaves/latest
This ensures agent doesn't recover old live executors.
The below command returns same version for docker client API and docker server API. Not sure what is wrong with the setup.
docker -H unix:///var/run/docker.sock version
Client:
Version: 1.10.1
API version: 1.22
Go version: go1.5.3
Git commit: 9e83765
Built: Thu Feb 11 19:18:46 2016
OS/Arch: linux/amd64
Server:
Version: 1.10.1
API version: 1.22
Go version: go1.5.3
Git commit: 9e83765
Built: Thu Feb 11 19:18:46 2016
OS/Arch: linux/amd64
Meoss slave was using the client version 1.24.
This is working after setting the environment variable for the mesos slave.
DOCKER_API_VERSION = 1.22
The combination of the release version and API version of Docker is as follows:
https://docs.docker.com/engine/api/v1.26/#section/Versioning
The other option is to update the docker version.

Resources