Docker restartmanger prevents restart despite restart policy - docker

I have a docker container that likes to shutdown without restarting, despite having a restart=unless-stopped policy set.
Other containers are running on the same host (with similar startup configuration parameters) which I don't have any problems with. The host is a node in a swarm on a somewhat unstable network, and the container is frequent user of the node network (talking to the master node) so I'm not surprised that it would fail regularly, but I expect it to restart itself.
This is due to the restartmanger. The docker inspect State.error shows a message which clearly came from docker and not my container. The logs show:
... time="2019-09-21T02:06:31.969473802Z" level=error msg="restartmanger wait error: Could not attach to network cqr3v2jode1boqh2yofqrh7bx: context deadline exceeded"
So it appears that -- occasionally -- when the container gets restarted the network is down and the manger decides stop restarting. The question becomes how to override this behavior.
docker info:
Client:
Debug Mode: false
Server:
Containers: 4
Running: 2
Paused: 0
Stopped: 2
Images: 43
Server Version: 19.03.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: wgn64s7lx9jvgw36gtlu0dsou
Is Manager: false
Node Address: 10.0.0.2
Manager Addresses:
10.0.0.1:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.19.66-v7+
Operating System: Raspbian GNU/Linux 9 (stretch)
OSType: linux
Architecture: armv7l
CPUs: 4
Total Memory: 874.5MiB
Name: sensors-2
ID: NTRC:WPLS:GH2P:ZTLM:EDAN:H7HB:HGP6:6G6A:3YVW:T2I7:TVJU:XV3N
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Here are the relevant bits from docker inspect on the non-restarting container. Note that it has restarted a few times, it exited due to a network error, and the MaximumRetryCount is 0 (which I assume is unlimited). Most recently it wasn't up for long... but my understanding of unless-stopped is that docker would continue restarting the container, though it would increase the delay between restarts.
[
{
"Id": "fa7c59dfa38f25c70d4c1293db27965c2e76af950fa19a2097b4ce63e1af2be4",
"Created": "2019-06-24T05:25:10.792698029Z",
"Path": "/srv/bin/weather_collector_server",
"Args": [
"/etc/config.ini"
],
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 1,
"Error": "Could not attach to network cqr3v2jode1boqh2yofqrh7bx: context deadline exceeded",
"StartedAt": "2019-09-21T03:56:40.911764904Z",
"FinishedAt": "2019-09-21T03:58:07.234852939Z"
},
"Image": "sha256:ee0e5023f37917f074dd0bf03dca328833eafd117fe69041203533768a196789",
"ResolvConfPath": "/var/lib/docker/containers/fa7c59dfa38f25c70d4c1293db27965c2e76af950fa19a2097b4ce63e1af2be4/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/fa7c59dfa38f25c70d4c1293db27965c2e76af950fa19a2097b4ce63e1af2be4/hostname",
"HostsPath": "/var/lib/docker/containers/fa7c59dfa38f25c70d4c1293db27965c2e76af950fa19a2097b4ce63e1af2be4/hosts",
"LogPath": "",
"Name": "/weather_collector_server",
"RestartCount": 3,
"Driver": "overlay2",
"Platform": "linux",
...
"HostConfig": {
...
"RestartPolicy": {
"Name": "unless-stopped",
"MaximumRetryCount": 0
},
...
],
"NetworkSettings": {
"Bridge": "",
"SandboxID": "0e901219511bb618d66943a12af1e09d8bbcb78ca4caa0bad88880f21d843c55",
...
"Networks": {
"hostnet": {
"IPAMConfig": null,
"Links": null,
"Aliases": [
"fa7c59dfa38f"
],
"NetworkID": "cqr3v2jode1boqh2yofqrh7bx",
"EndpointID": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "",
"DriverOpts": {}
}
}
}
}
]

Related

Can't access docker-swarm container by service name

I can't ping a service by it's service name from another container on the same overlay network in docker swarm. My steps are:
# docker swarm init
# docker network create -d overlay --attachable net1
# docker service create --name dns1 --network net1 tutum/dnsutils sleep 3000
# docker service create --name dns2 --network net1 tutum/dnsutils sleep 3000
This creates a 1 node swarm, a user defined overlay network and 2 services. I should be able to exec into 1 container and ping the other via service name but it does not work:
# docker exec -it dns1.1.6rned8409m9jkqoxgutzjz4y4 /bin/bash
root#05cba6fd8a0b:/# ping dns2
PING dns2 (10.0.5.5) 56(84) bytes of data.
From 05cba6fd8a0b (10.0.5.3) icmp_seq=1 Destination Host Unreachable
From 05cba6fd8a0b (10.0.5.3) icmp_seq=2 Destination Host Unreachable
From 05cba6fd8a0b (10.0.5.3) icmp_seq=3 Destination Host Unreachable
^C
--- dns2 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3062ms
I can ping the container directly either via the full hostname (dns2.1.idkledfjgd5dwknv6pirywpfk) or IP (10.0.5.6).
Environment Info:
# docker network inspect -v net1
[
{
"Name": "net1",
"Id": "ngzwl7l7m0zb5brvee21mvfcz",
"Created": "2020-12-14T22:05:25.962132239Z",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.5.0/24",
"Gateway": "10.0.5.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"05cba6fd8a0bc4e480b50f91fb395d27ee4998277d480109cb95249c38852909": {
"Name": "dns1.1.6rned8409m9jkqoxgutzjz4y4",
"EndpointID": "6bcc76c8688527fcf26d2ed313e351a54b8de69d28cde4388032849a2ff91a3e",
"MacAddress": "02:42:0a:00:05:03",
"IPv4Address": "10.0.5.3/24",
"IPv6Address": ""
},
"c1d9252f528b177ac397b7b9bf627996993ddc0f54aad3ee3862d93dcac407a3": {
"Name": "dns2.1.idkledfjgd5dwknv6pirywpfk",
"EndpointID": "fafd8335715737c26c83ff8a3e7c52a302eb48cbb6b7bb75e396ed6a483bfd31",
"MacAddress": "02:42:0a:00:05:06",
"IPv4Address": "10.0.5.6/24",
"IPv6Address": ""
},
"lb-net1": {
"Name": "net1-endpoint",
"EndpointID": "09e3b875528a05dc39a910b8cfe5cfd57756681c4aeffd56a0c9fb41d6bffd23",
"MacAddress": "02:42:0a:00:05:04",
"IPv4Address": "10.0.5.4/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4101"
},
"Labels": {},
"Peers": [
{
"Name": "4dc98c7e5f08",
"IP": "192.168.1.26"
}
],
"Services": {
"dns1": {
"VIP": "10.0.5.2",
"Ports": [],
"LocalLBIndex": 269,
"Tasks": [
{
"Name": "dns1.1.6rned8409m9jkqoxgutzjz4y4",
"EndpointID": "6bcc76c8688527fcf26d2ed313e351a54b8de69d28cde4388032849a2ff91a3e",
"EndpointIP": "10.0.5.3",
"Info": {
"Host IP": "192.168.1.26"
}
}
]
},
"dns2": {
"VIP": "10.0.5.5",
"Ports": [],
"LocalLBIndex": 270,
"Tasks": [
{
"Name": "dns2.1.idkledfjgd5dwknv6pirywpfk",
"EndpointID": "fafd8335715737c26c83ff8a3e7c52a302eb48cbb6b7bb75e396ed6a483bfd31",
"EndpointIP": "10.0.5.6",
"Info": {
"Host IP": "192.168.1.26"
}
}
]
}
}
}
]
and
# docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.4.2-docker)
Server:
Containers: 3
Running: 2
Paused: 0
Stopped: 1
Images: 7
Server Version: 20.10.0
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: x2o135d3kkfxw6lb6mfyx8s3h
Is Manager: true
ClusterID: v5x80quwm3vwsubwdd6pclj4r
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.1.26
Manager Addresses:
192.168.1.26:2377
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.73-1-pve
Operating System: Ubuntu 20.10
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 15.62GiB
Name: dockerHost
ID: CCGD:MQRE:PGJJ:YRU5:M4IM:5INT:EGA5:IER3:22UL:7CI3:PZOU:EZZ2
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No blkio weight support
WARNING: No blkio weight_device support
For anyone looking at this in the future. The issue for me was that I was running docker in a LXC container on proxmox (ubuntu 20.04 template). I tested this in a ubuntu 20.04 VM and it works exactly as expected. I don't know exactly what the issue is or if it can be fixed, but essentially running this in a LXC container will not work.

Unable to connect to URL within docker container (tomcat) - Socket exception is thrown

I am facing problem within my docker container's tomcat where it is throwing socket exception while connecting to URL. Same is working fine until few days back. Same is url is getting connected from host server of docker service.
[localhost-startStop-1] Error getting Properties from Config URL :http://config.server.com/config/
org.springframework.web.client.ResourceAccessException: I/O error on POST request for "http://config.server.com/config/public/rest-less-api/query-configurations": Connection reset; nested exception is java.net.SocketException: Connection reset
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:666) ~[spring-web-4.3.9.RELEASE.jar:4.3.9.RELEASE]
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:613) ~[spring-web-4.3.9.RELEASE.jar:4.3.9.RELEASE]
at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:531) ~[spring-web-4.3.9.RELEASE.jar:4.3.9.RELEASE]
Unfortunately, our base docker image(organization level) doesn't include ping or ssh tools. I am bit clueless to troubleshoot the same.
[root#mylin# docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 2
Server Version: 1.13.1
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Init Binary: /usr/libexec/docker/docker-init-current
containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: 5eda6f6fd0c2884c2c8e78a6e7119e8d0ecedb77 (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: fec3683b971d9c3ef73f284f176672c44b448662 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
WARNING: You're not using the default seccomp profile
Profile: /etc/docker/seccomp.json
selinux
Kernel Version: 3.10.0-862.14.4.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 4
Total Memory: 7.638 GiB
Name: vc2crtp1287181n.fmr.com
ID: B4VP:4BCJ:476O:RUWA:IT3G:O7NO:DZOQ:RR6Z:QMBG:FPB5:DMSE:G5HG
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://registry.access.redhat.com/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Registries: registry.access.redhat.com (secure), docker.io (secure)
Edited:
I did a few more analysis to find the same docker image works with another host server. When I ran curl command inside the docker container in the server where I had problem, I getting following error message
sh-4.2# curl --header "Content-Type: application/json" --request POST --data '{"search-query" :"q21321", "structure-format":"FLA T"}' http://config.server.com/config/public/rest-less-api/query-configurations
curl: (56) Recv failure: Connection reset by peer
where in another host server where image is working fine, curl command returns the values.
Any direction to resolve this problem will be of great help?
Additional Info:
Below are inspect information of the working container
docker network inspect bridge
[
{
"Name": "bridge",
"Id": "4b8207ce56b3741b7bd864f7adffdc324ba2e9db9e07ae031e10c90f351be158",
"Created": "2018-12-06T04:29:23.258033812-05:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16"
}
]
},
"Internal": false,
"Attachable": false,
"Containers": {
"6bcf920c0dc86d60dd288fd086f4d971aee217cf2ee49d71fd47dc1570460504": {
"Name": "GRK-BRK-EVENT",
"EndpointID": "b875dcdf4db8832fe518620801ae87137c6df44697ae7035148921f6a179b64a",
"MacAddress": "02:42:ac:11:00:03",
"IPv4Address": "172.17.0.3/16",
"IPv6Address": ""
},
"de1dc8c4a9e09b2612d2d4e0ede5b875b42c4a819f27fe32ed9728d3cc4d756b": {
"Name": "GRK-BRK-REST",
"EndpointID": "d0149fb42645e63c0d8e9c8ad1c605f9ddcb3afa4c41e52c10a554cd31452727",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
and below is inspect log for not-working container
linux-x86-64]# docker inspect bridge
[
{
"Name": "bridge",
"Id": "c245b3b5c4cedca3b9fa5370b464e0e9c2aef0dc2c520daeedf3e726e8b153e4",
"Created": "2018-12-18T11:14:10.806753755-05:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Containers": {
"e5bfa7fefa002b50f7e763ea30e2e602b4b577b1b558000725453773a4f10903": {
"Name": "GRK-BRK-REST",
"EndpointID": "64ff097ad0c72e107845c00aac2708ced6c9e896f37c317a247be7d3f482fcc0",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
I found find only difference is Gateway is added to IPAM segment in container which is not working

Docker Swarm does not create container

I'm trying to create 3 zookeeper services in my docker swarm. However only managed to create 2 of the 3 containers:
docker ps -a returns:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2c883f9148ff hyperledger/fabric-zookeeper:latest "/docker-entrypoint.…" 2 minutes ago Up 2 minutes 2181/tcp, 2888/tcp, 3888/tcp fabric_zookeeper1.1.td4wpq2t9uj5yjnw0q76gsqi0
068ef5d9075b hyperledger/fabric-zookeeper:latest "/docker-entrypoint.…" 2 minutes ago Up 2 minutes 2181/tcp, 2888/tcp, 3888/tcp fabric_zookeeper2.1.u3zr2o8lifcncjo6g2u2yqhwu
docker network ls return:
NETWORK ID NAME DRIVER SCOPE
0e17f2cd7e8d bridge bridge local
4f78c376719f docker_gwbridge bridge local
djds6rgg0pqc fabric overlay swarm
o1es27fz05i1 fabric_net overlay swarm
2f99d3b30b86 host host local
ls05jfjuekg0 ingress overlay swarm
e7d8a3ff8bb2 net_blockcord bridge local
42ec3d9a4f1b none null local
docker network inspect fabric_net return:
[
{
"Name": "fabric_net",
"Id": "o1es27fz05i1g9cjrq5nvv0ok",
"Created": "2018-10-26T07:41:49.436040523Z",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.6.0/24",
"Gateway": "10.0.6.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"068ef5d9075bc9c61b313b97cfbb36189401bc4eb72258b4346f659add5b3a0a": {
"Name": "fabric_zookeeper2.1.u3zr2o8lifcncjo6g2u2yqhwu",
"EndpointID": "3274a8bc693c742a0acedd786174a1c7ed4c2843cd28a6ff9140a2e977059657",
"MacAddress": "02:42:0a:00:06:11",
"IPv4Address": "10.0.6.17/24",
"IPv6Address": ""
},
"2c883f9148ff3b53228e8d02a8bd60db754cd2677155307e5db31f426e356223": {
"Name": "fabric_zookeeper1.1.td4wpq2t9uj5yjnw0q76gsqi0",
"EndpointID": "f58c3c303a6f2fe22ba410e0881f67ce002cbfc5e0afe9cd1104f7f11e2c6ecf",
"MacAddress": "02:42:0a:00:06:15",
"IPv4Address": "10.0.6.21/24",
"IPv6Address": ""
},
"lb-fabric_net": {
"Name": "fabric_net-endpoint",
"EndpointID": "d70a81ad2631c3b76feac7484599e0715c9b901d2ed72153a38105b236b4c882",
"MacAddress": "02:42:0a:00:06:02",
"IPv4Address": "10.0.6.2/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4103"
},
"Labels": {
"com.docker.stack.namespace": "fabric"
},
"Peers": [
{
"Name": "a2beaca62ca3",
"IP": "10.0.0.5"
},
{
"Name": "fa12393e1d65",
"IP": "137.116.149.79"
}
]
}
]
With my container showing only 2 of my 3 zookeepers
I first create an overlay network
docker network create --attachable --driver overlay fabric
and ran the below docker compose file using command:
docker stack deploy -c docker-compose-zookeeper.yaml fabric
docker-compose-zookeeper.yaml
# Copyright IBM Corp. All Rights Reserved.
#
# SPDX-License-Identifier: Apache-2.0
#
version: '3'
networks:
net:
services:
zookeeper0:
hostname: zookeeper0.example.com
image: hyperledger/fabric-zookeeper
ports:
- 2181
- 2888
- 3888
environment:
- ZOO_MY_ID=1
- ZOO_SERVERS=server.1=0.0.0.0:2888:3888 server.2=zookeeper1:2888:3888 server.3=zookeeper2:2888:3888
networks:
- net
zookeeper1:
hostname: zookeeper1.example.com
image: hyperledger/fabric-zookeeper
ports:
- 2181
- 2888
- 3888
environment:
- ZOO_MY_ID=2
- ZOO_SERVERS=server.1=zookeeper0:2888:3888 server.2=0.0.0.0:2888:3888 server.3=zookeeper2:2888:3888
networks:
- net
zookeeper2:
hostname: zookeeper2.example.com
image: hyperledger/fabric-zookeeper
ports:
- 2181
- 2888
- 3888
environment:
- ZOO_MY_ID=3
- ZOO_SERVERS=server.1=zookeeper0:2888:3888 server.2=zookeeper1:2888:3888 server.3=0.0.0.0:2888:3888
networks:
- net
docker info:
Containers: 2
Running: 2
Paused: 0
Stopped: 0
Images: 15
Server Version: 18.06.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
NodeID: x8mooygnt8mzruof5c5d3p0vp
Is Manager: true
ClusterID: vmqqjuwztz3sraag3e8dgpqbl
Managers: 2
Nodes: 2
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 10.0.0.5
Manager Addresses:
137.116.149.79:2377
168.63.239.163:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-1023-azure
Operating System: Ubuntu 16.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.853GiB
Name: blockcord-staging2
ID: UT5F:4ZFW:4PRT:LGFS:JIV4:3YAD:DK5I:BIYL:FU6P:ZFEB:3OD3:U5EX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Found out that the container was created in my other nodes. But my container wasnt able to resolve address of the service

docker-compose not setting gateway and IP address

I have a problem where docker-compose containers aren't able to reach the internet. Manually created containers via the docker cli or kubelet work just fine.
This is on an AWS EC2 node created using Kops with Calico overlay (I think that may be unrelated, however).
Here's the docker-compose:
version: '2.1'
services:
app:
container_name: app
image: "debian:jessie"
command: ["sleep", "99999999"]
app2:
container_name: app2
image: "debian:jessie"
command: ["sleep", "99999999"]
This fails:
# docker exec -it app ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
docker-compose container<->container works (as expected):
# docker exec -it app ping app2
PING app2 (172.19.0.2): 56 data bytes
64 bytes from 172.19.0.2: icmp_seq=0 ttl=64 time=0.098 ms
Manually created container works fine:
# docker run -it -d --name app3 debian:jessie sh -c "sleep 99999999"
# docker exec -it app3 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=37 time=9.972 ms
So it seems like docker-compose containers can't reach the internet.
Here's the NetworkSettings from app3, which works:
"NetworkSettings": {
"Bridge": "",
"SandboxID": "54168ea912b9caa842b208f36dac80a588ebdc63501a700379fb1b732a41d3ac",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/54168ea912b9",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "cdddee0f3e25e7861a98ba6aff33652619a3970c061d0ed2a5dc5bd2b075b30d",
"Gateway": "172.17.0.1",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"MacAddress": "02:42:ac:11:00:02",
"Networks": {
"bridge": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "46e8bc586d48c9a57e2886f7f35f7c2c8396f8084650fcc2bf1e74788df09e3f",
"EndpointID": "cdddee0f3e25e7861a98ba6aff33652619a3970c061d0ed2a5dc5bd2b075b30d",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:02"
}
}
}
From one of the docker-compose containers (fails):
"NetworkSettings": {
"Bridge": "",
"SandboxID": "6b79a6b45f099c65f89adf59eb50eadff2362942f316b05cf20ae1959ca9b88b",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/6b79a6b45f09",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"root_default": {
"IPAMConfig": null,
"Links": null,
"Aliases": [
"app2",
"4f48647ba5bb"
],
"NetworkID": "ffb540b2b9e2945908477a755a43d3505aea6ed94ef5fd944909a91fb104ce8e",
"EndpointID": "48aff2f00bb4bd670b5178b459a353ac45f7d3efbfb013c1026064022e7c4e59",
"Gateway": "172.19.0.1",
"IPAddress": "172.19.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:13:00:02"
}
}
}
So it seems like the major difference is that the docker-compose containers aren't created with an IPAddress or Gateway.
Some background info:
# docker version
Client:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:17:57 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:17:57 2017
OS/Arch: linux/amd64
# docker-compose version
docker-compose version 1.15.0, build e12f3b9
docker-py version: 2.4.2
CPython version: 2.7.13
OpenSSL version: OpenSSL 1.0.1t 3 May 2016
# ip route
default via 10.20.128.1 dev eth0
10.20.128.0/20 dev eth0 proto kernel scope link src 10.20.140.184
100.104.10.64/26 via 10.20.136.0 dev eth0 proto bird
100.109.150.192/26 via 10.20.152.115 dev tunl0 proto bird onlink
100.111.225.192 dev calic6f21d462fc scope link
blackhole 100.111.225.192/26 proto bird
100.111.225.193 dev calief8dddb6a0d scope link
100.111.225.195 dev cali8ca1dd867c3 scope link
100.111.225.196 dev cali34426885f86 scope link
100.111.225.197 dev cali6cae60de42a scope link
100.111.225.231 dev calibd569acd2f3 scope link
100.115.17.64/26 via 10.20.148.89 dev tunl0 proto bird onlink
100.115.237.64/26 via 10.20.167.9 dev tunl0 proto bird onlink
100.117.246.128/26 via 10.20.150.249 dev tunl0 proto bird onlink
100.118.80.0/26 via 10.20.162.215 dev tunl0 proto bird onlink
100.119.204.0/26 via 10.20.135.183 dev eth0 proto bird
100.123.178.128/26 via 10.20.170.43 dev tunl0 proto bird onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.18.0.0/16 dev br-bd6445b00ccf proto kernel scope link src 172.18.0.1
172.19.0.0/16 dev br-ffb540b2b9e2 proto kernel scope link src 172.19.0.1
iptables are a bit long, so not posting for now (I would expect them to interfere with the non-docker-compose generated containers, so I think the iptables are unrelated).
Anyone know what's going on?

Can't resolve hostnames between docker containers

I have two containers created in separate compose files (done for application isolation -- each application may have multiple containers defined in the compose file such as a backing database).
These containers are linked via an external network named "common".
An example compose file would be:
version: '2'
services:
rabbitmq:
image: "rabbitmq:3-management"
hostname: "rabbitmq"
container_name: "rabbitmq"
environment:
RABBITMQ_ERLANG_COOKIE: "SWQOKODSQALRPCLNMEQG"
RABBITMQ_DEFAULT_USER: "rabbitmq"
RABBITMQ_DEFAULT_PASS: "rabbitmq"
RABBITMQ_DEFAULT_VHOST: "/"
ports:
- "15672:15672"
- "5672:5672"
networks:
default:
external:
name: common
Docker versions:
root#server:~/# docker version
Client:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built: Mon, 10 Oct 2016 21:38:17 +1300
OS/Arch: linux/amd64
Server:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built: Mon, 10 Oct 2016 21:38:17 +1300
OS/Arch: linux/amd64
root#server:~/# docker-compose version
docker-compose version 1.8.1, build 878cff1
docker-py version: 1.10.3
CPython version: 2.7.9
OpenSSL version: OpenSSL 1.0.1e 11 Feb 2013
The common network was created using:
docker network create common
Then I bring containers up using:
docker-compose up -d
Inspecting the network I get:
root#server:~# docker network inspect b5e8f81a8ea0
[
{
"Name": "common",
"Id": "b5e8f81a8ea063149298d2023be5740c8d971e0329a741abdafbac59fd882684",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "192.168.0.0/16"
}
]
},
"Internal": false,
"Containers": {
"6137864e71161b417f0659b88b3e17538fb60277ca818e58b255d5cc17932c3c": {
"Name": "db",
"EndpointID": "5d9800dedcc22bdb8e08a1d04712df4e2f2846f5448e4ce888f164f7877ce5a4",
"MacAddress": "02:42:c0:a8:00:03",
"IPv4Address": "192.168.0.3/16",
"IPv6Address": ""
},
"61b1288bf250196d02f3933c8bbde732fbcd24bdf3949c4f53b3b05aa87f3c7f": {
"Name": "rabbitmq",
"EndpointID": "484e3cc05e5a852150e6a7c429dc73d9ce4b097d4f53b07800c9998ef977565c",
"MacAddress": "02:42:c0:a8:00:02",
"IPv4Address": "192.168.0.2/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
Other containers have crashed because they can't resolve the 'rabbitmq' hostname. Inspecting the crashed containers shows they are on the same network:
root#server:~# docker inspect bae5214bf619
"NetworkSettings": {
"Bridge": "",
"SandboxID": "54a8b0833204522a75cf2e4195c5838b336ab82438aa53d9f1f38276eb9f6061",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": null,
"SandboxKey": "/var/run/docker/netns/54a8b0833204",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"common": {
"IPAMConfig": null,
"Links": [
"db"
],
"Aliases": [
"rulesengine",
"bae5214bf619"
],
"NetworkID": "b5e8f81a8ea063149298d2023be5740c8d971e0329a741abdafbac59fd882684",
"EndpointID": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": ""
}
}
}
If I login to a container that is not trying to access the rabbitmq container and hasn't crashed, the hostname does not resolve. But, the ip address of the rabbitmq container is reachable.
root#server:~/David.Deployments# docker exec -it 6137864e7116 bash
root#db:/# ping rabbitmq
ping: unknown host
root#db:/# ping 192.168.0.2
PING 192.168.0.2 (192.168.0.2): 56 data bytes
64 bytes from 192.168.0.2: icmp_seq=0 ttl=64 time=0.149 ms
64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=0.043 ms
When signed into the rabbitmq container:
root#server:/# docker exec -it rabbitmq bash
root#rabbitmq:/# uname -n
rabbitmq
root#rabbitmq:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
445: eth0#if446: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:c0:a8:00:02 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/16 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:c0ff:fea8:2/64 scope link
valid_lft forever preferred_lft forever
root#rabbitmq:/# cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.0.2 rabbitmq
Are there any suggestions for how I can resolve this issue or things I should be checking?
Update 1
I have tried adding external_links, but the problem persists and no entry is added to /etc/hosts for the external hostname 'rabbitmq'
Per the discussion here: https://github.com/docker/docker/issues/13381
poga's suggestion to restart docker worked for me: systemctl restart docker
In my case the problem was that container was crashing when being accessed for the first time after being started because in the entrypoint script there was a command that was failing.

Resources