Docker swarm can be accessed only on nodes, where container is running - docker

I'm currently running docker swarm on 3 nodes. First I created network as
docker network create -d overlay xx_net
after that a service as
docker service create --network xxx_net --replicas 1 -p 12345:12345 --name nameofservice nameofimage:1
If I read correctly, this is routing mesh (=ok for me). But I can only access service on that node-ip, where container is running, even it should be available on every node ip's.
If I drain some node, container starts up on different node and then it's on available on new ip.
**more information added below here:
I rebooted all servers - 3 workers, where on of them is manager
after boot, all seems to work ok!
I'm using rabbitmq-image from docker hub. Dockerfile is quite small: FROM rabbitmq:3-management Container has been started at worker 2
I can connect to rabbitmq's management page from all workers: worker1-ip:15672, worker2-ip:15672, worker3-ip:15672, so I think all ports needed is open.
about after 1 hour, rabbitmq-container has been moved from worker 2 to worker 3 - I do not know reason.
after that I cannot anymore connect from worker1-ip:15672, worker2-ip:15672 but from worker3-ip:15672 all still works!
I drained worker3 as docker node update --availability drain worker3
container started at worker1.
after that I can only connect from worker1-ip:15672, not anymore from worker2 or worker3
One test more:
all docker services restarted on all workers, and all works again?!
- let's wait a few hours...
Today's status:
2 of 3 nodes are working ok. On service log of manager:
Jul 12 07:53:32 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:53:32.787953754Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:53:39 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:53:39.787783458Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:55:27 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:55:27.790564790Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:55:41 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:55:41.787974530Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:56:33 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:56:33.027525926Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 07:56:33 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:56:33.027668473Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 08:13:22 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:13:22.787796692Z" level=info msg="memberlist: Marking dockerswarmworker2-03ec8453a81f as failed, suspect timeout reached"
Jul 12 08:21:37 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:21:37.788694522Z" level=info msg="memberlist: Marking dockerswarmworker2-03ec8453a81f as failed, suspect timeout reached"
Jul 12 08:24:01 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:24:01.525570127Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 08:24:01 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:24:01.525713893Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
and from worker's docker log:
Jul 12 08:20:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:20:47.486202716Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:21:38 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:38.288117026Z" level=warning msg="memberlist: Refuting a dead message (from: h999-99-999-185.scenegroup.fi-891b24339f8a)"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404554761Z" level=warning msg="Neighbor entry already present for IP 10.255.0.3, mac 02:42:0a:ff:00:03"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404588738Z" level=warning msg="Neighbor entry already present for IP 104.198.180.163, mac 02:42:0a:ff:00:03"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404609273Z" level=warning msg="Neighbor entry already present for IP 10.255.0.6, mac 02:42:0a:ff:00:06"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404622776Z" level=warning msg="Neighbor entry already present for IP 104.198.180.163, mac 02:42:0a:ff:00:06"
Jul 12 08:21:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:47.486007317Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:22:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:22:47.485821037Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:23:17 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:23:17.630602898Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
And this one from working worker:
Jul 12 08:33:09 h999-99-999-185.scenegroup.fi dockerd[10330]: time="2017-07-12T08:33:09.219973777Z" level=warning msg="Neighbor entry already present for IP 10.0.0.3, mac xxxxx"
Jul 12 08:33:09 h999-99-999-185.scenegroup.fi dockerd[10330]: time="2017-07-12T08:33:09.220539013Z" level=warning msg="Neighbor entry already present for IP "managers ip here", mac xxxxxx"
I restarted docker on problematic worker and it started to work again.
I'll be following...
** Today's results:
2 of workers available, one is not
I didn't a thing
after 4 hour "swarm alone", all seems to works again?!
services has been moved from worker to other for any good reason, all results seems to be problem with communication.
quite confusing.

Upgrade to docker 17.06
Ingress overlay networking was broken for a long time until about 17.06-rc3

Related

Docker Swarm: Service keep on getting Ready & Shutdown

I have couple of docker swarm nodes, When tried to create the service on Leader with below command. Service creation process still going on it is more-than 40 minutes now.
docker service create \
--mode global \
--mount type=bind,src=/project/m32/,dst=/root/m32/ \
--publish mode=host,target=310,published=310 \
--publish mode=host,target=311,published=311 \
--publish mode=host,target=312,published=312 \
--publish mode=host,target=313,published=313 \
--constraint "node.labels.m32 == true" \
--name m32 \
local-registry/ubuntu:07
overall progress: 1 out of 2 tasks
ew0edluvz39p: ready [======================================> ]
kzc7jf7irsrh: running [==================================================>]
From service process, it keep on showing as Ready and Shutdown
$ docker service ps m32
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
s4q0rqrqbpdn m32.ew0edluvz39pazold0wnv2ean local-registry/ubuntu:07 sl-089 Ready Ready 1 second ago
r6vibgptm5oc \_ m32.ew0edluvz39pazold0wnv2ean local-registry/ubuntu:07 sl-089 Shutdown Complete 1 second ago
joq2p6c9jpnx \_ m32.ew0edluvz39pazold0wnv2ean local-registry/ubuntu:07 sl-089 Shutdown Complete 7 seconds ago
a5h8gac02vfx \_ m32.ew0edluvz39pazold0wnv2ean local-registry/ubuntu:07 sl-089 Shutdown Complete 13 seconds ago
f51stfsdlhvp \_ m32.ew0edluvz39pazold0wnv2ean local-registry/ubuntu:07 sl-089 Shutdown Complete 19 seconds ago
zqcbxkm4fwhr m32.kzc7jf7irsrhnx3kurcwqjb2j local-registry/ubuntu:07 sl-090 Ready Ready less than a second ago
za8efvi9x4yw \_ m32.kzc7jf7irsrhnx3kurcwqjb2j local-registry/ubuntu:07 sl-090 Shutdown Complete less than a second ago
$ sudo systemctl status docker.service
Nov 24 19:58:48 svr2 dockerd[2797]: time="2021-11-24T19:58:48.200421563+05:30" level=info msg="ignoring event" container=ea8b76fedb18159ba0cd8f279a9ca4264399c>
Nov 24 20:01:39 svr2 dockerd[2797]: time="2021-11-24T20:01:39.602028420+05:30" level=info msg="NetworkDB stats svr2(00bbf0799aa6) - netID:ubuzyty9mq4tb7xyb>
Nov 24 20:06:39 svr2 dockerd[2797]: time="2021-11-24T20:06:39.802013427+05:30" level=info msg="NetworkDB stats svr2(00bbf0799aa6) - netID:ubuzyty9mq4tb7xyb>
Nov 24 20:11:40 svr2 dockerd[2797]: time="2021-11-24T20:11:40.001992437+05:30" level=info msg="NetworkDB stats svr2(00bbf0799aa6) - netID:ubuzyty9mq4tb7xyb>
Nov 24 20:14:17 svr2 dockerd[2797]: time="2021-11-24T20:14:17.871605342+05:30" level=error msg="Error getting service xkauq9a599iv: service xkauq9a599iv not f>
Nov 24 20:14:52 svr2 dockerd[2797]: time="2021-11-24T20:14:52.833890158+05:30" level=error msg="Error getting service xkauq9a599iv: service xkauq9a599iv not f>
Nov 24 20:15:12 svr2 dockerd[2797]: time="2021-11-24T20:15:12.395692837+05:30" level=error msg="Error getting service pwaa8cvdd683: service pwaa8cvdd683 not f>
Nov 24 20:15:17 svr2 dockerd[2797]: time="2021-11-24T20:15:17.773200054+05:30" level=error msg="Error getting service xk0v0g2roypx: service xk0v0g2roypx not f>
Nov 24 20:16:18 svr2 dockerd[2797]: time="2021-11-24T20:16:18.529344060+05:30" level=error msg="Error getting service xk0v0g2roypx: service xk0v0g2roypx not f>
Nov 24 20:16:40 svr2 dockerd[2797]: time="2021-11-24T20:16:40.201888504+05:30" level=info msg="NetworkDB stats svr2(00bbf0799aa6) - netID:ubuzyty9mq4tb7xyb>
It looks loop process keep on creating containers. What is wrong in my way? Any help to fix this problem will be highly appreciated. Thanks
You really need to pass --restart-max-attempts 5 to your docker service create to ensure that services don't start too many times in a loop. Its bad for the stability of docker, and hard to debug. Rather have a task just give up and stop so you can see something is wrong and diagnose it.
To see specifically what is wrong you would want to look at the logs of each task. You use the individual task id's to see why each one failed:
# The logs for a task
docker service logs s4q0rqrqbpdn
# A general breakdown of a task
docker inspect s4q0rqrqbpdn
Sometimes you need to track down the actual container for the task and inspect that. docker container is not swarm aware, so
# list the service showing the full task id.
docker service ps <service> --no-trunc
# then docker context use <node> / ssh <node> to switch to a node of interest.
# Then, the container name is the "ID"."NAME" from the PS list. For example:
docker context use sl-089
docker container inspect m32.ew0edluvz39pazold0wnv2ean.s4q0rqrqbpdnABCDEFGABCDEFG
Inspecting the container can show if it was killed because of an OOM or certain other reasons that don't otherwise show up.

Docker change IP from Bridge

i have tried all of thes solutions to change the ip-address of my bridge:
Change default docker0 bridge ip address
But i allways got thes error:
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2021-07-10 20:02:29 CEST; 9min ago
Docs: https://docs.docker.com
Process: 3130 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --bip 192.168.198.128/25 (code=exited, status=1/FAILURE)
Main PID: 3130 (code=exited, status=1/FAILURE)
Jul 10 20:02:26 Server systemd[1]: Failed to start Docker Application Container Engine.
Jul 10 20:02:29 Server systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jul 10 20:02:29 Server systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Jul 10 20:02:29 Server systemd[1]: Stopped Docker Application Container Engine.
Jul 10 20:02:29 Server systemd[1]: docker.service: Start request repeated too quickly.
Jul 10 20:02:29 Server systemd[1]: docker.service: Failed with result 'exit-code'.
Jul 10 20:02:29 Server systemd[1]: Failed to start Docker Application Container Engine.
Jul 10 20:03:09 Server systemd[1]: docker.service: Start request repeated too quickly.
Jul 10 20:03:09 Server systemd[1]: docker.service: Failed with result 'exit-code'.
Jul 10 20:03:09 Server systemd[1]: Failed to start Docker Application Container Engine.
When i try to change it via
dockerd --bip=192.168.198.128/25
Then i recived this:
INFO[2021-07-10T20:30:14.220551981+02:00] Starting up
failed to start daemon: pid file found, ensure docker is not running or delete /var/run/docker.pid
root#Server:/etc/docker# sudo service docker stop
root#Server:/etc/docker# dockerd --bip=192.168.198.128/25
INFO[2021-07-10T20:30:27.749647728+02:00] Starting up
INFO[2021-07-10T20:30:27.751145974+02:00] detected 127.0.0.53 nameserver, assuming systemd-resolved, so using resolv.conf: /run/systemd/resolve/resolv.conf
INFO[2021-07-10T20:30:27.752521878+02:00] parsed scheme: "unix" module=grpc
INFO[2021-07-10T20:30:27.752770076+02:00] scheme "unix" not registered, fallback to default scheme module=grpc
INFO[2021-07-10T20:30:27.752906825+02:00] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>} module=grpc
INFO[2021-07-10T20:30:27.753028929+02:00] ClientConn switching balancer to "pick_first" module=grpc
INFO[2021-07-10T20:30:27.756109499+02:00] parsed scheme: "unix" module=grpc
INFO[2021-07-10T20:30:27.756411626+02:00] scheme "unix" not registered, fallback to default scheme module=grpc
INFO[2021-07-10T20:30:27.756544438+02:00] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock 0 <nil>}] <nil>} module=grpc
INFO[2021-07-10T20:30:27.756653280+02:00] ClientConn switching balancer to "pick_first" module=grpc
INFO[2021-07-10T20:30:27.761446487+02:00] [graphdriver] using prior storage driver: overlay2
WARN[2021-07-10T20:30:27.807419442+02:00] Your kernel does not support swap memory limit
WARN[2021-07-10T20:30:27.807711781+02:00] Your kernel does not support cgroup rt period
WARN[2021-07-10T20:30:27.807832417+02:00] Your kernel does not support cgroup rt runtime
INFO[2021-07-10T20:30:27.808384049+02:00] Loading containers: start.
ERRO[2021-07-10T20:30:28.211882902+02:00] failed to get event error="rpc error: code = Unavailable desc = transport is closing" module=libcontainerd namespace=moby
failed to start daemon: Error initializing network controller: Error creating default "bridge" network: failed to allocate gateway (192.168.198.128): Address already in use
but i dont have it in use, not in my network and not as a ip in the config of any network-connections.
Did anybody has an idea how to change it?
Thanks bevore.

docker container is is automatically paused

Hi all i have found a strange issue in my docker swarm standalone mode infrastucture: on one server and only one service is automatically paused after 5 second of starting . docker unpause also does not helps .
020450186dc1 myservice:latest "/bin/sh -c '/app/os…" 13 minutes ago Up 13 minutes (Paused) server-15.myservice_2
After some time this service is in pause mode again . I have no idea who can stop this service and where can i see some logging info about it . I know that the service can be killed but i have never heard about pause . The other instances of the same service are working stable.
UPD:
I dont see any special information on this issue even in docker debug mode . I just see that somebody pauses my container:
Apr 29 11:50:01 dockerd[1069]: time="2019-04-29T11:50:01.539113504+02:00" level=debug msg="Calling GET /_ping"
Apr 29 11:50:01 dockerd[1069]: time="2019-04-29T11:50:01.539687070+02:00" level=debug msg="Calling POST /v1.38/containers/debee0c7acb9aa0db2021719ee6ee3d084b88e47d9f2250b1af920f5271bf353/pause"
Apr 29 11:50:01 dockerd[1069]: time="2019-04-29T11:50:01+02:00" level=debug msg="event published" ns=moby topic="/tasks/paused" type=containerd.events.TaskPaused
Apr 29 11:50:01 dockerd[1069]: time="2019-04-29T11:50:01.570125095+02:00" level=debug msg=event module=libcontainerd namespace=moby topic=/tasks/paused

Docker overrides the IP address of my own manually created bridge

I am trying to set docker up to connect all containers to my own manually created bridge (br0), I don't want docker to create or edit anything in my bridge, because I have other services which uses and depends on my bridge (like OpenVPN) therefore I prefer to create the bridge using my own bash script.
The problem comes when I start docker service, docker changes my bridge IP address from what I want (192.168.1.10) to something else address(169.254.x.x)!!!
My Docker version 1.12.1, build 23cf638
The steps I did
Bridge creation:
sudo brctl addbr br0
sudo brctl addif br0 eth0
sudo ip addr del 192.168.1.10/24 dev eth0
sudo ip addr add 192.168.1.10/24 dev br0
sudo ip route add default via 192.168.1.1 dev br0
I also deleted the default docker0 brdige.
Tell docker to use my br0 instead of the default docker0:
Passing -b br0 parameter to dockerd.service starting script to tell docker that I want him to use my br0:
sudo vi /etc/systemd/system/docker.service.d/overlay.conf
I edited ExecStart to be like this:
ExecStart=/usr/bin/dockerd --storage-driver=overlay -H fd:// -b=br0
and then:
sudo systemctl daemon-reload
sudo systemctl restart docker
And now when I check my br0 IP, it is NOT 192.168.1.10 any more, it is back to 172.17.x.x, and when I try to change it now manually back to 192.168.1.10, the interfaces in containers keeps using 169.254.x.x instead of the IP I want.
P.s. when I check where are the interfaces of my containers: brctl show, they are really in my br0 (that means docker accepted -b br0 paramter, but it just ignores or override my intended IP address).
Could some one help me please to over come that problem? it looks for me like a bug maybe. I just want docker to use my br0 with the intended IP address 192.168.1.10.
My need is that all my containers get and IP address in the range I want.
Thanks in advance.
Edited:
My /var/log/daemon.log
Oct 10 20:41:12 raspberrypi systemd[1]: Stopping Docker Application Container Engine...
Oct 10 20:41:12 raspberrypi dockerd[976]: time="2016-10-10T20:41:12.067551389Z" level=info msg="Processing signal 'terminated'"
Oct 10 20:41:12 raspberrypi dockerd[976]: time="2016-10-10T20:41:12.128388194Z" level=info msg="stopping containerd after receiving terminated"
Oct 10 20:41:13 raspberrypi systemd[1]: Stopped Docker Application Container Engine.
Oct 10 20:41:13 raspberrypi systemd[1]: Stopping Docker Socket for the API.
Oct 10 20:41:13 raspberrypi systemd[1]: Closed Docker Socket for the API.
Oct 10 20:41:13 raspberrypi systemd[1]: Stopped Docker Application Container Engine.
Oct 10 20:41:50 raspberrypi avahi-daemon[440]: Withdrawing address record for 169.254.124.135 on br0.
Oct 10 20:41:50 raspberrypi dhcpcd[698]: br0: removing IP address 169.254.124.135/16
Oct 10 20:41:50 raspberrypi avahi-daemon[440]: Leaving mDNS multicast group on interface br0.IPv4 with address 169.254.124.135.
Oct 10 20:41:50 raspberrypi avahi-daemon[440]: Interface br0.IPv4 no longer relevant for mDNS.
Oct 10 20:41:50 raspberrypi dhcpcd[698]: br0: deleting route to 169.254.0.0/16
Oct 10 20:41:52 raspberrypi ntpd[723]: Deleting interface #7 br0, 169.254.124.135#123, interface stats: received=0, sent=0, dropped=0, active_time=516 secs
Oct 10 20:41:52 raspberrypi ntpd[723]: peers refreshed
Oct 10 20:42:58 raspberrypi avahi-daemon[440]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.1.19.
Oct 10 20:42:58 raspberrypi avahi-daemon[440]: New relevant interface br0.IPv4 for mDNS.
Oct 10 20:42:58 raspberrypi avahi-daemon[440]: Registering new address record for 192.168.1.19 on br0.IPv4.
Oct 10 20:43:00 raspberrypi ntpd[723]: Listen normally on 8 br0 192.168.1.19 UDP 123
Oct 10 20:43:00 raspberrypi ntpd[723]: peers refreshed
Oct 10 20:43:15 raspberrypi systemd[1]: getty#tty1.service has no holdoff time, scheduling restart.
Oct 10 20:43:15 raspberrypi systemd[1]: Stopping Getty on tty1...
Oct 10 20:43:15 raspberrypi systemd[1]: Starting Getty on tty1...
Oct 10 20:43:15 raspberrypi systemd[1]: Started Getty on tty1.
Oct 10 20:43:21 raspberrypi systemd[1]: getty#tty1.service has no holdoff time, scheduling restart.
Oct 10 20:43:21 raspberrypi systemd[1]: Stopping Getty on tty1...
Oct 10 20:43:21 raspberrypi systemd[1]: Starting Getty on tty1...
Oct 10 20:43:21 raspberrypi systemd[1]: Started Getty on tty1.
Oct 10 20:44:31 raspberrypi systemd[1]: Starting Docker Socket for the API.
Oct 10 20:44:31 raspberrypi systemd[1]: Listening on Docker Socket for the API.
Oct 10 20:44:31 raspberrypi systemd[1]: Starting Docker Application Container Engine...
Oct 10 20:44:31 raspberrypi dockerd[1536]: time="2016-10-10T20:44:31.887581128Z" level=info msg="libcontainerd: new containerd process, pid: 1543"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.903109872Z" level=info msg="[graphdriver] using prior storage driver \"overlay\""
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.950908429Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.951611338Z" level=warning msg="Your kernel does not support swap memory limit."
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.951800086Z" level=warning msg="Your kernel does not support kernel memory limit."
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.951906179Z" level=warning msg="Your kernel does not support cgroup cfs period"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.951993522Z" level=warning msg="Your kernel does not support cgroup cfs quotas"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.952173520Z" level=warning msg="Unable to find cpuset cgroup in mounts"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.952372059Z" level=warning msg="mountpoint for pids not found"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.953406319Z" level=info msg="Loading containers: start."
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.970612440Z" level=info msg="Firewalld running: false"
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.953406319Z" level=info msg="Loading containers: start."
Oct 10 20:44:32 raspberrypi dockerd[1536]: time="2016-10-10T20:44:32.970612440Z" level=info msg="Firewalld running: false"
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Withdrawing address record for 192.168.1.19 on br0.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Leaving mDNS multicast group on interface br0.IPv4 with address 192.168.1.19.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Interface br0.IPv4 no longer relevant for mDNS.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Joining mDNS multicast group on interface br0.IPv4 with address 169.254.124.135.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: New relevant interface br0.IPv4 for mDNS.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Registering new address record for 169.254.124.135 on br0.IPv4.
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.715576231Z" level=info msg="Loading containers: done."
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.715837582Z" level=info msg="Daemon has completed initialization"
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.715921435Z" level=info msg="Docker daemon" commit=23cf638 graphdriver=overlay version=1.12.1
Oct 10 20:44:33 raspberrypi systemd[1]: Started Docker Application Container Engine.
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.754984356Z" level=info msg="API listen on /var/run/docker.sock"
Oct 10 20:44:34 raspberrypi ntpd[723]: Listen normally on 9 br0 169.254.124.135 UDP 123
Oct 10 20:44:34 raspberrypi ntpd[723]: Deleting interface #8 br0, 192.168.1.19#123, interface stats: received=0, sent=0, dropped=0, active_time=94 secs
Oct 10 20:44:34 raspberrypi ntpd[723]: peers refreshed
The interesting part is the last part (I recopied it here bellow):
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Withdrawing address record for 192.168.1.19 on br0.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Leaving mDNS multicast group on interface br0.IPv4 with address 192.168.1.19.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Interface br0.IPv4 no longer relevant for mDNS.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Joining mDNS multicast group on interface br0.IPv4 with address 169.254.124.135.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: New relevant interface br0.IPv4 for mDNS.
Oct 10 20:44:33 raspberrypi avahi-daemon[440]: Registering new address record for 169.254.124.135 on br0.IPv4.
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.715576231Z" level=info msg="Loading containers: done."
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.715837582Z" level=info msg="Daemon has completed initialization"
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.715921435Z" level=info msg="Docker daemon" commit=23cf638 graphdriver=overlay version=1.12.1
Oct 10 20:44:33 raspberrypi systemd[1]: Started Docker Application Container Engine.
Oct 10 20:44:33 raspberrypi dockerd[1536]: time="2016-10-10T20:44:33.754984356Z" level=info msg="API listen on /var/run/docker.sock"
Oct 10 20:44:34 raspberrypi ntpd[723]: Listen normally on 9 br0 169.254.124.135 UDP 123
Oct 10 20:44:34 raspberrypi ntpd[723]: Deleting interface #8 br0, 192.168.1.19#123, interface stats: received=0, sent=0, dropped=0, active_time=94
Once the docker container is running the network configuration is not editable. Try running your docker container with --bip=CIDR and set your bridge ip manually. For more info follow here.

docker container fails to start after docker deamon has been restarted

I am using Ubuntu 16.04 with docker 1.11.2. I have configured systemd to automatically restart docker daemon. When I kill the docker daemon, docker daemon restarts, but container will not even it has RestartPolicy set to always. From the logs I can read that it failed to create directory because it exists. I personally think that it related to stopping containerd.
Any help would be appreciated.
Aug 25 19:20:19 api-31 systemd[1]: docker.service: Main process exited, code=killed, status=9/KILL
Aug 25 19:20:19 api-31 docker[17617]: time="2016-08-25T19:20:19Z" level=info msg="stopping containerd after receiving terminated"
Aug 25 19:21:49 api-31 systemd[1]: docker.service: State 'stop-sigterm' timed out. Killing.
Aug 25 19:21:49 api-31 systemd[1]: docker.service: Unit entered failed state.
Aug 25 19:21:49 api-31 systemd[1]: docker.service: Failed with result 'timeout'.
Aug 25 19:21:49 api-31 systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Aug 25 19:21:49 api-31 systemd[1]: Stopped Docker Application Container Engine.
Aug 25 19:21:49 api-31 systemd[1]: Closed Docker Socket for the API.
Aug 25 19:21:49 api-31 systemd[1]: Stopping Docker Socket for the API.
Aug 25 19:21:49 api-31 systemd[1]: Starting Docker Socket for the API.
Aug 25 19:21:49 api-31 systemd[1]: Listening on Docker Socket for the API.
Aug 25 19:21:49 api-31 systemd[1]: Starting Docker Application Container Engine...
Aug 25 19:21:49 api-31 docker[19023]: time="2016-08-25T19:21:49.913162167Z" level=info msg="New containerd process, pid: 19029\n"
Aug 25 19:21:50 api-31 kernel: [87066.742831] audit: type=1400 audit(1472152910.946:23): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="docker-default" pid=19043 comm="apparmor_parser"
Aug 25 19:21:50 api-31 docker[19023]: time="2016-08-25T19:21:50.952073973Z" level=info msg="[graphdriver] using prior storage driver \"overlay\""
Aug 25 19:21:50 api-31 docker[19023]: time="2016-08-25T19:21:50.956693893Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Aug 25 19:21:50 api-31 docker[19023]: time="2016-08-25T19:21:50.961641996Z" level=info msg="Firewalld running: false"
Aug 25 19:21:51 api-31 docker[19023]: time="2016-08-25T19:21:51.016582850Z" level=info msg="Removing stale sandbox 66ef9e1af997a1090fac0c89bf96c2631bea32fbe3c238c4349472987957c596 (547bceaad5d121444ddc6effbac3f472d0c232d693d8cc076027e238cf253613)"
Aug 25 19:21:51 api-31 docker[19023]: time="2016-08-25T19:21:51.046227326Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Aug 25 19:21:51 api-31 docker[19023]: time="2016-08-25T19:21:51.081106790Z" level=warning msg="Your kernel does not support swap memory limit."
Aug 25 19:21:51 api-31 docker[19023]: time="2016-08-25T19:21:51.081650610Z" level=info msg="Loading containers: start."
Aug 25 19:22:01 api-31 kernel: [87076.922492] docker0: port 1(vethbbc1192) entered disabled state
Aug 25 19:22:01 api-31 kernel: [87076.927128] device vethbbc1192 left promiscuous mode
Aug 25 19:22:01 api-31 kernel: [87076.927131] docker0: port 1(vethbbc1192) entered disabled state
Aug 25 19:22:03 api-31 docker[19023]: .time="2016-08-25T19:22:03.085800458Z" level=warning msg="error locating sandbox id 66ef9e1af997a1090fac0c89bf96c2631bea32fbe3c238c4349472987957c596: sandbox 66ef9e1af997a1090fac0c89bf96c2631bea32fbe3c238c4349472987957c596 not found"
Aug 25 19:22:03 api-31 docker[19023]: time="2016-08-25T19:22:03.085907328Z" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/547bceaad5d121444ddc6effbac3f472d0c232d693d8cc076027e238cf253613/shm: invalid argument"
Aug 25 19:22:03 api-31 kernel: [87078.882836] device veth5c6999c entered promiscuous mode
Aug 25 19:22:03 api-31 kernel: [87078.882984] IPv6: ADDRCONF(NETDEV_UP): veth5c6999c: link is not ready
Aug 25 19:22:03 api-31 systemd-udevd[19128]: Could not generate persistent MAC address for veth5c6999c: No such file or directory
Aug 25 19:22:03 api-31 systemd-udevd[19127]: Could not generate persistent MAC address for veth39fb4d3: No such file or directory
Aug 25 19:22:03 api-31 kernel: [87078.944218] docker0: port 1(veth5c6999c) entered disabled state
Aug 25 19:22:03 api-31 kernel: [87078.948636] device veth5c6999c left promiscuous mode
Aug 25 19:22:03 api-31 kernel: [87078.948640] docker0: port 1(veth5c6999c) entered disabled state
Aug 25 19:22:03 api-31 docker[19023]: time="2016-08-25T19:22:03.219677059Z" level=error msg="Failed to start container 547bceaad5d121444ddc6effbac3f472d0c232d693d8cc076027e238cf253613: rpc error: code = 6 desc = \"mkdir /run/containerd/547bceaad5d121444ddc6effbac3f472d0c232d693d8cc076027e238cf253613: file exists\""
Aug 25 19:22:03 api-31 docker[19023]: time="2016-08-25T19:22:03.219750430Z" level=info msg="Loading containers: done."
Aug 25 19:22:03 api-31 docker[19023]: time="2016-08-25T19:22:03.219776593Z" level=info msg="Daemon has completed initialization"
Aug 25 19:22:03 api-31 docker[19023]: time="2016-08-25T19:22:03.219847738Z" level=info msg="Docker daemon" commit=b9f10c9 graphdriver=overlay version=1.11.2
Aug 25 19:22:03 api-31 systemd[1]: Started Docker Application Container Engine.
Aug 25 19:22:03 api-31 docker[19023]: time="2016-08-25T19:22:03.226116336Z" level=info msg="API listen on /var/run/docker.sock"
#VonC - Thank you for pointing me at the right direction. I researched the thread, but in my case the apparmor is not an issue. There are some other issues mentioned in the thread, so I followed them and I found the solution.
SOLUTION:
On Ubuntu 16.04 the problem is that systemd kills process containerd with the docker daemon process. In order to prevent it, you need to add
KillMode=process
to /lib/systemd/system/docker.service and that fixes the issue.
Here are the sources I used:
https://github.com/docker/docker/issues/25246
https://github.com/docker/docker/blob/master/contrib/init/systemd/docker.service#L25
That seems to be followed by issue 25487 (August 2016), and was reported even before (April 2016) in issue 22195.
Check if you are in the situation mentioned in issue 21702 by Tõnis Tiigi:
This seems to be caused by the apparmor profile for docker daemon we have in docker/contrib/apparmor.
If this profile is applied in v1.11 (at least ubuntu wily) then container starting does not work.
I'm not sure if users have just manually enforced this profile or apparently we also accidentally installed this profile in 1.10.0-rc1 (#19707).
So the workaround, until we figure out how to deal with this, is to unload the profile with something like apparmor_parser -R /etc/apparmor.d/docker-engine ,delete it and restart daemon.
/etc/apparmor.d/docker is the profile for the containers and does not need to be changed.

Resources