How to run docker-compose against docker swarm (without docker-machine)

How to run docker-compose against docker swarm (without docker-machine) - docker

I have managed to manually set up a docker swarm (e.g.: without using docker-machine) following the official tutorial
I am able to run containers on the swarm successfully using docker engine:
docker -H :4000 run redis
I would like to use docker-compose to run containers on the swarm, however I cannot seem to get this right.
The first thing I had to work out was how to get compose to talk on port :4000. I achieved this by specifying: export DOCKER_HOST=":4000".
However, now, when I run docker-compose I get the following error:
$docker-compose up
Creating network "root_default" with the default driver
ERROR: Error response from daemon: failed to parse pool request for address space "GlobalDefault" pool "" subpool "": cannot find address space GlobalDefault (most likely the backing datastore is not configured)
It feels like this issue has to do with either TLS or network, but I'm pretty stumped as to how to fix it, or even how to go about investigating it further.
I'm using Docker engine: 1.10, Compose 1.6. Swarm:latest
In case it's useful, here is my docker info:
$docker -H :4000 info
Containers: 7
Running: 5
Paused: 0
Stopped: 2
Images: 7
Server Version: swarm/1.2.0
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
node02: 10.129.5.211:2375
└ Status: Healthy
└ Containers: 3
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=, kernelversion=3.13.0-79-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-04-15T08:28:20Z
└ ServerVersion: 1.11.0
node03: 10.129.6.21:2375
└ Status: Healthy
└ Containers: 4
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=, kernelversion=3.13.0-79-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-04-15T08:28:43Z
└ ServerVersion: 1.11.0
Plugins:
Volume:
Network:
Kernel Version: 3.13.0-79-generic
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 4.105 GiB
Name: b156985db557
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support

I am using Docker 1.10, Docker Swarm, Docker Compose on our production.
I have met the same issue in the past. And I solved it by steps as below:
Step 1: export DOCKER_HOST=tcp://localhost:4000
Step 2: Verify Docker Swarm by command docker info (without -H). If it not OK, make sure that Swarm Manager is working your host.
Step 3: If Step 2 is OK. Run your application on Docker Compose docker-compose up

I ran into the same problem and found the answer here: https://groups.google.com/a/weave.works/forum/m/#!topic/weave-users/Mf6fv9OEd-E
What fixed it was:
To make it work with docker-compose 2, you should add:
network_mode: "bridge"
to all service definitions.

to remove the above error, you need to run docker deamon on every node like this
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise 192.168.56.103:4000 consul://192.168.56.101:8500

Related

Podman unable to mount local file into container

I'm planning to move away from Docker to Podman.
I use docker-compose a lot so am planning to switch to podman-compose as well.
However I'm stuck at the simplest of podman examples, I can't seem to mount a volume onto my container? Obviously I'm doing something wrong however I cant figure out what it is.
My source file definitely exists on my (hardware) host (so not the podman machine). but I keep getting the error 'no such file or directory'.
Funny thing is if I manually create the same file locally on the podman machine (podman machine ssh --> touch /tmp/test.txt) it works perfectly fine.
Question is;
should I (manually?) mount all my local files onto the Fedora VM (podman machine) so that in turn this Fedora mount can be used in my actual container? and if so, how do I do this?
The podman run cmd below should work and there is something else I'm doing wrong?
$ ls -al /tmp/test.txt
-rw-r--r-- 1 <username> <group> 10 Dec 8 13:33 /tmp/test.txt
$ podman run -it -v /tmp/test.txt:/tmp/test.txt docker.io/library/busybox
Error: statfs /tmp/test.txt: no such file or directory
$ podman run -it -v /tmp/test.txt:/tmp/test.txt:Z docker.io/library/busybox
Error: statfs /tmp/test.txt: no such file or directory
Additional information:
$ podman info --debug
host:
arch: amd64
buildahVersion: 1.23.1
cgroupControllers:
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.0.30-2.fc35.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.30, commit: '
cpus: 10
distribution:
distribution: fedora
variant: coreos
version: "35"
eventLogger: journald
hostname: localhost.localdomain
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 5.15.6-200.fc35.x86_64
linkmode: dynamic
logDriver: journald
memFree: 11733594112
memTotal: 12538863616
ociRuntime:
name: crun
package: crun-1.3-1.fc35.x86_64
path: /usr/bin/crun
version: |-
crun version 1.3
commit: 8e5757a4e68590326dafe8a8b1b4a584b10a1370
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: true
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.12-2.fc35.x86_64
version: |-
slirp4netns version 1.1.12
commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
libslirp: 4.6.1
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.3
swapFree: 0
swapTotal: 0
uptime: 7h 9m 29.12s (Approximately 0.29 days)
plugins:
log:
- k8s-file
- none
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- docker.io
store:
configFile: /var/home/core/.config/containers/storage.conf
containerStore:
number: 4
paused: 0
running: 0
stopped: 4
graphDriverName: overlay
graphOptions: {}
graphRoot: /var/home/core/.local/share/containers/storage
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
imageStore:
number: 8
runRoot: /run/user/1000/containers
volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
APIVersion: 3.4.2
Built: 1636748737
BuiltTime: Fri Nov 12 20:25:37 2021
GitCommit: ""
GoVersion: go1.16.8
OsArch: linux/amd64
Version: 3.4.2

As mentioned by #ErikSjölund there has been an active treat on https://github.com/containers/podman. Apparantely Centos (Podman Machine) does not (yet) support different types of volume creation on the machine.
It's not perse Podman lacking this feature it's waiting for CentOS to support this feature as well.
However, should you want to mount a local directory onto the machine I recommend have a look at https://github.com/containers/podman/issues/8016#issuecomment-995242552. It describes how to do a read-only mount on CoreOS (or break compatibility with local version).
Info:
https://github.com/containers/podman/pull/11454
https://github.com/containers/podman/pull/12584

Can't recover docker swarm from pending state

There was a crash and I have this issue now where it says docker swarm status is pending and the node status is UNKNOWN. This is my docker info result
swarm#swarm-manager-1:~$ docker info
Containers: 270
Running: 0
Paused: 0
Stopped: 270
Images: 160
Server Version: 1.12.2
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 1211
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge null overlay
Swarm: pending
NodeID: d9hq8wzz6skh9pzrxzhbckm97
Is Manager: true
ClusterID: 5zgab5w50qgvvep35eqcbote2
Managers: 1
Nodes: 2
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: HIDDEN
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-91-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.804 GiB
Name: swarm-manager-1
ID: AXPO:VFSV:TDT3:6X7Y:QNAO:OZJN:U23R:V5S2:FU33:WUNI:CRPK:2E2C
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
This is my docker node ls result:
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
9tlo3rg7tuc23xzc3am28lak1 swarm-worker-1 Unknown Active
d9hq8wzz6skh9pzrxzhbckm97 * swarm-manager-1 Unknown Active Leader
I've tried restarting docker engine and the VM but doesn't help in any way. The system is actually running as when I say docker ps in the worker it shows all the containers but on the manager there is nothing on docker ps.
Any idea?

In my experience with Swarm the only solution to similar trouble was to destroy the swarm. And when you do this you should probably also do a docker system prune (only if theres nothing valuable that could be deleted) and service docker restart. And then set up a new swarm.
It sucks. I know

Instead of just rebuilding the whole swarm all at once, you can attempt to remove and re-add each node one at a time - the advantage of this is that the swarm state is not destroyed and, on larger swarms, services can continue while you fix it. This process is considerably more complicated when you don't have a quorum of managers, though.
First, note the node IDs (I'll refer to here as $WORKER_ID and $MANAGER_ID).
On manager node:
docker node update --availability drain $WORKER_ID
^ This is optional, but it's a good habit when working with live services on a swarm.
docker swarm join-token manager
^ This command will give you the join command to run on each node after it's removed. I'll refer to it as $JOIN_COMMAND below. We will demote the worker once the manager re-joins.
On worker:
docker swarm leave
$JOIN_COMMAND
This node is now re-joined as a manager, but I'll continue calling it the 'worker' to avoid confusion.
On manager:
docker node rm $WORKER_ID
docker node update --availability drain $MANAGER_ID
docker swarm leave -f
$JOIN_COMMAND
docker node rm $MANAGER_ID
docker node ls
Find the worker's new id (pay attention to the hostname, not the role) -> $NEW_WORKER_ID
docker node demote $NEW_WORKER_ID
Your swarm should be refreshed - if there were more nodes, the services running on each would have migrated across the swarm when you drained each node.
If it still doesn't work (and regardless), you really should consider upgrading to docker v17.06 or newer. Swarm networking was very unstable before that, causing a lot of issues stemming from race conditions.

docker-compose swarm: force containers to run on specific hosts

Trying to run cluster application on different virtual machines with use of Swarm stand alone and docker-compose version '2'. Overlay network is set. But want to force certain containers to run on specific hosts.
In documentation there is following advice, but with this parameter I was not able to start any container at all:
environment:
- "constraint:node==node-1"
ERROR: for elasticsearch1 Cannot create container for service elasticsearch1: Unable to find a node that satisfies the following conditions
[available container slots]
[node==node-1]
Should we register hosts as node-1 node-2... or it is done by default.
[root#ux-test14 ~]# docker node ls
Error response from daemon: 404 page not found
[root#ux-test14 ~]# docker run swarm list
[root#ux-test14 ~]#
[root#ux-test14 ~]# docker info
Containers: 8
Running: 6
Paused: 0
Stopped: 2
Images: 8
Server Version: swarm/1.2.5
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 2
ux-test16.rs: 10.212.212.2:2375
â ID: JQPG:GKFF:KJZJ:AY3N:NHPZ:HD6J:SH36:KEZR:2SSH:XF65:YW3N:W4DG
â Status: Healthy
â Containers: 4 (4 Running, 0 Paused, 0 Stopped)
â Reserved CPUs: 0 / 2
â Reserved Memory: 0 B / 3.888 GiB
â Labels: kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
â UpdatedAt: 2016-09-05T11:11:31Z
â ServerVersion: 1.12.1
ux-test17.rs: 10.212.212.3:2375
â ID: Z27V:T5NU:QKSH:DLNK:JA4M:V7UX:XYGH:UIL6:WFQU:FB5U:J426:7XIR
â Status: Healthy
â Containers: 4 (2 Running, 0 Paused, 2 Stopped)
â Reserved CPUs: 0 / 2
â Reserved Memory: 0 B / 3.888 GiB
â Labels: kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
â UpdatedAt: 2016-09-05T11:11:17Z
â ServerVersion: 1.12.1
Plugins:
Volume:
Network:
Swarm:
NodeID:
Is Manager: false
Node Address:
Security Options:
Kernel Version: 3.10.0-327.28.3.el7.x86_64
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 7.775 GiB
Name: 858ac2fdd225
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
WARNING: No kernel memory limit support

My first answer is about "swarm mode". You'd since clarified that you're using legacy Swarm and added more info, so here:
The constraint you list assumes that you have a host named node-1. Your hosts are named ux-test16.rs and ux-test17.rs. Just use that instead of node-1 in your constraint. Eg:
environment:
- "constraint:node==ux-test16.rs"

The environment variable constraint is only valid for the legacy (stand alone) version of Swarm. The newer "Swarm Mode" uses either mode or constraints options (not environment variables).
To enforce one and only one task (container) per node, use mode=global.
docker service create --name proxy --mode global nginx
The default mode is replicated which means that the swarm manager will create tasks (containers) across all available nodes to meet the number specified in the --replicas option. Eg:
docker service create --name proxy --replicas 5 nginx
To enforce other constraints based on hostname (node), label, role, id's use the --constraint option. Eg:
docker service create --name proxy --constraint "node.hostname!=node01" nginx
See https://docs.docker.com/engine/reference/commandline/service_create/#/specify-service-constraints
EDIT sept 2016:
Something else. docker-compose is not currently supported in "swarm mode". Swarm mode understands the new dab format instead. There is a way to convert docker-compose files to dab but it's experimental and not to be relied on at this point. It's better to create a bash script that calls all the docker service create ... directly.
EDIT March 2017:
As of docker 1.13 (17.03), docker-compose can now be used to provision swarm environments directly without having to deal with the dab step.

Related issue - I had a recent Swarm project with a mixture of worker nodes (3 x Linux + 4 x Windows). My containers needed to run on a specific OS, but not on any specific node. Swarm mode now supports specifying an OS under "constraints" in docker-compose files. No need to create labels for each node:
version: '3'
services:
service_1:
restart: on-failure
image: 'service_1'
deploy:
placement:
constraints:
- node.platform.os == windows
junittestsuite:
restart: on-failure
image: 'junit_test_suite:1.0'
command: ant test ...
deploy:
placement:
constraints:
- node.platform.os == linux

Docker Swarm - strategy doesn't seem to have any effect

I have provisioned a swarm master and swarm nodes with Docker Machine (as described here). Everything is working fine; all the machines are created and running, they have all been discovered and accepts containers.
The output from 'docker-machine ls' is:
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
default - virtualbox Stopped Unknown
local - virtualbox Stopped Unknown
my-swarm * (swarm) digitalocean Running tcp://104.131.161.197:2376 my-swarm (master) v1.11.1
node0 - digitalocean Running tcp://104.236.29.169:2376 my-swarm v1.11.1
node1 - digitalocean Running tcp://104.236.216.164:2376 my-swarm v1.11.1
The problem I'm having is with the distribution of containers. No matter which strategy I set for the swarm, it only seems to distribute containers to one of the nodes at a time. I.e. I run a bunch of containers, and the are all started on the same node, as shown below (with strategy Spread):
$ docker ps
5c075d7ccddc stress "/bin/sh -c /stress.s" 32 seconds ago Up 31 seconds node0/elated_goldstine
5bae22a15829 stress "/bin/sh -c /stress.s" 46 seconds ago Up 44 seconds node0/cocky_booth
dc52b3dfa0e6 stress "/bin/sh -c /stress.s" About a minute ago Up About a minute node0/goofy_kalam
3b9e69c694da stress "/bin/sh -c /stress.s" About a minute ago Up About a minute node0/focused_fermat
ef0e006ff3e0 stress "/bin/sh -c /stress.s" About a minute ago Up About a minute node0/stoic_engelbart
53e46b19ab33 stress "/bin/sh -c /stress.s" About a minute ago Up About a minute node0/condescending_rosalind
e9e126c7f4c6 stress "/bin/sh -c /stress.s" About a minute ago Up About a minute node0/sleepy_jang
f9c0003d509d stress "/bin/sh -c /stress.s" About a minute ago Up About a minute node0/amazing_bhaskara
What I would expect here is for the containers to be distributed roughly evenly on the 3 nodes, especially as the script I'm running in the containers is designed to take as much CPU as possible. But instead all of them are on node0 (which I would only expect with Binpack). The Random strategy has the exact same behaviour.
The output from 'docker info' with the swarm master set as active seems correct:
$ docker info
Containers: 15
Running: 4
Paused: 0
Stopped: 11
Images: 5
Server Version: swarm/1.2.1
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 3
my-swarm: 104.131.161.197:2376
└ ID: L2HK:F6S3:WWIM:BHNI:M4XL:KLEA:4U22:J6CE:ZHZI:OGGT:76KF:MTQU
└ Status: Healthy
└ Containers: 2
└ Reserved CPUs: 0 / 1
└ Reserved Memory: 0 B / 513.4 MiB
└ Labels: executiondriver=, kernelversion=4.2.0-27-generic, operatingsystem=Ubuntu 15.10, provider=digitalocean, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-05-09T10:25:24Z
└ ServerVersion: 1.11.1
node0: 104.236.29.169:2376
└ ID: I3TQ:5BMS:TM2P:GLL4:64OH:BDMY:SWBU:3QG4:TOZ2:LEDW:A6SQ:X34H
└ Status: Healthy
└ Containers: 12
└ Reserved CPUs: 0 / 1
└ Reserved Memory: 0 B / 513.4 MiB
└ Labels: executiondriver=, kernelversion=4.2.0-27-generic, operatingsystem=Ubuntu 15.10, provider=digitalocean, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-05-09T10:25:02Z
└ ServerVersion: 1.11.1
node1: 104.236.216.164:2376
└ ID: OTQH:UBSV:2HKE:ZVHL:2K7Z:BYGC:ZX25:Y6BQ:BN5J:UWEB:65KE:DABM
└ Status: Healthy
└ Containers: 1
└ Reserved CPUs: 0 / 1
└ Reserved Memory: 0 B / 513.4 MiB
└ Labels: executiondriver=, kernelversion=4.2.0-27-generic, operatingsystem=Ubuntu 15.10, provider=digitalocean, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-05-09T10:25:10Z
└ ServerVersion: 1.11.1
Plugins:
Volume:
Network:
Kernel Version: 4.2.0-27-generic
Operating System: linux
Architecture: amd64
CPUs: 3
Total Memory: 1.504 GiB
Name: my-swarm
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support
Is there some piece I have missed that is necessary for this type of automatic distribution to work the way I'm expecting?

Issue
It is most likely that you Docker Agent does not have the image you try to run on them.
Note that swarm build and swarm pull will not build, or pull, on each Docker agent.
Solution A
Run the pull manually on each Docker Agent
Solution B (not tested)
Run a local Docker registry wherein the Docker agent will pull the images from.
Have fun with Swarm! Also, if I may, I suggest you to read this answer that details a whole step-by-step tutorial on Swarm.

Docker Swarm: Does it let two containers in different hosts to communictae?

I set up swarm cluster with two machine. It work as desired. I'm able to launch container on desired node based on constraint filter. However when I'm trying to ping one container in one node from container in another node, it fail. It does not recognize it. Is it as expected or I did something wrong in setting swarm cluster?
Other Details:
machine 1(10.0.0.4) as both host/node
machine 2(10.0.0.21) as node
Swarm agent is 10.0.0.4:2374(ip :port)
The output of info command it's:
docker -H tcp://10.0.0.4:2374 info
Containers: 11
strategy: spread
Filters: affinity, health, constraint, port, dependency
Nodes: 2
machine1: 10.0.0.4:2375
└ Containers: 6
└ Reserved CPUs: 0 / 25
└ Reserved Memory: 0 B / 24.76 GiB
machine2: 10.0.0.21:2375
└ Containers: 5
└ Reserved CPUs: 0 / 25
└ Reserved Memory: 0 B / 24.76 GiB

Overlay networks was introduced in Docker 1.9 (Nov 2015). It allows containers in different nodes (hosts) to be part of the same network and communicate.

Yes, from the docs "Docker Swarm is native clustering for Docker. It turns a pool of Docker hosts into a single, virtual host." https://docs.docker.com/swarm/

It looks docker swarm is more like managing tool and scheduler. I have to use some other tool like weave or ambassador to connect two container in different host . Anyhow, docker swarm is good clustering tool and help me setup thing as i desire.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart