I set up swarm cluster with two machine. It work as desired. I'm able to launch container on desired node based on constraint filter. However when I'm trying to ping one container in one node from container in another node, it fail. It does not recognize it. Is it as expected or I did something wrong in setting swarm cluster?
Other Details:
machine 1(10.0.0.4) as both host/node
machine 2(10.0.0.21) as node
Swarm agent is 10.0.0.4:2374(ip :port)
The output of info command it's:
docker -H tcp://10.0.0.4:2374 info
Containers: 11
strategy: spread
Filters: affinity, health, constraint, port, dependency
Nodes: 2
machine1: 10.0.0.4:2375
└ Containers: 6
└ Reserved CPUs: 0 / 25
└ Reserved Memory: 0 B / 24.76 GiB
machine2: 10.0.0.21:2375
└ Containers: 5
└ Reserved CPUs: 0 / 25
└ Reserved Memory: 0 B / 24.76 GiB
Overlay networks was introduced in Docker 1.9 (Nov 2015). It allows containers in different nodes (hosts) to be part of the same network and communicate.
Yes, from the docs "Docker Swarm is native clustering for Docker. It turns a pool of Docker hosts into a single, virtual host." https://docs.docker.com/swarm/
It looks docker swarm is more like managing tool and scheduler. I have to use some other tool like weave or ambassador to connect two container in different host . Anyhow, docker swarm is good clustering tool and help me setup thing as i desire.
Related
I am using kubernetes with docker. #kubectl, #Kubevirt
When I create a VMI using containerDisk from docker.io registry I have found there is two container creating inside single pod (one is compute and another is volumecontainervolume)
cat centos.yaml | grep -ia3 centos-kubevirt-image
volumes:
- name: containervolume
containerDisk:
image: munnaeeebd/centos-kubevirt-image:latest
- name: cloudinitvolume
cloudInitNoCloud:
userData: |-
kubectl get pod | grep centos
virt-launcher-centos-5kfvw 2/2 Running 0 21h
But single container is create while using PVC and disk.img is uploaded via CDI
cat cirros-with-cirros-pvc.yaml | grep -ia3 cirros-pvc
volumes:
- name: containervolume
persistentVolumeClaim:
claimName: cirros-pvc
- name: cloudinitvolume
cloudInitNoCloud:
userData: |-
kubectl get pod | grep cirros
virt-launcher-cirros-57x2r 1/1 Running 0 78m
my question is, is it normal that containerDisk create one additional container than PVC
As explained in this GitHub post it is designed behavior.
The compute container contains the QEMU process with running Virtual
Machine. The Virtual Machine reads the machine image from mounted
volume served from the other volumecontainer.
The second container
only provides the data.
With this design you have the benefit of packaging a Virtual Machine
in the container image and uploading it to any container registry.
Once it is uploaded, it can be used by KubeVirt as container disk.
There was a crash and I have this issue now where it says docker swarm status is pending and the node status is UNKNOWN. This is my docker info result
swarm#swarm-manager-1:~$ docker info
Containers: 270
Running: 0
Paused: 0
Stopped: 270
Images: 160
Server Version: 1.12.2
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 1211
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge null overlay
Swarm: pending
NodeID: d9hq8wzz6skh9pzrxzhbckm97
Is Manager: true
ClusterID: 5zgab5w50qgvvep35eqcbote2
Managers: 1
Nodes: 2
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: HIDDEN
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-91-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.804 GiB
Name: swarm-manager-1
ID: AXPO:VFSV:TDT3:6X7Y:QNAO:OZJN:U23R:V5S2:FU33:WUNI:CRPK:2E2C
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
This is my docker node ls result:
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
9tlo3rg7tuc23xzc3am28lak1 swarm-worker-1 Unknown Active
d9hq8wzz6skh9pzrxzhbckm97 * swarm-manager-1 Unknown Active Leader
I've tried restarting docker engine and the VM but doesn't help in any way. The system is actually running as when I say docker ps in the worker it shows all the containers but on the manager there is nothing on docker ps.
Any idea?
In my experience with Swarm the only solution to similar trouble was to destroy the swarm. And when you do this you should probably also do a docker system prune (only if theres nothing valuable that could be deleted) and service docker restart. And then set up a new swarm.
It sucks. I know
Instead of just rebuilding the whole swarm all at once, you can attempt to remove and re-add each node one at a time - the advantage of this is that the swarm state is not destroyed and, on larger swarms, services can continue while you fix it. This process is considerably more complicated when you don't have a quorum of managers, though.
First, note the node IDs (I'll refer to here as $WORKER_ID and $MANAGER_ID).
On manager node:
docker node update --availability drain $WORKER_ID
^ This is optional, but it's a good habit when working with live services on a swarm.
docker swarm join-token manager
^ This command will give you the join command to run on each node after it's removed. I'll refer to it as $JOIN_COMMAND below. We will demote the worker once the manager re-joins.
On worker:
docker swarm leave
$JOIN_COMMAND
This node is now re-joined as a manager, but I'll continue calling it the 'worker' to avoid confusion.
On manager:
docker node rm $WORKER_ID
docker node update --availability drain $MANAGER_ID
docker swarm leave -f
$JOIN_COMMAND
docker node rm $MANAGER_ID
docker node ls
Find the worker's new id (pay attention to the hostname, not the role) -> $NEW_WORKER_ID
docker node demote $NEW_WORKER_ID
Your swarm should be refreshed - if there were more nodes, the services running on each would have migrated across the swarm when you drained each node.
If it still doesn't work (and regardless), you really should consider upgrading to docker v17.06 or newer. Swarm networking was very unstable before that, causing a lot of issues stemming from race conditions.
I use windows container and try to create docker swarm ,I create three virtual machine use hyper-v , and each OS is windows server 2016.There machines ip is :
windocker211 192.168.1.211
windocker212 192.168.1.212
windocker219 192.168.1.219
The docker swarm node is :
PS C:\ConsoleZ> docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
4c0g0o0uognheugw4do1a1h7y windocker212 Ready Active
bbxot0c8zijq7xw4lm86svgwp * windocker219 Ready Active Leader
wftwpiqpqpbqfdvgenn787psj windocker211 Ready Active
I create use command:
docker service create --name=demo5 -p 5005:5005 --replicas 6 192.168.1.245/cqgis/wintestcore:0.6
The docker image is asp.net core app , the Dockerfile is:
FROM 192.168.1.245/win/aspnetcore-runtime:1.1.2
COPY . /app
WORKDIR /app
ENV ASPNETCORE_URLS http://*:5005
EXPOSE 5005/tcp
ENTRYPOINT ["dotnet", "dotnetcore.dll"]
then it create success:
PS C:\ConsoleZ> docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
omhu7e0vo96s demo5 replicated 6/6 192.168.1.245/cqgis/wintestcore:0.6 *:5005->5005/tcp
PS C:\ConsoleZ> docker service ps demo5
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
8pihnak9a2ei demo5.1 192.168.1.245/cqgis/wintestcore:0.6 windocker212 Running Running 59 seconds ago
ut3f3b9giu4w demo5.2 192.168.1.245/cqgis/wintestcore:0.6 windocker219 Running Running 47 seconds ago
iy1xjevt67yl demo5.3 192.168.1.245/cqgis/wintestcore:0.6 windocker211 Running Running about a minute ago
q7f1gnbwslr3 demo5.4 192.168.1.245/cqgis/wintestcore:0.6 windocker212 Running Running about a minute ago
8zewaktcu32h demo5.5 192.168.1.245/cqgis/wintestcore:0.6 windocker219 Running Running about a minute ago
xq820kqwf3v9 demo5.6 192.168.1.245/cqgis/wintestcore:0.6 windocker211 Running Running 55 seconds ago
but my question is I cann't accessing The Site each by
http://192.168.1.219:5005/
http://192.168.1.219:5005/
http://192.168.1.219:5005/
When I use command
docker run -it -p 5010:5005 192.168.1.245/cqgis/wintestcore:0.6
I can use http://192.168.1.219:5010/ get the right result
my docker info is
PS C:\ConsoleZ> docker info
Containers: 4
Running: 3
Paused: 0
Stopped: 1
Images: 5
Server Version: 17.06.0-ce-rc1
Storage Driver: windowsfilter
Windows:
Logging Driver: json-file
Plugins:
Volume: local
Network: l2bridge l2tunnel nat null overlay transparent
Log: awslogs etwlogs fluentd json-file logentries splunk syslog
Swarm: active
NodeID: bbxot0c8zijq7xw4lm86svgwp
Is Manager: true
ClusterID: 32vsgwrbn6ihvpevly71gkgxk
Managers: 1
Nodes: 3
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Root Rotation In Progress: false
Node Address: 192.168.1.219
Manager Addresses:
192.168.1.219:2377
Default Isolation: process
Kernel Version: 10.0 14393 (14393.1198.amd64fre.rs1_release_sec.170427-1353)
Operating System: Windows Server 2016 Datacenter
OSType: windows
Architecture: x86_64
CPUs: 8
Total Memory: 2.89GiB
Name: windock219
ID: 7AOY:OT6V:BTJV:NCHA:3OF5:5WR5:K2YR:CFG3:VXLD:QTMD:GA3D:ZFJ2
Docker Root Dir: C:\ProgramData\docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: -1
Goroutines: 297
System Time: 2017-06-04T19:58:20.7582294+08:00
EventsListeners: 2
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
192.168.1.245
127.0.0.0/8
Live Restore Enabled: false
I beleive you need to publish port in "host" mode (learn.microsoft.com/en-us/virtualization/windowscontainers/…). Also it will be one to one port mapping between running container and host and hence you will not be able to run several containers on the same port. Routing mesh is not working on Windows yet.
There are some differences in the network between Docker for windows container and Docker for Linux. Windows Containers uses the HyperV Network technologies to provide the virtual networking features that docker uses. From there are some restrictions that are not work like you would expect or maybe found in standard Docker Documentation.
First you cannot access the web side running inside your container by
using the lookback address (127.0.0.1) or the host address (192.168.1.xxx) You have to call it
always from a remote machine.
I saw you are using the expose command in your Dockerfile. It is not
so self-explaining but expose is to expose a port in any other
network then the host or ingress network. It’s not a problem if you
do that in a non swarm configuration but it does not work in a swarm.
I Suggest to remove the Expose command.
There are some unsolved problems with windows networking. Sometimes the port stays in use after the container gets restarted.
For example, after a reboot of the host system.
[https://github.com/moby/moby/issues/21558][1]
With this scrip you can remove the all virtual network settings:
Stop-Service docker
Get-ContainerNetwork | Remove-ContainerNetwork
Get-NetNat | Remove-NetNat
Get-VMSwitch | Remove-VMSwitch
Start-Service docker
You cannot reach a container's published port from the same machine because of a limitation of the WinNAT networking. But you can reach the required port using an external request.
In your example, from a machine other than 192.168.1.219, accessing using the url http://192.168.1.219:5005/ will succeed. The url's http://192.168.1.211:5005/ and http://192.168.1.212:5005/ will also succeed provided the requests originate from outside those machines.
Using the 'host' mode will succeed: however, you are not getting the advantage of the 'routing mesh' feature which allows the service to be reachable from any of the services' nodes - only from that one single node.
Trying to run cluster application on different virtual machines with use of Swarm stand alone and docker-compose version '2'. Overlay network is set. But want to force certain containers to run on specific hosts.
In documentation there is following advice, but with this parameter I was not able to start any container at all:
environment:
- "constraint:node==node-1"
ERROR: for elasticsearch1 Cannot create container for service elasticsearch1: Unable to find a node that satisfies the following conditions
[available container slots]
[node==node-1]
Should we register hosts as node-1 node-2... or it is done by default.
[root#ux-test14 ~]# docker node ls
Error response from daemon: 404 page not found
[root#ux-test14 ~]# docker run swarm list
[root#ux-test14 ~]#
[root#ux-test14 ~]# docker info
Containers: 8
Running: 6
Paused: 0
Stopped: 2
Images: 8
Server Version: swarm/1.2.5
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 2
ux-test16.rs: 10.212.212.2:2375
â ID: JQPG:GKFF:KJZJ:AY3N:NHPZ:HD6J:SH36:KEZR:2SSH:XF65:YW3N:W4DG
â Status: Healthy
â Containers: 4 (4 Running, 0 Paused, 0 Stopped)
â Reserved CPUs: 0 / 2
â Reserved Memory: 0 B / 3.888 GiB
â Labels: kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
â UpdatedAt: 2016-09-05T11:11:31Z
â ServerVersion: 1.12.1
ux-test17.rs: 10.212.212.3:2375
â ID: Z27V:T5NU:QKSH:DLNK:JA4M:V7UX:XYGH:UIL6:WFQU:FB5U:J426:7XIR
â Status: Healthy
â Containers: 4 (2 Running, 0 Paused, 2 Stopped)
â Reserved CPUs: 0 / 2
â Reserved Memory: 0 B / 3.888 GiB
â Labels: kernelversion=3.10.0-327.28.3.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper
â UpdatedAt: 2016-09-05T11:11:17Z
â ServerVersion: 1.12.1
Plugins:
Volume:
Network:
Swarm:
NodeID:
Is Manager: false
Node Address:
Security Options:
Kernel Version: 3.10.0-327.28.3.el7.x86_64
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 7.775 GiB
Name: 858ac2fdd225
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
WARNING: No kernel memory limit support
My first answer is about "swarm mode". You'd since clarified that you're using legacy Swarm and added more info, so here:
The constraint you list assumes that you have a host named node-1. Your hosts are named ux-test16.rs and ux-test17.rs. Just use that instead of node-1 in your constraint. Eg:
environment:
- "constraint:node==ux-test16.rs"
The environment variable constraint is only valid for the legacy (stand alone) version of Swarm. The newer "Swarm Mode" uses either mode or constraints options (not environment variables).
To enforce one and only one task (container) per node, use mode=global.
docker service create --name proxy --mode global nginx
The default mode is replicated which means that the swarm manager will create tasks (containers) across all available nodes to meet the number specified in the --replicas option. Eg:
docker service create --name proxy --replicas 5 nginx
To enforce other constraints based on hostname (node), label, role, id's use the --constraint option. Eg:
docker service create --name proxy --constraint "node.hostname!=node01" nginx
See https://docs.docker.com/engine/reference/commandline/service_create/#/specify-service-constraints
EDIT sept 2016:
Something else. docker-compose is not currently supported in "swarm mode". Swarm mode understands the new dab format instead. There is a way to convert docker-compose files to dab but it's experimental and not to be relied on at this point. It's better to create a bash script that calls all the docker service create ... directly.
EDIT March 2017:
As of docker 1.13 (17.03), docker-compose can now be used to provision swarm environments directly without having to deal with the dab step.
Related issue - I had a recent Swarm project with a mixture of worker nodes (3 x Linux + 4 x Windows). My containers needed to run on a specific OS, but not on any specific node. Swarm mode now supports specifying an OS under "constraints" in docker-compose files. No need to create labels for each node:
version: '3'
services:
service_1:
restart: on-failure
image: 'service_1'
deploy:
placement:
constraints:
- node.platform.os == windows
junittestsuite:
restart: on-failure
image: 'junit_test_suite:1.0'
command: ant test ...
deploy:
placement:
constraints:
- node.platform.os == linux
I have managed to manually set up a docker swarm (e.g.: without using docker-machine) following the official tutorial
I am able to run containers on the swarm successfully using docker engine:
docker -H :4000 run redis
I would like to use docker-compose to run containers on the swarm, however I cannot seem to get this right.
The first thing I had to work out was how to get compose to talk on port :4000. I achieved this by specifying: export DOCKER_HOST=":4000".
However, now, when I run docker-compose I get the following error:
$docker-compose up
Creating network "root_default" with the default driver
ERROR: Error response from daemon: failed to parse pool request for address space "GlobalDefault" pool "" subpool "": cannot find address space GlobalDefault (most likely the backing datastore is not configured)
It feels like this issue has to do with either TLS or network, but I'm pretty stumped as to how to fix it, or even how to go about investigating it further.
I'm using Docker engine: 1.10, Compose 1.6. Swarm:latest
In case it's useful, here is my docker info:
$docker -H :4000 info
Containers: 7
Running: 5
Paused: 0
Stopped: 2
Images: 7
Server Version: swarm/1.2.0
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
node02: 10.129.5.211:2375
└ Status: Healthy
└ Containers: 3
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=, kernelversion=3.13.0-79-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-04-15T08:28:20Z
└ ServerVersion: 1.11.0
node03: 10.129.6.21:2375
└ Status: Healthy
└ Containers: 4
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=, kernelversion=3.13.0-79-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-04-15T08:28:43Z
└ ServerVersion: 1.11.0
Plugins:
Volume:
Network:
Kernel Version: 3.13.0-79-generic
Operating System: linux
Architecture: amd64
CPUs: 4
Total Memory: 4.105 GiB
Name: b156985db557
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
WARNING: No kernel memory limit support
I am using Docker 1.10, Docker Swarm, Docker Compose on our production.
I have met the same issue in the past. And I solved it by steps as below:
Step 1: export DOCKER_HOST=tcp://localhost:4000
Step 2: Verify Docker Swarm by command docker info (without -H). If it not OK, make sure that Swarm Manager is working your host.
Step 3: If Step 2 is OK. Run your application on Docker Compose docker-compose up
I ran into the same problem and found the answer here: https://groups.google.com/a/weave.works/forum/m/#!topic/weave-users/Mf6fv9OEd-E
What fixed it was:
To make it work with docker-compose 2, you should add:
network_mode: "bridge"
to all service definitions.
to remove the above error, you need to run docker deamon on every node like this
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise 192.168.56.103:4000 consul://192.168.56.101:8500