How volumes are shared across Docker Swarm nodes? - docker

So I'm starting to play with docker and docker swarm, and now I wonder how volumes are shared across docker swarm nodes.
Lets pick a Postgres database for instance, and I have a volume declared for the data:
docker-compose.yml:
...
db:
image: postgres:9.4-alpine
volumes:
- db-data:/var/lib/postgresql/data
deploy:
placement:
constraints: [node.role == manager]
...
What If I have managers that are very far away, in different regions of the world, and occurs that they need to share data between them?
How does Docker operate on that?

At first, creating a Swarm cluster compose of different regions is wiered and at some points, it is not even possible as TCP ports of different regions (or cloud vendors) may not allowed to connect with each other via the Internet which is an essential requirement for setting up a Swarm.
As far as sharing data is concerned, you have different options which you can choose from. For example you can use Azure File Storage or AWS EFS\EBS in your Docker containers via volume plugin mounts and storage, eventually, is shared across managers and workers in a cluster (distributed file-systems), you can use a volume orchestrator like Flocker volume plugin or setup your own NFS.

Related

Share volume in docker swarm for many nodes

I'm facing a big challenge. Trying run my app on 2 VPS in docker swarm. Containers that use volumes should use shared volume between nodes.
My solution is:
Use plugin glusterFS and mount volume on every node using nfs. NFS generate single point of failure so when something go wrong my data are gone. (it's not look good maybe im wrong)
Use Azure Storage - store data as blob ( Azure Data Lake Storage Gen2 ). But my main is problem how can i connect to azure storage using docker-compose.yaml? I should declarate volume in every service that use volume and declare volume in volume section. I don't have idea how to do that.
Docker documentation about it is gone. Should be here https://docs.docker.com/docker-for-azure/persistent-data-volumes/.
Another option is use https://hub.docker.com/r/docker4x/cloudstor/tags?page=1&ordering=last_updated but last update was 2 years ago so its probably not supported anymore.
Do i have any other options and which share volume between nodes is best solution?
There are a number of ways of dealing with creating persistent volumes in docker swarm, none of them particularly satisfactory:
First up, a simple way is to use nfs, glusterfs, iscsi, or vmware to multimount the same SAN storage volume onto each docker swarm node. Services just mount volumes as /mnt/volumes/my-sql-workload
On the one hand its really simple, on the other hand there is literally no access control and you can easilly accidentally load services pointing at each others data.
Next, commercial docker volume plugins for SANs. If you are lucky and possess a Pure Storage, NetApp or other such SAN array, some of them still offer docker volume plugins. Trident for example if you have a NetApp.
Third. if you are in the cloud, the legacy swarm offerings on Azure and Aws included a built in "cloudstor" volume driver but you need to dig really deep to find it in their legacy offering.
Four, there are a number of opensource or free volume plugins that will mount volumes from nfs, glusterfs or other sources. But most are abandoned or very quiet. The most active I know of is marcelo-ochoa/docker-volume-plugins
I wasn't particularly happy with how those plugins mounted pre-existing volumes, but made operations like docker volume create hard, so I made my own, but really
Swarm Cluster Volume Support with CSI Plugins is hopefully going to drop in 2021¹. Which hopefully is a solid rebuttal to all the problems above.
¹Its now 2022 and the next version of Docker has not yet gone live with CSI support. Still we wait.
In my opinion, a good solution could be to create a GlusterFS cluster, configure a single volume and mount it in every Docker Swarm node (i.e. in /mnt/swarm-storage).
Then, for every Container that needs persistent storage, bind-mount a subdirectory of the GlusterFS volume inside the container.
Example:
services:
my-container:
...
volumes:
- type: bind
source: /mnt/swarm-storage/my-container
target: /a/path/inside/the/container
This way, every node shares the same storage, so that a given container could be instantiated indifferently on every cluster node.
You don't need any Docker plugin for a particular storage driver, because the distributed storage is transparent to the Swarm cluster.
Lastly, GlusterFS is a distributed filesystem, designed to not have a single point of failure and you can cluster it on as many node you like (contrary to NFS).

Do replicated docker contianers in swarm mode contain multiple copies of data?

I have recently started learning docker. However when studying swarm mode I see that containers can be scaled up. What I would like to know is once you scale conatiner in replicated mode will the data within the container be replicated too ? or just fresh containers will be spawned ?
For example lets say I created mysql service initially only with 1 copy. I create and update tables in that mysql container. Later I scale it to 3, will newly spawned containers contain same table data ? Also will the data be continuously be replicated across 3 docker instances ?
A replicated service will use fresh container instances per container. Swarm does not take care about replication of persistent data to be stored in volumes.
Dependening on the volume plugin (e.g. local driver /w remote nfs shares) you are limited to read-write-once or read-write-many. Even if your volume allows read-write-many, the service replicas might not support that, for instance mysql will not work if you point n replicas to the same volume. You can leverage swarm service template variables for instance to point your volumes to different target folders of the same nfs share.
Also with swarm, you will want to have storage that needs to be reachable from all nodes, as a container can die and be re-spawned on a different node. So either you will need to use a remote share based on NFS or CIFS (see example usages nfs cifs), a storage cluster like Ceph or GlusterFS or a cloud native storage like Portworx. While you have to take care of HA for remote share solutions, data replication is build in for storage clusters and cloud native storage.
In case a containerized service itself is cluster/replica aware it is usualy better to not use the swarm replica mechanism - unless all instances can be started with the same set of parameters.

Mount rexray/ceph volume in multiple containers on Docker swarm

What I have done
I have built a Docker Swarm cluster where I am running containers that have persistent data. To allow the container to move to another host in the event of failure I need resilient shared storage across the swarm. After looking into the various options I have implemented the following:
Installed a Ceph Storage Cluster across all nodes of the Swarm and create a RADOS Block Device (RBD).
http://docs.ceph.com/docs/master/start/quick-ceph-deploy/
Installed Rexray on each node and configure it to use the RBD created above. https://rexray.readthedocs.io/en/latest/user-guide/storage-providers/ceph/
Deploy a Docker stack that mounts a volume using the rexray driver e.g.
version: '3'
services:
test-volume:
image: ubuntu
volumes:
- test-volume:/test
volumes:
test-volume:
driver: rexray
This solution is working in that I can deploy a stack, simulate a failure on the node that is running then observe the stack restarted on another node with no loss of persistent data.
However, I cannot mount a rexray volume in more than one container. My reason for doing is to use a short lived "backup container" that simply tars the volume to a snapshot backup while the container is still running.
My Question
Can I mount my rexray volumes into a second container?
The second container only needs read access so it can tar the volume to a snapshot backup while keeping the first container running.
Unfortunately the answer is no, in this use case rexray volumes cannot be mounted into a second container. Some information below will hopefully assist anyone heading down a similar path:
Rexray does not support multiple mounts:
Today REX-Ray was designed to actually ensure safety among many hosts that could potentially have access to the same host. This means that it forcefully restricts a single volume to only be available to one host at a time. (https://github.com/rexray/rexray/issues/343#issuecomment-198568291)
But Rexray does support a feature called pre-emption where:
..if a second host does request the volume that he is able to forcefully detach it from the original host first, and then bring it to himself. This would simulate a power-off operation of a host attached to a volume where all bits in memory on original host that have not been flushed down is lost. This would support the Swarm use case with a host that fails, and a container trying to be re-scheduled.
(https://github.com/rexray/rexray/issues/343#issuecomment-198568291)
However, pre-emption is not supported by the Ceph RBD.
(https://rexray.readthedocs.io/en/stable/user-guide/servers/libstorage/#preemption)
You could of course have a container that attaches the volume, and then exports it via nfs on a dedicated swarm network, the client containers could then access it via nfs

Can docker swarm stack distribute services averagely to all nodes?

I have:
three 1 swarm manager and 2 swarm working nodes
an application cluster that connected to each other
docker-compose.yml
services:
service1:
ports:
- 8888:8888
environment:
- ADDITIONAL_NODES=service2:8889,service3:8890
service2:
ports:
- 8889:8889
environment:
- ADDITIONAL_NODES=service1:8888,service3:8890
service3:
ports:
- 8890:8890
environment:
- ADDITIONAL_NODES=service1:8888,service2:8889
If I run docker stack deploy -c docker-compose.yml server :
swarm manager(service1), swarm node1(service2), swarm node2(service3)
swarm manager(service1、service2、service3), swarm node1(service1、service2、service3), swarm node3(service1、service2、service3)
Which one will be the result?
If it is 2, how can I deploy like 1 using docker swarm? I need to use docker swarm because I'm also using docker network overlay.
If it is 1, then how does my services distributed? Is it "averagely" distributed? If true then in what perspective is it "averagely" distributed?
Docker swarm has some logic which it uses to decide which services run on which nodes. It might now be 100% what you expect but they are smart people working on this and might consider things that you don't (such as CPU load, available ram....)
The goals is to spread the load evenly so like your example 1. If some services can for some reason not start on one node (like you use a private registry but didn't specify --with-registry-auth in stack deploy) then the services will all start on those nodes who can run them after failing on the other nodes.
From personal experience I can tell you that it spreads tasks nicely accross the swarm but theres no guarantee where which service ends up.
If you want to force where services run use constraints.

docker - in production - HA

how to run docker in production, with a active/active or active/standby HA system?
are there any guides or best practices?
i am thinking of 3 scenarios:
1) NFS - for two servers - wich are prepped with docker-machine and mounting a shared NFS to /var/lib/docker/ - so both docker nodes should see the same files. (using some sort of filer, like vnx, efs, and so on.)
2) using DRBD to replicate a disk - and mount it to: /var/lib/docker/ - so data is on both nodes, and the active node can mount it and run containers, in case of failover the other node mounts and starts the containers
3) using DRBD - as above - and export a NFS server, mounting the NFS on both nodes to : /var/lib/docker/ - so as above both nodes can mount and run containers, using Heartbeat/Pacemaker to travel the virtual-IP & DRBD switching
what is the best practice on running docker-containers in production to make them high availaible.
regards
Persistent storage is still somewhat the elephant in the room in the container/docker world.
I wouldn't recommend using any of the approaches that you're suggesting. The only exception would be if you put some particular data onto a shared volume (using a volume mount) (but not the entire /var/lib/docker).
There are lots of things going on in the container space and there's a volume plugins that integrates directly into Docker. One of the volume plugins/solutions that is gaining the most momentum is Flocker, which is worth looking into.
Once you've moved your data out of your containers, setting up a HA system becomes a lot easier, as the containers become more or less ephemeral.
You can then use something like Kubernetes, Docker Swarm, or Docker Datacenter to manage/monitor these containers.

Resources