I am trying to make a centralized location for postgresql data, and use that for multiple containers on the same network of docker. For this i have to use the shared location in the volume, for example something like
docker -v 0.0.0.0\data:/var/lib/postgrsql/data
How can i specify the shared location as volume's host path and make a linkage with the container's binded folder.
Environment details:
Ubuntu 17.10
Docker 17
Any help or guidance to achieve this would be appreciated.
In Swarm you want to use a volume driver. This way 1. you don't have manual nfs mounts on each node host OS, and 2. you can ensure that the volume you want for a specific service is connected to the host it plans to run on.
The Docker Store has a list of volume plugin drivers for various storage solutions.
If you're using cloud storage or simple NFS, REX-Ray is likely the plugin you want.
Related
I have seen the terms "bind mount" and "host volume" being used in various articles but none of them mention whether they are the same thing or not. But looking at their function, it looks like they are pretty much the same thing. Can anyone answer whether it is the same thing or not? If not, what is the difference?
Ref:
Docker Docs - Use bind mounts
https://blog.logrocket.com/docker-volumes-vs-bind-mounts/
They are different concepts.
As mentioned in bind mounts:
Bind mounts have been around since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its absolute path on the host machine. By contrast, when you use a volume, a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.
And as mentioned in volumes:
Volumes are the preferred mechanism for persisting data generated by
and used by Docker containers. While bind mounts are dependent on the
directory structure and OS of the host machine, volumes are completely
managed by Docker. Volumes have several advantages over bind mounts:
Volumes are easier to back up or migrate than bind mounts.
You can manage volumes using Docker CLI commands or the Docker API.
Volumes work on both Linux and Windows containers.
Volumes can be more safely shared among multiple containers.
Volume drivers let you store volumes on remote hosts or cloud providers, to encrypt the contents of volumes, or to add other functionality.
New volumes can have their content pre-populated by a container.
Volumes on Docker Desktop have much higher performance than bind mounts from Mac and Windows hosts.
A "bind mount" is when you let your container see and use a normal directory in a normal filesystem on your host. Changes made by programs running in the container will be visible in your host's filesystem.
A "volume" is a single file on your host that acts like a whole filesystem visible to the container. You can't normally see what's inside it from the host.
I was able to figure it out.
There are 3 types of storage in Docker.
1. Bind mounts-also known as host volumes.
2. Anonymous volumes.
3. Named volumes.
So bind mount = host volume. They are the same thing. "Host volume" must be a deprecating term though, as I cannot see it in Docker docs. But it can be seen in various articles published 1-2 years ago.
Examples for where it is referred to as "host volume":
https://docs.drone.io/pipeline/docker/syntax/volumes/host/
https://spin.atomicobject.com/2019/07/11/docker-volumes-explained/
This docs page here Manage data in Docker is quite helpful
Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.
Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.
Looking at multiple options to implement a shared storage for a Docker Swarm, I can see most of them require a special Docker plugin:
sshFs
CephFS
glusterFS
S3
and others
... but one thing that is not mentioned anywhere is just mounting a typical block storage to all VPS nodes running the Docker Swarm. Is this option impractical and thus not mentioned on the Internet? Am I missing something?
My idea is as follows:
Create a typical Block Storage (like e.g. one offered by DigitalOcean or Vultr).
Mount it to your VPS filesystem.
Mount a folder from that Block Storage as a volume in the Docker Container / Docker Worker with using a "local" driver.
Sounds the simplest and most obvious to me. Why people are using more complicated setups like sshFs, CephFS etc? And most importantly, is the implementation I described viable, and if so, what are the drawbacks of doing it this way?
The principal advantage of using a volume plugin over mounted storage comes down to the ability to create storage volumes dynamically, and associate them with namespaces.
i.e. with docker managing storage for a volume via a "volumes: data:" directive in a compose file, a volume will be created for each named stack that is deployed.
Using the local driver and mounts, you the swarm admin now need to ensure that no two stacks are trying to use /mnt/data.
Once you pass that hurdle, some platforms have limitations to the number of hosts a block storage can be mounted on to.
Theres also the security angle to consider - with the volume mapped like that a compromose to any service on any host can potentially expose all your data to an attacker, where a volume plugin will expose just exactly the data mounted on that container.
All that said - docker swarm is awesome and the current plugin space is lacking - if mounting block storage is what it takes to get a workable storage solution I say do it. Hopefully the CSI support will be ready before year end however.
I have a service or three that needs access to the same SMB share. The service(s) is running inside a Docker container. I think my choices are:
Have the Docker container(s) where the service(s) is running mount the SMB share itself
Have the host of the Docker container(s) mount the SMB share and then share it with the Docker container(s) where the service(s) is running
Which is better from a best practices perspective (which should probably include security as a dimension)?
Am I missing an option?
Thanks!
In standard Docker, you should almost always mount the filesystem on the host and then use a bind mount to attach the mounted filesystem to a container.
Containers can't usually call mount(2), a typical image won't contain smbclient(1) or mount.cifs(8), and safely passing credentials into a container is tricky. It will be much easier to do the mount on the host, and you can use standard images against the mounted filesystem without having to customize them to add site-specific tools.
One way is to mount the SMB shares on the host system as normal, for example if you are on Linux using mount and fstab. Afterwards you can use docker volumes to add the SMB shares, on your host system to your containers as volumes.
Advantages of using docker volumes are explained in the docker documentation.
More information about docker volumes in the docker documentation,
https://docs.docker.com/storage/volumes/
I would like to know if it is logical to use a redundant NFS/GFS share for webcontent instead of using docker volumes?
I'm trying to build a HA docker environment with the least amount of additional tooling. I would like to stick to 3 servers, each a docker swarm node.
Currently I'm looking into storage: an NFS/GFS filesystem cluster would require additional tooling for a small environment (100gb max storage). I would like to only use native docker supported configurations. So I would prefer to use volumes and share those across containers. However, those volumes are, for as far as I know, not synchronized to other swarm nodes by default.. so if the swarm node that hosts the data volume goes down it will be unavailable for each container across the swarm..
A few things, that together, should answer your question:
Volumes use a driver, and the default driver in Docker run and Swarm services is the built-in "local" driver which only supports file paths that are mounted on that host. For using shared storage with Swarm services, you'll want a 3rd party plugin driver, like REX-Ray. An official list is here: store.docker.com
What you want to look for in a volume driver is one that's "docker swarm aware" that will re-attach volumes to a new task created if old Swarm service task is killed/updated. Tools like REX-Ray are almost like a "persistent data orchestrator" that ensures volumes are attached to the proper node where they are needed.
I'm not sure what web content you're talking about, but if it's code or templates, it should be built into the image. If you're talking about user uploaded content that needs to be backed up, then yep a volume sounds like the right way.
On the host side, should all the mount points be located in the same location? Or should they reflect the locations which are inside the containers?
For example, what is the best place to mount /var/jenkins_home on the host side in order to be consistent with its Unix filesystem?
/var/jenkins_home
/srv/jenkins_home
/opt/docker-volumes/jenkins/var/jenkins_home
Other location ?
It absolutely depends on you where you want to mount the volume on the host. Just don't map it to any system file locations.
In my opinion the volumes reflecting the locations inside the container is not a great idea since you will have many containers, and all will have similar file system structure, so you will never be able to isolate container writes.
With jenkins, since the official Jenkins docker image runs with user "jenkins", it will be not a bad idea for you to create jenkins user on the host and map /home/jenkins on the host to /var/jenkins_home on the container.
Rather than using explicit host:container mounts, consider using named volumes. This has several benefits:
They can be shared easily into other containers
They are host-agnostic (if you don't have the specific mount on that machine, it will fail)
They can be managed as first-class citizens in the Docker world (docker volume)
You don't have to worry about where to put them on your host ;)