Is there any performance difference between following docker named volumes vs bind mounted volumes? If yes, how much numbers are we talking about?
Docker volume example:
docker run -v mysql:/var/lib/mysql mysql:tag
Docker bind mount example:
docker run -v /path/to/mysql-data:/var/lib/mysql mysql:tag
These containers are used for mostly databases like elasticsearch, mysql and mongodb. Which one should I prefer?
On a couple of platforms (MacOS, Windows with WSL 2) bind mounts are known to be especially slow.
Beyond that, you shouldn't see a perceptible performance difference between named volumes, the container filesystem, files in the image (regardless of the number of layers), or bind mounts (particularly on native Linux).
A good general rule might be to use bind mounts for config files and log files, where I/O is relatively rare but you as a human need to access the files directly; named volumes for database storage and other content where I/O is relatively frequent but as a human you can't directly read the files; and the image itself for your application code.
Related
We use containers to provision storage on our storage nodes but I can't for the life of my figure out a way to mount a device to the bare metal OS from a container. Both bare metal and containers are running oracle linux 7.5.
We cannot use ssh in any form for this. This is an isolated compute environment and the only access is thru the orchestration we use to manage containers.
I'm mainly a solaris guy so wondering if there is any linux magic I can work here.
I can mount any bare metal devices or filesystems into the container and I can run the container in privileged more.
Thx for any help
* clarification *
This is not about mounting a volume into a container.
This container is a temporary provisioning container, ie: it does stuff like mount iscisi volumes, create volume groups, create logical volumes and make filesystems.
This part is all working fine.
The last step this container needs to do is somehow tell the BARE METAL OPERATING SYSTEM TO MOUNT A DEVICE INTO IT'S FILESYSTEM. NOT IN THE CONTAINER.
Simplistic example: I need this container to somehow tell the OS to "mount /dev/sdg /data".
This mount does not need to be available to the container. The container is being destroyed once it allocate the storage and mounts it.
And we can't use SSH for this.
There are several problems you need to overcome.
By default, Docker does not have access to block devices on the
host.
A docker container is unable to modify its own mount namespace.
A docker container runs in a private mount namespace, so even after
solving (1) and (2), any mounts you make inside the container will
not be visible from the host.
Fortunately, there are solutions to all of the above!
We can solve (1) and (2) by passing the --privileged flag to
docker run. This removes all the restrictions that Docker normally
places on a container.
For solving (3), we need to use the --mount option instead of the
-v option, since we need to modify the style of mount propagation
used. Reading through the documentation on
bind-mounts,
we see that the --mount option supports the following options:
The type of the mount, which can be bind, volume, or tmpfs. This topic discusses bind mounts, so the type will always be bind.
The source of the mount. For bind mounts, this is the path to the file or directory on the Docker daemon host. May be specified as source or src.
The destination takes as its value the path where the file or directory will be mounted in the container. May be specified as destination, dst, or target.
The readonly option, if present, causes the bind mount to be mounted into the container as read-only.
The bind-propagation option, if present, changes the bind propagation. May be one of rprivate, private, rshared, shared, rslave, slave.
The consistency option, if present, may be one of consistent, delegated, or cached. This setting only applies to Docker for Mac, and is ignored on all other platforms.
The one we care about is the bind-propagation option. The values for
that are described later on in the same
document.
Reading through them, we probably want rshared.
Armed with this knowledge, I can run:
docker run -it \
--mount type=bind,source=/,dst=/host,bind-propagation=rshared \
--privileged alpine sh
Then inside the container I can run, for example:
mount /dev/sdd1 /host/mnt
And on the host I see the contents of /dev/sdd1 mounted on /mnt. The mount will persist after the container exits.
I created two docker containers with compose on Docker for Windows, using wordpress and mariadb. I've created a volume for wordpress that points to my PC's normal filesystem, but mariaDB's is still contained within the Hyper-V's Virtual Hard Disk.
The mount point is at /var/lib/docker/volumes/1995...ca3/_data
I've tried looking at previous answers, but the link that would explain how to backup, copy, or restore volumes redirects to a general volume explanation. Most plugins or scripts I've seen for Docker typically refers to a *nix environment.
Would anyone know of a modern method to export and import volumes mounted to Linux containers in Docker for Windows?
The way I normally do this is to start a container that mounts two volumes, the source volume and the destination volume, and I run a command in that container that copies the contents of one volume to another. I don't have a copy of windows at hand to find out how to copy all files recursively, but I'm sure it can do it quite easily.
I want to know why we have two different options to do the same thing, What are the differences between the two.
We basically have 3 types of volumes or mounts for persistent data:
Bind mounts
Named volumes
Volumes in dockerfiles
Bind mounts are basically just binding a certain directory or file from the host inside the container (docker run -v /hostdir:/containerdir IMAGE_NAME)
Named volumes are volumes which you create manually with docker volume create VOLUME_NAME. They are created in /var/lib/docker/volumes and can be referenced to by only their name. Let's say you create a volume called "mysql_data", you can just reference to it like this docker run -v mysql_data:/containerdir IMAGE_NAME.
And then there's volumes in dockerfiles, which are created by the VOLUME instruction. These volumes are also created under /var/lib/docker/volumes but don't have a certain name. Their "name" is just some kind of hash. The volume gets created when running the container and are handy to save persistent data, whether you start the container with -v or not. The developer gets to say where the important data is and what should be persistent.
What should I use?
What you want to use comes mostly down to either preference or your management. If you want to keep everything in the "docker area" (/var/lib/docker) you can use volumes. If you want to keep your own directory-structure, you can use binds.
Docker recommends the use of volumes over the use of binds, as volumes are created and managed by docker and binds have a lot more potential of failure (also due to layer 8 problems).
If you use binds and want to transfer your containers/applications on another host, you have to rebuild your directory-structure, where as volumes are more uniform on every host.
Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. While bind mounts are dependent on the directory structure of the host machine, volumes are completely managed by Docker. Volumes are often a better choice than persisting data in a container’s writable layer, because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container. More on
Differences between -v and --mount behavior
Because the -v and --volume flags have been a part of Docker for a long time, their behavior cannot be changed. This means that there is one behavior that is different between -v and --mount.
If you use -v or --volume to bind-mount a file or directory that does not yet exist on the Docker host, -v creates the endpoint for you. It is always created as a directory.
If you use --mount to bind-mount a file or directory that does not yet exist on the Docker host, Docker does not automatically create it for you, but generates an error. More on
Docker for Windows shared folders limitation
Docker for Windows does make much of the VM transparent to the Windows host, but it is still a virtual machine. For instance, when using –v with a mongo container, MongoDB needs something else supported by the file system. There is also this issue about volume mounts being extremely slow.
More on
Bind mounts are like a superset of Volumes (named or unnamed).
Bind mounts are created by binding an existing folder in the host system (host system is native linux machine or vm (in windows or mac)) to a path in the container.
Volume command results in a new folder, created in the host system under /var/lib/docker
Volumes are recommended because they are managed by docker engine (prune, rm, etc).
A good use case for bind mount is linking development folders to a path in the container. Any change in host folder will be reflected in the container.
Another use case for bind mount is keeping the application log which is not crucial like a database.
Command syntax is almost the same for both cases:
bind mount:
note that the host path should start with '/'. Use $(pwd) for convenience.
docker container run -v /host-path:/container-path image-name
unnamed volume:
creates a folder in the host with an arbitrary name
docker container run -v /container-path image-name
named volume:
should not start with '/' as this is reserved for bind mount.
'volume-name' is not a full path here. the command will cause a folder to be created with path "/var/lib/docker/volumes/volume-name" in the host.
docker container run -v volume-name:/container-path image-name
A named volume can also be created beforehand a container is run (docker volume create). But this is almost never needed.
As a developer, we always need to do comparison among the options provided by tools or technology. For Volume & Bind mounts, I would suggest to list down what kind of application you are trying to containerize.
Following are the parameters that I would consider before choosing Volume over Bind Mounts:
Docker provide various CLI commands to Volumes easily outside containers.
For backup & restore, Volume is far easier than Bind as it depends upon the underlying host OS.
Volumes are platform-agnostic so they can work on Linux as well as on Window containers.
With Bind, you have 2 technologies to take care of. Your host machine directory structure as well as Docker.
Migration of Volumes are easier not only on local machines but on cloud machines as well.
Volumes can be easily shared among multiple containers.
Let's say you are trying to dockerise a database (couchdb for example).
Then there are at least two assets you consider volumes for:
database files
log files
Let's further say you want to keep the db-files private but want to expose the log-files for later processing.
As far as I undestand the documentation, you have two options:
First option
define managed volumes for both, log- and db-files within the db-image
import these in a second container (you will get both) and work with the logs
Second option
create data container with a managed volume for the logs
create the db-image with a managed volume for the db-files only
import logs-volume from data container when running db-image
Two questions:
Are both options realy valid/ possible?
What is the better way to do it?
br volker
The answer to question 1 is that, yes both are valid and possible.
My answer to question 2 is that I would consider a different approach entirely and which one to choose depends on whether or not this is a mission critical system and that data loss must be avoided.
Mission critical
If you absolutely cannot lose your data, then I would recommend that you bind mount a reliable disk into your database container. Bind mounting is essentially mounting a part of the Docker Host filesystem into the container.
So taking the database files as an example, you could image these steps:
Create a reliable disk e.g. NFS that is backed-up on a regular basis
Attach this disk to your Docker host
Bind mount this disk into my database container which then writes database files to this disk.
So following the above example, lets say I have created a reliable disk that is shared over NFS and mounted on my Docker Host at /reliable/disk. To use that with my database I would run the following Docker command:
docker run -d -v /reliable/disk:/data/db my-database-image
This way I know that the database files are written to reliable storage. Even if I lose my Docker Host, I will still have the database files and can easily recover by running my database container on another host that can access the NFS share.
You can do exactly the same thing for the database logs:
docker run -d -v /reliable/disk/data/db:/data/db -v /reliable/disk/logs/db:/logs/db my-database-image
Additionally you can easily bind mount these volumes into other containers for separate tasks. You may want to consider bind mounting them as read-only into other containers to protect your data:
docker run -d -v /reliable/disk/logs/db:/logs/db:ro my-log-processor
This would be my recommended approach if this is a mission critical system.
Not mission critical
If the system is not mission critical and you can tolerate a higher potential for data loss, then I would look at Docker Volume API which is used precisely for what you want to do: managing and creating volumes for data that should live beyond the lifecycle of a container.
The nice thing about the docker volume command is that it lets you created named volumes and if you name them well it can be quite obvious to people what they are used for:
docker volume create db-data
docker volume create db-logs
You can then mount these volumes into your container from the command line:
docker run -d -v db-data:/db/data -v db-logs:/logs/db my-database-image
These volumes will survive beyond the lifecycle of your container and are stored on the filesystem if your Docker host. You can use:
docker volume inspect db-data
To find out where the data is being stored and back-up that location if you want to.
You may also want to look at something like Docker Compose which will allow you to declare all of this in one file and then create your entire environment through a single command.
How do you guys deal with files of web applications for your docker containers? We are using same application for >400 customers. It's the same application with enabled/disabled modules (there are extra files).
I am currently using this approach: build the images, e.g. for Mysql, nginx+php, and then start the container with specific prepared application folder:
docker create -v /dbdata --name dbstore x/mysql /bin/true
docker run -d --volumes-from dbstore --name db1 x/mysql
docker run -d -P --name web --link db1:db1 -v /webapp:/opt/webapp x/webapp php-start index.php
IMHO, it's a space overusing.
I think it's a little bit complex to create >100 tags(revisions) of a webapp docker data container.
Please advice how to manage this problem?
First, recent versions of Docker let you create and use named volumes. This means that "data-only containers" are antiquated and no longer necessary, and in fact are considered an anti-pattern these days. It's pretty straightforward to create and use a named volume:
docker volume create --name=foo
docker run -d -v "foo:/dbdata" --name "db1" x/mysql
You can view your volumes with:
docker volume ls
As far as your main question, you could take advantage of Docker's union filesystem (which could also more simply be called a "shared layer") design. What this means is that if you create two containers from the ubuntu image (e.g. docker run -d --name=one ubuntu and docker run -d --name=two ubuntu), both of those containers are going to use the same filesystem objects in the base ubuntu image. So for example the /etc/passwd file in both of those containers point to the same /etc/passwd data stored on disk. This is part of what is meant by the term "union filesystem" in the context of Docker.
So just take this knowledge a step further and "bake" those modules into your base image for use by all of the containers for your different customers. That just means creating your own image from a Dockerfile which uses FROM wordpress:latest at the top. Continuing with the WordPress example, and if you wanted to make a bunch of WP plugins available, you could just store them in /var/www/html/wp-plugins (or whatever) and only enable certain ones in your configuration. Since they're baked into the image you have created (and used the same image to create all of your different containers), all of those module files point to the same exact data stored on disk, via the union filesystem. Of course, if someone changes the code in one of their modules, for example, the individual container's image will store the changes in its own image layer, but the base files will all be from the same data, not taking up any extra space. Of course, you can substitute in whichever CMS you're using.
Now, where I work, I've recently created a Docker-based hosting system for people to use. The issue is that we wanted each and every customer to have their own copy of the CMS filesystem. Even though the union filesystem means that changes to the base image would be stored in their own image layers, that wasn't good enough for the guy that signs my paycheck. They wanted each customer to have their own EBS volume with their own copy of the CMS filesystem on it. So in that situation, where you want each and every customer to have their own volume (for example in order to transport them for backup, or move to a new host, etc), then you won't be able to get around the issue of using extra storage for those files.
It depends:
If the files are static and you want to be able to move the container around easily, then I keep the files in the container by just copying them into the web location as single directory.
If you have a reliable external location, and you change the files more regular (for example by using some kind of CMS), you could just run an apache or a nginx container and mount the volume