I am playing around with ScyllaDB in docker. To have ScyllaDB operate most efficient in docker production setup it needs a XFS-formatted disk.
Do you know how to create a XFS container volumes, file of disk - in Linux and MacOs?
Thanks
The best way to do that is to create a partition or LVM volume, and then format it with xfs normally, using some tool like mkfs.xfs.
Once you are done, you can use the -v flag in docker to pass that to your container.
Related
I'm using named volumes to persist data on Host machine in the cloud.
I want to take backup of these volumes present in the docker environment so that I can reuse them on critical incidents.
Almost decided to write a python script to compress the specified directory on the host machine and push it to the AWS S3.
But I would like to know if there is any other approaches to this problem?
docker-volume-backup may be helpful. It allows you to back up your Docker volumes to an external location or to a S3 storage.
Why use a Docker container to back up a Docker volume instead of writing your own Python script? Ideally you don't want to make backups while the volume is being used, so having a container on your docker-compose able to properly stop your container before taking backups can effectively copy data without affecting the application performance or backup integrity.
There's also this alternative: volume-backup
Is there any performance difference between following docker named volumes vs bind mounted volumes? If yes, how much numbers are we talking about?
Docker volume example:
docker run -v mysql:/var/lib/mysql mysql:tag
Docker bind mount example:
docker run -v /path/to/mysql-data:/var/lib/mysql mysql:tag
These containers are used for mostly databases like elasticsearch, mysql and mongodb. Which one should I prefer?
On a couple of platforms (MacOS, Windows with WSL 2) bind mounts are known to be especially slow.
Beyond that, you shouldn't see a perceptible performance difference between named volumes, the container filesystem, files in the image (regardless of the number of layers), or bind mounts (particularly on native Linux).
A good general rule might be to use bind mounts for config files and log files, where I/O is relatively rare but you as a human need to access the files directly; named volumes for database storage and other content where I/O is relatively frequent but as a human you can't directly read the files; and the image itself for your application code.
Let's say you are trying to dockerise a database (couchdb for example).
Then there are at least two assets you consider volumes for:
database files
log files
Let's further say you want to keep the db-files private but want to expose the log-files for later processing.
As far as I undestand the documentation, you have two options:
First option
define managed volumes for both, log- and db-files within the db-image
import these in a second container (you will get both) and work with the logs
Second option
create data container with a managed volume for the logs
create the db-image with a managed volume for the db-files only
import logs-volume from data container when running db-image
Two questions:
Are both options realy valid/ possible?
What is the better way to do it?
br volker
The answer to question 1 is that, yes both are valid and possible.
My answer to question 2 is that I would consider a different approach entirely and which one to choose depends on whether or not this is a mission critical system and that data loss must be avoided.
Mission critical
If you absolutely cannot lose your data, then I would recommend that you bind mount a reliable disk into your database container. Bind mounting is essentially mounting a part of the Docker Host filesystem into the container.
So taking the database files as an example, you could image these steps:
Create a reliable disk e.g. NFS that is backed-up on a regular basis
Attach this disk to your Docker host
Bind mount this disk into my database container which then writes database files to this disk.
So following the above example, lets say I have created a reliable disk that is shared over NFS and mounted on my Docker Host at /reliable/disk. To use that with my database I would run the following Docker command:
docker run -d -v /reliable/disk:/data/db my-database-image
This way I know that the database files are written to reliable storage. Even if I lose my Docker Host, I will still have the database files and can easily recover by running my database container on another host that can access the NFS share.
You can do exactly the same thing for the database logs:
docker run -d -v /reliable/disk/data/db:/data/db -v /reliable/disk/logs/db:/logs/db my-database-image
Additionally you can easily bind mount these volumes into other containers for separate tasks. You may want to consider bind mounting them as read-only into other containers to protect your data:
docker run -d -v /reliable/disk/logs/db:/logs/db:ro my-log-processor
This would be my recommended approach if this is a mission critical system.
Not mission critical
If the system is not mission critical and you can tolerate a higher potential for data loss, then I would look at Docker Volume API which is used precisely for what you want to do: managing and creating volumes for data that should live beyond the lifecycle of a container.
The nice thing about the docker volume command is that it lets you created named volumes and if you name them well it can be quite obvious to people what they are used for:
docker volume create db-data
docker volume create db-logs
You can then mount these volumes into your container from the command line:
docker run -d -v db-data:/db/data -v db-logs:/logs/db my-database-image
These volumes will survive beyond the lifecycle of your container and are stored on the filesystem if your Docker host. You can use:
docker volume inspect db-data
To find out where the data is being stored and back-up that location if you want to.
You may also want to look at something like Docker Compose which will allow you to declare all of this in one file and then create your entire environment through a single command.
Normally we would run container by using the following command:
Docker run -it ubuntu /bin/bash
Is there any option to specify where to run the container (like on which disk or partition)?
Do you mean where the container data/layers will be stored?
The layers are all inside /var/lib/docker/(aufs)
It's possible for you to mount a different larger/faster partition into this folder, but this is for the entire docker platform. if you are careful, you can mount the partition for a particular docker container.
It would be better if you would use "docker run -v folder:mount point" flag, since you can mount specific host folders as external volumes inside the container.
Both these can help you spread data over different partitions/disks.
I am not aware of a container specific option.
However, you can bind-mount (or create a symlink) a particular disk or partition to '/var/lib/docker'. This will make all the container storage to be on that partition.
If you want the container storage to be on multiple partitions, LVM is an option.
You can setup a volume group that spans multiple partitions. You can then ask the Docker daemon to create a thinly provisioned logical volume in one of these volume groups to be used as storage.
The following link provides more information : https://access.redhat.com/documentation/en/red-hat-enterprise-linux-atomic-host/7/getting-started-with-containers/chapter-7-managing-storage-with-docker-formatted-containers
Also, using a union mount like OverlayFS could be another solution : https://askubuntu.com/questions/109413/how-do-i-use-overlayfs
I would like to power a docker instance in ram... totally inside ram... using tmpfs
Can it be done?
I'm not sure how docker uses filesystems as I'm too used to using kvm and xen, they both need to set up a default size before it can be used.
So how does "docker fs" work?
This can be done. If you mount /var/lib/docker on a tmpfs, Docker can use other storage backends, like OverlayFS, on top of it.
Docker uses what it calls a "Union File System", made up of multiple read-only layers with a copy-on-write layer on top (see http://docs.docker.com/terms/layer/). It can use one of several storage drivers for this (in order of preference): AUFS, BTRFS, devicemapper, overlayfs or VFS. Because of this, no, I don't think you will be able to use tmpfs.
More information at https://developerblog.redhat.com/2014/09/30/overview-storage-scalability-docker/