Volume and data persistence - docker

What is the best way to persist containers data with docker? I would like to be able to retain some data and be able to get them back when restarting my container. I have read this interesting post but it does not exactly answer my question.
As far as I understand, I only have one option:
docker run -v /home/host/app:/home/container/app
This will mount the countainer folder onto the host.
Is there any other option? FYI, I don't use linking containers (--link )

Using volumes is the best way of handling data which you want to keep from a container. Using the -v flag works well and you shouldn't run into issues with this.
You can also use the VOLUME instruction in the Dockerfile which means you will not have to add any more options at run time, however they're quite tightly coupled with the specific container, you'd need to use docker start, rather than docker run to get the data back (or of course -v to the volume which was created in the past, likely in /var/ somewhere).
A common way of handling volumes is to create a data volume container with volumes defined by -v Then when you create your app container, use the --volumes-from flag. This will make your new container use the same volumes as the container you used the -v on (your data volume container). Of course this may seem like you're shifting the issue somewhere else.
This makes it quite simple to share volumes over multiple containers. Perhaps you have a container for your application, and another for logstash.

create a volume-container: this format of -v creates a volume, directory e.g. /var/lib/docker/volume/d3b0d5b781b7f92771b7342824c9f136c883af321a6e9fbe9740e18b93f29b69
which is still a bind mounted /container/path/vol
docker run -v /foo/bar/vol --name volbox ubuntu
I can now use this container, as my volume.
docker run --volumes-from volbox --name foobox ubuntu /bin/bash
root#foobox# ls /container/path/vol
Now, if I distribute these two containers, they will just work. The volume will always be available to foobox, regardless which host it is deployed to.
The snag of course comes if you don't want your storage to be in /var/lib/docker/volumes...
I suggest you take a look at some of the excellent post by Michael Crosby
https://docs.docker.com/userguide/dockervolumes/
and the docker docs
https://docs.docker.com/userguide/dockervolumes/

Related

Access files on host server from the Meteor App deployed with Meteor Up

I have a Meteor App deployed with Meteor UP to Ubuntu.
From this App I need to read a file which is located outside App container on the host server.
How can I do that?
I've tried to set up volumes in the mup.js but no luck. It seems that I'm missing how to correctly provide /host/path and /container/path
volumes: {
// passed as '-v /host/path:/container/path' to the docker run command
'/host/path': '/container/path',
'/second/host/path': '/second/container/path'
},
Read the docs for Docker mounting volumes but obviously can't understand it.
Let's say file is in /home/dirname/filename.csv.
How to correctly mount it into App to be able to access it from the Application?
Or maybe there are other possibilities to access it?
Welcome to Stack Overflow. Let me suggest another way of thinking about this...
In a scalable cluster, docker instances can be spun up and down as the load on the app changes. These may or may not be on the same host computer, so building a dependency on the file system of the host isn't a great idea.
You might be better to think of using a file storage mechanism such as S3, which will scale on its own, and disk storage limits won't apply.
Another option is to determine if the files could be stored in the database.
I hope that helps
Let's try to narrow the problem down.
Meteor UP is passing the configuration parameter volumes directly on to docker, as they also mention in the comment you included. It therefore might be easier to test it against docker directly - narrowing the components involved down as much as possible:
sudo docker run \
-it \
--rm \
-v "/host/path:/container/path" \
-v "/second/host/path:/second/container/path" \
busybox \
/bin/sh
Let me explain this:
sudo because Meteor UP uses sudo to start the container. See: https://github.com/zodern/meteor-up/blob/3c7120a75c12ea12fdd5688e33574c12e158fd07/src/plugins/meteor/assets/templates/start.sh#L63
docker run we want to start a container.
-it to access the container (think of it like SSH'ing into the container).
--rm to automatically clean up - remove the container - after we're done.
-v - here we give the volumes as you define it (I here took the two directories example you provided).
busybox - an image with some useful tools.
/bin/sh - the application to start the container with
I'd expect that you also cannot access the files here. In this case, dig deeper on why you can't make a folder accessible in Docker.
If you can, which would sound weird to me, you can start the container and try to access into the container by running the following command:
docker exec -it my-mup-container /bin/sh
You can think of this command like SSH'ing into a running container. Now you can check around if it really isn't there, if the credentials inside the container are correct, etc.
At last, I have to agree it #mikkel, that it's not a good option to mount a local directoy, but you can now start looking into how to use docker volume to mount a remote directory. He mentioned S3 by AWS, I've worked with AzureFiles on Azure, there are plenty of possibilities.

docker volume container strategy

Let's say you are trying to dockerise a database (couchdb for example).
Then there are at least two assets you consider volumes for:
database files
log files
Let's further say you want to keep the db-files private but want to expose the log-files for later processing.
As far as I undestand the documentation, you have two options:
First option
define managed volumes for both, log- and db-files within the db-image
import these in a second container (you will get both) and work with the logs
Second option
create data container with a managed volume for the logs
create the db-image with a managed volume for the db-files only
import logs-volume from data container when running db-image
Two questions:
Are both options realy valid/ possible?
What is the better way to do it?
br volker
The answer to question 1 is that, yes both are valid and possible.
My answer to question 2 is that I would consider a different approach entirely and which one to choose depends on whether or not this is a mission critical system and that data loss must be avoided.
Mission critical
If you absolutely cannot lose your data, then I would recommend that you bind mount a reliable disk into your database container. Bind mounting is essentially mounting a part of the Docker Host filesystem into the container.
So taking the database files as an example, you could image these steps:
Create a reliable disk e.g. NFS that is backed-up on a regular basis
Attach this disk to your Docker host
Bind mount this disk into my database container which then writes database files to this disk.
So following the above example, lets say I have created a reliable disk that is shared over NFS and mounted on my Docker Host at /reliable/disk. To use that with my database I would run the following Docker command:
docker run -d -v /reliable/disk:/data/db my-database-image
This way I know that the database files are written to reliable storage. Even if I lose my Docker Host, I will still have the database files and can easily recover by running my database container on another host that can access the NFS share.
You can do exactly the same thing for the database logs:
docker run -d -v /reliable/disk/data/db:/data/db -v /reliable/disk/logs/db:/logs/db my-database-image
Additionally you can easily bind mount these volumes into other containers for separate tasks. You may want to consider bind mounting them as read-only into other containers to protect your data:
docker run -d -v /reliable/disk/logs/db:/logs/db:ro my-log-processor
This would be my recommended approach if this is a mission critical system.
Not mission critical
If the system is not mission critical and you can tolerate a higher potential for data loss, then I would look at Docker Volume API which is used precisely for what you want to do: managing and creating volumes for data that should live beyond the lifecycle of a container.
The nice thing about the docker volume command is that it lets you created named volumes and if you name them well it can be quite obvious to people what they are used for:
docker volume create db-data
docker volume create db-logs
You can then mount these volumes into your container from the command line:
docker run -d -v db-data:/db/data -v db-logs:/logs/db my-database-image
These volumes will survive beyond the lifecycle of your container and are stored on the filesystem if your Docker host. You can use:
docker volume inspect db-data
To find out where the data is being stored and back-up that location if you want to.
You may also want to look at something like Docker Compose which will allow you to declare all of this in one file and then create your entire environment through a single command.

Docker: data volume container is not instantiated with `docker create` command

So, I'm trying to package my WordPress image in a way that all files except the uploads are persisted. In order to do so, I have created my Dockerfile which uses the official WordPress image as its base, and adds the files from an archive (containing all the WordPress files, themes, plugins, etc.), like so:
FROM wordpress
ADD archive.tar.gz /var/www/html/
Since I want the uploads to be persisted, I have created a separate data volume container, e.g. test2.com-wp-data:
docker create -v /var/www/html/wp-content/uploads —name test2.com-wp-data wordpress
Then I simply mount it via —-volumes-from flag:
docker run —name test2.com --volumes-from test2.com-wp-data -d --link test2.com-mysql:mysql myimage
However, when I inspect my newly created container, I cannot find /var/www/html/wp-content/uploads:
# docker inspect -f '{{.HostConfig.VolumesFrom}}' test2.com
[test2.com-wp-data]
# docker inspect -f '{{.Volumes}}' test2.com
map[/var/www/html:/var/lib/docker/vfs/dir/4fff1d36d5aacd0b2c73977acf8fe680bda6fd891f2c4410a90f6c2dca4aaedf]
I can see that both /var/www/html and /var/www/html/wp-content/uploads are set up as volumes in my test2.com-wp-data data container:
# docker inspect -f '{{.Config.Volumes}}' test2.com-wp-data
map[/var/www/html:map[] /var/www/html/wp-content/uploads:map[]]
I know that the wordpress image by default creates a /var/www/html volume, for which I don't really mind, but does that mean that anything that is below that folder is ignored if mounted separately? Will I need to build my own WordPress image in order to have /var/www/html/wp-content/uploads set as a volume in my WordPress container?
Thank you very much for your time!
EDIT: I've tested a different setup with a folder that has nothing to do with /var/www/html, and the result is the same: —-volumes-from is ignored.
Version 1.4 + of docker should be what you need to get this working. Older versions of docker don't seem to play nicely with data-only containers instantiated with "create" rather than "run".
Well, after some further testing I've realised that despite what the documentation indicates, docker create alone is not enough to get a working data volume working. I've only managed to get working data volumes by instantiating them with the docker run command, as follows:
docker run —-name data -v /var/www/html/wp-content/uploads mysql true
This way the container exits immediately, but if I use it to attach the data volume to another container it works as expected.
If anyone knows any specific reason behind this behaviour, I'd be glad to learn more, especially since the documentation seems to be misleading.
Thanks!
EDIT: It turns out I was using Docker 1.3.x, which hadn't implemented this feature yet, hence why the documentation was misleading for me!

Docker data container, boot2docker, and the local file system

I'm slowly working my way through understanding current Docker practices. I'm on a Mac, and I'm using boot2docker.
I've been able to use the docker -v local/directory:container/directory method to link a container directory to my local file system. Great, now I can easily edit things like site code in my local Mac file system and have the changes immediately available to my container (e.g. /var/www/html).
I'm now trying to separate my containers into discrete concerns. For example, a Web, Database, and File (e.g. busybox) container would be useful for a Wordpress site. Thing is, I don't know how to make my file container define volumes that I can then link to my local OS (similar to the -v local/directory:container/directory used by boot2docker).
This is probably not the most eloquent question, as I'm still fumbling through learning Docker, but if you can understand what I'm trying to achieve, I'd really appreciate any guidance provided.
Thanks!
Docker Volumes User Guide
I will use two docker containers for my simple example
marginalized_liskov and plagiarized_engelbart
Mount a Host Directory as a Data Volume (at runtime)
docker run -d -P --name marginalized_liskov -v /host/directory/context:/container/directory/context poop python server.py
marginalized_liskov is the name of the container.
poop is not only my favorite palindrome, but also the name of the volume that we're creating.
"/host/directory/context" is the location on the host that you want to mount
"/container/directory/context" is the location you want your new volume to be created in your container
python is of course the application to run
server.py is the argument provided to "python" for this sample.
Create a Named Volume in a container and mount another container to that volume
docker create -v /poop --name marginalized_liskov training/postgres
docker run -d --volumes-from marginalized_liskov --name plagiarized_engelbart ubuntu
This creates two containers.
marginalized_liskov gets a volume created named poop I built it from the postgres training image because that's what was used in the User Guide. Since we're just setting up a container to contain a data volume and not host applications, using the training/postgres image provides our functionality while remaining lean.
plagiarized_engelbart mounts the volumes from marginalized_liskov with the --volumes-from flag.

Dynamic mount point for Shared Volume Containers

Is there any way currently to create a dynamically named volume during Docker's build process? I'd like to see something like:
sudo docker run -e MOUNT_POINT="/path/to/mount" module/sub-module
and then in the Dockerfile have something like:
ln -s /internal/path/to/storage $MOUNT_POINT
VOLUME [$MOUNT_POINT]
This would allow the highly valuable volumes-from directive to be used but each storage container built could have a variant mount point (and avoid colliding with a consumer who wanted to consume more than one data-volume-container).
Any ideas would be VERY welcome.
Here is how one should use volumes.
You have one container, say your application container for e.g. a database.
You have another container, say your volumes container actually holding your data.
You start your volumes container with the volumes parameter -v. Here you can name your volume dynamically.
You start your application container with the option --volumes-from using your volumes container.
See the docs for detailed information https://docs.docker.com/userguide/dockervolumes/

Resources