How to deal with files of web applications in docker? - docker

How do you guys deal with files of web applications for your docker containers? We are using same application for >400 customers. It's the same application with enabled/disabled modules (there are extra files).
I am currently using this approach: build the images, e.g. for Mysql, nginx+php, and then start the container with specific prepared application folder:
docker create -v /dbdata --name dbstore x/mysql /bin/true
docker run -d --volumes-from dbstore --name db1 x/mysql
docker run -d -P --name web --link db1:db1 -v /webapp:/opt/webapp x/webapp php-start index.php
IMHO, it's a space overusing.
I think it's a little bit complex to create >100 tags(revisions) of a webapp docker data container.
Please advice how to manage this problem?

First, recent versions of Docker let you create and use named volumes. This means that "data-only containers" are antiquated and no longer necessary, and in fact are considered an anti-pattern these days. It's pretty straightforward to create and use a named volume:
docker volume create --name=foo
docker run -d -v "foo:/dbdata" --name "db1" x/mysql
You can view your volumes with:
docker volume ls
As far as your main question, you could take advantage of Docker's union filesystem (which could also more simply be called a "shared layer") design. What this means is that if you create two containers from the ubuntu image (e.g. docker run -d --name=one ubuntu and docker run -d --name=two ubuntu), both of those containers are going to use the same filesystem objects in the base ubuntu image. So for example the /etc/passwd file in both of those containers point to the same /etc/passwd data stored on disk. This is part of what is meant by the term "union filesystem" in the context of Docker.
So just take this knowledge a step further and "bake" those modules into your base image for use by all of the containers for your different customers. That just means creating your own image from a Dockerfile which uses FROM wordpress:latest at the top. Continuing with the WordPress example, and if you wanted to make a bunch of WP plugins available, you could just store them in /var/www/html/wp-plugins (or whatever) and only enable certain ones in your configuration. Since they're baked into the image you have created (and used the same image to create all of your different containers), all of those module files point to the same exact data stored on disk, via the union filesystem. Of course, if someone changes the code in one of their modules, for example, the individual container's image will store the changes in its own image layer, but the base files will all be from the same data, not taking up any extra space. Of course, you can substitute in whichever CMS you're using.
Now, where I work, I've recently created a Docker-based hosting system for people to use. The issue is that we wanted each and every customer to have their own copy of the CMS filesystem. Even though the union filesystem means that changes to the base image would be stored in their own image layers, that wasn't good enough for the guy that signs my paycheck. They wanted each customer to have their own EBS volume with their own copy of the CMS filesystem on it. So in that situation, where you want each and every customer to have their own volume (for example in order to transport them for backup, or move to a new host, etc), then you won't be able to get around the issue of using extra storage for those files.

It depends:
If the files are static and you want to be able to move the container around easily, then I keep the files in the container by just copying them into the web location as single directory.
If you have a reliable external location, and you change the files more regular (for example by using some kind of CMS), you could just run an apache or a nginx container and mount the volume

Related

Can I access some files which created in Docker Container from my local?(ex C drive or Desktop Folder)

I have window10 and SSD(e.g samsung SSD 256G)
If i created A Docker ubuntu container and access somewhere in there(e.g /home/myname)
and i created test.txt which contains "hello world", it might be in "/home/myname/test.txt"
and test.txt might have it's own size(8kb) i think it should get his room from samsung.SSD
i can access test.txt using 'docker attach' and also i know how to mount using -v option then i can change or update that file(i know it is just duplicated from Container)
But I wanna see or access test.txt file from My Window10 C-drive or window10-Desktop or using find/search function given from window10 how test.txt exists or using my samsung.SSD
sorry for lack of en, basic computing system.
the following comes from "https://docs.docker.com/storage/" it works not enough for me
By default all files created inside a container are stored on a writable container layer. This means that:
The data doesn’t persist when that container no longer exists, and it can be difficult to get the data out of the container if another process needs it.
A container’s writable layer is tightly coupled to the host machine where the container is running. You can’t easily move the data somewhere else.
Writing into a container’s writable layer requires a storage driver to manage the filesystem. The storage driver provides a union filesystem, using the Linux kernel. This extra abstraction reduces performance as compared to using data volumes, which write directly to the host filesystem.
Docker has two options for containers to store files in the host machine, so that the files are persisted even after the container stops: volumes, and bind mounts. If you’re running Docker on Linux you can also use a tmpfs mount. If you’re running Docker on Windows you can also use a named pipe.
Keep reading for more information about these two ways of persisting data.
Try the suggestions here:
https://stackoverflow.com/a/27320731/13064727
think this is a 2 step process, maybe you are missing the first step.
so seems you don't udnerstand how -v works
$ docker run -ti --rm -v "<your_windows_path>:/apps -w /apps ubuntu bash
root#b2fd40f5f423:/apps# echo "helloworld" > test.txt
-w /apps (WORKDIR) to make sure you create the file in container will be the same path reflected to your windows path.
from your windows system, you should be fine to search this file under local disk or SSD disk with path <your_windows_path>

How to restore docker postgres container?

I am totally newbie with docker. Unfortunately, I nave made a change - I set a vew environment variable from GUI and it astonishingly caused container re-creation! All postgreSQL DBs have been lost.
So, two questions:
Why did it happen?
is there a way to rollback? (There were no backups or something else).
There are a fairly broad set of changes that require deleting and recreating containers. As you've discovered, this includes changing environment variables; it also includes published ports, host-mapped directories, and changing the image underneath the container. In turn, the image will change if there's ever any sort of security update, software patch release, or just a new application build.
In short: deleting Docker containers is very common and you need to make sure the data gets preserved.
The standard way to do this is is to mount some additional storage into the container. Docker provides a named volume system, but the named volumes can be opaque and hard to manage; it's often easier to bind mount a host directory. (N.B.: the linked documentation advocates for named volumes, IME host directories are easier to inspect and manage with readily-available non-Docker tools.) You need to look at each image's documentation to know where to attach the storage, but for the standard postgres image it is in /var/lib/postgresql/data (see "Where To Store Data" at the end of the linked page). In plain Docker you could run
docker run \
-d \
-p 5432:5432 \
-v ./postgres:/var/lib/postgresql/data \
postgres:11
but there's presumably a setting for that in your GUI tool.
Your previous data is probably lost. Docker doesn't keep snapshots of containers, and deleting a container actually deletes it and its underlying data. You still need to do things like take backups of your data in case Docker or some other part of your system fails.

Image of a data volume using docker

I am very interested in reproducible data science work. To that end, I am now exploring Docker as a platform which enables bundling of code, data and environment's settings. My first simple attempt is a Docker image which contains the data it needs (link).
However, this is only the first step, in this example, the data is part of the image, and thus when the image is loaded into a container, the data is already there. My next objective is to decouple the code of the analysis and the data. As far as I understand, that would mean to have two containers, one with the code (code) and one with the data (data).
For the code I use a simple Dockerfile:
FROM continuumio/miniconda3
RUN conda install ipython
and for the data:
FROM atlassian/ubuntu-minimal
COPY data.csv /tmp
where data.csv is a data file I'm copying to the image.
After building these two images I can run them as described in this solution:
docker run -i -t --name code --net=data-testing --net-alias=code drorata/minimal-python /bin/bash
docker run -i -t --name data --net=data-testing --net-alias=data drorata/data-image /bin/bash
after starting a network: docker network create data-testing
After these steps I can ping one container from the other, and probably also access data.csv from code. But I have this feeling this is a sub optimal solution and cannot be considered good practice.
What is considered a good practice to have a container that can access data? I read a little about data volumes but I don't understand how to utilize them and how to turn them into images.
the use of a container as data storage is largely considered outdated and deprecated, at this point. you should be using data volumes instead.
but a data volume is not something that you can turn into an image. really, there is no need for this.
if you want to deliver a .csv file to someone and let them use that in their docker container, just give them the .csv file.
the easiest way to get the file into the container and be able to use it, is with a host mounted volume.
using the -v flag on docker run, you can specify a local folder or file to be mounted into the docker container.
Say, for example, your docker image expects to find a file at /data/input.csv. When you call docker run and you want to provide your own input.csv file, you would do something like
docker run -v /my/file/path/input.csv:/data/ my-image
i'm not providing all of the options in this example that you are showing, but i am illustrating the -v flag. this will take your local filesystem's input.csv and mount it into the docker container. now your container will be able to use your copy of that data.

docker volume container strategy

Let's say you are trying to dockerise a database (couchdb for example).
Then there are at least two assets you consider volumes for:
database files
log files
Let's further say you want to keep the db-files private but want to expose the log-files for later processing.
As far as I undestand the documentation, you have two options:
First option
define managed volumes for both, log- and db-files within the db-image
import these in a second container (you will get both) and work with the logs
Second option
create data container with a managed volume for the logs
create the db-image with a managed volume for the db-files only
import logs-volume from data container when running db-image
Two questions:
Are both options realy valid/ possible?
What is the better way to do it?
br volker
The answer to question 1 is that, yes both are valid and possible.
My answer to question 2 is that I would consider a different approach entirely and which one to choose depends on whether or not this is a mission critical system and that data loss must be avoided.
Mission critical
If you absolutely cannot lose your data, then I would recommend that you bind mount a reliable disk into your database container. Bind mounting is essentially mounting a part of the Docker Host filesystem into the container.
So taking the database files as an example, you could image these steps:
Create a reliable disk e.g. NFS that is backed-up on a regular basis
Attach this disk to your Docker host
Bind mount this disk into my database container which then writes database files to this disk.
So following the above example, lets say I have created a reliable disk that is shared over NFS and mounted on my Docker Host at /reliable/disk. To use that with my database I would run the following Docker command:
docker run -d -v /reliable/disk:/data/db my-database-image
This way I know that the database files are written to reliable storage. Even if I lose my Docker Host, I will still have the database files and can easily recover by running my database container on another host that can access the NFS share.
You can do exactly the same thing for the database logs:
docker run -d -v /reliable/disk/data/db:/data/db -v /reliable/disk/logs/db:/logs/db my-database-image
Additionally you can easily bind mount these volumes into other containers for separate tasks. You may want to consider bind mounting them as read-only into other containers to protect your data:
docker run -d -v /reliable/disk/logs/db:/logs/db:ro my-log-processor
This would be my recommended approach if this is a mission critical system.
Not mission critical
If the system is not mission critical and you can tolerate a higher potential for data loss, then I would look at Docker Volume API which is used precisely for what you want to do: managing and creating volumes for data that should live beyond the lifecycle of a container.
The nice thing about the docker volume command is that it lets you created named volumes and if you name them well it can be quite obvious to people what they are used for:
docker volume create db-data
docker volume create db-logs
You can then mount these volumes into your container from the command line:
docker run -d -v db-data:/db/data -v db-logs:/logs/db my-database-image
These volumes will survive beyond the lifecycle of your container and are stored on the filesystem if your Docker host. You can use:
docker volume inspect db-data
To find out where the data is being stored and back-up that location if you want to.
You may also want to look at something like Docker Compose which will allow you to declare all of this in one file and then create your entire environment through a single command.

Volume and data persistence

What is the best way to persist containers data with docker? I would like to be able to retain some data and be able to get them back when restarting my container. I have read this interesting post but it does not exactly answer my question.
As far as I understand, I only have one option:
docker run -v /home/host/app:/home/container/app
This will mount the countainer folder onto the host.
Is there any other option? FYI, I don't use linking containers (--link )
Using volumes is the best way of handling data which you want to keep from a container. Using the -v flag works well and you shouldn't run into issues with this.
You can also use the VOLUME instruction in the Dockerfile which means you will not have to add any more options at run time, however they're quite tightly coupled with the specific container, you'd need to use docker start, rather than docker run to get the data back (or of course -v to the volume which was created in the past, likely in /var/ somewhere).
A common way of handling volumes is to create a data volume container with volumes defined by -v Then when you create your app container, use the --volumes-from flag. This will make your new container use the same volumes as the container you used the -v on (your data volume container). Of course this may seem like you're shifting the issue somewhere else.
This makes it quite simple to share volumes over multiple containers. Perhaps you have a container for your application, and another for logstash.
create a volume-container: this format of -v creates a volume, directory e.g. /var/lib/docker/volume/d3b0d5b781b7f92771b7342824c9f136c883af321a6e9fbe9740e18b93f29b69
which is still a bind mounted /container/path/vol
docker run -v /foo/bar/vol --name volbox ubuntu
I can now use this container, as my volume.
docker run --volumes-from volbox --name foobox ubuntu /bin/bash
root#foobox# ls /container/path/vol
Now, if I distribute these two containers, they will just work. The volume will always be available to foobox, regardless which host it is deployed to.
The snag of course comes if you don't want your storage to be in /var/lib/docker/volumes...
I suggest you take a look at some of the excellent post by Michael Crosby
https://docs.docker.com/userguide/dockervolumes/
and the docker docs
https://docs.docker.com/userguide/dockervolumes/

Resources