How to move a local volume onto a remote docker machine - docker

I have my local docker machine and a remote docker machine, on the cloud. My docker-compose app has a webcontainer with this config:
web:
container_name: web
restart: always
build: ./web
expose:
- "8000"
links:
- postgres:postgres
volumes:
- /usr/src/app/static
- ./data:/usr/src/app/data
env_file: .env
command: /usr/local/bin/gunicorn --workers 4 --timeout 120 --bind :8000 app:app
The important part is that second volume. I have this local folder called data with some 10GB of data in it. I made it a volume in the first place because otherwise building the container takes forever. Now that the app is production-ready, I'd like to deploy it. One problem: now my remote web container has an empty data folder mounted in it. So how do I move data from my local machine into a container on a remote docker machine? Where do I even move it to?
It seems like there are two tools for this:
docker cp which doesn't seem like it will work for remote docker machines
docker-machine scp which seems made for this, right?
I'm almost positive I need to use the second of these, but since I don't quite understand how docker machine works or where it keeps its data, I'm not sure what destination path to use:
$ dm scp -r /Users/alex/Documents/Project/data remote-machine:/usr/src/app/data
fails with error message:
scp: /usr/src/app/data: No such file or directory
Where should I be scp'ing this data in order to have it mount properly on my remote web container?

Local path vs. in-container path
Assuming you will use the same model remotely that you used locally, keep in mind that the path /usr/src/app/data is the path inside the container. When you are copying the files from one system to another, you just need to copy them from the current system to the remote system, then put them in a path where docker-compose knows how to find them, to mount into a new container.
So all you have to do is copy them from here to there, and use the same path relative to docker-compose.yml. It only knows your external volume as ./data, so if you put the directory in the same place (from docker-compose's perspective), everything should work the same.
How to copy the files
As for how to do the copy, these are just files, so it doesn't matter. scp -r should work, or make a zipfile, copy that, unzip into the correct place, etc. There are a ton of ways to copy files, so pick whatever is simplest for your case.
What exactly needs to be copied?
In the comments you expressed confusion about local vs. remote operations in docker-machine, and what else you needed to copy. Here's a bit more full of an explanation:
On your local system (which I'm assuming is your own PC or laptop), you have docker-machine installed, and you've been using that for all of this development. Completely separate from that is your new cloud instance where you would like to deploy.
To run what you have locally already, up on your cloud instance, the cloud instance will need to have the following.
The docker-compose.yml file.
As long as you plan to use docker-compose to run this, that must be available.
Your .env file.
Since you are using an environment file in this setup, it must be available or docker-compose can't make use of it.
Your web image.
You have a build parameter for this container, but not an image parameter. So currently the only thing you can do is docker-compose build web which will locally generate an image, which docker-compose then knows how to run.
Another option is to add an image parameter, with a repository:tag, such as myuser/myapp_web:1.0, and push that up to Docker Hub. Then, on your cloud instance, the image can be retrieved from Docker Hub instead of building it locally.
In that case, you can add an image parameter to the web container in docker-compose.yml, then build it and push it up.
docker-compose build web
docker-compose push web
Then on the cloud instance, you can fetch it:
docker-compose pull web
docker-compose will know to use that image because of the image parameter in docker-compose.yml (which is also present on the cloud server).
Ref: Creating a new repository on Docker Hub
Which of these options is preferable depends on how you want to manage things. Either one would work, but the "local build" option would require you copy any required source files to your cloud instance too (anything that is used during the build process).
I don't see in your question where the postgres container comes from. If you are also custom-building this one, then the same goes as for web. If you are using a public image for this, then you shouldn't need to copy anything; docker-compose will know how to fetch it, i.e. you can do this:
docker-compose pull postgres
What about docker cp and docker-machine scp?
You mentioned docker cp and docker-machine scp in your question.
As you already determined, docker cp is not a solution here. That command is for copying files between a container and the host filesystem. It has nothing to do with copying over a network.
As far as I know, docker-machine scp is to copy files between your local host and a docker-machine-managed VM. To copy files to your cloud instance you can likely use a more generic tool like scp or sftp more easily.

Not sure as of which docker version, but contrary to the statements in the question and the #Dan_Lowe answer this works fine:
docker cp ./data container:/usr/src/app/
docker cp is a normal part of the API, so it works like any other command, even remotely.

Related

Windows Container with Sidecar for data

I am trying to setup a windows nanoserver container as a sidecar container holding the certs that I use for SSL. Because the SSL cert that I need changes in each environment, I need to be able to change the sidecar container (i.e. dev-cert container, prod-cert container, etc) at startup time. I have worked out the configuration problems, but am having trouble using the same pattern that I use for Linux containers.
On linux containers, I simply copy my files into a container and use the VOLUMES step to export my volume. Then, on my main application container, I can use volumes_from to import the volume from the sidecar.
I have tried to follow that same pattern with nanoserver and cannot get working. Here is my dockerfile:
# Building stage
FROM microsoft/nanoserver
RUN mkdir c:\\certs
COPY . .
VOLUME c:/certs
The container builds just fine, but I get the following error when I try and run it. The dockerfile documentation says the following:
Volumes on Windows-based containers: When using Windows-based
containers, the destination of a volume inside the container must be
one of:
a non-existing or empty directory
a drive other than C:
so I thought, easy, I will just switch to the D drive (because I don't want to export an empty directory like #1 requires). I made the following changes:
# Building stage
FROM microsoft/windowservercore as build
VOLUME ["d:"]
WORKDIR c:/certs
COPY . .
RUN copy c:\certs d:
and this container actually started properly. However, I missed in the docs where is says:
Changing the volume from within the Dockerfile: If any build steps
change the data within the volume after it has been declared, those
changes will be discarded.
so, when I checked, I didn't have any files in the d:\certs directory.
So how can you mount a drive for external use in a windows container if, #1 the directory must be empty to make a VOLUME on the c drive in the container, and use must use VOLUME to create a d drive, which is pointless because anything put in there will not be in the final container?
Unfortunately you cannot use Windows containers volumes in this way. Also this limitation is the reason why using database containers (like microsoft/mssql-server-windows-developer) is a real pain. You cannot create volume on non-empty database folder and as a result you cannot restore databases after container re-creation.
As for your use case, I would suggest you to utilize a reverse proxy (like Nginx for example).
You create another container with Nginx server and certificates inside. Then you let it handle all incoming HTTPS requests, terminate SSL/TLS and then pass request to inner application container using plain HTTP protocol.
With such deployment you don't have to copy and install HTTPS certificates to all application containers. There is only one place where you store certificates and you can change dev/test/etc certificates just by using different Nginx image versions (or by binding certificate folder using volume).
UPDATE:
Also if you still want to use sidecar container you can try one small hack. So basically you will move this operation
COPY . .
from build time to runtime (after container starts).
Something like this:
FROM microsoft/nanoserver
RUN mkdir c:\\certs_in
RUN mkdir c:\\certs_out
VOLUME c:/certs_out
CMD copy "C:\certs_in" *.* "D:\certs_out"

How to keep changes inside a container on the host after a docker build?

I have a docker-compose dev stack. When I run, docker-compose up --build, the container will be built and it will execute
Dockerfile:
RUN composer install --quiet
That command will write a bunch of files inside the ./vendor/ directory, which is then only available inside the container, as expected. The also existing vendor/ on the host is not touched and, hence, out of date.
Since I use that container for development and want my changes to be available, I mount the current directory inside the container as a volume:
docker-compose.yml:
my-app:
volumes:
- ./:/var/www/myapp/
This loads an outdated vendor directory into my container; forcing me to rerun composer install either on the host or inside the container in order to have the up to date version.
I wonder how I could manage my docker-compose stack differently, so that the changes during the docker build on the current folder are also persisted on the host directory and I don't have to run the command twice.
I do want to keep the vendor folder mounted, as some vendors are my own and I like being able to modifiy them in my current project. So only mounting the folders I need to run my application would not be the best solution.
I am looking for a way to tell docker-compose: Write all the stuff inside the container back to the host before adding the volume.
You can run a short side container after docker-compose build:
docker run --rm -v /vendor:/target my-app cp -a vendor/. /target/.
The cp could also be something more efficient like an rsync. Then after that container exits, you do your docker-compose up which mounts /vendor from the host.
Write all the stuff inside the container back to the host before adding the volume.
There isn't any way to do this directly, but there are a few options to do it as a second command.
as already suggested you can run a container and copy or rsync the files
use docker cp to copy the files out of a container (without using a volume)
use a tool like dobi (disclaimer: dobi is my own project) to automate these tasks. You can use one image to update vendor, and another image to run the application. That way updates are done on the host, but can be built into the final image. dobi takes care of skipping unnecessary operations when the artifact is still fresh (based on modified time of files or resources), so you never run unnecessary operations.

Appropriate use of Volumes - to push files into container?

I was reading Project Atomic's guidance for images which states that the 2 main use cases for using a volume are:-
sharing data between containers
when writing large files to disk
I have neither of these use cases in my example using an Nginx image. I intended to mount a host directory as a volume in the path of the Nginx docroot in the container. This is so that I can push changes to a website's contents into the host rather then addressing the container. I feel it is easier to use this approach since I can - for example - just add my ssh key once to the host.
My question is, is this an appropriate use of a data volume and if not can anyone suggest an alternative approach to updating data inside a container?
One of the primary reasons for using Docker is to isolate your app from the server. This means you can run your container anywhere and get the same result. This is my main use case for it.
If you look at it from that point of view, having your container depend on files on the host machine for a deployed environment is counterproductive- running the same container on a different machine may result in different output.
If you do NOT care about that, and are just using docker to simplify the installation of nginx, then yes you can just use a volume from the host system.
Think about this though...
#Dockerfile
FROM nginx
ADD . /myfiles
#docker-compose.yml
web:
build: .
You could then use docker-machine to connect to your remote server and deploy a new version of your software with easy commands
docker-compose build
docker-compose up -d
even better, you could do
docker build -t me/myapp .
docker push me/myapp
and then deploy with
docker pull
docker run
There's a number of ways to achieve updating data in containers. Host volumes are a valid approach and probably the simplest way to achieve making your data available.
You can also copy files into and out of a container from the host. You may need to commit afterwards if you are stopping and removing the running web host container at all.
docker cp /src/www webserver:/www
You can copy files into a docker image build from your Dockerfile, which is the same process as above (copy and commit). Then restart the webserver container from the new image.
COPY /src/www /www
But I think the host volume is a good choice.
docker run -v /src/www:/www webserver command
Docker data containers are also an option for mounted volumes but they don't solve your immediate problem of copying data into your data container.
If you ever find yourself thinking "I need to ssh into this container", you are probably doing it wrong.
Not sure if I fully understand your request. But why you need do that to push files into Nginx container.
Manage volume in separate docker container, that's my suggestion and recommend by Docker.io
Data volumes
A data volume is a specially-designated directory within one or more containers that bypasses the Union File System. Data volumes provide several useful features for persistent or shared data:
Volumes are initialized when a container is created. If the container’s base image contains data at the specified mount point, that existing data is copied into the new volume upon volume initialization.
Data volumes can be shared and reused among containers.
Changes to a data volume are made directly.
Changes to a data volume will not be included when you update an image.
Data volumes persist even if the container itself is deleted.
refer: Manage data in containers
As said, one of the main reasons to use docker is to achieve always the same result. A best practice is to use a data only container.
With docker inspect <container_name> you can know the path of the volume on the host and update data manually, but this is not recommended;
or you can retrieve data from an external source, like a git repository

Is there a way to replicate pwd in a volume mount for docker in a boot2docker context?

So currently I can do: docker -v .:/usr/src/app or even specify it in my docker-compose.yml:
web:
volumes:
- .:/usr/src/app
But when I attempt to define this in my Dockerfile:
VOLUME .:/usr/src/app
It doesn't mount anything.
Now I understand the complexities in that I'm using OSX and so I have to virtualize the environment to run Docker via boot2docker, and that boot2docker solves the copy issue by mounting /User to the linux machine running Docker.
The documentation wants me to be explicit, but since my explicitness would require me to name my user (in this case /User/krainboltgreene/code/krainboltgreene/blankrails) it seems non-idiomatic, as that obviously doesn't work on other people's environments.
What's the solution for this? I mean, I can technically get this all working without (as noted above the CLI and compose works fine), but it means not being able to do project specific provisioning (bower install, npm install, vulcanize, etc).
You can't specify a host directory for a volume inside a Dockerfile, because of the portability reasons you mention (not everyone will have the same directories and there are security issues regarding mounting sensitive files).
If you instead do:
VOLUME /usr/src/app
Docker will automatically set up a volume at run-time for the folder, which will be mapped to a directory under /var/lib/docker/volumes.
If you want to be able to quickly make changes during development, I would suggest using COPY in the Dockerfile, but mounting local changes over the top with a volume at run-time. This has the disadvantage that if you volume mount a folder, all the contents of that directory in the container will be hidden (rather than merged).
The docker run -v .:/usr/src/app ... command as well as the docker-compose definitions are executing during runtime. Whereas the Dockerfile instructions are executed during build time.
By the way the instruction in your Dockerfile is syntactically incorrect. It should be VOLUME /usr/src/app instead.
That VOLUME keyword only defines that later during runtime this location will be stored on a volume. So all files that you add by further Dockerfile instructions or manual commits to that location are ignored and not added to the resulting image.
Now during runtime when you did not specify a volume it Docker will generate a volume for you which is empty by default.
To have your docker-compose setup working for other colleagues you could simply make the docker-compose configuration file being part of your blankrails project folder. Everybody then runs docker-compose from within that directory and your provided configuration will work.
EDIT:
I do not know exactly what you mean with project specific provisioning. But if your aim is to provide default contents for the defined volume you could do something like the following:
Add all required project files during the Dockerfile build to a /bootstrap folder on the image.
Instead of executing your app directly use a start shell script for CMD.
In that start script you can check whether the volume mounted to /usr/src/app is empty or not. When it is empty copy all the /bootstrap contents into it.
Afterwards start your app from within that script in foreground.
With that approach you can easily provide a default file set for mounted volumes. And when you re-use that volume e.g. after a container restart the container just works with the files that are on the volume without touching them again during startup. So modified files will be persisted.

How can I mount a file in a container, that isn't available before first run?

I'm trying to build a Dockerfile for a webapp that uses a file-based database. I would like to be able to mount the file from the host*
The file is in the root of the complete software install, so it's not really ideal to mount that complete dir.
Another problem is that before the first use, the database-file isn't created yet. A first time user won't have a database, but another user might. I can't 'mount' anything during a build** I believe.
It could probably work like this:
First/new database start:
Start the container (without mount).
The webapp creates a database.
Stop the container
subsequent starts:
Start the container using a -v to mount the file
It would be better if that extra start/stop isn't needed for a user. Even if it is, I'm still looking for a way to do this userfriendly, possibly having 2 'methods' of starting it (maybe I can define a first-boot thing in docker-compose as well as a 'normal' method?).
How can I do this in a simpel way, so that it's clear for any first time users?
* The reason is that you can copy your Dockerfile and the database file as a backup, and be up and running with just those 2 elements.
** How to mount host volumes into docker containers in Dockerfile during build
One approach that may work is:
Start the database in the build file in such a way that it has time to create the default file before exiting.
Declare a VOLUME in the Dockerfile for the file after the above instruction. This will cause the file to be copied into the volume when a container is started, assuming you don't explicitly provide a host path
Use data-containers rather than volumes. So the normal usage would be:
docker run --name data_con my_db echo "my_db data container"
docker run -d --volumes-from data_con my_db
...
The first container should exit immediately but set up the volume that is used in the second container.
I was trying to achieve something similar and managed to do it by mounting a folder, instead of the file, and creating a symlink in the Dockerfile, initially pointing to a non-existing file:
docker-compose.yml
version: '3.0'
services:
bash:
build: .
volumes:
- ./data:/data
command: ['bash']
Dockerfile
FROM bash:latest
RUN ln -s /data/.bash_history /root/.bash_history
Then you can run the container with:
docker-compose run --rm bash
With this setup, you can push an empty "data" folder into the repository for example (and exclude its content with .gitignore). In the first run, inside the container /root/.bash_history will be a "broken" symlink, pointing to a file that does not exist. When you exit the shell, bash will write the history to /root/.bash_history, which will end up in /data/.bash_history.
This is probably not the correct approach.
If you have multiple containers that are trying to share some information through the file-system, you should probably let them share some directory.
That way, the flow is simple and very hard to get wrong.
You simply mount the same directory, say /data (from the host's perspective) into all the containers that are trying to use it.
When an application starts and it can't find anything inside that directory, it can gracefully stop and exit with a code that says: "Cannot start, DB not initialized yet".
You can then configure some mechanism with a growing timeout to try and restart that container until you're successful.
On the other hand, the app that creates the DB can start and create it inside the directory or find an existing file to use.

Resources