How to place files on shared volume from within Dockerfile? - docker

I have a Dockerfile which builds an image that provides for me a complicated tool-chain environment to compile a project on a mounted volume from the host machines file system. Another reason is that I don't have a lot of space on the image.
The Dockerfile builds my tool-chain in the OS image, and then prepares the source by downloading packages to be placed on the hosts shared volume. And normally from there I'd then log into the image and execute commands to build. And this is the problem. I can download the source in the Dockerfile, but how then would I get it to the shared volume.
Basically I have ...
ADD http://.../file mydir
VOLUME /home/me/mydir
But then of course, I get the error 'cannot mount volume over existing file ..."
Am I going about this wrong?

You're going about it wrong, but you already suspected that.
If you want the source files to reside on the host filesystem, get rid of the VOLUME directive in your Dockerfile, and don't try to download the source files at build time. This is something you want to do at run time. You probably want to provision your image with a pair of scripts:
One that downloads the files to a specific location, say /build.
Another that actually runs the build process.
With these in place, you could first download the source files to a location on the host filesystem, as in:
docker run -v /path/on/my/host:/build myimage fetch-sources
And then you can build them by running:
docker run -v /path/on/my/host:/build myimage build-sources
With this model:
You're trying to muck about with volumes during the image build process. This is almost never what you want, since data stored in a volume is explicitly excluded from the image, and the build process doesn't permit you to conveniently mount host directories inside the container.
You are able to download the files into a persistent location on the host, where they will be available to you for editing, or re-building, or whatever.
You can run the build process multiple times without needing to re-download the source files every time.
I think this will do pretty much what you want, but if it doesn't meet your needs, or if something is unclear, let me know.

Related

Docker container bind host directory

I wish I could update the settings of the application by changing the local profile.
I use "volume" to bind a local directory, for example:
docker run -v D:\test:/app
But when the container is running, all files in /app are emptied, because D:\test does not have any files.
Is there any way I can achieve my goal
Your question is a bit unclear. I guess you're problem is the following: You want to bind mount your app directory, but it is initially empty and will stay empty since the bind mount overwrites everything put into /app during build
I usually use two different ways:
Put your profile into host directory D:\test (if applicable). This is also a viable strategy for e.g. the source code of nodejs apps
During build, put your profile into /app_temp. Then create an entry point which moves /app_temp into /app. If you want to persist the profile through multiple build/run phases, it has to be inside the build context (which is likely not D:\test) on your host.
You need to change a bit the way your application is organized. put all the settings in their own directory and have the application read them from there. Then you can map only the settings folder you the host one.
Another option is to map the host folder to a temporary folder inside the container and have the ENTRYPOINT script update your files (by copying them over) and then run your application.
Docker was not meant to be used for the workflow you are trying to setup and for this reason you need to do some extra work.
because D:\test does not have any files.
That the way it works. Volume type you use is bind mount, i.e. you mount file system, using mount point mapped to a host directory.
According to documentation:
With bind mounts, we control the exact mountpoint on the host. We can use this to persist data, but it’s often used to provide additional data into containers.
You have two options here (both imply host data should exist in advance):
Bind to a folder, containing configuration data. As you showed.
It is possible to bind only file:
docker run -v D:\test\config.json:/app/config.json
While binding to a file, if it does not exist beforehand, docker-daemon would think it is a directory and will create directory, both in container and on the host.
you mount file system, using mount point mapped to a host directory
Hence, if host directory is empty mounted file system would also be empty.

Why should our work inside the container shouldn't modify the content of the container itself?

I am reading an article related to docker images and containers.
It says that a container is an instance of an image. Fair enough. It also says that whenever you make some changes to a container, you should create an image of it which can be used later.
But at the same time it says:
Your work inside a container shouldn’t modify the container. Like
previously mentioned, files that you need to save past the end of a
container’s life should be kept in a shared folder. Modifying the
contents of a running container eliminates the benefits Docker
provides. Because one container might be different from another,
suddenly your guarantee that every container will work in every
situation is gone.
What I want to know is that, what is the problem with modifying container's contents? Isn't this what containers are for? where we make our own changes and then create an image which will work every time. Even if we are talking about modifying container's content itself and not just adding any additional packages, how will it harm anything since the image created from this container will also have these changes and other containers created from that image will inherit those changes too.
Treat the container filesystem as ephemeral. You can modify it all you want, but when you delete it, the changes you have made are gone.
This is based on a union filesystem, the most popular/recommended being overlay2 in current releases. The overlay filesystem merges together multiple lower layers of the image with an upper layer of the container. Reads will be performed through those layers until a match is found, either in the container or in the image filesystem. Writes and deletes are only performed in the container layer.
So if you install packages, and make other changes, when the container is deleted and recreated from the same image, you are back to the original image state without any of your changes, including a new/empty container layer in the overlay filesystem.
From a software development workflow, you want to package and release your changes to the application binaries and dependencies as new images, and those images should be created with a Dockerfile. Persistent data should be stored in a volume. Configuration should be injected as either a file, environment variable, or CLI parameter. And temp files should ideally be written to a tmpfs unless those files are large. When done this way, it's even possible to make the root FS of a container read-only, eliminating a large portion of attacks that rely on injecting code to run inside of the container filesystem.
The standard Docker workflow has two parts.
First you build an image:
Check out the relevant source tree from your source control system of choice.
If necessary, run some sort of ahead-of-time build process (compile static assets, build a Java .jar file, run Webpack, ...).
Run docker build, which uses the instructions in a Dockerfile and the content of the local source tree to produce an image.
Optionally docker push the resulting image to a Docker repository (Docker Hub, something cloud-hosted, something privately-run).
Then you run a container based off that image:
docker run the image name from the build phase. If it's not already on the local system, Docker will pull it from the repository for you.
Note that you don't need the local source tree just to run the image; having the image (or its name in a repository you can reach) is enough. Similarly, there's no "get a shell" or "start the service" in this workflow, just docker run on its own should bring everything up.
(It's helpful in this sense to think of an image the same way you think of a Web browser. You don't download the Chrome source to run it, and you never "get a shell in" your Web browser; it's almost always precompiled and you don't need access to its source, or if you do, you have a real development environment to work on it.)
Now: imagine there's some critical widespread security vulnerability in some core piece of software that your application is using (OpenSSL has had a couple, for example). It's prominent enough that all of the Docker base images have already updated. If you're using this workflow, updating your application is very easy: check out the source tree, update the FROM line in the Dockerfile to something newer, rebuild, and you're done.
Note that none of this workflow is "make arbitrary changes in a container and commit it". When you're forced to rebuild the image on a new base, you really don't want to be in a position where the binary you're running in production is something somebody produced by manually editing a container, but they've since left the company and there's no record of what they actually did.
In short: never run docker commit. While docker exec is a useful debugging tool it shouldn't be part of your core Docker workflow, and if you're routinely running it to set up containers or are thinking of scripting it, it's better to try to move that setup into the ordinary container startup instead.

Override a volume when Building docker image from another docker image

sorry if the question is basic but would it be possible to build a docker image from another one with a different volume in the new image? My use case is the following:
Start From image library/odoo (cfr. https://hub.docker.com/_/odoo/)
upload folders into the volume "/mnt/extra-addons"
build a new image, tag it then put it in our internal image repo
how can we achieve that? I would like to avoid putting extra folders into the host filesystem
thanks a lot
This approach seems to work best until the Docker development team adds the capability you are looking for.
Dockerfile
FROM percona:5.7.24 as dbdata
MAINTAINER monkey#blackmirror.org
FROM centos:7
USER root
COPY --from=dbdata / /
Do whatever you want . This eliminates the VOLUME issue. Heck maybe I'll write tool to automatically do this :)
You have a few options, without involving the host OS that runs the container.
Make your own Dockerfile, inherit from the library/odoo Docker image using a FROM instruction, and COPY files into the /mnt/extra-addons directory. This still involves your host OS somewhat, but may be acceptable since you wouldn't necessarily be building the Docker image on the same host you were running it.
Make your own Dockerfile, as in (1), but use an entrypoint script to download the contents of /mnt/extra-addons at runtime. This would increase your container startup time since the download would need to take place before running your service, but no host directories would need be involved.
Personally I would opt for (1) if your build pipeline supports it. That would bake the addons right into the image, so the image itself would be a complete, ready-to-go build artifact.

Dockerized executable read/write on host filesystem

I just dockerized an executable that reads from a file and creates a new file in the very directory that file came from.
I want to use Docker in that setup, so that I avoid installing numerous third-party libraries in the production environment.
My problem now: I have file /this/is/a.file on my underlying (host) file system and my executable is supposed to create /this/is/b.file.
As far as I see it, the only chance to get this done is by mapping a volume that points to /this/is and then let the executable know where I mounted it to in the docker, container.
Am I right? Or is there a way that I just pass docker run mydockerizedstuff /this/is/a.file without using Docker volumes?
You're correct, you need to pass in /this/is as a volume and the executable will write to that location.
If you want to constrain the thing even more, you can pass /this/is/b.file as a volume. You need to create it (simply via touch) beforehand, otherwise Docker will consider it a directory and create it as such for you, but you'll know that the thing won't be able to create /this/is/c.file or any other thing.

Is it a bad idea to use docker to run a front end build process during development?

I have an angular project I'm working on containerizing. It currently has enough build tooling that a front-end developer could (and this is how it currently works) just run gulp in the project root, edit source files in src/, and the build tooling handles running traceur, templates and libsass and so forth, spitting content into a build directory. That build directory is served with a minimal server in development, and handled by nginx in production.
My goal is to build a docker-based environment that replicates this workflow. My users are developers who are on different kinds of boxes, so having build dependencies frozen in a dockerfile seems to make sense.
I've gotten close enough to this- docker mounts the host volume, the developer edits the file on the local disk, and the gulp build watcher is running in the docker container instance and rebuilds the site (and triggers livereload, etc etc).
The issue I have is with wrapping my head around how filesystem layers work. Is this process of rebuilding files in the container's build/frontend directory generating a ton of extraneous saved layers? That's not something I'd really like, because I'm not wild about monotonically growing this instance as developers tweak and rebuild and tweak and rebuild. It'd only grow locally, but having to go through the "okay, time to clean up and start over" process seems tedious.
Is this process of rebuilding files in the container's build/frontend directory generating a ton of extraneous saved layers?
Nope, the only way to stack up extra layers is to commit a container with changes to a new image then use that new image to create the next container. Rinse, repeat.
Filesystem layers are saved when a container is committed to a new image (docker commit ...). When a container is running there will be a single read/write layer on top that contains all of the changes made to the container since it was created.
having to go through the "okay, time to clean up and start over" process seems tedious.
If you run the build container with docker run --rm ... then you'll get the cleanup for free. The build container will be created fresh from the image each time.
Also, data volumes bypass the union filesystem so there's a good chance you won't write to the container's filesystem at all.

Resources