Keep configuration inside Dockerfile? And is this image production safe? - docker

I've been browsing Docker Hub and I'm trying to determine the quality of builds.
I've got 2 questions:
Question 1
I came accross this image: https://hub.docker.com/r/perfectweb/production/~/dockerfile/
It uses a lot of configuration rewriting inside the image, wouldn't it be better to just copy external configuration files to the container? Like described here: Separate specific configuration in Dockerfile.
Question 2
One of the most-starred images for lemp is this one: https://hub.docker.com/r/stenote/docker-lemp/
It has a warning not to use it for production (because of the empty root password for MySQL) but I'm wondering: are there other reasons why this image is not production safe ?

wouldn't it be better to just copy external configuration files to the container?
If you copied from the disk the same php.ini already modified, that file might overwrite some of the evolution introduced by another version of php in php.ini.
So the current process (rewrites) allows for php.ini to evolve (when installing a new version of php), while keeping the rewrite visible in the Dockerfile.
are there other reasons why this image is not production safe ?
Another reason might be that, by default, those services are accessible in http, not https.

Related

Best practice for handling service configuration in docker

I want to deploy a docker application in a production environment (single host) using a docker-compose file provided by the application creator. The docker based solution is being used as a drop-in replacement for a monolithic binary installer.
The application ships with a default configuration but with an expectation that the administrator will want to apply moderate configuration changes.
There appears to be a few ways to apply custom configuration to the services that are defined in the docker-compose.yml file however I am not sure which is considered best practice. The two I am considering between at the moment are:
Bake the configuration into a new image. Here, I would add a build step for each service defined in the docker-compose file and create a minimal Dockerfile which uses COPY to replace the existing configuration files in the image with my custom config files. Using sed and echo in CMD statements could also be used to change configuration inline without replacing the files wholesale.
Use a bind mount with configuration stored on the host. In this case, I would store all custom configuration files in a directory on the host machine and define bind mounts in the volumes parameter for each service in the docker-compose file.
The first option seems the cleanest to me as the application is completely self-contained, however I would need to rebuild the image if I wanted to make any further configuration changes. The second option seems the easiest as I can make configuration changes on the fly (restarting services as required in the container).
Is there a recommended method for injecting custom configuration into Docker services?
Given your context, I think using a bind mount would be better.
A Docker image is supposed to be reusable in different context, and building an entire image solely for a specific configuration (i.e. environment) would defeat that purpose:
instead of the generic configuration provided by the base image, you create an environment-specific image
everytime you need to change the configuration you'll need to rebuild the entire image, whereas with a bind mount a simple restart or re-read of the configuration file by application will be sufficient
Docker documentation recommend that:
Dockerfile best practice
You are strongly encouraged to use VOLUME for any mutable and/or
user-serviceable parts of your image.
Good use cases for bind mounts
Sharing configuration files from the host machine to containers.

Docker, update image or just use bind-mounts for website code?

I'm using Django but I guess the question is applicable to any web project.
In our case, there are two types of codes, the first one being python code (run in django), and others are static files (html/js/css)
I could publish new image when there is a change in any of the code.
Or I could use bind mounts for the code. (For django, we could bind-mount the project root and static directory)
If I use bind mounts for code, I could just update the production machine (probably with git pull) when there's code change.
Then, docker image will handle updates that are not strictly our own code changes. (such as library update or new setup such as setting up elasticsearch) .
Does this approach imply any obvious drawback?
For security reasons is advised to keep an operating system up to date with the last security patches but docker images are meant to be released in an immutable fashion in order we can always be able to reproduce productions issues outside production, thus the OS will not update itself for security patches being released. So this means we need to rebuild and deploy our docker image frequently in order to stay on the safe side.
So I would prefer to release a new docker image with my code and static files, because they are bound to change more often, thus requiring frequent release, meaning that you keep the OS more up to date in terms of security patches without needing to rebuild docker images in production just to keep the OS up to date.
Note I assume here that you release new code or static files at least in a weekly basis, otherwise I still recommend to update at least once a week the docker images in order to get the last security patches for all the software being used.
Generally the more Docker-oriented solutions I've seen to this problem learn towards packaging the entire application in the Docker image. That especially includes application code.
I'd suggest three good reasons to do it this way:
If you have a reproducible path to docker build a self-contained image, anyone can build and reproduce it. That includes your developers, who can test a near-exact copy of the production system before it actually goes to production. If it's a Docker image, plus this code from this place, plus these static files from this other place, it's harder to be sure you've got a perfect setup matching what goes to production.
Some of the more advanced Docker-oriented tools (Kubernetes, Amazon ECS, Docker Swarm, Hashicorp Nomad, ...) make it fairly straightforward to deal with containers and images as first-class objects, but trickier to say "this image plus this glop of additional files".
If you're using a server automation tool (Ansible, Salt Stack, Chef, ...) to push your code out, then it's straightforward to also use those to push out the correct runtime environment. Using Docker to just package the runtime environment doesn't really give you much beyond a layer of complexity and some security risks. (You could use Packer or Vagrant with this tool set to simulate the deploy sequence in a VM for pre-production testing.)
You'll also see a sequence in many SO questions where a Dockerfile COPYs application code to some directory, and then a docker-compose.yml bind-mounts the current host directory over that same directory. In this setup the container environment reflects the developer's desktop environment and doesn't really test what's getting built into the Docker image.
("Static files" wind up in a gray zone between "is it the application or is it data?" Within the context of this question I'd lean towards packaging them into the image, especially if they come out of your normal build process. That especially includes the primary UI to the application you're running. If it's things like large image or video assets that you could reasonably host on a totally separate server, it may make more sense to serve those separately.)

Where are you supposed to store your docker config files?

I'm new to docker so I have a very simple question: Where do you put your config files?
Say you want to install mongodb. You install it but then you need to create/edit a file. I don't think they fit on github since they're used for deployment though it's not a bad place to store the files.
I was just wondering if docker had any support for storing such config files so you can add them as part of running an image.
Do you have to use swarms?
Typically you'll store the configuration files on the Docker host and then use volumes to bind mount your configuration files in the container. This allows you to separately manage the configuration file from the running containers. When you make a change to the configuration, you can just restart the container.
You can then use a configuration management tool like Salt, Puppet, or Chef to manage copying/storing the configuration file onto the Docker host. Things like passwords can be managed by the secrets capabilities of the tool. When set up this way, changing a configuration file just means you need to restart your container and not build a new image.
Yes, in most cases you definitely want to keep your Dockerfiles in version control. If your org (or you personally) use GitHub for this, that's fine, but stick them wherever your other repos are. One of the main ideas in DevOps is to treat infrastructure as code. In fact, one of the main benefits of something like a Dockerfile (or a chef cookbook, or a puppet file, etc) is that it is "used for deployment" but can also be version-controlled, meaningfully diffed, etc.

Is it a docker best practice to use volume for the code?

The VOLUME instruction should be used to expose any database storage area, configuration storage, or files/folders created by your docker container. You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image.
will you store your code in volume?
Such as your jar files. It could be a little convenient to deploy the application without rebuilding the image.
Are there any considerations if storing the code in volume? like performance, security or others.
I don't recommend using a VOLUME statement inside the Dockerfile for anything with current versions of docker (current being any version of docker since the introduction of named volumes). Including a VOLUME command has multiple downsides, including:
possible inability to change contents at that location of the image with any later steps or child images (this behavior appears to be different with different scenarios and different versions of docker)
potential to create volumes with just a hash for the name that clutter up the docker volume ls output and are very difficult to find and reuse later if you needed the data inside
for your changing code, if you place it in a volume and recreate your container from a new version of the image, the volume will still have the old copy of your code unless you update that volume yourself (the key feature of volumes is persistent data that you want to keep between image versions)
I do recommend putting your data in a volume that you define on the docker run command line or inside a docker-compose.yml. Volumes defined there can have a name or map back to a path on the docker host. And you can make any folder or file a volume without needing to define it in the Dockerfile. Volumes defined at this step doesn't impact the image, allowing you to extend an image without being locked out of making changes to a directory.
For your code, it is a common best practice to inject code with a volume if it is interpreted (e.g. javascript) or already compiled (e.g. a jar file) during application development. You would define the volume on the container (not the Dockerfile), and overlay the code or binaries that were also copied into the image using the same filenames. This allows you to rapidly iterate in development without frequently rebuilding the image. Depending on the application, you may be able to live reload the code, otherwise, a container restart should be all that's needed to see the latest change. And once development is finished, you rebuild the image with your current code and ship that to someone that can use it without needing the volume mount for the code.
I've also blogged about my concerns with volumes inside of Dockerfiles if you'd like to see more details on this.
You say:
It could be a little convenient to deploy the application without rebuilding the image.
Instead of that, it has a lot of advantages to encapsulate your application version inside an image build. You can easily deploy your app only deploying the image, so the fact that you use a volume for app code leads you to orchestrate some other deployment method to update that volume too.
And you have to (eventually) match the jar version with the proper image version.
Regarding security or performance, I don't think that there are special considerations.
Anyway, it is not a common approach to use volumes for that. And as #BMitch say, using VOLUME inside Dockerfile is some tricky.

Containerizing a web application using Docker: copy source code to multiple images?

I have a web application XY which consists of nginx, php-fpm and mariadb. I successfully splitted everything up to it's own container using docker-compose and it's running like a charm. For development purposes I just mounted a local directory that contains the actual source/php code. When deploying this to a staging or production environment the Docker docs told me to bake the source code into the actual image. In this case I have to copy the source code to the nginx as well as to the php-fpm image when building it, because both of them need it.
When the application itself gets bigger (more assets and libaries) the nginx and php-fpm images grow both. In my opinion this somehow violates the "keep the image as small as possible" rule and this seems so deeply wrong to me. I've always learned not to repeat myself and store logic in one place, encouple things and so on.
Is this the right way to do it, or am I missing something?
In this case I'd probably create a new container which would contain the source code. This container can export the source code's directory in a volume that both the nginx and the php-fpmcan can mount.
There is an interesting writeup on Dockerise your PHP application with Nginx and PHP7-FPM. This example uses volumes to share code between PHP and Nginx.
Your point about not repeating yourself isn't a bad one, but consider you may not always want the same number of Nginx containers and PHP containers. Maybe the PHP part of your application will be under more load than the part that serves static assets and you'll want to scale that up independently. If you use something like Docker Swarm, you aren't even guaranteed all of your PHP containers will be on the same host.
Your images are deployment artifacts, there isn't anything wrong with having the same static content baked into multiple images.

Resources