Build multiple images vs start multiple containers - docker

What's the best practices?
Build different docker images for each application instance. For example, each application instance has its own code directory. Use ADD to build different images
Build a base image. Start a new container for each application instance. Use option -v to bind specific volume for each application instance.
Reasons go for multiple containers:
Using ADD in Dockerfile means you need rebuild your image once any changes in that directory.

not sure if best practice but i'd go for the -v option just for the numbers: you have to create a container for every code directory anyway, i would avoid having to build the same number of images too.
Moreover disk space may be an issue but im not sure
one image and one container for every code directory < one image and multiple associated container with different mount point for every code directory

Related

how remove all docker containers and images associated with a particular Dockerfile + docker-compose.yml file?

We frequently spin up some quick exploratory docker-based projects for a day or two that we'd like to quickly and easily discard when done, without disturbing our primary ongoing containers and images.
At any point in time we have quite a few 'stable' docker images and containers that we do NOT want to rebuild all the time.
How can one remove all the images and containers associated with the current directory's Dockerfile and docker-compose.yml file, without disturbing other projects' images and containers?
(All the Docker documentation I see shows how to discard them ALL, or requires finding and listing a bunch of IDs manually and discarding them manually. This seems primitive and time-wasting... In a project folder, the Dockefile and docker-compose.yml file have all the info needed to identify "all images and images that were created when building and running THIS project" so it seems there would quick command to remove that project's docker dregs when done.)
As an example, right now I have rarely revised Docker images and containers for several production Rails 5 apps I'd like to keep untouched, but a half-dozen folders with short-term Rails 6 experiments that represent dozens of images and containers I'd like to discard.
Is there any way to tell Docker... "here's a Dockerfile and a docker-compose,yml file, stop and remove any/all containers and images that were created by them?"

Why should our work inside the container shouldn't modify the content of the container itself?

I am reading an article related to docker images and containers.
It says that a container is an instance of an image. Fair enough. It also says that whenever you make some changes to a container, you should create an image of it which can be used later.
But at the same time it says:
Your work inside a container shouldn’t modify the container. Like
previously mentioned, files that you need to save past the end of a
container’s life should be kept in a shared folder. Modifying the
contents of a running container eliminates the benefits Docker
provides. Because one container might be different from another,
suddenly your guarantee that every container will work in every
situation is gone.
What I want to know is that, what is the problem with modifying container's contents? Isn't this what containers are for? where we make our own changes and then create an image which will work every time. Even if we are talking about modifying container's content itself and not just adding any additional packages, how will it harm anything since the image created from this container will also have these changes and other containers created from that image will inherit those changes too.
Treat the container filesystem as ephemeral. You can modify it all you want, but when you delete it, the changes you have made are gone.
This is based on a union filesystem, the most popular/recommended being overlay2 in current releases. The overlay filesystem merges together multiple lower layers of the image with an upper layer of the container. Reads will be performed through those layers until a match is found, either in the container or in the image filesystem. Writes and deletes are only performed in the container layer.
So if you install packages, and make other changes, when the container is deleted and recreated from the same image, you are back to the original image state without any of your changes, including a new/empty container layer in the overlay filesystem.
From a software development workflow, you want to package and release your changes to the application binaries and dependencies as new images, and those images should be created with a Dockerfile. Persistent data should be stored in a volume. Configuration should be injected as either a file, environment variable, or CLI parameter. And temp files should ideally be written to a tmpfs unless those files are large. When done this way, it's even possible to make the root FS of a container read-only, eliminating a large portion of attacks that rely on injecting code to run inside of the container filesystem.
The standard Docker workflow has two parts.
First you build an image:
Check out the relevant source tree from your source control system of choice.
If necessary, run some sort of ahead-of-time build process (compile static assets, build a Java .jar file, run Webpack, ...).
Run docker build, which uses the instructions in a Dockerfile and the content of the local source tree to produce an image.
Optionally docker push the resulting image to a Docker repository (Docker Hub, something cloud-hosted, something privately-run).
Then you run a container based off that image:
docker run the image name from the build phase. If it's not already on the local system, Docker will pull it from the repository for you.
Note that you don't need the local source tree just to run the image; having the image (or its name in a repository you can reach) is enough. Similarly, there's no "get a shell" or "start the service" in this workflow, just docker run on its own should bring everything up.
(It's helpful in this sense to think of an image the same way you think of a Web browser. You don't download the Chrome source to run it, and you never "get a shell in" your Web browser; it's almost always precompiled and you don't need access to its source, or if you do, you have a real development environment to work on it.)
Now: imagine there's some critical widespread security vulnerability in some core piece of software that your application is using (OpenSSL has had a couple, for example). It's prominent enough that all of the Docker base images have already updated. If you're using this workflow, updating your application is very easy: check out the source tree, update the FROM line in the Dockerfile to something newer, rebuild, and you're done.
Note that none of this workflow is "make arbitrary changes in a container and commit it". When you're forced to rebuild the image on a new base, you really don't want to be in a position where the binary you're running in production is something somebody produced by manually editing a container, but they've since left the company and there's no record of what they actually did.
In short: never run docker commit. While docker exec is a useful debugging tool it shouldn't be part of your core Docker workflow, and if you're routinely running it to set up containers or are thinking of scripting it, it's better to try to move that setup into the ordinary container startup instead.

Sharing bind volume in Docker swarm

We use open-jdk image to deploy our jars. since we have multiple jars we simply attach them using bind mode and run them. I don't want to build separate images since our deployment will be in air gaped environments and each time I can't rebuild images as only the jars will be changing.
Now we are trying to move towards swarm. Since it is a bind mount, I'm unable to spread the replicas to other nodes.
If I use volumes how can I put these jars into that volume? One possibility is that I can run a dummy alpine image and mount the volume to host and then I can share it with other containers. But it possible to share that volume between the nodes? and is it an optimum solution? Also if I need to update the jars how can that be done?
I can create NFS drive but I'm trying to figure out a way of implementing without it. Since it is an isolated environment and may contain crucial data I can't use 3rd party plugins to finish the job as well.
So how docker swarm can be implemented in this scenario?
Use docker build. Really.
An image is supposed to be a static copy of your application and its runtime, and not the associated data. The statement "only the jars changed" means "we rebuilt the application". While you can use bind mounts to inject an application into a runtime-only container, I don't feel like it's really a best practice, and that's doubly true in a language where there's already a significant compile-time step.
If you're in an air-gapped environment, you need to figure out how you're going to provide application updates (regardless of the deployment framework). The best solution, if you can manage it, is to set up a private Docker registry on the isolated network, docker save your images (with the tars embedded), then docker load, docker tag, and docker push them into the registry. Then you can use the registry-tagged image name everywhere and not need to worry about manually pushing the images and/or jar files across.
Otherwise you need to manually distribute the image tar and docker load it, or manually push your updated jars on to each of the target systems. An automation system like Ansible works well for this; I'm partial to Ansible because it doesn't require a central server.

Is it a docker best practice to use volume for the code?

The VOLUME instruction should be used to expose any database storage area, configuration storage, or files/folders created by your docker container. You are strongly encouraged to use VOLUME for any mutable and/or user-serviceable parts of your image.
will you store your code in volume?
Such as your jar files. It could be a little convenient to deploy the application without rebuilding the image.
Are there any considerations if storing the code in volume? like performance, security or others.
I don't recommend using a VOLUME statement inside the Dockerfile for anything with current versions of docker (current being any version of docker since the introduction of named volumes). Including a VOLUME command has multiple downsides, including:
possible inability to change contents at that location of the image with any later steps or child images (this behavior appears to be different with different scenarios and different versions of docker)
potential to create volumes with just a hash for the name that clutter up the docker volume ls output and are very difficult to find and reuse later if you needed the data inside
for your changing code, if you place it in a volume and recreate your container from a new version of the image, the volume will still have the old copy of your code unless you update that volume yourself (the key feature of volumes is persistent data that you want to keep between image versions)
I do recommend putting your data in a volume that you define on the docker run command line or inside a docker-compose.yml. Volumes defined there can have a name or map back to a path on the docker host. And you can make any folder or file a volume without needing to define it in the Dockerfile. Volumes defined at this step doesn't impact the image, allowing you to extend an image without being locked out of making changes to a directory.
For your code, it is a common best practice to inject code with a volume if it is interpreted (e.g. javascript) or already compiled (e.g. a jar file) during application development. You would define the volume on the container (not the Dockerfile), and overlay the code or binaries that were also copied into the image using the same filenames. This allows you to rapidly iterate in development without frequently rebuilding the image. Depending on the application, you may be able to live reload the code, otherwise, a container restart should be all that's needed to see the latest change. And once development is finished, you rebuild the image with your current code and ship that to someone that can use it without needing the volume mount for the code.
I've also blogged about my concerns with volumes inside of Dockerfiles if you'd like to see more details on this.
You say:
It could be a little convenient to deploy the application without rebuilding the image.
Instead of that, it has a lot of advantages to encapsulate your application version inside an image build. You can easily deploy your app only deploying the image, so the fact that you use a volume for app code leads you to orchestrate some other deployment method to update that volume too.
And you have to (eventually) match the jar version with the proper image version.
Regarding security or performance, I don't think that there are special considerations.
Anyway, it is not a common approach to use volumes for that. And as #BMitch say, using VOLUME inside Dockerfile is some tricky.

Recreating Docker images instead of reusing - for microservices

One microservice stays in one docker container. Now, let's say that I want to upgrade the microservice - for example, some configuration is changed, and I need to re-run it.
I have two options:
I can try to re-use existing image, by having a script that runs on containers startup and that updates the microservice by reading new config (if there is) from some shared volume. After the update, script runs the microservice.
I can simply drop the existing image and container and create the new image (with new name) and new container with updated configuration/code.
Solution #2 seems more robust to me. There is no 'update' procedure, just single container creation.
However, what bothers me is if this re-creation of the image has some bad side-effects? Like a lot of dangling images or something similar. Imagine that this may happens very often during the time user plays with the app - for example, if developer is trying out something, he wants to play with different configurations of microservice, and he will re-start it often. But once it is configured, this will not change. Also, when I say configuration I dont mean just config files, but also user code etc.
For production changes you'll want to deploy a new image for changes to the file. This ensures your process is repeatable.
However, developing by making a new image every time you write a new line of code would be a nightmare. The best option is to run your docker container and mount the source directory of the container to your file system. That way, when you make changes in your editor, the code in the container updates too.
You can achieve this like so:
docker run -v /Users/me/myapp:/src myapp_image
That way you only have to build myapp_image once and can easily make changes thereafter.
Now, if you had a running container that was not mounted and you wanted to make changes to the file, you can do that too. It's not recommended, but it's easy to see why you might want to.
If you run:
docker exec -it <my-container-id> bash
This will put you into the container and you can make changes in vim/nano/editor of your choice while you're inside.
Your option #2 is definitely preferable for a production environment. Ideally you should have some automation around this process, typically to perform something like a blue-green deploy where you replace containers based on the old image one by one with those from the new, testing as you go and then only when you are satisfied with the new deployment do you clean up the containers from the previous version and remove the image. That way you can quickly roll-back to the previous version if needed.
In a development environment you may want to take a different approach where you bind mount the application in to the container at runtime allowing you to make updates dynamically without having to rebuild the image. There is a nice example in the Docker Compose docs that illustrates how you can have a common base compose YML and then extend it so that you get different behavior in development and production scenarios.

Resources