Recreating Docker images instead of reusing - for microservices - docker

One microservice stays in one docker container. Now, let's say that I want to upgrade the microservice - for example, some configuration is changed, and I need to re-run it.
I have two options:
I can try to re-use existing image, by having a script that runs on containers startup and that updates the microservice by reading new config (if there is) from some shared volume. After the update, script runs the microservice.
I can simply drop the existing image and container and create the new image (with new name) and new container with updated configuration/code.
Solution #2 seems more robust to me. There is no 'update' procedure, just single container creation.
However, what bothers me is if this re-creation of the image has some bad side-effects? Like a lot of dangling images or something similar. Imagine that this may happens very often during the time user plays with the app - for example, if developer is trying out something, he wants to play with different configurations of microservice, and he will re-start it often. But once it is configured, this will not change. Also, when I say configuration I dont mean just config files, but also user code etc.

For production changes you'll want to deploy a new image for changes to the file. This ensures your process is repeatable.
However, developing by making a new image every time you write a new line of code would be a nightmare. The best option is to run your docker container and mount the source directory of the container to your file system. That way, when you make changes in your editor, the code in the container updates too.
You can achieve this like so:
docker run -v /Users/me/myapp:/src myapp_image
That way you only have to build myapp_image once and can easily make changes thereafter.
Now, if you had a running container that was not mounted and you wanted to make changes to the file, you can do that too. It's not recommended, but it's easy to see why you might want to.
If you run:
docker exec -it <my-container-id> bash
This will put you into the container and you can make changes in vim/nano/editor of your choice while you're inside.

Your option #2 is definitely preferable for a production environment. Ideally you should have some automation around this process, typically to perform something like a blue-green deploy where you replace containers based on the old image one by one with those from the new, testing as you go and then only when you are satisfied with the new deployment do you clean up the containers from the previous version and remove the image. That way you can quickly roll-back to the previous version if needed.
In a development environment you may want to take a different approach where you bind mount the application in to the container at runtime allowing you to make updates dynamically without having to rebuild the image. There is a nice example in the Docker Compose docs that illustrates how you can have a common base compose YML and then extend it so that you get different behavior in development and production scenarios.

Related

Is it better to start an existing Docker container or always run a new one (and manually remove the old ones)?

What would be the most friendly solution for CI/CD and automated workflows? Always forcing a new run would mean we always use the latest image, but it creates additional work in terms of having to remove the old, inactive containers (especially if they're named the same).
Would always running a new container and not naming it, and then periodically cleaning everything with docker container prune be the way to go? I'm looking for what's best practice.
For a CI workflow especially:
Every build creates a new image with a unique tag.
Integration tests always create new containers with that new image. (There isn't an existing container you could reuse with the right image; you can generate a unique name within the context of a particular build.)
When you deploy, create new containers with the new image. (Or update a Compose setup or Kubernetes objects to have the new tag, which will cause new containers to be created.)
Deleting and recreating containers is extremely routine. The docker build; docker run sequence also means your container's startup is always starting from a known place, and you won't usually have old temporary files or stale pid files that need cleanup. You also can't change the image underneath an existing container, so in the context of a CI system you must create a new container with the newly-built image.
IME there's not a lot of downside to always using docker run. docker exec is a debugging tool, and while it's extremely useful, it doesn't have a place here. You shouldn't need docker start at all.

Why should our work inside the container shouldn't modify the content of the container itself?

I am reading an article related to docker images and containers.
It says that a container is an instance of an image. Fair enough. It also says that whenever you make some changes to a container, you should create an image of it which can be used later.
But at the same time it says:
Your work inside a container shouldn’t modify the container. Like
previously mentioned, files that you need to save past the end of a
container’s life should be kept in a shared folder. Modifying the
contents of a running container eliminates the benefits Docker
provides. Because one container might be different from another,
suddenly your guarantee that every container will work in every
situation is gone.
What I want to know is that, what is the problem with modifying container's contents? Isn't this what containers are for? where we make our own changes and then create an image which will work every time. Even if we are talking about modifying container's content itself and not just adding any additional packages, how will it harm anything since the image created from this container will also have these changes and other containers created from that image will inherit those changes too.
Treat the container filesystem as ephemeral. You can modify it all you want, but when you delete it, the changes you have made are gone.
This is based on a union filesystem, the most popular/recommended being overlay2 in current releases. The overlay filesystem merges together multiple lower layers of the image with an upper layer of the container. Reads will be performed through those layers until a match is found, either in the container or in the image filesystem. Writes and deletes are only performed in the container layer.
So if you install packages, and make other changes, when the container is deleted and recreated from the same image, you are back to the original image state without any of your changes, including a new/empty container layer in the overlay filesystem.
From a software development workflow, you want to package and release your changes to the application binaries and dependencies as new images, and those images should be created with a Dockerfile. Persistent data should be stored in a volume. Configuration should be injected as either a file, environment variable, or CLI parameter. And temp files should ideally be written to a tmpfs unless those files are large. When done this way, it's even possible to make the root FS of a container read-only, eliminating a large portion of attacks that rely on injecting code to run inside of the container filesystem.
The standard Docker workflow has two parts.
First you build an image:
Check out the relevant source tree from your source control system of choice.
If necessary, run some sort of ahead-of-time build process (compile static assets, build a Java .jar file, run Webpack, ...).
Run docker build, which uses the instructions in a Dockerfile and the content of the local source tree to produce an image.
Optionally docker push the resulting image to a Docker repository (Docker Hub, something cloud-hosted, something privately-run).
Then you run a container based off that image:
docker run the image name from the build phase. If it's not already on the local system, Docker will pull it from the repository for you.
Note that you don't need the local source tree just to run the image; having the image (or its name in a repository you can reach) is enough. Similarly, there's no "get a shell" or "start the service" in this workflow, just docker run on its own should bring everything up.
(It's helpful in this sense to think of an image the same way you think of a Web browser. You don't download the Chrome source to run it, and you never "get a shell in" your Web browser; it's almost always precompiled and you don't need access to its source, or if you do, you have a real development environment to work on it.)
Now: imagine there's some critical widespread security vulnerability in some core piece of software that your application is using (OpenSSL has had a couple, for example). It's prominent enough that all of the Docker base images have already updated. If you're using this workflow, updating your application is very easy: check out the source tree, update the FROM line in the Dockerfile to something newer, rebuild, and you're done.
Note that none of this workflow is "make arbitrary changes in a container and commit it". When you're forced to rebuild the image on a new base, you really don't want to be in a position where the binary you're running in production is something somebody produced by manually editing a container, but they've since left the company and there's no record of what they actually did.
In short: never run docker commit. While docker exec is a useful debugging tool it shouldn't be part of your core Docker workflow, and if you're routinely running it to set up containers or are thinking of scripting it, it's better to try to move that setup into the ordinary container startup instead.

Can I migrate processes between fat containers?

I recently began getting into Docker and containers. Up to now, I understand that the philosophy behind containers is to run one process per container so we end up with applications which can be run easily and consistently regardless of the environment. Also, that containers are intrinsically connected to it's image, so if you want to save the changes to a container you need to commit and create a new image.
But let's say I want to run multiple processes inside a single container, AKA a fat container. I know it can be done and things like "Supervisord" and "Baseimage-docker" can help manage processes within fat containers.
Now we get to my question: Is there a way to have a fat container running, save the run state of a single process and migrate said process to another container?
I've looked online but I haven't really found anyone that has said that this is possible. So I'm turning to you guys in case one of you have thought about this problem or maybe I've missed something along the way.
I am not so sure if the question might be opinion based. but here is what i think you might be able to do, lets say you have a web application like a Django application that uses Redis within the same container, which will be considered as a fat container and you need to migrate redis to be a standalone service within its own container then you have to do the following:
1- Prepare a docker image that has Redis installed you might go with your own image or use the official redis docker image.
2- Copy the configuration that is being used with redis from the fat container so you can mount it later inside the new redis container
3- Change the Django application settings and make it point to that new redis container
4- Remove redis service and its configuration from the fat container or maybe build a new image.
and thats it, now you should start the redis container and restart the django application container to take effect or start a new one if you modified the fat image.
There's the famous Quake demo and the ability to migrate the state of an entire container with CRIU. That's probably the closest I've seen to what you're talking about. More here: https://criu.org/Docker
As far as a "single" process inside a container, maybe just migrate the entire container and kill the processes you want moved?
I would say the more common pattern in the Docker community is single process containers that are freely killed/updated/etc.

How to deploy an application running in docker - best practice?

We are discussing how we should deploy our application running in a docker container. At the moment, we build our application image in the pipeline containing the application code. Which means we have to build the docker image every time the application updates.
Another approach we consider is putting the application code in a volume on the server. We then pull the latest release with git on the server. So the image has not to be rebuilt.
So our discussed options are:
Build the image containing the application code
Use a volume and store the application code on the server
What is best practice to do and why?
While the other answers here have explained the point of building code into your image, I'd like to go one step further and show you how to get the benefits of both worlds while following this best practice.
Docker best practices call for building source code into your image before deployment, rather than deploying an image with dependencies installed and then source code mounted in as a volume.
This gives you a self-contained, portable container that is straightforward to test, deploy, or rollback.
May I take a stab at why you are considering hot-mounting code?
Hot-mounting code is appealing for several reasons — and they're all easy to achieve without sacrificing this best practice of building a self-contained image:
Building Docker images can be slow, so why rebuild for a minor change when you can just hot-mount the code?
A complementary best practice is to use a "base image" that installs all dependencies -- usually the slow part of building a docker image. The key insight is that this base image won't change often!
But the image that derives from it -- your application image, which installs source code -- will change with every commit you want to deploy. That derived image will be very fast to build. The Dockerfile could be as simple as:
FROM myapp/base . # all dependencies installed in base image
ADD code.tar.gz /src # automatic untaring!
CMD [...] # whatever it takes to run your app
Hot-mounting enables faster development cycles, because a developer won't need to flush their docker container, rebuild, and run a new container just to see a change.
This is a fair point. I recommend making a "dev" image (which will also derive from your base image) that enables code mounting via a volume rather than the source code installation steps you'd have in your testing and deployment images.
When you build image every time with new application you have easy way to deploy it later on to the customer or on your production server. When the docker image is ready you can keep it in the repository. Additionally you have full control on that that your docker is working with current application.
In case of keeping the application in mounted volume you have to keep in mind following problems:
life cycle of application - what to do with container when you have to update the application - gently stop, overwrite and run again
how do you deploy your application - you have to do it manually over SSH, or you want just to run simple command docker run, and it runs your latest version from your repository
The mounted volumes are rather for following casses:
you want to have externally exposed settings for container - what is also not a good idea
you want to have externally access to the data produced by the application like logs, db, etc
To automate it totally, you can:
build image for each application and push to the repository
use for example watchtower to automatic update of the system on your production servers
I believe you should follow the first approach i.e. rebuilding the docker image every time there are changes in code. Reasons are-
Firstly, if you are using volume, every time you have to manage the clean closing and removing of the previous version of the application and check whether the new version of the application is running correctly. Your new application might get affected dependencies of your previous version of the application. That need to be taken care too.
Secondly, there might be some version updates of the frameworks installed and some new frameworks are to be installed with the current application. In this case, the first approach seems to be the only option.
Thirdly, As when you are using docker volume you will be removing the most important feature of docker i.e. abstraction from outside environment. Also, the image might become machine dependent because of it, which might affect if you want to publish the app in multiple environments.
My suggestion would be creating a pipeline using some continuous integration tool and fully automate the process starting from code building, building of docker image and deploying it to your environment.

Can you share Docker containers?

I have been trying to figure out why one might choose adding every "step" of their setup to a Dockerfile which will create your container in a certain state.
The alternative in my mind is to just create a container from a simple base image like ubuntu and then (via shell input) configure your container the way you'd like.
But can you share containers? If you can only share images with Docker then I'd understand why one would want every step of their container setup listed in a Dockerfile.
The reason I ask is because I imagine there is some amount of headache involved with porting shell commands, file changes for configs, etc. to correct Dockerfile syntax and have them work correctly? But as a novice with Docker I could be overestimating the difficulty of that task.
EDIT: I suppose another valid reason for having the Dockerfile with each setup step is for documentation as to the initial state of the container. As opposed to being given a container in a certain state, but not necessarily having a way to know what all was done from the container's image base state.
But can you share containers? If you can only share images with Docker then I'd understand why one would want every step of their container setup listed in a Dockerfile.
Strictly speaking, no. However, you can create a new image from an existing container using the docker commit command:
$ docker commit <container-name> <image-name>
This command will create a new image from the existing container that you can push and pull from/to registries, export and import and create new containers from.
The reason I ask is because I imagine there is some amount of headache involved with porting shell commands, file changes for configs, etc. to correct Dockerfile syntax and have them work correctly? But as a novice with Docker I could be overestimating the difficulty of that task.
If you're already using some other mechanism for automated configuration, you can simply integrate your existing automation into the Docker build. For instance, if you are already configuring your images using shell scripts, simply add a build step in your Dockerfile in which to add your install scripts to the container and execute it. In theory, this can also work with configuration management utilities like Puppet, Salt and others.
EDIT: I suppose another valid reason for having the Dockerfile with each setup step is for documentation as to the initial state of the container. As opposed to being given a container in a certain state, but not necessarily having a way to know what all was done from the container's image base state.
True. As mentioned in comments, there are clear advantages to have an automated and reproducible build of your image. If you build your containers manually and then create an image with docker commit, you don't necessarily know how to re-build this image at a later point in time (which may become necessary when you want to release a new version of your application or re-build the image on top of an updated base image).

Resources