Is it a bad idea to use docker to run a front end build process during development? - docker

I have an angular project I'm working on containerizing. It currently has enough build tooling that a front-end developer could (and this is how it currently works) just run gulp in the project root, edit source files in src/, and the build tooling handles running traceur, templates and libsass and so forth, spitting content into a build directory. That build directory is served with a minimal server in development, and handled by nginx in production.
My goal is to build a docker-based environment that replicates this workflow. My users are developers who are on different kinds of boxes, so having build dependencies frozen in a dockerfile seems to make sense.
I've gotten close enough to this- docker mounts the host volume, the developer edits the file on the local disk, and the gulp build watcher is running in the docker container instance and rebuilds the site (and triggers livereload, etc etc).
The issue I have is with wrapping my head around how filesystem layers work. Is this process of rebuilding files in the container's build/frontend directory generating a ton of extraneous saved layers? That's not something I'd really like, because I'm not wild about monotonically growing this instance as developers tweak and rebuild and tweak and rebuild. It'd only grow locally, but having to go through the "okay, time to clean up and start over" process seems tedious.

Is this process of rebuilding files in the container's build/frontend directory generating a ton of extraneous saved layers?
Nope, the only way to stack up extra layers is to commit a container with changes to a new image then use that new image to create the next container. Rinse, repeat.
Filesystem layers are saved when a container is committed to a new image (docker commit ...). When a container is running there will be a single read/write layer on top that contains all of the changes made to the container since it was created.
having to go through the "okay, time to clean up and start over" process seems tedious.
If you run the build container with docker run --rm ... then you'll get the cleanup for free. The build container will be created fresh from the image each time.
Also, data volumes bypass the union filesystem so there's a good chance you won't write to the container's filesystem at all.

Related

How can squid be used in a Dockerfile to cache downloads in a host directory

I am running a docker build command with a Dockerfile, but this is being held up by a slow, and sometimes aborted, download of a certain package (google's boringssl as it happens).
I would like to install squid near the start of the Dockerfile, so that subsequent git clones, apt gets, etc, i.e. every kind of download is cached to a directory outside the docker image, by defining a volume in the docker build command.
I'm fairly familiar with Docker, and understand the concept of layers. So a reply dealing solely with those will not be useful to me (although possibly to others). But sometimes one has to make a Dockerfile change that disrupts subsequent layers, and also a build error of a subsystem within a layer will mean all the web fetches for that layer will need repeating on the next build. So layers are not the answer to everything cache-related.
Thanks in anticipation!

Why should our work inside the container shouldn't modify the content of the container itself?

I am reading an article related to docker images and containers.
It says that a container is an instance of an image. Fair enough. It also says that whenever you make some changes to a container, you should create an image of it which can be used later.
But at the same time it says:
Your work inside a container shouldn’t modify the container. Like
previously mentioned, files that you need to save past the end of a
container’s life should be kept in a shared folder. Modifying the
contents of a running container eliminates the benefits Docker
provides. Because one container might be different from another,
suddenly your guarantee that every container will work in every
situation is gone.
What I want to know is that, what is the problem with modifying container's contents? Isn't this what containers are for? where we make our own changes and then create an image which will work every time. Even if we are talking about modifying container's content itself and not just adding any additional packages, how will it harm anything since the image created from this container will also have these changes and other containers created from that image will inherit those changes too.
Treat the container filesystem as ephemeral. You can modify it all you want, but when you delete it, the changes you have made are gone.
This is based on a union filesystem, the most popular/recommended being overlay2 in current releases. The overlay filesystem merges together multiple lower layers of the image with an upper layer of the container. Reads will be performed through those layers until a match is found, either in the container or in the image filesystem. Writes and deletes are only performed in the container layer.
So if you install packages, and make other changes, when the container is deleted and recreated from the same image, you are back to the original image state without any of your changes, including a new/empty container layer in the overlay filesystem.
From a software development workflow, you want to package and release your changes to the application binaries and dependencies as new images, and those images should be created with a Dockerfile. Persistent data should be stored in a volume. Configuration should be injected as either a file, environment variable, or CLI parameter. And temp files should ideally be written to a tmpfs unless those files are large. When done this way, it's even possible to make the root FS of a container read-only, eliminating a large portion of attacks that rely on injecting code to run inside of the container filesystem.
The standard Docker workflow has two parts.
First you build an image:
Check out the relevant source tree from your source control system of choice.
If necessary, run some sort of ahead-of-time build process (compile static assets, build a Java .jar file, run Webpack, ...).
Run docker build, which uses the instructions in a Dockerfile and the content of the local source tree to produce an image.
Optionally docker push the resulting image to a Docker repository (Docker Hub, something cloud-hosted, something privately-run).
Then you run a container based off that image:
docker run the image name from the build phase. If it's not already on the local system, Docker will pull it from the repository for you.
Note that you don't need the local source tree just to run the image; having the image (or its name in a repository you can reach) is enough. Similarly, there's no "get a shell" or "start the service" in this workflow, just docker run on its own should bring everything up.
(It's helpful in this sense to think of an image the same way you think of a Web browser. You don't download the Chrome source to run it, and you never "get a shell in" your Web browser; it's almost always precompiled and you don't need access to its source, or if you do, you have a real development environment to work on it.)
Now: imagine there's some critical widespread security vulnerability in some core piece of software that your application is using (OpenSSL has had a couple, for example). It's prominent enough that all of the Docker base images have already updated. If you're using this workflow, updating your application is very easy: check out the source tree, update the FROM line in the Dockerfile to something newer, rebuild, and you're done.
Note that none of this workflow is "make arbitrary changes in a container and commit it". When you're forced to rebuild the image on a new base, you really don't want to be in a position where the binary you're running in production is something somebody produced by manually editing a container, but they've since left the company and there's no record of what they actually did.
In short: never run docker commit. While docker exec is a useful debugging tool it shouldn't be part of your core Docker workflow, and if you're routinely running it to set up containers or are thinking of scripting it, it's better to try to move that setup into the ordinary container startup instead.

Persisting changes to Windows Registry between restarts of a Windows Container

Given a Windows application running in a Docker Windows Container, and while running changes are made to the Windows registry by the running applications, is there a docker switch/command that allows changes to the Windows Registry to be persisted, so that when the container is restarted the changed values are retained.
As a comparison, file changes can be persisted between container restarts by exposing mount points e.g.
docker volume create externalstore
docker run -v externalstore:\data microsoft/windowsservercore
What is the equivalent feature for Windows Registry?
I think you're after dynamic changes (each start and stop of the container contains different user keys you want to save for the next run), like a roaming profile, rather than a static set of registry settings but I'm writing for static as it's an easier and more likely answer.
It's worth noting the distinction between a container and an image.
Images are static templates.
Containers are started from images and while they can be stopped and restarted, you usually throw them entirely away after each execution with most enterprise designs such as with Kubernetes.
If you wish to run a docker container like a VM (not generally recommended), stopping and starting it, your registry settings should persist between runs.
It's possible to convert a container to an image by using the docker commit command. In this method, you would start the container, make the needed changes, then commit the container to an image. New containers would be started from the new image. While this is possible, it's not really recommended for the same reason that cloning a machine or upgrading an OS is not. You will get extra artifacts (files, settings, logs) that you don't really want in the image. If this is done repeatedly, it'll end up like a bad photocopy.
A better way to make a static change is to build a new image using a dockerfile. You'll need to read up on that (beyond the scope of this answer) but essentially you're writing a docker script that will make a change to an existing docker image and save it to a new image (done with docker build). The advantage of this is that it's cleaner, more repeatable, and each step of the build process is layered. Layers are advantageous for space savings. An image made with a windowsservercore base and application layer, then copied to another machine which already had a copy of the windowsservercore base, would only take up the additional space of the application layer.
If you want to repeatedly create containers and apply consistent settings to them but without building a new image, you could do a couple things:
Mount a volume with a script and set the execution point of the container/image to run that script. The script could import the registry settings and then kick off whatever application you were originally using as the execution point, note that the script would need to be a continuous loop. The MS SQL Developer image is a good example, https://github.com/Microsoft/mssql-docker/tree/master/windows/mssql-server-windows-developer. The script could export the settings you want. Not sure if there's an easy way to detect "shutdown" and have it run at that point, but you could easily set it to run in a loop writing continuously to the mounted volume.
Leverage a control system such as Docker Compose or Kubernetes to handle the setting for you (not sure offhand how practical this is for registry settings)
Have the application set the registry settings
Open ports to the container which allow remote management of the container (not recommended for security reasons)
Mount a volume where the registry files are located in the container (I'm not certain where these are or if this will work correctly)
TL;DR: You should make a new image using a dockerfile for static changes. For dynamic changes, you will probably need to use some clever scripting.

How to deploy an application running in docker - best practice?

We are discussing how we should deploy our application running in a docker container. At the moment, we build our application image in the pipeline containing the application code. Which means we have to build the docker image every time the application updates.
Another approach we consider is putting the application code in a volume on the server. We then pull the latest release with git on the server. So the image has not to be rebuilt.
So our discussed options are:
Build the image containing the application code
Use a volume and store the application code on the server
What is best practice to do and why?
While the other answers here have explained the point of building code into your image, I'd like to go one step further and show you how to get the benefits of both worlds while following this best practice.
Docker best practices call for building source code into your image before deployment, rather than deploying an image with dependencies installed and then source code mounted in as a volume.
This gives you a self-contained, portable container that is straightforward to test, deploy, or rollback.
May I take a stab at why you are considering hot-mounting code?
Hot-mounting code is appealing for several reasons — and they're all easy to achieve without sacrificing this best practice of building a self-contained image:
Building Docker images can be slow, so why rebuild for a minor change when you can just hot-mount the code?
A complementary best practice is to use a "base image" that installs all dependencies -- usually the slow part of building a docker image. The key insight is that this base image won't change often!
But the image that derives from it -- your application image, which installs source code -- will change with every commit you want to deploy. That derived image will be very fast to build. The Dockerfile could be as simple as:
FROM myapp/base . # all dependencies installed in base image
ADD code.tar.gz /src # automatic untaring!
CMD [...] # whatever it takes to run your app
Hot-mounting enables faster development cycles, because a developer won't need to flush their docker container, rebuild, and run a new container just to see a change.
This is a fair point. I recommend making a "dev" image (which will also derive from your base image) that enables code mounting via a volume rather than the source code installation steps you'd have in your testing and deployment images.
When you build image every time with new application you have easy way to deploy it later on to the customer or on your production server. When the docker image is ready you can keep it in the repository. Additionally you have full control on that that your docker is working with current application.
In case of keeping the application in mounted volume you have to keep in mind following problems:
life cycle of application - what to do with container when you have to update the application - gently stop, overwrite and run again
how do you deploy your application - you have to do it manually over SSH, or you want just to run simple command docker run, and it runs your latest version from your repository
The mounted volumes are rather for following casses:
you want to have externally exposed settings for container - what is also not a good idea
you want to have externally access to the data produced by the application like logs, db, etc
To automate it totally, you can:
build image for each application and push to the repository
use for example watchtower to automatic update of the system on your production servers
I believe you should follow the first approach i.e. rebuilding the docker image every time there are changes in code. Reasons are-
Firstly, if you are using volume, every time you have to manage the clean closing and removing of the previous version of the application and check whether the new version of the application is running correctly. Your new application might get affected dependencies of your previous version of the application. That need to be taken care too.
Secondly, there might be some version updates of the frameworks installed and some new frameworks are to be installed with the current application. In this case, the first approach seems to be the only option.
Thirdly, As when you are using docker volume you will be removing the most important feature of docker i.e. abstraction from outside environment. Also, the image might become machine dependent because of it, which might affect if you want to publish the app in multiple environments.
My suggestion would be creating a pipeline using some continuous integration tool and fully automate the process starting from code building, building of docker image and deploying it to your environment.

How exactly does docker work? (Theory)

I am venturing into using docker and trying to get a firm grasp of the product.
While I love everything it promises it is a big change from doing things manually.
Right now I understand how to build a container, attach your code, commit and push it to your repo.
But what I am really wondering is how do I update my code once deployed, for example, I have some minor bug fixes but no change to dependencies but I also run a database in the same container.
Container:
Node & NPM
Nginx
Mysql
php
Right now the only way I understand you can do it is to close thje container re pull the new container and run, but I am thinking you will lose database data.
I have been reading into https://docs.docker.com/engine/tutorials/dockervolumes/
and thinking maybe the container mounts a data file that persists between containers.
What I am trying to do is run a web app/website with the above container layout and just change code with latest bugfixes/features.
You're quite correct. Docker images are something you should be rebuilding and discarding with each update - avoid commit wherever possible (outside your build scripts anyway).
Persistent state should be managed via data containers that you then mount with your image. Thus your "data" is decoupled from that specific version and instance of the application.

Resources