I'm building my first Dockerfile for a go application and I don't understand while go build or go install are considered a necessary part of the docker container.
I know this can be avoided using muilt-stage but I don't know why it was ever put in the container image in the first place.
What I would expect:
I have a go application 'go-awesome'
I can build it locally from cmd/go-awesome
My Dockerfile contains not much more than
COPY go-awesome .
CMD ["go-awesome"]
What is the downside of this configuration? What do I gain by instead doing
COPY . .
RUN go get ./...
RUN go install ./..
Links to posts showing building go applications as part of the Dockerfile
https://www.callicoder.com/docker-golang-image-container-example/
https://blog.codeship.com/building-minimal-docker-containers-for-go-applications/
https://www.cloudreach.com/blog/containerize-this-golang-dockerfiles/
https://medium.com/travis-on-docker/how-to-dockerize-your-go-golang-app-542af15c27a2
You are correct that you can compile your application locally and simply copy the executable into a suitable docker image.
However, there are benefits to compiling the application inside the docker build, particularly for larger projects with multiple collaborators. Specifically the following reasons come to mind:
There are no local dependencies (aside from docker) required to build the application source. Someone wouldn't even need to have go installed. This is especially valuable for projects in which multiple languages are in use. Consider someone who might want to edit an HTML template inside of a go project and see what that looked like in the container runtime..
The build environment (version of go, dependency managment, file paths...) is constant. Any external dependencies can be safely managed and maintained via the Dockerfile.
Related
We are discussing how we should deploy our application running in a docker container. At the moment, we build our application image in the pipeline containing the application code. Which means we have to build the docker image every time the application updates.
Another approach we consider is putting the application code in a volume on the server. We then pull the latest release with git on the server. So the image has not to be rebuilt.
So our discussed options are:
Build the image containing the application code
Use a volume and store the application code on the server
What is best practice to do and why?
While the other answers here have explained the point of building code into your image, I'd like to go one step further and show you how to get the benefits of both worlds while following this best practice.
Docker best practices call for building source code into your image before deployment, rather than deploying an image with dependencies installed and then source code mounted in as a volume.
This gives you a self-contained, portable container that is straightforward to test, deploy, or rollback.
May I take a stab at why you are considering hot-mounting code?
Hot-mounting code is appealing for several reasons — and they're all easy to achieve without sacrificing this best practice of building a self-contained image:
Building Docker images can be slow, so why rebuild for a minor change when you can just hot-mount the code?
A complementary best practice is to use a "base image" that installs all dependencies -- usually the slow part of building a docker image. The key insight is that this base image won't change often!
But the image that derives from it -- your application image, which installs source code -- will change with every commit you want to deploy. That derived image will be very fast to build. The Dockerfile could be as simple as:
FROM myapp/base . # all dependencies installed in base image
ADD code.tar.gz /src # automatic untaring!
CMD [...] # whatever it takes to run your app
Hot-mounting enables faster development cycles, because a developer won't need to flush their docker container, rebuild, and run a new container just to see a change.
This is a fair point. I recommend making a "dev" image (which will also derive from your base image) that enables code mounting via a volume rather than the source code installation steps you'd have in your testing and deployment images.
When you build image every time with new application you have easy way to deploy it later on to the customer or on your production server. When the docker image is ready you can keep it in the repository. Additionally you have full control on that that your docker is working with current application.
In case of keeping the application in mounted volume you have to keep in mind following problems:
life cycle of application - what to do with container when you have to update the application - gently stop, overwrite and run again
how do you deploy your application - you have to do it manually over SSH, or you want just to run simple command docker run, and it runs your latest version from your repository
The mounted volumes are rather for following casses:
you want to have externally exposed settings for container - what is also not a good idea
you want to have externally access to the data produced by the application like logs, db, etc
To automate it totally, you can:
build image for each application and push to the repository
use for example watchtower to automatic update of the system on your production servers
I believe you should follow the first approach i.e. rebuilding the docker image every time there are changes in code. Reasons are-
Firstly, if you are using volume, every time you have to manage the clean closing and removing of the previous version of the application and check whether the new version of the application is running correctly. Your new application might get affected dependencies of your previous version of the application. That need to be taken care too.
Secondly, there might be some version updates of the frameworks installed and some new frameworks are to be installed with the current application. In this case, the first approach seems to be the only option.
Thirdly, As when you are using docker volume you will be removing the most important feature of docker i.e. abstraction from outside environment. Also, the image might become machine dependent because of it, which might affect if you want to publish the app in multiple environments.
My suggestion would be creating a pipeline using some continuous integration tool and fully automate the process starting from code building, building of docker image and deploying it to your environment.
There is an option to use FROM scratch for me it looks like a really attractive way of building my Go containers.
My question is what does it still have natively to run binaries do I need to add anything in order to reliably run Go binaries? Compiled Go binary seems to run it at least on my laptop.
My goal is to keep image size to a minimum both for security and infra management reasons. In an optimal situation, my container would not be able to execute binaries or shell commands outside of build phase.
The scratch image contains nothing. No files. But actually, that can work to your advantage. It turns out, Go binaries built with CGO_ENABLED=0 require absolutely nothing, other than what they use. There are a couple things to keep in mind:
With CGO_ENABLED=0, you can't use any C code. Actually not too hard.
With CGO_ENABLED=0, your app will not use the system DNS resolver. I don't think it does by default anyways because it's blocking and Go's native DNS resolver is non-blocking.
Your app may depend on some things that are not present:
Apps that make HTTPS calls (as in, to other services, i.e. Amazon S3, or the Stripe API) will need ca-certs in order to confirm HTTPS certificate authenticity. This also has to be updated over time. This is not needed for serving HTTPS content.
Apps that need timezone awareness will need the timezone info files.
A nice alternative to FROM scratch is FROM alpine, which will include a base Alpine image - which is very tiny (5 MiB I believe) and includes musl libc, which is compatible with Go and will allow you to link to C libraries as well as compile without setting CGO_ENABLED=0. You can also leverage the fact that alpine is regularly updated, using its tzinfo and ca-certs.
(It's worth noting that the overhead of Docker layers is amortized a bit because of Docker's deduplication, though of course that is negated by how often your base image is updated. Still, it helps sell the idea of using the quite small Alpine image.)
You may not need tzinfo or ca-certs now, but it's better to be safe than sorry; you can accidentally add a dependency without realizing it breaks your build. So I recommend using alpine as your base. alpine:latest should be fine.
Bonus: If you want the advantages of reproducible builds inside Docker, but with small image sizes, you can use the new Docker multi-stage builds available in Docker 17.06+.
It works a bit like this:
FROM golang:alpine
ADD . /go/src/github.com/some/gorepo # may need some go getting if you don't vendor
RUN go build -o /app github.com/some/gorepo
FROM scratch # or alpine
COPY --from=0 /app /app
ENTRYPOINT ["/app"]
(I apologize if I've made any mistakes, I'm typing that from memory.)
Note that when using FROM scratch you must use the exec form of ENTRYPOINT, because the shell form won't work (it depends on the Docker image having /bin/sh, which it won't.) This will work fine in Alpine.
I have a Docker image which is a server for a web IDE (Jupyter notebook) for Haskell.
Each time I want to allow the usage of a library in the IDE, I have to go to the Dockerfile and add the install command into it, then rebuild the image.
Another drawback of this, I have to fork the original image on Github, not allowing me to contribute to it.
I was thinking about writing another Dockerfile which pulls the base one with the FROM directive and then RUNs the commands to install the libraries. But, as they are in separate layers, the guest system does not find the Haskell package manager command.
TL;DR: I want to run stack install <library> (stack is like npm or pip, but for Haskell) from the Dockerfile, but I dont want to have a fork of the base image.
How could I solve this problem?
I was thinking about writing another Dockerfile which pulls the base one with the FROM directive and then RUNs the commands to install the libraries. But, as they are in separate layers, the guest system does not find the Haskell package manager command.
This is indeed the correct way to do this, and it should work. I'm not sure I understand the "layers" problem here - the commands executed by RUN should be running in an intermediate container that contains all of the layers from the base image and the previous RUN commands. (Ignoring the possibility of multi-stage builds, but these were added in 17.05 and did not exist when this question was posted.)
The only scenario I can see where stack might work in the running container but not in the Dockerfile RUN command would be if the $PATH variable isn't set correctly at this point. Check this variable, and make sure RUN is running as the correct user?
I've been experimenting with docker recently but can't get my head around what I think is a fairly important/useful requirement:
The ability to download a NEW copy of a web site for running, when a container is run. NOT at build time, but at run time.
I have seen countless examples of Dockerfiles where java, tomcat, a copy of a WAR is installed and added to an image during build time, but none where that WAR is downloaded fresh each time "docker run -d me/myimage" is executed on the command line.
I think it might involve adding a CMD statement at the end of the Dockerfile but I wonder if people out there more experienced than me with docker have some advice? Perhaps I shouldn't even be attempting this and should re-build my images each time my web app has a new release? But that would mean I would have to distribute my new image via a private dockerhub or something right? I am not willing to stick my source in a public github repo and have the Dockerfile pull it and build it during an image build.
Thanks.
As Mark O'Connor said in his comment, it's certainly possible. A Docker container is just a process tree running on your Linux host, and with a few exceptions (generally involving privileged access to the kernel) can do anything you can do outside of a container.
So sure, you could put together an image that, when run, would download the most recent of an application and run it.
The reason this is considered a bad idea is that it suddenly becomes difficult if you want to run an older version of the application (or more generally a specific version). What if you redeploy your container and end up with a new version of the application that requires manual database schema upgrades before it will operate? Now instead of an application you have a brick.
Similarly, what if the newest version of the application is simply buggy? If you were performing the download and install at build time, you would simply deploy an image with an older version of the application.
Performing the application and download at run time makes the container unpredictable and less manageable.
The Docker documentation suggests to use the ONBUILD instruction if you have the following scenario:
For example, if your image is a reusable python application builder, it will require application source code to be added in a particular directory, and it might require a build script to be called after that. You can't just call ADD and RUN now, because you don't yet have access to the application source code, and it will be different for each application build. You could simply provide application developers with a boilerplate Dockerfile to copy-paste into their application, but that is inefficient, error-prone and difficult to update because it mixes with application-specific code.
Basically, this all sounds nice and good, but that does mean that I have to re-create the app container every single time I change something, even if it's only a typo.
This doesn't seem to be very efficient, e.g. when creating web applications where you are used to change something, save, and hit refresh in the browser.
How do you deal with this?
does mean that I have to re-create the app container every single time I change something, even if it's only a typo
not necessarily, you could use the -v option for the docker run command to inject your project files into a container. So you would not have to rebuild a docker image.
Note that the ONBUILD instruction is meant for cases where a Dockerfile inherits FROM a parent Dockerfile. The ONBUILD instructions found in the parent Dockerfile would be run when Docker builds an image of the child Dockerfile.
This doesn't seem to be very efficient, e.g. when creating web applications where you are used to change something, save, and hit refresh in the browser.
If you are using a Docker container to serve a web application while you are iterating on that application code, then I suggest you make a special Docker image which only contains everything to run your app but the app code.
Then share the directory that contains your app code on your host machine with the directory from which the application files are served within the docker container.
For instance, if I'm developing a static web site and my workspace is at /home/thomas/workspace/project1/, then I would start a container running nginx with:
docker run -d -p 80:80 -v /home/thomas/workspace/project1/:/usr/local/nginx/html:ro nginx
That way I can change files in /home/thomas/workspace/project1/ and the changes are reflected live without having to rebuild the docker image or even restart the docker container.