JIB - Is it possible to see the dockerfile jib creates behind the scenes? - docker

Is it possible to see the dockerfile jib creates behind the scenes ? If yes then where and how can i locate it ?
Context - I am bit familiar with docker file and want to make sure the docker file that gets generated has everything required for my app to run successfully.

Jib does not generate Dockerfile or make any use of Docker during image building. You don't need to install Docker to use Jib.
For a normal project, if a Dockefile existed, some part of it would roughly look like this. However, do note that the Dockerfile in the link is mostly for informational purposes; almost all the time, there cannot be a Dockerfile that can accurately reproduce the image generated by Jib.
Related, note that the way Jib works is fundamentally different than Docker's:
the way Jib builds an image is fundamentally different from how the Docker CLI builds an image using Dockerfile (reproducible vs. non-reproducible, declarative vs. imperative, Docker and Dockerfile-less build vs. requiring Docker daemon and client, requiring root-privilege vs. not). Jib is in a very different realm ...
UPDATE: if you want to examine the built image, check out dive.

Related

jib-maven-plugin - How to set folder permission

I'm trying to build docker image using jib-maven-plugin, I want to set permission for specific folder.
If I am using docker file, the configuration will look below :
FROM xxxxxxxx.com/sandbox/gui-server:1.0.0-SNAPSHOT
USER root
RUN chmod 755 /home/www
USER www
Now how do I implement this using jib-maven-plugin? I believe somewhere in pom.xml in jib-maven-plugin
<container>
<mainClass>${mainClass}</mainClass>
...
...
<user>www</user>
</container>
The first question you need to think about is why you have to change the permissions of a base image directory (/home/www in your case) to 755. It might be the case that the base image is specifically designed to be run as root and /home/www should only readable by root for some reason I don't know. Or, if it doesn't make sense that the directory is not readable by a non-root user, it may be a bug that should be fixed in the base image.
If you still want to change the permissions of an arbitrary directory of a base image, I can think of an abuse of the <extraDirectories> feature as demonstrated here. However, I am a bit reluctant to suggest this hack as a good workaround. In many cases (although not yours), the root of the issue may not be about permissions but about file/directory ownership or about fixing an app to not mutate files in the base image. If the files/directories were not from a base image but about files/directories put by Jib, maybe the Jib Ownership Extension (Maven / Gradle) might resolve some seemingly-permission-related issues.
Also check out this Stack Overflow question.
For those who wonder the possibility of a RUN-like support in Jib (i.e., actually executing a command inside a container using some container runtime), I'll quote these comments:
the way Jib builds an image is fundamentally different from how the Docker CLI builds an image using Dockerfile (reproducible vs. non-reproducible, declarative vs. imperative, Docker and Dockerfile-less build vs. requiring Docker daemon and client, requiring root-privilege vs. not). Jib is in a very different realm, so unfortunately, it is very difficult to support ONBUILD unless we radically change our opinionated philosophy in how an image should be built. Basically, we don't "run" Dockerfile directives, particularly the ones like RUN that executes something. Jib doesn't provide/include a Docker runtime (that is one of the points of Jib).
And as for running arbitrary commands, unfortunately this is largely incompatible with the mode Jib operates in, because the way Jib builds an image is fundamentally different from how Docker does: https://github.com/GoogleContainerTools/jib/issues/1806#issuecomment-505526975 We build images in a declarative and reproducible way without actually requiring to have a runtime component to be able to run an image at image build-time; running an image basically destroys reproducibility. So unfortunately it is very difficult for Jib to support "running" arbitrary commands at image building-time.

How to instruct docker or docker-compose to automatically build image specified in FROM

When processing a Dockerfile, how do I instruct docker build to build the image specified in FROM locally using another Dockerfile if it is not already available?
Here's the context. I have a large Dockerfile that starts from base Ubuntu image, installs Apache, then PHP, then some custom configuration on top of that. Whether this is a good idea is another point, let's assume the build steps cannot be changed. The problem is, every time I change anything in the config, everything has to be rebuilt from scratch, and this takes a while.
I would like to have a hierarchy of Dockerfiles instead:
my-apache : based on stock Ubuntu
my-apache-php: based on my-apache
final: based on my-apache-php
The first two images would be relatively static and can be uploaded to dockerhub, but I would like to retain an option to build them locally as part of the same build process. Only one container will exist, based on the final image. Thus, putting all three as "services" in docker-compose.yml is not a good idea.
The only solution I can think of is to have a manual build script that for each image checks whether it is available on Dockerhub or locally, and if not, invokes docker build.
Are there better solutions?
I have found this article on automatically detecting dependencies between docker files and building them in proper order:
https://philpep.org/blog/a-makefile-for-your-dockerfiles
Actual makefile from Philippe's git repo provides even more functionality:
https://github.com/philpep/dockerfiles/blob/master/Makefile

Should `docker-compose.yml` be in its own repository?

I'm building a small web app with Vue.js and an Express API, each with their own Dockerfile. I currently am able to build those images and publish them to a private Docker repository, then pull them onto a virtual machine and run them. I want to add Docker Compose, and I've often seen that together with the code for the services, such as
|--..
|__api/
|__client/
|__docker-compose.yml
but that seem then like you can't publish the images to a repository, since Docker Compose builds the images and runs the containers, and so my VM would need to pull all the code, when to my thinking it should just need the images and then know how to run them.
So am I thinking about Docker Compose wrong? I have very little experience with it; I'm just trying to figure out the best way to be able to run the containers and it seems like I should be able to do that on a VM without having to download all the source code to that VM.
You can use docker-compose and still publish the individual images.
I guess that the API and the client have their own Docker files respectively.
So basically you have three options:
Let docker-compose build the images via the build
option.
Just reference the images with the image
option and
make sure they are built before.
Do both so docker-compose will build those images and give them
the name and the tag that you put under the image option.
They are all valid options as far as I am concerned. If you go with
option two I would write a little Makefile or script that makes sure
the images are in place for convenience.

How to idiomatically access sensitive data when building a Docker image?

Sometimes there is a need to use sensitive data when building a Docker image. For example, an API token or SSH key to download a remote file or to install dependencies from a private repository. It may be desirable to distribute the resulting image and leave out the sensitive credentials that were used to build it. How can this be done?
I have seen docker-squash which can squash multiple layers in to one, removing any deleted files from the final image. But is there a more idiomatic approach?
Regarding idiomatic approach, I'm not sure, although docker is still quite young to have too many idioms about.
We have had this same issue at our company, however. We have come to the following conclusions, although these are our best efforts rather than established docker best practices.
1) If you need the values at build time: Supply a properties file in the build context with the values that can be read at build, then the properties file can be deleted after build. This isn't as portable but will do the job.
2) If you need the values at run time: Pass values as environment variables. They will be visible to someone who has access to ps on the box, but this can be restricted via SELinux or other methods (honestly, I don't know this process, I'm a developer and the operations teams will deal with that part).
Sadly, there is still no proper solution for handling sensitive data while building a docker image.
This bug has a good summary of what is wrong with every hack that people suggest:
https://github.com/moby/moby/issues/13490
And most advice seems to confuse secrets that need to go INTO the container with secrets that are used to build the container, like several of the answers here.
The current solutions that seem to actually be secure, all seem to center around writing out the secret file to disk or memory, and then starting a silly little HTTP server, and then having the build process pull in the secret from the http server, use it, and not store it in the image.
The best I've found without going to that level of complexity, is to (mis)use the built in predefined-args feature of docker compose files, as specified in this comment:
https://github.com/moby/moby/issues/13490#issuecomment-403612834
That does seem to keep the secrets out of the image build history.
Matthew Close talks about this in this blog article.
Summarized: You should use docker-compose to mount sensitive information into the container.
2019, and I'm not sure there is an idomatic approach or best practices regarding secrets when using docker: https://github.com/moby/moby/issues/13490 remains open so far.
Secrets at runtime:
So far, the best approach I could find was using environment variables in a container:
with docker run -e option... but then your secrets are available in command line history
with docker env_file option or docker-compose env_file option. At least secrets are not passed in command line
Problem: in any case, secrets are now available for anyone able to run docker commands on your docker host (using docker inspect command)
Secrets at build time (your question):
I can see 2 additional (partial?) solutions to this problem:
Multistage build:
use a multi-stage docker build: basically, your dockerfile will define 2 images:
One first intermediate image (the "build image") in which:
you add your secrets to this image: either use build args or copy secret files (be careful with build args: they have to be passed in docker build command line)
you build your artefact (you now have access to your private repository)
A second image (the "distribution image") in which:
you copy the built artefact from the "build image"
distribute your image on a docker registry
This approach is explained by several comments in the quoted github thread:
https://github.com/moby/moby/issues/13490#issuecomment-408316448
https://github.com/moby/moby/issues/13490#issuecomment-437676553
Caution
This multistage build approach is far from being ideal: the "build image" is still lying on your host after the build command (and is containing your sensitive information). There are precautions to take
A new --secret build option:
I discovered this option today, and therefore did not experiment it yet... What I know so far:
it was announced in a comment from the same thread on github
this comment leads to a detailed article about this new option
the docker documentation (docker v19.03 at the time being) is not verbose about this option: it is listed with the description below, but there is no detailed section about it:
--secret
API 1.39+
Secret file to expose to the build (only if BuildKit enabled): id=mysecret,src=/local/secret
The way we solve this issue is that we have a tool written on top of docker build. Once you initiate a build using the tool, it will download a dockerfile and alters it. It changes all instructions which require "the secret" to something like:
RUN printf "secret: asd123poi54mnb" > /somewhere && tool-which-uses-the-secret run && rm /somewhere
However, this leaves the secret data available to anyone with access to the image unless the layer itself is removed with a tool like docker-squash. The command used to generate each intermediate layer can be found using the history command

Why doesn't Docker Hub cache Automated Build Repositories as the images are being built?

Note: It appears the premise of my question is no longer valid since the new Docker Hub appears to support caching. I haven't personally tested this. See the new answer below.
Docker Hub's Automated Build Repositories don't seem to cache images. As it is building, it removes all intermediate containers. Is this the way it was intended to work or am I doing something wrong? It would be really nice to not have to rebuild everything for every small change. I thought that was supposed to be one of the best advantages of docker and it seems weird that their builder doesn't use it. So why doesn't it cache images?
UPDATE:
I've started using Codeship to build my app and then run remote commands on my DigitalOcean server to copy the built files and run the docker build command. I'm still not sure why Docker Hub doesn't cache.
Disclaimer: I am a lead software engineer at Quay.io, a private Docker container registry, so this is an educated guess based on the same problem we faced in our own build system implementation.
Given my experience with Dockerfile build systems, I would suspect that the Docker Hub does not support caching because of the way caching is implemented in the Docker Engine. Caching for Docker builds operates by comparing the commands to be run against the existing layers found in memory.
For example, if the Dockerfile has the form:
FROM somebaseimage
RUN somecommand
ADD somefile somefile
Then the Docker build code will:
Check to see if an image matching somebaseimage exists
Check if there is a local image with the command RUN somecommand whose parent is the previous image
Check if there is a local image with the command ADD somefile somefile + a hashing of the contents of somefile (to make sure it is invalidated when somefile changes), whose parent is the previous image
If any of the above steps match, then that command will be skipped in the Dockerfile build process, with the cached image itself being used instead. However, the one key issue with this process is that it requires the cached images to be present on the build machine, in order to find and verify the matches. Having all of everyone's images on build nodes would be highly inefficient, making this a harder problem to solve.
At Quay.io, we solved the caching problem by creating a variation of the Docker caching code that could precompute these commands/hashes and then ask our registry for the cached layers, downloading them to the machine only after we had found the most efficient caching set. This required significant data model changes in our registry code.
If you'd like more information, we gave a technical overview into how we do so in this talk: https://youtu.be/anfmeB_JzB0?list=PLlh6TqkU8kg8Ld0Zu1aRWATiqBkxseZ9g
The new Docker Hub came out with a new Automated Build system that supports Build Caching.
https://blog.docker.com/2018/12/the-new-docker-hub/

Resources