How can I see what's in the Docker layers? - docker

I am running a docker build and one of the layers is big and always takes a long time to download. Is there a way to see where the layer is coming from and what it does? I would like to check it while it's downloading but it would still be useful to examine it after download. Is either possible?
This is the command, I am running:
docker build \
-t registry.gear.ge.com/predix_edge/edge-agent-i386 \
-f docker-runners/.dockerfile-build-i386 docker-runners
Sending build context to Docker daemon 8.027MB
Step 1/13 : FROM registry.gear.ge.com/predix_edge/edge-agent-build-i386:20180920
20180920: Pulling from predix_edge/edge-agent-build-i386
10c05d2b2fbf: Pull complete
3f9f2d6d7ae5: Pull complete
a2f288eed9a5: Pull complete
8fadaaf1d0d3: Pull complete
5c746e81cede: Pull complete
20d91e41d92e: Downloading [===============> ] 113.4MB/366.3MB
c0701269de1c: Download complete
e6a6642f6692: Download complete
ccac838d533e: Download complete
0e3809b7d911: Download complete
e0b7e3addbed: Verifying Checksum
I would like to see what's in layer 20d91e41d92e.

docker history will give you a listing of all of the layers in an image, the individual layer’s size, and the command that got run to create the image. That could be a shell command or a Dockerfile directive.
In practice, a very large layer will probably be either COPYing some artifact into Docker land, or a software installation of some sort. Depending on what it is that’s being installed, working around this may or may not be tricky. I see a lot of Dockerfiles go by on SO that install a full C toolchain and library header files just to produce a runnable Python library, for instance; that can be split into a multi-stage build that will have a much smaller runtime artifact, but that involves reengineering the build process.

Related

Copy error log on a failed Docker image build

I run a Docker image build task in a CI environment. The Docker image build failed, and the detailed error information is stored in a file of the temporary container. I wonder if there is any good approach that I can use to copy the error log out of the container so that I can view it as CI build artifacts.
I see some approach like [1], which basically temporarily comments out Dockerfile contents from the point of build failure and then run the image build again manually to get the log. However, this is very time consuming if I am not building the image locally but building it in a CI server. I am looking for better approach for addressing this issue. Thanks.
[1] https://pythonspeed.com/articles/debugging-docker-build/

How to use docker images when building artefacts in Actions?

TL;DR: I would like to use on a self-hosted Actions runner (itself a docker container on my docker engine) specific docker images to build artefacts that I would move between the build phases, and end with a standalone executable (not a docker container to be deployed). I do not know how to use docker containers as "building engines" in Actions.
Details: I have a home project consisting of a backend in Go (cross compiled to a standalone binary) and a frontend in Javascript (actually a framework: Quasar).
I develop on my laptop in Windows and use GitHub as the SCM.
The manual steps I do are:
build a static version of the frontend which lands in a directory spa
copy that directory to the backend directory
compile the executable that embeds the spa directory
copy (scp) this executable to the final destination
For development purposes this works fine.
I now would like to use Actions to automate the whole thing. I use docker based self-hosted runners (tcardonne/github-runner).
My problem: the containers do a great job isolating the build environment from the server they run on. They are however reused across build jobs and this may create conflicts. More importantly, the default versions of software provided by these containers is not the right (usually - latest) one.
The solution would be to run the build phases in disposable docker containers (that would base on the right image, shortening the build time as a collateral nice to have). Unfortunately, I do not know how to set this up.
Note: I do not want to ultimately create docker containers, I just want to use them as "building engines" and extract the artefacts from them, and share between the jobs (in my specific case - one job would be to build the front with quasar and generate a directory, the other one would be a compilation ending up with a standalone executable copied elsewhere)
Interesting premise, you can certainly do this!
I think you may be slightly mistaken with regards to:
They are however reused across build jobs and this may create conflicts
If you run a new container from an image, then you will start with a fresh instance of that container. Files, software, etc, all adhering to the original image definition. Which is good, as this certainly aids your efforts. Let me know if I have the wrong end of the stick in regards to the above though.
Base Image
You can define your own image for building, in order to mitigate shortfalls of public images that may not be up to date, or suit your requirements. In fact, this is a common pattern for CI, and Google does something similar with their cloud build configuration. For either approach below, you will likely want to do something like the following to ensure you have all the build tools you may
As a rough example:
FROM golang:1.16.7-buster
RUN apt update && apt install -y \
git \
make \
...
&& useradd <myuser> \
&& mkdir /dist
USER myuser
You could build and publish this with the following tag:
docker build . -t <containerregistry>:buildr/golang
It would also be recommended that you maintain a separate builder image for other types of projects, such as node, python, etc.
Approaches
Building with layers
If you're looking to leverage build caching for your applications, this will be the better option for you. Caching is only effective if nothing has changed, and since the projects will be built in isolation, it makes it relatively safe.
Building your app may look something like the following:
FROM <containerregistry>:buildr/golang as builder
COPY src/ .
RUN make dependencies
RUN make
RUN mv /path/to/compiled/app /dist
FROM scratch
COPY --from=builder /dist /dist
The gist of this is that you would start building your app within the builder image, such that it includes all the build deps you require, and then use a multi stage file to publish a final static container that includes your compiled source code, with no dependencies (using the scratch image as the smallest image possible ).
Getting the final files out of your image would be a bit harder using this approach, as you would have to run an instance of the container once published in order to mount the files and persist it to disk, or use docker cp to retrieve the files from a running container (not image) to your disk.
In Github actions, this would look like running a step that builds a Docker container, where the step can occur anywhere with docker accessibility
For example:
jobs:
docker:
runs-on: ubuntu-latest
steps:
...
- name: Build and push
id: docker_build
uses: docker/build-push-action#v2
with:
push: true
tags: user/app:latest
Building as a process
This one can not leverage build caching as well, but you may be able to do clever things like mounting a host npm cache into your container to aid in actions like npm restore.
This approach differs from the former in that the way you build your app will be defined via CI / a purposeful script, as opposed to the Dockerfile.
In this scenario, it would make more sense to define the CMD in the parent image, and mount your source code in, thus not maintaining a image per project you are building.
This would shift the responsibility of building your application from the buildtime of the image, to the runtime. Retrieving your code from the container would be doable through volume mounting for example:
docker run -v /path/to/src:/src /path/to/dist:/dist <containerregistry>:buildr/golang
If the CMD was defined in the builder, that single script would execute and build the mounted in source code, and subsequently publish to /dist in the container, which would then be persisted to your host via that volume mapping.
Of course, this applies if you're building locally. It actually becomes a bit nicer in a Github actions context if you wish to keep your build instructions there. You can choose to run steps within your builder container using something like the following suggestion
jobs:
...
container:
runs-on: ubuntu-latest
container: <containerregistry>:buildr/golang
steps:
- run: |
echo This job does specify a container.
echo It runs in the container instead of the VM.
name: Run in container
Within that run: spec, you could choose to call a build script, or enter the commands that might be present in the script yourself.
What you do with the compiled source is muchly up to you once acquired 👍
Chaining (Frontend / Backend)
You mentioned that you build static assets for your site and then embed them into your golang binary to be served.
Something like that introduces complications of course, but nothing untoward. If you do not need to retrieve your web files until you build your golang container, then you may consider taking the first approach, and copying the content from the published image as part of a Docker directive. This makes more sense if you have two separate projects, one for frontend and backend.
If everything is in one folder, then it sounds like you may just want to extend your build image to facilitate go and js, and then take the latter approach and define those build instructions in a script, makefile, or your run: config in your actions file
Conclusion
This is alot of info, I hope it's digestible for you, and more importantly, I hope it gives you some ideas as to how you can tackle your current issue. Let me know if you would like clarity in the comments

Docker multi-stage build caching

IIUC, docker doesn't natively support caching multi-stage build layers in a repository. I had expected the results of the multi-stage build to be cached, but is that not the case?
For example, take a toy build file:
FROM centos:7 as stage_0
RUN sleep 60
RUN echo "Hello world" > /root/x.txt
FROM centos:7 as output
COPY --from=stage_0 /root/x.txt /root/x.txt
It seems like when using --cache-from, stage_0 needs to rerun. Is there a way of caching the results of stage_0 in the image repository such that it doesn't need to be built each time, assuming the inputs don't change?
I've been using multi-stage builds for operations that take lots of time — e.g. compiling a third-party tool — so these would be the most helpful items to cache between builds.

gcloud rebuilds complete container but Dockerfile is the same, only the script has changed

I am building Docker containers using gcloud:
gcloud builds submit --timeout 1000 --tag eu.gcr.io/$PROJECT_ID/dockername Dockerfiles/folder_with_dockerfile
The last 2 steps of the Dockerfile contain this:
COPY script.sh .
CMD bash script.sh
Many of the changes I want to test are in the script. So the Dockerfile stays intact. Building those Docker files on Linux with Docker-compose results in a very quick build because it detects nothing has changed. However, doing this on gcloud, I notice the complete Docker being re-generated whereas only a minor change in the script.sh has been created.
Any way to prevent this behavior?
Your local build is fast because you already have all remote resouces cached locally.
Looks like using kaniko-cache would speed a lot your build. (see https://cloud.google.com/cloud-build/docs/kaniko-cache#kaniko-build).
To enable the cache on your project run
gcloud config set builds/use_kaniko True
The first time you build the container it will feed the cache (for 6h by default) and the rest will be faster since dependencies will be cached.
If you need to further speed up your build, I would use two containers and have both in my local GCP container registry:
The fist one as a cache with all remote dependencies (OS / language / framework / etc).
The second one is the one you need with just the COPY and CMD using the cache container as base.
Actually, gcloud has a lot to do:
The gcloud builds submit command:
compresses your application code, Dockerfile, and any other assets in the current directory as indicated by .;
uploads the files to a storage bucket;
initiates a build using the uploaded files as input;
tags the image using the provided name;
pushes the built image to Container Registry.
Therefore the compete build process could be time consuming.
There are recommended practices for speeding up builds such as:
building leaner containers;
using caching features;
using a custom high-CPU VM;
excluding unnecessary files from upload.
Those could optimize the overall build process.

How do I pass a docker image from one TeamCity build to another?

I need to split a teamcity build that builds and pushes a docker image into a docker registry into two separate builds.
a) The one that builds the docker image and publishes it as an artifact
b) The one that accepts the docker artifact from the first build and pushes it into a registry
The log says, that there are these three commands running:
docker build -t thingy -f /opt/teamcity-agent/work/55abcd6/docker/thingy/Dockerfile /opt/teamcity-agent/work/55abcd6
docker tag thingy docker.thingy.net/thingy/thingy:latest
docker push docker.thingy.net/thingy/thingy:latest
There's plenty of other stuff going on, but I figured that this is the important part.
So I have copied the initial build two times, with the first command in the first build, and the next two in the second build.
I have set the first build as a snapshot dependency for the second build, and run it. And what I get is:
FileNotFoundError: [Errno 2] No such file or directory: 'docker': 'docker'
Which probably is because some of the files are missing.
Now, I did want to publish the docker image as an artifact, and make the first build an artifact dependency, but I can't find where does the docker put its files and all of the searches containing a "docker" and a "file" in them just lead to a bunch of articles about what the Dockerfile is.
So what can I do to make it so that the second build could use the resulting image and/or enviroment from the first build?
in all honesty I didn't understand what exactly you are trying to do here.
However, this might help you:
You can save the image as a tar file:
docker save -o <image_file_name>.tar <image_tag>
This archive can then be moved and imported somewhere else.
You can get a lot of information about an image or a container with "docker inspect":
docker inspect <image_tag>
Hope this helps.

Resources