I have tried docker history and dfimage for getting the dockerfile from a docker image.
From what I can see, any information about the multistage dockerfiles is not there. As I think about it, it makes sense. The final docker image just knows that files were copied in. It probably does not keep a reference to the layer that was used to construct it.
But I thought I would ask just to be sure. (It would be really helpful)
For example: I have a multistage docker file that, in the first stage builds a dot.net core application, then in the second stage copies the files from that build into an Nginx container.
Is there any way, given the final image, to get the dockerfile used to do the build?
Unfortunately this wont be possible since your final docker image won't contain anything from the "builder" stage. Basically the builder stage is a completely different image which was built, the files were copied from it during the build of the final image and than it was discarded.
The stages from the builder stage will live on in your build cache and you could even tag them to run some kind of docker image analyzer against them. However this does not help you, if you only have access to the final image...
No it is not possible. Docker image will only have its own history and not the multi stages that may have been used before it
Related
I need to build custom image which contains both terraform and the gcloud CLI. Very new to docker so I'm struggling with this even though it seems very straight forward. I need to make a multi stage image from the following two images:
google/cloud-sdk:slim
hashicorp/terraform:light
How can I copy the terraform binary from the hashicorp/terraform:light image to the google/cloud-sdk:slim image? Any fumbling I've done so far has given me countless errors. Just hoping somebody could give me an example of what this should look like because this is clearly not it:
FROM hashicorp/terraform:light AS builder
FROM google/cloud-sdk:slim
COPY --from=builder /usr/bin/env/terraform ./
Thanks!
That's not really the purpose of multistaging. For your case, you would want to pick either image and install the other tool, instead of copying from one to another.
Multistage is meant when you want to build an app but you don't want to add building dependencies to the final image in order to reduce the image size and reduce the attack surface.
So, for example, you could have a Go app and you would have two stages:
The first stage would build the binary, downloading all the required dependencies.
The second stage would copy the binary from the first stage, and that's it.
I have a Dockerfile that produces an image as a result of a multi-stage build. One of the steps produces a file (an sql migration script) that I would like to export and store somewhere outside of the build process, while I still want the build to produce the final image.
I was looking at the approach explained here How to copy files from host to Docker container?. It works well, but there is a couple of problems with it:
It either produces the image or output the files.
It only exports the files from the last stage. To limit the number of exported files, I can to use the scratch image, but that is not the final image I want to produce.
FROM scratch AS export
COPY --from=build /script.sql /
If I just copy the sql script into the finally produced (production) image, it will output all the build-produced files. And I also don't really want the script to be in the final image as it has no purpose there.
Is there any way how to do it? Feels silly to run two separated Dockerfiles to do the same build, one to generate the script and another to produce the image.
I foubd out that if I use multiple docker images in single Dockerfile , the second one would always override or delete the data installed in former one , for example :
FROM nvidia/cuda:10.2-cudnn8-devel-ubuntu18.04
FROM python:3.8.10
CMD ["/bin/bash"]
The cuda-10.2 installed from former one is gone...But what I want is to have data installed from both docker images I use . Is there any way to achieve it ? Thanks
This concept is called "multi-stage builds" and it works in a different way from what you expect in your Dockerfile. It allows you to build multiple things in a single Dockerfile, and then hand-pick the parts you need in a single final image:
With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.
To achieve what you want, you might try using multi-stage builds and COPY --from statements, but it probably won't work (for example if the base images use different OS distributions, or if you accidentally miss some files while copying).
What would work is writing a new Dockerfile using the instructions from both other Dockerfiles (python and cuda) and building an image from it. Note that you might need to adapt the commands executed in every one of the base files if they don't work as expected out of the box.
You can use multi-stage Docker builds. Need copy the data between stages but it possible. Read more about it:
https://docs.docker.com/develop/develop-images/multistage-build/#use-multi-stage-builds
In Docker Hub one can configure Automated Builds by clicking on the corresponding button in the upper-right corner of the Builds tab. Apart from configuring a rebuild on pushing to the source-code repository containing the Dockerfile, one can also set "Repository Links" to "Enable for Base Image". This is intended to "Trigger a build in this repository whenever the base image is updated on Docker Hub".
I got this to work in some simple toy-example cases. But it fails to trigger on a more complex example. My Dockerfile looks something like this:
FROM mediawiki AS orig
FROM alpine AS build
COPY --from=orig <file> /
RUN <patch-command of file>
FROM mediawiki
COPY --from=build <file> /
Why does the rebuild not trigger if (either of) the base-images gets updated? Is this because I have more than one FROM line in the Dockerfile? Or did the warning "Only works for non-official images" apply to the base image instead of the destination image?
If the answer to my last question above is "yes", is there some way to still get the desired effect of rebuilding on base image updates?
"Only works for non-official images"
I'm fairly sure it doesn't work for any official images like alpine, golang, etc. The reason is that so many images depend on those base images that a single update would be a huge burden on their infrastructure to rebuild everyone's images.
My guess is that the logic to determine whether an image uses an official image or not is very basic and if it detects FROM <some-official-image> anywhere in your Dockerfile then it probably won't get automatically rebuilt.
I'm very new to Docker and stuff, so I wonder if I can change source official and public images from Docker.Hub (which I use in FROM directive) on-the-fly, while using them in my own container builds, kinda like chefs chef-rewind do?
For example, if I need to pass build-args to openresty/latest-centos to build it without modules I won't use. I need to put this
FROM openresty/latest-centos
in my Dockerfile, and what else should I do for openresty to be built only with modules I needed?
When you use the FROM directive in a Dockerfile, you are simply instructing Docker to use the named image as the base for the image that will be built with your Dockerfile. This does not cause the base image to be rebuilt, so there is no way to "pass parameters" to the build process.
If the openresty image does not meet your needs, you could:
Clone the openresty git repository,
Modify the Dockerfile,
Run docker build ... to build your own image
Alternatively, you can save yourself that work and just use the existing image and live with a few unused modules hanging around. If the modules are separate components, you could also issue the necessary commands in your Dockerfile to remove them.