Compiling and running in different containers - docker

I have a project which compiles to a binary file, and running that binary file exposes some REST APIs.
To compile the project I need docker image A which has the compiler and and all the libraries required to produce the executable. To run the executable (ie. host the service) I can get away with a much smaller image B (just basic linux distro, no need for the compiler).
How does one use docker is such a situation?

My thinking for this scenario is that you can prepare two base images:
The 1st one, which includes compiler and all libs for building your executable, call it base-image:build
The 2nd one, as the base image to build your final image to delivery, call it base-image:runtime
And then break your build process into two steps:
Step 1: build your executable inside base-image:build, and then put your executable to some place, like NFS or any registry from where you can fetch it for later use;
Step 2: write your Dockerfile which FROM base-image:runtime, fetch your artifact/executable from wherever generated by Step 1, docker build your delivery image, and then docker push to your registry for release.
Hope this could be helpful :-)

mkdir local_dir
docker run -dv $PWD/local_dir:/mnt BUILD_CONTAINER
compile code and save it to /mnt in the container. It'll be written to local_dir on your host filesystem and persist after the container is destroyed.
You Should now write a Dockerfile and add a step to copy in the new binary, then build. But for example's sake...
docker run -dv $PWD/local_dir:/mnt PROD_CONTAINER
Your bin, and everything else in local_dir, will reside in the container at /mnt/

Related

How does docker create a cached layer for RUN command?

I'm new to docker, II read from a book which says:
Dockerfile instruction results in an image layer, but if the
instruction doesn’t change between builds, and the content going into
the instruction is the same, Docker knows it can use the previous
layer in the cache. An example forCOPY instuction, Docker
calculates whether the input has a match in the cache by generating a
hash, which is like a digital fingerprint representing the input. The
hash is made from the Dockerfile instruction and the contents of any
files being copied. If there’s no match for the hash in the existing
image layers, Docker executes the instruction, and that breaks the
cache. As soon as the cache is broken, Docker executes all the
instructions that follow, even if they haven’t changed.
I can understand it, but is every instruction eligible for docker to create a cache layer?
Let's take RUN instruction for example
RUN dotnet build "WebApplication.csproj" -c Release -o /app/build
WebApplication.csproj is most likely the same (we are not adding any third party package, only modify source code). since the content of "WebApplication.csproj" is the same, should Docker use the cache layer generated before? if docker does use cache layer, which can cause problem because we modified the source code, I don't think Docker is smart enough to check every source files in our project?

Can I run scripts from a docker build context without a copy?

I want to build on top of a windows docker container by installing a couple programs. The files total .5 GB and I want to keep the layers as small as possible. I am hoping I can run the setup files from the build-context, and then have the build-context swept away at the end so I don't have a needless copy of the source files for the setup.exe embedded in my container layers. However, I have not found an example where this is the case. Instead I mostly see people run a COPY command to a temporary build folder, run their setup, then remove the folder. Won't those files still be in the container layers because the COPY command creates a new layer when it's done?
I don't know if the container can see the build-context directly. I was hoping for some magical folder filled with the build-context files so I could run a script using it, but haven't found anything.
It seems like the alternative is to create a private file-server and perform a RUN that can download them from that private server and unpack them, run the install, and remove them (all as 1 docker step). I understand this would make it more available to others who need to rerun the build, but I'm not convinced we'll need to rerun it. It's not likely to change as the container will build patches for a legacy application. Just seems like a lot to host files on a private, public-facing server for something that will get called once every couple years if ever.
So are these my two options?
Make a container with needless copies of source files embedded within
Host the files on a private file server and download/install/remove them
Or am I missing another option or point about how the containers work?
It's a long shot as Windows is a tricky thing with file system, but you could do this way:
In your Dockerfile use a COPY command, install then RUN del ... to remove the installation files
Build your image docker build -t my-large-image:latest .
Run your image docker run --name my-large-container my-large-image:latest
Stop the container
Export your container filesystem docker export my-large-container > my-large-container.tar
Import the filesystem to a new image cat my-large-container.tar | docker import - my-small-image
Caveat is you need to run the container once which might not be what you want. And also I haven't tested with windows container, sorry.
I usually do the download or copy in one step, then in the next step I do the silent installation and remove the installer.
# escape=`
FROM mcr.microsoft.com/dotnet/framework/wcf:4.8-windowsservercore-ltsc2016
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
ADD https://download.visualstudio.microsoft.com/download/pr/6afa582f-fa26-4a73-8cb9-194321e85f8d/ecea51ead62beb7acc73ad9799511ffdb3083ad384fe04ec50e2cbecfb426482/VS_RemoteTools.exe VS_RemoteTools_x64.exe
RUN Start-Process .\\VS_RemoteTools_x64.exe -ArgumentList #('/install','/quiet','/norestart') -NoNewWindow -Wait; `
Remove-Item -Path C:/VS_RemoteTools_x64.exe -Force;
But otherwise, I don't think you can mount a custom volume while it's being built.
I didn't find a satisfactory answer to this. Docker seems designed for only the modern era and assumes you'll be able to download what you need via scripts and tools hitting APIs and file servers. The easiest option I found that I eventually went with was to host the files on a private file server or service (in my case, AWS S3).
I really wish there was a way to have files hosted by the docker daemon in some way, eg. if it acted like a temporary server that you could get data from via http instead of needing to COPY the files and create a layer. Alas, I found no such feature.
Taking this route made my container about a GB smaller.

How to use big file only to build the container without adding it?

I have a big tar/executable (over 30GB) I COPY/ADD it but this is used only for the installation. Once the application is installed I don't need it anymore.
How can I do? I am trying to use it but:
Everytime I run a build, it takes minutes to define the build context.
I'd like to share this image, if I create a tar with docker save, Is the final version or each layer included in it?
I found some solutions that said I can use RUN wget tar ... && rm tar but I don't want to create webserver for that.
Why isn't possible to mount a volume during build process?! It would be very useful.
Use Docker's multi-stage builds. This mechanism allows you to drop intermediate artifacts and therefore achieve a lightweight image.
Example:
FROM alpine:latest as build
# copy large file
# build
FROM alpine:latest as output
# copy necessary files built in the previous stage
COPY --from=build app /app
Anything built in the build stage will not be included in the final image, unless you explicitly COPY them.
Docs: https://docs.docker.com/develop/develop-images/multistage-build/
This is solvable using 2 different context.
Please follow these steps as mentioned below.
Objective is to create a
docker image that will have you large-build file.
docker image that will have you real codebase/executables.
For this you have to create 2 folders (Build & CodeBase) as follow.
Application<br/>
|---> BUILD <br/>
|======|--->Large-File<br/>
|======|--->Dockerfile<br/>
|--->CodeBase<br/>
|======|--->SRC+Other stuff<br/>
|======|--->Dockerfile<br/>
Build & Codebase both folders will have individual Dockerfile and arrange files accordingly.
Dockerfile(Build)
FROM **Base-Image**
COPY Large-File /tmp/Large-File
Build this and tag it with some name like (base-build-app-image)
#>cd Application <==Application root folder as mentioned above==>
#>docker build -t base-build-app-image BUILD <==path of your build-folder==>
Dockerfile(Codebase)
FROM base-build-app-image
RUN *****
CMD *****
RUN rm -f **/tmp/Large-File**
RUN rm -f **Remove installation files that is not required**
ENTRYPOINT *****
Build this-code-base and base-build-app-image is already in your local docker-repository and your large iso file is not in the current-buid-context
#>cd Application <==Application root folder as mentioned above==>
#>docker build CodeBase <==path of your code-base==>
This time since the context size is only your code base and since this doesn't include that Large file - it will definitely reduce your build time.
You can also take an advance of using docker-compose to do both operations together so you will not have to execute 2 separate commands.
If you need help on preparing this docker-compose file then do let me know in comments.
If anything is not clear then leave a comment or come over chat to fix this issue.

What does "From image" do in dockerfiles

I see that dockerfiles usually have a line beginning with "from" keywork, for example:
FROM composer/composer:1.1-alpine AS composer
As far as I know, dockerfiles are a set of commands that help to build and run many containers in docker.
As the example above, docker uses a image named composer/composer:1.1-alpine from docker hub. The As composer just make an alias, so we can use it more convenient.
When I looked for the image, I found the link enter link description here and then enter link description here.
The thing I dont really understand is that:
I guess docker will use the image to build something, but how exactly does it use the image? Does docker run the image or just prepare to use it when in need. Sometimes I dont see the dockerfiles use the image in following line (like this example, there are no lines using the keyword "composer" except the first line). It makes me confused.
Any help would be appreciated.
Thanks.
DockerFiles describes layers: Each command creates it's own layer. For example:
RUN touch test.txt
RUN cp test.txt foo.txt
would create two layers - the first one with the file test.txt, the second one without test.txt but with foo.txt
Each layer adds something to a container. When we walk the layers "up" we find that the very first layer is the empty layer, e.g. it contains only the linux (or windows) kernel itself - nothing else. But that's not really useful - we need a lot of tools (e.g. bash) to be able to run an app. So common base images like alpine add suc tools and core os functions.
It would be annoying as hell if we had to do this setup in every container so there a lots of base images, which do exactly this kind of setup.
If you want to see what a base image does, just search the name on hub.docker.com - there you will find the Dockerfile, describing the build process.
Aditionally, containers can be extenend, e.g. you use the elasticsearch container as a base image, and add your own functionality - that's the second use case for base images.
For your second question: You have to decide if you have to replicate the steps in your base image or not. If you inherit from a general OS image like apline - probably not, since linux normally ships with these tools. If you inherit from a more specialized container, it depends - if your machine matches the environment in the container, you don't need to, but if not you will have to apply these steps to your machine, too. E.g. if you don't have elasticsearch installed, you have to install it.
As for multiple froms in one Dockerfile: Please look up the documentation for Multi Stage builds. Essentially, they encapsulate multiple containers in a single dockerfile. Which can be very useful if you need a different set to build an app and to run the app. The first Container is responsible to build your app, while the second one takes the compiled source code and just runs it.
Watch for COPY --from= lines, these are copying files from one container to another.
The FROM instruction initializes a new build stage and sets the Base Image for subsequent instructions. As such, a valid Dockerfile must start with a FROM instruction. The image can be any valid image – it is especially easy to start by pulling an image from the Public Repositories.
FROM can appear multiple times within a single Dockerfile to create multiple images or use one build stage as a dependency for another. Simply make a note of the last image ID output by the commit before each new FROM instruction. Each FROM instruction clears any state created by previous instructions.
Optionally a name can be given to a new build stage by adding AS name to the FROM instruction. The name can be used in subsequent FROM and COPY --from= instructions to refer to the image built in this stage.
The tag or digest values are optional. If you omit either of them, the builder assumes a latest tag by default. The builder returns an error if it cannot find the tag value.
Taken from : https://docs.docker.com/engine/reference/builder/#from

How to specify different .dockerignore files for different builds in the same project?

I used to list the tests directory in .dockerignore so that it wouldn't get included in the image, which I used to run a web service.
Now I'm trying to use Docker to run my unit tests, and in this case I want the tests directory included.
I've checked docker build -h and found no option related.
How can I do this?
Docker 19.03 shipped a solution for this.
The Docker client tries to load <dockerfile-name>.dockerignore first and then falls back to .dockerignore if it can't be found. So docker build -f Dockerfile.foo . first tries to load Dockerfile.foo.dockerignore.
Setting the DOCKER_BUILDKIT=1 environment variable is currently required to use this feature. This flag can be used with docker compose since 1.25.0-rc3 by also specifying COMPOSE_DOCKER_CLI_BUILD=1.
See also comment0, comment1, comment2
from Mugen comment, please note
the custom dockerignore should be in the same directory as the Dockerfile and not in root context directory like the original .dockerignore
i.e.
when calling
DOCKER_BUILDKIT=1
docker build -f /path/to/custom.Dockerfile ...
your .dockerignore file should be at
/path/to/custom.Dockerfile.dockerignore
At the moment, there is no way to do this. There is a lengthy discussion about adding an --ignore flag to Docker to provide the ignore file to use - please see here.
The options you have at the moment are mostly ugly:
Split your project into subdirectories that each have their own Dockerfile and .dockerignore, which might not work in your case.
Create a script that copies the relevant files into a temporary directory and run the Docker build there.
Adding the cleaned tests as a volume mount to the container could be an option here. After you build the image, if running it for testing, mount the source code containing the tests on top of the cleaned up code.
services:
tests:
image: my-clean-image
volumes:
- '../app:/opt/app' # Add removed tests
I've tried activating the DOCKER_BUILDKIT as suggested by #thisismydesign, but I ran into other problems (outside the scope of this question).
As an alternative, I'm creating an intermediary tar by using the -T flag which takes a txt file containing the files to be included in my tar, so it's not so different than a whitelist .dockerignore.
I export this tar and pipe it to the docker build command, and specify my docker file, which can live anywhere in my file hierarchy. In the end it looks like this:
tar -czh -T files-to-include.txt | docker build -f path/to/Dockerfile -
Another option is to have a further build process that includes the tests. The way I do it is this:
If the tests are unit tests then I create a new Docker image that is derived from the main project image; I just stick a FROM at the top, and then ADD the tests, plus any required tools (in my case, mocha, chai and so on). This new 'testing' image now contains both the tests and the original source to be tested. It can then simply be run as is or it can be run in 'watch mode' with volumes mapped to your source and test directories on the host.
If the tests are integration tests--for example the primary image might be a GraphQL server--then the image I create is self-contained, i.e., is not derived from the primary image (it still contains the tests and tools, of course). My tests use environment variables to tell them where to find the endpoint that needs testing, and it's easy enough to get Docker Compose to bring up both a container using the primary image, and another container using the integration testing image, and set the environment variables so that the test suite knows what to test.
Sadly it isn't currently possible to point to a specific file to use for .dockerignore, so we generate it in our build script based on the target/platform/image. As a docker enthusiast it's a sad and embarrassing workaround.

Resources