Is it possible to run a x64 executable in a linux arm docker that is emulated on a x64 machine?
I would like to use this to achieve fast cross compilations without changing the architecture of the build system or docker image. I currently compile C++ and C source code in the arm docker but all executables are emulated via QEMU which results in very slow compile times. If the compiler and linker executable were instead x64 executables the whole process would be accelerated.
I know that there is a working alternative for this approach which id like to avoid:
Extract the whole docker filesystem on the host system
Use clang or gcc with the --sysroot argument to cross compile using this extracted filesystem
From the docker side, this is fairly straightforward using buildkit. You would create a Dockerfile that contains something like:
FROM --platform=$BUILDPLATFORM your_build_image as build
WORKDIR /src/
COPY . /src/
ARG TARGETOS
ARG TARGETARCH
RUN make cross-compile-${TARGETOS}-${TARGETARCH}
FROM your_target_image
COPY --from=build /src/app /usr/local/bin/app
The key parts in there:
--platform=$BUILDPLATFORM: this uses an image that matches your builder OS/architecture, rather than the target you are building
${TARGETOS} and ${TARGETARCH} these are injected automatically by buildx and refer to the Go GOOS and GOARCH values (since Docker is written in Go).
The RUN make ... depends on how you build your app for different platforms, change that as appropriate along with adjusting the paths in the COPY.
With that, you can run:
# if you have buildkit enabled and want a single platform image
docker build --platform=linux/amd64 .
# or use buildx for creating multiplatform images, this requires pushing to a registry
docker buildx build --platform=linux/amd64,linux/arm64 --push -t $REGISTRY/$IMAGE:$TAG .
from a linux/arm64 system and BUILDPLATFORM will be set to linux/arm64, TARGETOS=linux, and TARGETARCH=amd64.
You can see more about the automatic build args in the Dockerfile docs.
Beyond that, if you need help doing the actual C/C++ cross compile, that will likely need someone else to help, maybe a separate question including an example of errors you're encountering.
Related
Requirement: An application has to be containerised as a docker image and needs to support arm64 and amd64 architectures.
Codebase: It is a golang application that needs to make use of git2go library and must have CGO_ENABLED=1 to build the project. The minimum reproducible example can be found here on github.
Host machine: I am using arm64 M1 mac and docker desktop to build the app but the results are similar on our amd64 Jenkins CI build system.
Dockerfile:
FROM golang:1.17.6-alpine3.15 as builder
WORKDIR /workspace
COPY go.mod go.mod
COPY go.sum go.sum
RUN apk add --no-cache libgit2 libgit2-dev git gcc g++ pkgconfig
RUN go mod download
COPY main.go main.go
ARG TARGETARCH TARGETOS
RUN CGO_ENABLED=1 GO111MODULE=on GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -tags static,system_libgit2 -a -o gitoperations main.go
FROM alpine:3.15 as runner
WORKDIR /
COPY --from=builder /workspace/gitoperations .
ENTRYPOINT ["/gitoperations"]
Build steps:
docker buildx create --name gitops --use
docker buildx build --platform=linux/amd64,linux/arm64 --pull .
This setup works but the build is taking way too long when building for different arch. The time difference between this specific build step:
RUN CGO_ENABLED=1 GO111MODULE=on GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -tags static,system_libgit2 -a -o gitoperations main.go is always 10x longer when building for different arch:
example:
On arm64 M1 mac (without rossetta): Building arm64 executable takes ~30s and amd64 takes ~300seconds.
On our amd64 Jenkins CI system: Building arm64 executable takes 10x longer than building amd64 executable.
This build times can be seen by looking at the docker buildx build command output.
I believe (and I can most certainly be wrong) its happening because docker is using qemu emulation when building for a cpu architecture thats not the same as host machine's cpu arch. So I want to make use of golang cross-compilation capabilities to speed up the build times.
What I have tried: I thought of having a single builder stage in this dockerfile for arm and amd arch by trying this syntax:
FROM --platform=$BUILDPLATFORM golang:1.17.6-alpine3.15 as builder.
But using the same docker build commands after making this change to dockerfile gives build errors, this is what I get when running on arm64 M1 mac:
> [linux/arm64->amd64 builder 9/9] RUN CGO_ENABLED=1 GO111MODULE=on GOOS=linux GOARCH=amd64 go build -tags static,system_libgit2 -a -o gitoperations main.go:
#0 1.219 # runtime/cgo
#0 1.219 gcc: error: unrecognized command-line option '-m64'
After reading through golang CGO documentation I think this error is happening because go is not selecting the correct c compiler that is able to build for both architectures and I need to set the CC env variable to instruct go which c compiler to use.
Question: Am I right in assuming that qemu is causing the build time difference and it can be reduced by using golang's native cross-compilation functionality?
How can I make go build compile for amd64 and arm64 from any host machine using docker desktop as I dont have any experience working with C code and gcc and I am not sure what value I should set for CC flag in the go build command if I need to support linux/amd64 and linux/arm64?
Btw I want to share a little bit of my experience, I also tried to build a Go application Docker image using GitHub Actions and go build was run inside the Docker, although the application was not too big, but I felt it was quite a long process. Then I tried to build the binary outside the Docker and the process was much faster, especially if we store the cache from the previous build process.
To be able to compile C code on go you need to set CC variable to arm cross compiler. You can see your CC variable by go env. The error you have is related with the native compiler in the host system you use. You should sudo apt-get install gcc-arm-linux-gnueabi in your dockerfile. After you downloaded necessary cross compilation tools. You need to link your gcc command to compiler you have downloaded via command I mentioned. Then you should be able to compile your application for arm64.
Could you also share your go env output. You might have to edit GOGCCFLAGS variable too.
Yes, you are correct in assuming that qemu is causing the build time difference and using golang's native cross-compilation functionality could help reduce the build time.
To compile for both amd64 and arm64 architectures from any host machine using Docker Desktop, you can set the CC environment variable in the go build command to specify the compiler that Go should use.
For example, to cross-compile for arm64, you can set CC=aarch64-linux-gnu-gcc. For amd64, you can set CC=gcc. You can set these values as arguments in your docker buildx build command.
Note that you would need to have the appropriate cross-compilation tools installed on the host machine for this to work, such as aarch64-linux-gnu-gcc for arm64.
I have a few Dockerfiles right now.
One is for Cassandra 3.5, and it is FROM cassandra:3.5
I also have a Dockerfile for Kafka, but t is quite a bit more complex. It is FROM java:openjdk-8-fre and it runs a long command to install Kafka and Zookeeper.
Finally, I have an application written in Scala that uses SBT.
For that Dockerfile, it is FROM broadinstitute/scala-baseimage, which gets me Java 8, Scala 2.11.7, and STB 0.13.9, which are what I need.
Perhaps, I don't understand how Docker works, but my Scala program has Cassandra and Kafka as dependencies and for development purposes, I want others to be able to simply clone my repo with the Dockerfile and then be able to build it with Cassandra, Kafka, Scala, Java and SBT all baked in so that they can just compile the source. I'm having a lot of issues with this though.
How do I combine these Dockerfiles? How do I simply make an environment with those things baked in?
You can, with the multi-stage builds feature introduced in Docker 1.17
Take a look at this:
FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]
Then build the image normally:
docker build -t alexellis2/href-counter:latest
From : https://docs.docker.com/develop/develop-images/multistage-build/
The end result is the same tiny production image as before, with a significant reduction in complexity. You don’t need to create any intermediate images and you don’t need to extract any artifacts to your local system at all.
How does it work? The second FROM instruction starts a new build stage with the alpine:latest image as its base. The COPY --from=0 line copies just the built artifact from the previous stage into this new stage. The Go SDK and any intermediate artifacts are left behind, and not saved in the final image.
You can't combine dockerfiles as conflicts may occur. What you want to do is to create a new dockerfile or build a custom image.
TL;DR;
If your current development container contains all the tools you need and works, then save it as an image and upon it to a repo and create a dockerfile to pull from that image off that repo.
Details:
Building a custom image is by far easier than creating a dockerfile using a public image as you can store whatever hacks and mods into the image. To do so, start a blank container with a basic Linux image (or broadinstitute/scala-baseimage), install whatever tools you need and configure them until everything works correctly, then save it (the container) as an image. Create a new container off this image and test to see if you can build your code on top of it via docker-compose (or however you want to do/build it). If it works, than you have a working base image that you can upload to a repo so others can pull it.
To build a dockerfile with a public image, you will need to put all hacks, mods and setup on the dockerfile itself. That is, you will need to place every command line that you used into a text file and reduce whatever hacks, mods and setup into command lines. At the end, your dockerfile will create an image automatically and you don't need to store this image into a repo and all you need to do is to give others the dockerfile and they can spin the image up at their own docker.
Note that once you have a working dockerfile, you can tweak it easily as it will create a new image every time you use the dockerfile. With a custom image, you may run into issues where you need to rebuild the image due to conflicts. For example, all of your tools work with openjdk until you install one that doesn't work. The fix may involve uninstalling openjdk and use the oracle one, but all configuration you did for all the tools that you have installed broke.
The following answer applies to docker 1.7 and above:
I would prefer to use --from=NAME and from image as NAME
Why?
You can use --from=0 and above but this might get little hard to manage when you have many docker stages in dockerfile.
sample example:
FROM golang:1.7.3 as backend
WORKDIR /backend
RUN go get -d -v golang.org/x/net/html
COPY app.go .
RUN #install some stuff, compile assets....
FROM golang:1.7.3 as assets
WORKDIR /assets
RUN ./getassets.sh
FROM nodejs:latest as frontend
RUN npm install
WORKDIR /assets
COPY --from=assets /asets .
CMD ["./app"]
FROM alpine:latest as mergedassets
WORKDIR /root/
COPY --from=frontend . /
COPY --from=backend ./backend .
CMD ["./app"]
Note: Managing dockerfile properly will help to build a docker image much faster. Internally docker usings docker layer caching to help with this process, incase the image have to be rebuilt.
Yes, you can roll a whole lot of software into a single Docker image (GitLab does this, with one image that includes Postgres and everything else), but generalhenry is right - that's not the typical way to use Docker.
As you say, Cassandra and Kafka are dependencies for your Scala app, they're not part of the app, so they don't all belong in the same image.
Having to orchestrate many containers with Docker Compose adds an extra admin layer, but it gives you much more flexibility:
your containers can have different lifespans, so when you have a new version of your app to deploy, you only need to run a new app container, you can leave the dependencies running;
you can use the same app image in any environment, using different configurations for your dependencies - e.g. in dev you can run a basic Kafka container and in prod have it clustered on many nodes, your app container is the same;
your dependencies can be used by other apps too - so multiple consumers can run in different containers and all work with the same Kafka and Cassandra containers;
plus all the scalability, logging etc. already mentioned.
When might you want to "combine" Docker images?
As others are pointing out here, you typically don't want to put your database and you application into the same Docker image. Ideally you want a Docker image to wrap a "single process"/"runtime". This allows each process to be scaled up/down and restarted individually.
Let's say you want to use some shared C-libraries/executables that are not available in the package manager of the image you are using, but someone else has created an image where they are precompiled - and you might not want to recompile these binaries as part of your build (depending on how long this takes). Is there a way to quickly create a POC-Docker image containing all of these executables/libraries based on the existing images?
Docker and Composition
Relevant discussion: https://github.com/moby/moby/issues/3378
What Docker lacks is a good way of composing images. You can copy individual files or entire file systems from other images into your own using COPY --from=<image> <from-path> <to-path>. There is no builtin way of copying the environment variables from another image into your own.
That said, I have personally created a custom frontend/parser for Dockerfiles that adds an INCLUDE <image>-keyword. This copies the entire filesystem, along with the environment variables into your image:
DOCKER_BUILDKIT=1 docker build -t myimage .
#syntax=bergkvist/includeimage
FROM alpine:3.12.0
INCLUDE rust:1.44-alpine3.12
INCLUDE python:3.8.3-alpine3.12
nixpkgs.dockerTools
if you want truly composable Docker builds, I recommend checking out dockerTools in nixpkgs. This will also result in more reproducible (and typically very small) images. See https://nix.dev/tutorials/building-and-running-docker-images
docker load < $(nix-build docker-image.nix)
# docker-image.nix
let
pkgs = import <nixpkgs> {};
python = pkgs.python38;
rustc = pkgs.rustc;
in pkgs.dockerTools.buildImage {
name = "myimage";
tag = "latest";
contents = [ python rustc ];
}
Docker doesn't do merges of the images, but there isn't anything stopping you combining the dockerfiles if available, and rolling into them into a fat image which you'd need to build. There's times where this makes sense, however, as for running multiple processes in a container most Docker dogma will point to this as less desirable especially with microservice architecture (however rules are there to be broken right?)
You could not combine docker images into 1 container. See the detail discussions in Moby issue, How do I combine several images into one via Dockerfile.
For your case, it is better to not include the whole Cassandra and Kafka images. The application would only need the Cassandra Scala driver and Kafka Scala driver. The container should include the drivers only.
I needed docker:latest and python:latest images for Gitlab CI. Here is what I came up with:
FROM ubuntu:latest
RUN apt update
RUN apt install -y sudo
RUN sudo apt install -y docker.io
RUN sudo apt install -y python3-pip
RUN sudo apt install -y python3
RUN docker --version
RUN pip3 --version
RUN python3 --version
After I've build and pushed it to my Docker Hub repo:
docker build -t docker-hub-repo/image-name:latest path/to/Dockerfile
docker push docker-hub-repo/image-name:latest
Don't forget to docker login before push
Hope it helps
So, I'm trying to get into embedded rust, for which I had to use the nightly version of rust, and modify my .cargo/config.toml to change the target device, and stuff. I decided to use docker, as I didn't want this interfering with my main installation. I don't know much about docker, but I'm assuming, it's quite similar to pipenv, where what I do with the docker image, doesn't affect anything outside it. (Unless I run the code)
So, this is how my Dockerfile looks
FROM jdrouet/rust-nightly:buster-slim AS builder
WORKDIR /usr/source/myapp
COPY . .
RUN cargo build --release
CMD cargo run
When I run sudo docker build . -t name It gives me the error I used to get before modifying my .cargo/config.toml file, which is a good thing, I'm guessing, cuz now I can revert to my original configuration, and make the changes to this image's config files. But I'm not able to find the configuration files for this docker image. I don't know what WORKDIR does, but there is no folder called /source in my /usr directory
So, I'm trying to get into embedded rust, for which I had to use the nightly version of rust, and modify my .cargo/config.toml to change the target device, and stuff
You can put a file in the folder wherever/your/project/is/.cargo/config.toml, and it will only impact the project(s) in that directory.
source: Cargo Book
I don't know much about docker, but I'm assuming, it's quite similar to pipenv
Docker is actually quite different to Pipenv. Cargo is similar to Pipenv in that it manages your dependencies for you (Cargo.toml vs Pipfile), distinguishes between regular dependencies vs dev dependencies vs build-time dependencies, etc. Docker is a level of isolation beyond this -- a Docker container is a completely different filesystem from your actual computer. The Dockerfile is a recipe that tells Docker how to build an image of your container, which Docker can run.
Basically, WORKDIR /usr/source/myapp creates a folder /usr/source/app in the Docker container's file system, and cd's into that for the rest of the Dockerfile. This means that the following line, COPY . ., will copy everything in the same folder as the Dockerfile into the folder in the container /usr/source/app.
I bet if you open a shell into the Docker container like so:
# Build the docker container
docker build . -t my-cool-project:latest
# Run it
docker run -it my-cool-project:latest bash
you should be able to cd /usr/source/app and see all your stuff.
I'm having a few issues try to build a Docker container that runs one Haskell application indefinitely. For starters, I'd like to use a base image that provides a program I need to use from my code. It is based on scratch linux. However, when I build my Haskell program and copy it to that container, I get an error:
standard_init_linux.go:211: exec user process caused "no such file or
directory"
Next, I would like to keep my build process, and file structure very simple if possible. I have just one script in Haskell in Main.hs and it has one dependency on process. If it's possible and reasonable to avoid both a stack and a cabal file as well as subdirectories and all the that, it'd be nice if the build directive where just in the Docker or in the Haskell file.
However I have an issue with the build in that the stack ghc line takes several minutes to download ghc and process and build everything and that line reexecutes whenever I make a small code change. This makes development very difficult.
What's a better process for running a simple Haskell script in a Docker image?
Here is my simplified Docker image:
# Pretty standard just using the latest stack-build
FROM fpco/stack-build:lts-15.4 as haskell
# Setup a build dir and copy code to it
WORKDIR /opt/build
COPY Main.hs /opt/build
# This step takes forever and reruns every time I make a code change.
RUN stack ghc --package process -- Main.hs
# Alpine failed here for file not found.
FROM ubuntu:latest
COPY --from=haskell /opt/build/Main /Main
ENTRYPOINT ["/Main"]
A simplified version of the Haskell program.
import System.Process (readProcess)
import Control.Monad (forever)
main = forever $ do
output <- readProcess "/bin/ls" [] ""
print output
That image is intended to be used with Haskell Stack's Docker integration. One very reasonable path is just to use that path to build a binary in a host-system directory, and then use the second half of that Dockerfile to package the binary into a Docker image.
If you look at what gets built, it's a dynamically linked binary that has a non-default dependency. If I change ubuntu to alpine (temporarily) and change ENTRYPOINT to CMD then I can run
$ docker run --rm 101681db8d96 ldd /Main
Error loading shared library libgmp.so.10: No such file or directory (needed by /Main)
This also will not start with the musl libc that's packaged in the Alpine image (it's not obvious why), so you need to install the GNU libc compatibility package as well as the libgmp package.
(Since it's a dynamically linked binary, you also can't run it in a FROM scratch image, unless you're willing to hand-install GNU libc and the other supporting libraries you need.
For the build stage, as the image name suggests, it includes a complete copy of LTS Haskell 15.4, but it takes some poking around in the image to find it.
$ docker run --rm -it fpco/stack-build:lts-15.4 sh
In this shell, you can find the Stack installation in /home/stackage/.stack; pointing the STACK_ROOT environment variable at that directory will make the stack command find it. That avoids the need to download ghc and the rest of the LTS Haskell environment again on rebuild. Once you've done that, the rest of your Dockerfile works pretty much as you've shown.
That leaves us with the final Dockerfile:
FROM fpco/stack-build:lts-15.4 as haskell
# Tell `stack` where to find its content (not in $HOME)
ENV STACK_ROOT /home/stackage/.stack
WORKDIR /opt/build
COPY Main.hs .
RUN stack ghc --package process -- Main.hs
# Switch Ubuntu back to Alpine
FROM alpine:latest
# Add the libraries we need to run the application
RUN apk add libc6-compat gmp
COPY --from=haskell /opt/build/Main /Main
CMD ["/Main"]
New to docker here. I followed the instructions here to make a slim & trim container for my Go Project. I do not fully understand what it's doing though, hopefully someone can enlighten me.
Specifically there are two steps to generating this docker container.
docker run --rm -it -v "$GOPATH":/gopath -v "$(pwd)":/app -e "GOPATH=/gopath" -w /app golang:1.8 sh -c 'CGO_ENABLED=0 go build -a --installsuffix cgo --ldflags="-s" -o hello'
docker build -t myDockerImageName .
The DockerFile itself just contains
FROM iron/base
WORKDIR /app
COPY hello /app/
ENTRYPOINT ["./hello"]
I understand (in a broad sense) that the 1st step is compiling the go program and statically linking the C-dependencies (and doing all this inside an unnamed docker container). The 2nd step just generates the docker image according to the instructions in the DockerFile.
What I don't understand is why the first command starts with docker run (why does it need to be run inside a docker container? Why are we not just generating the Go binary outside of it, and then copying it in?)
And if it's being run inside a docker container, how is binary generated in the docker container being dropped on my local machine's file system?(e.g. why do I need to copy the binary back into the image - as it seems to be doing on line 3 of the DockerFile?)
You're actually using 2 different docker containers, each with a different image. The first container is only around during the compilation... it uses the image golang:1.8. What you're doing is mounting your current working directory into that image and compiling it with the version of GO contained in the image.
The second command builds your custom image that uses the iron/base image as its base. You then copy your built application into that image and run it.
Using a golang container to build the binary is usually done for reproducibility of the build process, i.e.:
it ensures that always the same Go version is used,
compilation takes place in an alway clean environment,
and the build host needs no Go installed at all, or can have a different version,
This way, all parts needed to build the "hello" image can be tracked in a version control system.
However, this example mounts the whole local GOPATH, defeating above purpose. Dependencies must be available to the build container, e.g. by vendoring them. Maybe the author considered vendoring out of scope for his example.
(note: this should be a comment, but my reputation does not allow that)