Very slow docker build with go-sqlite3 CGO enabled package

Very slow docker build with go-sqlite3 CGO enabled package - docker

Since I've installed go-sqlite3 as dependency in my go project my docker build time started oscillating around 1 min.
I tried to optimize the build by using go mod download to cache dependencies
But it didn't reduce overall build time.
Then I found out that
go-sqlite3 is a CGO enabled package you are required to
set the environment variable CGO_ENABLED=1 and have a gcc compile
present within your path.
So I run go install github.com/mattn/go-sqlite3 as an extra step and it reduced build time to 17s~
I also tried vendoring, but it didn't help with reducing the build time, installing library explicitly was always necessary to achieve that.
## Build
FROM golang:1.16-buster AS build
WORKDIR /app
# Download dependencies
COPY go.mod .
COPY go.sum .
RUN go mod download
RUN go install github.com/mattn/go-sqlite3 //this reduced build time to around 17s~
COPY . .
RUN go build -o /myapp
But somehow I am still not happy with this solution.
I don't get why adding this package makes my build so long and why I need to explicitly install it in order to avoid such long build times.
Also, wouldn't it be better to install all packages after downloading them?
Do you see any obvious way of improving my current docker build?

The fact of the matter is that the C-based SQLite package just takes a long time to build. I use it myself currently, and yes it's painful every time. I have also been unhappy with it, and have been looking for alternatives. I have been busy with other projects, but I did find this package QL [1], which you can build without C [2]:
go build -tags purego
or if you just need read only, you can try SQLittle [3].
https://pkg.go.dev/modernc.org/ql
https://pkg.go.dev/modernc.org/ql#hdr-Building_non_CGO_QL
https://github.com/alicebob/sqlittle

Related

Cache Rust dependencies with Docker build and lib.rs

I've been trying to create a Dockerfile for my rust build that would allow me to build the application separately from the dependencies as demonstrated here:
Cache Rust dependencies with Docker build
However this doesn't not seem to be working for me, having a slightly different working tree with the lib.rs file. My Dockerfile is laid out like so:
FROM rust:1.60 as build
# create a new empty shell project
RUN USER=root cargo new --bin rocket-example-pro
WORKDIR /rocket-example-pro
# create dummy lib.rs file to build dependencies separately from changes to src
RUN touch src/lib.rs
# copy over your manifests
COPY ./Cargo.lock ./Cargo.lock
COPY ./Cargo.toml ./Cargo.toml
RUN cargo build --release --locked
RUN rm src/*.rs
# copy your source tree
COPY ./src ./src
# build for release
RUN rm ./target/release/deps/rocket_example_pro*
RUN cargo build --release --locked ## <-- fails
# our final base
FROM rust:1.60
# copy the build artifact from the build stage
COPY --from=build /rocket-example-pro/target/release/rocket_example_pro .
# set the startup command to run your binary
CMD ["./rocket_example_pro"]
As you can see initially I copy over the toml files and perform a build, similarly to the previously demonstrated. However with my project structure being slightly different I seem to be having an issues, as my main.rs pretty much only has one line that calls the main method in my lib.rs. lib.rs is also defined in my toml file that gets copied before building dependencies and requires me to touch the lib.rs file for it to not fail building here with it otherwise being missing.
Its at the second build step that I can't seem to resolve, after I've copied over the actual source files to build the application, I am getting the error message
Compiling rocket_example_pro v0.1.0 (/rocket-example-pro)
error[E0425]: cannot find function `run` in crate `rocket_example_pro`
--> src/main.rs:3:22
|
3 | rocket_example_pro::run().unwrap();
| ^^^ not found in `rocket_example_pro`
When performing these steps myself in a empty directory I don't seem to encounter the same errors myself, instead the last step succeeds, but the produced rocket-example-pro executable file still seems to be the shell example project only printing 'Hello world' and not the rocket application i copy over before the second build.
As far as i can figure it seems that the first build is affecting the second, perhaps when I touch the lib.rs file in the dummy shell project, it builds it without the run() method? so when the second one starts, it doesn't see the run method because its empty? but this doesn't make much sense to me as I have copied over the lib.rs file with the run() method inside it.
here's what the toml file looks like if it helps:
[package]
name = "rocket_example_pro"
version = "0.1.0"
edition = "2021"
[[bin]]
name = "rocket_example_pro"
path = "src/main.rs"
[lib]
name = "rocket_example_pro"
path = "src/lib.rs"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
...

(I couldn't reproduce this at first. Then I noticed that having at least one dependency seems to be a necessary condition.)
With the line
RUN rm ./target/release/deps/rocket_example_pro*
you're forcing rebuild of the rocket_example_pro binary. But the library will remain as built from the first empty file. Try changing to
RUN rm ./target/release/deps/librocket_example_pro*
Though personally, I think deleting random files from the target directory is a terribly hacky solution. I'd prefer to trigger the rebuild of the lib by adjusting the timestamp:
RUN touch src/lib.rs && cargo build --release --locked ## Doesn't fail anymore
For a clean solution, have a look at cargo-chef.
[Edit:] So what's happening here?
To decide whether to rebuild, cargo seems to compare the mtime of the target/…/*.d to the mtime of the files listed in the content of the *.d files.
Probably, src/lib.rs was created first, and then docker build was run. So src/lib.rs was older than target/release/librocket_example_pro.d, leading to target/release/librocket_example_pro.rlib not being rebuilt after copying in src/lib.rs.
You can partially verify that that's what's happening.
With the original Dockerfile, run cargo build, see it fail
Run echo >> src/lib.rs outside of docker to update its mtime and hash
Run cargo build, it succeeds
Note that for step 2, updating mtime with touch src/lib.rs is not sufficient because docker will set the mtime when COPYing a file, but it will ignore mtime when deciding whether to use a cached step.

How can I cache packages build artefacts for docker build? [duplicate]

This question already has answers here:
How to pre-build all required modules and cache them
(2 answers)
Closed 1 year ago.
A typical go docker pattern is this:
# cache modules
COPY go.mod .
COPY go.sum .
RUN go mod download
COPY . .
RUN make
This will create a (cached) layer for downloaded packages before compiling the actual sources. It would be great to not only download but also compile the packages before adding the application to further speed up repeated builds.
How would one force-compile all downloaded packages irrespective of the parent application?

The goal of using go mod download in docker build, is to avoid waiting for the dependencies' sources to be compiled. Since you want to avoid that beahviour, you can replace that line with
RUN go get -d -v

How can I cache a nix derivations's dependencies when built via Docker?

FROM nixos/nix#sha256:af330838e838cedea2355e7ca267280fc9dd68615888f4e20972ec51beb101d8
# FROM nixos/nix:2.3
ADD . /build
WORKDIR /build
RUN nix-build
ENTRYPOINT /build/result/bin/app
I have the very simple Dockerfile above that can succesfully build my application. However each time I modify any of the files within the application directory (.), it'll have to rebuild from scratch + download all the nix store dependencies.
Can I somehow grab a "list" of store dependencies downloaded and then add them in on the beginning of the Dockerfile for the purpose of caching them independently (for the ultimate goal of saving time + bandwidth)?
I'm aware I could build this docker image using nix natively which has it's own caching functionality (well the nix store), but I'm trying to have this buildable in a non nix environment (hence using docker).

I can suggest split source in two parts. The idea is to create a separate Docker layer with dependencies only, which changes rarely:
FROM nixos/nix:2.3
ADD ./default.nix /build
# if you have any other Nix files, put them to ./nix subdirectory
ADD ./nix /build/nix
# now let's download all the dependencies
RUN nix-shell --run exit
# At this point, Docker has cached all the dependencies. We can perform the build
ADD . /build
WORKDIR /build
RUN nix-build
ENTRYPOINT /build/result/bin/app

Build multiple docker images without building binaries in each Dockerfile

I have a .NET Core solution with 6 runnable applications (APIs) and multiple netstandard projects. In a build pipeline on Azure DevOps I need to create 6 Docker images and push them to the Azure Registry.
Right now what I do is I build image by image and every one of these 6 Dockerfiles builds the solution from scratch (restores, builds, publishes). This takes a few minutes and the whole pipeline goes almost to 30 minutes.
My goal is to optimize the time of the build. I figured two possible, parallel, ways of doing that:
remove restore and build, run just publish (because it restores references and does the same thing as build)
publish the code once (for all runnable applications) and in Dockerfiles just copy binaries, without building again
Are both ways doable? I can't figure out how to make the second one work - should I just run dotnet publish for each runnable application and then gather all Dockerfiles within the folder with binaries and run docker build? My concern is - I will need to copy required .dll files to the image but how do I choose which ones, without explicitly specifying them?
EDIT:
I'm using Linux containers. I don't write my Dockerfiles - they are autogenerated by Visual Studio. I'll show you one example:
FROM mcr.microsoft.com/dotnet/core/aspnet:2.2-stretch-slim AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443
FROM mcr.microsoft.com/dotnet/core/sdk:2.2-stretch AS build
WORKDIR /src
COPY ["Application.WebAPI/Application.WebAPI.csproj", "Application.WebAPI/"]
COPY ["Processing.Dependency/Processing.Dependency.csproj", "Processing.Dependency/"]
COPY ["Processing.QueryHandling/Processing.QueryHandling.csproj", "Processing.QueryHandling/"]
COPY ["Model.ViewModels/Model.ViewModels.csproj", "Model.ViewModels/"]
COPY ["Core.Infrastructure/Core.Infrastructure.csproj", "Core.Infrastructure/"]
COPY ["Model.Values/Model.Values.csproj", "Model.Values/"]
COPY ["Sql.Business/Sql.Business.csproj", "Sql.Business/"]
COPY ["Model.Events/Model.Events.csproj", "Model.Events/"]
COPY ["Model.Messages/Model.Messages.csproj", "Model.Messages/"]
COPY ["Model.Commands/Model.Commands.csproj", "Model.Commands/"]
COPY ["Sql.Common/Sql.Common.csproj", "Sql.Common/"]
COPY ["Model.Business/Model.Business.csproj", "Model.Business/"]
COPY ["Processing.MessageBus/Processing.MessageBus.csproj", "Processing.MessageBus/"]
COPY ["Processing.CommandHandling/Processing.CommandHandling.csproj", "Processing.CommandHandling/"]
COPY ["Processing.EventHandling/Processing.EventHandling.csproj", "Processing.EventHandling/"]
COPY ["Sql.System/Sql.System.csproj", "Sql.System/"]
COPY ["Application.Common/Application.Common.csproj", "Application.Common/"]
RUN dotnet restore "Application.WebAPI/Application.WebAPI.csproj"
COPY . .
WORKDIR "/src/Application.WebAPI"
RUN dotnet build "Application.WebAPI.csproj" -c Release -o /app
FROM build AS publish
RUN dotnet publish "Application.WebAPI.csproj" -c Release -o /app
FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "Application.WebApi.dll"]
One more thing - The problem is that azure devops has this job which builds an image and I just copied this job 6 times, pointing every copy to other Dockerfile. That's why they don't reuse the code - I would love to change that so they base on the same binaries. Here are steps in Azure DevOps:
Get sources
Build and push image no. 1
Build and push image no. 2
Build and push image no. 3
Build and push image no. 4
Build and push image no. 5
Build and push image no. 6
Every single 'Build and push image' does:
dotnet restore
dotnet build
dotnet publish
I want to get rid of this overhead - is it possible?

It's hard to say without seeing your Dockerfiles, but you probably are making some mistakes that are adding time to the image build. For example, each command in a Dockerfile results in a layer. Docker caches these layers and only rebuilds the layer if it or previous layers have changed.
A very common mistake people make is to copy their entire project with all the files within first, and then run dotnet restore. When you do that, any change to any file invalidates that copy layer and thus also the dotnet restore layer, meaning that you have to restore packages every single build. The only thing necessary for the dotnet restore is the project file(s), so if you copy just those, run dotnet restore, and then copy all the files, those layers will be cached, unless the project file itself changes. Since that normally only happens when you change packages (add, update, remove, etc.), most of the time, you will not have to repeat the restore step, and the build will go much quicker.
Another issue can occur when you're using npm and Linux images on Windows. This one bit me personally. In order to support Linux images, Docker uses a Linux VM (MobyLinux). At the start of a build, Docker lifts the entire filesystem context (i.e. where you run the docker command) into the MobyLinux VM, first, as all the Dockerfile commands will be run actually in the VM, and thus the files will need to reside there. If you have a node_modules directory, it can take a significant amount of time to move all that over. You can solve this by adding node_modules to your .dockerignore file.
There's other similar types of mistakes you might be making. We'd really need to see your Dockerfiles to help you further. Regardless, you should not go with either of your proposed approaches. Just running publish will suffer from the same issues described above, and gives you no recourse to alleviate the problem at that point. Publishing outside of the image can lead to platform inconsistencies and other problems unless you're very careful. It also adds a bunch of manual steps to the image building process, which defeats a lot of the benefit Docker provides. Your images will be larger as well, unless you just happen to publish on exactly the same architecture as what the image will use. If you're developing on Windows, but using Linux images, for example, you'll have to include the full ASP.NET Core runtime. If you build and publish within the image, you can include the SDK only in a stage to build and publish, and then target something like alpine linux, with a self-contained architecture-specific publish.

How to ask sbt to only fetch dependencies, without compiling?

Is there a way to only download the dependencies but do not compile source.
I am asking because I am trying to build a Docker build environment for my bigger project.
The Idear is that during docker build I clone the project, download all dependencies and then delete the code.
Then use docker run -v to mount the frequently changing code into the docker container and start compiling the project.
Currently I just compile the code during build and then compile it again on run. The problem ist that when a dependencie changes I have to build from scratch and that takes a long time.

Run sbt's update command. Dependencies will be resolved and retrieved.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart