Bazel Monorepo - How Rebuild and Publish only Changed Docker Images? - docker

Objective
I have a monorepo setup with a growing number of services services. When I deploy the application I run a command and every service will be rebuilt and the final Docker images will be published.
But as the number of services grows the time it takes to rebuilt all of them gets longer and longer, although changes were made to only a few of them.
Why does my setup rebuilt all Docker images although only a few have changed? My goal is to rebuilt and publish only the images that have actually changed.
Details
I am using Bazel to build my Docker images, thus in the root of my project there is one BUILD file which contains the target I run when I want to deploy. It is just a collection of k8s_objects, where every service is included:
load("#io_bazel_rules_k8s//k8s:objects.bzl", "k8s_objects")
k8s_objects(
name = "kubernetes_deployment",
objects = [
"//services/service1",
"//services/service2",
"//services/service3",
"//services/service4",
# ...
]
)
Likewise there is one BUILD file for every service which first creates a Typescript library from all the source files, then creates the Node.Js image and finally passes the image to the Kubernetes object:
load("#npm_bazel_typescript//:index.bzl", "ts_library")
ts_library(
name = "lib",
srcs = glob(
include = ["**/*.ts"],
exclude = ["**/*.spec.ts"]
),
deps = [
"//packages/package1",
"//packages/package2",
"//packages/package3",
],
)
load("#io_bazel_rules_docker//nodejs:image.bzl", "nodejs_image")
nodejs_image(
name = "image",
data = [":lib", "//:package.json"],
entry_point = ":index.ts",
)
load("#k8s_deploy//:defaults.bzl", "k8s_deploy")
k8s_object(
name = "service",
template = ":service.yaml",
kind = "deployment",
cluster = "my-cluster"
images = {
"gcr.io/project/service:latest": ":image"
},
)
Note that the Typescript lib also depends on some packages, which should also be accounted for when redeploying!
To deploy I run bazel run :kubernetes_deployment.apply
Initially one reason I decided to choose Bazel is because I thought it would handle building only changed services itself. But obviously this is either not the case or my setup is faulty in some way.
If you need more detailed insight into the project you can check it out here: https://github.com/flolude/cents-ideas

Looks like Bazel repo itself does something similar:
https://github.com/bazelbuild/bazel/blob/ef0f8e61b5d3a139016c53bf04361a8e9a09e9ab/scripts/ci/ci.sh
The rough steps are:
Calculate the list of files that has changed
Use the file list and find their dependents (e.g. The bazel querykind(.*_binary, rdeps(//..., set(file1.txt file2.txt))) will find all binary targets which are dependents of either file1.txt or file2.txt)
build/test the list of targets
You will need to adapt this script to your need (e.g. make sure it finds docker image targets)
To find out the kind of a target, you can use bazel query //... --output label_kind
EDIT:
A bit of warning for anyone who wants to go down this rabbit hole (especially if you absolutely do not want to miss tests in CI):
You need to think about:
Deleted files / BUILD files (who depended on them)
Note that moved files == Deleted + Added as well
Also you cannot query reverse deps of files/BUILD that doesn't exist anymore!
Modified BUILD files (To be safe, make sure all reverse deps of all targets in the BUILD are built)
I think there is a ton of complexity here going down this route (if even possible). It might be less erro-prone to rely on Bazel itself to figure out what changed, using remote caches & --subcommands to calculate which side-effects need to be performed.

Related

Bazel clean only a subset of the cached rules

I am currently developing in a monorepo that has a pretty large workspace file.
Right now, I am noting that one of my testing rules, is not getting its dependency rules re-built when I update one of my tests. Here is an example of this:
load("#npm//#grafana/toolkit:index.bzl", "grafana_toolkit")
load("#build_bazel_rules_nodejs//:index.bzl", "copy_to_bin")
APPLICATION_DEPS = glob(
[
# My updated test file is included in this glob
"src/**/*",
],
) + [
"my-config-files.json"
]
RULE_DEPS = [
"#npm//#grafana/data",
"#npm//#grafana/ui",
"#npm//emotion",
"#npm//fs-extra",
]
copy_to_bin(
name = "bin_files",
srcs = APPLICATION_DEPS,
)
grafana_toolkit(
name = "test",
args = [
"plugin:test",
],
chdir = package_name(),
data = RULE_DEPS + [
":bin_files",
],
)
I then have a file called maybe something.test.ts. I run bazel run :test and my test might show that I failed and I see the problem and fix it. The problem is that the next time I run my test, I see from the output that it's still failing because it's running the old test instead of the new test.
The Problem
The way that I normally fix this sort of issue with stale files not updating, is by running bazel clean. The problem is that doing bazel clean means I clean EVERYTHING. And that makes re-running all the build steps take pretty damn long. I'm wondering if there is a way I can specify that I only clean a subset of the cache (maybe only the output of my bin_files rule, for example). That way, rather than starting all over again, I only rebuild what I want to rebuild.
I've actually found a pretty quick and easy way to do what I was originally asking is basically to just go to the bazel-bin directory, and delete the output of whichever rule it is I want to re-run. So maybe in this case, I could delete the bin_files output in my bazel-bin directory then run my bin_files rule again.
With that being said, I think #ahumsky might be right in that if you're needing to do this, it's more likely a bug with something else. In my case, I was running a build version of my rule instead of a test version of my rule. So, cleaning a subset of my cache didn't really have anything to do with my original problem.

Don't discard analysis cache when --action_env changes

I have an --action_env variable I'm passing into Bazel sometimes, but each time I remove it or add it back, it discards the analysis cache, which triggers a re-analysis that takes several minutes because I'm working in a large repo. Is there a way to prevent this? I'm already using --trim_test_configuration
Answer
Not really. You can think of the set of environment variables as a file that almost every Bazel action depends on. As a simple example let's say you have a build file that looks something like this;
genrule(
name = "foo_header",
cmd = "echo #define FOO $FOO_FROM_ENV > foo.h",
outs = ["foo.h"],
)
cc_library(
name = "my_library_that_everything_depends_on",
hdrs = [":foo_header"],
)
In this simple case it's not hard to see that if you change --action_env=FOO_FROM_ENV=7 to another value that everything that depends on foo.h now has to be completely rebuilt and analysed. So while frustrating it's probably a good thing that Bazel does this otherwise you'd end up with an inconsistent build.
Partial workarounds
Remove usage of use_default_shell_env from your rules/actions (this is far from trivial as most of the standard rules do this)
Use a centralised cache to prevent Bazel from deleting the artifact cache. e.g. I add these to ~/.bazelrc as then it shares the action/repository cache between all my projects. This only helps if you are switching between env variables on a regular basis rather than using new env values each time. Also, be careful with this as Bazel doesn't presently do any garbage collection so the cache directories can end up very large.
# ~/.bazelrc
common --disk_cache=~/.cache/shared_bazel_action_cache
common --repository_cache=~/.cache/shared_bazel_repository_cache
Add --incompatible_strict_action_env to your project bazelrc, this will prevent changes in your user shell triggering Bazel to discard the analysis cache.

Are absolute paths safe to use in Bazel?

I am experimenting with Bazel to be added along with an old, make/shell based build system. I can easily make shell commands which returns an absolute path to some tool or library build by the old build system as early prerequisites. These commands I can use in a genrule(), which copies the needed files (like headers and libs) into Bazel proper to be exposed in form of a cc_library().
I found out that genrule() does not detect a dependency if the command uses a file with absolute path - it is not caught by the sandbox. In a way I am (ab)using that behavior.
It is it safe? Will some future update of Bazel refuse access to files based on absolute path in that way in a command in genrule?
Most of Bazel's sandboxes allow access to most paths outside of the source tree by default. Details depend on which sandbox implementation you're using. The docker sandbox, for example, allows access to all those paths inside of a docker image. It's kind of hard to make promises about future Bazel versions, but I think it's unlikely that a sandbox will prevent accessing /bin/bash (for example), which means other absolute paths will probably continue to work too.
--sandbox_block_path can be used to explicitly block a path if you want.
If you always have the files available on every machine you build on, your setup should work. Keep in mind that Bazel will not recognize when the contents of those files change, so you can easily get stale results in various caches. You can avoid that by ensuring the external paths change whenever their contents do.
new_local_repository might be a better fit to avoid those problems, if you know the paths ahead of time.
If you don't know the paths ahead of time, you can write a custom repository rule which runs arbitrary commands via repository_ctx.execute to retrieve the paths and them symlinks them in with repository_ctx.symlink.
Tensorflow's third_party/sycl/sycl_configure.bzl has an example of doing something similar (you would do something other than looking at environment variables like find_computecpp_root does, and you might symlink entire directories instead of all the files in them):
def _symlink_dir(repository_ctx, src_dir, dest_dir):
"""Symlinks all the files in a directory.
Args:
repository_ctx: The repository context.
src_dir: The source directory.
dest_dir: The destination directory to create the symlinks in.
"""
files = repository_ctx.path(src_dir).readdir()
for src_file in files:
repository_ctx.symlink(src_file, dest_dir + "/" + src_file.basename)
def find_computecpp_root(repository_ctx):
"""Find ComputeCpp compiler."""
sycl_name = ""
if _COMPUTECPP_TOOLKIT_PATH in repository_ctx.os.environ:
sycl_name = repository_ctx.os.environ[_COMPUTECPP_TOOLKIT_PATH].strip()
if sycl_name.startswith("/"):
return sycl_name
fail("Cannot find SYCL compiler, please correct your path")
def _sycl_autoconf_imp(repository_ctx):
<snip>
computecpp_root = find_computecpp_root(repository_ctx)
<snip>
_symlink_dir(repository_ctx, computecpp_root + "/lib", "sycl/lib")
_symlink_dir(repository_ctx, computecpp_root + "/include", "sycl/include")
_symlink_dir(repository_ctx, computecpp_root + "/bin", "sycl/bin")

Where to get list of known repositories like #bazel_tools, #rules_jvm_external etc?

Sometimes I see extensions loading from the internet or built-in ones.
Canonical example:
load("#bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
However, I cannot distinguish local repo and known repo by looking at the load expression.
How can I check the source (location) of any repo which I see in my WORKSPACE/BUILD files?
If the Bazel label is sufficient as a source, you might try fetching repo roots with BUILD files with bazel query 'buildfiles(//...)'.
Otherwise, you could run bazel clean --expunge and run a build with --experimental_execution_log_file=<FILENAME>. This creates a protobuf based log of the actions by Bazel. In there, all internet repos are downloaded anew because of clean --expunge.
Check https://github.com/bazelbuild/bazel/tree/master/src/tools/execlog for a parser.
It is super inconvenient that this information is not available another way - afaik. I really hope someone swings by and corrects me, but this way you at least know the available sources you can correlate.
I'm new to Bazel, but as far as I understand:
Copy the name of the repo. E.g. io_bazel_rules_docker
Search it through the codebase
Look at how it's being loaded
E.g. if you see
http_archive(
name = "io_bazel_rules_docker",
...
)
http_file(
name = "io_bazel_rules_docker",
...
)
And you can conclude where it's coming from.
bazel query --output=build //external:repo_name works just fine.

Skylark - How to execute a jar from a repository rule

Context
I am writing a repository rule that invokes another Bazel project. My current approach is to build the additional project as a deploy jar. I would like a user to be able to instantiate the rule like:
jar_path = some/relative/path
my_rule(name = "something", p_arg="m_arg", binary=jar_path)
and then given the jar_path and the arguments, I would like the repository rule to execute the following command in the shell:
java -jar $(SOME_JAR) $(ARGUMENTS_PROVIDED_BY_RULE)
Problem
First, it's unclear how best to accomplish the deploy jar approach. So far, I have attempt two different approaches, with varying levels of success. For examples, I have skimmed through the scala_rules, the maven_rules, and the skylark cookbook.
Second, and more importantly, I am not sure whether the deploy jar is the best route to accomplishing my goals. Again, my interest is to invoke a target from an external Bazel project, that is currently hosted on github. (So feasibly, I could try to fetch the project using the http_archive rule).
Below, I describe the attempts I have made.
Approach 1
My first approach involved trying to execute the command using the command field in ctx.action. I tried various enumerations of
java -jar {computed_absolute_path_of_deploy_jar} {args_passed_from_instantiation}.
My biggest issue here was with determining the absolute path of the deploy jar. The file's root path, would contain some additional information. For example, it would like something like this.
/abs/olute/path[ something ]/rela/tive/path
As a side note, I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None, despite File.none not being None.
My first approach involved was to was to try to use skylark [ctx.binary]
Approach 2
Next thing I tried was to mimic the input binary example from the docs. This was also unsuccessful. The issue was that the actual binary could not be found. Here is how I configured it.
First, I relaxed the repository rule into a regular skylark rule.
def _test_binary(ctx):
ctx.action(
....
arguments = [ctx.attr.p_arg],
executable = ctx.executable.binary)
test_binary = rule(
...
attrs = {
"binary":attr.label(mandatory=True, cfg="host", allow_files=True, executable=True),
...
}
Then, in my external project, I loaded the skylark rule into the WORKSPACE file. Finally, I called the macro from one of my BUILD files as follows:
load("#something_rule//:something_rule.bzl", "test_binary")
test_binary(name = "hello", p_arg = "hello", binary = "script.sh")
The script is a one line java -jar something_deploy.jar -- -arg:$1, and is in the same directory as the BUILD file.
Bazel complains that src/script.sh does not exist. I presume because it is looking for the file in /private/var/tmp/-bazel_username/somehash/relative_path. In response, I tried to pass the absolute path, which is not allowed.
Cheers.
It looks like you're mixing up repository rules with build extensions ("normal" rules). A good rule of thumb is:
Repository rules are for getting sources onto your system or symlinking them to a place Bazel can see them.
Build extension are for everything else: compiling, copying files, running binaries, etc.
I don't actually think you need to use either, for this. You say that the other project is on GitHub, so you can add the following to your WORKSPACE file:
http_archive(
name = "other_project",
...
)
Then, in your BUILD file:
genrule(
name = "run-a-jar",
srcs = ["#other_project//some/relative:path"],
cmd = "java -jar $(location #other_project//some/relative:path) -- arg1 arg2 > $#",
outs = ["jar-output"],
)
You shouldn't need to use the _deploy.jar target, since you're not moving the jar out of its project (_deploy.jar is useful when you need to relocate it).
Other things from your question:
I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None,
Are you sure it didn't evaluate to ""? The path is relative to the execution root, so for sources, it will always be "" (for outputs, it'll be bazel-out/local-fastbuild/bin or similar).
Bazel complains that src/script.sh does not exist.
Passing -s to Bazel can really help debugging Skylark rules. You can see exactly where it is looking.

Resources