Installing different pre-requisites in bazel build depending on architecture? - bazel

I want to make a build for a C++ project so that developers can seamlessly build both in MacOS and Linux, but they need different pre-requisites.
Can I configure Bazel to run different commands depending on the architecture, as a pre-requisite to compiling C++ files?

You certainly can, there are two approaches I will suggest.
Approach 1 (recommended)
The best way to do this is to include all of your deps for each arch/os in your WORKSPACE file e.g.
# WORKSPACE
http_archive(
name = "foo_repo_mac",
# ...
)
http_archive(
name = "foo_repo_linux",
# ...
)
NOTE: While the example here uses http_archive, this approach works with other repository rules as well e.g. local_repository (for system deps), git_repository (for git deps) etc.
Then depend on your mac/linux version os these libs using a select statement e.g.
# //:BUILD.bazel
cc_library(
name = "foo_cross_platform",
deps = select({
"#platforms//os:macos": ["#foo_repo_mac//:foo"],
"#platforms//os:linux": ["#foo_repo_linux//:foo"],
}) + ["//other:deps"],
)
Now what happens here is that when you build the target //:foo_cross_platform Bazel will first evaluate the deps in the select statement. Let's say we are running on linux it will select "#foo_repo_linux//:foo" as a dep. Now that Bazel knows there is a dependency in an external repository it will go ahead and download the #foo_repo_linux repository. But because there is no dependency in #foo_repo_macos it won't download/configure that dependency. So in the case of a build on Linux this will work;
bazel build //...
bazel build //:foo_cross_platform
However it is unlikely that running running the following on linux would work.
bazel build #foo_repo_macos//:foo
By default you can select based on cpu/os/distro. A full list of the configurations that are included in default Bazel are listed here. For more complex configurable builds take a look over the Bazel docs on configuration.
Approach 2 (not recommended)
If the approach 1 does not meet your needs there is a second approach that you could take and that is to write your own [repository_rule]
(https://docs.bazel.build/versions/main/skylark/repository_rules.html), that uses the repository_ctx.os field with a skylark if statement. e.g.
# my_custom_repository.bzl
def _my_custom_repository_rule_impl(rctx):
if rctx.os.name == "linux":
rctx.download_and_extract(
url = "http//:www.my_linux_dep.com/foo",
#...
)
elif rctx.os.name == "macos":
rctx.download_and_extract(
url = "http//:www.my_macos_dep.com/foo",
#...
)
else:
fail("No dependency for this os")
# ... Generate a BUILD.bazel file etc.
my_custom_repository = repository_rule(_my_custom_repository_rule_impl)
Then add the following to your WORKSPACE;
# WORKSPACE
load("//:my_custom_repository.bzl", "my_custom_repository")
my_custom_repository(
name = "foo"
)
Then depend on it directly in your build file (This depends on how you generate your BUILD.bazel file).
The reason I don't recommend this approach is that, repository_ctx.os.name is not particularly stable and in general the configurations using selects is far more expressive.

Related

How do you enumerate and copy multiple files to the source folder in Bazel?

How do you enumerate and copy multiple files to the source folder in Bazel?
I'm new to Bazel and I am trying to replace a non-Bazel build step that is effectively cp -R with an idiomatic Bazel solution. Concrete use cases are:
copying .proto files to a a sub-project where they will be picked up by a non-Bazel build system. There are N .proto files in N Bazel packages, all in one protos/ directory of the repository.
copying numerous .gotmpl template files to a different folder where they can be picked up in a docker volume for a local docker-compose development environment. There are M template files in one Bazel package in a small folder hierarchy. Example code below.
Copy those same .gotmpl files to a gitops-type repo for a remote terraform to send to prod.
All sources are regular, checked in files in places where Bazel can enumerate them. All target directories are also Bazel packages. I want to write to the source folder, not just to bazel-bin, so other non-Bazel tools can see the output files.
Currently when adding a template file or a proto package, a script must be run outside of bazel to pick up that new file and add it to a generated .bzl file, or perform operations completely outside of Bazel. I would like to eliminate this step to move closer to having one true build command.
I could accomplish this with symlinks but it still has an error-prone manual step for the .proto files and it would be nice to gain the option to manipulate the files programmatically in Bazel in the build.
Some solutions I've looked into and hit dead ends:
glob seems to be relative to current package and I don't see how it can be exported since it needs to be called from BUILD. A filegroup solves the export issue but doesn't seem to allow enumeration of the underlying files in a way that a bazel run target can take as input.
Rules like cc_library that happily input globs as srcs are built into the Bazel source code, not written in Starlark
genquery and aspects seem to have powerful meta-capabilities but I can't see how to actually accomplish this task with them.
The "bazel can write to the source folder" pattern and write_source_files from aspect-build/bazel-lib might be great if I could programmatically generate the files parameter.
Here is the template example which is the simpler case. This was my latest experiment to bazel-ify cp -R. I want to express src/templates/update_templates_bzl.py in Bazel.
src/templates/BUILD:
# [...]
exports_files(glob(["**/*.gotmpl"]))
# [...]
src/templates/update_templates_bzl.py:
#!/usr/bin/env python
from pathlib import Path
parent = Path(__file__).parent
template_files = [str(f.relative_to(parent)) for f in list(parent.glob('**/*.gotmpl'))]
as_python = repr(template_files).replace(",", ",\n ")
target_bzl = Path(__file__).parent / "templates.bzl"
target_bzl.write_text(f""""Generated template list from {Path(__file__).relative_to(parent)}"
TEMPLATES = {as_python}""")
src/templates/copy_templates.bzl
"""Utility for working with this list of template files"""
load("#aspect_bazel_lib//lib:write_source_files.bzl", "write_source_files")
load("templates.bzl", "TEMPLATES")
def copy_templates(name, prefix):
files = {
"%s/%s" % (prefix, f) : "//src/templates:%s" % f for f in TEMPLATES
}
write_source_files(
name = name,
files = files,
visibility = ["//visibility:public"],
)
other/module:
load("//src/templates:copy_templates.bzl", "copy_templates")
copy_templates(
name = "write_template_files",
prefix = "path/to/gitops/repo/templates",
)
One possible method to do this would be to use google/bazel_rules_install.
As mentioned in the project README.md you need to add the following to your WORKSPACE file;
# file: WORKSPACE
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "com_github_google_rules_install",
urls = ["https://github.com/google/bazel_rules_install/releases/download/0.3/bazel_rules_install-0.3.tar.gz"],
sha256 = "ea2a9f94fed090859589ac851af3a1c6034c5f333804f044f8f094257c33bdb3",
strip_prefix = "bazel_rules_install-0.3",
)
load("#com_github_google_rules_install//:deps.bzl", "install_rules_dependencies")
install_rules_dependencies()
load("#com_github_google_rules_install//:setup.bzl", "install_rules_setup")
install_rules_setup()
Then in your src/templates directory you can add the following to bundle all your templates into one target.
# file: src/templates/BUILD.bazel
load("#com_github_google_rules_install//installer:def.bzl", "installer")
installer(
name = "install_templates",
data = glob(["**/*.gotmpl"]),
)
Then you can use the installer to install into your chosen directory like so.
bazel run //src/templates:install_templates -- path/to/gitops/repo/templates
It's also worth checking out bazelbuild/rules_docker for building your development environments using only Bazel.

Configure bazel toolchain without modifying project/workspace

I have a Bazel project (the new tcmalloc) I'm trying to integrate into a typical GNU Make project that uses it's own build of compiler/libc++. The goal is to not fork the upstream project.
If I pass all the C++ options correctly to bazel (one set of which is -nostdinc++ -I<path to libc++>), Bazel is uhappy The include path '/home/vlovich/myproject/deps/toolchain/libc++/trunk/include' references a path outside of the execution root. (tcmalloc is a git submodule sibling # deps/tcmalloc). It's possible to get this "working" by giving Bazel a custom script to invoke as the compiler that injects those flags so that Bazel never sees them. However, I'd like to just define a toolchain to work properly.
I've read all the documentation I could find on this topic but it's not clear to me how to glue all these docs together.
Specifically not really clear where I should place the toolchain definition files or how to tell Bazel to find those definitions. Is there a way to give bazel a directory that it uses to find toolchain definitions? Am I expected to create a top-level WORKSPACE # /home/vlovich/myproject & register tcmalloc and my toolchain there, & then invoke bazel from /home/vlovich/myproject instead of /home/vlovich/myproject/deps/tcmalloc?
Toolchain support is rather complicated, and it is hard to understand, if you are not a bazel maintainer.
You can use CC and CXX environment variables to set a different compiler like: CC=your_c_compiler CXX=your_c++_compiler bazel build .... You can write your own custom script wrapper which will act as a normal C++ compiler
That -I<path to libc++> does not work, because all normal include paths have to be defined in srcs attribute or via dependencies indicated by the deps attribute. For system-wide dependencies use -isystem Read more about it https://stackoverflow.com/a/44061589/4638604

How can I pass a specific macro to each compile in bazel?

Here's an easy version of the BUILD file:
cc_library(
name = "ab",
srcs = ['a.c', 'b.c', 'logger.h'],
)
logger.h contains the implementation of a logging function that uses the macro XOC_FILE_ID. XOC_FILE_ID has to contain the name of the source file.
Using __FILE__ instead would not help because __FILE__ expands to the string "logger.h" inside the file logger.h.
That's why I need to compile these files with different defines:
gcc -c [...] -DXOC_FILE_ID="a.c" a.c
gcc -c [...] -DXOC_FILE_ID="b.c" b.c
My failed approaches:
set the attribute local_defines using the value{source_file}: local_defines = ['XOC_FILE_ID="{source_file}"]: does not get replaced
set the attribute local_defines using the make variable $<: local_defines = ['XOC_FILE_ID="$<"]: Bazel aborts telling me that $(<) is not defined
same approach for attribute copts
Of course, I could try to make Bazel call a compiler wrapper script. However, this would mean that I have to explicitly set PATH to my wrapper script(s) before each call to Bazel. Isn't there a better solution?
You have access to {source_file} in a toolchain definition.
This means you have to write your own toolchain definition.
I tried two ways of writing a toolchain:
Use the Bazel tutorial on toolchains. Afterwards my build was broken: The default compile options of Bazel were missing. cc_library did not create shared libraries any more.
Use a hint pointing to a post in bazel-discuss and use the toolchain that Bazel itself creates using your environment. That's what I'm going to describe now (for Bazel 3.5.1)
If you want to use a compiler that is not in $PATH, do bazel clean and update $PATH to make compiler available. Bazel will pick it up.
create a toolchain directory (maybe my-toolchain/) in your workspace
bazel build #bazel_tools//tools/cpp:toolchain
copy BUILD, all *.bzl files, cc_wrapper.sh and builtin_include_directory_paths from $(bazel info output_base)/external/local_config_cc/ to your toolchain directory; copy the files the symbolic links are pointing to instead of copying the symbolic links
Adapt the BUILD file in my-toolchain/ to your needs—like adding '-DXOC_FILE_ID=\\"%{source_file}\\"' to compile_flags of cc_toolchain_config.
add these lines to your .bazelrc to make Bazel use your new toolchain by default:
build:my-toolchain --crosstool_top=//my-toolchain:toolchain
build --config=my-toolchain

Clean up unreachable generated files in Bazel

Suppose I have a very minimal project with an empty WORKSPACE and a single package defined at the project root that simply uses touch to create a file called a, as follows:
genrule(
name = "target",
cmd = "touch $#",
outs = ["a"],
)
If I now run
bazel build //:target
the package will be "built" and the a file will be available under bazel-genfiles.
Suppose I now change the BUILD to write the output to a different file, as follows:
genrule(
name = "target",
cmd = "touch $#",
outs = ["b"],
)
Building the same target will result in the file b being available under bazel-genfiles. a will still be there though, even though at this point it's "unreachable" from within the context of the build definition.
Is there a way to ask Bazel to perform some sort of "garbage collection" and remove files (and possibly other content) generated by previous builds that are no longer reachable as-per the current build definition, without getting rid of the entire directory? The bazel clean command seems to adopt the latter behavior.
There seems to be a feature in the works, but apparently it cannot be performed on demand, but rather it executes automatically as soon as a certain threshold has been reached.
Note that running bazel clean will not actually delete the external directory. To remove all external artifacts, use bazel clean --expunge
bazel clean is the way to remove these.
The stale outputs aren't visible to actions, provided you build with sandboxing. (Not yet available on Windows, only on Linux and macOS.)
What trouble do these files make?

Local cache for bazel remote repos

We are using codeship to run CI for a C++ project. Our CI build consists of a Docker image into which we install system dependencies, then a bazel build step that builds our tests.
Our bazel WORKSPACE file pulls in various external dependencies, such as gtest:
new_http_archive(
name = "gtest",
url = "https://github.com/google/googletest/archive/release-1.7.0.zip",
build_file = "thirdparty/gtest.BUILD",
strip_prefix = "googletest-release-1.7.0",
sha256 = "b58cb7547a28b2c718d1e38aee18a3659c9e3ff52440297e965f5edffe34b6d0",
)
During CI builds, a lot of time is spent downloading these files. Is it possible to set up Bazel to use a local cache for these archives?
I think Bazel already caches external repositories in the output_base (It should, if not it's a bug worth reporting). Is it an option for you to keep the cache hot in the docker container? E.g. by fetching the code and running bazel fetch //... or some more specific target? Note you can also specify where is bazel`s output_base by using bazel --output_base=/foo build //.... You might find this doc section relevant.
[EDIT: Our awesome Kristina comes to save the day]:
You can use --experimental_repository_cache=/path/to/some/dir
Does this help?

Resources