bazel WORKSPACE file not behaving as documented? - bazel

Consider the following hierarchy:
WORKSPACE
foo/
BUILD
foo.sh
bar/
BUILD
bar.sh
Where, e.g., foo/BUILD contains
sh_binary(
name = "foo",
srcs = ["foo.sh"],
)
and similarly for bar/BUILD. As expected, bazel cquery //... prints:
INFO: Analyzed 2 targets (0 packages loaded, 0 targets configured).
INFO: Found 2 targets...
//bar:bar (a5d130b)
//foo:foo (a5d130b)
According to https://docs.bazel.build/versions/master/build-ref.html, "Bazel ignores any directory trees in a workspace rooted at a subdirectory containing a WORKSPACE file (as they form another workspace)."
Therefore, if I touch bar/WORKSPACE, I should expect bar to no longer be part of my workspace, and its contents should be ignored by bazel. Why, then, do I still get the same query results?
$ ls bar
BUILD WORKSPACE bar.sh
$ bazel cquery //...
INFO: Analyzed 2 targets (0 packages loaded, 0 targets configured).
INFO: Found 2 targets...
//bar:bar (a5d130b)
//foo:foo (a5d130b)
Bazel version is 3.7.0.

I answered this in the slack thread the author created, but for posterity, the confusion here was due to the author's interpretation (and possibly poor verbiage) in the documentation.
A WORKSPACE file alone doesn't "instantiate" or "define" a wholly separate workspace, as the author of this question assumed -- as such, Bazel won't treat that subdirectory any differently from any other package. Instead, the workspace needs to be defined as an "external repository" -- that is to say, "external" to the parent workspace, not necessarily a repository whose source code lives somewhere else.
You can do this using local_repository like so:
WORKSPACE
local_repository(
name = "bar_workspace_name",
path = "bar",
)

I may be projecting an actual behavior of bazel into interpreting the intended meaning of the sentence when I suppose the "ignoring" it speaks of goes in the other direction not descending into nested directories, but crawling up towards root. Essentially...
If you have that bar/WORKSPACE file, it does not change the behavior when looking at it from bar/'s parent and //bar still appears as package in that workspace.
However, it impact the behavior should you run bazel with bar/ for its working directory. As it tries to find the workspace root it stops WORKSPACE therein bar/ becomes its own // in that case (and by extension its parent is not accessible as part of the same workspace).
Or in practical terms... try the same thing:
bazel cquery //...
but in bar:
(cd bar/ && bazel cquery //... )
Once with and once without bar/WORKSPACE.

Related

How do you enumerate and copy multiple files to the source folder in Bazel?

How do you enumerate and copy multiple files to the source folder in Bazel?
I'm new to Bazel and I am trying to replace a non-Bazel build step that is effectively cp -R with an idiomatic Bazel solution. Concrete use cases are:
copying .proto files to a a sub-project where they will be picked up by a non-Bazel build system. There are N .proto files in N Bazel packages, all in one protos/ directory of the repository.
copying numerous .gotmpl template files to a different folder where they can be picked up in a docker volume for a local docker-compose development environment. There are M template files in one Bazel package in a small folder hierarchy. Example code below.
Copy those same .gotmpl files to a gitops-type repo for a remote terraform to send to prod.
All sources are regular, checked in files in places where Bazel can enumerate them. All target directories are also Bazel packages. I want to write to the source folder, not just to bazel-bin, so other non-Bazel tools can see the output files.
Currently when adding a template file or a proto package, a script must be run outside of bazel to pick up that new file and add it to a generated .bzl file, or perform operations completely outside of Bazel. I would like to eliminate this step to move closer to having one true build command.
I could accomplish this with symlinks but it still has an error-prone manual step for the .proto files and it would be nice to gain the option to manipulate the files programmatically in Bazel in the build.
Some solutions I've looked into and hit dead ends:
glob seems to be relative to current package and I don't see how it can be exported since it needs to be called from BUILD. A filegroup solves the export issue but doesn't seem to allow enumeration of the underlying files in a way that a bazel run target can take as input.
Rules like cc_library that happily input globs as srcs are built into the Bazel source code, not written in Starlark
genquery and aspects seem to have powerful meta-capabilities but I can't see how to actually accomplish this task with them.
The "bazel can write to the source folder" pattern and write_source_files from aspect-build/bazel-lib might be great if I could programmatically generate the files parameter.
Here is the template example which is the simpler case. This was my latest experiment to bazel-ify cp -R. I want to express src/templates/update_templates_bzl.py in Bazel.
src/templates/BUILD:
# [...]
exports_files(glob(["**/*.gotmpl"]))
# [...]
src/templates/update_templates_bzl.py:
#!/usr/bin/env python
from pathlib import Path
parent = Path(__file__).parent
template_files = [str(f.relative_to(parent)) for f in list(parent.glob('**/*.gotmpl'))]
as_python = repr(template_files).replace(",", ",\n ")
target_bzl = Path(__file__).parent / "templates.bzl"
target_bzl.write_text(f""""Generated template list from {Path(__file__).relative_to(parent)}"
TEMPLATES = {as_python}""")
src/templates/copy_templates.bzl
"""Utility for working with this list of template files"""
load("#aspect_bazel_lib//lib:write_source_files.bzl", "write_source_files")
load("templates.bzl", "TEMPLATES")
def copy_templates(name, prefix):
files = {
"%s/%s" % (prefix, f) : "//src/templates:%s" % f for f in TEMPLATES
}
write_source_files(
name = name,
files = files,
visibility = ["//visibility:public"],
)
other/module:
load("//src/templates:copy_templates.bzl", "copy_templates")
copy_templates(
name = "write_template_files",
prefix = "path/to/gitops/repo/templates",
)
One possible method to do this would be to use google/bazel_rules_install.
As mentioned in the project README.md you need to add the following to your WORKSPACE file;
# file: WORKSPACE
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "com_github_google_rules_install",
urls = ["https://github.com/google/bazel_rules_install/releases/download/0.3/bazel_rules_install-0.3.tar.gz"],
sha256 = "ea2a9f94fed090859589ac851af3a1c6034c5f333804f044f8f094257c33bdb3",
strip_prefix = "bazel_rules_install-0.3",
)
load("#com_github_google_rules_install//:deps.bzl", "install_rules_dependencies")
install_rules_dependencies()
load("#com_github_google_rules_install//:setup.bzl", "install_rules_setup")
install_rules_setup()
Then in your src/templates directory you can add the following to bundle all your templates into one target.
# file: src/templates/BUILD.bazel
load("#com_github_google_rules_install//installer:def.bzl", "installer")
installer(
name = "install_templates",
data = glob(["**/*.gotmpl"]),
)
Then you can use the installer to install into your chosen directory like so.
bazel run //src/templates:install_templates -- path/to/gitops/repo/templates
It's also worth checking out bazelbuild/rules_docker for building your development environments using only Bazel.

Clean up unreachable generated files in Bazel

Suppose I have a very minimal project with an empty WORKSPACE and a single package defined at the project root that simply uses touch to create a file called a, as follows:
genrule(
name = "target",
cmd = "touch $#",
outs = ["a"],
)
If I now run
bazel build //:target
the package will be "built" and the a file will be available under bazel-genfiles.
Suppose I now change the BUILD to write the output to a different file, as follows:
genrule(
name = "target",
cmd = "touch $#",
outs = ["b"],
)
Building the same target will result in the file b being available under bazel-genfiles. a will still be there though, even though at this point it's "unreachable" from within the context of the build definition.
Is there a way to ask Bazel to perform some sort of "garbage collection" and remove files (and possibly other content) generated by previous builds that are no longer reachable as-per the current build definition, without getting rid of the entire directory? The bazel clean command seems to adopt the latter behavior.
There seems to be a feature in the works, but apparently it cannot be performed on demand, but rather it executes automatically as soon as a certain threshold has been reached.
Note that running bazel clean will not actually delete the external directory. To remove all external artifacts, use bazel clean --expunge
bazel clean is the way to remove these.
The stale outputs aren't visible to actions, provided you build with sandboxing. (Not yet available on Windows, only on Linux and macOS.)
What trouble do these files make?

Can I ignore some folder (containing bazel configuration) while building the project recursively?

For some reasons, practical or not, rxjs npm package stores BAZEL.build configuration in the package, so when I'm trying to build my project (which has node_modules folder) bazel tries automatically to build something that it's not supposed to build at all.
My question would be - what is canonical way of ignoring some specific folder while building bazel project recursively?
The only way to achieve what I'm looking for that I know of is to point to it explicitly in the command line
bazel build //... --deleted_packages=node_modules/rxjs/src (see user manual)
But I don't want to type this every time.
Bazel recently added a feature for ignoring folders (similar to gitignore).
Simply add node_modules to the .bazelignore file in the root of your project.
Yes, this is expressible as a bazel target pattern:
bazel build -- //... -//node_modules/rxjs/src/...
Full documentation is available at https://docs.bazel.build/versions/master/user-manual.html#target-patterns

Bazel ignore subdirectory on full build

In my repository I have some files with the name "build" (automatically generated and/or imported, spread around elsewhere from where I have my bazel build files). These seem to be interpreted by Bazel as its BUILD files, and fail the full build I try to run with bazel build //...
Is there some way I can tell Bazel in a settings configuration file to ignore certain directories altogether? Or perhaps specify the build file names as something other than BUILD, like BUILD.bazel?
Or are my options:
To banish the name build from the entire repository.
To add a gigantic --deleted_packages=<...> to every run of build.
To not use full builds, instead specifying explicit targets.
I think this is a duplicate of the two questions you linked, but to expand on what you asked about in your comment:
You don't have to rename them BUILD.bazel, my suggestion is to add an empty BUILD.bazel to those directories. So you'd end up with:
my-project/
BUILD
src/
build/
stuff-bazel-shouldn't-mess-with
BUILD.bazel # Empty
Then Bazel will check for targets in BUILD.bazel, see that there are none, and won't try to parse the build/ directory.
And there is a distressing lack of documentation about BUILD vs. BUILD.bazel, at least that I could find.

Bazel- How to recursively glob deleted_packages to ignore maven outputs?

I have a mutli-module project which I'm migrating from Maven to Bazel. During this migration people will need to be able to work on both build systems.
After an mvn clean install Maven copies some of the BUILD files into the target folder.
When I later try to run bazel build //... it thinks the BUILD files under the various target folders are valid packages and fails due to some mismatch.
I've seen deleted_packages but AFAICT it requires I specify the list of folders to "delete" while I can't do that for 200+ modules.
I'm looking for the ability to say bazel build //... --deleted_packages=**/target.
Is this supported? (my experimentation says it's not but I might be wrong). If it's not supported is there an existing hack for it?
Can you use your shell to find the list of packages to ignore?
deleted=$(find . -name target -type d)
bazel build //... --deleted_packages="$deleted"
#Laurent's answer gave me the lead but Bazel didn't accept relative paths and required I add both classes and test-classes folders under target to delete the package so I decided to answer with the complete solution:
#!/bin/bash
#find all the target folders under the current working dir
target_folders=$(find . -name target -type d)
#find the repo root (currently assuming it's git based)
repo_root=$(git rev-parse --show-toplevel)
repo_root_length=${#repo_root}
#the current bazel package prefix is the PWD minus the repo root and adding a slash
current_bazel_package="/${PWD:repo_root_length}"
deleted_packages=""
for target in $target_folders
do
#cannonicalize the package path
full_package_path="$current_bazel_package${target:1}"
classes_full="${full_package_path}/classes"
test_classes_full="${full_package_path}/test-classes"
deleted_packages="$deleted_packages,$classes_full,$test_classes_full"
done
#remove the leading comma and call bazel-real with the other args
bazel-real "$#" --deleted_packages=${deleted_packages:1}
This script was checked in under tools/bazel which is why it calls bazel-real at the end.
I'm sorry I don't think this is supported. Some brainstorming:
Is it an option to point maven outputs somewhere else?
Is is an option not to use //... but explicit target(s)?
Maybe just remove the bad BUILD files before running bazel?

Resources