Bazel: Referencing a file inside an output directory in a genrule

Bazel: Referencing a file inside an output directory in a genrule - bazel

I am trying to reference an output of a rule that is nested inside an output directory of another rule in a genrule.
For example, I use rules_foreign_cc to build boost:
boost_build(
name = "boost",
lib_source = "#boost//:all",
linkopts = [
"-lpthread",
],
shared_libraries = [
"libboost_chrono.so.1.72.0",
"libboost_program_options.so.1.72.0",
"libboost_filesystem.so.1.72.0",
"libboost_system.so.1.72.0",
"libboost_thread.so.1.72.0",
"libboost_timer.so.1.72.0",
],
user_options = [
"cxxstd=17",
"--with-chrono",
"--with-filesystem",
"--with-program_options",
"--with-system",
"--with-thread",
"--with-timer",
"-j4",
],
)
And when I build it, I see the outputs:
bazel build //:boost
INFO: Invocation ID: 36440de3-15f2-4ca0-8802-0a95f75ed926
INFO: Analyzed target //:boost (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:boost up-to-date:
bazel-bin/boost/include
bazel-bin/boost/lib/libboost_chrono.so.1.72.0
bazel-bin/boost/lib/libboost_program_options.so.1.72.0
bazel-bin/boost/lib/libboost_filesystem.so.1.72.0
bazel-bin/boost/lib/libboost_system.so.1.72.0
bazel-bin/boost/lib/libboost_thread.so.1.72.0
bazel-bin/boost/lib/libboost_timer.so.1.72.0
bazel-bin/copy_boost/boost
bazel-bin/boost/logs/BuildBoost_script.sh
bazel-bin/boost/logs/BuildBoost.log
bazel-bin/boost/logs/wrapper_script.sh
INFO: Elapsed time: 0.758s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
Boost works properly, I can reference it in cc_library targets and binaries run fine.
Now, I would like to reference one of the outputs in a genrule. The file I want to reference is nested inside the boost/lib/ directory. I would expect something like: $(location :boost/lib/libboost_program_options.so.1.72.0), but that doesn't work.
What's the proper way to reference the outputs in the directory?

This is what I did, hope it helps.
Inside your genrule, create a variable that hold the output array of your boost rule, use $(locations) instead of $(location). Then loop through the array and find the path you want to use later.
BOOST_OUTPUTS=( $(locations boost) )
LIBBOOST_CHRONO_SO=
for i in "$${{BOOST_OUTPUTS[#]}}"
do
if [[ "$$i" == *"libboost_chrono.so.1.72.0"* ]]; then
LIBBOOST_CHRONO_SO="$$i"
fi
done
# Do something with LIBBOOST_CHRONO_SO
Just note that $$ and {{ and }} are for escaping special characters.

Related

BAZEL: How to exclude certain targets if a configuration setting is true

Let's say I have 3 libraries in different folders:
a_library(
name = "a",
)
b_library(
name = "b",
)
c_library(
name = "c",
)
I want to bazel build ... to build all libraries, while bazel build ... -c dbg only builds a_library and b_library. I'm wondering what I should change?
What i've tried:
I tried following https://bazel.build/docs/configurable-attributes, where I made a config setting for dbg and selected tag=['default'] when you passed in -c dbg, but apparently tags aren't configurable.
I've also tried modifying my .bazelrc file to add a tag to c_library. Eg: build --build_tag_filters="disable_for_dbg", but I'm not sure how to get this to only run when I pass -c dbg
Any help is appreciated, thanks!

Incompatible target skipping is the feature for this. It'll exclude the libraries from wildcards, and anything which (transitively) depends on them. If you explicitly list one of them on the command line (bazel build -c dbg c), you'll get an error. Like this:
config_setting(
name = "debug_build",
values = {
"compilation_mode": "dbg",
},
)
c_library(
name = "c",
target_compatible_with = select({
":debug_build": ["#platforms//:incompatible"],
"//conditions:default": [],
}),
)
#platforms//:incompatible is not part of any platform, so any target with target_compatible_with that includes it will always be skipped. target_compatible_with is configurable, so you can add that platform constraint conditionally via select.

Bazel: why lint failed for variable reference?

I had this genrule in BUILD file, but bazel build failed for the error of:
in cmd attribute of genrule rule //example:create_version_pom: $(BUILD_TAG) not defined
genrule(
name = "create_version_pom",
srcs = ["pom_template.xml"],
outs = ["version_pom.xml"],
cmd = "sed 's/BUILD_TAG/$(BUILD_TAG)/g' $< > $#",
)
What's the reason, and how to fix it please?

The cmd attribute of genrule will do variable expansion for Bazel build variables before the command is executed. The $< and $# variables for the input file and the output file are some of the pre-defined variables. Variables can be defined with --define, e.g.:
$ cat BUILD
genrule(
name = "genfoo",
outs = ["foo"],
cmd = "echo $(bar) > $#",
)
$ bazel build foo --define=bar=123
INFO: Analyzed target //:foo (5 packages loaded, 8 targets configured).
INFO: Found 1 target...
Target //:foo up-to-date:
bazel-bin/foo
INFO: Elapsed time: 0.310s, Critical Path: 0.01s
INFO: 2 processes: 1 internal, 1 linux-sandbox.
INFO: Build completed successfully, 2 total actions
$ cat bazel-bin/foo
123
So to have $(BUILD_TAG) work in the genrule, you'll want to pass
--define=BUILD_TAG=the_build_tag
Unless it's that you want BUILD_TAG replaced with literally $(BUILD_TAG), in which case the $ needs to be escaped with another $: $$(BUILD_TAG).
See
https://docs.bazel.build/versions/main/be/general.html#genrule.cmd
https://docs.bazel.build/versions/main/be/make-variables.html
Note that Bazel also has a mechanism for "build stamping" for bringing information like build time and version numbers into the build:
https://docs.bazel.build/versions/main/user-manual.html#workspace_status
https://docs.bazel.build/versions/main/command-line-reference.html#flag--embed_label
Using --workspace_status_command and --embed_label are a little more complicated though.

Bazel: map set(files) -> set(targets)

According to https://docs.bazel.build/versions/master/query-how-to.html#What_build_rule_contains_file_ja,
fullname=$(bazel query path/to/file/bar.java)
bazel query "attr('srcs', $fullname, ${fullname//:*/}:*)"
will tell me which target bar.java belongs to.
How can I get the set of targets that multiple files belong to? I.e. map set(files) -> set(targets). I could do this in serial, but each bazel call is fairly expensive and slow---I want to get it done in one call.
Context: I'd like to do this (build targets pertaining to a few files):
git diff HEAD~ | xargs bazel query "get targets for set(files)" | xargs bazel build
I feel like this capability must already exist, but I haven't been able to find it.

Using query will work, but you can also use aquery for a more direct approach.
https://docs.bazel.build/versions/master/aquery.html
BUILD:
genrule(
name = "gen1",
srcs = ["a"],
outs = ["gen1.out"],
cmd = "echo foo > $#",
)
pkg/BUILD:
genrule(
name = "gen2",
srcs = ["b"],
outs = ["gen2.out"],
cmd = "echo foo > $#",
)
$ bazel aquery "inputs('a|pkg/b', ...)" --include_artifacts=false --include_commandline=false
INFO: Analyzed 2 targets (6 packages loaded, 10 targets configured).
INFO: Found 2 targets...
action 'Executing genrule //pkg:gen2'
Mnemonic: Genrule
Target: //pkg:gen2
Configuration: k8-fastbuild
ActionKey: 8d7d05620bfd8303aa66488e0cd6586d8e978197126cdb41c5fc8c49c81988ef
action 'Executing genrule //:gen1'
Mnemonic: Genrule
Target: //:gen1
Configuration: k8-fastbuild
ActionKey: d4c76a6b6913ce5d887829dbc00d101c1cf5b0ff5240ed13ea328c26e4b41960
INFO: Elapsed time: 0.198s
INFO: 0 processes.
INFO: Build completed successfully, 0 total actions
inputs (like attr) accepts a regular expression, so you can "or" the files together with |. Then just filter for Target:, or use one of the other outputs (--output=(proto|textproto|jsonproto))
This has the advantages that:
you don't need to figure out the labels of the files first (since attr in query operates on labels).
aquery queries after the analysis phase, so your results are more accurate because it accounts for configuration (flags, etc).
this will work for any attribute, since it's querying the inputs of all actions, which is at a lower level than rules
On the other hand, since aquery runs loading + analysis, it might take longer than query, since query just runs the loading phase.

How do I make a bazel `sh_binary` target depend on other binary targets?

I have set up bazel to build a number of CLI tools that perform various database maintenance tasks. Each one is a py_binary or cc_binary target that is called from the command line with the path to some data file: it processes that file and stores the results in a database.
Now, I need to create a dependent package that contains data files and shell scripts that call these CLI tools to perform application-specific database operations.
However, there doesn't seem to be a way to depend on the existing py_binary or cc_binary targets from a new package that only contains sh_binary targets and data files. Trying to do so results in an error like:
ERROR: /workspace/shbin/BUILD.bazel:5:12: in deps attribute of sh_binary rule //shbin:run: py_binary rule '//pybin:counter' is misplaced here (expected sh_library)
Is there a way to call/depend on an existing bazel binary target from a shell script using sh_binary?
I have implemented a full example here:
https://github.com/psigen/bazel-mixed-binaries
Notes:
I cannot use py_library and cc_library instead of py_binary and cc_binary. This is because (a) I need to call mixes of the two languages to process my data files and (b) these tools are from an upstream repository where they are already designed as CLI tools.
I also cannot put all the data files into the CLI tool packages -- there are multiple application-specific packages and they cannot be mixed.

You can either create a genrule to run these tools as part of the build, or create a sh_binary that depends on the tools via the data attribute and runs them them.
The genrule approach
This is the easier way and lets you run the tools as part of the build.
genrule(
name = "foo",
tools = [
"//tool_a:py",
"//tool_b:cc",
],
srcs = [
"//source:file1",
":file2",
],
outs = [
"output_file1",
"output_file2",
],
cmd = "$(location //tool_a:py) --input=$(location //source:file1) --output=$(location output_file1) && $(location //tool_b:cc) < $(location :file2) > $(location output_file2)",
)
The sh_binary approach
This is more complicated, but lets you run the sh_binary either as part of the build (if it is in a genrule.tools, similar to the previous approach) or after the build (from under bazel-bin).
In the sh_binary you have to data-depend on the tools:
sh_binary(
name = "foo",
srcs = ["my_shbin.sh"],
data = [
"//tool_a:py",
"//tool_b:cc",
],
)
Then, in the sh_binary you have to use the so-called "Bash runfiles library" built into Bazel to look up the runtime-path of the binaries. This library's documentation is in its source file.
The idea is:
the sh_binary has to depend on a specific target
you have to copy-paste some boilerplate code to the top of the sh_binary (reason is described here)
then you can use the rlocation function to look up the runtime-path of the binaries
For example your my_shbin.sh may look like this:
#!/bin/bash
# --- begin runfiles.bash initialization ---
...
# --- end runfiles.bash initialization ---
path=$(rlocation "__main__/tool_a/py")
if [[ ! -f "${path:-}" ]]; then
echo >&2 "ERROR: could not look up the Python tool path"
exit 1
fi
$path --input=$1 --output=$2
The __main__ in the rlocation path argument is the name of the workspace. Since your WORKSPACE file does not have a "workspace" rule in, which would define the workspace's name, Bazel will use the default workspace name, which is __main__.

An easier approach for me is to add the cc_binary as a dependency in the data section. In prefix/BUILD
cc_binary(name = "foo", ...)
sh_test(name = "foo_test", srcs = ["foo_test.sh"], data = [":foo"])
Inside foo_test.sh, the working directory is different, so you need to find the right prefix for the binary
#! /usr/bin/env bash
executable=prefix/foo
$executable ...

A clean way to do this is to use args and $(location):
Contents of BUILD:
py_binary(
name = "counter",
srcs = ["counter.py"],
main = "counter.py",
)
sh_binary(
name = "run",
srcs = ["run.sh"],
data = [":counter"],
args = ["$(location :counter)"],
)
Contents of counter.py (your tool):
print("This is the counter tool.")
Contents of run.sh (your bash script):
#!/bin/bash
set -eEuo pipefail
counter="$1"
shift
echo "This is the bash script, about to call the counter tool."
"$counter"
And here's a demo showing the bash script calling the Python tool:
$ bazel run //example:run 2>/dev/null
This is the bash script, about to call the counter tool.
This is the counter tool.
It's also worth mentioning this note (from the docs):
The arguments are not passed when you run the target outside of bazel (for example, by manually executing the binary in bazel-bin/).

Bazel: copy multiple files to binary directory

I need to copy some files to binary directory while preserving their names. What I've got so far:
filegroup(
name = "resources",
srcs = glob(["resources/*.*"]),
)
genrule(
name = "copy_resources",
srcs = ["//some/package:resources"],
outs = [ ],
cmd = "cp $(SRCS) $(#D)",
local = 1,
output_to_bindir = 1,
)
Now I have to specify file names in outs but I can't seem to figure out how to resolve the labels to obtain the actual file names.

To make a filegroup available to a binary (executed using bazel run) or to a test (when executed using bazel test) then one usually lists the filegroup as part of the data of the binary, like so:
cc_binary(
name = "hello-world",
srcs = ["hello-world.cc"],
data = [
"//your_project/other/deeply/nested/resources:other_test_files",
],
)
# known to work at least as of bazel version 0.22.0
Usually the above is sufficient.
However, the executable must then recurse through the directory structure "other/deeply/nested/resources/" in order to find the files from the indicated filegroup.
In other words, when populating the runfiles of an executable, bazel preserves the directory nesting that spans from the WORKSPACE root to all the packages enclosing the given filegroup.
Sometimes, this preserved directory nesting is undesirable.
THE CHALLENGE:
In my case, I had several filegroups located at various points in my project directory tree, and I wanted all the individual files of those groups to end up side-by-side in the runfiles collection of the test binary that would consume them.
My attempts to do this with a genrule were unsuccessful.
In order to copy individual files from multiple filegroups, preserving the basename of each file but flattening the output directory, it was necessary to create a custom rule in a bzl bazel extension.
Thankfully, the custom rule is fairly straightforward.
It uses cp in a shell command much like the unfinished genrule listed in the original question.
The extension file:
# contents of a file you create named: copy_filegroups.bzl
# known to work in bazel version 0.22.0
def _copy_filegroup_impl(ctx):
all_input_files = [
f for t in ctx.attr.targeted_filegroups for f in t.files
]
all_outputs = []
for f in all_input_files:
out = ctx.actions.declare_file(f.basename)
all_outputs += [out]
ctx.actions.run_shell(
outputs=[out],
inputs=depset([f]),
arguments=[f.path, out.path],
# This is what we're all about here. Just a simple 'cp' command.
# Copy the input to CWD/f.basename, where CWD is the package where
# the copy_filegroups_to_this_package rule is invoked.
# (To be clear, the files aren't copied right to where your BUILD
# file sits in source control. They are copied to the 'shadow tree'
# parallel location under `bazel info bazel-bin`)
command="cp $1 $2")
# Small sanity check
if len(all_input_files) != len(all_outputs):
fail("Output count should be 1-to-1 with input count.")
return [
DefaultInfo(
files=depset(all_outputs),
runfiles=ctx.runfiles(files=all_outputs))
]
copy_filegroups_to_this_package = rule(
implementation=_copy_filegroup_impl,
attrs={
"targeted_filegroups": attr.label_list(),
},
)
Using it:
# inside the BUILD file of your exe
load(
"//your_project:copy_filegroups.bzl",
"copy_filegroups_to_this_package",
)
copy_filegroups_to_this_package(
name = "other_files_unnested",
# you can list more than one filegroup:
targeted_filegroups = ["//your_project/other/deeply/nested/library:other_test_files"],
)
cc_binary(
name = "hello-world",
srcs = ["hello-world.cc"],
data = [
":other_files_unnested",
],
)
You can clone a complete working example here.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Bazel: Referencing a file inside an output directory in a genrule - bazel

Related

BAZEL: How to exclude certain targets if a configuration setting is true

Bazel: why lint failed for variable reference?

Bazel: map set(files) -> set(targets)

How do I make a bazel `sh_binary` target depend on other binary targets?

Bazel: copy multiple files to binary directory

Categories

Resources