How to define string based on host os in bazel rule definition? - bazel

I have the following rule definition:
helm_action = rule(
attrs = {
…
"cluster_aliases": attr.string_dict(
doc = "key value pair matching for creating a cluster alias where the name used to evoke a cluster alias is different than the actual cluster's name",
default = DEFAULT_CLUSTER_ALIASES,
),
…
},
…
)
I'd like for DEFAULT_CLUSTER_ALIASES value to be based on the host os but
DEFAULT_CLUSTER_ALIASES = {
"local": select({
"#platforms//os:osx": "docker-desktop",
"#platforms//os:linux": "minikube",
})
}
errors with:
Error in string_dict: expected value of type 'string' for dict value element, but got select({"#platforms//os:osx": "docker-desktop", "#platforms//os:linux": "minikube"}) (select)
How do I go about defining DEFAULT_CLUSTER_ALIASES based on the host os?

Judging from https://github.com/bazelbuild/bazel/issues/2045, selecting based on host os is not possible.

When you create a rule or macro, it is evaluated during the loading phase, before command-line flags are evaluated. Bazel needs to know the default value in your build rule helm_action during the loading phase but can't because it hasn't parsed the command line and analysed the build graph.
The command line is parsed and select statements are evaluated during the analysis phase. As a broad rule, if your select statement isn't in a BUILD.bazel then it's not going to work. So the easiest way to achieve what you are after is to create a macro that uses your rule injecting the default. e.g.
# helm_action.bzl
# Add an '_' prefix to your rule to make the rule private.
_helm_action = rule(
attrs = {
…
"cluster_aliases": attr.string_dict(
doc = "key value pair matching for creating a cluster alias where the name used to evoke a cluster alias is different than the actual cluster's name",
# Remove default attribute.
),
…
},
…
)
# Wrap your rule in a publicly exported macro.
def helm_action(**kwargs):
_helm_action(
name = kwargs["name"],
# Instantiate your rule with a select.
cluster_aliases = DEFAULT_CLUSTER_ALIASES,
**kwargs,
)
It's important to note the difference between a macro and a rule. A macro is a way of generating a set of targets using other build rules, and actually expands out roughly equivalent to it's contents when used in a BUILD file. You can check this by querying a target with the --output build flag. e.g.
load(":helm_action.bzl", "helm_action")
helm_action(
name = "foo",
# ...
)
You can query the output using the command;
bazel query //:foo --output build
This will demonstrate that the select statement is being copied into the BUILD file.
A good example of this approach is in the rules_docker repository.
EDIT: The question was clarified, so I've got an updated answer below but will keep the above answer in case it is useful to others.
A simple way of achieving what you are after is to use Bazels toolchain api. This is a very flexible API and is what most language rulesets use in Bazel. e.g.
Create a build file with your toolchains;
# //helm:BUILD.bazel
load(":helm_toolchains.bzl", "helm_toolchain")
toolchain_type(name = "toolchain_type")
helm_toolchain(
name = "osx",
cluster_aliases = {
"local": "docker-desktop",
},
)
toolchain(
name = "osx_toolchain",
toolchain = ":osx",
toolchain_type = ":toolchain_type",
exec_compatible_with = ["#platforms//os:macos"],
# Optionally use to restrict target platforms too.
# target_compatible_with = []
)
helm_toolchain(
name = "linux",
cluster_aliases = {
"local": "minikube",
},
)
toolchain(
name = "linux_toolchain",
toolchain = ":linux",
toolchain_type = ":toolchain_type",
exec_compatible_with = ["#platforms//os:linux"],
)
Register your toolchains so that Bazel knows what to look for;
# //:WORKSPACE
# the rest of your workspace...
register_toolchains("//helm:all")
# You may need to register your execution platforms too...
# register_execution_platforms("//your_platforms/...")
Implement the toolchain backend;
# //helm:helm_toolchains.bzl
HelmToolchainInfo = provider(fields = ["cluster_aliases"])
def _helm_toolchain_impl(ctx):
toolchain_info = platform_common.ToolchainInfo(
helm_toolchain_info = HelmToolchainInfo(
cluster_aliases = ctx.attr.cluster_aliases,
),
)
return [toolchain_info]
helm_toolchain = rule(
implementation = _helm_toolchain_impl,
attrs = {
"cluster_aliases": attr.string_dict(),
},
)
Update helm_action to use toolchains. e.g.
def _helm_action_impl(ctx):
cluster_aliases = ctx.toolchains["#your_repo//helm:toolchain_type"].helm_toolchain_info.cluster_aliases
#...
helm_action = rule(
_helm_action_impl,
attrs = {
#…
},
toolchains = ["#your_repo//helm:toolchain_type"]
)

Related

Bazel options in BUILD file

I need to set some build options every time I invoke bazel for a specific target. For example, bazel build --collect_code_coverage //:target. How can I avoid providing the build options at the command line explicitly, so that bazel build //:target implicitly has the build option --collect_code_coverage applied?
The closest solution I found was using the bazelrc file, but it does not allow me to configure build options at a target level.
This is doable with user defined transitions, there are some performance considerations here that are worth reading over. It's also relatively complicated.
# transition.bzl
def _platform_transition_impl(settings, attr):
_ignore = settings
return {"//command_line_option:collect_code_coverage": True}
coverage_transition = transition(
implementation = _coverage_transition_impl,
inputs = [],
outputs = ["//command_line_option:collect_code_coverage"],
)
def _coverage_impl(ctx):
executable = ctx.actions.declare_file(ctx.attr.name)
ctx.actions.symlink(output = executable, target_file = ctx.file.target)
return [DefaultInfo(executable = executable)]
build_with_collect_coverage = rule(
implementation = _coverage_impl,
attrs = {
"target": attr.label(mandatory = True, allow_single_file = True, cfg = coverage_transition),
"_allowlist_function_transition": attr.label(
default = "#bazel_tools//tools/allowlists/function_transition_allowlist",
),
},
)
Then add the transition rule to BUILD.bazel e.g.
load(":transition.bzl", "build_with_collect_coverage")
cc_binary(
name = "target"
#...
)
build_with_collect_coverage(
name = "target_with_coverage_collection",
target = ":target",
)
Note: While this is a generic solution that is incredibly powerful and supports any language you might simply be able to get away with using the features attribute e.g.
cc_binary(
name = "target",
#...
features = ["coverage"],
)

Propagating copts/defines to all of a target's dependencies

I have a project that involves multiple BUILD files in a single WORKSPACE, within a fairly complex build system. My goal in short: for some specific target, I want all of its recursive dependencies to be built with an extra set of attributes (copts/defines) compared to when those dependency targets are built in any other way. I have not yet found a way to do this cleanly.
For example, target G is normally built with copts = []. If target P depends on target G, and I run bazel build :P, I want both targets to be built with copts = ["-DMY_DEFINE"], along with all dependencies of target G, etc.
The cc_binary.defines argument propagates in the opposite direction: all targets that depend on some target A will receive all of target A's defines.
Limitations:
prefer to avoid custom command line flags, I don't control how people call bazel {build,test}
duplicating the entire tree of dependency targets is not practical
It doesn't appear possible to set the value of a config_setting from within a BUILD file or a target, so it seems a select-based solution couldn't work.
Previous work:
https://groups.google.com/g/bazel-discuss/c/rZps4nqYqt8/m/YS_pZD6oAQAJ - 2017, recommends "parallel trees" or custom macros (of which we already have many, it would be challenging to wrap them in another)
Propagate copts to all dependencies in Bazel - I believe these all depend on custom command line flags as well
Creating a user-defined build setting doesn't require command-line flags. If you set flag = False, then it actually can't be set on the command line. You can use a user-defined transition to set it instead.
I think something like this will do what you're looking for (save it in extra_copts.bzl):
def _extra_copts_impl(ctx):
context = cc_common.create_compilation_context(
defines = depset(ctx.build_setting_value)
)
return [CcInfo(compilation_context = context)]
extra_copts = rule(
implementation = _extra_copts_impl,
build_setting = config.string_list(flag = False),
)
def _use_extra_copts_implementation(ctx):
return [ctx.attr._copts[CcInfo]]
use_extra_copts = rule(
implementation = _use_extra_copts_implementation,
attrs = "_copts": attr.label(default = "//:extra_copts")},
)
def _add_copts_impl(settings, attr):
return {"//:extra_copts": ["MY_DEFINE"]}
_add_copts = transition(
implementation = _add_copts_impl,
inputs = [],
outputs = ["//:extra_copts"],
)
def _with_extra_copts_implementation(ctx):
infos = [d[CcInfo] for d in ctx.attr.deps]
return [cc_common.merge_cc_infos(cc_infos = infos)]
with_extra_copts = rule(
implementation = _with_extra_copts_implementation,
attrs = {
"deps": attr.label_list(cfg = _add_copts),
"_allowlist_function_transition": attr.label(
default = "#bazel_tools//tools/allowlists/function_transition_allowlist"
)
},
)
and then in the BUILD file:
load("//:extra_copts.bzl", "extra_copts", "use_extra_copts", "with_extra_copts")
extra_copts(name = "extra_copts", build_setting_default = [])
use_extra_copts(name = "use_extra_copts")
cc_library(
name = "G",
deps = [":use_extra_copts"],
)
with_extra_copts(
name = "P_deps",
deps = [":G"],
)
cc_library(
name = "P",
deps = [":P_deps"],
)
extra_copts is the build setting. It returns a CcInfo directly, which means it's straightforward to do any other C++ library swapping with the same approach. Its default is effectively an "empty" CcInfo which won't do anything to libraries that depend on it.
with_extra_copts wraps a set of dependencies, configured to use a different CcInfo. This is the rule that actually changes the value, to create the second version of G with different flags.
_add_copts is the transition which with_extra_copts uses to change the value of the extra_copts build setting. It could examine attr to do something more sophisticated than adding a hard-coded list.
use_extra_copts pulls the CcInfo out of extra_copts so a cc_library can use them.
To avoid rewriting the builtin C++ rules, this uses wrapper rules to pull the copts out and do the transition. You might want to create macros to bundle the wrapper rules along with the corresponding cc_library. Alternatively, you could use rules_cc's my_c_archive as a starting point to create custom rules that reuse the core implementation of the builtin C++ rules while integrating the transition and use of the build setting into a single rule.

Conditionally create a Bazel rule based on --config

I'm working on a problem in which I only want to create a particular rule if a certain Bazel config has been specified (via '--config'). We have been using Bazel since 0.11 and have a bunch of build infrastructure that works around former limitations in Bazel. I am incrementally porting us up to newer versions. One of the features that was missing was compiler transitions, and so we rolled our own using configs and some external scripts.
My first attempt at solving my problem looks like this:
load("#rules_cc//cc:defs.bzl", "cc_library")
# use this with a select to pick targets to include/exclude based on config
# see __build_if_role for an example
def noop_impl(ctx):
pass
noop = rule(
implementation = noop_impl,
attrs = {
"deps": attr.label_list(),
},
)
def __sanitize(config):
if len(config) > 2 and config[:2] == "//":
config = config[2:]
return config.replace(":", "_").replace("/", "_")
def build_if_config(**kwargs):
config = kwargs['config']
kwargs.pop('config')
name = kwargs['name'] + '_' + __sanitize(config)
binary_target_name = kwargs['name']
kwargs['name'] = binary_target_name
cc_library(**kwargs)
noop(
name = name,
deps = select({
config: [ binary_target_name ],
"//conditions:default": [],
})
)
This almost gets me there, but the problem is that if I want to build a library as an output, then it becomes an intermediate dependency, and therefore gets deleted or never built.
For example, if I do this:
build_if_config(
name="some_lib",
srcs=[ "foo.c" ],
config="//:my_config",
)
and then I run
bazel build --config my_config //:some_lib
Then libsome_lib.a does not make it to bazel-out, although if I define it using cc_library, then it does.
Is there a way that I can just create the appropriate rule directly in the macro instead of creating a noop rule and using a select? Or another mechanism?
Thanks in advance for your help!
As I noted in my comment, I was misunderstanding how Bazel figures out its dependencies. The create a file section of The Rules Tutorial explains some of the details, and I followed along here for some of my solution.
Basically, the problem was not that the built files were not sticking around, it was that they were never getting built. Bazel did not know to look in the deps variable and build those things: it seems I had to create an action which uses the deps, and then register an action by returning a (list of) DefaultInfo
Below is my new noop_impl function
def noop_impl(ctx):
if len(ctx.attr.deps) == 0:
return None
# ctx.attr has the attributes of this rule
dep = ctx.attr.deps[0]
# DefaultInfo is apparently some sort of globally available
# class that can be used to index Target objects
infile = dep[DefaultInfo].files.to_list()[0]
outfile = ctx.actions.declare_file('lib' + ctx.label.name + '.a')
ctx.actions.run_shell(
inputs = [infile],
outputs = [outfile],
command = "cp %s %s" % (infile.path, outfile.path),
)
# we can also instantiate a DefaultInfo to indicate what output
# we provide
return [DefaultInfo(files = depset([outfile]))]

How to write a Bazel test rule using a provided tool rather than a rule-built one?

I have a test tool (roughly, a diffing tool) that takes two inputs, and returns both an output (the difference between the two inputs), and a return code (0 if the two inputs are matching, 1 otherwise). It's built in Kotlin, and available at //java/fr/enoent/phosphorus in my repo.
I want to write a rule that tests that a file generated by something is identical to the reference file already present in the repository. I tried something with ctx.actions.run, the problem being that my rule, having test = True set, needs to return an executable built by that rule (so not a tool provided to the rule). I then tried to wrap it in a shell script following the example, like this:
def _phosphorus_test_impl(ctx):
output = ctx.actions.declare_file("{name}.phs".format(name = ctx.label.name))
script = phosphorus_compare(
ctx,
reference = ctx.file.reference,
comparison = ctx.file.comparison,
out = output,
)
ctx.actions.write(
output = ctx.outputs.executable,
content = script,
)
runfiles = ctx.runfiles(files = [ctx.executable._phosphorus_tool, ctx.file.reference, ctx.file.comparison])
return [DefaultInfo(runfiles = runfiles)]
phosphorus_test = rule(
_phosphorus_test_impl,
attrs = {
"comparison": attr.label(
allow_single_file = [".phs"],
doc = "File to compare to the reference",
mandatory = True,
),
"reference": attr.label(
allow_single_file = [".phs"],
doc = "Reference file",
mandatory = True,
),
"_phosphorus_tool": attr.label(
default = "//java/fr/enoent/phosphorus",
executable = True,
cfg = "host",
),
},
doc = "Compares two files, and fails if they are different.",
test = True,
)
(phosphorus_compare is just a macro generating the actual command.)
However, this approach has two issues:
The output can't be declared this way. It's not linked to any action (and Bazel is complaining about it). Maybe I don't really need to declare an output for a test? Does Bazel make anything in the test folder available when the test fails?
The runfiles necessary to run the tool don't seem to be available when the test runs:
java/fr/enoent/phosphorus/phosphorus: line 359: /home/kernald/.cache/bazel/_bazel_kernald/58c025fbb926eac6827117ef80f7d2fa/sandbox/linux-sandbox/1979/execroot/fr_enoent/bazel-out/k8-fastbuild/bin/tools/phosphorus/tests/should_pass.runfiles/remotejdk11_linux/bin/java: No such file or directory
Overall I feel like using a shell script is just adding an unnecessary indirection, and losing some context (e.g. tools' runfiles). Ideally, I would just use ctx.actions.run and rely on its return code, but it doesn't seem to be an option as a test apparently needs to generate an executable. What would be the correct approach to write such a rule?
Turns out, generating a script is the correct approach, it's (as far as I understood) impossible to return some kind of pointer to a ctx.actions.run. A test rule needs to have an executable output.
Regarding the output file that the tool is generating: there's no need to declare it, at all. I just need to make sure that it's generated in $TEST_UNDECLARED_OUTPUTS_DIR. Every single file in this directory will be added to an archive called output.zip by Bazel. This is (partly) documented here.
Concerning the runfiles, well, I had the tool's binary, but not its own runfiles. Here is the fixed rule:
def _phosphorus_test_impl(ctx):
script = phosphorus_compare(
ctx,
reference = ctx.file.reference,
comparison = ctx.file.comparison,
out = "%s.phs" % ctx.label.name,
)
ctx.actions.write(
output = ctx.outputs.executable,
content = script,
)
return [
DefaultInfo(
runfiles = ctx.runfiles(
files = [
ctx.executable._phosphorus_tool,
ctx.file.reference,
ctx.file.comparison,
],
).merge(ctx.attr._phosphorus_tool[DefaultInfo].default_runfiles),
executable = ctx.outputs.executable,
),
]
def phosphorus_test(size = "small", **kwargs):
_phosphorus_test(size = size, **kwargs)
_phosphorus_test = rule(
_phosphorus_test_impl,
attrs = {
"comparison": attr.label(
allow_single_file = [".phs"],
doc = "File to compare to the reference",
mandatory = True,
),
"reference": attr.label(
allow_single_file = [".phs"],
doc = "Reference file",
mandatory = True,
),
"_phosphorus_tool": attr.label(
default = "//java/fr/enoent/phosphorus",
executable = True,
cfg = "target",
),
},
doc = "Compares two files, and fails if they are different.",
test = True,
)
The key part being .merge(ctx.attr._phosphorus_tool[DefaultInfo].default_runfiles) in the returned DefaultInfo.
I also made a small mistake about the configuration, as this test is intended to run on the target configuration, not host, it's been fixed accordingly.

How can I build custom rules using the output of workspace_status_command?

The bazel build flag --workspace_status_command supports calling a script to retrieve e.g. repository metadata, this is also known as build stamping and available in rules like java_binary.
I'd like to create a custom rule using this metadata.
I want to use this for a common support function. It should receive the git version and some other attributes and create a version.go output file usable as a dependency.
So I started a journey looking at rules in various bazel repositories.
Rules like rules_docker support stamping with stamp in container_image and let you reference the status output in attributes.
rules_go supports it in the x_defs attribute of go_binary.
This would be ideal for my purpose and I dug in...
It looks like I can get what I want with ctx.actions.expand_template using the entries in ctx.info_file or ctx.version_file as a dictionary for substitutions. But I didn't figure out how to get a dictionary of those files. And those two files seem to be "unofficial", they are not part of the ctx documentation.
Building on what I found out already: How do I get a dict based on the status command output?
If that's not possible, what is the shortest/simplest way to access workspace_status_command output from custom rules?
I've been exactly where you are and I ended up following the path you've started exploring. I generate a JSON description that also includes information collected from git to package with the result and I ended up doing something like this:
def _build_mft_impl(ctx):
args = ctx.actions.args()
args.add('-f')
args.add(ctx.info_file)
args.add('-i')
args.add(ctx.files.src)
args.add('-o')
args.add(ctx.outputs.out)
ctx.actions.run(
outputs = [ctx.outputs.out],
inputs = ctx.files.src + [ctx.info_file],
arguments = [args],
progress_message = "Generating manifest: " + ctx.label.name,
executable = ctx.executable._expand_template,
)
def _get_mft_outputs(src):
return {"out": src.name[:-len(".tmpl")]}
build_manifest = rule(
implementation = _build_mft_impl,
attrs = {
"src": attr.label(mandatory=True,
allow_single_file=[".json.tmpl", ".json_tmpl"]),
"_expand_template": attr.label(default=Label("//:expand_template"),
executable=True,
cfg="host"),
},
outputs = _get_mft_outputs,
)
//:expand_template is a label in my case pointing to a py_binary performing the transformation itself. I'd be happy to learn about a better (more native, fewer hops) way of doing this, but (for now) I went with: it works. Few comments on the approach and your concerns:
AFAIK you cannot read in (the file and perform operations in Skylark) itself...
...speaking of which, it's probably not a bad thing to keep the transformation (tool) and build description (bazel) separate anyways.
It could be debated what constitutes the official documentation, but ctx.info_file may not appear in the reference manual, it is documented in the source tree. :) Which is case for other areas as well (and I hope that is not because those interfaces are considered not committed too yet).
For sake of comleteness in src/main/java/com/google/devtools/build/lib/skylarkbuildapi/SkylarkRuleContextApi.java there is:
#SkylarkCallable(
name = "info_file",
structField = true,
documented = false,
doc =
"Returns the file that is used to hold the non-volatile workspace status for the "
+ "current build request."
)
public FileApi getStableWorkspaceStatus() throws InterruptedException, EvalException;
EDIT: few extra details as asked in the comment.
In my workspace_status.sh I would have for instance the following line:
echo STABLE_GIT_REF $(git log -1 --pretty=format:%H)
In my .json.tmpl file I would then have:
"ref": "${STABLE_GIT_REF}",
I've opted for shell like notation of text to be replaced, since it's intuitive for many users as well as easy to match.
As for the replacement, relevant (CLI kept out of this) portion of the actual code would be:
def get_map(val_file):
"""
Return dictionary of key/value pairs from ``val_file`.
"""
value_map = {}
for line in val_file:
(key, value) = line.split(' ', 1)
value_map.update(((key, value.rstrip('\n')),))
return value_map
def expand_template(val_file, in_file, out_file):
"""
Read each line from ``in_file`` and write it to ``out_file`` replacing all
${KEY} references with values from ``val_file``.
"""
def _substitue_variable(mobj):
return value_map[mobj.group('var')]
re_pat = re.compile(r'\${(?P<var>[^} ]+)}')
value_map = get_map(val_file)
for line in in_file:
out_file.write(re_pat.subn(_substitue_variable, line)[0])
EDIT2: This is how the Python script is how I expose the python script to rest of bazel.
py_binary(
name = "expand_template",
main = "expand_template.py",
srcs = ["expand_template.py"],
visibility = ["//visibility:public"],
)
Building on Ondrej's answer, I now use somthing like this (adapted in SO editor, might contain small errors):
tools/bazel.rc:
build --workspace_status_command=tools/workspace_status.sh
tools/workspace_status.sh:
echo STABLE_GIT_REV $(git rev-parse HEAD)
version.bzl:
_VERSION_TEMPLATE_SH = """
set -e -u -o pipefail
while read line; do
export "${line% *}"="${line#* }"
done <"$INFILE" \
&& cat <<EOF >"$OUTFILE"
{ "ref": "${STABLE_GIT_REF}"
, "service": "${SERVICE_NAME}"
}
EOF
"""
def _commit_info_impl(ctx):
ctx.actions.run_shell(
outputs = [ctx.outputs.outfile],
inputs = [ctx.info_file],
progress_message = "Generating version file: " + ctx.label.name,
command = _VERSION_TEMPLATE_SH,
env = {
'INFILE': ctx.info_file.path,
'OUTFILE': ctx.outputs.version_go.path,
'SERVICE_NAME': ctx.attr.service,
},
)
commit_info = rule(
implementation = _commit_info_impl,
attrs = {
'service': attr.string(
mandatory = True,
doc = 'name of versioned service',
),
},
outputs = {
'outfile': 'manifest.json',
},
)

Resources