How to create a rule from within another rule in Bazel - bazel

Situation
I have two Skylark extension rules: blah_library and blah_binary. All of a blah_library's transitive dependencies are propagated by returning a provider(transitive_deps=...), and are handled appropriately by any ultimate dependent blah_binary target.
What I want to do
I want each blah_library to also create a filegroup with all the transitive dependencies mentioned above, so that I can access them separately. E.g., I'd like to be able to pass them in as data dependencies to a cc_binary. In other words:
# Somehow have this automatically create a target named `foo__trans_deps`?
blah_library(
name = "foo",
srcs = [...],
deps = [...],
)
cc_binary(
...,
data = [":foo__trans_deps"],
)
How should I do this? Any help would be appreciated!
What I've tried
Make a macro
I tried making a macro like so:
_real_blah_library = rule(...)
def blah_library(name, *args, **kwargs):
native.filegroup(
name = name + "__trans_deps",
srcs = ???,
)
_real_blah_library(name=name, *args, **kwargs)
But I'm not sure how to access the provider provided by _real_blah_library from within the macro, so I don't know how to populate the filegroup's srcs field...
Modify the blah_library rule's implementation
Right now I have something like:
_blah_provider = provider(fields=['transitive_deps'])
def _blah_library_impl(ctx):
...
trans_deps = []
for dep in ctx.attr.deps:
trans_deps += dep[_blah_provider].trans_deps
return _blah_provider(trans_deps=trans_deps)
blah_library = rule(impl=_blah_library_impl, ...)
I tried adding the following to _blah_library_impl, but it didn't work because apparently native.filegroup can't be called within a rule's implementation ("filegroup() cannot be called during the analysis phase"):
def _blah_library_impl(ctx):
...
trans_deps = []
for dep in ctx.attr.deps:
trans_deps += dep[_blah_provider].trans_deps
native.filegroup(
name = ctx.attr.name + "__trans_deps",
srcs = trans_deps,
)
return _blah_provider(trans_deps=trans_deps)

You can't easily create a filegroup like that, but you can still achieve what you want.
If you want to use the rule in genrule.srcs, filegroup.srcs, cc_binary.data, etc., then return a DefaultInfo provider (along with _blah_provider) and set the files field to the transitive closure of files.
You can refine the solution if you want a different set of files when the rule is in a data attribute vs. when in any other (e.g. srcs): just also set the runfiles-related members in DefaultInfo. (Frankly I don't know the difference between them, I'd just set all runfiles-fields to the same value.)

I ended up making my own special filegroup-like rule, as discussed in the comments under #Laszlo's answer. Here's the raw code in case it's a useful starting point for anyone:
def _whl_deps_filegroup_impl(ctx):
input_wheels = ctx.attr.src[_PyZProvider].transitive_wheels
output_wheels = []
for wheel in input_wheels:
file_name = wheel.basename
output_wheel = ctx.actions.declare_file(file_name)
# TODO(josh): Use symlinks instead of copying. Couldn't figure out how
# to do this due to issues with constructing absolute paths...
ctx.actions.run(
outputs=[output_wheel],
inputs=[wheel],
arguments=[wheel.path, output_wheel.path],
executable="cp",
mnemonic="CopyWheel")
output_wheels.append(output_wheel)
return [DefaultInfo(files=depset(output_wheels))]
whl_deps_filegroup = rule(
_whl_deps_filegroup_impl,
attrs = {
"src": attr.label(),
},
)

Related

How to define string based on host os in bazel rule definition?

I have the following rule definition:
helm_action = rule(
attrs = {
…
"cluster_aliases": attr.string_dict(
doc = "key value pair matching for creating a cluster alias where the name used to evoke a cluster alias is different than the actual cluster's name",
default = DEFAULT_CLUSTER_ALIASES,
),
…
},
…
)
I'd like for DEFAULT_CLUSTER_ALIASES value to be based on the host os but
DEFAULT_CLUSTER_ALIASES = {
"local": select({
"#platforms//os:osx": "docker-desktop",
"#platforms//os:linux": "minikube",
})
}
errors with:
Error in string_dict: expected value of type 'string' for dict value element, but got select({"#platforms//os:osx": "docker-desktop", "#platforms//os:linux": "minikube"}) (select)
How do I go about defining DEFAULT_CLUSTER_ALIASES based on the host os?
Judging from https://github.com/bazelbuild/bazel/issues/2045, selecting based on host os is not possible.
When you create a rule or macro, it is evaluated during the loading phase, before command-line flags are evaluated. Bazel needs to know the default value in your build rule helm_action during the loading phase but can't because it hasn't parsed the command line and analysed the build graph.
The command line is parsed and select statements are evaluated during the analysis phase. As a broad rule, if your select statement isn't in a BUILD.bazel then it's not going to work. So the easiest way to achieve what you are after is to create a macro that uses your rule injecting the default. e.g.
# helm_action.bzl
# Add an '_' prefix to your rule to make the rule private.
_helm_action = rule(
attrs = {
…
"cluster_aliases": attr.string_dict(
doc = "key value pair matching for creating a cluster alias where the name used to evoke a cluster alias is different than the actual cluster's name",
# Remove default attribute.
),
…
},
…
)
# Wrap your rule in a publicly exported macro.
def helm_action(**kwargs):
_helm_action(
name = kwargs["name"],
# Instantiate your rule with a select.
cluster_aliases = DEFAULT_CLUSTER_ALIASES,
**kwargs,
)
It's important to note the difference between a macro and a rule. A macro is a way of generating a set of targets using other build rules, and actually expands out roughly equivalent to it's contents when used in a BUILD file. You can check this by querying a target with the --output build flag. e.g.
load(":helm_action.bzl", "helm_action")
helm_action(
name = "foo",
# ...
)
You can query the output using the command;
bazel query //:foo --output build
This will demonstrate that the select statement is being copied into the BUILD file.
A good example of this approach is in the rules_docker repository.
EDIT: The question was clarified, so I've got an updated answer below but will keep the above answer in case it is useful to others.
A simple way of achieving what you are after is to use Bazels toolchain api. This is a very flexible API and is what most language rulesets use in Bazel. e.g.
Create a build file with your toolchains;
# //helm:BUILD.bazel
load(":helm_toolchains.bzl", "helm_toolchain")
toolchain_type(name = "toolchain_type")
helm_toolchain(
name = "osx",
cluster_aliases = {
"local": "docker-desktop",
},
)
toolchain(
name = "osx_toolchain",
toolchain = ":osx",
toolchain_type = ":toolchain_type",
exec_compatible_with = ["#platforms//os:macos"],
# Optionally use to restrict target platforms too.
# target_compatible_with = []
)
helm_toolchain(
name = "linux",
cluster_aliases = {
"local": "minikube",
},
)
toolchain(
name = "linux_toolchain",
toolchain = ":linux",
toolchain_type = ":toolchain_type",
exec_compatible_with = ["#platforms//os:linux"],
)
Register your toolchains so that Bazel knows what to look for;
# //:WORKSPACE
# the rest of your workspace...
register_toolchains("//helm:all")
# You may need to register your execution platforms too...
# register_execution_platforms("//your_platforms/...")
Implement the toolchain backend;
# //helm:helm_toolchains.bzl
HelmToolchainInfo = provider(fields = ["cluster_aliases"])
def _helm_toolchain_impl(ctx):
toolchain_info = platform_common.ToolchainInfo(
helm_toolchain_info = HelmToolchainInfo(
cluster_aliases = ctx.attr.cluster_aliases,
),
)
return [toolchain_info]
helm_toolchain = rule(
implementation = _helm_toolchain_impl,
attrs = {
"cluster_aliases": attr.string_dict(),
},
)
Update helm_action to use toolchains. e.g.
def _helm_action_impl(ctx):
cluster_aliases = ctx.toolchains["#your_repo//helm:toolchain_type"].helm_toolchain_info.cluster_aliases
#...
helm_action = rule(
_helm_action_impl,
attrs = {
#…
},
toolchains = ["#your_repo//helm:toolchain_type"]
)

Apply cpu transition to cc_binary rule in Bazel

I've a c target that always must be compiled for darwin_x86_64, no matter the --cpu set when calling bazel build. It's the only target that always must be compiled for a specific cpu in a big project.
cc_binary(
name = "hello-world",
srcs = ["hello-world.cc"],
)
In the bazel documentation it seems to be possible to do this using transitions. Maybe something like:
def _force_x86_64_impl(settings, attr):
return {"//command_line_option:cpu": "darwin_x86_64"}
force_x86_64 = transition(
implementation = _force_x86_64_impl,
inputs = [],
outputs = ["//command_line_option:cpu"]
)
But how do I tie these two together? I'm probably missing something obvious, but I can't seem to find the relevant documentation over at bazel.build.
You've got the transition defined, now you need to attach it to a rule.
r = rule(
implementation = _r_impl,
attrs = {
"_allowlist_function_transition": attr.label(
default = "#bazel_tools//tools/allowlists/function_transition_allowlist",
),
"srcs": attr.label(cfg = force_x86_64),
},
)
def _r_impl(ctx):
return [DefaultInfo(files = ctx.attr.srcs[0][DefaultInfo].files)]
This defines a rule that attaches the appropriate transition to the srcs you pass it, and then simply forwards the list of files from DefaultInfo. Depending on what you're doing, this might be sufficient, or you might also want to forward the runfiles contained in DefaultInfo and/or other providers.
Use target_compatible_with
Example (only compiled on Windows):
cc_library(
name = "win_driver_lib",
srcs = ["win_driver_lib.cc"],
target_compatible_with = [
"#platforms//cpu:x86_64",
"#platforms//os:windows",
],
)

Propagating copts/defines to all of a target's dependencies

I have a project that involves multiple BUILD files in a single WORKSPACE, within a fairly complex build system. My goal in short: for some specific target, I want all of its recursive dependencies to be built with an extra set of attributes (copts/defines) compared to when those dependency targets are built in any other way. I have not yet found a way to do this cleanly.
For example, target G is normally built with copts = []. If target P depends on target G, and I run bazel build :P, I want both targets to be built with copts = ["-DMY_DEFINE"], along with all dependencies of target G, etc.
The cc_binary.defines argument propagates in the opposite direction: all targets that depend on some target A will receive all of target A's defines.
Limitations:
prefer to avoid custom command line flags, I don't control how people call bazel {build,test}
duplicating the entire tree of dependency targets is not practical
It doesn't appear possible to set the value of a config_setting from within a BUILD file or a target, so it seems a select-based solution couldn't work.
Previous work:
https://groups.google.com/g/bazel-discuss/c/rZps4nqYqt8/m/YS_pZD6oAQAJ - 2017, recommends "parallel trees" or custom macros (of which we already have many, it would be challenging to wrap them in another)
Propagate copts to all dependencies in Bazel - I believe these all depend on custom command line flags as well
Creating a user-defined build setting doesn't require command-line flags. If you set flag = False, then it actually can't be set on the command line. You can use a user-defined transition to set it instead.
I think something like this will do what you're looking for (save it in extra_copts.bzl):
def _extra_copts_impl(ctx):
context = cc_common.create_compilation_context(
defines = depset(ctx.build_setting_value)
)
return [CcInfo(compilation_context = context)]
extra_copts = rule(
implementation = _extra_copts_impl,
build_setting = config.string_list(flag = False),
)
def _use_extra_copts_implementation(ctx):
return [ctx.attr._copts[CcInfo]]
use_extra_copts = rule(
implementation = _use_extra_copts_implementation,
attrs = "_copts": attr.label(default = "//:extra_copts")},
)
def _add_copts_impl(settings, attr):
return {"//:extra_copts": ["MY_DEFINE"]}
_add_copts = transition(
implementation = _add_copts_impl,
inputs = [],
outputs = ["//:extra_copts"],
)
def _with_extra_copts_implementation(ctx):
infos = [d[CcInfo] for d in ctx.attr.deps]
return [cc_common.merge_cc_infos(cc_infos = infos)]
with_extra_copts = rule(
implementation = _with_extra_copts_implementation,
attrs = {
"deps": attr.label_list(cfg = _add_copts),
"_allowlist_function_transition": attr.label(
default = "#bazel_tools//tools/allowlists/function_transition_allowlist"
)
},
)
and then in the BUILD file:
load("//:extra_copts.bzl", "extra_copts", "use_extra_copts", "with_extra_copts")
extra_copts(name = "extra_copts", build_setting_default = [])
use_extra_copts(name = "use_extra_copts")
cc_library(
name = "G",
deps = [":use_extra_copts"],
)
with_extra_copts(
name = "P_deps",
deps = [":G"],
)
cc_library(
name = "P",
deps = [":P_deps"],
)
extra_copts is the build setting. It returns a CcInfo directly, which means it's straightforward to do any other C++ library swapping with the same approach. Its default is effectively an "empty" CcInfo which won't do anything to libraries that depend on it.
with_extra_copts wraps a set of dependencies, configured to use a different CcInfo. This is the rule that actually changes the value, to create the second version of G with different flags.
_add_copts is the transition which with_extra_copts uses to change the value of the extra_copts build setting. It could examine attr to do something more sophisticated than adding a hard-coded list.
use_extra_copts pulls the CcInfo out of extra_copts so a cc_library can use them.
To avoid rewriting the builtin C++ rules, this uses wrapper rules to pull the copts out and do the transition. You might want to create macros to bundle the wrapper rules along with the corresponding cc_library. Alternatively, you could use rules_cc's my_c_archive as a starting point to create custom rules that reuse the core implementation of the builtin C++ rules while integrating the transition and use of the build setting into a single rule.

Conditionally create a Bazel rule based on --config

I'm working on a problem in which I only want to create a particular rule if a certain Bazel config has been specified (via '--config'). We have been using Bazel since 0.11 and have a bunch of build infrastructure that works around former limitations in Bazel. I am incrementally porting us up to newer versions. One of the features that was missing was compiler transitions, and so we rolled our own using configs and some external scripts.
My first attempt at solving my problem looks like this:
load("#rules_cc//cc:defs.bzl", "cc_library")
# use this with a select to pick targets to include/exclude based on config
# see __build_if_role for an example
def noop_impl(ctx):
pass
noop = rule(
implementation = noop_impl,
attrs = {
"deps": attr.label_list(),
},
)
def __sanitize(config):
if len(config) > 2 and config[:2] == "//":
config = config[2:]
return config.replace(":", "_").replace("/", "_")
def build_if_config(**kwargs):
config = kwargs['config']
kwargs.pop('config')
name = kwargs['name'] + '_' + __sanitize(config)
binary_target_name = kwargs['name']
kwargs['name'] = binary_target_name
cc_library(**kwargs)
noop(
name = name,
deps = select({
config: [ binary_target_name ],
"//conditions:default": [],
})
)
This almost gets me there, but the problem is that if I want to build a library as an output, then it becomes an intermediate dependency, and therefore gets deleted or never built.
For example, if I do this:
build_if_config(
name="some_lib",
srcs=[ "foo.c" ],
config="//:my_config",
)
and then I run
bazel build --config my_config //:some_lib
Then libsome_lib.a does not make it to bazel-out, although if I define it using cc_library, then it does.
Is there a way that I can just create the appropriate rule directly in the macro instead of creating a noop rule and using a select? Or another mechanism?
Thanks in advance for your help!
As I noted in my comment, I was misunderstanding how Bazel figures out its dependencies. The create a file section of The Rules Tutorial explains some of the details, and I followed along here for some of my solution.
Basically, the problem was not that the built files were not sticking around, it was that they were never getting built. Bazel did not know to look in the deps variable and build those things: it seems I had to create an action which uses the deps, and then register an action by returning a (list of) DefaultInfo
Below is my new noop_impl function
def noop_impl(ctx):
if len(ctx.attr.deps) == 0:
return None
# ctx.attr has the attributes of this rule
dep = ctx.attr.deps[0]
# DefaultInfo is apparently some sort of globally available
# class that can be used to index Target objects
infile = dep[DefaultInfo].files.to_list()[0]
outfile = ctx.actions.declare_file('lib' + ctx.label.name + '.a')
ctx.actions.run_shell(
inputs = [infile],
outputs = [outfile],
command = "cp %s %s" % (infile.path, outfile.path),
)
# we can also instantiate a DefaultInfo to indicate what output
# we provide
return [DefaultInfo(files = depset([outfile]))]

Passing output of custom bazel rule to a *_binary rule as runtime data

I have a rule that creates a folder (untars a tar.gz) and I would like to use this folder directly as data for a cc_binary. The only way I could find to so is to create an intermediate filegroup with the created folder as source.
Though it works it requires the users of the rule to create an intermediate filegroup and introduces some naming issues as the name of the filegroup cannot be the same as the created folder name.
This is what I have
# rules.bzl
def _untar_impl(ctx):
tree = ctx.actions.declare_directory(ctx.attr.out)
ctx.actions.run_shell(
inputs = [ctx.file.src],
outputs = [tree],
command = "tar -C '{out}' -xf '{src}'".format(src=ctx.file.src.path, out=tree.path),
)
return [DefaultInfo(files = depset([tree]))]
untar = rule(
implementation = _untar_impl,
attrs = {
"src": attr.label(mandatory=True, allow_single_file = True),
"out": attr.string(mandatory=True),
},
)
# BUILD
untar(
name = "media",
src = "media.tar.gz",
out = "media",
)
filegroup(
name = "mediafiles",
srcs = ["media"],
data = [":media"],
)
cc_binary(
name = "main",
srcs = ["main.cpp"],
data = [":mediafiles"],
)
Is there any way to avoid having the intermediate filegroup?
Based on the discussion that ensued, I see I've misunderstood the problem statement a bit. You probably still want to look into handling that tarball (and its entire processing) as an external dependency and a repository_rule, but for your immediate problem of need of intermediate filegroup.
If you noticed, you've defined both srcs and data to point to your media label, and that is exactly the missing bit to have data available for execution of your *_binary rule. Because the untar rule returned depset of files, but those when used data directly would resolve to being empty.
If you replace this line in your rule definition:
return [DefaultInfo(files = depset([tree]))]
with:
return [DefaultInfo(runfiles = ctx.runfiles([tree]))]
You can then in your BUILD file say:
cc_binary(
name = "main",
srcs = ["main.cpp"],
data = [":media"],
)
Because untar rule now provides runfiles of DefaultInfo. That filegroup wrapping and adding media through its data property has done that.

Resources