How to get bazel genrule to access transitive dependencies? - bazel

I have the following in a BUILD file:
proto_library(
name = "proto_default_library",
srcs = glob(["*.proto"]),
visibility = ["//visibility:public"],
deps = [
"#go_googleapis//google/api:annotations_proto",
"#grpc_ecosystem_grpc_gateway//protoc-gen-openapiv2/options:options_proto",
],
)
genrule(
name = "generate-buf-image",
srcs = [
":buf_yaml",
":buf_breaking_image_json",
":protos",
],
exec_tools = [
":proto_default_library",
"//buf:generate-buf-image-sh",
"//buf:generate-buf-image",
],
outs = ["buf-image.json"],
cmd = "$(location //buf:generate-buf-image-sh) --buf-breaking-image-json=$(location :buf_breaking_image_json) $(location :protos) >$#",
)
While executing $(location //buf:generate-buf-image-sh), glob(["*.proto"]) of proto_default_library can be seen in the sandbox but the proto files of #go_googleapis//google/api:annotations_proto and #grpc_ecosystem_grpc_gateway//protoc-gen-openapiv2/options:options_proto cannot. The same goes for the dependencies of //buf:generate-buf-image-sh.
Do I need to explicitly list out all transitive dependencies so they can be processed by generate-buf-image? Is there a programmatic way to do that?

Since genrules are pretty generic, a genrule sees only the default provider of a target, which usually just has the main outputs of that target (e.g., for java_library, a jar of the classes of that library, for proto_library, the proto files of that library). So to get more detailed information, you would write a Starlark rule to access more specific providers. For example:
WORKSPACE:
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "rules_proto",
sha256 = "66bfdf8782796239d3875d37e7de19b1d94301e8972b3cbd2446b332429b4df1",
strip_prefix = "rules_proto-4.0.0",
urls = [
"https://mirror.bazel.build/github.com/bazelbuild/rules_proto/archive/refs/tags/4.0.0.tar.gz",
"https://github.com/bazelbuild/rules_proto/archive/refs/tags/4.0.0.tar.gz",
],
)
load("#rules_proto//proto:repositories.bzl", "rules_proto_dependencies", "rules_proto_toolchains")
rules_proto_dependencies()
rules_proto_toolchains()
defs.bzl:
def _my_rule_impl(ctx):
output = ctx.actions.declare_file(ctx.attr.name + ".txt")
args = ctx.actions.args()
args.add(output)
inputs = []
for src in ctx.attr.srcs:
proto_files = src[ProtoInfo].transitive_sources
args.add_all(proto_files)
inputs.append(proto_files)
ctx.actions.run(
inputs = depset(transitive = inputs),
executable = ctx.attr._tool.files_to_run,
arguments = [args],
outputs = [output],
)
return DefaultInfo(files = depset([output]))
my_rule = rule(
implementation = _my_rule_impl,
attrs = {
"srcs": attr.label_list(providers=[ProtoInfo]),
"_tool": attr.label(default = "//:tool"),
},
)
ProtoInfo is here: https://bazel.build/rules/lib/ProtoInfo
BUILD:
load(":defs.bzl", "my_rule")
proto_library(
name = "proto_a",
srcs = ["proto_a.proto"],
deps = [":proto_b"],
)
proto_library(
name = "proto_b",
srcs = ["proto_b.proto"],
deps = [":proto_c"],
)
proto_library(
name = "proto_c",
srcs = ["proto_c.proto"],
)
my_rule(
name = "foo",
srcs = [":proto_a"],
)
sh_binary(
name = "tool",
srcs = ["tool.sh"],
)
proto_a.proto:
package my_protos_a;
message ProtoA {
optional int32 a = 1;
}
proto_b.proto:
package my_protos_b;
message ProtoB {
optional int32 b = 1;
}
proto_c.proto:
package my_protos_c;
message ProtoC {
optional int32 c = 1;
}
tool.sh:
output=$1
shift
echo input protos: $# > $output
$ bazel build foo
INFO: Analyzed target //:foo (40 packages loaded, 172 targets configured).
INFO: Found 1 target...
Target //:foo up-to-date:
bazel-bin/foo.txt
INFO: Elapsed time: 0.832s, Critical Path: 0.02s
INFO: 5 processes: 4 internal, 1 linux-sandbox.
INFO: Build completed successfully, 5 total actions
$ cat bazel-bin/foo.txt
input protos: proto_a.proto proto_b.proto proto_c.proto

Related

Using data dependency for bazel ctx.action.run_shell custom rule

I am looking at emit_rule example in bazel source tree:
https://github.com/bazelbuild/examples/blob/5a8696429e36090a75eb6fee4ef4e91a3413ef13/rules/shell_command/rules.bzl
I want to add a data dependency to the custom rule. My understanding of dependency attributes documentation calls for data attr label_list to be used, but it does not appear to work?
# This example copied from docs
def _emit_size_impl(ctx):
in_file = ctx.file.file
out_file = ctx.actions.declare_file("%s.pylint" % ctx.attr.name)
ctx.actions.run_shell(
inputs = [in_file],
outputs = [out_file],
command = "wc -c '%s' > '%s'" % (in_file.path, out_file.path),
)
return [DefaultInfo(files = depset([out_file]),)]
emit_size = rule(
implementation = _emit_size_impl,
attrs = {
"file": attr.label(mandatory = True,allow_single_file = True,),
"data": attr.label_list(allow_files = True),
# ^^^^^^^ Above does not appear to be sufficient to copy data dependency into sandbox
},
)
With this rule emit_size(name = "my_name", file = "my_file", data = ["my_data"]) I want to see my_data copied to bazel-out/ before running the command. How do I go about doing this?
The data files should be added as inputs to the actions that need those files, e.g. something like this:
def _emit_size_impl(ctx):
in_file = ctx.file.file
out_file = ctx.actions.declare_file("%s.pylint" % ctx.attr.name)
ctx.actions.run_shell(
inputs = [in_file] + ctx.files.data,
outputs = [out_file],
# For production rules, probably should use ctx.actions.run() and
# ctx.actions.args():
# https://bazel.build/rules/lib/Args
command = "echo data is: ; %s ; wc -c '%s' > '%s'" % (
"cat " + " ".join([d.path for d in ctx.files.data]),
in_file.path, out_file.path),
)
return [DefaultInfo(files = depset([out_file]),)]
emit_size = rule(
implementation = _emit_size_impl,
attrs = {
"file": attr.label(mandatory = True, allow_single_file = True,),
"data": attr.label_list(allow_files = True),
},
)
BUILD:
load(":defs.bzl", "emit_size")
emit_size(
name = "size",
file = "file.txt",
data = ["data1.txt", "data2.txt"],
)
$ bazel build size
INFO: Analyzed target //:size (4 packages loaded, 9 targets configured).
INFO: Found 1 target...
INFO: From Action size.pylint:
data is:
this is data
this is other data
Target //:size up-to-date:
bazel-bin/size.pylint
INFO: Elapsed time: 0.323s, Critical Path: 0.02s
INFO: 2 processes: 1 internal, 1 linux-sandbox.
INFO: Build completed successfully, 2 total actions
$ cat bazel-bin/size.pylint
22 file.txt

How do I get the files in the build directory in another bazel rule

when use the python tool to generate the .cpp/.hpp code like the protobuf tool, but I don't know how many files will be generated, so it's a little not the same as protbuf tool.
In one genrule:
def __generate_core_ifce_impl(ctx):
...
output_file = ctx.actions.declare_directory(out)
cmd = """
mkdir -p {path};
""".format(path = output_file.path)
cmd += """
{tools} -i {src} -o {output_dir}
""".format(tools = tools, src = ctx.files.srcs, output_dir = output_file.path)
ctx.actions.run_shell(
command = cmd,
inputs = ctx.files.srcs,
outputs = [output_file]
)
return [DefaultInfo(files = depset([output_file])),]
_generate_core_ifce = rule (
implementation = __generate_core_ifce_impl,
attrs = {
"srcs": attr.label_list(mandatory = False, allow_files = True),
"tools": attr.label_list(mandatory = True, allow_files = True),
"out": attr.sting(mandatory = True),
},
)
In output_file directory , there will generate some *.cpp && *.hpp, but i can't know their names
then in another rule , cc_library will use *.cpp && *.hpp which are in output_file directory
the questions is: how to write this rule?
I can't get the files in the output_file diectory,
so I can't write the cc_library?
You should be able to use the name of the target, and the cc_library will use the files that are given in the DefaultInfo, e.g.:
_generate_core_ifce(
name = "my_generate_core_ifce_target",
...
)
cc_library(
name = "my_cc_library_target",
srcs = [":my_generate_core_ifce_target"],
...
)
edit: adding an example:
BUILD:
load(":defs.bzl", "my_rule")
my_rule(
name = "my_target",
)
cc_binary(
name = "cc",
srcs = [":my_target"],
)
defs.bzl:
def _impl(ctx):
output_dir = ctx.actions.declare_directory("my_outputs")
command = """
mkdir -p {output_dir}
cat > {output_dir}/main.c <<EOF
#include "stdio.h"
#include "mylib.h"
int main() {
printf("hello world %d\\n", get_num());
return 0;
}
EOF
cat > {output_dir}/mylib.c <<EOF
int get_num() {
return 42;
}
EOF
cat > {output_dir}/mylib.h <<EOF
int get_num();
EOF
""".replace("{output_dir}", output_dir.path)
ctx.actions.run_shell(
command = command,
outputs = [output_dir]
)
return [DefaultInfo(files = depset([output_dir])),]
my_rule = rule(
implementation = _impl,
)
usage:
$ bazel run cc
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:cc (15 packages loaded, 57 targets configured).
INFO: Found 1 target...
Target //:cc up-to-date:
bazel-bin/cc
INFO: Elapsed time: 3.626s, Critical Path: 0.06s
INFO: 8 processes: 4 internal, 4 linux-sandbox.
INFO: Build completed successfully, 8 total actions
INFO: Build completed successfully, 8 total actions
hello world 42

Accessing runfiles during build in genrule

I have a cc_binary target that uses runfiles. I would like to zip the executable and all of the runfiles into a single archive using a genrule.
Something like:
genrule(
name = "zip_binary"
srcs = [
":binary",
],
outs = [
"binary.zip",
],
cmd = "zip -r $(OUTS) $(locations :binary)",
)
However, this only includes the binary and not the binary.runfiles dir.
How can I get bazel to include the runfiles in the srcs?
Genrules don't have access to enough information to do that. With a full custom rule it's pretty easy though. Something like this:
def _zipper_impl(ctx):
inputs = ctx.runfiles(files = ctx.files.srcs)
inputs = inputs.merge_all([a[DefaultInfo].default_runfiles for a in ctx.attr.srcs])
ctx.actions.run_shell(
outputs = [ctx.output.out],
inputs = inputs.files,
command = " ".join(["zip", "-r", ctx.output.out.path] +
[i.path for i in inputs.files.to_list()]),
)
return [DefaultInfo(files = depset(ctx.output.out))]
zipper = rule(
impl = _zipper_impl,
attrs = {
"out": attr.output(mandatory = True),
"srcs": attr.label_list(allow_files = True),
},
)

Unable to find package for #bazel_skylib//:bzl_library.bzl

Here's my workspace;
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
RULES_JVM_EXTERNAL_TAG = "4.0"
RULES_JVM_EXTERNAL_SHA = "31701ad93dbfe544d597dbe62c9a1fdd76d81d8a9150c2bf1ecf928ecdf97169"
http_archive(
name = "maven",
strip_prefix = "rules_jvm_external-%s" % RULES_JVM_EXTERNAL_TAG,
sha256 = RULES_JVM_EXTERNAL_SHA,
url = "https://github.com/bazelbuild/rules_jvm_external/archive/%s.zip" % RULES_JVM_EXTERNAL_TAG,
)
load("#maven//:defs.bzl", "maven_install")
maven_install(
artifacts = [
"com.fasterxml.jackson.core:jackson-databind:2.12.1",
"org.apache.commons:commons-lang3:3.11"
],
repositories = [
"https://repo1.maven.org/maven2",
"https://jcenter.bintray.com/"
],
);
Here's my second/BUILD
java_binary(
name = "main",
srcs = glob(["src/main/java/**/*.java"]),
deps = [
"//First:first",
],
main_class = "com.test.MyMain",
);
here's my First/Build
java_library(
name = "first",
srcs = glob(["src/main/java/**/*.java"]),
deps = [
"#maven//:com_fasterxml_jackson_core_jackson_databind",
],
visibility =[ "//Second:__pkg__"],
);
when doing
bazel build //Second:main
I get
ERROR: /Users/foobar/Documents/Main/First/BUILD:1:13: error loading package '#maven//': Unable to find package for #bazel_skylib//:bzl_library.bzl: The repository '#bazel_skylib' could not be resolved. and referenced by '//First:first'
ERROR: Analysis of target '//Second:main' failed; build aborted: Analysis failed
INFO: Elapsed time: 0.078s
INFO: 0 processes.
You need to add Bazel Skylib to your workspace. Follow the "workspace setup" instructions here: https://github.com/bazelbuild/bazel-skylib/releases

How to use genrule output as string to expand_template substitutions in Bazel?

It seems that the genrule can only output a Target, and the expand_template substitutions accept only string_dict, how can I use the genrule output to expand_template?
gen.bzl
def _expand_impl(ctx):
ctx.actions.expand_template(
template = ctx.file._template,
output = ctx.outputs.source_file,
substitutions = {
"{version}": ctx.attr.version,
}
)
expand = rule(
implementation = _expand_impl,
attrs = {
"version": attr.string(mandatory = True),
"_template": attr.label(
default = Label("//version:local.go.in"),
allow_single_file = True,
),
},
outputs = {"source_file": "local.go"},
)
BUILD
load("#io_bazel_rules_go//go:def.bzl", "go_library")
filegroup(
name = "templates",
srcs = ["local.go.in"],
)
genrule(
name = "inject",
outs = ["VERSION"],
local = 1,
cmd = "git rev-parse HEAD",
)
load(":gen.bzl", "expand")
expand(
name = "expand",
version = ":inject",
)
go_library(
name = "go_default_library",
srcs = [
"default.go",
":expand", # Keep
],
importpath = "go.megvii-inc.com/brain/data/version",
visibility = ["//visibility:public"],
)
and the local.go.in
package version
func init() {
V = "{version}"
}
I expect the {version} in local.go.in can be replace by git rev-parse HEAD output.
The problem here is that the substitutions argument of ctx.actions.expand_template() must be known during the analysis phase (i.e., when _expand_impl is run), which happens before the git rev-parse HEAD command of the genrule would be run (i.e., during the execution phase).
There are a few ways to do this. The simplest is to do everything in the genrule:
genrule(
name = "gen_local_go",
srcs = ["local.go.in"],
outs = ["local.go"],
local = 1,
cmd = 'sed "s/{VERSION}/$(git rev-parse HEAD)/" "$<" > "$#"',
)
That relies on sed being available on the host machine, but any other sort of program that can input one file, modify the text, and output it to another file will work.
Another option is to use a combination of --workspace_status_command
There are more details here:
How to run a shell command at analysis time in bazel?
The advantage of this approach is that it avoids local genrules.

Resources