Accessing runfiles during build in genrule - bazel

I have a cc_binary target that uses runfiles. I would like to zip the executable and all of the runfiles into a single archive using a genrule.
Something like:
genrule(
name = "zip_binary"
srcs = [
":binary",
],
outs = [
"binary.zip",
],
cmd = "zip -r $(OUTS) $(locations :binary)",
)
However, this only includes the binary and not the binary.runfiles dir.
How can I get bazel to include the runfiles in the srcs?

Genrules don't have access to enough information to do that. With a full custom rule it's pretty easy though. Something like this:
def _zipper_impl(ctx):
inputs = ctx.runfiles(files = ctx.files.srcs)
inputs = inputs.merge_all([a[DefaultInfo].default_runfiles for a in ctx.attr.srcs])
ctx.actions.run_shell(
outputs = [ctx.output.out],
inputs = inputs.files,
command = " ".join(["zip", "-r", ctx.output.out.path] +
[i.path for i in inputs.files.to_list()]),
)
return [DefaultInfo(files = depset(ctx.output.out))]
zipper = rule(
impl = _zipper_impl,
attrs = {
"out": attr.output(mandatory = True),
"srcs": attr.label_list(allow_files = True),
},
)

Related

Chain expand_template and run in one bazel rule

I am trying to write a custom rule, where I first generate a file from a template, then pass this file to a script to generate some c++ headers that are the output of my rule.
def _msg_library_impl(ctx):
# For each target in deps, print its label and files.
for source in enumerate(ctx.attr.srcs):
print("File = " + str(source))
out_header = ctx.actions.declare_file("some_header.hpp")
out_arguments = ctx.actions.declare_file("arguments.json")
ctx.actions.expand_template(
template = ctx.file._arguments_file,
output = out_arguments,
substitutions = {
"{output_dir}": out_header.dirname,
"{idl_tuples}": out_header.path,
},
)
args = ctx.actions.args()
args.add("--arguments-file")
args.add(out_arguments)
ctx.actions.run(
outputs = [out_header],
progress_message = "Generating headers '{}'".format(out_header.short_path),
executable = ctx.executable._generator,
arguments = [args],
)
return [
CcInfo(compilation_context=cc_common.create_compilation_context(
includes=depset([out_header.dirname]),
headers=depset([out_header])))
]
msg_library = rule(
implementation = _msg_library_impl,
output_to_genfiles = True,
attrs = {
"srcs": attr.label_list(allow_files = True),
"outs": attr.output_list(),
"_arguments_file": attr.label(
allow_single_file = [".json"],
default = Label("//examples/generation_rule:arguments_template.json"),
),
"_generator": attr.label(
default = Label("//examples/generation_rule:generator"),
executable = True,
cfg = "exec"
),
},
)
Here, generator is a python library that, given an input file provided to srcs and an arguments file generates headers.
The issue that I am facing is that it seems that the expand_template doesn't actually run before run is called, so the generated file is nowhere to be found. What am I doing wrong here? Did I misunderstand how things work?
You need to indicate the file is an input to the action, in addition to passing its path in the arguments. Change the ctx.actions.run to:
ctx.actions.run(
outputs = [out_header],
inputs = [out_arguments],
progress_message = "Generating headers '{}'".format(out_header.short_path),
executable = ctx.executable._generator,
arguments = [args],
)

How to use genrule output as string to expand_template substitutions in Bazel?

It seems that the genrule can only output a Target, and the expand_template substitutions accept only string_dict, how can I use the genrule output to expand_template?
gen.bzl
def _expand_impl(ctx):
ctx.actions.expand_template(
template = ctx.file._template,
output = ctx.outputs.source_file,
substitutions = {
"{version}": ctx.attr.version,
}
)
expand = rule(
implementation = _expand_impl,
attrs = {
"version": attr.string(mandatory = True),
"_template": attr.label(
default = Label("//version:local.go.in"),
allow_single_file = True,
),
},
outputs = {"source_file": "local.go"},
)
BUILD
load("#io_bazel_rules_go//go:def.bzl", "go_library")
filegroup(
name = "templates",
srcs = ["local.go.in"],
)
genrule(
name = "inject",
outs = ["VERSION"],
local = 1,
cmd = "git rev-parse HEAD",
)
load(":gen.bzl", "expand")
expand(
name = "expand",
version = ":inject",
)
go_library(
name = "go_default_library",
srcs = [
"default.go",
":expand", # Keep
],
importpath = "go.megvii-inc.com/brain/data/version",
visibility = ["//visibility:public"],
)
and the local.go.in
package version
func init() {
V = "{version}"
}
I expect the {version} in local.go.in can be replace by git rev-parse HEAD output.
The problem here is that the substitutions argument of ctx.actions.expand_template() must be known during the analysis phase (i.e., when _expand_impl is run), which happens before the git rev-parse HEAD command of the genrule would be run (i.e., during the execution phase).
There are a few ways to do this. The simplest is to do everything in the genrule:
genrule(
name = "gen_local_go",
srcs = ["local.go.in"],
outs = ["local.go"],
local = 1,
cmd = 'sed "s/{VERSION}/$(git rev-parse HEAD)/" "$<" > "$#"',
)
That relies on sed being available on the host machine, but any other sort of program that can input one file, modify the text, and output it to another file will work.
Another option is to use a combination of --workspace_status_command
There are more details here:
How to run a shell command at analysis time in bazel?
The advantage of this approach is that it avoids local genrules.

Idiomatic retrieval of the Bazel execution path

I'm working on my first custom Bazel rules. The rules allow the running of bats command line tests.
I've included the rule definition below verbatim. I'm pretty happy with it so far but there's one part which feels really ugly and non-standard. If the rule user adds a binary dependency to the rule then I make sure that the binary appears on the PATH so that it can be tested. At the moment I do this by making a list of the binary paths and then appending them with $PWD which is expanded inside the script to the complete execution path. This feels hacky and error prone.
Is there a more idiomatic way to do this? I don't believe I can access the execution path in the rule due to it not being created until the execution phase.
Thanks for your help!
BATS_REPOSITORY_BUILD_FILE = """
package(default_visibility = [ "//visibility:public" ])
sh_binary(
name = "bats",
srcs = ["libexec/bats"],
data = [
"libexec/bats-exec-suite",
"libexec/bats-exec-test",
"libexec/bats-format-tap-stream",
"libexec/bats-preprocess",
],
)
"""
def bats_repositories(version="v0.4.0"):
native.new_git_repository(
name = "bats",
remote = "https://github.com/sstephenson/bats",
tag = version,
build_file_content = BATS_REPOSITORY_BUILD_FILE
)
BASH_TEMPLATE = """
#!/usr/bin/env bash
set -e
export TMPDIR="$TEST_TMPDIR"
export PATH="{bats_bins_path}":$PATH
"{bats}" "{test_paths}"
"""
def _dirname(path):
prefix, _, _ = path.rpartition("/")
return prefix.rstrip("/")
def _bats_test_impl(ctx):
runfiles = ctx.runfiles(
files = ctx.files.srcs,
collect_data = True,
)
tests = [f.short_path for f in ctx.files.srcs]
path = ["$PWD/" + _dirname(b.short_path) for b in ctx.files.deps]
sep = ctx.configuration.host_path_separator
ctx.file_action(
output = ctx.outputs.executable,
executable = True,
content = BASH_TEMPLATE.format(
bats = ctx.executable._bats.short_path,
test_paths = " ".join(tests),
bats_bins_path = sep.join(path),
),
)
runfiles = runfiles.merge(ctx.attr._bats.default_runfiles)
return DefaultInfo(
runfiles = runfiles,
)
bats_test = rule(
attrs = {
"srcs": attr.label_list(
allow_files = True,
),
"deps": attr.label_list(),
"_bats": attr.label(
default = Label("#bats//:bats"),
executable = True,
cfg = "host",
),
},
test = True,
implementation = _bats_test_impl,
)
This should be easy to support from Bazel 0.8.0 which will be released in ~2 weeks.
In your skylark implementation you should do ctx.expand_location(binary) where binary should be something like $(execpath :some-label) so you might want to just format the label you got from the user with the $(execpath) and bazel will make sure to give you the execution location of that label.
Some relevant resources:
$location expansion in Bazel
https://github.com/bazelbuild/bazel/issues/2475
https://github.com/bazelbuild/bazel/commit/cff0dc94f6a8e16492adf54c88d0b26abe903d4c

Bazel rules that use different subsets of genrule outputs

I have a code generator that produces three output files:
client.cpp
server.cpp
data.h
The genrule looks like this:
genrule(
name = 'code_gen',
tools = [ '//tools:code_gen.sh' ],
outs = [ 'client.cpp', 'server.cpp', 'data.h' ],
local = True,
cmd = '$(location //tools:code_gen.sh) $(#D)')
The 'client.cpp' and 'server.cpp' each have their own cc_library rule.
My question is how to depend on the genrule but only use a specific output file.
What I did is create a macro that defined the genrule with specific outs set to the file required, but this resulted in multiple execution of the genrule:
gen.bzl:
def code_generator(
name,
out):
native.genrule(
name = name,
tools = [ '//bazel:gen.sh' ],
outs = [ out ],
local = True,
cmd = '$(location //bazel:gen.sh) $(#D)')
BUILD
load(':gen.bzl', 'code_generator')
code_generator('client_cpp', 'client.cpp')
code_generator('server_cpp', 'server.cpp')
code_generator('data_h', 'data.h')
cc_library(
name = 'client',
srcs = [ ':client_cpp' ],
hdrs = [ ':data_h' ],
)
cc_library(
name = 'server',
srcs = [ ':server_cpp' ],
hdrs = [ ':data_h' ],
)
Is there a way to depend on a genrule making it run once and then use only selected outputs from it?
You should be able to just use the filename (e.g. :server.cpp) to depend on a specific output of a rule.

Dispatching C++ generated files into srcs and hdrs

In the Bazel official documentation there is an example explaining how to create a Java library built from regular java files and files generated by a :gen_java_srcs rule. I rewrite this code here for ease of reading:
java_library(
name = "mylib",
srcs = glob(["*.java"]) + [":gen_java_srcs"],
deps = "...",
)
genrule(
name = "gen_java_srcs",
outs = [
"Foo.java",
"Bar.java",
],
...
)
Now in a C++ perspective, I am in a scenario where the genrule generates two kind of files: .hpp and .cpp:
genrule(
name = "gen_cpp_srcs",
outs = [
"myFile_1.hpp","myFile_2.hpp",...,"myFile_N.hpp",
"myFile.cpp","myFile_2.cpp",...,"myFile_N.cpp",
],
...
)
where N is some tens.
My problem/question is: how to write the cc_library rule, with an automatic dispatching of the hpp and cpp files into hdrs and srcs field?
I want something like:
cc_library(
name = "mylib",
srcs = glob(["*.cpp"]) + (howto: .cpp files of [":gen_cpp_srcs"]),
hdrs = glob(["*.hpp"]) + (howto: .hpp files of [":gen_cpp_srcs"]),
...
)
Some magic like:
output_filter(":gen_cpp_srcs","*.cpp")
would be perfect, but I do not know enough of Bazel to make it real.
Globs only get expanded when they're passed into rules, so you'll need to write a simple rule. I would package it like this (in a file named filter.bzl):
# The actual rule which does the filtering.
def _do_filter_impl(ctx):
return struct(
files = set([f for f in ctx.files.srcs if f.path.endswith(ctx.attr.suffix)]),
)
_do_filter = rule(
implementation = _do_filter_impl,
attrs = {
"srcs": attr.label_list(
mandatory = True,
allow_files = True,
),
"suffix": attr.string(
mandatory = True,
),
},
)
# A convenient macro to wrap the custom rule and cc_library.
def filtered_cc_library(name, srcs, hdrs, **kwargs):
_do_filter(
name = "%s_hdrs" % name,
visibility = ["//visibility:private"],
srcs = hdrs,
suffix = ".hpp",
)
_do_filter(
name = "%s_srcs" % name,
visibility = ["//visibility:private"],
srcs = srcs,
suffix = ".cpp",
)
native.cc_library(
name = name,
srcs = [ ":%s_srcs" % name ],
hdrs = [ ":%s_hdrs" % name ],
**kwargs
)
This is what my demo BUILD file looks like (I changed the globs so they both include *.cpp and *.hpp files; using the label of a genrule will work the same way):
load("//:filter.bzl", "filtered_cc_library")
filtered_cc_library(
name = "mylib",
srcs = glob(["*.*pp"]),
hdrs = glob(["*.*pp"]),
)
This is easy to extend to more sophisticated filtering by changing _do_filter_impl. In particular, changing suffix to an attr.string_list so you can accept multiple C/C++ source/header extensions seems like a good idea.
Depending on the genrule by name (:gen_cpp_srcs) will give you all of the outputs of the genrule, as you have noted. Instead, you can depend on the individual outputs of the genrule (e.g. hdrs = [:myFile.hpp] and srcs = [:myFile.cpp]).
See also the answer to Bazel & automatically generated cpp / hpp files.
Looks like you know the total number of files that should be generated. Can you put those in their own variables and then reuse them in both targets. Something like this in your BUILD file:
output_cpp_files = [
"myFile_1.cpp",
"myFile_2.cpp",
"myFile_3.cpp"
]
output_hpp_files = [
"myFile_1.hpp",
"myFile_2.hpp",
"myFile_3.hpp"
]
genrule(
name = "gen_cpp_srcs",
outs = output_cpp_files + output_hpp_files,
cmd = """
touch $(OUTS)
"""
)
cc_library(
name = "mylib",
srcs = output_cpp_files,
hdrs = output_hpp_files
)

Resources