Bazel: how to get access to srcs of a filegroup? - bazel

I have some html files in "root/html_files/*.html" directory. I want to iterate on these html files and run some bazel rules on them, from "root/tests/" directory.
So, I made a filegroup of those html files and tried to get access to them in "root/tests/" directory, but that is't working.
I wonder if it is possible?
My BUILD file in "root/" directory:
HTMLS_LIST = glob(["html_files/*.html",])
filegroup(
name = "html_files",
srcs = HTMLS_LIST,
visibility = [ "//visibility:public" ],)
My BUILD file in "root/tests/" directory:
load("//tests:automation_test.bzl", "make_and_run_html_tests")
make_and_run_html_tests(
name = 'test_all_htmls',
srcs = ['test/automation_test.py'],
html_files = '//:html_files')
My bzl file in "root/tests/" directory:
def make_and_run_html_tests(name, html_files, srcs):
tests = []
for html_file in html_files: # I want to iterate on sources of html filegroup here
folders = html_file.split("/")
filename = folders[-1].split(".")[0]
test_rule_name = 'test_' + filename + '_file'
native.py_test(
name = test_rule_name,
srcs = srcs,
main = srcs[0],
data = [
html_file,
],
args = [html_file],
)
testname = ":" + test_rule_name
tests.append(testname)
native.test_suite(
name = name,
tests = tests,
)
And my python unittest file in "root/tests/" directory:
import sys
import codecs
import unittest
class TestHtmlDocumetns(unittest.TestCase):
def setUp(self):
self.html_file_path = sys.argv[1]
def test_html_file(self):
fid = codecs.open(self.html_file_path, 'r')
print(fid.read())
self.assertTrue(fid)
if __name__ == '__main__':
unittest.main(argv=[sys.argv[1]])

You can't access the references to the files inside another filegroup / rule from within a macro like that. You'd need to create a rule and access them via the ctx.files attr
However, you can iterate over them if you were to remove the filegroup, and pass the glob directly to the macro:
HTMLS_LIST = glob(["html_files/*.html"])
make_and_run_html_tests(
name = 'test_all_htmls',
srcs = ['test/automation_test.py'],
html_files = HTMLS_LIST
)
The glob is resolved to an array before expanding the macro

Related

Why `ctx.actions.run` cannot refer `generated file type` files as a `inputs` attribute even though `source file type` files can be referred?

I'm creating a rules file that generate some scripts with ctx.actions.expand_template and runs these scripts with ctx.actions.run.
ctx.actions.run uses the script file generated by ctx.actions.expand_template and the 'generated file type' file(filelist file contains several file name, path information) generated from other rule files which has a dependency relationship with this rule file as input attribute.
When the script is executed in ctx.actions.run, the generated file type filelist mentioned above is not found.
If I check the sandbox path where the actual build takes place, this filelist does not exist.
What should I do?
This is a part of my rule file
def _my_rule_impl(ctx):
...
my_script = ctx.actions.declare_file("my_script.sh")
ctx.actions.expand_template(
output = compile_script,
template = ctx.file._my_template,
substitutions = {
"{TOP}": "{}".format(top_name),
"{FLISTS}": " ".join(["-f {}".format(f.short_path) for f in flists_list]),
...
},
)
compile_srcs = flists_list + srcs_list + [my_script]
outputs = ctx.outputs.executable
executable = compile_script.path
ctx.actions.run(
inputs = depset(compile_srcs),
outputs = [outputs],
executable = executable,
env = {
"HOME": "/home/grrrr",
},
)
allfiles = depset(compile_srcs)
runfiles = ctx.runfiles(files = compile_srcs)
return [DefaultInfo(
files = allfiles,
runfiles = runfiles,
)]
my_rule = rule(
implementation = _my_rule_impl,
attrs = {
"deps": attr.label_list(
mandatory = True,
),
"_my_template": attr.label(
allow_single_file = True,
default = Label("#my_rules//my_test:my_script.sh.template"),
),
...
},
executable = True,
)
As a result of checking with print, this path is the location where the script is executed.
/home/grrrr/.cache/bazel/_bazel_grrrr/.../sandbox/processwrapper-sandbox/.../execroot/my_rules/
As a result of checking with print, the script refers to sources including a filelist in this path. However, there are only source file type files. There is not a filelist.
/home/grrrr/.cache/bazel/_bazel_grrrr/.../sandbox/processwrapper-sandbox/.../execroot/my_rules/my_test
However, There is a filelist in this path. I'm wondering why this filelist is not in above directory.
/home/grrrr/.cache/bazel/_bazel_grrrr/.../sandbox/processwrapper-sandbox/.../execroot/my_rules/bazel-out/k8-fastbuild/bin/my_test
It's resolved by using sandboxfs instead of sandbox.
Here is the useful page regarding sandboxfs.
https://bazel.build/docs/sandboxing#sandboxfs

How to create a py_library from a generated source tree

I am trying to consume some Python code + C extensions that are produced by CMake.
I'm using rules_foreign_cc to build the code, and it puts the code into an output directory, but I'm stumped as to how I can turn that into a py_library. I tried the below:
Depend on the cmake rule as srcs
Set imports to try to point to the directory containing the Python packages
But it doesn't work. When I run the py_test, the constructed PYTHONPATH doesn't point to the directory containing the python package from the cmake rule.
Thanks for any help!
Here's what I've got in my third_party/BUILD.bazel:
load("#rules_foreign_cc//foreign_cc:defs.bzl", "cmake")
load("#rules_python//python:defs.bzl", "py_library", "py_test")
filegroup(
name = "torch_mlir_src",
srcs = glob(["torch-mlir/**"]),
)
make(
name = "torch_mlir",
...
install = False,
lib_source = ":torch_mlir_src",
out_data_dirs = ["python_packages"],
postfix_script = " && ".join([
"cp -r --dereference tools/torch-mlir/python_packages/torch_mlir $$INSTALLDIR$$/python_packages",
]),
targets = ["tools/torch-mlir/all"],
working_directory = "external/llvm-project/llvm",
)
# This doesn't seem to work.
py_library(
name = "torch_mlir_py",
srcs = [":torch_mlir"],
imports = ["$(BINDIR)/third_party/torch_mlir/python_packages"],
srcs_version = "PY3",
)
py_test(
name = "torch_mlir_annotations_sugar_test",
srcs = ["torch-mlir/python/test/annotations-sugar.py"],
main = "torch-mlir/python/test/annotations-sugar.py",
srcs_version = "PY3",
deps = [":torch_mlir_py"],
)

How do I get my custom header template rule to pass it's output downstream cc_binary/cc_library dependency?

I'm trying to build a rule for bazel which emulates the CMake *.in template system.
This has two challenges, the first is generate the template output. The second is make the output available to both genrules, filegroups and cc_* rules. The third is to get that dependency to transitively be passed to further downstream rules.
I have it generating the output file version.hpp in genfiles (or bazel-bin), and I can get the initial library rule to include it, but I can't seem to figure out how to make my cc_binary rule, which depends on the cc_library and transitively the header_template rule to find the header file.
I have the following .bzl rule:
def _header_template_impl(ctx):
# this generates the output from the template
ctx.actions.expand_template(
template = ctx.file.template,
output = ctx.outputs.out,
substitutions = ctx.attr.vars,
)
return [
# create a provider which says that this
# out file should be made available as a header
CcInfo(compilation_context=cc_common.create_compilation_context(
headers=depset([ctx.outputs.out])
)),
# Also create a provider referencing this header ???
DefaultInfo(files=depset(
[ctx.outputs.out]
))
]
header_template = rule(
implementation = _header_template_impl,
attrs = {
"vars": attr.string_dict(
mandatory = True
),
"extension": attr.string(default=".hpp"),
"template": attr.label(
mandatory = True,
allow_single_file = True,
),
},
outputs = {
"out": "%{name}%{extension}",
},
output_to_genfiles = True,
)
elsewhere I have a cc_library rule:
load("//:tools/header_template.bzl", "header_template")
# version control
BONSAI_MAJOR_VERSION = '2'
BONSAI_MINOR_VERSION = '0'
BONSAI_PATCH_VERSION = '9'
BONSAI_VERSION = \
BONSAI_MAJOR_VERSION + '.' + \
BONSAI_MINOR_VERSION + '.' + \
BONSAI_PATCH_VERSION
header_template(
name = "bonsai_version",
extension = ".hpp",
template = "version.hpp.in",
vars = {
"#BONSAI_MAJOR_VERSION#": BONSAI_MAJOR_VERSION,
"#BONSAI_MINOR_VERSION#": BONSAI_MINOR_VERSION,
"#BONSAI_PATCH_VERSION#": BONSAI_PATCH_VERSION,
"#BONSAI_VERSION#": BONSAI_VERSION,
},
)
# ...
private = glob([
"src/**/*.hpp",
"src/**/*.cpp",
"proto/**/*.hpp",
])
public = glob([
"include/*.hpp",
":bonsai_version",
])
cc_library(
# target name matches directory name so you can call:
# bazel build .
name = "bonsai",
srcs = private,
hdrs = public,
# public headers
includes = [
"include",
],
# ...
deps = [
":bonsai_version",
# ...
],
# ...
)
When I build, my source files need to be able to:
#include "bonsai_version.hpp"
I think the answer involves CcInfo but I'm grasping in the dark as to how it should be constructed.
I've already tried add "-I$(GENDIR)/" + package_name() to the copts, to no avail. The generated header still isn't available.
My expectation is that I should be able to return some kind of Info object that would allow me to add the dependency in srcs. Maybe it should be a DefaultInfo.
I've dug through the bazel rules examples and the source, but I'm missing something fundamental, and I can't find documentation that discuss this particular.
I'd like to be able to do the following:
header_template(
name = "some_header",
extension = ".hpp",
template = "some_header.hpp.in",
vars = {
"#SOMEVAR#": "value",
"{ANOTHERVAR}": "another_value",
},
)
cc_library(
name = "foo",
srcs = ["foo.src", ":some_header"],
...
)
cc_binary(
name = "bar",
srcs = ["bar.cpp"],
deps = [":foo"],
)
and include the generated header like so:
#include "some_header.hpp"
void bar(){
}
The answer looks like it is:
def _header_template_impl(ctx):
# this generates the output from the template
ctx.actions.expand_template(
template = ctx.file.template,
output = ctx.outputs.out,
substitutions = ctx.attr.vars,
)
return [
# create a provider which says that this
# out file should be made available as a header
CcInfo(compilation_context=cc_common.create_compilation_context(
# pass out the include path for finding this header
includes=depset([ctx.outputs.out.dirname]),
# and the actual header here.
headers=depset([ctx.outputs.out])
))
]
elsewhere:
header_template(
name = "some_header",
extension = ".hpp",
template = "some_header.hpp.in",
vars = {
"#SOMEVAR#": "value",
"{ANOTHERVAR}": "another_value",
},
)
cc_library(
name = "foo",
srcs = ["foo.cpp"],
deps = [":some_header"],
...
)
cc_binary(
name = "bar",
srcs = ["bar.cpp"],
deps = [":foo"],
)
If your header has a generic name (eg config.h) and you want it to be private (ie srcs instead of hdrs), you might need a different approach. I've seen this problem for gflags, which "leaked" config.h and affected libraries that depended on it (issue).
Of course, in both cases, the easiest solution is to generate and commit header files for the platforms you target.
Alternatively, you can set copts for the cc_library rule that uses the generated private header:
cc_library(
name = "foo",
srcs = ["foo.cpp", "some_header.hpp"],
copts = ["-I$(GENDIR)/my/package/name"],
...
)
If you want this to work when your repository is included as an external repository, you have a bit more work cut out for you due to bazel issue #4463.
PS. You might want to see if cc_fix_config from https://github.com/antonovvk/bazel_rules works for you. It's just a wrapper around perl but I found it useful.

Filter source files for custom rule

I used a bazel macro to run a python test on a subset of source files. Similar to this:
def report(name, srcs):
source_labels = [file for file in srcs if file.startswith("a")]
if len(source_labels) == 0:
return;
source_filenames = ["$(location %s)" % x for x in source_labels]
native.py_test(
name = name + "_report",
srcs = ["report_tool"],
data = source_labels,
main = "report_tool.py",
args = source_filenames,
)
report("foo", ["foo.hpp", "afoo.hpp"])
This worked fine until one of my source files started using a select and now I get the error:
File "/home/david/foo/report.bzl", line 47, in report
[file for file in srcs if file.startswith("a")]
type 'select' is not iterable
I tried to move the code to a bazel rule, but then I get a different error that py_test can not be used in the analysis phase.
The reason that the select is causing the error is that macros are evaluated during the loading phase, whereas selectss are not evaluated until the analysis phase (see Extension Overview).
Similarly, py_test can't be used in a rule implementation because the rule implementation is evaluated in the analysis phase, whereas the py_test would need to have been loaded in the loading phase.
One way past this is to create a separate Starlark rule that takes a list of labels and just creates a file with each filename from the label. Then the py_test takes that file as data and loads the other files from there. Something like this:
def report(name, srcs):
file_locations_label = "_" + name + "_file_locations"
_generate_file_locations(
name = file_locations_label,
labels = srcs
)
native.py_test(
name = name + "_report",
srcs = ["report_tool.py"],
data = srcs + [file_locations_label],
main = "report_tool.py",
args = ["$(location %s)" % file_locations_label]
)
def _generate_file_locations_impl(ctx):
paths = []
for l in ctx.attr.labels:
f = l.files.to_list()[0]
if f.basename.startswith("a"):
paths.append(f.short_path)
ctx.actions.write(ctx.outputs.file_paths, "\n".join(paths))
return DefaultInfo(runfiles = ctx.runfiles(files = [ctx.outputs.file_paths]))
_generate_file_locations = rule(
implementation = _generate_file_locations_impl,
attrs = { "labels": attr.label_list(allow_files = True) },
outputs = { "file_paths": "%{name}_files" },
)
This has one disadvantage: Because the py_test has to depend on all the sources, the py_test will get rerun even if the only files that have changed are the ignored files. (If this is a significant drawback, then there is at least one way around this, which is to have _generate_file_locations filter the files too, and have the py_test depend on only _generate_file_locations. This could maybe be accomplished through runfiles symlinks)
Update:
Since the test report tool comes from an external repository and can't be easily modified, here's another approach that might work better. Rather than create a rule that creates a params file (a file containing the paths to process) as above, the Starlark rule can itself be a test rule that uses the report tool as the test executable:
def _report_test_impl(ctx):
filtered_srcs = []
for f in ctx.attr.srcs:
f = f.files.to_list()[0]
if f.basename.startswith("a"):
filtered_srcs.append(f)
report_tool = ctx.attr._report_test_tool
ctx.actions.write(
output = ctx.outputs.executable,
content = "{report_tool} {paths}".format(
report_tool = report_tool.files_to_run.executable.short_path,
paths = " ".join([f.short_path for f in filtered_srcs]))
)
runfiles = ctx.runfiles(files = filtered_srcs).merge(
report_tool.default_runfiles)
return DefaultInfo(runfiles = runfiles)
report_test = rule(
implementation = _report_test_impl,
attrs = {
"srcs": attr.label_list(allow_files = True),
"_report_test_tool": attr.label(default="//:report_test_tool"),
},
test = True,
)
This requires that the test report tool be a py_binary somewhere so that the test rule above can depend on it:
py_binary(
name = "report_test_tool",
srcs = ["report_tool.py"],
main = "report_tool.py",
)

how to write extended rule for java test in Bazel?

What I figured currently is creating a AllTest and run it with junit. But, I am not satisfied with it. I want this rule can create as many tests as many java test file in created in codebase.
def junit_suite_test(name, srcs, deps, size="small", resources=[], classpath_resources=[], jvm_flags=[], tags=[], data=[]):
tests = []
package = PACKAGE_NAME.replace("src/test/java/", "").replace("/", ".")
for src in srcs:
if src.endswith("Test.java"):
if "/" in src:
src = package + "." + src.replace("/", ".")
tests += [src.replace(".java", ".class")]
native.genrule(
name = name + "-AllTests-gen",
outs = ["AllTests.java"],
cmd = """
cat <<EOF >> $#
package %s;
import org.junit.runner.RunWith;
import org.junit.runners.Suite;
#RunWith(Suite.class)
#Suite.SuiteClasses({%s})
public class AllTests {}
EOF
""" % (package, ",".join(tests))
)
native.java_test(
name = name,
srcs = srcs + ["AllTests.java"],
test_class = package + ".AllTests",
resources = resources,
classpath_resources = classpath_resources,
data = data,
size = size,
tags = tags,
jvm_flags = jvm_flags,
deps = deps + [
],
)
Hi you can do something like that:
[java_test(name = s[:-5], srcs = s) for s in glob(["*.java"])]
That will create on test target per java file.
With that method, your macro would looks like:
def junit_suite_test(name, srcs, deps, size="small", resources=[], classpath_resources=[], jvm_flags=[], tags=[], data=[]):
[native.java_test(
name = name,
srcs = src,
resources = resources,
classpath_resources = classpath_resources,
data = data,
size = size,
tags = tags,
jvm_flags = jvm_flags,
deps = deps,
) for src in srcs if src.endswith("Test.java")]
Of course you probably needs some adaptation to feed in the good sources.
However, I would recommend against doing that over your solution as too much parallelisms can actually be slower in fine. The test log and the XML file will report the actual failing test case and you can use shard_count to increase parallelism is really needed.

Resources