I have automatically-generated .cc sources and a Starlark rule running the .cc generator:
BUILD file:
generate_cc(
name = "foo_generated"
) # runs an executable that generates foo.h, foo.cc
I'd like the above foo_generated to act also as a cc_library, so that it can be a valid dependency of a subsequent cc_library:
cc_library(
name = "bar",
deps = [":foo_generated"] # foo_generated used like a cc_library()
)
Can generate_cc be implemented in a single rule, without macros, so that a target of type generate_cc would work as other cc_library's deps?
(I realize that generate_cc could be a macro that calls the actual rule and then calls a cc_library rule, thereby creating two separate targets / labels - this is what I'd like to avoid).
If a rule implementation could call another rule, then generate_cc's implementation could
wrap the sources it generates in a cc_library
return the CcInfo provider returned by cc_library
as in (hypothetical .bzl file):
def generate_cc_impl(ctx):
# generate .h, .cc files
# ...
cc_info = native.cc_library(...) # wrap .h, .cc files
return cc_info
But I suppose calling one rule from another is not possible?
Rules cannot call other rules. However, support was added fairly recently for rules to reuse most of the native C++ functionality, which supports this use case. There's a section of documentation about implementing Starlark rules that depend on C++ rules and/or that C++ rules can depend on.
The my_c_archive example shows a lot of the boilerplate to use this functionality (finding the cc_toolchain and feature_configuration in particular). cc_common.compile is the function to create actions to compile your source files. cc_common.create_linking_context_from_compilation_outputs will convert the CcCompilationOutputs from compile into a CcLinkingContext for creating the CcInfo to return.
You can choose to pull some/all of the files out of the CcCompilationOutputs and CcLinkingOutputs to return as your rule's DefaultInfo, depending on your use case.
create_linking_context_from_compilation_outputs returns (CcLinkingContext, CcLinkingOutputs) for reference. I created bazel#10253 just now to add that to the docs.
Related
I use bazel configure_make rule to build 3rd party lib. This lib requires specifying paths to compiler in its configure options (or it uses default compiler, like /usr/bin/gcc, that is definitely wrong for cross-compilation).
I want to make my BUILD file free of configurable paths to toolchain, and I see, that I can get a toolchain from ctx in rule implementation. Idea is to get compiler/linker/etc paths from ctx and add them to configure options of configure_make rule, so the BUILD file will not have any info about toolchain.
I made a POC - copied original configure_make rule and changed its implementation - it works as I want. But I don't want to keep a copy of configure_make, if there is a way to write some wrapper for this rule.
Generally, what I want:
def _new_impl(ctx):
find_cpp_toolchain(ctx)
attrs = ctx.attr
# add new configure options somehow
# pass them to configure_make ???
new_rule = rule (
# all configure_make attrs
impl = _new_impl,
...
)
For now, after reading bazel docs it seems impossible, but I know, that I'm not an expert in bazel, so I could miss something.
It is impossible in current version.
I have rule A implemented with a macro that uses declare_directory to produce a set of files:
output = ctx.actions.declare_directory("selected")
Names of those files are not known in advance. The implementation returns the directory created by declare_directory with the following:
return DefaultInfo(
files = depset([output]),
)
Rule A is included in "srcs" attribute of rule B. Rule B is also implemented with a macro. Unfortunately the list of files passed to B implementation through "srcs" attribute only contains the "selected" directory created by rule A instead of files residing in that directory.
I know that Args class supports expansion of directories so I could pass names of all files in "selected" directory to a single action. What I need, however, is a separate action for every individual file for parallelism and caching. What is the best way to achieve that?
This is one of the intended use cases of directory outputs (called TreeArtifacts in the implementation), and it's implemented using ActionTemplate:
https://github.com/bazelbuild/bazel/blob/c2100ad420618bb53754508da806b5624209d9be/src/main/java/com/google/devtools/build/lib/actions/ActionTemplate.java#L24-L57
However, this is not exposed to Starlark, and has only a couple usages currently, in the Android rules AndroidBinary.java and C++ rules CcCompilationHelper.java. The Android rules and C++ rules are going to be migrated over to Starlark, so this functionality might eventually be made available in Starlark, but I'm not sure of any concrete timelines. It would probably be good to file a feature request on Github.
I'm having trouble understanding how to construct proper label forms when dealing with external repositories (directories with their own WORKSPACE).
What is the semantic meaning of characters like /, :, // or #?
For example:
#foo/bar
#foo:bar
//foo
foo
Do they preserve their meaning when used in an external repository? Also, is //external special in any way?
/ is a separator for package and target names.
relative/package/to/my:target
//absolute/package/to:my/file/target.java
A package is defined as a directory containing a BUILD or BUILD.bazel file.
: is the lexeme for selecting a rule or file target in a package.
//my/package:my_java_binary
Selects the target my_java_binary defined in <workspace root>/my/package/BUILD
//my/package:file.go
Selects the file <workspace root>/my/package/file.go if <workspace root>/my/package/BUILD exists, and if there's a rule in that BUILD file that references it.
//:my/nested/file.txt
Selects the file <workspace root>/my/nested/file.txt if <workspace root>/BUILD exists, but not in the my and my/nested subdirectories.
// is the location of the current or closest parent directory containing a WORKSPACE file.
Otherwise known as workspace root.
# is used for referencing a repository by its name when used to the left of //
#io_bazel_rules_scala//scala:scala.bzl: look into your WORKSPACE file for a repository named io_bazel_rules_scala. Usually defined using http_archive or git_repository.
#//my/package:target: # alone refers to the current workspace.
As of Bazel 0.16.0, # can be used in package names.
Do they preserve their meaning when used in an external repository?
Yes, think of the #<repository> syntax as a namespace mechanism.
Also, is //external special in any way?
Yes, it's used for the bind function, which is not recommended anymore. bind lets you give a target an alias in //external.
Problem
I wonder how to inform bazel about dependencies unknown at declaration time, but known at build time (a.k.a implicit dependencies, dynamic dependencies, ...). For instance when compiling C++ sources, the .cpp source file will depends on some header files and this information is not available when writing the BUILD file. It needs to be retrieve at build time. Whatever is the solution to get the information (dry-run, generating depfile, parsing stdout), it needs to be done at build time and the information need to be retrieved to bazel build graph.
Since skylark does not allow to do I/O, for instance to read a generated depfile or to parse stdout result containing a dependency list, I have no clue on how to deal with it.
Behind implicit dependencies, I am looking for correct incremental build.
Example
To experiment this problem I have created a simple tool, just_a_tool.exe which takes an input file, read a list of file from it, and concatenate the content of all these file to an output file.
command line example:
just_a_tool.exe --input input.txt --depfile dep.d output.txt
dep.d contains the list of all the read files.
Issue
If I change the content of test1.txt, test2.txt, or test3.txt, bazel does not rebuild output.txt file. Of course, because it does not know there were dependencies.
Example files
just_a_tool.bzl
def _impl(ctx):
exec_path = "C:/Code/JustATool/just_a_tool.exe"
for f in ctx.attr.source.files:
source_path = f.path
output_path = ctx.outputs.out.path
dep_file = ctx.actions.declare_file("dep.d")
args = ["--input", source_path, "--dep_file", dep_file.path, output_path]
ctx.actions.run(
outputs=[ctx.outputs.out, dep_file],
executable=exec_path,
inputs=ctx.attr.source.files,
arguments=args
)
jat_convert = rule(
implementation = _impl,
attrs = {
"source" : attr.label(mandatory=True, allow_files=True, single_file=True)
},
outputs = {"out": "%{name}.txt"}
)
BUILD
load("//tool:just_a_tool.bzl", "jat_convert")
jat_convert(
name="my_output",
source=":input.txt"
)
input.txt
test1.txt
test2.txt
test3.txt
Goal
I want to do correct and fast incremental build for the following situation:
Generate reflection data from C++ sources, this custom tool execution depends on header file included in my source files.
Use a internal tool to build asset file which can include other files
Run a custom preprocessor on my shaders allowing a #include feature
Thanks!
Bazel's extension language doesn't support creating actions with a dynamic set of inputs, where this set depends on the output of a previous action. In other words, custom rules cannot run an action, read the action's output, then create actions with those inputs or update (or prune the set of) inputs of already created actions.
Instead I suggest adding attribute(s) to your rule where the user can declare the set of files that the sources may include. I call this "the universe of headers". The actions you create depend on this user-defined universe, so the set of action inputs is completely defined. Of course this means these actions potentially depend on more files than the cpp files, which they process, include.
This approach is analogous to how the cc_* rules work: a file in cc_*.srcs can include other files in the srcs of the same rule and from hdrs of dependencies, but nothing else. Thus the union of srcs + hdrs of (direct & transitive) dependencies defines the universe of header files that a cpp file may include.
I have the following maven_jar in my workspace:
maven_jar(
name = "com_google_code_findbugs_jsr305",
artifact = "com.google.code.findbugs:jsr305:3.0.1",
sha1 = "f7be08ec23c21485b9b5a1cf1654c2ec8c58168d",
)
In my project I reference it through #com_google_code_findbugs_jsr305//jar. However, I now want to depend on a third party library that references #com_google_code_findbugs_jsr305 without the jar target.
I tried looking into both bind and alias, however alias cannot be applied inside the WORKSPACE and bind doesn't seem to allow you to define targets as external repositories.
I could rename the version I use so it doesn't conflict, but that feels like the wrong solution.
IIUC, your code needs to depend on both #com_google_code_findbugs_jsr305//jar and #com_google_code_findbugs_jsr305//:com_google_code_findbugs_jsr305. Unfortunately, there isn't any pre-built rule that generates BUILD files for both of those targets, so you basically have to define the BUILD files yourself. Fortunately, #jart has written most of it for you in the closure rule you linked to. You just need to add //jar:jar by appending a couple of lines, after line 69 add something like:
repository_ctx.file(
'jar/BUILD',
"\n".join([
"package(default_visibility = '//visibility:public')"] + _make_java_import('jar', '//:com_google_code_findbugs_jsr305.jar')
This creates a //jar:jar (or equivalently, //jar) target in the repository.