Are the bazel buildtools primarily focused on single starlark files? - bazel

I'm taking a glance over at the buildtools repo (https://github.com/bazelbuild/buildtools) and trying to understand the scope of its responsibilities as it relates to the three phases of a bazel build (loading, analysis, execution)
The repo's description states that it is A bazel BUILD file formatter and editor. I find much logic in the repo written in go-lang that lends complete support for an AST parser, starlark syntax interpreting capabilities, reformatting and rewriting of BUILD files and what not. Basically there's logic designed to operate upon a single starlark file at a time. Rereading that repo description in this light leads me to conclude that buildtools is really a single file scoped effort and presents tools that only intersect functionality wise (perhaps only partially) to those loading operations bazel conducts while building.
Question: Is it accurate that the focus of buildtools is upon the single starlark file?
If that's true then all the multiple starlark file analysis logic and so forth seems to actually be maintained over at https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib and I should not expect to find any tools for the analysis phase and beyond in the buildtools repo. Is that right?

I don't work on Buildtools, but we agree: these tools seem to focus on BUILD / .bzl files in isolation. They let you process these files in parallel, to do similar operations on them.
If you wonder whether these tools understand relations between these files, the answer seems to be no.
If you further wonder what tools do then, the answer is Bazel's query, cquery, and aquery. I'm not aware of a programmable API for these queries though; you have to run Bazel to perform them.

buildtools has tools working on a syntactic level (it looks at the syntax tree). These tools are outside of Bazel and have no knowledge of Bazel build phases. In the future, we may expand the code to work on multiple files (for the static analysis), but it will still be independent from Bazel phases.
https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib/ is the source code of Bazel. The syntax/ directory includes the code for reading and evaluating the Starlark files. The code there is called by Skyframe. The interpreter is called by Skyframe many times in parallel, both during the loading and the analysis phases.
If you have a more specific question (what are you trying to do?), I can help more. :)

Related

Can bazel package depend on a source file in another package

A few years ago I wrote a set of wrappers for Bazel that enabled me to use it to build FPGA code. The FPGA bit is only relevant because the full clean build takes many CPU days so I really care about caching and minimizing rebuilds.
Using Bazel v0.28 I never found a way to have my Bazel package depend on a single source file from somewhere else in the git repo. It felt like this wasn't something Bazel was designed for.
We want to do this because we have a library of VHDL source files that are parameterized and the parameters are set in the instantiating VHDL source. (VHDL generics). If we declare this library as a Bazel package in its own right then a change to one library file would rebuild everything (at huge time cost) when in practice only a couple of steps might need to be rebuilt.
I worked around this with a python script to copy all the individual source files into a subdirectory and then generate the BUILD file to reference these copies. The resulting build process is:
call python preparation script
bazel build //:allfpgas
call python result extractor
This is clearly quite ugly but the benefits were huge so we live with it.
Now we want to leverage Bazel to build our Java, C++ etc so I wanted to revisit and try and make everything work with Bazel alone.
In the latest Bazel is there a way to have a BUILD package depend on individual source files outside of the package directory? If Bazel cant, would buck pants or please.build work better for our use case?
The Bazel rules for most languages support doing something like this already. For example, the Python rules bundle source files from multiple packages together, and the C++ rules manage include files from other packages. Somehow the rule has to pass the source files around in providers, so that another rule can generate actions which use them. Hard to be more specific without knowing which rules you're using.
If you just want to copy the files, you can do that in bazel with a genrule. In the package with the source file:
exports_files(["templated1.vhd", "templated2.vhd"])
In the package that uses it:
genrule(
name = "copy_templates",
srcs = ["//somewhere:templated1.vhd", "//somewhere:templated2.vhd"],
outs = ["templated1.vhd", "templated2.vhd"],
cmd = "cp $(SRCS) $(RULEDIR)",
)
some_library(
srcs = ["templated1.vhd", "templated2.vhd", "other.vhd"],
)
If you want to deduplicate that across multiple packages that use it, put the filenames in a list and write a macro to create the genrule.

Running bazel query without downloading external dependencies

Given a rather big repository built with bazel and tons of third party dependencies in multiple languages (including heavy docker containers), I have the following problem:
running Bazel queries triggers the downloading of many of these dependencies, resulting in slow query performance. Hence, the question:
Is there a way to run bazel query without having to download the dependencies?
Typical query: bazel query 'kind("source file", deps(//...) except deps(//3rdparty/...))
I'm aware of the caching options, which I mostly use, but depending on the languages, things can still be slow.
After asking on Bazel's Slack channel, the response (from Sahin Yort) is not encouraging:
I don’t believe that’s possible due to nature of workspace files. loads from a workspace leads to fetch of the given workspace because has to expand the workspaces in order to know their targets. at that point, it is up to the repository rule to fetch whatever it needs to fetch eagerly or lazily. workspace rules usually expand BUILD files using various patterns. eg running a executable or using expand_template. i have little faith in that it is possible to get what you want.
I'll be looking into other ways to speed things up: a likely culprit for the slowness is probably the action/analysis cache being invalidated due to some flags changing.
If you only need to parse the AST without resolving dependencies, you can use buildozer instead:
buildozer "print srcs" "//some:target"
It also supports -output_json for machine readable output

Bazel build file introspection

Are there any tools out there for introspecting a collection of Bazel build files to run queries against a codebase with? I'm thinking of a simple case of collecting all defined tags used in a codebase. Some sort of bazel metaquery sort of capability that will let me scope out the conventions and usages across a repo with a substantial amount of build files.
It would even be nice to be able to do a cross tabulation of cc_test and py_test rules against their collective tags. Ideally there'd be a python client to introspect the bazel files.
bazel query provides information about your target dependency graph, with a highly expressive query language. It can output into various formats like DOT, XML, Protobuf, and the text representation of the expanded BUILD files themselves (if there are macros) for post-processing. See: Bazel query how-to, Bazel query reference.
bazel cquery does the same as query, but also performs the analysis phase, which computes information about configurations (e.g. CPU, API levels) over the target dependency graph. This takes slightly longer, but gives you a more accurate representation of the graph that Bazel brings the execution phase. See: Bazel cquery reference.
bazel aquery is not directly related to BUILD file introspections in that it presents information about executable actions, which is a few layers of computation after BUILD file parsing and analysis. See: Bazel aquery reference
query, cquery and aquery don't operate on the syntax of the BUILD files. If you want to work with the Starlark syntax / AST, check out the buildozer and buildifier tooling in the bazelbuild/buildtools repository.
If there are information about your build graph that cannot be retrieved using these mechanisms, please file a feature request on the Bazel GitHub project.

why is bazel faster than gradle

originally, I use gradle to build my android project, but recently, I migrate it to bazel, and I find that bazel is truly fast than gradle, so I want to know why, but the doc of bazel doesn't give much idea about this, can anyone help me?
Thanks very much!
Full disclosure: I work on Bazel.
That's not an easy question to answer for two reasons. First, performance is highly dependent on the scenario. For example, we'd generally expect a clean build to be slower than a build where only a single file has changed. Second, I don't know how Gradle works internally, and they've done a lot of work recently to improve Gradle performance.
But I can talk about Bazel and what we're doing to make it fast. We've been working on build performance for ~10 years, starting long before we made it public.
The key feature is that we require all dependencies to be declared, and we track them explicitly. If you use a header file in C++, or depend on a Java library, you must declare this dependency in your BUILD file (and we enforce that these are declared by sandboxing individual actions). There are three effects from this:
First, we can heavily parallelize the build, because we know which things depend on which other things.
Second, we can make incremental builds very fast, because we can tell what parts of the build have to be re-done when you change a specific file (BUILD file, header file, source file, ...).
Third, we almost never have to do clean builds. Other build tools often require 'make clean' to get into a predictable state - since Bazel knows all the dependencies, it can get to a predictable state on every single build.
Another effect is that we can cache remotely (i.e., across users), and even execute on another machine, although neither of these are fully supported at the time of this writing.

Ant: Is it possible to create a dynamic ant script?

So, at work, I frequently have to create virtually identical ant scripts. Basically the application we provide to our clients is designed to be easily extensible, and we offer a service of designing and creating custom modules for it. Because of the complexity of our application, with lots of cross dependencies, I tend to develop the module within our core dev environment, compile it using IntelliJ, and then run a basic ant script that does the following tasks:
1) Clean build directory
2) Create build directory and directory hierarchy based on package paths.
3) Copy class files (and source files to a separate sources directory).
4) Jar it up.
The thing is, to do this I need to go through the script line by line and change a bunch of property names, so it works for the new use case. I also save all the scripts in case I need to go back to them.
This isn't the worst thing in the world, but I'm always looking for a better way to do things. Hence my idea:
For each specific implementation I would provide an ant script (or other file) of just properties. Key-value pairs, which would have specific prefixes for each key based on what it's used for. I would then want my ant script to run the various tasks, executing each one for the key-value pairs that are appropriate.
For example, copying the class files. I would have a property with a name like "classFile.filePath". I would want the script to call the task for every property it detects that starts with "classFile...".
Honestly, from my current research so far, I'm not confident that this is possible. But... I'm super stubborn, and always looking for new creative options. So, what options do I have? Or are there none?
It's possible to dynamically generate ANT scripts, for example the following does this using an XML input file:
Use pure Ant to search if list of files exists and take action based on condition
Personally I would always try and avoid this level of complexity. Ant is not a programming language.
Looking at what you're trying to achieve it does appear you could benefit from packaging your dependencies as jars and using a Maven repository manager like Nexus or Artifactory for storage. This would simplify each sub-project build. When building projects that depend on these published libraries you can use a dependency management tool like Apache ivy to download them.
Hope that helps your question is fairly broad.

Resources