What's the difference between invocation_id and build_id in Bazel? - bazel

Looking at the outputs from Bazel's build event protocol, I can see that the BEP output only has one unique invocation_id (which makes sense) and also only one unique build_id (even when building multiple targets at once).
In that case, what's the point of build_id?

A higher-level system that uses Bazel (e.g., a CI system) may have a concept of a build that is broader than one Bazel invocation. For instance, the higher-level system may retry entire Bazel invocations under certain circumstances or allow a "build" to contain multiple Bazel build or test steps. The build id allows multiple Bazel invocations composing a high-level build to be correlated in Bazel's metadata emissions (most notably, the Build Event Protocol).

Related

Bazel + Fastlane?

Is it possible to integrate Fastlane with Bazel (or vice versa)? The non-mobile part of our org uses Bazel for build, and I'd like to be consistent on mobile. However Fastlane provides a lot of stuff aimed at mobile that Bazellane does not. Bazel is for built + test, whereas Fastlane also provides solutions for release/deployment.
Is it possible (or advisable) to call Bazel build from within Fastlane? Or perhaps call Fastlane from within Bazel for deployment?
Bazel is like an interpreter for a language, which allows you to define rules - functions which may have a set of inputs, an action, and a set of outputs.
I am not familiar with Fastlane, but it is surely possible to write a rule which will produce you an artifact. The only requirement is that your set of outputs must be clearly defined (hardcoded in a rule) - in other words, you can not write a rule which will "unzip whatever is in this archive to this folder", because you have to define a set of outputs.
Rules doc page is the best place to start.

Bazel build file introspection

Are there any tools out there for introspecting a collection of Bazel build files to run queries against a codebase with? I'm thinking of a simple case of collecting all defined tags used in a codebase. Some sort of bazel metaquery sort of capability that will let me scope out the conventions and usages across a repo with a substantial amount of build files.
It would even be nice to be able to do a cross tabulation of cc_test and py_test rules against their collective tags. Ideally there'd be a python client to introspect the bazel files.
bazel query provides information about your target dependency graph, with a highly expressive query language. It can output into various formats like DOT, XML, Protobuf, and the text representation of the expanded BUILD files themselves (if there are macros) for post-processing. See: Bazel query how-to, Bazel query reference.
bazel cquery does the same as query, but also performs the analysis phase, which computes information about configurations (e.g. CPU, API levels) over the target dependency graph. This takes slightly longer, but gives you a more accurate representation of the graph that Bazel brings the execution phase. See: Bazel cquery reference.
bazel aquery is not directly related to BUILD file introspections in that it presents information about executable actions, which is a few layers of computation after BUILD file parsing and analysis. See: Bazel aquery reference
query, cquery and aquery don't operate on the syntax of the BUILD files. If you want to work with the Starlark syntax / AST, check out the buildozer and buildifier tooling in the bazelbuild/buildtools repository.
If there are information about your build graph that cannot be retrieved using these mechanisms, please file a feature request on the Bazel GitHub project.

Are the bazel buildtools primarily focused on single starlark files?

I'm taking a glance over at the buildtools repo (https://github.com/bazelbuild/buildtools) and trying to understand the scope of its responsibilities as it relates to the three phases of a bazel build (loading, analysis, execution)
The repo's description states that it is A bazel BUILD file formatter and editor. I find much logic in the repo written in go-lang that lends complete support for an AST parser, starlark syntax interpreting capabilities, reformatting and rewriting of BUILD files and what not. Basically there's logic designed to operate upon a single starlark file at a time. Rereading that repo description in this light leads me to conclude that buildtools is really a single file scoped effort and presents tools that only intersect functionality wise (perhaps only partially) to those loading operations bazel conducts while building.
Question: Is it accurate that the focus of buildtools is upon the single starlark file?
If that's true then all the multiple starlark file analysis logic and so forth seems to actually be maintained over at https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib and I should not expect to find any tools for the analysis phase and beyond in the buildtools repo. Is that right?
I don't work on Buildtools, but we agree: these tools seem to focus on BUILD / .bzl files in isolation. They let you process these files in parallel, to do similar operations on them.
If you wonder whether these tools understand relations between these files, the answer seems to be no.
If you further wonder what tools do then, the answer is Bazel's query, cquery, and aquery. I'm not aware of a programmable API for these queries though; you have to run Bazel to perform them.
buildtools has tools working on a syntactic level (it looks at the syntax tree). These tools are outside of Bazel and have no knowledge of Bazel build phases. In the future, we may expand the code to work on multiple files (for the static analysis), but it will still be independent from Bazel phases.
https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib/ is the source code of Bazel. The syntax/ directory includes the code for reading and evaluating the Starlark files. The code there is called by Skyframe. The interpreter is called by Skyframe many times in parallel, both during the loading and the analysis phases.
If you have a more specific question (what are you trying to do?), I can help more. :)

How do I get workspace status in bazel

I would like to version build artefacts with build number for CI passed to bazel via workspace_status_command. Sometimes I would like to include build number to the name of the artefact.
Is there a way how do I access ctx when writing a macro(as I was trying to use ctx.info_file)? So far it seems that I am able to access such info just in new rule when creating a new rule which in this case is a bit awkward.
I guess that having a build number or similar info is pretty common use case so I wonder if thre is a simpler way how to access such info.
No, you really need to define a custom rule to be able to consume information passed from workspace_status_command through info_file and version_file file and even then you cannot just access it's values from Starlark, you can pass the file to your tooling (wrapper) and process the inputs there. After all, (build) rules do not execute anything, they emit actions to be executed at a later phase.
Be careful though, because if you depend on info_file (STABLE_* entries), changes to the file invalidate targets depending on it. For something like CI build number, it's usually not what you want and version_file is more likely what you are after. You may want to record the id, but you usually do not want to rebuild stuff just because the build ID has changed (it's a new CI run). However, even simple inclusion of IDs could be considered problematic, if you want your results to be completely reproducible.
Having variable artifact names is a whole new problem and there would be good reasons why not to. But generally since as proposed the name would be decided during execution of actions (reading in version_file in your tool), you're past the analysis phase to decide what comes out of the action. The only way I am currently aware of (that is for out of tree source of variable input, you can of course always define a Starlark variable and load it from your BUILD file) to be able to do that is to use tree artifacts (using declare_directory in your rule.

bazel: why do targets keep increasing if build graph is known

Per this doc, the analysis and execution phases handle building out the dependency tree (among other things) and going and doing the work if needed, respectively. If that's true, I'm curious why the total number of targets keeps increasing as the build progresses (i.e., when I start a large build, bazel may report that it's built 5 out of 100 targets, but later will say it's built 20 out of 300 targets, and so forth, with the denominator increasing for a while until it levels off).
I've heard the loading and analysis phases can be intermixed. My likely incomplete or incorrect understanding is that when bazel parses a BUILD file, analysis is invoked to determine what dependencies are needed for the requested targets on the command line, and then I guess this is somehow communicated back to the loader to pull in any other BUILD files referenced by these dependencies, which could cause the loader to go out and fetch a remote repo if the dependency (and thus BUILD file) is not in the local repo.
However, my understanding was also that while dynamic build-out of the dependency graph was a potential future direction for bazel, that currently, execution does not intermix with analysis, and thus when execution begins, the full dependency tree should be available to bazel (and thus the total number of targets known)? Does bazel have the full tree, but just not want to traverse the tree to get a count in case it's big, or is something else going on here?
Note: I found a brief mention of this phenomenon here, but without an explanation as to why it happens.
The number you're seeing in the progress bar refers to actions (command-lines..ish) and not targets (e.g. //my:target). I wrote a blog post about the action graph, and here's the relevant description about it:
The action graph contains a different set of information: file-level
dependencies, full command lines, and other information Bazel needs to
execute the build. If you are familiar with Bazel’s build phases, the
action graph is the output of the loading and analysis phase and used
during the execution phase.
However, Bazel does not necessarily execute every action in the graph.
It only executes if it has to, that is, the action graph is the super
set of what is actually executed.
As to why the denominator is ever-increasing, it's because the actions-to-execute discovery within the action graph is lazy. Here's a better explanation from the Bazel TL, Ulf Adams:
The problem is that Skyframe does not eagerly walk the action graph,
but it does it lazily. The reason for that is performance, since the
action graph can be rather large and this was previously a blocking
operation (where Bazel would just hang for some time). The downside is
that all threads that walk the action graph block on actions that they
execute, which delays discovery of remaining actions. That's why the
number keeps going up during the build.
Source: https://github.com/bazelbuild/bazel/issues/3582#issuecomment-329405311

Resources