Running bazel query without downloading external dependencies - bazel

Given a rather big repository built with bazel and tons of third party dependencies in multiple languages (including heavy docker containers), I have the following problem:
running Bazel queries triggers the downloading of many of these dependencies, resulting in slow query performance. Hence, the question:
Is there a way to run bazel query without having to download the dependencies?
Typical query: bazel query 'kind("source file", deps(//...) except deps(//3rdparty/...))
I'm aware of the caching options, which I mostly use, but depending on the languages, things can still be slow.

After asking on Bazel's Slack channel, the response (from Sahin Yort) is not encouraging:
I don’t believe that’s possible due to nature of workspace files. loads from a workspace leads to fetch of the given workspace because has to expand the workspaces in order to know their targets. at that point, it is up to the repository rule to fetch whatever it needs to fetch eagerly or lazily. workspace rules usually expand BUILD files using various patterns. eg running a executable or using expand_template. i have little faith in that it is possible to get what you want.
I'll be looking into other ways to speed things up: a likely culprit for the slowness is probably the action/analysis cache being invalidated due to some flags changing.

If you only need to parse the AST without resolving dependencies, you can use buildozer instead:
buildozer "print srcs" "//some:target"
It also supports -output_json for machine readable output

Related

Are the bazel buildtools primarily focused on single starlark files?

I'm taking a glance over at the buildtools repo (https://github.com/bazelbuild/buildtools) and trying to understand the scope of its responsibilities as it relates to the three phases of a bazel build (loading, analysis, execution)
The repo's description states that it is A bazel BUILD file formatter and editor. I find much logic in the repo written in go-lang that lends complete support for an AST parser, starlark syntax interpreting capabilities, reformatting and rewriting of BUILD files and what not. Basically there's logic designed to operate upon a single starlark file at a time. Rereading that repo description in this light leads me to conclude that buildtools is really a single file scoped effort and presents tools that only intersect functionality wise (perhaps only partially) to those loading operations bazel conducts while building.
Question: Is it accurate that the focus of buildtools is upon the single starlark file?
If that's true then all the multiple starlark file analysis logic and so forth seems to actually be maintained over at https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib and I should not expect to find any tools for the analysis phase and beyond in the buildtools repo. Is that right?
I don't work on Buildtools, but we agree: these tools seem to focus on BUILD / .bzl files in isolation. They let you process these files in parallel, to do similar operations on them.
If you wonder whether these tools understand relations between these files, the answer seems to be no.
If you further wonder what tools do then, the answer is Bazel's query, cquery, and aquery. I'm not aware of a programmable API for these queries though; you have to run Bazel to perform them.
buildtools has tools working on a syntactic level (it looks at the syntax tree). These tools are outside of Bazel and have no knowledge of Bazel build phases. In the future, we may expand the code to work on multiple files (for the static analysis), but it will still be independent from Bazel phases.
https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib/ is the source code of Bazel. The syntax/ directory includes the code for reading and evaluating the Starlark files. The code there is called by Skyframe. The interpreter is called by Skyframe many times in parallel, both during the loading and the analysis phases.
If you have a more specific question (what are you trying to do?), I can help more. :)

How do I get workspace status in bazel

I would like to version build artefacts with build number for CI passed to bazel via workspace_status_command. Sometimes I would like to include build number to the name of the artefact.
Is there a way how do I access ctx when writing a macro(as I was trying to use ctx.info_file)? So far it seems that I am able to access such info just in new rule when creating a new rule which in this case is a bit awkward.
I guess that having a build number or similar info is pretty common use case so I wonder if thre is a simpler way how to access such info.
No, you really need to define a custom rule to be able to consume information passed from workspace_status_command through info_file and version_file file and even then you cannot just access it's values from Starlark, you can pass the file to your tooling (wrapper) and process the inputs there. After all, (build) rules do not execute anything, they emit actions to be executed at a later phase.
Be careful though, because if you depend on info_file (STABLE_* entries), changes to the file invalidate targets depending on it. For something like CI build number, it's usually not what you want and version_file is more likely what you are after. You may want to record the id, but you usually do not want to rebuild stuff just because the build ID has changed (it's a new CI run). However, even simple inclusion of IDs could be considered problematic, if you want your results to be completely reproducible.
Having variable artifact names is a whole new problem and there would be good reasons why not to. But generally since as proposed the name would be decided during execution of actions (reading in version_file in your tool), you're past the analysis phase to decide what comes out of the action. The only way I am currently aware of (that is for out of tree source of variable input, you can of course always define a Starlark variable and load it from your BUILD file) to be able to do that is to use tree artifacts (using declare_directory in your rule.

What tests should be run in preparation for making contributions to Bazel?

I am preparing for making a minor bug fix to bazel java code. Am working on a Linux distribution.
Following the instructions in https://bazel.build/contributing.html but I encounter problems with two of the test instructions:
In the section about "Compiling bazel" the third parapgraph state: "In addition to the Bazel binary, you might want to build the various tools Bazel uses. They are located in //src/java_tools/..., //src/objc_tools/... and //src/tools/... and their directories contain README files describing their respective utility." If I follow this the //src/tools/... fail because there is no xcrun command in the Linux environment I am using. I suppose this is MacOS platform specific tests?
The next paragraph instructs you to build a distribution package, that you then unpack in a new directory, and then do: "bazel test //src/... //third_party/ijar/...". I now get an error that windows.h is missing, which I suppose is Windows platform specific tests.
Some questions:
So is there an easy way to run tests only for the current platform?
Is the instructions good enough?
If the instructions should be updated, what is the best way to notify the ones managing that documentation page?
Thanks for your interest in contributing to Bazel! The bazel-dev mailing list is a better avenue for these questions.
The tests that you want to run largely depend on the changes you make, but when you make a pull request, the Bazel CI will run all of Bazel's tests to make sure that nothing breaks.
So is there an easy way to run tests only for the current platform?
It depends, and this is still a work in progress where we want to make Bazel more aware of platforms and toolchains without specifying additional flags.
In general, you don't need to modify or worry about the //src/*_tools packages unless you're making direct changes to them.
Is the instructions good enough?
The instructions will never be perfect, and we're always looking for ways to make it clearer and more concise.
If the instructions should be updated, what is the best way to notify the ones managing that documentation page?
Please file an issue on the GitHub repository or email the bazel-dev mailing list for further discussion.

why is bazel faster than gradle

originally, I use gradle to build my android project, but recently, I migrate it to bazel, and I find that bazel is truly fast than gradle, so I want to know why, but the doc of bazel doesn't give much idea about this, can anyone help me?
Thanks very much!
Full disclosure: I work on Bazel.
That's not an easy question to answer for two reasons. First, performance is highly dependent on the scenario. For example, we'd generally expect a clean build to be slower than a build where only a single file has changed. Second, I don't know how Gradle works internally, and they've done a lot of work recently to improve Gradle performance.
But I can talk about Bazel and what we're doing to make it fast. We've been working on build performance for ~10 years, starting long before we made it public.
The key feature is that we require all dependencies to be declared, and we track them explicitly. If you use a header file in C++, or depend on a Java library, you must declare this dependency in your BUILD file (and we enforce that these are declared by sandboxing individual actions). There are three effects from this:
First, we can heavily parallelize the build, because we know which things depend on which other things.
Second, we can make incremental builds very fast, because we can tell what parts of the build have to be re-done when you change a specific file (BUILD file, header file, source file, ...).
Third, we almost never have to do clean builds. Other build tools often require 'make clean' to get into a predictable state - since Bazel knows all the dependencies, it can get to a predictable state on every single build.
Another effect is that we can cache remotely (i.e., across users), and even execute on another machine, although neither of these are fully supported at the time of this writing.

Is there a way for one ant script to check if another is already running?

We have several automated build scripts, some of which are run automatically every 2 hours, and some of which are only ever run manually.
If a build script is started manually while another is already running, it can cause...problems. Such as merging untested branches into the production branch.
I'd like to prevent this happening again, and the simplest solution in my mind is to have each build script start by checking that another is not currently running.
Is there a way in ant to directly check if another ant instance/script is currently running?
If not, what's the simplest way to add such a check? My first thought is a file created at the beginning and deleted at the end of a build. I'd prefer a way that handles user-cancelled builds nicely, but it's not necessary. It needs to work if a build succeeded and if a build failed (but was not killed by the user).
If these are separate Ant processes, then I think the only solution is to define a lockfile of some sort that each Ant process needs to acquire before it can continue.
Perhaps the tempfile task could be used for this?
Actually, a sort-of semaphore based on a directory might be better because the tempfile really is a unique tempfile. The first thing your script does is use mkdir to create a shared resource directory name, but it only does this if the directory does not exist.
Upon exit it invokes delete on this shared resource name.
The idea is that the content and name of the directory is meaningless -- it only serves as an "IPC" cooperative locking mechanism.
This isn't particularly elegant, but I think your only other option is to set up a build server that handles scheduled and continuous builds based on various triggers. One that many people use is Jenkins (or has it been renamed?)
[Update]
Perhaps Do I have a way to check the existence of a directory in Ant (not a file)? would do the trick?
To be honest, this approach may work in the short term, but it just moves the problem around. Instead of resetting unit test results you'll be removing lockfiles by hand to get builds working again. My advice is to set up a CI build system, but I recognize this is a fair amount of work (and introduces a whole different set of future problems.)

Resources