How to determine absolute path of target using bazel query? - bazel

Question
Is there any way I could use bazel query or aspects to identify where on the package path bazel is picking up a package? Something similar to the which command.
The documentation suggests using the --show_package_location. However that is deprecated and no longer supported, see #5592. Additionally, my attempts at using it have not uncovered much useful information. I have tried bazel query //some/target/... --output label_kind --show_package_location as well as other permutations with bazel build and it doesn't add output anything different to the console output.
Motivation
I have two different directories on my package path for fetch, query and build.
--package_path=%workspace%:%workspace%/__fuse__
This configuration supports a workflow where users perform sparse-checkouts of our large repository, while still being able to build code that has not been locally checked out. When building targets, Bazel checks for the locally checked out version of package, and if that doesn't exist, it searches a read only fuse mount.
Sometimes it's unclear to users where a package is getting picked up from, i.e. whether it's the locally checked out version or the one served from fuse. This becomes problematic when they delete or move a Bazel package, and Bazel picks up the version on the fuse mount.
It'd be nice if I could point them to a command that would map each package to where it's being picked up. For example, if i ran the command on ...
//some/package/foo --> package_path/some/package/foo
//some/package/bar --> other_package_path/some/package/bar

I completely missed this in the bazel query documentation.
With bazel query, I simply needed to add --output location, so provided I make a query like:
bazel query //some/package/... --output location
Then bazel query will output
/absolute/path/some/package/BUILD:lineno:colno target_kind label
for each target in //some/package/...

Related

Bazel rules with unknown output filenames

I have a command that compiles and runs a program, but the intermediate files are randomly named (but contained within a directory). E.g.
build foo.src bar.src -o output_dir
run output_dir
Bazel requires me to pre-declare all of the outputs of my rule, but I can't do that because they're randomly named. Can I somehow name an entire directory instead?
The only alternative I can think of is having the rule zip/unzip the directory before/after it runs the commands, which is a pretty awful solution.
Edit: I found an issue exactly describing the "just zip/unzip everything" solution here. The closing comment says to just use the rules from rules_pkg to zip/unzip stuff. Unfortunately it requires Python too.
Some of the comments in that thread suggest you can use declare_directory() but I don't think that really works.
There are tree artifacts. An example of how to use an tree artifact can be found here.
Tree artifacts are problematic for caching since Bazel is not aware of the content of the corresponding directory and if for some reason the content of a tree artifact is different between two machines that use the same Bazel cache and same Bazel configuration you are trouble.

Bazel builds from scratch ignoring cache

I observe that my Bazel build agent frequently builds the project from scratch (including compiling grpc, which keeps unchanged) instead of taking results from cache. Is there a way, like query or cquery (pardon my ignorance) to determine why is the cache considered invalid for particular target? Or any techniques to tackle cache invalidation problem?
This is How the bazel build works :
When running a build or a test, Bazel does the following: Loads the BUILD files relevant to the target. Analyzes the inputs and their dependencies, applies the specified build rules, and produces an action graph. Executes the build actions on the inputs until the final build outputs are produced.
If you are having any clear assumptions can you please share the complete details!
This is most likely due to the rebuild sensitivity to particular environment variables. Many build actions will read from environment variables and use them to change the outputs. Bazel keeps track of this and will rebuild seemingly unchanged remote targets when your env changes.
To demonstrate this;
Build grpc (2x ensure it is cached the second time)
Change the PATH environment variable (your IDE may do this without you knowing)
mkdir ~/bin && export PATH=$PATH:~/bin
Rebuild grpc (This should trigger a complete rebuild)
There are a couple helpful flags to combat this rebuild sensitivity, and I'd recommend adding them to your bazelrc.
incompatible_strict_action_env: Freezes your environment and doesn't source environment variables from your shell.
action_env modify environment variables as needed for you build.
# file //.bazelrc
# Don't source environment from shell
build --incompatible_strict_action_env
# Use action_env as needed for your project
build --action_env=CC=clang

How do I debug an annotation processor in a bazel java_library rule?

I have added an annotation processor as a java_plugin and have added this into the plugins section of my java_library rule. I was wondering what are the bazel options to step through the annotation processor code and the javac compiler's code?
One way to do this is to run bazel build with --subcommands. Bazel will then print out all the commands it executes during a build. You can then find the javac invocation you're interested in, copy the command line (including the cd part so you're in the correct directory), modify the command line to include the debugging options, and run it manually. Then you can debug it like you would any java program.
One thing to note is that bazel will print only the commands that it actually runs in that build, so if the action you're interested in is already up-to-date, you may have to delete one of its outputs (e.g. the jar output of that library) to get bazel to re-run the action.

Bazel list of tools in-scope for targets running shell commands

Is there a list of tools that are assumed to be always in the PATH when a Bazel target runs a shell command?
This is relevant for creating isolated build environments. AFAIU (see https://github.com/NixOS/nixpkgs/pull/50765#issuecomment-440009735) by default Bazel picks up tools from /bin and /usr/bin when in strict mode.
But what can ultimately be assumed about the minimal content of those? For example, I saw awk to be used liberally. But then git as well, which sounds border-line.
I imagine the exact set might correspond to whatever Google-internal Bazel expects to find in Google's build images bin directories. At least for BUILD rules open-sourced by Google.
Is there such a definitive list? Thank you.
As far as I can tell, your assessment of the tool usage is correct, and unfortunately I'm not aware of such a list.
There should be one, and Bazel should treat the shell as a toolchain. Alas nobody is working on that at the moment. See https://github.com/bazelbuild/bazel/issues/5265.

Can I ignore some folder (containing bazel configuration) while building the project recursively?

For some reasons, practical or not, rxjs npm package stores BAZEL.build configuration in the package, so when I'm trying to build my project (which has node_modules folder) bazel tries automatically to build something that it's not supposed to build at all.
My question would be - what is canonical way of ignoring some specific folder while building bazel project recursively?
The only way to achieve what I'm looking for that I know of is to point to it explicitly in the command line
bazel build //... --deleted_packages=node_modules/rxjs/src (see user manual)
But I don't want to type this every time.
Bazel recently added a feature for ignoring folders (similar to gitignore).
Simply add node_modules to the .bazelignore file in the root of your project.
Yes, this is expressible as a bazel target pattern:
bazel build -- //... -//node_modules/rxjs/src/...
Full documentation is available at https://docs.bazel.build/versions/master/user-manual.html#target-patterns

Resources