How to avoid deleting cached files after build in Bazel - bazel

I have a genrule in Bazel that is supposed to manipulate some files. I think I'm not accessing these files by the correct path, so I want to look at the directory structure that Bazel is creating so I can debug.
I added some echo statements to my genrule and I can see that Bazel is working in the directory /home/lyft/.cache/bazel/_bazel_lyft/8de0a1069de8d166c668173ca21c04ae/sandbox/linux-sandbox/1/execroot/. However, after Bazel finishes running, this directory is gone, so I can't look at the directory structure.
How can I prevent Bazel from deleting its temporary files so that I can debug what's happening?

Since this question is a top result for "keep sandbox files after build bazel" Google search and it wasn't obvious for me from the accepted answer, I feel the need to write this answer.
Short answer
Use --sandbox_debug. If this flag is passed, Bazel will not delete the files inside the sandbox folder after the build finishes.
Longer answer
Run bazel build with --sandbox_debug option:
$ bazel build mypackage:mytarget --sandbox_debug
Then you can inspect the contents of the sandbox folder for the project.
To get the location of the sandbox folder for current project, navigate to project and then run:
$ bazel info output_base
/home/johnsmith/.cache/bazel/_bazel_johnsmith/d949417420413f64a0b619cb69f1db69 # output will be something like this
Inside that directory there will be sandbox folder.
Possible caveat: (I'm NOT sure about this but) It's possible that some of the files are missing in sandbox folder, if you previously ran a build without --sandbox_debug flag and it partially succeeded. The reason is Bazel won't rerun parts of the build that already succeeded, and consequently the files corresponding to the successful build parts might not end up in the sandbox.
If you want to make sure all the sandbox files are there, clean the project first using either bazel clean or bazel clean --expunge.

You can use --spawn_strategy=standalone.
You can also use --sandbox_debug to see which directories are mounted to the sandbox.
You can also set the genrule's cmd to find . > $# to debug what's available to the genrule.
Important: declare all srcs/outs/tools that the genrule will read/write/use, and use $(location //label/of:target) to look up their path. Example:
genrule(
name = "x1",
srcs = ["//foo:input1.txt", "//bar:generated_file"],
outs = ["x1out.txt", "x1err.txt"],
tools = ["//util:bin1"],
cmd = "$(location //util:bin1) --input1=$(location //foo:input1.txt) --input2=$(location //bar:generated_file) --some_flag --other_flag >$(location x1out.txt) 2>$(location x1err.txt)",
)

Related

Clean up unreachable generated files in Bazel

Suppose I have a very minimal project with an empty WORKSPACE and a single package defined at the project root that simply uses touch to create a file called a, as follows:
genrule(
name = "target",
cmd = "touch $#",
outs = ["a"],
)
If I now run
bazel build //:target
the package will be "built" and the a file will be available under bazel-genfiles.
Suppose I now change the BUILD to write the output to a different file, as follows:
genrule(
name = "target",
cmd = "touch $#",
outs = ["b"],
)
Building the same target will result in the file b being available under bazel-genfiles. a will still be there though, even though at this point it's "unreachable" from within the context of the build definition.
Is there a way to ask Bazel to perform some sort of "garbage collection" and remove files (and possibly other content) generated by previous builds that are no longer reachable as-per the current build definition, without getting rid of the entire directory? The bazel clean command seems to adopt the latter behavior.
There seems to be a feature in the works, but apparently it cannot be performed on demand, but rather it executes automatically as soon as a certain threshold has been reached.
Note that running bazel clean will not actually delete the external directory. To remove all external artifacts, use bazel clean --expunge
bazel clean is the way to remove these.
The stale outputs aren't visible to actions, provided you build with sandboxing. (Not yet available on Windows, only on Linux and macOS.)
What trouble do these files make?

Can I ignore some folder (containing bazel configuration) while building the project recursively?

For some reasons, practical or not, rxjs npm package stores BAZEL.build configuration in the package, so when I'm trying to build my project (which has node_modules folder) bazel tries automatically to build something that it's not supposed to build at all.
My question would be - what is canonical way of ignoring some specific folder while building bazel project recursively?
The only way to achieve what I'm looking for that I know of is to point to it explicitly in the command line
bazel build //... --deleted_packages=node_modules/rxjs/src (see user manual)
But I don't want to type this every time.
Bazel recently added a feature for ignoring folders (similar to gitignore).
Simply add node_modules to the .bazelignore file in the root of your project.
Yes, this is expressible as a bazel target pattern:
bazel build -- //... -//node_modules/rxjs/src/...
Full documentation is available at https://docs.bazel.build/versions/master/user-manual.html#target-patterns

Bazel ignore subdirectory on full build

In my repository I have some files with the name "build" (automatically generated and/or imported, spread around elsewhere from where I have my bazel build files). These seem to be interpreted by Bazel as its BUILD files, and fail the full build I try to run with bazel build //...
Is there some way I can tell Bazel in a settings configuration file to ignore certain directories altogether? Or perhaps specify the build file names as something other than BUILD, like BUILD.bazel?
Or are my options:
To banish the name build from the entire repository.
To add a gigantic --deleted_packages=<...> to every run of build.
To not use full builds, instead specifying explicit targets.
I think this is a duplicate of the two questions you linked, but to expand on what you asked about in your comment:
You don't have to rename them BUILD.bazel, my suggestion is to add an empty BUILD.bazel to those directories. So you'd end up with:
my-project/
BUILD
src/
build/
stuff-bazel-shouldn't-mess-with
BUILD.bazel # Empty
Then Bazel will check for targets in BUILD.bazel, see that there are none, and won't try to parse the build/ directory.
And there is a distressing lack of documentation about BUILD vs. BUILD.bazel, at least that I could find.

Bazel- How to recursively glob deleted_packages to ignore maven outputs?

I have a mutli-module project which I'm migrating from Maven to Bazel. During this migration people will need to be able to work on both build systems.
After an mvn clean install Maven copies some of the BUILD files into the target folder.
When I later try to run bazel build //... it thinks the BUILD files under the various target folders are valid packages and fails due to some mismatch.
I've seen deleted_packages but AFAICT it requires I specify the list of folders to "delete" while I can't do that for 200+ modules.
I'm looking for the ability to say bazel build //... --deleted_packages=**/target.
Is this supported? (my experimentation says it's not but I might be wrong). If it's not supported is there an existing hack for it?
Can you use your shell to find the list of packages to ignore?
deleted=$(find . -name target -type d)
bazel build //... --deleted_packages="$deleted"
#Laurent's answer gave me the lead but Bazel didn't accept relative paths and required I add both classes and test-classes folders under target to delete the package so I decided to answer with the complete solution:
#!/bin/bash
#find all the target folders under the current working dir
target_folders=$(find . -name target -type d)
#find the repo root (currently assuming it's git based)
repo_root=$(git rev-parse --show-toplevel)
repo_root_length=${#repo_root}
#the current bazel package prefix is the PWD minus the repo root and adding a slash
current_bazel_package="/${PWD:repo_root_length}"
deleted_packages=""
for target in $target_folders
do
#cannonicalize the package path
full_package_path="$current_bazel_package${target:1}"
classes_full="${full_package_path}/classes"
test_classes_full="${full_package_path}/test-classes"
deleted_packages="$deleted_packages,$classes_full,$test_classes_full"
done
#remove the leading comma and call bazel-real with the other args
bazel-real "$#" --deleted_packages=${deleted_packages:1}
This script was checked in under tools/bazel which is why it calls bazel-real at the end.
I'm sorry I don't think this is supported. Some brainstorming:
Is it an option to point maven outputs somewhere else?
Is is an option not to use //... but explicit target(s)?
Maybe just remove the bad BUILD files before running bazel?

Prevent Jenkins Job from escaping the workspace

A developer committed a makefile with an output path of ....\SomeDir. When the project was built the output was dumped into Jenkins_Home. Is there any way to fail a build if it attempts to escape the workspace directory? Thanks,
-Michael
Worth putting some strong write permissions in place. Make sure that the user that Jenkins runs as only has permission to write to Jenkins_Home and its subdirectories.
You could also make sure that directories under Jenkins_Home only have write permission if it is necessary (workspace, for logging, etc.)
If a makefile attempts to use an absolute path then the build should fail with a permissions error.
I think the solution is that, before running the command, to do a change directory to workspace. If you have the workspace set up, then you can write something like cd $WORKSPACE or cd %WORKSPACE% ( for batch commands).

Resources