I want to create some "build traceability" functionality, and include the actual bazel command that was run to produce one of my build artifacts. So if the user did this:
bazel run //foo/bar:baz --config=blah
I want to actually get the string "bazel run //foo/bar:baz --config=blah" and write that to a file during the build. Is this possible?
Stamping is the "correct" way to get information like that into a Bazel build. Note the implications around caching though. You could also just write a wrapper script that puts the command line into a file or environment variable. Details of each approach below.
I can think of three ways to get the information you want via stamping, each with differing tradeoffs.
First way: Hijack --embed_label. This shows up in the BUILD_EMBED_LABEL stamping key. You'd add a line like build:blah --embed_label blah in your .bazelrc. This is easy, but that label is often used for things like release_50, which you might want to preserve.
Second way: hijack the hostname or username. These show up in the BUILD_HOST and BUILD_USER stamping keys. On Linux, you can write a shell script at tools/bazel which will automatically be used to wrap bazel invocations. In that shell script, you can use unshare --uts --map-root-user, which will work if the machine is set up to enable bazel's sandboxing. Inside that new namespace, you can easily change the hostname and then exec the real bazel binary, like the default /usr/bin/bazel shell script does. That shell script has full access to the command line, so it can encode any information you want.
Third way: put it into an environment variable and have a custom --workspace_status_command that extracts it into a stamping key. Add a line like build:blah --action_env=MY_BUILD_STYLE=blah to your .bazelrc, and then do echo STABLE_MY_BUILD_STYLE ${MY_BUILD_STYLE} in your workspace status script. If you want the full command line, you could have a tools/bazel wrapper script put that into an environment variable, and then use build --action_env=MY_BUILD_STYLE to preserve the value and pass it to all the actions.
Once you pick a stamping key to use, src/test/shell/integration/stamping_test.sh in the Bazel source tree is a good example of writing stamp information to a file. Something like this:
genrule(
name = "stamped",
outs = ["stamped.txt"],
cmd = "grep BUILD_EMBED_LABEL bazel-out/volatile-status.txt | cut -d ' ' -f 2 >\$#",
stamp = True,
)
If you want to do it without stamping, just write the information to a file in the source tree in a tools/bazel wrapper. You'd want to put that file in your .gitignore, of course. echo "$#" > cli_args is all it takes to dump them to a file, and then you can use that as a source file like normal in your build. This approach is simplest, but interacts the most poorly with Bazel's caching, because everything that depends on that file will be rebuilt every time with no way to control it.
Related
I am trying to build a Docker image with this code:
container_image(
name = "docker_image",
base = "#java_base//image",
files = [":executable_deploy.jar"],
cmd = ["java", "-jar", "executable_deploy.jar"],
env = { "VERSION" : "$(VERSION)" }
)
I want to pass a variable to the target built so it can be replaced in $(VERSION). Is this possible?
I have tried with VERSION=1.0.0 bazel build :docker_image, but I get an error:
$(VERSION) not defined.
How can I pass that variable?
According docs:
The values of this field (env) support make variables (e.g., $(FOO)) and
stamp variables; keys support make variables as well.
But I don't understand exactly what that means.
Those variables can be set via the --define flag.
There is a section on the rules_docker page about stamping which covers this.
Essentially you can do something like:
bazel build --define=VERSION=1.0.0 //:docker_image
It is also possible to source these key / value pairs from the stable-status.txt and volatile-status.txt files. The user manual page for bazel shows how to use these files, and the use of the --workspace_status_command to populate them.
For setting defaults, you could use a .bazelrc file, with something like the following as the contents:
build --define=VERSION=0.0.0-PLACEHOLDER
The flags passed on the command line will take precedence over those in the .bazelrc file.
It's worth mentioning, that changing define values will cause bazel to analyze everything again, which depending on the graph may take some time, but only affected actions will be executed.
I was wondering if its possible for platform-specific default Bazel build flags.
For example, we want to use --workspace_status_command but this must be a shell script on Linux and must point towards a batch script for Windows.
Is there a way we can write in the tools/bazel.rc file something like...
if platform=WINDOWS build: --workspace_status_command=status_command.bat
if platform=LINUX build: --workspace_status_command=status_command.sh
We could generate a .bazelrc file by having the users run a script before building, but it would be cleaner/nicer if this was not neccessary.
Yes, kind of. You can specify config-specific bazelrc entries, which you can select by passing --config=<configname>.
For example your bazelrc could look like:
build:linux --cpu=k8
build:linux --workspace_status_command=/path/to/command.sh
build:windows --cpu=x64_windows
build:windows --workspace_status_command=c:/path/to/command.bat
And you'd build like so:
bazel build --config=linux //path/to:target
or:
bazel build --config=windows //path/to:target
You have to be careful not to mix semantically conflicting --config flags (Bazel doesn't prevent you from that). Though it will work, the results may be unpredictable when the configs tinker with the same flags.
Passing --config to all commands is tricky, it depends on developers remembering to do this, or controlling the places where Bazel is called.
I think a better answer would be to teach the version control system how to produce the values, like by putting a git-bazel-stamp script on the $PATH/%PATH% so that git bazel-stamp works.
Then we need workspace_status_command to allow commands from the PATH rather than a path on disk.
Proper way to do this is to wrap your cc_library with a custom macro, and pass hardcoded flags to copts. For full reference, look at envoy_library.bzl.
In short, your steps:
Define a macro to wrap cc_library:
def my_cc_library(
name,
copts=[],
**kwargs):
cc_library(name, copts=copts + my_flags(), **kwargs)
Define my_flags() macro as following:
config_setting(
name = "windows_x86_64",
values = {"cpu": "x64_windows"},
)
config_setting(
name = "linux_k8",
values = {"cpu": "k8"},
)
def my_flags():
x64_windows_options = ["/W4"]
k8_options = ["-Wall"]
return select({
":windows_x86_64": x64_windows_options,
":linux_k8": k8_options,
"//conditions:default": [],
})
How it works:
Depending on --cpu flag value my_flags() will return different flags.
This value is resolved automatically based on a platform. On Windows, it's x64_windows, and on Linux it's k8.
Then, your macro my_cc_library will supply this flags to every target in a project.
A better way of doing this has been added since you asked--sometime in 2019.
If you add
common --enable_platform_specific_config to your .bazelrc, then --config=windows will automatically apply on windows hosts, --config=macos on mac, --config=linux on linux, etc.
You can then add lines to your .bazelrc like:
build:windows --windows-flags
build:linux --linux-flags
There is one downside, though. This works based on the host rather than the target. So if you're cross-compiling, e.g. to mobile, and want different flags there, you'll have to go with a solution like envoy's (see other answer), or (probably better) add transitions into your graph targets. (See discussion here and here. "Flagless builds" are still under development, but there are usable hacks in the meantime.) You could also use the temporary platform_mappings API.
References:
Commit that added this functionality.
Where it appears in the Bazel docs.
I would like to add this to my .bazelrc, but the $(whoami) doesn't expand like if it was in a shell.
startup --output_user_root=/tmp/bazel/out/$(whoami)
It produces the literal result:
/tmp/bazel/out/$(whoami)/faedb999bdce730c9c495251de1ca1a4/execroot/__main__/bazel-out/
Is there any way to do what I want: adding a name/hash to the option in the .bashrc file?
Edit: what I really want is to set the outputRoot to /tmp/bazel/out without using an environment variable and to let bazel create it's user and workspace hash directories there.
You can run Bazel from a wrapper script. In fact, that's exactly what the bazel binary is (at least on Linux): it's a wrapper script that calls bazel-real. You can edit this wrapper script if you like, or rename it to bazel.sh and write your own wrapper.
/usr/bin/bazel is a script which looks for //tools/bazel, and if it exists, calls it. Otherwise, it calls bazel-real. This lets you check Bazel into your repo, or otherwise modify how it gets called. We use that to download a specific version of bazel, extract it, and then call it.
I would recommend creating //tools/bazel, and having that do your modification. It can then either call a versioned version of bazel, or call bazel-real. That keeps your modifications local to your repo rather than global.
I am trying to get a value of the full current directory path from within .bzl rule. I have tried following:
ctx.host_configuration.default_shell_env.PATH returns "/Users/[user_name]/.rbenv/shims:/usr/local/bin:/usr/bin:/bin:...
ctx.bin_dir.path returns bazel-out/local-fastbuild/bin
pwd = ctx.expand_make_variables("cmd", "$$PWD", {}) returns string $PWD - I don't think this rule is helpful for me, but may be just using it wrong.
What I need is the directory under which the cmd that runs Bazel .bzl rule is running. For example: /Users/[user_name]/git/workspace/path/to/bazel/rule.bzl or at least first part of the path prior to the WORKSPACE directory.
I can't use pwd because I need this value before I call ctx.actions.run_shell()
Are there no attributes in Bazel configurations that hold this value?
The goal is to have hermetic builds, so you shouldn't depend on the absolute path.
Feel free to use pwd inside the command of ctx.actions.run_shell() (for reproducible builds, be careful, avoid putting the absolute path in the generated files).
Edit.
Technically, there are some workarounds. For example, you can pass the path via the --define flag:
bazel build :all --define=path=$(pwd)
Then the value will be available using ctx.var["path"].
Based on your comment below, you want the path to declare an output. Let me repeat: You shouldn't use an absolute path to declare the output file. Declare an output in your package. Then ask the tool you call to use that output.
For example, when you call gcc, you can use -o to specify the output. When a tool writes to stdout, use the shell to redirect it. If the tool is really not flexible, you may want to wrap it with your own script (e.g. call the tool and copy the output file).
Using an absolute path here is not the right solution. For example, it should be possible to execute the action on a remote machine (where your absolute path won't make sense.
Zip may be a reasonable solution. It's useful when you cannot know in advance the number or the names of the output files.
Context
I am writing a repository rule that invokes another Bazel project. My current approach is to build the additional project as a deploy jar. I would like a user to be able to instantiate the rule like:
jar_path = some/relative/path
my_rule(name = "something", p_arg="m_arg", binary=jar_path)
and then given the jar_path and the arguments, I would like the repository rule to execute the following command in the shell:
java -jar $(SOME_JAR) $(ARGUMENTS_PROVIDED_BY_RULE)
Problem
First, it's unclear how best to accomplish the deploy jar approach. So far, I have attempt two different approaches, with varying levels of success. For examples, I have skimmed through the scala_rules, the maven_rules, and the skylark cookbook.
Second, and more importantly, I am not sure whether the deploy jar is the best route to accomplishing my goals. Again, my interest is to invoke a target from an external Bazel project, that is currently hosted on github. (So feasibly, I could try to fetch the project using the http_archive rule).
Below, I describe the attempts I have made.
Approach 1
My first approach involved trying to execute the command using the command field in ctx.action. I tried various enumerations of
java -jar {computed_absolute_path_of_deploy_jar} {args_passed_from_instantiation}.
My biggest issue here was with determining the absolute path of the deploy jar. The file's root path, would contain some additional information. For example, it would like something like this.
/abs/olute/path[ something ]/rela/tive/path
As a side note, I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None, despite File.none not being None.
My first approach involved was to was to try to use skylark [ctx.binary]
Approach 2
Next thing I tried was to mimic the input binary example from the docs. This was also unsuccessful. The issue was that the actual binary could not be found. Here is how I configured it.
First, I relaxed the repository rule into a regular skylark rule.
def _test_binary(ctx):
ctx.action(
....
arguments = [ctx.attr.p_arg],
executable = ctx.executable.binary)
test_binary = rule(
...
attrs = {
"binary":attr.label(mandatory=True, cfg="host", allow_files=True, executable=True),
...
}
Then, in my external project, I loaded the skylark rule into the WORKSPACE file. Finally, I called the macro from one of my BUILD files as follows:
load("#something_rule//:something_rule.bzl", "test_binary")
test_binary(name = "hello", p_arg = "hello", binary = "script.sh")
The script is a one line java -jar something_deploy.jar -- -arg:$1, and is in the same directory as the BUILD file.
Bazel complains that src/script.sh does not exist. I presume because it is looking for the file in /private/var/tmp/-bazel_username/somehash/relative_path. In response, I tried to pass the absolute path, which is not allowed.
Cheers.
It looks like you're mixing up repository rules with build extensions ("normal" rules). A good rule of thumb is:
Repository rules are for getting sources onto your system or symlinking them to a place Bazel can see them.
Build extension are for everything else: compiling, copying files, running binaries, etc.
I don't actually think you need to use either, for this. You say that the other project is on GitHub, so you can add the following to your WORKSPACE file:
http_archive(
name = "other_project",
...
)
Then, in your BUILD file:
genrule(
name = "run-a-jar",
srcs = ["#other_project//some/relative:path"],
cmd = "java -jar $(location #other_project//some/relative:path) -- arg1 arg2 > $#",
outs = ["jar-output"],
)
You shouldn't need to use the _deploy.jar target, since you're not moving the jar out of its project (_deploy.jar is useful when you need to relocate it).
Other things from your question:
I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None,
Are you sure it didn't evaluate to ""? The path is relative to the execution root, so for sources, it will always be "" (for outputs, it'll be bazel-out/local-fastbuild/bin or similar).
Bazel complains that src/script.sh does not exist.
Passing -s to Bazel can really help debugging Skylark rules. You can see exactly where it is looking.