Bazel WORKSPACE conditionally define exactly one of two `git_repository`s - bazel

I'm maintaining two python libraries A and B, each partially using Bazel for building non-python codes. Library B depends on A in terms of Bazel, so B needs a remote repository of A.
For the released version of B, I'd like to have remote repository of A in canonical form, for example git_repository with commit hash.
git_repository(
name = "A",
commit = "...",
remote = "https://github.com/foo/A",
)
During development, I'd like to have remote repository of A in symbolic link form, for example, git_repository with master branch.
git_repository(
name = "A",
branch = "master",
remote = "https://github.com/foo/B",
)
And I'd like to use exactly one of them. After some research I found there is no "conditional branch" method (fed from command line flags or environment variable) I can use at the WORKSPACE level. I'm asking for any options I couldn't have found.
Followings are the alternatives that I've searched for, but not 100% happy.
Using local_repository during development is not an attractive solution, as in real there are 8+ libraries with chained dependencies, and I don't think it is realistic to manually clone and sometimes pull them.
Using alias() with select() at a BUILD level is not also very attractive solution, because it turns out there are tens of A's blaze targets that are used in B. Defining aliases for all of them is not maintainable at scale. (or is there a way to define alias at a package level?).
# WORKSPACE
git_repository(name = "A", ...)
git_repository(name = "A_master", ...)
# BUILD
config_setting(name = "use_master", ...)
alias(
name = "A_pkg_label", # There are too many targets to declare
actual = select({
":use_master": "#A_master/pkg:label",
"//conditions:default": "#A/pkg:label",
})
)
Using two WORKSPACE files seems feasible, but I couldn't find a clean way to select WORKSPACE file other than manually renaming them.
Defining custom repository_rule, branching by the repository_ctx.os.environ value, seemed promising until I figured out that I cannot reuse other repository rule inside implementation.

While you can't reuse other repository rules in general, in practice many of them are written in Starlark and are easy to reuse. For example, git_repository's implementation looks like this:
def _git_repository_implementation(ctx):
update = _clone_or_update(ctx)
patch(ctx)
ctx.delete(ctx.path(".git"))
return _update_git_attrs(ctx.attr, _common_attrs.keys(), update)
Most of those utility functions are either NOPs if you're only using the basic features or possible to load from your own starlark code. You could do a barebones replacement with just this:
load("#bazel_tools//tools/build_defs/repo:git_worker.bzl", "git_repo")
def _my_git_repository_implementation(ctx):
directory = str(ctx.path("."))
git_repo(ctx, directory)
ctx.delete(ctx.path(".git"))

Related

How to `bazel build` all targets that use a specific rule?

We're starting to use gRPC and are currently using bazel as our build tool. After an engineer pulls in updates to proto definitions, they'll need to proto compile. Due to the structure of our repository, the proto compile targets will be scattered in the repo.
The only option I'm seeing is to use a target naming convention so engineers just need to do something like bazel build //...:compile-proto. Are there other ways to make it easy for engineers to proto compile all updated proto definitions?
If you add a specific tag to each of them, you can use --build_tag_filters.
For example:
a_proto_library(
name = "compile-proto",
tags = ["a_proto"],
[...]
)
and then bazel build --build_tag_filters=a_proto //....
You can also wrap the rule in a macro to add the tag automatically.
I don't think //...:compile-proto is a valid target pattern, so unfortunately I'm not sure that that would work (not that you necessarily really want to rely on naming conventions anyway). See https://docs.bazel.build/versions/main/guide.html#specifying-targets-to-build
One option is to let bazel do all the updating for you. If you're already doing builds like bazel build //... to build everything, then once you pull in updates to proto definitions, another bazel build //... should rebuild only what has changed.
Another option is to find all rules using bazel query:
https://docs.bazel.build/versions/main/query.html
https://docs.bazel.build/versions/main/query-how-to.html
https://docs.bazel.build/versions/main/query.html#kind
Something like:
targets=$(bazel query "kind('java_proto_library', //...)")
bazel build $targets
Note that query with //... will load every build file in the workspace, but not build anything.

Default, platform specific, Bazel flags in bazel.rc

I was wondering if its possible for platform-specific default Bazel build flags.
For example, we want to use --workspace_status_command but this must be a shell script on Linux and must point towards a batch script for Windows.
Is there a way we can write in the tools/bazel.rc file something like...
if platform=WINDOWS build: --workspace_status_command=status_command.bat
if platform=LINUX build: --workspace_status_command=status_command.sh
We could generate a .bazelrc file by having the users run a script before building, but it would be cleaner/nicer if this was not neccessary.
Yes, kind of. You can specify config-specific bazelrc entries, which you can select by passing --config=<configname>.
For example your bazelrc could look like:
build:linux --cpu=k8
build:linux --workspace_status_command=/path/to/command.sh
build:windows --cpu=x64_windows
build:windows --workspace_status_command=c:/path/to/command.bat
And you'd build like so:
bazel build --config=linux //path/to:target
or:
bazel build --config=windows //path/to:target
You have to be careful not to mix semantically conflicting --config flags (Bazel doesn't prevent you from that). Though it will work, the results may be unpredictable when the configs tinker with the same flags.
Passing --config to all commands is tricky, it depends on developers remembering to do this, or controlling the places where Bazel is called.
I think a better answer would be to teach the version control system how to produce the values, like by putting a git-bazel-stamp script on the $PATH/%PATH% so that git bazel-stamp works.
Then we need workspace_status_command to allow commands from the PATH rather than a path on disk.
Proper way to do this is to wrap your cc_library with a custom macro, and pass hardcoded flags to copts. For full reference, look at envoy_library.bzl.
In short, your steps:
Define a macro to wrap cc_library:
def my_cc_library(
name,
copts=[],
**kwargs):
cc_library(name, copts=copts + my_flags(), **kwargs)
Define my_flags() macro as following:
config_setting(
name = "windows_x86_64",
values = {"cpu": "x64_windows"},
)
config_setting(
name = "linux_k8",
values = {"cpu": "k8"},
)
def my_flags():
x64_windows_options = ["/W4"]
k8_options = ["-Wall"]
return select({
":windows_x86_64": x64_windows_options,
":linux_k8": k8_options,
"//conditions:default": [],
})
How it works:
Depending on --cpu flag value my_flags() will return different flags.
This value is resolved automatically based on a platform. On Windows, it's x64_windows, and on Linux it's k8.
Then, your macro my_cc_library will supply this flags to every target in a project.
A better way of doing this has been added since you asked--sometime in 2019.
If you add
common --enable_platform_specific_config to your .bazelrc, then --config=windows will automatically apply on windows hosts, --config=macos on mac, --config=linux on linux, etc.
You can then add lines to your .bazelrc like:
build:windows --windows-flags
build:linux --linux-flags
There is one downside, though. This works based on the host rather than the target. So if you're cross-compiling, e.g. to mobile, and want different flags there, you'll have to go with a solution like envoy's (see other answer), or (probably better) add transitions into your graph targets. (See discussion here and here. "Flagless builds" are still under development, but there are usable hacks in the meantime.) You could also use the temporary platform_mappings API.
References:
Commit that added this functionality.
Where it appears in the Bazel docs.

Disagreed names of Bazel external dependencies in difference projects

Say there are two Bazel projects, they both depend on the Python package six.
Project A adds six with the name six_1_10_0:
new_http_archive(
name = "six_1_10_0"
...
)
py_binary(
name = "lib_a",
deps = ["#six_1_10_0//:six"]
)
Project B adds six with the name six_archive.
new_http_archive(
name = "six_archive"
...
)
py_binary(
name = "lib_b",
deps = ["#six_archive//:six"]
)
In my project, I depend on both A and B. Is there a way to let them use the same six?
To change the BUILD file contents of a dependency, the simplest way I can think of is to use one of the new_* repository rules (e.g. new_git_repository). Using the build_file or build_file_content attribute to write a new BUILD file, write a new py_binary rule with its deps containing your canonical #six repository, and keeping all other attributes the same.
There isn't a straightforward way of doing this because Bazel makes no assumption on why Project A uses a different version of six compared to Project B.
The only way that Bazel knows that they're using the same version is if both new_http_archive rules specify the same SHA checksum. If they are the same checksum, you can use --experimental_repository_cache=/some/path to avoid downloading the same archive twice.

Skylark - How to execute a jar from a repository rule

Context
I am writing a repository rule that invokes another Bazel project. My current approach is to build the additional project as a deploy jar. I would like a user to be able to instantiate the rule like:
jar_path = some/relative/path
my_rule(name = "something", p_arg="m_arg", binary=jar_path)
and then given the jar_path and the arguments, I would like the repository rule to execute the following command in the shell:
java -jar $(SOME_JAR) $(ARGUMENTS_PROVIDED_BY_RULE)
Problem
First, it's unclear how best to accomplish the deploy jar approach. So far, I have attempt two different approaches, with varying levels of success. For examples, I have skimmed through the scala_rules, the maven_rules, and the skylark cookbook.
Second, and more importantly, I am not sure whether the deploy jar is the best route to accomplishing my goals. Again, my interest is to invoke a target from an external Bazel project, that is currently hosted on github. (So feasibly, I could try to fetch the project using the http_archive rule).
Below, I describe the attempts I have made.
Approach 1
My first approach involved trying to execute the command using the command field in ctx.action. I tried various enumerations of
java -jar {computed_absolute_path_of_deploy_jar} {args_passed_from_instantiation}.
My biggest issue here was with determining the absolute path of the deploy jar. The file's root path, would contain some additional information. For example, it would like something like this.
/abs/olute/path[ something ]/rela/tive/path
As a side note, I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None, despite File.none not being None.
My first approach involved was to was to try to use skylark [ctx.binary]
Approach 2
Next thing I tried was to mimic the input binary example from the docs. This was also unsuccessful. The issue was that the actual binary could not be found. Here is how I configured it.
First, I relaxed the repository rule into a regular skylark rule.
def _test_binary(ctx):
ctx.action(
....
arguments = [ctx.attr.p_arg],
executable = ctx.executable.binary)
test_binary = rule(
...
attrs = {
"binary":attr.label(mandatory=True, cfg="host", allow_files=True, executable=True),
...
}
Then, in my external project, I loaded the skylark rule into the WORKSPACE file. Finally, I called the macro from one of my BUILD files as follows:
load("#something_rule//:something_rule.bzl", "test_binary")
test_binary(name = "hello", p_arg = "hello", binary = "script.sh")
The script is a one line java -jar something_deploy.jar -- -arg:$1, and is in the same directory as the BUILD file.
Bazel complains that src/script.sh does not exist. I presume because it is looking for the file in /private/var/tmp/-bazel_username/somehash/relative_path. In response, I tried to pass the absolute path, which is not allowed.
Cheers.
It looks like you're mixing up repository rules with build extensions ("normal" rules). A good rule of thumb is:
Repository rules are for getting sources onto your system or symlinking them to a place Bazel can see them.
Build extension are for everything else: compiling, copying files, running binaries, etc.
I don't actually think you need to use either, for this. You say that the other project is on GitHub, so you can add the following to your WORKSPACE file:
http_archive(
name = "other_project",
...
)
Then, in your BUILD file:
genrule(
name = "run-a-jar",
srcs = ["#other_project//some/relative:path"],
cmd = "java -jar $(location #other_project//some/relative:path) -- arg1 arg2 > $#",
outs = ["jar-output"],
)
You shouldn't need to use the _deploy.jar target, since you're not moving the jar out of its project (_deploy.jar is useful when you need to relocate it).
Other things from your question:
I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None,
Are you sure it didn't evaluate to ""? The path is relative to the execution root, so for sources, it will always be "" (for outputs, it'll be bazel-out/local-fastbuild/bin or similar).
Bazel complains that src/script.sh does not exist.
Passing -s to Bazel can really help debugging Skylark rules. You can see exactly where it is looking.

Aliasing jar target of maven_jar rule

I have the following maven_jar in my workspace:
maven_jar(
name = "com_google_code_findbugs_jsr305",
artifact = "com.google.code.findbugs:jsr305:3.0.1",
sha1 = "f7be08ec23c21485b9b5a1cf1654c2ec8c58168d",
)
In my project I reference it through #com_google_code_findbugs_jsr305//jar. However, I now want to depend on a third party library that references #com_google_code_findbugs_jsr305 without the jar target.
I tried looking into both bind and alias, however alias cannot be applied inside the WORKSPACE and bind doesn't seem to allow you to define targets as external repositories.
I could rename the version I use so it doesn't conflict, but that feels like the wrong solution.
IIUC, your code needs to depend on both #com_google_code_findbugs_jsr305//jar and #com_google_code_findbugs_jsr305//:com_google_code_findbugs_jsr305. Unfortunately, there isn't any pre-built rule that generates BUILD files for both of those targets, so you basically have to define the BUILD files yourself. Fortunately, #jart has written most of it for you in the closure rule you linked to. You just need to add //jar:jar by appending a couple of lines, after line 69 add something like:
repository_ctx.file(
'jar/BUILD',
"\n".join([
"package(default_visibility = '//visibility:public')"] + _make_java_import('jar', '//:com_google_code_findbugs_jsr305.jar')
This creates a //jar:jar (or equivalently, //jar) target in the repository.

Resources