Where to get list of known repositories like #bazel_tools, #rules_jvm_external etc? - bazel

Sometimes I see extensions loading from the internet or built-in ones.
Canonical example:
load("#bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
However, I cannot distinguish local repo and known repo by looking at the load expression.
How can I check the source (location) of any repo which I see in my WORKSPACE/BUILD files?

If the Bazel label is sufficient as a source, you might try fetching repo roots with BUILD files with bazel query 'buildfiles(//...)'.
Otherwise, you could run bazel clean --expunge and run a build with --experimental_execution_log_file=<FILENAME>. This creates a protobuf based log of the actions by Bazel. In there, all internet repos are downloaded anew because of clean --expunge.
Check https://github.com/bazelbuild/bazel/tree/master/src/tools/execlog for a parser.
It is super inconvenient that this information is not available another way - afaik. I really hope someone swings by and corrects me, but this way you at least know the available sources you can correlate.

I'm new to Bazel, but as far as I understand:
Copy the name of the repo. E.g. io_bazel_rules_docker
Search it through the codebase
Look at how it's being loaded
E.g. if you see
http_archive(
name = "io_bazel_rules_docker",
...
)
http_file(
name = "io_bazel_rules_docker",
...
)
And you can conclude where it's coming from.

bazel query --output=build //external:repo_name works just fine.

Related

How to `bazel build` all targets that use a specific rule?

We're starting to use gRPC and are currently using bazel as our build tool. After an engineer pulls in updates to proto definitions, they'll need to proto compile. Due to the structure of our repository, the proto compile targets will be scattered in the repo.
The only option I'm seeing is to use a target naming convention so engineers just need to do something like bazel build //...:compile-proto. Are there other ways to make it easy for engineers to proto compile all updated proto definitions?
If you add a specific tag to each of them, you can use --build_tag_filters.
For example:
a_proto_library(
name = "compile-proto",
tags = ["a_proto"],
[...]
)
and then bazel build --build_tag_filters=a_proto //....
You can also wrap the rule in a macro to add the tag automatically.
I don't think //...:compile-proto is a valid target pattern, so unfortunately I'm not sure that that would work (not that you necessarily really want to rely on naming conventions anyway). See https://docs.bazel.build/versions/main/guide.html#specifying-targets-to-build
One option is to let bazel do all the updating for you. If you're already doing builds like bazel build //... to build everything, then once you pull in updates to proto definitions, another bazel build //... should rebuild only what has changed.
Another option is to find all rules using bazel query:
https://docs.bazel.build/versions/main/query.html
https://docs.bazel.build/versions/main/query-how-to.html
https://docs.bazel.build/versions/main/query.html#kind
Something like:
targets=$(bazel query "kind('java_proto_library', //...)")
bazel build $targets
Note that query with //... will load every build file in the workspace, but not build anything.

Bazel Monorepo - How Rebuild and Publish only Changed Docker Images?

Objective
I have a monorepo setup with a growing number of services services. When I deploy the application I run a command and every service will be rebuilt and the final Docker images will be published.
But as the number of services grows the time it takes to rebuilt all of them gets longer and longer, although changes were made to only a few of them.
Why does my setup rebuilt all Docker images although only a few have changed? My goal is to rebuilt and publish only the images that have actually changed.
Details
I am using Bazel to build my Docker images, thus in the root of my project there is one BUILD file which contains the target I run when I want to deploy. It is just a collection of k8s_objects, where every service is included:
load("#io_bazel_rules_k8s//k8s:objects.bzl", "k8s_objects")
k8s_objects(
name = "kubernetes_deployment",
objects = [
"//services/service1",
"//services/service2",
"//services/service3",
"//services/service4",
# ...
]
)
Likewise there is one BUILD file for every service which first creates a Typescript library from all the source files, then creates the Node.Js image and finally passes the image to the Kubernetes object:
load("#npm_bazel_typescript//:index.bzl", "ts_library")
ts_library(
name = "lib",
srcs = glob(
include = ["**/*.ts"],
exclude = ["**/*.spec.ts"]
),
deps = [
"//packages/package1",
"//packages/package2",
"//packages/package3",
],
)
load("#io_bazel_rules_docker//nodejs:image.bzl", "nodejs_image")
nodejs_image(
name = "image",
data = [":lib", "//:package.json"],
entry_point = ":index.ts",
)
load("#k8s_deploy//:defaults.bzl", "k8s_deploy")
k8s_object(
name = "service",
template = ":service.yaml",
kind = "deployment",
cluster = "my-cluster"
images = {
"gcr.io/project/service:latest": ":image"
},
)
Note that the Typescript lib also depends on some packages, which should also be accounted for when redeploying!
To deploy I run bazel run :kubernetes_deployment.apply
Initially one reason I decided to choose Bazel is because I thought it would handle building only changed services itself. But obviously this is either not the case or my setup is faulty in some way.
If you need more detailed insight into the project you can check it out here: https://github.com/flolude/cents-ideas
Looks like Bazel repo itself does something similar:
https://github.com/bazelbuild/bazel/blob/ef0f8e61b5d3a139016c53bf04361a8e9a09e9ab/scripts/ci/ci.sh
The rough steps are:
Calculate the list of files that has changed
Use the file list and find their dependents (e.g. The bazel querykind(.*_binary, rdeps(//..., set(file1.txt file2.txt))) will find all binary targets which are dependents of either file1.txt or file2.txt)
build/test the list of targets
You will need to adapt this script to your need (e.g. make sure it finds docker image targets)
To find out the kind of a target, you can use bazel query //... --output label_kind
EDIT:
A bit of warning for anyone who wants to go down this rabbit hole (especially if you absolutely do not want to miss tests in CI):
You need to think about:
Deleted files / BUILD files (who depended on them)
Note that moved files == Deleted + Added as well
Also you cannot query reverse deps of files/BUILD that doesn't exist anymore!
Modified BUILD files (To be safe, make sure all reverse deps of all targets in the BUILD are built)
I think there is a ton of complexity here going down this route (if even possible). It might be less erro-prone to rely on Bazel itself to figure out what changed, using remote caches & --subcommands to calculate which side-effects need to be performed.

Determine if files are part of any package

Given I have a list of files, e.g foo/src/main.cpp, foo/src/bar.cpp, foo/README.md is it possible to determine which of those files are part of a bazel package?
In my example, the output would e.g. be foo/src/main.cpp, foo/src/bar.cpp since the README.md would not be part of the build.
One way to do this would be to call bazel query on each file and see if it results in an output, but that is quite inefficient and so I was wondering if there is an easier way.
Background: I am trying to determine if a changes in a set of files have an impact on a target, and I want to use bazel query somepath(//some/target, set($FILES)) for that, but this will fail if any of the files in $FILES is not part of a BUILD file.
How about flipping it around and querying for all the source files of the target with:
bazel query 'kind("source file", deps(//some:target))'
and then checking if the result has any of the files in the set

Skylark - How to execute a jar from a repository rule

Context
I am writing a repository rule that invokes another Bazel project. My current approach is to build the additional project as a deploy jar. I would like a user to be able to instantiate the rule like:
jar_path = some/relative/path
my_rule(name = "something", p_arg="m_arg", binary=jar_path)
and then given the jar_path and the arguments, I would like the repository rule to execute the following command in the shell:
java -jar $(SOME_JAR) $(ARGUMENTS_PROVIDED_BY_RULE)
Problem
First, it's unclear how best to accomplish the deploy jar approach. So far, I have attempt two different approaches, with varying levels of success. For examples, I have skimmed through the scala_rules, the maven_rules, and the skylark cookbook.
Second, and more importantly, I am not sure whether the deploy jar is the best route to accomplishing my goals. Again, my interest is to invoke a target from an external Bazel project, that is currently hosted on github. (So feasibly, I could try to fetch the project using the http_archive rule).
Below, I describe the attempts I have made.
Approach 1
My first approach involved trying to execute the command using the command field in ctx.action. I tried various enumerations of
java -jar {computed_absolute_path_of_deploy_jar} {args_passed_from_instantiation}.
My biggest issue here was with determining the absolute path of the deploy jar. The file's root path, would contain some additional information. For example, it would like something like this.
/abs/olute/path[ something ]/rela/tive/path
As a side note, I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None, despite File.none not being None.
My first approach involved was to was to try to use skylark [ctx.binary]
Approach 2
Next thing I tried was to mimic the input binary example from the docs. This was also unsuccessful. The issue was that the actual binary could not be found. Here is how I configured it.
First, I relaxed the repository rule into a regular skylark rule.
def _test_binary(ctx):
ctx.action(
....
arguments = [ctx.attr.p_arg],
executable = ctx.executable.binary)
test_binary = rule(
...
attrs = {
"binary":attr.label(mandatory=True, cfg="host", allow_files=True, executable=True),
...
}
Then, in my external project, I loaded the skylark rule into the WORKSPACE file. Finally, I called the macro from one of my BUILD files as follows:
load("#something_rule//:something_rule.bzl", "test_binary")
test_binary(name = "hello", p_arg = "hello", binary = "script.sh")
The script is a one line java -jar something_deploy.jar -- -arg:$1, and is in the same directory as the BUILD file.
Bazel complains that src/script.sh does not exist. I presume because it is looking for the file in /private/var/tmp/-bazel_username/somehash/relative_path. In response, I tried to pass the absolute path, which is not allowed.
Cheers.
It looks like you're mixing up repository rules with build extensions ("normal" rules). A good rule of thumb is:
Repository rules are for getting sources onto your system or symlinking them to a place Bazel can see them.
Build extension are for everything else: compiling, copying files, running binaries, etc.
I don't actually think you need to use either, for this. You say that the other project is on GitHub, so you can add the following to your WORKSPACE file:
http_archive(
name = "other_project",
...
)
Then, in your BUILD file:
genrule(
name = "run-a-jar",
srcs = ["#other_project//some/relative:path"],
cmd = "java -jar $(location #other_project//some/relative:path) -- arg1 arg2 > $#",
outs = ["jar-output"],
)
You shouldn't need to use the _deploy.jar target, since you're not moving the jar out of its project (_deploy.jar is useful when you need to relocate it).
Other things from your question:
I'm not sure if this is a bug/nit, but the File.root.path, evaluated to None,
Are you sure it didn't evaluate to ""? The path is relative to the execution root, so for sources, it will always be "" (for outputs, it'll be bazel-out/local-fastbuild/bin or similar).
Bazel complains that src/script.sh does not exist.
Passing -s to Bazel can really help debugging Skylark rules. You can see exactly where it is looking.

Equivalent to git difftool -y with libgit2sharp?

I am planning to replace the usage of git.exe from windows path by libgit2sharp for my plugin GitDiffMargin, A Visual Studio 2012 extension to display Git Diff on the margin of the current file. - https://github.com/laurentkempe/GitDiffMargin
I would like to know if there is an equivalent in libgit2sharp to start the external difftool using git difftool -y filename ?
I don't think it should be the responsibility of LibGit2Sharp to launch an external process. LibGit2Sharp goal is to provide a way to manipulate a git repository easily.
It means you can use it to:
Get the diff between files in the workdir and the previous version (in the index). To do that you can use the Repository.Diff.Compare(IEnumerable<string> paths, bool includeUntracked, ExplicitPathsOptions explicitPathsOptions) overload, which returns a TreeChanges object. From there, you can get a TreeEntryChanges object through the indexer of the treeChanges, which corresponds to the changes of a given file (use the .Patch property to get the actual content of the patch).
Get the configured diff tool by using the Repository.Config namespace (e.g.: repo.Config.Get<string>("diff.tool").Value, although you should also check if the value returned by the Get() method is null in case no diff tool has been configured by the user). That way, you can launch the diff tool by yourself.
Additional resources (v0.11.0):
Diff workdir to index tests
An example showing how to use the TreeChanges object
Configuration fixture, pretty self-explanatory
Note: it seems that at some point, you will need to know if a line has changed or not. I don't think there is an easy way to do that right now (apart from parsing the patch content manually). However, opening an issue on the LibGit2Sharp issue tracker might trigger some discussion around that (feel free to weigh in what kind of API you would like to have to do that!).
Edit: Before launching the external diff tool, you will need to copy the content of the file which is in the index in a temporary folder. You can lookup the blob of a file in the index by doing something like:
var indexEntry = repo.Index[fileName];
var blob = repo.Lookup(indexEntry.Id);
However... no filters are currently applied when you get the blob content, so the comparison is likely to produce false positives due to crlf differences.
There is currently an issue opened on libgit2 in order to propose an API to allow applying filters.

Resources