Is there a way to add dependency restrictions to packages in Bazel? - bazel

It would be useful to add restrictions between packages on a Bazel workspace. Something like "packages tagged as 'library' may not depend on packages tagged as 'app'"
I remember hearing that this was a featured supported by Blaze originally but looking at the documentation https://docs.bazel.build/versions/master/build-ref.html nothing like it seems to be mentioned anywhere.
Is it possible to do something like this with Bazel?

Bazel's visibility label is used to restrict access to packages. Of interest to you might be the 'package_group' flavor.
Taken from https://docs.bazel.build/versions/master/be/common-definitions.html:
The visibility attribute on a rule controls whether the rule can be used by other packages. Rules are always visible to other rules declared in the same package.
There are five forms (and one temporary form) a visibility label can take:
["//visibility:public"]: Anyone can use this rule.
["//visibility:private"]: Only rules in this package can use this rule. Rules in javatests/foo/bar can always use rules in java/foo/bar.
["//some/package:__pkg__", "//other/package:__pkg__"]: Only rules in some/package and other/package (defined in some/package/BUILD and other/package/BUILD) have access to this rule. Note that sub-packages do not have access to the rule; for example, //some/package/foo:bar or //other/package/testing:bla wouldn't have access. __pkg__ is a special target and must be used verbatim. It represents all of the rules in the package.
["//project:__subpackages__", "//other:__subpackages__"]: Only rules in packages project or other or in one of their sub-packages have access to this rule. For example, //project:rule, //project/library:lib or //other/testing/internal:munge are allowed to depend on this rule (but not //independent:evil)
["//some/package:my_package_group"]: A package group is a named set of package names. Package groups can also grant access rights to entire subtrees, e.g.//myproj/....
The visibility specifications of //visibility:public and //visibility:private can not be combined with any other visibility specifications. A visibility specification may contain a combination of package labels (i.e. //foo:__pkg__) and package_groups.
If a rule does specify the visibility attribute, that specification overrides any default_visibility attribute of the package statement in the BUILD file containing the rule.
Otherwise, if a rule does not specify the visibility attribute, the default_visibility of the package is used (except for exports_files).
Otherwise, if the default_visibility for the package is not specified, //visibility:private is used.
Example:
File //frobber/bin/BUILD:
# This rule is visible to everyone
cc_binary(
name = "executable",
visibility = ["//visibility:public"],
deps = [":library"],
)
# This rule is visible only to rules declared in the same package
cc_library(
name = "library",
visibility = ["//visibility:private"],
)
# This rule is visible to rules in package //object and //noun
cc_library(
name = "subject",
visibility = [
"//noun:__pkg__",
"//object:__pkg__",
],
)
# See package group "//frobber:friends" (below) for who can
# access this rule.
cc_library(
name = "thingy",
visibility = ["//frobber:friends"],
)
File //frobber/BUILD:
# This is the package group declaration to which rule
# //frobber/bin:thingy refers.
#
# Our friends are packages //frobber, //fribber and any
# subpackage of //fribber.
package_group(
name = "friends",
packages = [
"//fribber/...",
"//frobber",
],
)

Related

How to properly handle cyclic dependencies in Bazel without changing the source code?

I have read the question, but unfortunately the proposed solutions are to change source code of header files location, which on this case it's not possible.
Let's imagine I have the following structure of system includes from an installed package:
/usr/local/bar/bar1.h
/usr/local/bar/bar2.h
And
/usr/include/foo.h
Important detail:
bar1.h includes foo.h
foo.h includes bar2.h
From Bazel, I would right away create a new_git_repository() from /usr/local/bar and then create a library containing the header files from this folder as:
cc_library(
name = "libBar",
hrds = ["bar1.h", "bar2.h"],
deps = ["#foo//:libFoo"],
)
I would do another new_git_repository() from /usr/include/ and create a library:
cc_library(
name = "libFoo",
hrds = ["foo.h"],
deps = ["#bar//:libBar"],
)
As seen, this will end up in a compilation error from Bazel regarding cyclic dependencies:
libBar is dependent of libFoo, and libFoo is dependend of libBar.
Or course a solution would be to separate the libBar into some libBar1 and libBar2. But I would want to avoid this solution as the folder /usr/local/bar contains too many header files and separating this will be a nightmare.
Is there a proper solution for this case?

How to integrate C/C++ analysis tooling in Bazel?

I have a code analysis tool that I'd like to run for each cc_library (and cc_binary, silently implied for rest of the question). The tool has a CLI interfaces taking:
A tool project file
Compiler specifics, such as type sizes, built-ins, macros etc.
Files to analyze
File path, includes, defines
Rules to (not) apply
Files to add to the project
Options for synchronizing files with build data
JSON compilation database
Parse build log
Analyze and generate analysis report
I've been looking at how to integrate this in Bazel so that the files to analyze AND the associated includes and defines are updated automatically, and that any analysis result is properly cached. Generating JSON compilation database (using third party lib) or parsing build log both requires separate runs and updating the source tree. For this question I consider that a workaround I'm trying to remove.
What I've tried so far is using aspects, adding an analysis aspect to any library. The general idea is having a base project file holding library invariant configuration, appended with the cc_library files to analysis, and finally an analysis is triggered generating the report. But I'm having trouble to execute, and I'm not sure it's even possible.
This is my aspect implementation so far, trying to iterate through cc_library attributes and target compilation context:
def _print_aspect_impl(target, ctx):
# Make sure the rule has a srcs attribute
if hasattr(ctx.rule.attr, 'srcs'):
# Iterate through the files
for src in ctx.rule.attr.srcs:
for f in src.files.to_list():
if f.path.endswith(".c"):
print("file: ")
print(f.path)
print("includes: ")
print(target[CcInfo].compilation_context.includes)
print("quote_includes: ")
print(target[CcInfo].compilation_context.quote_includes)
print("system_includes: ")
print(target[CcInfo].compilation_context.system_includes)
print("define: " + define)
print(ctx.rule.attr.defines)
print("local_defines: ")
print(ctx.rule.attr.local_defines)
print("") # empty line to separate file prints
return []
What I cannot figure out is how to get ALL includes and defines used when compiling the library:
From libraries depended upon, recursively
copts, defines, includes
From the toolchain
features, cxx_builtin_include_directories
Questions:
How do I get the missing flags, continuing on presented technique?
Can I somehow retrieve the compile action command string?
Appended to analysis project using the build log API
Some other solution entirely?
Perhaps there is something one can do with cc_toolchain instead of aspects...
Aspects are the right tool to do that. The information you're looking for is contained in the providers, fragments, and toolchains of the cc_* rules the aspect has access to. Specifically, CcInfo has the target-specific pieces, the cpp fragment has the pieces configured from the command-line flag, and CcToolchainInfo has the parts from the toolchain.
CcInfo in target tells you if the current target has that provider, and target[CcInfo] accesses it.
The rules_cc my_c_compile example is where I usually look for pulling out a complete compiler command based on a CcInfo. Something like this should work from the aspect:
load("#rules_cc//cc:action_names.bzl", "C_COMPILE_ACTION_NAME")
load("#rules_cc//cc:toolchain_utils.bzl", "find_cpp_toolchain")
[in the impl]:
cc_toolchain = find_cpp_toolchain(ctx)
feature_configuration = cc_common.configure_features(
ctx = ctx,
cc_toolchain = cc_toolchain,
requested_features = ctx.features,
unsupported_features = ctx.disabled_features,
)
c_compiler_path = cc_common.get_tool_for_action(
feature_configuration = feature_configuration,
action_name = C_COMPILE_ACTION_NAME,
)
[in the loop]
c_compile_variables = cc_common.create_compile_variables(
feature_configuration = feature_configuration,
cc_toolchain = cc_toolchain,
user_compile_flags = ctx.fragments.cpp.copts + ctx.fragments.cpp.conlyopts,
source_file = src.path,
)
command_line = cc_common.get_memory_inefficient_command_line(
feature_configuration = feature_configuration,
action_name = C_COMPILE_ACTION_NAME,
variables = c_compile_variables,
)
env = cc_common.get_environment_variables(
feature_configuration = feature_configuration,
action_name = C_COMPILE_ACTION_NAME,
variables = c_compile_variables,
)
That example only handles C files (not C++), you'll have to change the action names and which parts of the fragment it uses appropriately.
You have to add toolchains = ["#bazel_tools//tools/cpp:toolchain_type"] and fragments = ["cpp"] to the aspect invocation to use those. Also see the note in find_cc_toolchain.bzl about the _cc_toolchain attr if you're using legacy toolchain resolution.
The information coming from the rules and the toolchain is already structured. Depending on what your analysis tool wants, it might make more sense to extract it directly instead of generating a full command line. Most of the provider, fragment, and toolchain is well-documented if you want to look at those directly.
You might pass required_providers = [CcInfo] to aspect to limit propagation to rules which include it, depending on how you want to manage propagation of your aspect.
The Integrating with C++ Rules documentation page also has some more info.

Have all Bazel packages expose their documentation files (or any file with a given extension)

Bazel has been working great for me recently, but I've stumbled upon a question for which I have yet to find a satisfactory answer:
How can one collect all files bearing a certain extension from the workspace?
Another way of phrasing the question: how could one obtain the functional equivalent of doing a glob() across a complete Bazel workspace?
Background
The goal in this particular case is to collect all markdown files to run some checks and generate a static site from them.
At first glance, glob() sounds like a good idea, but will stop as soon as it runs into a BUILD file.
Current Approaches
The current approach is to run the collection/generation logic outside of the sandbox, but this is a bit dirty, and I'm wondering if there is a way that is both "proper" and easy (ie, not requiring that each BUILD file explicitly exposes its markdown files.
Is there any way to specify, in the workspace, some default rules that will be added to all BUILD files?
You could write an aspect for this to aggregate markdown files in a bottom-up manner and create actions on those files. There is an example of a file_collector aspect here. I modified the aspect's extensions for your use case. This aspect aggregates all .md and .markdown files across targets on the deps attribute edges.
FileCollector = provider(
fields = {"files": "collected files"},
)
def _file_collector_aspect_impl(target, ctx):
# This function is executed for each dependency the aspect visits.
# Collect files from the srcs
direct = [
f
for f in ctx.rule.files.srcs
if ctx.attr.extension == f.extension
]
# Combine direct files with the files from the dependencies.
files = depset(
direct = direct,
transitive = [dep[FileCollector].files for dep in ctx.rule.attr.deps],
)
return [FileCollector(files = files)]
markdown_file_collector_aspect = aspect(
implementation = _file_collector_aspect_impl,
attr_aspects = ["deps"],
attrs = {
"extension": attr.string(values = ["md", "markdown"]),
},
)
Another way is to do a query on file targets (input and output files known to the Bazel action graph), and process these files separately. Here's an example querying for .bzl files in the rules_jvm_external repo:
$ bazel query //...:* | grep -e ".bzl$"
//migration:maven_jar_migrator_deps.bzl
//third_party/bazel_json/lib:json_parser.bzl
//settings:stamp_manifest.bzl
//private/rules:jvm_import.bzl
//private/rules:jetifier_maven_map.bzl
//private/rules:jetifier.bzl
//:specs.bzl
//:private/versions.bzl
//:private/proxy.bzl
//:private/dependency_tree_parser.bzl
//:private/coursier_utilities.bzl
//:coursier.bzl
//:defs.bzl

Bazel: share macro between multiple http_archive BUILD files

My project depends on some external libraries which I have to bazelfy myself. Thus, my WORKSPACE:
http_archive(
name = "external_lib_component1",
build_file = "//third_party:external_lib_component1.BUILD",
sha256 = "xxx",
urls = ["https://example.org/external_lib_component1.tar.gz"],
)
http_archive(
name = "external_lib_component2",
build_file = "//third_party:external_lib_component2.BUILD",
sha256 = "yyy",
urls = ["https://example.org/external_lib_component2.tar.gz"],
)
...
The two entries above are similar, and external_lib_component{1, 2}.BUILD share a lot of code.
What is the best way to share code (macros) between them?
Just putting a shared_macros.bzl file into third_party/ won't work, because it will not be copied into
the archive location on build (only the build_file is copied).
If you place a bzl file such a In your./third_party/shared_macros.bzl into your tree as you've mentioned.
Then in the //third_party:external_lib_component1.BUILD and //third_party:external_lib_component2.BUILD you provide for your external dependencies, you can load symbols from that shared file using:
load("#//third_party:shared_macros.bzl", ...)
Labels starting with #// refer to packages from the main repository, even when used in an external dependency (as they would otherwise be rooted when starting with //. You can for check docs on labels, in particular the last paragraph.
Alternatively you can also refer to the "parent" project by its name. If in your WORKSPACE file you've had:
workspace(name = "parent")
You could say:
load("#parent//third_party:shared_macros.bzl", ...)
Note: in versions prior to 2.0.0 you might want to add --incompatible_remap_main_repo if you mixed both of above approaches in your project.

What is the relationship between DefaultInfo and PyInfo

It's not clear to me what the difference between the DefaultInfo runfiles's transitive_files and PyInfo transitive_sources are. Are they redundant or is there an important difference?
For example, I have a custom starlark rule which I want to conform as a PyInfo provider, but I want to add an additional provider so I can't use the native py_library rule.
transitive_sources = [dep[PyInfo].transitive_sources for dep in ctx.attr.deps]
return struct(providers = [
DefaultInfo(
files = depset(sources + outs),
runfiles = ctx.runfiles(files = sources + outs, transitive_files = transitive_sources)
),
PyInfo(
transitive_sources = depset(direct = sources + outs, transitive = transitive_sources),
imports = depset(
direct = [_path_join(ctx.workspace_name, ctx.label.package, im) for im in ctx.attr.imports],
transitive = [dep[PyInfo].imports for dep in ctx.attr.deps]
)
),
_EggLibraryInfo(aditional_info="other stuff"),
])
I'm creating redundant depsets to satisfy these providers, which makes me think maybe I'm doing it wrong.
I have also tried another method of looping over all the default_runfiles of the deps, and using runfiles.merge for DefaultInfo. For simple cases, these methods appear equivalent, but I don't know if there are other scenarios where the approaches would diverge.
The PyInfo documentation could use a section on how transitive_sources fits into DefaultInfo, and why additional mechanisms outside of runfiles needs to be provided. https://docs.bazel.build/versions/master/skylark/lib/PyInfo.html
DefaultInfo is a known type to Bazel:
files controls which files are built when you bazel build the target,
runfiles defines which files need to be present in the sandbox when executing the target.
PyInfo is exclusively used by Python rules and is used to propagate metadata to consuming targets.
My guess is that the duplication is necessary because the values may differ, so removing the duplication will either mean Bazel doesn't build/include the right files, or consuming Python rules are missing information.

Resources