Controlling the use of --whole-archive with Bazel cc_binary() rule - bazel

I would like to control the use of -whole-archive when linking a shared library (.so) using the cc_binary() rule.
The reason I'm using a cc_binary() rule to create a shared library is related to this thread: https://groups.google.com/forum/#!topic/bazel-discuss/NG4N84ar3BY
I have a liba.a that contains two function implementations: a(), a1() which are implemented in separate object files and archived into one .a file.
the code is as follows:
a.c
void a() {
puts("a");
}
a1.c
void a1() {
d();
}
BUILD file
cc_library(
name = 'a',
srcs = [ 'liba.a' ],
hdrs = [ 'a.h' ],
linkstatic = True,
)
I would like to build a shared library that depends (links) with the above library:
b.c
void b() {
a();
puts("b");
}
BUILD file
cc_binary(
name = 'libb.so',
srcs = [ 'b.c' ],
deps = [ ':a' ],
linkshared = True,
)
What I would like to achieve is linking libb.so in such a way that it will only use liba.a for the required symbols, in this case it should only require the a.o object and link it into libb.so
I could not make this happen. When building, Bazel will use -whole-archive for liba.a which will result in containing the implementation of a1() as well when it is not required at all.
If the -whole-archive was not used, then resulted libb.so would have been built correctly, and there would have been no a1() symbol.
The reason that this is important, is that now libb.so when using -whole-archive will cause a dependency on d() for no reason.
This is the snippet output of the linkage command from running bazel build libb.so -s:
>>>>> # //:libb.so [action 'Linking libb.so']
(cd /bazel/jbasila/_bazel_jbasila/9ad84409935838f6b01d4c9936deda53/execroot/__main__ && \
exec env - \
PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/home/jbasila/tools/bin:/home/jbasila/tools/git-tools:/home/jbasila/.local/bin:/home/jbasila/bin:/home/jbasila/tools/bin:/home/jbasila/tools/git-tools:/home/jbasila/tools/bin:/home/jbasila/tools/git-tools:/home/jbasila/.local/bin:/home/jbasila/bin \
PWD=/proc/self/cwd \
/usr/bin/gcc -shared -o bazel-out/local-fastbuild/bin/libb.so '-fuse-ld=gold' -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -B/usr/bin -pass-exit-codes -Wl,-S -Wl,#bazel-out/local-fastbuild/bin/libb.so-2.params)
The content of the file bazel-out/local-fastbuild/bin/libb.so-2.params:
-whole-archive
bazel-out/local-fastbuild/bin/_objs/libb.so/b.pic.o
-no-whole-archive
-whole-archive
liba.a
-no-whole-archive
-lstdc++
-lm
So to the question again, is there a way to make Bazel ditch the use of -whole-archive for liba.a?

You can use --nolegacy_whole_archive to disable setting the whole-archive for dependencies of a shared library. There is a short explanation in https://docs.bazel.build/versions/master/command-line-reference.html.

Related

Bazel - What is the relation between hdrs vs includes on a cc_library to generate gcc command with -I instead of -isystem or -iquote?

Looks like there is a relation between creating a cc_library on a BUILD file using hdrs and includes to generate the gcc command with -isystem, -iquote and -I.
Official Bazel doc
To explain my problem better, I have this example.
I have a Bazel project with the following files:
main.cpp
WORKSPACE
BUILD
The dependencies are:
main.cpp needs foo.hpp
foo.hpp needs access to #include "bar.h", so a library hdrs-bar is provided
bar.h needs access to #include <system_bar.h>, so a library hdrs-system-bar is provided. (Note that <> on the include is needed.)
A binary called demo is created with all these dependencies
As described on (1), the main.cpp contains:
#include "foo.hpp"
int main()
{
// ...something which uses foo.hpp
return 0;
}
As described on (2), to be able to access foo.hpp from main.cpp I need to create a Bazel project out of a git repository and create a BUILD file for for it using the build_file_content where it contains a library that will expose all the *.h files I need (including the bar.h).
Therefore the WORKSPACE file contains:
load("#bazel_tools//tools/build_defs/repo:git.bzl", "new_git_repository")
new_git_repository(
name = "bar-utils",
remote = "https://github.com/someGitRepo/bar-utils.git",
branch = "master",
build_file_content = """
package(default_visibility = ["//visibility:public"])
cc_library(
name = "hdrs-bar",
hdrs = glob(["bar/*.h"], allow_empty=False),
strip_include_prefix = "bar",
)
""",
)
As described on (3), the WORKSPACE file also contains a creation of a library called hdrs-system-bar which contains the system headers needed from bar.h:
new_local_repository(
name = "bar-system",
path = "/usr/local/bar/include",
build_file_content = """
cc_library(
name = "hdrs-system-bar",
hdrs = glob(["*.h"], allow_empty=False),
visibility = ["//visibility:public"],
)
""",
)
As described in (4), the BUILD file contains a binary which gets all the dependencies in deps:
cc_binary(
name = "demo",
srcs = ["main.cpp"],
deps = ["#bar-utils//:hdrs-bar", "#bar-system//:hdrs-system-bar"],
)
If I compile:
$ bazel build //... --sandbox_debug
I get the following build ERROR:
bazel-out/aarch64-fastbuild/bin/external/bar-utils/_virtual_includes/hdrs-bar/barUtils.h:7:10: fatal error: bar_runtime.h: No such file or directory #include <bar_runtime.h>
The file bar_runtime.h should be inside the library hdrs-system-bar.
So if I see my sandbox folder on .../sandbox/linux-sandbox/4/execroot/__main__/ I get the following directories:
main.cpp
bazel-out
external
On the external, I have these two folder, where each contains all the header files I need:
bar-utils
bar-system
On the bazel-out, I have the following:
aarch64-fastbuild/bin/external/bar-utils/_virtual_includes/hdrs-bar/bar.h
I was expecting inside this bazel-out external to have also the bar-system, but there is only bar-utils. Why is that?
To add more info the gcc command generated was like this (removed all the non-necessary arguments for sake of simplicity):
/usr/bin/gcc -iquote external/bar-utils -iquote external/bar-system -c main.cpp
If I simply use includes instead of hdrs on all the libraries creation, I get a lot of -isystem instead of the -iquote. But what I need is actually the -I<folder where the headers are>.
If I manually change my gcc command to:
/usr/bin/gcc -iquote external/bar-utils -Iexternal/bar-system -c main.cpp
Then it works.
How can I force it to use -I? Changing the cc_libraries to srcs or includes instead of hdrs is not working.
Solution is to also use the include flag, besides the hdrs. This way it will show up on the sandbox and be accessible by adding the -isystem argument for gcc with the folder you need.
cc_library(
name = "hdrs-system-bar",
hdrs = glob(["*.h"], allow_empty=False),
include = ["."],
visibility = ["//visibility:public"],
)

How to specify package/derivation runtime dependencies with Nix?

I'm making a haskell program and I'm setting buildInput like this to include pkgs.ffmpeg-full:
(myHaskellPackages.callCabal2nix "App" (./.) {}).overrideAttrs (oldAttrs: {
buildInputs = (oldAttrs.buildInputs or []) ++ [ pkgs.ffmpeg-full ];
})
However this seems to make the ffmpeg package accessible during build time only rather than runtime of the application.
What attribute do I need to set for ffmpeg-full to be available during runtime - being able to invoke the ffmpeg executable?
There is a section about runtime dependencies in nix pills but I don't understand that section, it doesn't make sense how it can always determine runtime dependencies by hashes alone? I mean if I reference an executable in a shell script - surely nix does not parse the shell script to determine the executable I reference. https://nixos.org/guides/nix-pills/automatic-runtime-dependencies.html#idm140737320205792
Something is different for runtime dependencies however. Build
dependencies are automatically recognized by Nix once they are used in
any derivation call, but we never specify what are the runtime
dependencies for a derivation.
There's really black magic involved. It's something that at first
glance makes you think "no, this can't work in the long term", but at
the same time it works so well that a whole operating system is built
on top of this magic.
In other words, Nix automatically computes all the runtime
dependencies of a derivation, and it's possible thanks to the hash of
the store paths.
default.nix:
{
ghc ? "ghc8106",
pkgs ? import <nixpkgs> {}
}:
with pkgs.haskell.lib;
let
haskellPkgs = pkgs.haskell.packages.${ghc};
inherit (pkgs) lib;
mySourceRegexes = [
"^app.*$"
"^.*\\.cabal$"
"package.yaml"
];
myApp = (haskellPkgs.callCabal2nix "my-hello"
(lib.sourceByRegex ./. mySourceRegexes) { });
in myApp
.overrideAttrs(
oa: {
nativeBuildInputs = oa.nativeBuildInputs ++ [pkgs.hello pkgs.makeWrapper];
installPhase = oa.installPhase + ''
ln -s ${pkgs.hello.out}/bin/hello $out/bin/hello
'';
postFixup = ''
wrapProgram $out/bin/x-exe --prefix PATH : ${pkgs.lib.makeBinPath [ pkgs.hello ]}
'';
})
src/Main.hs:
module Main where
import System.Process (callCommand)
main :: IO ()
main = do
putStrLn "HELLO"
callCommand "hello"
putStrLn "BYE"
Seems this is not directly supported with an explicitly stated list of dependencies. However we can indirectly achieve this with "wrapping".
I found more information about wrapping here: https://nixos.wiki/wiki/Nix_Cookbook#Wrapping_packages
So I can do a ls that references the package.
...
appPkg = (myHaskellPackages.callCabal2nix "HaskellNixCabalStarter" (./.) {}).overrideAttrs (oldAttrs: {
buildInputs = (oldAttrs.buildInputs or []) ++ [ pkgs.ffmpeg-full ];
});
in
(pkgs.writeScriptBin "finderapp" ''
#!${pkgs.stdenv.shell}
ls ${pkgs.ffmpeg-full}/bin/ffmpeg
exec ${appPkg}/bin/app
''
)
We can verify the built package(?) correctly depends on the appropriate with:
nix-store -q --references result
/nix/store/0cq84xic2absp75ciajv4lfx5ah1fb59-ffmpeg-full-4.2.2
/nix/store/rm1hz1lybxangc8sdl7xvzs5dcvigvf7-bash-4.4-p23
/nix/store/wlvnjx53xfangaa4m5rmabknjbgpvq3d-HaskellNixCabalStarter-0.1.0.0

How do I make a bazel `sh_binary` target depend on other binary targets?

I have set up bazel to build a number of CLI tools that perform various database maintenance tasks. Each one is a py_binary or cc_binary target that is called from the command line with the path to some data file: it processes that file and stores the results in a database.
Now, I need to create a dependent package that contains data files and shell scripts that call these CLI tools to perform application-specific database operations.
However, there doesn't seem to be a way to depend on the existing py_binary or cc_binary targets from a new package that only contains sh_binary targets and data files. Trying to do so results in an error like:
ERROR: /workspace/shbin/BUILD.bazel:5:12: in deps attribute of sh_binary rule //shbin:run: py_binary rule '//pybin:counter' is misplaced here (expected sh_library)
Is there a way to call/depend on an existing bazel binary target from a shell script using sh_binary?
I have implemented a full example here:
https://github.com/psigen/bazel-mixed-binaries
Notes:
I cannot use py_library and cc_library instead of py_binary and cc_binary. This is because (a) I need to call mixes of the two languages to process my data files and (b) these tools are from an upstream repository where they are already designed as CLI tools.
I also cannot put all the data files into the CLI tool packages -- there are multiple application-specific packages and they cannot be mixed.
You can either create a genrule to run these tools as part of the build, or create a sh_binary that depends on the tools via the data attribute and runs them them.
The genrule approach
This is the easier way and lets you run the tools as part of the build.
genrule(
name = "foo",
tools = [
"//tool_a:py",
"//tool_b:cc",
],
srcs = [
"//source:file1",
":file2",
],
outs = [
"output_file1",
"output_file2",
],
cmd = "$(location //tool_a:py) --input=$(location //source:file1) --output=$(location output_file1) && $(location //tool_b:cc) < $(location :file2) > $(location output_file2)",
)
The sh_binary approach
This is more complicated, but lets you run the sh_binary either as part of the build (if it is in a genrule.tools, similar to the previous approach) or after the build (from under bazel-bin).
In the sh_binary you have to data-depend on the tools:
sh_binary(
name = "foo",
srcs = ["my_shbin.sh"],
data = [
"//tool_a:py",
"//tool_b:cc",
],
)
Then, in the sh_binary you have to use the so-called "Bash runfiles library" built into Bazel to look up the runtime-path of the binaries. This library's documentation is in its source file.
The idea is:
the sh_binary has to depend on a specific target
you have to copy-paste some boilerplate code to the top of the sh_binary (reason is described here)
then you can use the rlocation function to look up the runtime-path of the binaries
For example your my_shbin.sh may look like this:
#!/bin/bash
# --- begin runfiles.bash initialization ---
...
# --- end runfiles.bash initialization ---
path=$(rlocation "__main__/tool_a/py")
if [[ ! -f "${path:-}" ]]; then
echo >&2 "ERROR: could not look up the Python tool path"
exit 1
fi
$path --input=$1 --output=$2
The __main__ in the rlocation path argument is the name of the workspace. Since your WORKSPACE file does not have a "workspace" rule in, which would define the workspace's name, Bazel will use the default workspace name, which is __main__.
An easier approach for me is to add the cc_binary as a dependency in the data section. In prefix/BUILD
cc_binary(name = "foo", ...)
sh_test(name = "foo_test", srcs = ["foo_test.sh"], data = [":foo"])
Inside foo_test.sh, the working directory is different, so you need to find the right prefix for the binary
#! /usr/bin/env bash
executable=prefix/foo
$executable ...
A clean way to do this is to use args and $(location):
Contents of BUILD:
py_binary(
name = "counter",
srcs = ["counter.py"],
main = "counter.py",
)
sh_binary(
name = "run",
srcs = ["run.sh"],
data = [":counter"],
args = ["$(location :counter)"],
)
Contents of counter.py (your tool):
print("This is the counter tool.")
Contents of run.sh (your bash script):
#!/bin/bash
set -eEuo pipefail
counter="$1"
shift
echo "This is the bash script, about to call the counter tool."
"$counter"
And here's a demo showing the bash script calling the Python tool:
$ bazel run //example:run 2>/dev/null
This is the bash script, about to call the counter tool.
This is the counter tool.
It's also worth mentioning this note (from the docs):
The arguments are not passed when you run the target outside of bazel (for example, by manually executing the binary in bazel-bin/).

Binary shell rule dependency

We're using Bazel at work for wrapping and organizing some build systems that we use for our projects. Since, each project has its own build system (like bitbake for example), we can't use Bazel for the actual building (e.g creating cc_binary and cc_library rules that contain all the source code). But we do use it for wrapping and for a uniformed API.
I want to make a shell script (A) dependent of another shell script (B), and I want to do so, such that B will run before A. Since we have some configurations inside the BUILD files, I can't simply run from command line B then A. I need Bazel to do so, and inject the configuration to both scripts. Something like :
In the BUILD file of one component:
sh_binary(
name = "bazel_build_multivisor",
srcs = ["bazel-build-multivisor.sh"],
data = ["wrkspceinfo"],
deps = ["//core-build/components/bazel-pull:bazel_pull_multivisor"])
In the BUILD file of a second component which is in charge of pulling from git:
sh_binary(
name = "bazel_pull_multivisor",
srcs = ["pull_repo_repository.sh"],
data = ["wrkspceinfo", "pull_repo_repository.sh"],
args = [ARGS, "NAME=multivisor_repo;MODULE_NAME=*;GIT_BRANCH=*;MANIFEST=*.xml;PULL_PARAMS=8,8,8"],)
Meaning, I want a sh_binary rule to depend on a different sh_binary rule.
Is this possible? Is there a better way doing so?
Thanks.
Consider using a genrule.
You can specify both sh_binary targets in the genrule.tools attribute, and then in the genrule.cmd attribute, specify the shell command that calls them in order.
For example,
genrule(
name = "pull_and_build",
srcs = "some_config_file.txt",
outs = ["built_project.tar.gz"]
tools = [":pull_sh", ":build_sh"],
cmd = """
$(location :pull_sh) $< | \
xargs -1 $(location :build_sh) | \
xargs -1 tar -czvf $#"""
)
Another possible way is to wrap the dependency sh_binary in a sh_library instead, and aggregate them at the top level sh_binary.

In Bazel, how can I make a C++ library depend on a general rule?

I have a library that depends on graphics files that are generated by a shell script.
I would like the library, when it is compiled, to use the shell script to generate the graphics files, which should be copied as if it were a 'data' statement, but whenever I try to make the library depend on the genrule, I get
in deps attribute of cc_library rule //graphics_assets
genrule rule '//graphics_assets:assets_gen_rule' is misplaced here
(expected cc_inc_library, cc_library, objc_library or
cc_proto_library)
# This is the correct format.
# Here we want to run all the shader .glsl files through the program
# file_utils:archive_tool (which we also build elsewhere) and copy the
# output .arc file to the data.
# 1. List the source files
filegroup(
name = "shader_source",
srcs = glob([
"shaders/*.glsl",
]),
)
# 2. invoke file_utils::archive_tool on the shaders
genrule(
name = "shaders_gen_rule",
srcs = [":shader_source"],
outs = ["shaders.arc"],
cmd = "echo $(locations shader_source) > temp.txt ; \
$(location //common/file_utils:archive_tool) \
--create_from_list=temp.txt \
--archive $(OUTS) ; \
$(location //common/file_utils:archive_tool) \
--ls --archive $(OUTS) ",
tools = ["//common/file_utils:archive_tool"],
)
# 3. when a a binary depends on this tool the arc file will be copied.
# This is the thing I had trouble with
cc_library(
name = "shaders",
srcs = [], # Something
data = [":shaders_gen_rule"],
linkstatic = 1,
)

Resources