How do I tell Bazel where Python.h lives? - bazel

I'm building a C++ executable that needs to #include "Python.h" from the user's Python installation.
To express Python.h (and the various header files it includes) in Bazel, I need to know where the Python include directory is. This location will be different on Windows and Linux, and I'd like a single Bazel configuration to build them both.
What's the best Bazel practice for referencing software that exists outside of the WORKSPACE root?

So to tell Bazel about external dependencies you need to use one of the Workspace Rules to specify the location of the external dependency, as well as the BUILD file for Bazel to use with that external dependency.
To have something work cross-platform you need to use the select() function to have Bazel select the proper library to build against for your host operating system.
Here's a stab at accomplishing it:
First we have the WORKSPACE file in your project's root that defines the two libraries and the BUILD file to use for them. Here I'm just using build_file_content but if that becomes too complex you can put it in it's own file and reference it instead. The BUILD file here exposes the prebuild library shipped with Python along with the header files needed. It also adds an include path for any targets that depend on these libraries so you can do #include "Python.h"
new_local_repository(
name = "python_linux",
path = "/usr",
build_file_content = """
cc_library(
name = "python35-lib",
srcs = ["lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5.so"],
hdrs = glob(["include/python3.5/*.h"]),
includes = ["include/python3.5"],
visibility = ["//visibility:public"]
)
"""
)
new_local_repository(
name = "python_win",
path = "C:/Python35",
build_file_content = """
cc_library(
name = "python35-lib",
srcs = ["libs/python35.lib"],
hdrs = glob(["include/*.h"]),
includes = ["include/"],
visibility = ["//visibility:public"]
)
"""
)
Next the BUILD file for your application. Here you need to define some config_settings. This allows us to define platform dependent settings for our build. We use the cpu value to determine the host OS.
In the cc_binary rule we use the select() function to choose the correct host library to link against based on the configuration.
config_setting(
name = "linux_x86_64",
values = {"cpu": "k8"},
visibility = ["//visibility:public"],
)
config_setting(
name = "windows",
values = {"cpu": "x64_windows"},
visibility = ["//visibility:public"],
)
cc_binary(
name="python-test",
srcs = [
"main.c",
],
deps = select({
"//:linux_x86_64": [
"#python_linux//:python35-lib"
],
"//:windows": [
"#python_win//:python35-lib"
]
})
)
FWIW here's the main.c I was playing around with to get this working.
#include "Python.h"
int main(int argc, char *argv[])
{
Py_SetProgramName(argv[0]); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString("from time import time,ctime\n"
"print('Today is',ctime(time()))\n");
Py_Finalize();
return 0;
}
Another way (and perhaps simpler) is checking the python headers and libraries into your repository. You will still need to use select() to choose the correct library to link against but at least you won't need to add anything to your WORKSPACE file and can just rely on another BUILD file in your repository. If you look at the Bazel repo they check in lots of external dependencies into the third_party directory, so it's a common practice.

Related

Bazel: How to make new_local_repository target depend on target generated by http_archive?

I have several third party libraries that depend on openssl, so I fetch and build openssl via repository mechanic (http_archive()). And I have GRPC that fetch boringssl that has same symbols that an openssl (after linking i get an error due collision).
I want to redefine boringsll using new_local_repository() method. But I don't known how to pass generated path to path argument and how make new_local_repository() call depend on openssl target.
The code that a want to get, looks like this:
new_local_repository(
name = "boringssl",
??? path = "bazel-out/k8-fastbuild/bin/external/openssl/openssl/", <-- generated path with openssl
build_file_content = """
cc_library(
name = "ssl",
deps = ["#openssl"],
srcs = ["lib/libssl.a"],
hdrs = glob(["include/openssl/*.h"]),
strip_include_prefix = "/include/openssl",
visibility = ["//visibility:public"],
)
cc_library(
name = "crypto",
deps = ["#openssl"],
srcs = ["lib/libcrypto.a"],
hdrs = glob(["include/openssl/*.h"]),
strip_include_prefix = "/include/openssl",
visibility = ["//visibility:public"],
)
""",
)
I would write a replacement #boringssl with aliases for each rule. Something like this as third_party/boringssl/BUILD.bazel:
alias(
name = "crypto",
actual = "#openssl//:libcrypto",
visibility = ["//visibility:public"],
)
Then you can ad it to your WORKSPACE with a relative path:
local_repository(
name = "boringssl",
path = "third_party/boringssl",
)
This also lets you freely map target names if they differ. Alternatively, you could write cc_library wrappers (without srcs, just deps) to do things like combine multiple #openssl targets into a single one for the #boringssl equivalent.
With your approach, even if you use some hacks to reuse the repository directory or copy the source files, you will still have linking problems. Bazel will build all the #openssl source files twice and then try linking both copies in. The linker might just pick one and make everything work, or it might refuse to link, or you might get subtle problems at runtime with duplicated global state.

bazel: how to create a rule that strips relative path of all files in subfolders

I'm trying to write a bazel BUILD for GSL
the problem is that it has various gsl_*.h header files in subfolders, but they are always included as #include <gsl/gsl_somename.h> so for example the header gsl_errno.h that lives in gsl/err/gsl_errno.h is included as #include <gsl/gsl_errno.h> and gsl_math.h that lives in gsl/gsl_math.h is also included as #include <gsl/gsl_math.h>.
I tried to create a separate cc_library for each folder and use strip_include_prefix and include_prefix like so:
cc_library(
name = "gsl_sys",
visibility = ["//visibility:public"],
srcs = [
"sys/minmax.c",
"sys/prec.c",
"sys/hypot.c",
"sys/log1p.c",
"sys/expm1.c",
"sys/coerce.c",
"sys/invhyp.c",
"sys/pow_int.c",
"sys/infnan.c",
"sys/fdiv.c",
"sys/fcmp.c",
"sys/ldfrexp.c",
],
hdrs = glob(
include = ["sys/*.h"],
),
strip_include_prefix = "sys/",
include_prefix = "gsl/",
)
but the problem is if I go by folder then there are circular dependencies (for example gsl/gsl_math.h includes gsl/sys/gsl_sys.h but some files in gsl/sys include gsl_*.h files that live in the gsl/ root folder.
I think optimally I'd have one cc_library with all the gsl_*.h files but such that they are all accessible as #include <gsl/gsl_*.h> independently of what subfolder they are in.
how can I achieve that?
I would copy them all to a new folder, and then use those new copied versions for your cc_library. A genrule is the simplest way to do this. Something like this in a BUILD file at the top level (don't put BUILD files in any of the subfolders; you want it all in the same package so one rule can handle all the files):
# You could list all the headers instead of the glob, or something
# similar, if you only want a subset of them.
all_headers = glob(["gsl/*/*.h"])
# This is the logic for actually remapping the paths. Consider a macro
# instead of writing it inline like this if it gets more complicated.
unified_header_paths = ["unified_gsl/" + p.split("/")[-1] for p in all_headers]
genrule(
name = "unified_gsl"
srcs = all_headers,
outs = unified_header_paths,
cmd = "\n".join(["cp $(location %s) $(location %s)" %
(src, dest) for src, dest in zip(all_headers, unified_header_paths)]),
)
The files would end up like this after copying:
unified_gsl/gsl/gsl_math.h
unified_gsl/gsl/gsl_sys.h
And then you can write a cc_library like:
cc_library(
name = "gsl_headers",
hdrs = [":unified_gsl"],
strip_include_prefix = "unified_gsl/",
)
cc_library.hdrs is looking for files, so it will grab all the outputs from the genrule.
If you want to do more complicated things with the files than just moving them around, consider a full custom rule. If you include all the copied headers in your DefaultInfo.files, then just passing the target's label to cc_library.hdrs will work like it does with the genrule.

Bazel: share macro between multiple http_archive BUILD files

My project depends on some external libraries which I have to bazelfy myself. Thus, my WORKSPACE:
http_archive(
name = "external_lib_component1",
build_file = "//third_party:external_lib_component1.BUILD",
sha256 = "xxx",
urls = ["https://example.org/external_lib_component1.tar.gz"],
)
http_archive(
name = "external_lib_component2",
build_file = "//third_party:external_lib_component2.BUILD",
sha256 = "yyy",
urls = ["https://example.org/external_lib_component2.tar.gz"],
)
...
The two entries above are similar, and external_lib_component{1, 2}.BUILD share a lot of code.
What is the best way to share code (macros) between them?
Just putting a shared_macros.bzl file into third_party/ won't work, because it will not be copied into
the archive location on build (only the build_file is copied).
If you place a bzl file such a In your./third_party/shared_macros.bzl into your tree as you've mentioned.
Then in the //third_party:external_lib_component1.BUILD and //third_party:external_lib_component2.BUILD you provide for your external dependencies, you can load symbols from that shared file using:
load("#//third_party:shared_macros.bzl", ...)
Labels starting with #// refer to packages from the main repository, even when used in an external dependency (as they would otherwise be rooted when starting with //. You can for check docs on labels, in particular the last paragraph.
Alternatively you can also refer to the "parent" project by its name. If in your WORKSPACE file you've had:
workspace(name = "parent")
You could say:
load("#parent//third_party:shared_macros.bzl", ...)
Note: in versions prior to 2.0.0 you might want to add --incompatible_remap_main_repo if you mixed both of above approaches in your project.

How to generate cc_library from an output directory from a genrule?

I have a binary that takes as input a single file and produces an unknown number of header and source C++ files into a single directory. I would like to be able to write a target like:
x_library(
name = "my_x_library",
src = "source.x",
)
where x_library is a macro that ultimately produces the cc_library from the output files. However, I can't bundle all the output files inside the rule implementation or inside the macro. I tried this answer but it doesn't seem to work anymore.
What's the common solution to this problem? Is it possible at all?
Small example of a macro using a genrule (not a huge fan) to get one C file and one header and provide them as a cc_library:
def x_library(name, src):
srcfile = "{}.c".format(name)
hdrfile = "{}.h".format(name)
native.genrule(
name = "files_{}".format(name),
srcs = [src],
outs = [srcfile, hdrfile],
cmd = "./generator.sh $< $(OUTS)",
tools = ["generator.sh"],
)
native.cc_library(
name = name,
srcs = [srcfile],
hdrs = [hdrfile],
)
Used it like this then:
load(":myfile.bzl", "x_library")
x_library(
name = "my_x_library",
src = "source.x",
)
cc_binary(
name = "tgt",
srcs = ["mysrc.c"],
deps = ["my_x_library"],
)
You should be able to extend that with any number of files (and for C++ content; IIRC the suffices are use for automagic decision how to call the tools) as long as your generator input -> generated content is known and stable (generally a good thing for a build). Otherwise you can no longer use genrule as you need your custom rule (probably a good thing anyways) to use TreeArtifact as described in the linked answer. Or two, one with .cc suffix and one with .hh so that you can pass them to cc_library.

Using bazel macros across repositories with labels

I've got two repositories, Client and Library.
Inside of Client's WORKSPACE file Client imports Library as a http_archive with the name "foo".
Inside of Client, I want to use Library macros that reference targets inside Library. My problem is that the Library macros don't know that were imported as "foo", so when the macro is expanded the targets are not found.
library/WORKSPACE:
workspace(name = "library")
library/some.bzl:
def my_macro():
native.java_library(name = "my_macro_lib",
deps = ["#library//:my_macro_lib_dependnecy"]
)
library/BUILD.bazel:
java_library(name = "my_macro_lib_dependnecy",
...
)
client/WORKSPACE:
workspace(name = "client")
http_archive(
name = "library",
urls = [...],
strip_prefix = ...,
sha256 = ...,
)
Because both workspaces use the same name for library workspace (name = "library") and because the macro refers to the workspace name in its dependencies (#library//:my_macro_lib_dependnecy) this works.
Note this works but has some quirks which will be resolved in 0.17.0

Resources