Bazel, python: moving tests breaks imports in a pip_requirement - bazel

I am trying to move a large project to using Bazel, and I am starting small. I have found a small wrapper around pydantic in our project, and I am trying to "bazelify" that first.
The initial structure of the package was something like this:
pydantic_utils
+- __init__.py
+- some_module.py
+- some_other_module.py
+- subdir
+- one.py
+- tests
+- __init__.py
+- test_some_module.py
I added BUILD.bazel files to this structure and everything worked fine, my tests ran and passed. But then I thought, I'd put the tests into the root dir of our wrapper lib. So I moved the test_some_module one level up, moved the py_test target to the BUILD.bazel file in the root dir, modified the srcs of both targets not to include the other target's files:
load("#rules_python//python:defs.bzl", "py_library", "py_test")
load("#my_pip_install//:requirements.bzl", "requirement")
py_library(
name = "pydantic_utils",
srcs = glob(
["*.py"],
exclude = ["test_*.py"],
),
visibility = ["//visibility:public"],
deps = [
"//pydantic_utils/subdir",
requirement("pydantic"),
],
)
py_test(
name = "test_pydantic_utils_dp",
srcs = glob([
"test_*.py",
]),
main = "test_some_module.py",
deps = [
"//pydantic_utils",
requirement("pytest"),
],
)
But now I get an error, that pydantic cannot import TYPE_CHECKING for some reason.
...pydantic_utils/test_some_module.py", line 2, in <module>
from typing import Any
...
File "pydantic/__init__.py", line 2, in init pydantic.__init__
from .models import (
File "pydantic/dataclasses.py", line 1, in init pydantic.dataclasses
import re
ImportError: cannot import name TYPE_CHECKING
It is a very vague question, but I have no idea how to begin to diagnose it. Can anyone help me?

Please upgrade your python to the latest version if you are using the old one. It has support upto 3.10 and above. Thats a general error log from Python implementation. Reach out if you are still seeing the error after the upgrade.

Related

Unable to use a hermetic python in a Bazel Workspace as an external dependency

I've been working on a small project trying to teach myself how to use Bazel. The goal is to download a http_archive that contains a python interpreter. once the python interpreter has been added to the environment I want to run the command `python.exe --version and write the output of that into a file
The issue I have the most difficulty with at the moment are the following:
I am not confident that I am correctly injecting the hermetic python BUILD file into the hermetic python package (I keep getting the message "BUILD file not found in any of the following directories. Add a BUILD file to a directory to mark it as a package").
I'm pretty sure that when I pass in python_compiler = ["#hermetic_python"] in the BUILD file I'm just getting a string and not a reference to the files in the package
Here is an overview of my project and the code files. Any help would be appreciated! :D
Project structure:
|-- WORKSPACE
|-- BUILD
|-- custom_rules.bzl
|-- main.py
|-- custom-rules/
|-- BUILD.custom_python
|-- custom_python_rules.bzl
WORKSPACE
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "hermetic_python",
urls = ["https://www.python.org/ftp/python/3.9.10/python-3.9.10-embed-amd64.zip"],
sha256 = "67161cab713a52f6658b76274f8dbe0cd2f256aab1e84f94cd557d4a906fa5d2",
build_file = "#//:custom-rules/BUILD.custom_python"
)
BUILD File:
load("//:custom_rules.bzl","build_with_custom_python")
build_with_custom_python(
name = "write-to-file",
python_compiler = ["#hermetic_python"]
)
custom_rules.bzl
def _build_with_custom_python_impl(ctx):
out_file = ctx.actions.declare_file("file_with_python_version.txt")
ctx.actions.run(
outputs = [out_file]
executable = ctx.attr.python_compiler,
arguments = [--version],
)
return DefaultInfo(files=[out_file])
build_with_custom_python = rule(
implementation = _build_with_custom_python_impl,
attrs = {
"python_compiler": attr.label_list(allow_files=True)
}
)
BUILD.custom_python
load("//:custom_python_rules.bzl","run_me")
run_me(
name="my_py_run",
python_files = glob(["**"]),
visibility = ["//visibility:public"],
)
custom_python_rules.bzl
def _run_me_impl(ctx):
pass
run_me = rule (
implementation = _run_me_impl,
attrs = {
"python_files" : attr.label_list(allow_files=True),
}
)
I've spent some more time on this I've managed to mostly do what I intended to do. Here is an overview of what I've learned that fixes the problems in the original question
1 - Mark the custom build file / custom rule folder as a package
Just by adding an empty BUILD file into the custom-rules folder I had market it as a bazel package. That allowed me to reference the files
|-- WORKSPACE
|-- BUILD
|-- custom_rules.bzl
|-- main.py
|-- custom-rules/
|-- BUILD # NEW
|-- BUILD.custom_python
|-- custom_python_rules.bzl
2 - The workspace file references a BUILD file. That build file needs to reference a custom rules file
The workspace should just reference the build file directly // means workspace root and then its just the path to the BUILD file
http_archive(
# build_file = "#//:custom-rules/BUILD.custom_python" // WRONG
build_file = "//custom-rules/BUILD.custom_python" // RIGHT
)
3 - BUILD file can not reference the custom rules (bzl file)
This is the one that took the longest time for me to figure out! The BUILD file that is loaded into the package (hermetic_python from the workspace file) NEEDS TO reference the current workspace in order to
# BUILD.custom_python
load("#root_workspace//custom-rules:custom-python-rules.bzl","execute_python_file")
Notice that this starts with #root_workspace which tells the BUILD file to look in that workspace. the WORKSPACE file in the project now contains this line
# WORKSPACE
workspace(name = "root_workspace")
4 - Passing in a build step as the dependancy
build_with_custom_python(
name = "write-to-file",
# python_compiler = ["#hermetic_python"] WRONG
python_compiler = ["#hermetic_python:my_py_run"] RIGHT
)
The key here is that I need to reference a build action which then makes the files in that build action available through the dependency. In this case #hermetic_python is the package and :my_py_run is the build action.
I still need to figure out how to properly use the files in the dependency but that's outside of the scope of this question

Up-level references ("..") in Bazel

In my bazel BUILD file, I have a line:
srcs = glob([<pattern1>, <pattern2>, ...])
I tried to have one of my patterns be "../dir/*.cc" but I get an error that I'm not allowed to use the .. sequence here.
Checking the documentation, I have found that it's not permitted, but I'm not sure what the expected substitute is.
Similarly, up-level references (..) and current-directory references (./) are forbidden.
How can I include these other source files in my srcs list given my current file structure? If I can't reference the up-level directory, is there a way to use the package name of the other directory instead?
Going "up" from your BUILD file would violate the package boundaries. If you really need that structure and cannot or don't want to change it, you have to make files from one package available to the other package by declaring the corresponding target(s) or at least export the files and making those visible. For instance assuming the following structure:
.
├── BUILD
├── WORKSPACE
├── hello.c
└── tgt
└── BUILD
It the // (top-level) package BUILD I could say:
filegroup(
name = "hello",
srcs = ["hello.c"],
visibility = ["//tgt:__pkg__"],
)
(Could also be: exports_files(["hello.c"], ["//tgt:__pkg__"]) instead in which case I would refer to the file by its name //:hello.c from tgt.)
And inside //tgt (tgt/BUILD) it can then read:
cc_binary(
name="tgt",
srcs=["//:hello"],
)
Which would give me:
$ bazel run //tgt
WARNING: /tmp/bzl1/tgt/BUILD:3:10: in srcs attribute of cc_binary rule //tgt:tgt: please do not import '//:hello.c' directly. You should either move the file to this package or depend on an appropriate rule there
INFO: Analyzed target //tgt:tgt (11 packages loaded, 68 targets configured).
INFO: Found 1 target...
Target //tgt:tgt up-to-date:
bazel-bin/tgt/tgt
INFO: Elapsed time: 0.247s, Critical Path: 0.09s
INFO: 2 processes: 2 linux-sandbox.
INFO: Build completed successfully, 6 total actions
INFO: Build completed successfully, 6 total actions
Hello World!
Note: bazel still flags this as something weird and noteworthy going on. I have to say I do not disagree with it. The tree structure does not seem to correspond to the content very well.
Perhaps in this example the tgt package boundary is artificial and not actually useful? Or hello.c is in the wrong place.

Building a simple library with bazel, fixing include path

I have a very simple directory structure:
.
├── libs
│   └── foo
│   ├── BUILD
│   ├── include
│   │   └── foo
│   │   └── func.h
│   └── src
│   └── func.cxx
└── WORKSPACE
With func.h:
#pragma once
int square(int );
And func.cxx:
#include <foo/func.h>
int square(int i) { return i * i; }
And BUILD:
cc_library(
name = "foo",
srcs = ["src/func.cxx"],
hdrs = ["include/foo/func.h"],
visibility = ["//visibility:public"],
)
This fails to build:
$ bazel build //libs/foo
INFO: Analysed target //libs/foo:foo (0 packages loaded).
INFO: Found 1 target...
ERROR: /home/brevzin/sandbox/bazel/libs/foo/BUILD:1:1: C++ compilation of rule '//libs/foo:foo' failed (Exit 1)
libs/foo/src/func.cxx:1:22: fatal error: foo/func.h: No such file or directory
#include <foo/func.h>
^
compilation terminated.
Target //libs/foo:foo failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.299s, Critical Path: 0.02s
FAILED: Build did NOT complete successfully
How do I set the include path properly? I tried using include_prefix (whether include or include/foo) but that didn't change the behavior.
Hmm, the tricky part about including headers from another places is that you have to specify the header file from its relative location according to the workspace (where the WORKSPACE file resides).
Moreover, you should not use the angle-bracket including style #include <a/b.h> unless you are including the system header files.
The related specifications for #include can be found here: https://docs.bazel.build/versions/master/bazel-and-cpp.html#include-paths
TL;DR The only change you need to make to your func.cxx file is that, change the first line to #include "libs/foo/include/foo/func.h".
And then, when you run bazel build //... (build all targets in this workspace, similar to make makes all) from the root of the workspace, you will encounter no error.
However, this is not the only way that can solve your problem.
Another way to resolve this issue that does not involve changing the source code include statement would be specifying the include path in the attribute of the rule cc_library.
Which means, you can make changes to your BUILD in path libs/foo to make it look like this:
cc_library(
name = "foo",
srcs = ["src/func.cxx"],
hdrs = ["include/foo/func.h"],
copts = ["-Ilibs/foo/include"], # This is the key part
visibility = ["//visibility:public"],
)
After this change, the compiler will be able to figure out where to find the header file(s), and you don't have to change your source code, yay.
Related information can be found here: https://docs.bazel.build/versions/master/cpp-use-cases.html#adding-include-paths
Nevertheless, there is another hacky way to solve your problem, however, it involves in more changes to your code.
It makes uses of the rule cc_inc_library.
The rule cc_inc_library will strips prefix attribute passed to this rule from the relative path of the header files specified in the hdrs attribute.
The example on the website is a little confusing, your code and directory structure will yield a much better demonstration purpose.
In your case, you have to modify your BUILD file under libs/foo to something that looks like this:
cc_library(
name = "foo",
srcs = ["src/func.cxx"],
deps = [":lib"],
copts = ["-Ilibs/foo/include"],
visibility = ["//visibility:public"],
)
cc_inc_library(
name = "lib",
hdrs = ["include/foo/func.h"],
prefix = "include/foo",
)
In your case, the header file func.h which has a relative path from the package libs/foo as include/foo/func.h, which is specified in the hdrs attribute.
Since it has a relative path to the root of the workspace as libs/foo/include/foo/func.h, and the prefix attribute in the cc_inc_library is specified as include/foo: the value include/foo will be stripped form lib/foo/include/foo/func.h, making it libs/foo/func.h.
So, now you can include this header file in your func.cxx as #include "libs/foo/func.h".
And now, bazel will not report error saying that it was not able to find the header file.
You can find the information about this rule at: https://docs.bazel.build/versions/master/be/c-cpp.html#cc_inc_library.
However, as stated above, the explanation is confusing at best, possibly because the documentation for it is out of date.
I was puzzled by the office explanation on bazel.build for quite some time until I read the source code for this rule at: https://github.com/bazelbuild/bazel/blob/f20ae6b20816df6a393c6e8352befba9b158fdf4/src/main/java/com/google/devtools/build/lib/rules/cpp/CcIncLibrary.java#L36-L50
The comment for the actual code that implements the function does a much, much better job at explaining what this rule actually does.
cc_inc_library rule has been deprecated since Bazel release 0.12.
Use cc_library approach instead.
See: https://blog.bazel.build/2018/04/11/bazel-0.12.html
What you really want here is strip_include_prefix:
cc_library(
name = "foo",
srcs = ["src/func.cxx"],
hdrs = ["include/foo/func.h"],
# Here!
strip_include_prefix = "include",
visibility = ["//visibility:public"],
)
This will make the headers accessible via:
#include "foo/func.h"
This attribute has been available since at least Bazel 0.17.

Bazel Maven migration Transitive Dependencies Scope

I am trying to use the generate_workspace on one of the project which has deps and transitive dependencies. Once thegenerate_workspace.bzl has been generated and I copied it to the WORKSPACE and followed the instruction in the bazel docs. Though I see the deps and their transitive deps listed in the generate_workspace.bzl my project during java_library phase is not able resolve transitive deps.. when I import the same project in IDEA I don't see the jars correctly loaded.
My doubt is for the deps I see the generate_workspace.bzl is adding its transitve deps as runtime_deps which means they are available only during runtenter code hereime
I have created gist of all the files here
https://gist.github.com/kameshsampath/8a4bdc8b22d85bbe3f243fa1b816e464
Ideally in my maven project I just need https://gist.github.com/kameshsampath/8a4bdc8b22d85bbe3f243fa1b816e464#file-src_main_build-L8-L9, though generate_workspace.bzl has resolved rightly i thought its enough if my src/main/BUILD looks like
java_binary(
name = "main",
srcs = glob(["java/**/*.java"]),
resources = glob(["resources/**"]),
main_class = "com.redhat.developers.DemoApplication",
# FIXME why I should import all the jars when they are transitive to spring boot starter
deps = [
"//third_party:org_springframework_boot_spring_boot_starter_actuator",
"//third_party:org_springframework_boot_spring_boot_starter_web",
],
)
But sadly that gives lots of compilation errors as transitive deps are not getting loaded as part the above declaration. eventually I have to define like how I did in the https://gist.github.com/kameshsampath/8a4bdc8b22d85bbe3f243fa1b816e464#file-src_main_build
src_main_build is BUILD file under package src/main/BUILD
third_party_BUILD is the BUILD under package third_party/BUILD
Bazel expects you to declare all your direct dependencies. I.e. if you directly use a class from jar A, you need to have it in your direct dependencies.
What you are looking for is a deploy jar. This is a bit hacky but you can actually do it that way (in third_party/BUILD):
java_binary(
name = "org_springframework_boot_spring_boot_starter_actuator_bin",
main_class = "not.important",
runtime_deps = [":org_springframework_boot_spring_boot_starter_actuator"], )
java_import(
name = "springframework_actuator",
jars = [":org_springframework_boot_spring_boot_starter_actuator_bin_deploy.jar"],
)
This will bundle all dependencies except the neverlink one in a jar (the _deploy.jar) and reexpose it.
An update: rules_jvm_external is the officially maintained ruleset by the Bazel team to fetch and resolve artifacts transitively.
You can find the example for Spring Boot here. The declaration in the WORKSPACE file looks something like this:
load("#rules_jvm_external//:defs.bzl", "maven_install")
maven_install(
artifacts = [
"org.hamcrest:hamcrest-library:1.3",
"org.springframework.boot:spring-boot-autoconfigure:2.1.3.RELEASE",
"org.springframework.boot:spring-boot-test-autoconfigure:2.1.3.RELEASE",
"org.springframework.boot:spring-boot-test:2.1.3.RELEASE",
"org.springframework.boot:spring-boot:2.1.3.RELEASE",
"org.springframework.boot:spring-boot-starter-web:2.1.3.RELEASE",
"org.springframework:spring-beans:5.1.5.RELEASE",
"org.springframework:spring-context:5.1.5.RELEASE",
"org.springframework:spring-test:5.1.5.RELEASE",
"org.springframework:spring-web:5.1.5.RELEASE",
],
repositories = [
"https://jcenter.bintray.com",
]
)

Failure compiling glog with gflags support using Bazel

I'm getting a failure when I try to compile glog with gflags support using Bazel. A github repo reproducing this problem and showing the compilation error message is here: https://github.com/dionescu/bazeltrunk.git
I suspect that the problem occurs because glog is finding and using the "config.h" file published by gflags. However, I do not understand why this happens and why the current structure of the build files results in such errors. One solution I found was to provide my own BUILD file for gflags where the config was in a separate dependency (just how glog does it in my example).
I would appreciate any help with understanding the issue in this example.
The problem is that gflag's BUILD file is including its own config. Adding -H to glog.BUILD's copts yields:
. external/glog_archive/src/utilities.h
.. external/glog_archive/src/base/mutex.h
... bazel-out/local-fastbuild/genfiles/external/com_github_gflags_gflags/config.h
In file included from external/glog_archive/src/utilities.h:73:0,
from external/glog_archive/src/utilities.cc:32:
external/glog_archive/src/base/mutex.h:147:3: error: #error Need to implement mutex.h for your architecture, or #define NO_THREADS
# error Need to implement mutex.h for your architecture, or #define NO_THREADS
^
If you take a look at gflag's config.h, it went with a not-very-helful approach of commenting out most of the config:
// ---------------------------------------------------------------------------
// System checks
// Define if you build this library for a MS Windows OS.
//cmakedefine OS_WINDOWS
// Define if you have the <stdint.h> header file.
//cmakedefine HAVE_STDINT_H
// Define if you have the <sys/types.h> header file.
//cmakedefine HAVE_SYS_TYPES_H
...
So nothing is defined.
Options:
The easiest way is probably to generate the config.h in your glog.BUILD:
genrule(
name = "config",
outs = ["config.h"],
cmd = "cd external/glog_archive; ./configure; cd ../..; cp external/glog_archive/src/config.h $#",
srcs = glob(["**"]),
)
# Then add the generated config to your glog target.
cc_library(
name = "glog",
srcs = [...],
hdrs = [
":config.h",
...
This puts the .h file at a higher-precedence location than the gflags version.
Alternatively, you could do something like this in the genrule, if you want to use your //third_party/glog/config.h (#// is shorthand for your project's repository):
genrule(
name = "config",
outs = ["config.h"],
cmd = "cp $(location #//third_party/glog:config.h) $#",
srcs = ["#//third_party/glog:config.h"],
)
You'll have to add exports_files(['config.h']) to the third_party/glog/BUILD file, too.

Resources