A bazel rule "template" to generate many rules - bazel

I have some number of these input files that are called src0, src1, src2, ... srcN, and they need to be compiled into dst1, dst2, dst3, ... dstN.
I have a simple my_convert rule, and macros to generate full filename paths:
def src_filename(id):
return ":some/path/to/src{}".format(id)
def dest_filename(id):
return ":some/path/to/dest{}".format(id)
my_convert(
src=src_filename(0),
dest=dest_filename(0)
)
my_convert(
src=src_filename(1),
dest=dest_filename(1)
)
...
Now, I could copy & paste my_convert N times, but N is sometimes in the hundreds, and the number of files depends on some configuration... so I'd really like to have a dynamic rule of sorts where I can pass in 'N' from command-line and my_convert gets called for all ids in 0..N range.
What's the best way of doing this within Bazel? Is there some way to describe rules in a for loop? (I'll fall back to writing a script to generate a BUILD file with all the rules, but I'm hoping that I don't have to do that)

Does it need to be from some configuration? Or can it be something like "all the files in a directory"? Because then you could do something like this:
my_convert(
name = "converstions",
srcs = glob(["some/path/to/src*"]),
)
def _my_convert_impl(ctx):
outputs = []
for src in ctx.files.srcs:
output = ctx.actions.declare_file(src.basename + ".out")
outputs.append(output)
ctx.actions.run_shell(
inputs = [src],
outputs = [output],
command = "wc -l '%s' > '%s'" % (src.path, output.path),
mnemonic = "Convert",
progress_message = "Convert %{input}",
)
return [DefaultInfo(files = depset(outputs))]
my_convert = rule(
implementation = _my_convert_impl,
attrs = {
"srcs": attr.label_list(allow_files = True),
},
)
Then bazel build //package:conversions.
If it can't be based on a directory, another approach is to generate targets for each of the files, and then explicitly build the subset you want. Something like this:
generate_my_conversions(glob(["some/path/to/src*"]))
def generate_my_conversions(files):
for file in files:
my_convert(
name = "convert_" + file.replace("/", "_"), # this might need adjusting
src = file,
)
Then build the targets you want (maybe generate the list):
bazel build //package:convert_some_path_to_src0 //package:convert_some_path_to_src1 //package:convert_some_path_to_src2 //package:convert_some_path_to_srcN
There's a similar way to do this using output files: https://docs.bazel.build/versions/main/skylark/rules.html#declaring-outputs
which avoids instantiating a rule per file (the distinction/advantage may or may not be important until there are many more files).
If it really needs to be a flag for this, it's possible though a bit awkward, because it's a bit uncommon to have a flag to configure a specific individual target (flags are usually for things that apply to the entire build). Something like this:
$ tree
.
├── package
│   ├── BUILD
│   ├── defs.bzl
│   └── some
│   └── path
│   └── to
│   ├── src0
│   ├── src1
│   ├── src10
│   ├── src11
│   ├── src12
│   ├── src2
│   ├── src3
│   ├── src4
│   ├── src5
│   ├── src6
│   ├── src7
│   ├── src8
│   └── src9
└── WORKSPACE
package/defs.bzl:
def _my_convert_impl(ctx):
file_count = ctx.attr.file_count[FileCountToConvertProvider].count
if file_count < 0:
fail("Must set flag --%s" % ctx.attr.file_count.label)
if file_count > len(ctx.files.srcs):
fail("More files requested than sources")
# srcs could be in any order, so extract the source number from the file name
srcs = {}
for src in ctx.files.srcs:
file_number_char_count = 0
# reversed() doesn't work with string in Starlark...
for i in range(len(src.basename) - 1, 0, -1):
if src.basename[i].isdigit():
file_number_char_count += 1
else:
break
file_number = int(src.basename[-file_number_char_count:])
srcs[file_number] = src
outputs = []
for i in range(file_count):
src = srcs[i]
if src.path.startswith(ctx.label.package + "/"):
package_relative_path = src.path[len(ctx.label.package) + 1:]
else:
package_relative_path = src.path
output = ctx.actions.declare_file(package_relative_path.replace("src", "dst"))
outputs.append(output)
ctx.actions.run_shell(
inputs = [src],
outputs = [output],
command = "wc -l '%s' > '%s'" % (src.path, output.path),
mnemonic = "Convert",
progress_message = "Convert %{input}",
)
return [DefaultInfo(files = depset(outputs))]
_my_convert = rule(
implementation = _my_convert_impl,
attrs = {
"srcs": attr.label_list(allow_files = True, mandatory = True),
"file_count": attr.label(),
},
)
FileCountToConvertProvider = provider(fields = ["count"])
def _file_count_to_convert_flag_impl(ctx):
return FileCountToConvertProvider(count = ctx.build_setting_value)
_file_count_to_convert_flag = rule(
implementation = _file_count_to_convert_flag_impl,
build_setting = config.int(flag = True),
)
# Use a macro to create the my_convert target + associated flag
def my_convert(name, srcs):
flag_name = name + "_file_count"
_file_count_to_convert_flag(
name = flag_name,
build_setting_default = -1,
)
_my_convert(
name = name,
srcs = srcs,
file_count = flag_name,
)
package/BUILD:
load(":defs.bzl", "my_convert")
my_convert(
name = "conversions",
srcs = glob(["some/path/to/src*"]),
)
Note that the glob takes in every file that might be needed, because there's no way in Starlark to connect the value of the flag to generate a list of files at loading time (i.e., at the time the build file is evaluated).
$ bazel build //package:conversions --//package:conversions_file_count=5
INFO: Analyzed target //package:conversions (4 packages loaded, 20 targets configured).
INFO: Found 1 target...
Target //package:conversions up-to-date:
bazel-bin/package/some/path/to/dst0
bazel-bin/package/some/path/to/dst1
bazel-bin/package/some/path/to/dst2
bazel-bin/package/some/path/to/dst3
bazel-bin/package/some/path/to/dst4
INFO: Elapsed time: 0.302s, Critical Path: 0.03s
INFO: 6 processes: 1 internal, 5 linux-sandbox.
INFO: Build completed successfully, 6 total actions
$ bazel build //package:conversions --//package:conversions_file_count=11
INFO: Build option --//package:conversions_file_count has changed, discarding analysis cache.
INFO: Analyzed target //package:conversions (0 packages loaded, 20 targets configured).
INFO: Found 1 target...
Target //package:conversions up-to-date:
bazel-bin/package/some/path/to/dst0
bazel-bin/package/some/path/to/dst1
bazel-bin/package/some/path/to/dst2
bazel-bin/package/some/path/to/dst3
bazel-bin/package/some/path/to/dst4
bazel-bin/package/some/path/to/dst5
bazel-bin/package/some/path/to/dst6
bazel-bin/package/some/path/to/dst7
bazel-bin/package/some/path/to/dst8
bazel-bin/package/some/path/to/dst9
bazel-bin/package/some/path/to/dst10
INFO: Elapsed time: 0.114s, Critical Path: 0.02s
INFO: 7 processes: 1 internal, 6 linux-sandbox.
INFO: Build completed successfully, 7 total actions

Related

Clarify file/dir list to include in duply profile

I have defined /tmp/ as my source directory. I want to backup only in1/ and in2/ subfolders from it. What lines do I need in profile's exclude file?
/tmp/a
├── in1
│   └── in.txt
├── in2
│   └── in.txt
└── out.txt
According to duplicity man page's dir/foo example, I tried:
+ in1/
+ in2/
- **
But that did not work and I got error as:
Reading globbing filelist /path/to/duply_profile/exclude
Fatal Error: The file specification
in1/
cannot match any files in the base directory
/tmp
Useful file specifications begin with the base directory or some
pattern (such as '**') which matches the base directory.
better use up-to-date man page from duplicity's website https://duplicity.us/stable/duplicity.1.html#file-selection
not sure why the example relative paths is in there, but as the error states you will need something along the lines
+ /tmp/in1/
+ /tmp/in2/
- **
feel free to post a bug ticket on https://gitlab.com/duplicity/duplicity/-/issues so maybe someday some kind soul would make it work with relative paths.
I figured that the following specification work:
+ **in1/
+ **in2/
- **

module not found in a custom python package

I am trying to make a package in Python. I have the following file and directory structure:
.
├── ds
│ ├── __init__.py
│ ├── __main__.py
│ ├── package_a
│ │ ├── __init__.py
│ │ └── package_a_b
│ │ └── __init__.py
│ └── settings.py
├── install.sh
├── LICENSE
├── Manifest.in
├── README.md
└── setup.py
The code is the following:
ds/__init__.py
Empty file
ds/__main__.py
from package_a import name_a
from package_a.package_a_b import name_a_b
from settings import config
def main():
print(name_a)
print(config)
ds/package_a/__init__.py
name_a = 'name_a'
ds/package_a_b/__init__.py
name_a_b = 'name_a_b'
setup.py
import pathlib
from setuptools import setup
HERE = pathlib.Path(__file__).parent
README = (HERE / "README.md").read_text()
setup(
name="ds",
version="2.0.0",
long_description=README,
long_description_content_type="text/markdown",
license="MIT",
classifiers=[
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
],
packages=["ds"],
#packages=find_packages(exclude=("tests",)),
include_package_data=True,
entry_points={
"console_scripts": [
"ds=ds.__main__:main",
]
},
)
In order to install I following these steps
rm -rf dist
pip uninstall ds
python setup.py bdist_wheel
pip install dist/ds-2.0.0-py3-none-any.whl --force-reinstall
ds
the problem is that when I execute ds I got the following message
ModuleNotFoundError: No module named 'settings'
if I comment all about settings then I get this error:
ModuleNotFoundError: No module named 'package_a'
So python is not finding the packages.
How can I solve this?
Your package inside __main__.py is available under the name ds.
from ds.package_a import name_a
from ds.package_a.package_a_b import name_a_b
from ds.settings import config

list all bazel targets in the directory

How can I put all targets from a package into one variable in a higher-level package?
I'm trying to create a simulation framework with the following architecture:
.
├── run_simulation.py
├──data
| ├── folder1
| │ ├── file_1.cvs
| │ ...
| │ └── file_k.csv
| ├── folder2
| │ ├── file_1.cvs
| │ ...
| │ └── file_k.csv
| └── BUILD
└── BUILD
the data/BUILD file consists of:
data_dirs = [
"folder1",
"folder2"
]
[
filegroup(
name = "my_cool_data_" + data_dir,
srcs = glob(["{}/*.csv".format(data_dir)]),
...
) for data_dir in data_dirs
]
and produce a target with experimental data per a folder.
I'd like to run the script run_simulation.py with inputs from folders in the data directory with a command like:
$ bazel run run_simulation_<data target name>
so I have to add to the ./BUILD file
data_targets=[*do smth to get all targets from data/BUILD file*]
[
py_binary(
name = "run_simulation_" + data_target,
...
data = ["//data:{}".format(data_target)],
...
) for data_target in data_targets
]
The problem is how can I put all targets from the data package into a variable in the ./BUILD file? I couldn't find a way to do this in the official documentation.

Bazel. Is there any way of running script under the same directory without `--run_under`?

Summary
I'd like to reduce the number of types when formatting.
Status quo
I am using Bazel to manage C++ project. Below is the simplified structure of the project.
❯ tree
.
├── bin
│ ├── BUILD.bazel
│ └── format.sh
├── README.md
├── src
└── WORKSPACE
Now, I'd like to format all files in src (off course, I have test in my real project) by bin/format.sh.
However, it really bothers me to type the long command below. Do you know how to make it easier?(If it is possible to change the command tobazel run bin:format, that's perfect.)
I think adding some codes in bin/BUILD.bazel would help, but I don't have any idea.
bazel run --run_under="cd $PWD &&" bin:format # format source codes
contents of files
sh_binary(
name = "format",
srcs = ["format.sh"],
)
#!/usr/bin/env sh
buildifier -r .
find . -iname *.h -o -iname *.cc | xargs clang-format -i -style=Google
I think what you are doing is fine. I would just define an alias as in
alias clang-fmt='bazel run --run_under="${PWD}" //bin:format'
You could also not use the --run_under option, and pass the directory to the program:
alias clang-fmt='bazel run //bin:format -- "${PWD}"'
and update the script
find $1 -iname *.h -o -iname *.cc | xargs clang-format -i -style=Google

waf cross-project dependencies

I have a trivial waf project:
$ root
|-- a
| `-- wscript
|-- b
| `-- wscript
`-- wscript
Root wscript is
def configure(conf):
pass
def build(bld):
bld.recurse('a b')
a wscript is
def build(bld):
bld (rule = 'touch ${TGT}', target = 'a.target' )
b wscript is
def build(bld):
bld (rule = 'cp ${SRC} ${TGT}', source='a.target', target='b.target')
What I'm trying to achieve is have a build system that first touches a.target, then copies it to b.target. I want to have rules for a.target and b.target to stay in their respective wscript files.
When I'm trying to launch it, I get the following error instead:
source not found: 'a.target' in bld(target=['b.target'], idx=1, tg_idx_count=2, meths=['process_rule', 'process_source'], rule='cp ${SRC} ${TGT}', _name='b.target', source='a.target', path=/<skip>/waf_playground/b, posted=True, features=[]) in /<skip>/waf_playground/b
If I put both rules into a single wscript file, everything works like a charm.
Is there a way for a target to depend on a another target defined in other wscript?
When you specify source/target, that is expressed relative to the current wscript file.
$ waf configure build
...
source not found: 'a.target' in bld(source='a.target, ...)
...
$ tree build
build/
├── a
│ └── a.target
...
Knowing that, the fix is to refer to the a.target source file correctly in b/wscript:
def build(bld):
bld (rule = 'cp ${SRC} ${TGT}', source='../a/a.target', target='b.target')
The task now correctly finds the source file:
$ waf build
Waf: Entering directory `.../build'
[1/2] Creating build/a/a.target
[2/2] Compiling build/a/a.target
Waf: Leaving directory `.../build'
'build' finished successfully (0.055s)
$ tree build
build/
├── a
│   └── a.target
├── b
│   └── b.target
...

Resources