New to bazel so please bear with me :) I have a genrule which basically downloads and unpacks a a package:
genrule(
name = "extract_pkg",
srcs = ["#deb_pkg//file:pkg.deb"],
outs = ["pkg_dir"],
cmd = "dpkg-deb --extract $< $(#D)/pkg_dir",
)
Naturally pkg_dir here is a directory. There is another rule which uses this rule as input to create executable, but the main point is that I now need to add a rule (or something) which will allow me to use some headers from that package. This rule is used as an input to a cc_library which is then used in other parts of the repository to get access to the headers. Tried like this:
genrule(
name = "pkg_headers",
srcs = [":extract_pkg"],
outs = [
"pkg_dir/usr/include/pkg/h1.h",
"pkg_dir/usr/include/pkg/h2.h"
]
)
But it seems Bazel doesn't like the fact that both rules use the same directory as output, even though the second one doesn't do anything (?):
output file 'pkg_dir' of rule 'extract_pkg' conflicts with output file 'pkg_dir/usr/include/pkg/h1.h' of rule 'pkg_headers'
It works fine if I use different "root" directory for both rules, but I think there must be some better way to do this.
EDIT
I tried to use declare_directory as follows (compiled from different sources):
unpack_deb.bzl:
def _unpack_deb_impl(ctx):
input_deb_file = ctx.file.deb
output_dir = ctx.actions.declare_directory(ctx.attr.name + ".cc")
print(input_deb_file.path)
print(output_dir.path)
ctx.actions.run_shell(
inputs = [ input_deb_file ],
outputs = [ output_dir ],
arguments = [ input_deb_file.path, output_dir.path ],
progress_message = "Unpacking %s to %s" % (input_deb_file.path, output_dir.path),
command = "dpkg-deb --extract \"$1\" \"$2\"",
)
return [DefaultInfo(files = depset([output_dir]))]
unpack_deb = rule(
implementation = _unpack_deb_impl,
attrs = {
"deb": attr.label(
mandatory = True,
allow_single_file = True,
doc = "The .deb file to be unpacked",
),
},
doc = """
Unpacks a .deb file and returns a directory.
""",
)
BUILD.bazel:
load(":unpack_deb.bzl", "unpack_deb")
unpack_deb(
name = "pkg_dir",
deb = "#deb_pkg//file:pkg.deb"
)
cc_library(
name = "headers",
linkstatic = True,
srcs = [ "pkg_dir" ],
hdrs = ["pkg_dir.cc/usr/include/pkg/h1.h",
"pkg_dir.cc/usr/include/pkg/h2.h"],
strip_include_prefix = "pkg_dir.cc/usr/include",
)
The trick with adding .cc so the input can be accepted by cc_library was stolen from this answer. However the command fails on
ERROR: missing input file 'blah/blah/pkg_dir.cc/usr/include/pkg/h1.h'
From the library.
When I run with debug, I can see the command being "executed" (strange thing is that I don't always see this printout):
SUBCOMMAND: # //blah/pkg:pkg_dir [action 'Unpacking tmp/deb_pkg/file/pkg.deb to blah/pkg/pkg_dir.cc', configuration: xxxx]
(cd /home/user/.../execroot/src && \
exec env - \
/bin/bash -c 'dpkg-deb --extract "$1" "$2"' '' tmp/deb_pkg/file/pkg.deb bazel-out/.../pkg/pkg_dir.cc)
After execution, bazel-out/.../pkg/pkg_dir.cc exists but is empty. If I run the command manually it extracts files correctly. What might be the reason? Also, is it correct that there's an empty string directly after bash command line string?
Bazel's genrule doesn't work very well with directory outputs. See https://docs.bazel.build/versions/master/be/general.html#general-advice
Bazel mostly works with individual files, although there's some support for working with directories in Starlark rules with https://docs.bazel.build/versions/master/skylark/lib/actions.html#declare_directory
Your best bet is probably to extract all the files you're interested in in the genrule, then create filegroups for the different groups of files:
genrule(
name = "extract_pkg",
srcs = ["#deb_pkg//file:pkg.deb"],
outs = [
"pkg_dir/usr/include/pkg/h1.h",
"pkg_dir/usr/include/pkg/h2.h",
"pkg_dir/other_files/file1",
"pkg_dir/other_files/file2",
],
cmd = "dpkg-deb --extract $< $(#D)/pkg_dir",
)
filegroup(
name = "pkg_headers",
srcs = [
":pkg_dir/usr/include/pkg/h1.h",
":pkg_dir/usr/include/pkg/h2.h",
],
)
filegroup(
name = "pkg_other_files",
srcs = [
":pkg_dir/other_files/file1",
":pkg_dir/other_files/file2",
],
)
If you've seen glob, you might be tempted to use glob(["pkg_dir/usr/include/pkg/*.h"]) or similar for the srcs of the filegroup, but note that glob works only with "source files", which means files already on disk, not with the outputs of other rules.
There are rules for creating debs, but I'm not aware of rules for importing them. It's possible to write such rules using Starlark:
https://docs.bazel.build/versions/master/skylark/repository_rules.html
With repository rules, it's possible to avoid having to explicitly write out all the files you want to extract, among other things. Might be more work than you want to do though.
Related
Suppose I am writing a custom Bazel rule for foo-compiler.
The user provides a list of source-files to the rule:
foo_library(
name = "hello",
srcs = [ "A.foo", "B.foo" ],
)
To build this without Bazel, the steps would be:
Create a config file config.json that lists the sources:
{
"srcs": [ "./A.foo", "./B.foo" ]
}
Place the config alongside the sources:
$ ls .
A.foo
B.foo
config.json
Call foo-compiler in that directory:
$ foo-compiler .
Now, in my Bazel rule implementation I can declare a file like this:
config_file = ctx.actions.declare_file("config.json")
ctx.actions.write(
output = config_file,
content = json_for_srcs(ctx.files.srcs),
)
The file is created and it has the right content.
However, Bazel does not place config.json alongside the srcs.
Is there a way to tell Bazel where to place the file?
Or perhaps I need to copy each source-file alongside the config?
You can do this with ctx.actions.symlink e.g.
srcs = []
# Declare a symlink for each src files in the same directory as the declared
# config file.Then write that symlink.
for f in ctx.files.srcs:
src = ctx.actions.declare_file(f.basename)
srcs.append(src)
ctx.actions.symlink(
output = src,
target_file = f,
)
config_file = ctx.actions.declare_file("config.json")
ctx.actions.write(
output = config_file,
content = json_for_srcs(ctx.files.srcs),
)
# Run compiler
ctx.actions.run(
inputs = srcs + [config_file],
outputs = # TODO: Up to you,
tools = [ctx.file.__compiler], #TODO: Update this to match your rule.
command = ctx.file.__compiler.path,
args = ["."],
#...
)
Note that when you return your provider that you should only return the result of your compilation not the srcs. Otherwise, you'll likely run into problems with duplicate outputs.
I want to create the following structure in bazel.
dir1
|_ file1
|_ file2
|_ dir2
|_file3
Creating a specific structure doesn't seem trivial.
I'm hoping there's a simple and reusable rule.
Something like:
makedir(
name = "dir1",
path = "dir1",
)
makedir(
name = "dir2",
path = "dir1/dir2",
deps = [":dir1"],
)
What I've tried:
I could create a macro with a python script, but want something cleaner.
I tried creating a genrule with mkdir -p path/to/directoy which didn't work
The use case is that I want to create a squashfs using bazel.
It's important to note that Bazel provides some packaging functions.
To create a squashfs, the command requires a directory structure populated with artifacts.
In my case, I want to create a directory structure and run mksquashfs to produce a squashfs file.
To accomplish this, I ended up modifying the basic example from bazel's docs on packaging.
load("#bazel_tools//tools/build_defs/pkg:pkg.bzl", "pkg_tar")
genrule(
name = "file1",
outs = ["file1.txt"],
cmd = "echo exampleText > $#",
)
pkg_tar(
name = "dir1",
strip_prefix = ".",
package_dir = "/usr/bin",
srcs = [":file1"],
mode = "0755",
)
pkg_tar(
name = "dir2",
strip_prefix = ".",
package_dir = "/usr/share",
srcs = ["//main:file2.txt", "//main:file3.txt"],
mode = "0644",
)
pkg_tar(
name = "pkg",
extension = "tar.gz",
deps = [
":dir1",
":dir2",
],
)
If there's an easier way to create a tar or directory structure without the need for intermediate tars, I'll make that top answer.
You could create such a Bazel macro, that uses genrule:
def mkdir(name, out_dir, marker_file = "marker"):
"""Create an empty directory that you can use as an input in another rule
This will technically create an empty marker file in that directory to avoid Bazel warnings.
You should depend on this marker file.
"""
path = "%s/%s" % (out_dir, marker_file)
native.genrule(
name = name,
outs = [path],
cmd = """mkdir -p $$(dirname $(location :%s)) && touch $(location :%s)""" % (path, path),
)
Then you can use the outputs generated by this macro in a pkg_tar definition:
mkdir(
name = "generate_a_dir",
out_dir = "my_dir",
)
pkg_tar(
name = "package",
srcs = [
# ...
":generate_a_dir",
],
# ...
)
You can always create a genrule target or a shell_binary target that will execute bash command or a shell script (respectively) that creates these directories.
with genrule you can use bazel's $(location) that will make sure that the dir structure you create will be under an output path that is inside bazel's sandbox environment.
The genrule example shows how to use it exactly.
Here you can find more details on predefined output paths.
I'm just getting started working with Bazel. So, I apologize in advance that I haven't been able to figure this out.
I'm trying to run a command that outputs a bunch of files to a directory and make this directory available for subsequent targets. I have two different attempts:
Use genrule
Write my own rule
I was naively hoping to just do this with a genrule. But, it doesn't seem you can say "I don't know exactly what this command is going to output" and put a directory in outs. Now I'm trying to write a rule that can use ctx.actions.declare_directory but I haven't gotten it quite right. I can't seem to get the tools over from my workspace and into my rule.
My genrule attempt looks something like this:
genrule(
name = "doit",
srcs = [
"doitConfigA",
"doitConfigB",
],
cmd = 'HOME=. ./$(location path/to/doit) install',
# Neither of the below outs work - seems like bazel wants to know
# exactly this list of files. I don't know the files that
# will be output ahead of time.
# This one looks at the `out_dir` that I already have and
# expects the files to be the same which they might not be
outs = glob(["out_dir/**/*.*"]),
# this fails with:
# "declared output 'out_dir' was not
# created by genrule. This is probably because the genrule actually
# didn't create this output, or because the output was a directory
# and the genrule was run remotely (note that only the contents of
# declared file outputs are copied from genrules run remotely)"
outs = ['out_dir'],
tools = ['path/to/doit'],
)
My custom rule attempt looks something like this:
def _impl(ctx):
dir = ctx.actions.declare_directory("out_dir")
ctx.actions.run_shell(
outputs=[dir],
progress_message="Running doit install ...",
command="HOME=. ./path/to/doit install",
tools=[ctx.attr.tools],
)
doit = rule(
implementation=_impl,
attrs={
"tools": attr.label_list(allow_files=True),
},
outputs={"out": "out_dir"},
)
Then, to run my doit rule, my BUILD file looks like this:
doit(
name = 'doit',
tools = ['path/to/doit'],
)
In my genrule, the command runs but it doesn't like my trying to use a directory in outs, it seems. In my custom rule, I can't seem to tell Bazel that I want to use ./path/to/doit as a tool from my workspace, eg expected type 'File' for 'tools' element but got type 'list' instead ...
Seems like I must be missing something basic because surely this is a common situation to run a command and output a bunch of unknown stuff to a directory?
The output of a genrule must be a fixed list of files. As a work-around, you can create a zip from the output directory.
I used this approach to manipulate the output of yarn install where the usual method was not viable:
genrule(
name = "node_modules",
srcs = [
"package.json",
"yarn.lock",
],
cmd = " && ".join([
"yarn install --pure-lockfile",
"zip -r $# node_modules",
]),
outs = [
"node_modules.zip",
],
)
Then a rule that consumes the zip:
# Rule that generates a list of the folders in node_modules
genrule(
name = "node_modules_ls",
srcs = [
":node_modules",
],
cmd = " && ".join([
"unzip $(location :node_modules) -d . ",
"ls > $#",
]),
outs = [
"out.txt",
],
)
A while ago I created this example showing how to use directories with skylark action: How to build static library from the Generated source files using Bazel Build. Maybe it still works :)
Genrule won't work, this is too advanced use case.
https://github.com/aspect-build/bazel-lib/blob/main/docs/run_binary.md has a similar API to genrule, and it supports directory outputs.
I have a library that depends on graphics files that are generated by a shell script.
I would like the library, when it is compiled, to use the shell script to generate the graphics files, which should be copied as if it were a 'data' statement, but whenever I try to make the library depend on the genrule, I get
in deps attribute of cc_library rule //graphics_assets
genrule rule '//graphics_assets:assets_gen_rule' is misplaced here
(expected cc_inc_library, cc_library, objc_library or
cc_proto_library)
# This is the correct format.
# Here we want to run all the shader .glsl files through the program
# file_utils:archive_tool (which we also build elsewhere) and copy the
# output .arc file to the data.
# 1. List the source files
filegroup(
name = "shader_source",
srcs = glob([
"shaders/*.glsl",
]),
)
# 2. invoke file_utils::archive_tool on the shaders
genrule(
name = "shaders_gen_rule",
srcs = [":shader_source"],
outs = ["shaders.arc"],
cmd = "echo $(locations shader_source) > temp.txt ; \
$(location //common/file_utils:archive_tool) \
--create_from_list=temp.txt \
--archive $(OUTS) ; \
$(location //common/file_utils:archive_tool) \
--ls --archive $(OUTS) ",
tools = ["//common/file_utils:archive_tool"],
)
# 3. when a a binary depends on this tool the arc file will be copied.
# This is the thing I had trouble with
cc_library(
name = "shaders",
srcs = [], # Something
data = [":shaders_gen_rule"],
linkstatic = 1,
)
I need to copy some files to binary directory while preserving their names. What I've got so far:
filegroup(
name = "resources",
srcs = glob(["resources/*.*"]),
)
genrule(
name = "copy_resources",
srcs = ["//some/package:resources"],
outs = [ ],
cmd = "cp $(SRCS) $(#D)",
local = 1,
output_to_bindir = 1,
)
Now I have to specify file names in outs but I can't seem to figure out how to resolve the labels to obtain the actual file names.
To make a filegroup available to a binary (executed using bazel run) or to a test (when executed using bazel test) then one usually lists the filegroup as part of the data of the binary, like so:
cc_binary(
name = "hello-world",
srcs = ["hello-world.cc"],
data = [
"//your_project/other/deeply/nested/resources:other_test_files",
],
)
# known to work at least as of bazel version 0.22.0
Usually the above is sufficient.
However, the executable must then recurse through the directory structure "other/deeply/nested/resources/" in order to find the files from the indicated filegroup.
In other words, when populating the runfiles of an executable, bazel preserves the directory nesting that spans from the WORKSPACE root to all the packages enclosing the given filegroup.
Sometimes, this preserved directory nesting is undesirable.
THE CHALLENGE:
In my case, I had several filegroups located at various points in my project directory tree, and I wanted all the individual files of those groups to end up side-by-side in the runfiles collection of the test binary that would consume them.
My attempts to do this with a genrule were unsuccessful.
In order to copy individual files from multiple filegroups, preserving the basename of each file but flattening the output directory, it was necessary to create a custom rule in a bzl bazel extension.
Thankfully, the custom rule is fairly straightforward.
It uses cp in a shell command much like the unfinished genrule listed in the original question.
The extension file:
# contents of a file you create named: copy_filegroups.bzl
# known to work in bazel version 0.22.0
def _copy_filegroup_impl(ctx):
all_input_files = [
f for t in ctx.attr.targeted_filegroups for f in t.files
]
all_outputs = []
for f in all_input_files:
out = ctx.actions.declare_file(f.basename)
all_outputs += [out]
ctx.actions.run_shell(
outputs=[out],
inputs=depset([f]),
arguments=[f.path, out.path],
# This is what we're all about here. Just a simple 'cp' command.
# Copy the input to CWD/f.basename, where CWD is the package where
# the copy_filegroups_to_this_package rule is invoked.
# (To be clear, the files aren't copied right to where your BUILD
# file sits in source control. They are copied to the 'shadow tree'
# parallel location under `bazel info bazel-bin`)
command="cp $1 $2")
# Small sanity check
if len(all_input_files) != len(all_outputs):
fail("Output count should be 1-to-1 with input count.")
return [
DefaultInfo(
files=depset(all_outputs),
runfiles=ctx.runfiles(files=all_outputs))
]
copy_filegroups_to_this_package = rule(
implementation=_copy_filegroup_impl,
attrs={
"targeted_filegroups": attr.label_list(),
},
)
Using it:
# inside the BUILD file of your exe
load(
"//your_project:copy_filegroups.bzl",
"copy_filegroups_to_this_package",
)
copy_filegroups_to_this_package(
name = "other_files_unnested",
# you can list more than one filegroup:
targeted_filegroups = ["//your_project/other/deeply/nested/library:other_test_files"],
)
cc_binary(
name = "hello-world",
srcs = ["hello-world.cc"],
data = [
":other_files_unnested",
],
)
You can clone a complete working example here.