Can bazel package depend on a source file in another package

Can bazel package depend on a source file in another package - bazel

A few years ago I wrote a set of wrappers for Bazel that enabled me to use it to build FPGA code. The FPGA bit is only relevant because the full clean build takes many CPU days so I really care about caching and minimizing rebuilds.
Using Bazel v0.28 I never found a way to have my Bazel package depend on a single source file from somewhere else in the git repo. It felt like this wasn't something Bazel was designed for.
We want to do this because we have a library of VHDL source files that are parameterized and the parameters are set in the instantiating VHDL source. (VHDL generics). If we declare this library as a Bazel package in its own right then a change to one library file would rebuild everything (at huge time cost) when in practice only a couple of steps might need to be rebuilt.
I worked around this with a python script to copy all the individual source files into a subdirectory and then generate the BUILD file to reference these copies. The resulting build process is:
call python preparation script
bazel build //:allfpgas
call python result extractor
This is clearly quite ugly but the benefits were huge so we live with it.
Now we want to leverage Bazel to build our Java, C++ etc so I wanted to revisit and try and make everything work with Bazel alone.
In the latest Bazel is there a way to have a BUILD package depend on individual source files outside of the package directory? If Bazel cant, would buck pants or please.build work better for our use case?

The Bazel rules for most languages support doing something like this already. For example, the Python rules bundle source files from multiple packages together, and the C++ rules manage include files from other packages. Somehow the rule has to pass the source files around in providers, so that another rule can generate actions which use them. Hard to be more specific without knowing which rules you're using.
If you just want to copy the files, you can do that in bazel with a genrule. In the package with the source file:
exports_files(["templated1.vhd", "templated2.vhd"])
In the package that uses it:
genrule(
name = "copy_templates",
srcs = ["//somewhere:templated1.vhd", "//somewhere:templated2.vhd"],
outs = ["templated1.vhd", "templated2.vhd"],
cmd = "cp $(SRCS) $(RULEDIR)",
)
some_library(
srcs = ["templated1.vhd", "templated2.vhd", "other.vhd"],
)
If you want to deduplicate that across multiple packages that use it, put the filenames in a list and write a macro to create the genrule.

Related

Bazel Starlark: how can I generate a BUILD file procedurally?

After downloading an archive throug http_archive I'd like to run a script to generate a BUILD file from the folder structure and Cmake files in it (I currently do that by hand and it is easy enough that it could be scripted). I don't find anything on how to open, read and write files in the starlark documentation but since http_archive itself is loaded from a bzl file (haven't found the source of that file yet though...) and generates BUILD files (by unpacking them from archives) I guess it must be possible to write a wrapper for http_archive that also generates the BUILD file?

This is a perfect use case for a custom repository rule. That lets you run arbitrary commands to generate the files for the repository, along with some helpers for common operations like downloading a file over HTTP using the repository cache (if configured). A repository rule conceptually similar to a normal rule, but with much less infrastructure because it's running during the loading phase when most of the Bazel infrastructure doesn't apply yet.
The starlark implementation of http_archive is in http.bzl. The core of it is a single call to ctx.download_and_extract. Your custom rule should do that too. http_archive then calls workspace_and_buildfile and patch from util.bzl, which do what they sound like. Instead of workspace_and_buildfile, you should call ctx.execute to run your command to generate the BUILD file. You could call patch if you want, or skip that functionality if you're not going to use it.
The repository_ctx page in the documentation is the top-level reference for everything your repository rule's implementation function can do, if you want to extend it further.

When using http_archive, you can use the build_file argument to create a BUILD file. To generate it dynamically, I think you can use the patch_cmds argument to run external commands.

Are the bazel buildtools primarily focused on single starlark files?

I'm taking a glance over at the buildtools repo (https://github.com/bazelbuild/buildtools) and trying to understand the scope of its responsibilities as it relates to the three phases of a bazel build (loading, analysis, execution)
The repo's description states that it is A bazel BUILD file formatter and editor. I find much logic in the repo written in go-lang that lends complete support for an AST parser, starlark syntax interpreting capabilities, reformatting and rewriting of BUILD files and what not. Basically there's logic designed to operate upon a single starlark file at a time. Rereading that repo description in this light leads me to conclude that buildtools is really a single file scoped effort and presents tools that only intersect functionality wise (perhaps only partially) to those loading operations bazel conducts while building.
Question: Is it accurate that the focus of buildtools is upon the single starlark file?
If that's true then all the multiple starlark file analysis logic and so forth seems to actually be maintained over at https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib and I should not expect to find any tools for the analysis phase and beyond in the buildtools repo. Is that right?

I don't work on Buildtools, but we agree: these tools seem to focus on BUILD / .bzl files in isolation. They let you process these files in parallel, to do similar operations on them.
If you wonder whether these tools understand relations between these files, the answer seems to be no.
If you further wonder what tools do then, the answer is Bazel's query, cquery, and aquery. I'm not aware of a programmable API for these queries though; you have to run Bazel to perform them.

buildtools has tools working on a syntactic level (it looks at the syntax tree). These tools are outside of Bazel and have no knowledge of Bazel build phases. In the future, we may expand the code to work on multiple files (for the static analysis), but it will still be independent from Bazel phases.
https://github.com/bazelbuild/bazel/tree/master/src/main/java/com/google/devtools/build/lib/ is the source code of Bazel. The syntax/ directory includes the code for reading and evaluating the Starlark files. The code there is called by Skyframe. The interpreter is called by Skyframe many times in parallel, both during the loading and the analysis phases.
If you have a more specific question (what are you trying to do?), I can help more. :)

Saving external dependencies to projects repository

"(new_)git_repository" and "(new_)http_archive" workspace rules deal with external projects in such way that any external dependency is copied to temporary directory linked to workspace as ${WORKSPACE}/bazel-workspace/external/${EXTERNAL_DEP_NAME} on build or prefetch.
I'd like to save external dependencies locally in my repo, so if remote repository vanishes i'd have copy of dependency even on a new machine, where it wasn't cached.
Can I somehow change default behaviour without writing custom workspace rule?

Bazel does have a flag you could use for this: --experimental_repository_cache. It is designed to be a system-wide cache so that multiple projects on one machine don't have to re-download dependencies, but you could use it per-repository. Basically you'd say:
bazel build --experimental_repository_cache=$PWD/my_cache //foo
Then all external repositories would be downloaded to the my_cache directory in your project.
This is a cache keyed by the hash of your external dependencies' content, so it's not going to be very human-readable, but it would let you keep your external dependencies in your VCS fairly easily.
(Theoretically you could even check in a .bazelrc file to specify this option by default, but --experimental_repository_cache only takes an absolute path right now, so it's a bit impractical. I filed a bug to handle the relative path use case.)

I might be wrong but it sounds like you want to just check it in the VCS. If we're talking about an http archive then download it manually, stick it under the relevant "third_party" sub folder with the BUILD file you craft for it and you're done.
If you want to use Bazel mechanisms to download and check-in the external dependencies then this isn't currently supported.
Maybe you should open an issue

Ant: Is it possible to create a dynamic ant script?

So, at work, I frequently have to create virtually identical ant scripts. Basically the application we provide to our clients is designed to be easily extensible, and we offer a service of designing and creating custom modules for it. Because of the complexity of our application, with lots of cross dependencies, I tend to develop the module within our core dev environment, compile it using IntelliJ, and then run a basic ant script that does the following tasks:
1) Clean build directory
2) Create build directory and directory hierarchy based on package paths.
3) Copy class files (and source files to a separate sources directory).
4) Jar it up.
The thing is, to do this I need to go through the script line by line and change a bunch of property names, so it works for the new use case. I also save all the scripts in case I need to go back to them.
This isn't the worst thing in the world, but I'm always looking for a better way to do things. Hence my idea:
For each specific implementation I would provide an ant script (or other file) of just properties. Key-value pairs, which would have specific prefixes for each key based on what it's used for. I would then want my ant script to run the various tasks, executing each one for the key-value pairs that are appropriate.
For example, copying the class files. I would have a property with a name like "classFile.filePath". I would want the script to call the task for every property it detects that starts with "classFile...".
Honestly, from my current research so far, I'm not confident that this is possible. But... I'm super stubborn, and always looking for new creative options. So, what options do I have? Or are there none?

It's possible to dynamically generate ANT scripts, for example the following does this using an XML input file:
Use pure Ant to search if list of files exists and take action based on condition
Personally I would always try and avoid this level of complexity. Ant is not a programming language.
Looking at what you're trying to achieve it does appear you could benefit from packaging your dependencies as jars and using a Maven repository manager like Nexus or Artifactory for storage. This would simplify each sub-project build. When building projects that depend on these published libraries you can use a dependency management tool like Apache ivy to download them.
Hope that helps your question is fairly broad.

Multiple classifiers in Maven

Being a Maven newbie, I want to know if its possible to use multiple classifiers at once; in my case it would be for generating different jars in a single run. I use this command to build my project:
mvn -Dclassifier=bootstrap package
Logically I would think that this is possible:
mvn -Dclassifier=bootstrap,api package
I am using Maven 3.0.4

Your project seems like a candidate for refactoring into a couple of what Maven calls "modules". This involves splitting the code into separate projects within a single directory tree, where the topmost level is normally a parent or aggregator POM with <packaging>pom</packaging> and a <modules/> list containing the sub-project directory names.
Then, I'd advise putting the API interfaces/exceptions/whatnot into an api/ subdirectory with its own pom.xml, and putting the bootstrap classes into a bootstrap/ subdirectory with its own pom.xml. The top-level pom.xml would then list the modules like this:
<modules>
<module>api</module>
<module>bootstrap</module>
</module>
Once you've refactored the project, you will probably want to add a dependency from the bootstrap module to the api module, since I'm guessing the bootstrap will depend on interfaces/etc. from the api.
Now, you should be able to go into the top level of the directory structure and simply call:
mvn clean install
This approach is good because it forces you to think about how different use cases are supported in your code, and it makes dependency cycles between classes harder to miss.
If you want an example to follow, have a look at one of my github projects: Aprox.
NOTE: If you have many modules dependent on the api module, you might want to list it in the top-level pom.xml in the <dependencyManagement/> section, so you can leave off the version in submodule dependency declarations (see Introduction to the Dependency Mechanism).
UPDATE: Legacy Considerations
If you can't refactor the codebase for legacy reasons, etc. then you basically have two options:
Construct a series of pom.xml files in an empty multimodule structure, and use the build-helper-maven-plugin along with source includes/excludes to fragment the codebase and allocate the classes to different modules out of a single source tree.
Maybe use a plugin like the assembly plugin to carve up the target/classes directory (${project.build.directory}) and allocate classes to the different jars. In this scenario, each assembly descriptor requires an <id/> and by default this value becomes the classifier for that assembly jar. Under this plan, the "main" jar output will still be the monolithic one created by the Maven build. If you don't want this, you can use a separate execution of the assembly plugin, and in the configuration use <appendAssemblyId>false</appendAssemblyId>. If the output of that assembly is a jar, then it will effectively replace the old output from the jar plugin. If you decide to pursue this approach, you might want to read the assembly plugin documents to get as much exposure to different examples as you can.
Also, I should note that in both cases you would be stuck with manipulating the list of things produced by using a set of profiles in the pom in order to turn on/off different parts of the build. I'd highly recommend making the default, un-qualified build one that produces everything. This makes it more likely for things like the release plugin to catch everything you want to release, and up-rev versions, etc. appropriately.
These solutions are usually what I promote as migration steps when you can't refactor the codebase all at once. They are especially useful when migrating from something like an Ant build that produces multiple jars out of a single source tree.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart