Bazel: Separate output directories for different build options

Bazel: Separate output directories for different build options - bazel

Bazel has different output directories for different compilation modes (fastbuild, opt, dbg) which allows it to keep the release build-cache intact after you compile the app in debug mode. Which is great.
Is it possible to do the same for different compilation options?
My use-case: I have a custom-built C++ symbolic computations system. Each run of the program is one computation. Most computations take a few seconds. But some take minutes. To speed up the latter I unrolled several low-level functions, and now thousands of lines of code are copied to every compilation unit (because the functions are templated). This made a decent effect on computation speed but also slowed down compilation significantly. It really only makes sense to use these optimizations for a small fraction of runs.
So, a put them under a define which I can toggle via --cxxopt=-DUNROLL_ALL_THE_THINGS. But whenever I switch from unrolled version to a simple one and back Bazel drops compilation cache. In essence, I've split “opt” mode into two (“opt” and “super-opt”) but I can't make Bazel see it this way.

One can use the --platform_suffix option to manually add a suffix to the output directory name. So, you could pass --platform_suffix=super whenever you use --cxxopt=-DUNROLL_ALL_THE_THINGS.

For your different commands, pass --platform_suffix=your_configuration_name.
So, if you have a custom dbg configuration, it will be
--platform_suffix=dbg-myconfig
Similarly for the opt build pass this flag:
--platform_suffix=opt-myotherconfig

Related

how to minimize edit-build-test cycle for developing a single clang tool?

I am trying to build a clang static checker that will help me identify path(s) to a line in code where variables of type "struct in_addr" (on linux) are being modified in C/C++ progrmas. So I started to build a tool that will find lines where "struct in_addr" variable are being modified (and will then attempt to trace path to such places). For my first step I felt I only need to work with AST, I would work w/ paths as step 2.
I started with the "LoopConvert" example. I understand it and am making some progress, but I can't find how to only make the "LoopConvert" example in he cmake|build eco system. I had used
cmake -DCMAKE_BUILD_TYPE=Debug -G "Unix Makefiles"
when I started. I find that when I edit my example and rebuild (by typing "make" in the build directory) the build systems checks everything, seems to rebuilding quite a bit though nothing else has changed but 1 line in LoopConvert.cpp, and takes forever.
How can only rebuild the one tool I am working on? If I can shorten my edit-compile-test cycle I feel I can learn more quickly.

In a comment you say that switching to Ninja helped. I'll add my advice when using make.
First, as explained in an answer to Building clang taking forever, when invoking cmake, pass -DCMAKE_BUILD_TYPE=Release instead of -DCMAKE_BUILD_TYPE=Debug. The "debug" build type produces an enormous clang executable (2.2 GB) whereas "release" produces a more reasonable size (150 MB). The large size causes the linker to spend a long time and consume a lot of memory (of order 10 GB?), possibly leading to swapping (it did on my 16 GB VM).
Second, and this more directly addresses the excessive recompilation, what I do is save the original output of make VERBOSE=1 to a file, then copy the command lines that are actually relevant to what I am changing (compile, archive, link) into a little shell script that I run thereafter. This is of course fragile, but IMO the substantially increased speed is worth it during my edit-compile-debug inner loop.
I have not tried Ninja yet, but have no reason to doubt your claim that it works much better. Although, again, I think you want build type to be release for faster link time.

Using large non-bazel dependencies in a bazel project

I would like to use a very large non-bazel system in a bazel project. Specifically, ROS2. This dependency provides a large number of python, C, and C++ libraries which are built using its own hand-rolled buildsystem. Obviously, I would like to avoid having to translate the entire buildsystem over to bazel.
Broadly, what's the best way of me doing this? In instinct was to use a custom repository rule to download the source (since it's split across many repositories), then use a genrule to call the ROS2 build system. Then write my simple cc_import and py_library rules for each of the individual components that I need.
However, I'm having trouble with the bit where I need to call the foreign build system. It seems that genrules require a list of output files to be specified, while I would like it to make an entire build directory available.
Before I spent any more time on this, I thought I'd ask whether I'm on the right lines since I'm new to bazel. Is this a good strategy? How would you approach this problem? Are there any other projects that mainly use bazel, but call other build systems in this way that I can look at?

As of recent, you can use rules_foreign_cc to call native CMake or make/configure like projects.

Handling complex and large dependencies

Problem
I've been developing a game in C++ in my spare time and I've opted to use Bazel as my build tool since I have never had a ton of luck (or fun) working with make or cmake. I also have dependencies in other languages (python for some of the high level scripting). I'm using glfw for basic window handling and high level graphics support and that works well enough but now comes the problem. I'm uncertain on how I should handle dependencies like glfw in a Bazel world.
For some of my dependencies (like gtest and fruit) I can just reference them in my WORKSPACE file and Bazel handles them automagically but glfw hasn't adopted Bazel. So all of this leads me to ask, what should I do about dependencies that don't use Bazel inside a Bazel project?
Current approach
For many of the simpler dependencies I have, I simply created a new_git_repository entry in my WORKSPACE file and created a BUILD file for the library. This works great until you get to really complicated libraries like glfw that have a number of dependencies on their own.
When building glfw for a Linux machine running X11 you now have a dependency on X11 which would mean adding X11 to my Bazel setup. X11 Comes with its own set of dependencies (the X11 libraries like X11Cursor) and so on.
glfw also tries to provide basic joystick support which is provided by default in Linux which is great! Except that this is provided by the kernel which means that the kernel is also a dependency of my project. Now I shouldn't need anything more than the kernel headers this still seems like a lot to bring in.
Alternative Options
The reason I took the approach I've taken so far is to make the dependencies required to spin up a machine that can successfully build my game very minimal. In theory they just need a C/C++ compiler, Java 8, and Bazel and they're off to the races. This is great since it also means I can create a Docker container that has Bazel installed and do CI/CD really easily.
I could sacrifice this ease and just say that you need to have libraries like glfw installed before attempting to compile the game but that brings the whole which version is installed and how is it all configured problem back up that Bazel is supposed to help solve.
Surely there is a simpler solution and I'm overthinking this?

If the glfw project has no BUILD files, then you have the following options:
Build glfw inside a genrule.
If glfw supports some other build system like make, you could create a genrule that runs the tool. This approach has obvious drawbacks, like the not-to-be-underestimated impracticality of having to declare all inputs of that genrule, but it'd be the simplest way of Bazel'izing glfw.
Pre-build glfw.o and check it into your source tree.
You can create a cc_library rule for it, and put the .o file in the srcs. Even though this solution is the least flexible of all because you not only restrict the target platform to whatever the .o was built for, but also make it harder to reproduce the whole build, the benefits are sometimes worth the costs.
I view this approach as a last resort. Even in Bazel's own source code there's one cc_library.srcs that includes a raw object file, because it was worth it, as the commit message of 92caf38 explains.
Require that glfw be installed.
You already considered this option. Some people may prefer this to the other approaches.

Evaluating Rascal's Performance?

I want to evaluate the performance of Rascal for a given rewrite system that I've written. I'm wondering if there's a good way of doing it?
Ideally, I'd generate some compiled Java classes from the system and then run them manually against my inputs. Is there an easy or recommended way to do it?
Cheers,

One way to do this is to use the functions in the library util::Benchmark. Typically, you could write something like
cpuType( (){ call_the_function_I_want_to_observe(); } ). This will execute your function and print the cpu time used.
Note that Rascal can be executed in two ways: interpreted and compiled which makes a big difference when measuring performance. We are working hard at the moment to fully integrate the compiler in the Eclipse IDE, but a stand alone version is available as well. This can be called as java -Xss8m -jar rascal-0.8.4-SNAPSHOT.jar --compiledREPL followed by at least values for directories for sources (--src), and binaries (--bin). Here rascal-0.8.4-SNAPSHOT.jar (but most likely named differently) is downloaded from the https://update.rascal-mpl.org/console/rascal-shell-unstable.jar.
If you need more information, don't hesitate to ask for more details: this part of our tool chain is unfortunately still undocumented.

Is exec a good programming solution to ant OutOfMemory issues?

This question requires a bit of backstory... At my company, we produce a set of PDF and HTML files. A very large set. The current build process (which I designed, in haste) is a Perl script that reads a set of files, where each file contains a new ant command to execute.
It is designed terribly.
Now, I'm trying to shift the entire project over to using ant for the majority of the tasks. Within a target, I can construct a list of files that need to be built, as either PDF or HTML. However, when I call the ant command to build each file, after about three builds (of, say, five), the entire process crashes with an OutOfMemory error. Furthermore, my buildlog.xml ends up being something like 20 megs--it concatenates every ant command's output into one giant log, since they are being called from a single target. With the earlier Perl solution, I was able to get a buildlog.xml for each ant command--simply save and rename the buildlog to something else.
Even if I set ant or java heap sizes in my user.properties, I still fail with an OOM eventually. I wonder if an appropriate solution is to call <exec> to launch a script that does some of what I described and desire: namely, call ant, rename the buildlog, and die--theoretically allocating and freeing up space better than one "giant" ant call. I am worried that I am going to be heading down another "hacky" solution to a problem that's well-defined, and can be entirely confined to ant. Then again, <exec> does exist for a reason, so should I not feel bad for using it?

As with most corporate software (at least those which have deadlines and, if yours don't, please let me know where you work so I can try get a job there), the first step is to get it working.
Then, worry about getting it working well.
For that first step, you can use any tool at your disposal, no matter how ugly you think it looks.
But you might want to make sure that the powers-that-be know that you've had to do all sorts of kludgy things to get it working for them, so that they allow you to hopefully fix it up before maintenance has to start on it. You probably don't want to be maintaining a hideously ugly code base or design.
We've unleashed such wonders on the world as applications that shut themselves down nightly to avoid memory leaks (leaving the OS to restart them), putting "questionable" code at the other end of a TCP socket so their crashing doesn't bring down the main application and, I'm sure, many other horrors that my brain has decided to remove all trace of.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart