How to understand Bazel's output time? - bazel

Everytime after a build is done, I see something like:
Elapsed time: 1034.748s, Critical Path: 257.54s
Wondering what's the difference between Elapsed Time and Critical Path? What can be causing the time difference?
Forwarded from: https://github.com/bazelbuild/bazel/issues/3164

"Elapsed time" shows the wall time of the build, since Bazel started running the first build action until the last action finished.
"Critical path" shows the wall time spent building the longest chain of actions, where each subsequent action depends on the output(s) of the previous one, so they must be run sequentially. The critical path is a lower limit on the clean build time of this build; even if the CPU had more cores than the number of actions Bazel ever runs in parallel, the build could still not complete any faster.
The time difference is caused by Bazel executing other actions too. There were presumably more actions to run than just those on the critical path.

Related

bazel: why do targets keep increasing if build graph is known

Per this doc, the analysis and execution phases handle building out the dependency tree (among other things) and going and doing the work if needed, respectively. If that's true, I'm curious why the total number of targets keeps increasing as the build progresses (i.e., when I start a large build, bazel may report that it's built 5 out of 100 targets, but later will say it's built 20 out of 300 targets, and so forth, with the denominator increasing for a while until it levels off).
I've heard the loading and analysis phases can be intermixed. My likely incomplete or incorrect understanding is that when bazel parses a BUILD file, analysis is invoked to determine what dependencies are needed for the requested targets on the command line, and then I guess this is somehow communicated back to the loader to pull in any other BUILD files referenced by these dependencies, which could cause the loader to go out and fetch a remote repo if the dependency (and thus BUILD file) is not in the local repo.
However, my understanding was also that while dynamic build-out of the dependency graph was a potential future direction for bazel, that currently, execution does not intermix with analysis, and thus when execution begins, the full dependency tree should be available to bazel (and thus the total number of targets known)? Does bazel have the full tree, but just not want to traverse the tree to get a count in case it's big, or is something else going on here?
Note: I found a brief mention of this phenomenon here, but without an explanation as to why it happens.
The number you're seeing in the progress bar refers to actions (command-lines..ish) and not targets (e.g. //my:target). I wrote a blog post about the action graph, and here's the relevant description about it:
The action graph contains a different set of information: file-level
dependencies, full command lines, and other information Bazel needs to
execute the build. If you are familiar with Bazel’s build phases, the
action graph is the output of the loading and analysis phase and used
during the execution phase.
However, Bazel does not necessarily execute every action in the graph.
It only executes if it has to, that is, the action graph is the super
set of what is actually executed.
As to why the denominator is ever-increasing, it's because the actions-to-execute discovery within the action graph is lazy. Here's a better explanation from the Bazel TL, Ulf Adams:
The problem is that Skyframe does not eagerly walk the action graph,
but it does it lazily. The reason for that is performance, since the
action graph can be rather large and this was previously a blocking
operation (where Bazel would just hang for some time). The downside is
that all threads that walk the action graph block on actions that they
execute, which delays discovery of remaining actions. That's why the
number keeps going up during the build.
Source: https://github.com/bazelbuild/bazel/issues/3582#issuecomment-329405311

Jenkins job that calls cyclically a script

I'm having some difficulties in getting this one working.
How would I have to configure a job in Jenkins to call some script on a node, the node to execute and when exiting, the job to not declare the build finished but to wait for a certain amount of time and to call again the script. The waiting period would have to be dynamically calculated at runtime based on a target start time. Such a build would have to be stopped by some user input and not by aborting the build.
I know pipelining might be needed in this case but I'm not very sure how the build history would look like as I intend to have only one build appearing in there not a bunch of builds spawned by the main one. Hopefully, I was able to make myself easy to understand.
Thank you very much.

Can I create a unit test that fails if compile time is above an acceptable level?

Sometimes code finds its way on to my teams dev branch that compiles very slowly. When this gets to the point where it's several minutes long we've no choice but to drop our tasks and search for what caused this else we'd lose a lot of time until we resolve it.
For our apps performance we have unit tests to stop our users experiencing slow times, I'm wondering if it's possible to device a test where slow compile times will cause our tests to fail so the changes that cause slow compiles times can be identified and removed immediately before they waste the entire teams time.
You can add in your project's Build Settings -> Other Swift Flags the following flag: -Xfrontend -warn-long-function-bodies=<time> where in <time> you specify the amount of ms. You'll then be able to see warnings for any function that takes more time and fix them.
It's not gonna fail your tests but the whole team will be aware when they code something that takes too long to compile.
Maybe it needs a little extra effort, but a possible solution:
I use 'xcodebuild clean build OTHER_SWIFT_FLAGS="-Xfrontend -debug-time-function-bodies" | grep "[0-9][0-9].[0-9]*ms" | sort -nr > culprits.txt' to get a text file with the time it took to compile individual methods. Then you know where the compiler gets stuck and can optimize until the method that takes the longest to compile is <100ms.
source

TFS Build takes long time

In our company we use Gated Checkin to make sure commited code doesn't have any problems and also we run all of our unit tests there. We have around 450 unit tests.
To build the entire solution it takes 15-20 seconds and for tests maybe 3 mins on my local computer. When I build it on the server it takes 10 minutes. Why is that? Is there an extra stuff that will be fired that I don't know of?
Be aware that there are additional overheads (clean/get workspace is the main culprit most of the time) in the workflow prior to the actual build and then test cycle. I have seen the same behaviour myself and never really got to a point where the performance was that close to what it would be locally.
Once the build is running, you can view the progress and see where the time is being taken, this will also be in the logs.
In the build process parameters you can skip some extra steps if you just want to build the checked in code.
Set all these to False: Clean Workspace, Label sources, Clean build, Update work items with build number.
You could also avoid publishing (if you're doing it) or copying binaries to a drop folder (also, if you're doing it).
As others have suggested, take a look at the build log, it'll tell you what's consuming the time.

How To Abort Another Jenkins Job?

I have two Jenkins jobs, COMPILE and TEST, COMPILE triggers TEST, COMPILE is quick, TEST is slow. COMPILE re-creates data which is used by TEST, so if COMPILE runs while TEST is running, TEST might fail due to the necessary data being temporarily unavailable or incomplete.
Sometimes, COMPILE gets triggered a lot (via CMS, busy development). The standard way would be to synchronize COMPILE and TEST via a lock, so that both never run at the same time but instead wait for the other to finish before starting. This waiting does not really suit me as it delays the COMPILE jobs too much.
An alternative might be to turn TEST to concurrent running, but in my case TEST requires too many resources to be able to run concurrently.
So my approach now is to configure COMPILE so that it first aborts a running TEST job (in case one is running) and then starts its work (eventually triggering TEST again in the end). Several quickly performed COMPILE builds will then each start a TEST build which will all be aborted (gray bubble). Only the last TEST build will be completed and show a decent red or green (or yellow) bubble. But that will be enough in my case (and I accept the drawback that this way I cannot detect exactly which commit broke the build).
Now the question is: How can I make COMPILE abort a TEST build? (Preferably in an elegant way.)
I only found a way to generally abort a job from the outside using Jenkins's REST interface, but that doesn't seem to be very elegant. Is there a more elegant way, maybe using a Jenkins-internal feature I don't know or maybe by using a suitable plugin?
I would suggest consolidating the two jobs into a single job. That would allow for multiple simultaneous executions and will still fail if the compile fails.
You can set the number of Executors for the node to "1" . By this we can make sure only one jobs run at a time in the node , even if it is executed in parallel it will wait for the 1st job to complete and then start the second

Resources