Bazel TestNG XML output - bazel

I have a small Java project: one package with dependencies on Google Truth, Google Guava, the JSR305 annotations, and TestNG for unit tests. I've been having some trouble running the tests with Bazel. I can create a java_test rule and run it with bazel test, but Bazel's XML output gives me a single pass/fail for the entire test suite, with no information on individual failures. The XML from TestNG gets cleaned up along with the sandbox.
To get around this, I've created a genrule for TestNG's XML, but the documentation explicitly says "don't use genrules for testing" so I'm wondering if there's a better approach.
My BUILD file looks like this:
java_library(
name='myproject',
srcs=glob(['src/main/java/**/*.java']),
deps=[
'#com_google_code_findbugs_jsr305//jar',
'#com_google_guava_guava//jar',
],
)
java_library(
name='myproject-test-lib',
srcs=glob(['src/test/java/**/*.java']),
deps=[
':myproject',
'#com_google_code_findbugs_jsr305//jar',
'#com_google_guava_guava//jar',
'#com_google_truth_truth//jar',
'#org_testng_testng//jar',
],
)
java_test(
name='myproject-test',
size='small',
runtime_deps=[
':myproject',
':myproject-test-lib',
'#org_testng_testng//jar',
'#com_beust_jcommander//jar', # Used by TestNG CLI
'#org_yaml_snakeyaml//jar', # Used by TestNG to parse YAML
'#junit_junit//jar', # Dependency of Truth
],
data=['testng.yaml'],
use_testrunner=False,
main_class='org.testng.TestNG',
args=['testng.yaml'],
)
genrule(
name='myproject-test-report',
srcs=['testng.yaml'],
tools=[
':myproject',
':myproject-test-lib',
'#com_google_code_findbugs_jsr305//jar',
'#com_google_guava_guava//jar',
'#com_google_truth_truth//jar',
'#org_testng_testng//jar',
'#com_beust_jcommander//jar', # Used by TestNG CLI
'#org_yaml_snakeyaml//jar', # Used by TestNG to parse YAML
'#junit_junit//jar', # Dependency of Truth
],
outs=['testng_report'],
cmd='$(JAVA) -cp $(location :myproject):$(location :myproject-test-lib):$(location #com_google_code_findbugs_jsr305//jar):$(location #com_google_guava_guava//jar):$(location #com_google_truth_truth//jar):$(location #org_testng_testng//jar):$(location #com_beust_jcommander//jar):$(location #org_yaml_snakeyaml//jar):$(location #junit_junit//jar) org.testng.TestNG -d $(OUTS) -usedefaultlisteners false testng.yaml'
)
...I suspect there's also a better way to deal with the classpath. My WORKSPACE file, for completeness:
workspace(name='com_example_myproject')
maven_jar(
name='com_google_code_findbugs_jsr305',
artifact='com.google.code.findbugs:jsr305:3.0.1',
sha1='f7be08ec23c21485b9b5a1cf1654c2ec8c58168d',
)
maven_jar(
name='com_google_guava_guava',
artifact='com.google.guava:guava:21.0',
sha1='3a3d111be1be1b745edfa7d91678a12d7ed38709',
)
maven_jar(
name='com_google_truth_truth',
artifact='com.google.truth:truth:0.32',
sha1='e996fb4b41dad04365112786796c945f909cfdf7',
)
maven_jar(
name='org_testng_testng',
artifact='org.testng:testng:6.11',
sha1='1fdd5e22f50b14f6d846163456e8c9a7657626fb',
)
maven_jar(
name='com_beust_jcommander',
artifact='com.beust:jcommander:1.64',
sha1='456a985ac9b12d34820e4d5de063b2c2fc43ed5a',
)
maven_jar(
name='org_yaml_snakeyaml',
artifact='org.yaml:snakeyaml:1.17',
sha1='7a27ea250c5130b2922b86dea63cbb1cc10a660c',
)
maven_jar(
name='junit_junit',
artifact='junit:junit:4.10',
sha1='e4f1766ce7404a08f45d859fb9c226fc9e41a861',
)

By default bazel test will output just a summary of the test results. To see a more detailed report you can use --test_output all. You could also set --test_summary detailed.
If this won't give you the desired output and you would prefer the testng log, I can think of 2 alternatives:
Disable sandboxing.
Declare testng_report as an input file (using
data attribute of java_test). Bazel needs to know the set of input/output files
and will remove everything not declared beforehand. Since for
java_test there is no way to declare additional output files, try
to declare it as an input if having it at all times in the package
is not an inconvenience. This is a bit hackish and I wouldn't prefer it.
Hope this helps.

I think this is a new option, but I was able to use --sandbox_writable_path to make a directory in CI writable, and then specify my test output to go to that directory.
--sandbox_writable_path=<a string> multiple uses are accumulated
For sandboxed actions, make an existing directory writable in the sandbox (if supported by the sandboxing implementation, ignored otherwise).

Related

Bazel fetch remote file not as a WORKSPACE rule?

In Bazel, how do I fetch a remote file as a build rule not as a WORKSPACE rule?
I want to use a build rule because WORKSPACE rules are not loaded for transitively.
e.g. this fails
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_file")
http_file(
name = "foo",
urls = [ "https://example.com" ],
sha256 = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
executable = True,
)
Error in repository_rule: 'repository rule http_file' can only be called during workspace loading
If you really want to do that, you have to implement your own rule, a naïve trivial example relying on curl to fetch could be:
def _impl(ctx):
args = ctx.actions.args()
args.add("-o", ctx.outputs.out)
args.add(ctx.attr.url)
ctx.actions.run(
outputs = [ctx.outputs.out],
executable = "curl",
arguments = [args],
)
get_stuff = rule(
_impl,
attrs = {
"url": attr.string(
mandatory = True,
),
},
outputs = {"out": "%{name}.out"},
)
But (and esp. in such a trivial) for, it comes with problems. Apart from, do you want to step out of sandbox during the build? And do you want to talk to someone across the network during the build (out of the sandbox)? Bypassing repository_cache, and possibly getting remote_cache involved (networked caching of networked fetching). Specifically in this example, if content of the file pointed to by url changes... build has no idea and only fetches it when it either hasn't done so or the url itself has changed. I.e. the implementation would need to be more robust (mimic that of http_file for instance).
But it actually sounds like you're trying to address a different problem (transitive external dependencies, for which there could be another solution). One trick used for that is to define a macro (in your first level dependency to load define the next hop) and after declaring that first step as an external dependency in your parent project, load the that macro and use it from parent project WORKSPACE. This too has a price though, namely the first level dependency has to always be present (fetched or already cached), even if build target asked for does not actually need it (as that load and macro call will always pull it in).

Bazel external dependency rebuilt unnecessarily in rule used as a tool from within another rule

I am working on a set of Bazel rules where one test rule is also executed as a tool from within another executable rule. The test rule depends on an external tool which is built by rules_foreign_cc.
symbiyosys_test = rule(
implementation = _symbiyosys_test_impl,
doc = "Formal verification of (System) Verilog.",
attrs = {
...,
"_yosys_toolchain": attr.label(
doc = "Yosys toolchain.",
default = Label("#rules_symbiyosys//symbiyosys/tools:yosys"),
),
"_yices_toolchain": attr.label(
doc = "Yices toolchain.",
default = Label("#rules_symbiyosys//symbiyosys/tools:yices"),
),
},
test = True,
)
symbiyosys_trace = rule(
implementation = _symbiyosys_trace_impl,
doc = "View VCD trace from Symbiyosys.",
attrs = {
"test": attr.label(
doc = "Symbiyosys test target to produce VCD file.",
mandatory = True,
executable = True,
cfg = "exec",
),
...,
},
executable = True,
)
With a virgin Bazel cache, when an instance of the test rule is run with bazel test //examples:counter_fail the external tool is built. The external tool is also built when an instance of the executable rule (which utilizes the test rule) is run with bazel run //examples:counter_fail_trace. Once the external tool has been built in these two contexts, subsequent tests or runs use the cached outputs.
Building the external tool twice seems unnecessary as both the test and executable rule have the same configuration ("exec"). I have a hunch that this may have to do with bazel test and bazel run invoking different command line options causing the cache to miss on the external dependency.
My question is primarily what is causing this rebuild and how do I get rid of it? And short of answering that, what are some techniques to dig into what is causing this rebuild? I have tried some basic Bazel queries, but haven't had much luck.
EDIT
I still haven't cracked this one. I do suspect a subtle difference between bazel test and bazel run but unfortunately there is limited information about how specifically the two differ in the documentation.

Jenkins PyLint Warnings tool parses log files but reports 'found 0 issues'

I have setup Jenkins to run pylint on all python source files and all the log files are generated (apparently correctly) into a sub-directory as follows:
Source\pylint_logs\pylint1.log, pylint2.log, ..., pylint75.log
I have included a --msg-template definition based on the instructions on my Jenkins Configure page: Post-build Actions->Record compiler warnings and static analysis results->Static Analysis Tools. The template is shown as:
msg-template={path}:{line}: [{msg_id}, {obj}] {msg} ({symbol})
An example of one of the log files being generated by Jenkins/pylint is as follows:
************* Module FigureView
myapp\Views\FigureView.py:1: [C0103, ] Module name "FigureView" doesn't conform to snake_case naming style (invalid-name)
myapp\Views\FigureView.py:30: [C0103, FigureView.__init__] Attribute name "ax" doesn't conform to snake_case naming style (invalid-name)
------------------------------------------------------------------
Your code has been rated at 8.57/10 (previous run: 8.57/10, +0.00)
For the PyLint Report File Pattern, I have: Source/pylint_logs/pylint*.log
It appears that PyLint Warnings is parsing the files because the console output looks like this:
[PyLint] Searching for all files in 'D:\Jenkins\workspace\PROJECT' that match the pattern 'Source/pylint_logs/pylint*.log'
[PyLint] -> found 75 files
[PyLint] Successfully parsed file D:\Jenkins\workspace\PROJECT\Source\pylint_logs\pylint1.log
[PyLint] -> found 0 issues (skipped 0 duplicates)
[PyLint] Successfully parsed file D:\Jenkins\workspace\PROJECT\Source\pylint_logs\pylint10.log
[PyLint] -> found 0 issues (skipped 0 duplicates)
This repeats for all 75 files, even though there are plenty of issues in the log files.
What is odd, is that when I was first prototyping the use of Jenkins on this project, I set it up to just run pylint on a single file. I ran across another StackOverflow post that showed a msg-template that allowed me to get it working (unable to get pylint output to populate the violations graph). I even got the graph to show up for the PyLint Warnings Trend. I used the following definition per the post:
msg-template={path}:{line}: [{msg_id}({symbol}), {obj}] {msg}
Note that this format is slightly different from the one recommended by my Jenkins page (shown earlier). Even though this worked for a single file, neither template now seems to work for multiple files, or else there is something other than the template causing the problem. My graph has flat-lined, and I always get 0 issues reported.
I have had trouble finding useful documentation on the Jenkins PyLint Warnings tool. Does anyone have any ideas or pointers to documentation I can research further? Thanks much!
Ensure pass output-format parameter in pylint command. Example:
pylint --exit-zero --output-format=parseable module1 module2 > pylint.report
you have to set the Pylint's option --message-template in .pylintrc as
msg-template={path}: {line}: [{msg_id} ({symbol}), {obj}] {msg}
output-format=text

How do I get output files for a given Bazel target?

Ideally, I'd like a list of output files for a target without building. I imagine this should be possible using cquery which runs post-analysis, but can't figure out how.
Here's my output.cquery
def format(target):
outputs = target.files.to_list()
return outputs[0].path if len(outputs) > 0 else "(missing)"
You can run this as follows:
bazel cquery //a/b:bundle --output starlark \
--starlark:file=output.cquery 2>/dev/null
bazel-out/darwin-fastbuild/bin/a/b/something-bundle.zip
For more information on cquery.
What exactly do you mean by "output files" here? Do you mean that you'd like to know the files generated if you build the target on the command line?
At what point would you like to have this information? Do you really want to invoke a bazel query command to acquire this information, or would you like it during analysis? I don't think there's a way, using bazel query, to get the exact expected absolute path of output files (or even the workspace-relative path, for example, bazel-out/foo/bar/baz.txt)
It may be a bit more involved than you want, but Requesting Output Files
has some information about specifying output files in Starlark, with a brief bit about acquiring information about your dependencies' output files (See DefaultInfo
I made a slight improvement to Engene's answer, since a target's output might be multiple:
bazel cquery --output=starlark \
--starlark:expr="'\n'.join([f.path for f in target.files.to_list()])" \
//foo:bar

Different checksum results for jar files compiled on subsequent build?

I am working verifying the jar files present on remote unix boxes with that of built on local machine(Windows & Cygwin) with same JVM.
As a POC I am trying to verify if same checksum is produced with jar files generated on my machine with consecutive builds, I tried below,
Generated the jar file first time using ant script
Calculated the checksum (e.g. "xyz abc")
Generated the jar file again with same ant script without changing anything
I got different checksum but same byte count (e.g. "xvw abc")
I am not sure how java internal processes produce the class files and then the jar files, Can someone please help me understand below points
Does the cksum utility of unix/cygwin consider timestamp of the file while coming up with the value?
Will the checksum be different for compiled class files/jar file produced if we keep every other things same [Compiler version + sourcecode + machine + environment]?
Answer to question 1: cksum doesn't consider the timestamp of the archive (e.g. jar-file) but it does consider the timestamps of the files inside the jarfile.
Answer to question 2: The checksums of the individual class-files will be the same with all other things the same (source-code, compiler etc.) The checksums of the jar-files will be different. Causes of differences can be the timestamp of the files inside the jarfile or if files are put into the archive in different orders (e.g. caused by parallel builds).
If you want to create a reproducible build with gradle you can do so with the config below:
tasks.withType(AbstractArchiveTask) {
preserveFileTimestamps = false
reproducibleFileOrder = true
}
Maven allows something similar, sorry I don't know how to do this with ant..
More info here:
https://dzone.com/articles/reproducible-builds-in-java
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74682318

Resources