**INVALID** Test fails but ball is still green - jenkins

Edit: Please see my own answer
With the xUnit-plugin in Jenkins I have configured Post-build Actions > Publish xUnit test result report > Failed Tests to set the Build Status as red, with a treshold of 1 (set in the Total field). The other fields are left empty.
But although having a test that fails, the status ball stays green. Worth mentioning is that the xUnit-plugin detects the failure. It is listed under Latest Test Result and also in the Test Result Trend graph.
Along with the configuration form the is a description stating: "..A build is considered as unstable or failure if the new or total number of failed tests exceeds the specified thresholds.". But seems strange, 1 will never exceed 1. Are you not allowed to set it as failure for a single failed test case? I can not recall seeing this before.

Ok, sometimes ones brain certainly malfunctions.
1 may not exceed 1 but it exceeds 0 for sure. I think I need to go on vacation.

Related

Jenkins plugin with viewing\aggregating possibilities depending on one of the parameters

I'm looking for plugin where I could have aggregation of settings and view for many cases, the same way it is in multi-branch pipeline. But instead of basing on various branches I want to base on one branch but varying on parameters. Below picture is from mentioned multi-branch pipeline, instead of "Branches" I'm looking for "Cases" and instead of "Name" column I need to have configurable Parameter.
Additionally to it, I need to have various Periodic build triggers in way
H 22 * * 5 %param1=value1 %param2=value3
H 22 * * 5 %param1=value2 %param2=value3
The second case could be done in standard job, but since there will be many such cases launched periodically every week or two weeks or every month, and difference in param1 is crucial and is important to have it readable and easily visible to quickly distinguish which case have failed.
I was looking for such plugin but couldn't find something like this. Maybe someone knows such plugin or way to solve it.
I have alternative of creating "super"job which in build steps would launch my current job with specific parameters. Then my readability would change from many rows to many columns since the number is over 20 it will IMHO significantly decrease readability(in column solution) and additionally not all cases would be launched with same periodicity. So there would be necessity to have some ready sets assigned by parameter, and most often the super build cases would have mostly skips in it. What would result that one might not see last result for one of the cases.
Note, that param2, has always same value for periodic launches. Other values are used only with manual trigger. Param2 can but doesn't have to be visible on "multi-branch pipeline" like solution.
I hope my explanation of issue is clear. Looking forward for answers\suggestions etc. :)

Count distinct values in a stream pipeline

I have a pipeline that looks like
pipeline.apply(PubsubIO.read.subscription("some subscription"))
.apply(Window.into(SlidingWindow.of(10 mins).every(20 seconds)
.triggering(AfterProcessingTime.pastFirstElementInPane()
.plusDelayOf(20 seconds))
.withAllowedLateness(Duration.ZERO)
.accumulatingFiredPanes()))
.apply(RemoveDuplicates.create())
.apply(Window.discardingFiredPanes()) // this is suggested in the warnings under https://cloud.google.com/dataflow/model/triggers#window-accumulation-modes
.apply(Count.<String>globally().withoutDefaults())
This pipeline overcounts distinct values significantly (20x normal value). Initially, I was suspecting that the default trigger may have caused this issue. I have tweaked to use triggers that allow no lateness/discard fired panes/use processing time, all of which have similar overcount issues.
I've also tried ApproximateUnique.globally: it failed during pipeline construction because of an exception that looks like
Default values are not supported in Combine.globally() if the output PCollection is not windowed by GlobalWindows. There seems no way to add withoutDefaults to it (like we did with Count.globally).
Is there a recommended way to do COUNT(DISTINCT) in dataflow/beam streaming pipeline with reasonable precision?
P.S. I'm using Java Dataflow SDK 1.9.0.
Your code looks OK; it shouldn't overcount. Note that you are placing each element into 30 windows, so if you have a window-unaware sink (equivalent to collapsing all the sliding windows) you would expect precisely 30 times as many elements. If you could show a bit more of the pipeline or how you are observing the counts, that might help.
Beyond that, I have a few suggestions for the pipeline:
I suggest changing your trigger for RemoveDuplicates to AfterPane.elementCountAtLeast(1); this will get you the same result at lower latency, since later elements arriving will have no impact. This trigger, and your current trigger, will never fire repeatedly. So it does not actually matter whether you set accumulatingFiredPanes() or discardingFiredPanes(). This is good, because neither one would work with the rest of your pipeline.
I'd install a new trigger prior to the Count. The reason is a bit technical, but I'll try to describe it:
In your current pipeline, the trigger installed there (the "continuation trigger" of the trigger for RemoveDuplicates) notes the arrival time of the first element and waits until it has received all elements that were produced at or before that processing time, as measured by the upstream worker. There is some nondeterminism because it puns the local processing time and the processing time of other workers.
If you take my advice and switch the trigger for RemoveDuplicates, then the continuation trigger will be AfterPane.elementCountAtLeast(1) so it will always emit a count as soon as possible and then discard further data, which is very wrong.

Jenkins BUILD_NUMBER limit - maximum build number

Curious to know what's the maximum number a Jenkins build number (BUILD_NUMBER) can get? I tried to find online but couldn't find this info.
Is it infinite or has some limit (being INT or INT64 or some other type)?
PS:
I'm NOT looking for how to reset it back to #1 or #N using the following plugin or set it value to a given name (using Set build name plugin).
https://wiki.jenkins-ci.org/display/JENKINS/Next+Build+Number+Plugin
To find it's limit, using Next Build Number plugin - when I set the build to "65,535", it still worked in getting me 65536 successfully and I kept increasing this value to 999999999 (9 times), and it still worked i.e. the next build ran which was 1000000000 and reaped a valid Jenkins build# for few other runs/builds.
When I tried to set the next build number as: 9999999999 (10 times), Jenkins plugin barfed with an error mesg (which shows the next build number I set was not an INTEGER i.e. not in the range):
A problem occurred while processing the request. Please check our bug tracker to see if a similar problem has already been reported. If it is already reported, please vote and put a comment on it to let us gauge the impact of the problem. If you think this is a new issue, please file a new issue. When you file an issue, make sure to add the entire stack trace, along with the version of Jenkins and relevant plugins. The users list might be also useful in understanding what has happened.
Stack trace
javax.servlet.ServletException: Build number must be an integer
at org.jvnet.hudson.plugins.nextbuildnumber.NextBuildNumberAction.doSubmit(NextBuildNumberAction.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Found Jenkins BUILD_NUMBER is a SIGNED Integer data type and INT signed limits are:
int 2 or 4 bytes -32,768 to 32,767 or -2,147,483,648 to 2,147,483,647
Thus, the last value that a Jenkins job can be set or built is: 2147483647 (4 bytes here). Setting anything above this number will generate the expected INT limit error.
Noticed the following issue as well, with SET NEXT BUILD NUMBER plugin:
If I set the next build number between -2147483648 or 2147483647 (positive), this plugin it works in both cases i.e. giving Negative or a positive Number and setting it doesn't error out. BUT, if you have already reached at build# 2147483647 (under Job history), then even setting the build# to 1 did NOT generate the next build as 1 or 2 . or 3 and so on.
As the BUILD_NUMBER is SIGNED (-N to +N in the above range) and for ex: setting -22 works for setting the next build# using the plugin, it didn't give me build# -21 or build# 21 in my case.
What's happening is, while providing a negative/lower value than the last/previous build# then this plugin is taking that value within the limit BUT not actually doing anything / giving me the expected BUILD#. When I'm clicking on Build Now / every time you hit Build Now and if you click on "Set Next Build Number" to check the value, I noticed that the number listed in the box was auto-decrementing by the number of times I hit "Build Now" (this happens only when you have reached the limit) and negative/lower value has no effect for the next build number (as per the plugin docs).
I tested this by creating a new test_job (discarded old builds by retaining only 1 build):
clicked on Build Now, got Build# 1.
Set next build number to 100.
Clicked Build Now, got Build# 100, Clicked Build now, Got build# 101.
Set Build number to 11 this time.
Clicked Build Now, but instead of giving me build# 12, Jenkins gave me build# 102 (PS: that's the intended behavior by this plugin as per it's documentation as setting next build number to a lower number N will not do anything).

how to use startPhase in Mahout

I am running some RecommenderJob (org.apache.mahout.cf.taste.hadoop.item.RecommenderJob) based job from Mahout 0.7 and notice that there are options like startPhase and endPhase. I am guessing these are to run only portions of the pipeline assuming you have necessary input data from prior run(s). But I am having a hard time understanding what kinds of phases there are in RecommenderJob. I am in the middle of reading the source code but it looks like it will take a while. In the meantime I am wondering if anybody can shed light on how to use these options (startPhase in particular) with RecommenderJob class?
Here is what I found:
phase 0 is about PreparePreferenceMatrixJob and it has 3 hadoop jobs:
PreparePreferenceMatrixJob-ItemIDIndexMapper-Reducer
PreparePreferenceMatrixJob-ToItemPrefsMapper-Reducer
PreparePreferenceMatrixJob-ToItemVectorsMapper-Reducer
phase 1 is about RowSimilarityJob and it has 3 jobs:
RowSimilarityJob-VectorNormMapper-Reducer
RowSimilarityJob-CooccurrencesMapper-Reducer
RowSimilarityJob-UnsymmetrifyMapper-Reducer
phase 2 is about RecommenderJob and it has 3 jobs:
RecommenderJob-SimilarityMatrixRowWrapperMapper-Reducer
RecommenderJob-UserVectorSplitterMapper-Reducer
RecommenderJob-Mapper-Reducer
phase 3 is the last one and it has only one job:
RecommenderJob-PartialMultiplyMapper-Reducer
Also output from phase 1 here in RecommenderJob class is exactly the same as the output from phase 0 and 1 of ItemSimilarityJob (but the temp directory names are different).
Yes, that's correct. It's a fairly crude mechanism. Really it controls which of a series of MapReduce jobs are run. You have to read the code to know what they are, yes. They vary by job.
If I'd done it over again I would have just made it detect the presence of output to know to skip the jobs. (That's what I've done in my next-gen recommender project.)

What's the point of basis path coverage?

The article at onjava seems to imply that basis path coverage is a sufficient substitute for full path coverage, due to some linear-independence/cyclomatic-complexity magic.
Using an example similar to the article:
public int returnInput(int x, boolean one, boolean two)
{
int y = x;
if(one)
{
y = x-1;
}
if(two)
{
x = y;
}
return x;
}
with the basis set {FF,TF,FT}, the bug is not exposed. Only the untested TT path would expose it.
So, how is basis path coverage useful? It doesn't seem much better than branch coverage.
[Disclaimer: I've never heard of this technique before, it just looks interesting so I've done a few searches and here's what I think I've found out. Hopefully someone who knows what they're talking about will contribute too...]
I think it's supposed to be a better way of generating branch coverage tests, not a complete substitute for path coverage. There's a far longer document here which restates the goals a bit: http://www.westfallteam.com/sites/default/files/papers/Basis_Path_Testing_Paper.pdf
The onjava article says "the goal of basis path testing is to test all decision outcomes independently of one another. Testing the four basis paths achieves this goal, making the other paths extraneous"
I think "extraneous" here means, "unnecessary to the goal of basis path testing", not as one might assume, "a complete waste of everyone's time".
I think the point of testing branches independently, is to break those accidental correlations between the paths which work, and the paths you test, that occur with terrifying frequency when I write both the code and an arbitrary set of branch coverage tests myself. There's no magic in the linear independence, it's just a systematic way of generating branch coverage, which discourages the tester from making the same assumptions as the programmer about correlation between branch choices.
So you're right, basis path testing misses your bug, and in general misses 2^(N-1)-N bugs, where N is the cyclomatic complexity. It just aims not to miss the 2^(N-1)-N paths most likely to be buggy, as letting the coder choose N paths to test typically does ;-)
path coverage is no better than any other coverage metrics - it is just that metrics that shows how much of 'code' has been tested. The fact that you can achieve 100% branch coverage with (TF,FT) set of TCs as well as (TT,FF) means it is all up to your luck if your exit criteria tell you exit after 100% coverage is done.
The coverage should not be a focus for the tester - finding bugs should be and TC is just a way to show the bug just as well as coverage a proxy showing how much of this showing the bug activity has been done. As with all other white box methods - striving for max coverage with minimum costs require actually understanding the code so that you could actually write a defect w/o a TC. TC is just good for regression and as a documentation of the defect.
As a tester coverage is just a hint on how much has been done - only experience can be really helpful as to say how much is enough. As this is difficult to present in numerical values we use other methods i.e. coverage statistics.
Not sure if this makes sense to you I guess judging on the date you are far gone since the date you publish your question...
My recollection from McCabe's work on this exact subject is: you generate the basis paths systematically, changing one condition at a time, and only changing the last condition, until you can't change any new conditions.
Suppose we start with FF, which is the shortest path. Following the algorithm, we change the last if in the chain, yielding FT. We've covered the second if now, meaning: if there was a bug in the second if, surely our two tests were paying attention to what happened when the second if statement suddenly started executing, otherwise our tests aren't working or our code isn't verifiable. Both possibilities suggest our code needs reworking.
Having covered FT, we go back up one node in the path and change the first T to F. When building basis paths, we only change one condition at a time. So we are forced to leave the second if the same, yielding... TT!
We are left with these basis paths: {FF, FT, TT}. Which address the issue you raised.
But wait, you say, what if the bug occurs in the TF case?? The answer is: we should have already noticed it between two of the other three tests. Think about it:
The second if already had its chance to demonstrate its effect on the code independently any other changes to the execution of the program through the FF and FT tests.
The first if had its chance to demonstrate its independent effect going from FT to TT.
We could have started with the TT case (the longest path). We would have arrived at slightly different basis paths, but they would still exercise each if statement independently.
Notice in your simple example, there is no co-linearity in the conditions of the if statements. Co-linearity cripples basis path generation.
In short: basis path testing, done systematically, avoids the problems you think it has. Basis path testing doesn't tell you how to write verifiable code. (TDD does that.) More to the point, path testing doesn't tell you which assertions you need to make. That's your job as the human.
Source: this is my research area, but I read McCabe's paper on this exact subject a few years back: http://mccabe.com/pdf/mccabe-nist235r.pdf

Resources