Jenkins: avoid build due to commit message - jenkins

Is it possible to cancel or skip a job in Jenkins due to special commit-message patterns? I thought the option "Excluded Commit comments" in the job configuration does this for me out of the box, like mentioned here. But no matter which regular expression i write in this field, the build is performed nevertheless.
For example:
I want to perform the build job only if the commit message includes the expression "release". So i write the regular expression [^(?:release)] in the Excluded Commit comments field. I thought if i commit a revision with, for example "test commit" the build-job does not perform, right? Is this the right way to do when not using a post-commit hook?

Jenkins Git plugin exposes you the environment variable GIT_COMMIT which of course, contains the current git commit hash.
Use [Jenkins Conditional Step] and build a step which execute the following bash shell:
echo "==========================="
if [ "git show $GIT_COMMIT | grep "your-pattern-here" == false ] ; then
echo "pattern failed";
exit 1
else
echo "ok"
fi
echo "==========================="
And mark that if the shell fails, fail the build.

Late reply but may help some one in future,
There is a plugin to skip build depending on git commit message, just include a [ci-skip] in the commit message junkin will skip the build
jenkins-ci-skip-plugin

TL;DR
To trigger builds only for commits with "release" word (case-insensitive) set this in the "Excluded Commit comments" field in job configuration:
(?i)(?s)(?!.*\brelease\b.*)^.*$
Better still, use a trigger phrase which is unlikely to be added to a commit message accidentally. For example, use "[ci build]":
(?i)(?s)(?!.*\[ci build\].*)^.*$
How does this work?
(?i) tells regex do do case-insensitive match. This is optional, but useful if you want to match "Release" and "RELEASE" as well as "release".
(?s) makes dot to match line-ends (aka dotall option), so that we look for matches within the entire commit message. By default dot doesn't match line-ends, so if there is no "release" keyword on one of the lines in the commit message, the pattern would match on that line, and commit would be incorrectly ignored by Jenkins. With dotall, we look at the entire commit message, ignoring any line ends.
(?!.*\brelease\b.*) - negative look-ahead pattern. Any match is discarded if this pattern is found within it. In this pattern:
.* matches anything before our trigger phrase and after it. We need this because of the way java regex matching works (quote from the tutorial):
myString.matches("regex") returns true or false depending whether the string can be matched entirely by the regular expression. It is important to remember that String.matches() only returns true if the entire string can be matched. In other words: "regex" is applied as if you had written "^regex$" with start and end of string anchors. This is different from most other regex libraries, where the "quick match test" method returns true if the regex can be matched anywhere in the string. If myString is abc then myString.matches("bc") returns false. bc matches abc, but ^bc$ (which is really being used here) does not.
\b makes sure that there is a word boundary before and after the keyword, as you probably don't want to match "unreleased" etc.
^.*$ is the actual matching pattern we are looking for. Note that ^ and $ match start of the string and end of the string, not the start/end of lines within that string. This is default behavior for java regex, unless multi-line mode is enabled. In other words, this pattern matches the entire commit message, because dotall mode was enabled by (?s) and dot matches newlines.
So matching algorithm would match the entire commit message, and then discard it or not depending on whether it finds negative look-ahead pattern anywhere in it.
Why your expression didn't work?
There were two problems with your suggested regex expression. First, you used incorrect regex syntax for excluding a pattern. Second, you didn't tell what your pattern should include, only what it should exclude. Therefore it would never match anything even if you used correct syntax. And because it doesn't match anything, then nothing is excluded from triggering jobs, i.e. any commits would trigger.
References
If you need more information, look for java.util.regex package which is used by Jenkins uses for matching. I used this online java regex tester to test my expressions. I've also found a nice tutorial - learned about (?m), (?s) and (?i) there.

Related

End of line lex

I am writing an interpreter for assembly using lex and yacc. The problem is that I need to parse a word that will strictly be at the end of the file. I've read that there is an anchor $, which can help. However it doesn't work as I expected. I've wrote this in my lex file:
ABC$ {printf("QWERTY\n");}
The input file is:
ABC
without spaces or any other invisible symbols. So I expect the outputput to be QWERTY, however what I get is:
ABC
which I guess means that the program couldn't parse it. Then I thought, that $ might be a regular symbol in lex, so I changed the input file into this:
ABC$
So, if $ isn't a special symbol, then it will be parsed as a normal symbol, and the output will be QWERTY. This doesn't happen, the output is:
ABC$
The question is whether $ in lex is a normal symbol or special one.
In (f)lex, $ matches zero characters followed by a newline character.
That's different from many regex libraries where $ will match at the end of input. So if your file does not have a newline at the end, as your question indicates (assuming you consider newline to be an invisible character), it won't be matched.
As #sepp2k suggests in a comment, the pattern also won't be matched if the input file happens to use Windows line endings (which consist of the sequence \r\n), unless the generated flex file was compiled for Windows. So if you created the file on Windows and run the flex-generated scanner in a Unix environment, the \r will also cause the pattern to fail to match. In that case, you can use (f)lex's trailing context operator:
ABC/\r?\n { puts("Matched ABC at the end of a line"); }
See the flex documentation for patterns for a full description of the trailing context operator. (Search for "trailing context" on that page; it's roughly halfway down.) $ is exactly equivalent to /\n.
That still won't match ABC at the very end of the file. Matching strings at the very end of the file is a bit tricky, but it can be done with two patterns if it's ok to recognise the string other than at the end of the file, triggering a different action:
ABC/. { /* Do nothing. This ABC is not at the end of a line or the file */ }
ABC { puts("ABC recognised at the end of a line"); }
That works because the first pattern will match as long as there is some non-newline character following ABC. (. matches any character other than a newline. See the above link for details.) If you also need to work with Windows line endings, you'll need to modify the trailing context in the first pattern.

whitespace in flex patterns leads to "unrecognized rule"

The flex info manual provides allows whitespace in regular expressions using the "x" modifier in the (?r-s:pattern) form. It specifically offers a simple example (without whitespace)
(?:foo) same as (foo)
but the following program fails to compile with the error "unrecognized rule":
BAD (?:foo)
%%
{BAD} {}
I cannot find any form of (? that is acceptable as a rule pattern. Is the manual in error, or do I misunderstand?
The example in your question does not seem to reflect the question itself, since it shows neither the use of whitespace nor a x flag. So I'm going to assume that the pattern which is failing for you is something like
BAD (?x:two | lines |
of | words)
%%
{BAD} { }
And, indeed, that will not work. Although you can use extended format in a pattern, you can only use it in a definition if it doesn't contain a newline. The definition terminates at the last non-whitespace character on the definition line.
Anyway, definitions are overused. You could write the above as
%%
(?x:two | lines |
of | words ) { }
Which saves anyone reading your code from having to search for a definition.
I do understand that you might want to use a very long pattern in a rule, which is awkward, particularly if you want to use it twice. Regardless of the issue with newlines, this tends to run into problems with Flex's definition length limit (2047 characters). My approach has been to break the very long pattern into a series of definitions, and then define another symbol which concatenates the pieces.
Before v2.6, Flex did not chop whitespace off the end of the definition line, which also leads to mysterious "unrecognized rule" errors. The manual seems to still reflect the v2.5 behaviour:
The definition is taken to begin at the first non-whitespace character following the name and continuing to the end of the line.

flex default rule can be matched

I am working on a flex parser using flex 2.6.4 with the -s option specified, a particular start condition has the following patterns (I am trying to read everything to the next unescaped newline):
\\(.|\n)
[^\\\n]+
\n
Yet I get the warning: "-s option given but default rule can be matched"
I don't see any holes in the above pattern set, am I missing something or is this a flex error?
Your set of rules does not match a backslash at the end of the file.
Your first rule requires the backslash to be followed by something and the other ones don't match backslashes at all.

Remove "[string]" from BUILD_LOG_REGEX extracted lines

Here is my sample string.
[echo] The SampleProject solution currently has 85% code coverage.
My desired output should be.
The SampleProject solution currently has 85% code coverage.
Btw, I had this out because I'm getting through the logs in my CI using Jenkins.
Any help? Thanks..
You can try substText parameter in BUILD_LOG_REGEX token to substitute the text matching your regex
New optional arg: ${BUILD_LOG_REGEX, regex, linesBefore, linesAfter, maxMatches, showTruncatedLines, substText} which allows substituting text for the matched regex. This is particularly useful when the text contains references to capture groups (i.e. $1, $2, etc.)
Using below will remove prefix [echo] from all of your logs ,
${BUILD_LOG_REGEX, regex="^\[echo] (.*)$", maxMatches=0, showTruncatedLines=false, substText="$1"}
\[[^\]]*\] will match the bit you want to remove. Just use a string replace function to replace that bit with an empty string.
Andrew has the right idea, but with Perl-style regex syntaxes (which includes Java's built-in regex engine), you can do even better:
str.replaceAll("\\[.*?\\]", "");
(i.e., use the matching expression \[.*?\]. The ? specifies minimal match: so it will finish matching upon the first ] found.)

In the PowerShell grammar, what is the the `lvalueExpression` rule saying?

I was reviewing the PowerShell grammar posted here: http://www.manning.com/payette/AppCexcerpt.pdf
(I don't think it has been updated since PowerShell v1, and there are some typos. So, it's clearly not the true PowerShell Grammar, but a human-oriented document.)
In section C.2.1, it says:
<lvalueExpression> = <lvalue> [? |? <lvalue>]*
What is the meaning of the question marks? I can't tell if it means "match any character" or "match a question mark" or it's a typo.
I'm not sure what inputs this is intended to match, but maybe it's this:
$a,$b = 1, 2
in which case maybe the question mark is supposed to be a comma?
Based on its use in the preceding rule (<assignmentStatementRule> = <lvalueExpression> <AssignmentOperatorToken> <pipelineRule>), it appears that lvalueExpression in Appendix C of Windows PowerShell in Action corresponds to expression in section B.2.3 of The PowerShell Language Specification that Joey linked to. Matching it further than this is difficult, but I'll add some speculation anyway :)
The ? characters in [? |? <lvalue>]* are very likely erroneous. If it had been used to represent "the previous token is optional", then:
the [ and | tokens it was applied to should have been quoted
only [ makes sense as part of a value expression, but indexing is already covered later by the propertyOrArrayReferenceOperator rule
? is not used anywhere else in the grammar, but {0|1} is used multiple times to indicate "can appear zero or one times"
Given its similarity to [ '|' <cmdletCall> ]* at the end of the first rule in the section, it may have been a copy-and-paste error, compounded by a ‘smart quote’ round-trip encoding error. Assuming this was copied with the intent of editing later, then ?|? may have become '.' to represent multiple property accesses (but again, this is covered by the propertyOrArrayReferenceOperator rule).
Though based on the statement at the end of section C.2.1 that "[the pipeline rule] also handles parsing assignment expressions", lvalueExpression was probably intended to list all the assignable expressions besides simpleLvalue (e.g. cast-expression for [int]$x = 1, array-literal-expression for $a,$b,$c = 1,2,3), etc).

Resources