I'm trying to convert my large multi-config Jenkins job over to pipeline syntax so I can, among other things, split it across multiple nodes and combine my multiple stages into one job. Here's the part where I'm seeing trouble:
def build_test_configs = [:]
def compilers = ['gnu', 'icc']
def configs = ['debug', 'default', 'opt']
for (int i = 0; i < configs.size(); i++) {
for (int j = 0; j < compilers.size(); j++) {
def node_name = ""
if ("${compilers[j]}" == "gnu") {
node_name = "node001"
} else {
node_name = "node002"
}
build_test_configs["${node_name} ${configs[i]}"] = {
node ("${node_name}") {
stage("Build Test ${node_name} ${compilers[j]} ${configs[i]}") {
unstash "${node_name}-tarball"
sh "$HOME/software/jenkins_scripts/nightly.sh ${configs[i]} ${compilers[j]} yes $WORKSPACE"
}
}
}
}
}
parallel build_test_configs
My problem is that ${compilers[j] and $configs[i] are undefined when I get to the part where I'm trying to build up a dictionary of build_test_configs on line 13. It would appear that the check on line 8 is working just fine.
Update
I don't have an error message per se. The script doesn't produce any runtime errors. The unexpected output is that the names of the stages are:
Build Test node001 null null
Build Test node001 null null
Build Test node002 null null
And the nightly.sh script is getting passed null parameters as well.
I think this is the expected behavior: Jenkins Pipeline scripts are written in Groovy but what is actually executed is a transformation of that (the term they use is "continuation-passing style transformation"). For example, some parts will run on the master, some on the slave nodes.
This involves a lot of magic that flies way above my head, but at our level it means we have to work with constraints in the syntax & constructs we use.
See the "fundamentals" paragraph of this article:
To understand Pipeline behavior you must understand a few points about how it executes.
Except for the steps themselves, all of the Pipeline logic, the
Groovy conditionals, loops, etc execute on the master. Whether
simple or complex! Even inside a node block!
Steps may use executors to do work where appropriate, but each step has a small on-master overhead too.
Pipeline code is written as Groovy but the execution
model is radically transformed at compile-time to Continuation
Passing Style (CPS).
This transformation provides valuable safety
and durability guarantees for Pipelines, but it comes with
trade-offs: Steps can invoke Java and execute fast and efficiently,
but Groovy is much slower to run than normal. Groovy logic requires
far more memory, because an object-based syntax/block tree is kept
in memory.
Pipelines persist the program and its state frequently to
be able to survive failure of the master.
Also see JENKINS-41335 discussing support of variables across the script.
Edit: ah, yes, as pointed in the comments, the new declarative model allows to define an environment with variables that would be passed the way you need... Don't know how to do that in scripted pipeline without JENKINS-41335 but it seems further evolutions will now happen in declarative land :/
Related
I am developing software for an embedded device. The steps involved in building and verifying it all are complicated: creating the build environment (via containers), building the actual SD card image, running unit tests, automated tests on target hardware, license compliance checks and so on - details aren't important here.
Currently I have this in one long declarative Jenkinsfile as a multibranch-pipeline (for all intents and purpose here, we're doing gitflow). In doing this I've hit a limit on the size of a Jenkinsfile (https://issues.jenkins.io/browse/JENKINS-37984) and can't actually get all the stages in that I want to.
It's too big so i need to cut this massive pipeline up. I broke this all up in little pipeline jobs with parameters to pass data/context between each part of the pipeline and came up with something like this:
I've colour-coded the A and B artifacts as they're used a lot and the lines would make things messy. What this tries to show is an order of running things, where things in a column depend on artifacts created in column to the left.
I'm struggling to discover how to do the "waiting" for multiple upstream jobs (for instance in Job Foxtrot in the diagram) before starting another downstream job that depends on them.
I specifically do not want to turn each column in the diagram into a parallel group of things, because for instance Job Delta might take 2 minutes but Job Charlie take 20 minutes. The exact duration of each job is variable and unpredictable as for some parameter combinations will mean building from scratch and others will cause an existing artifact to be output.
I think I need something like the join plugin (https://plugins.jenkins.io/join/), but for pipeline jobs (join only works on freestyle jobs and is quite aged).
The one approach I've explored is to have a "controller" job (maybe job Alpha in the diagram?) that uses the build step (https://www.jenkins.io/doc/pipeline/steps/pipeline-build-step/_) with the wait parameter set to false to trigger the downstream jobs in correct order, with the correct parameters. It would involve searching Jenkins.instance.getItems() to locate the Runs for the downstream projects, which have an upstream cause that matches the currently executing "controller" job. This involves polling waiting for the job to appear and then polling for the job to complete. This feels like I'm "doing it wrong". Below is the source for this polling approach - be gentle, i'm new to groovy!
Is this polling approach a good way? What problems could I encounter with this approach? Should I be using the ItemListener Jenkins ExtensionPoint and writing a plugin to do this sort of thing in a generic way? Is there another way I've not found?
I feel like I'm not "holding it right" when it comes to the overall pipeline design/architecture here.
Finally after writing this I notice that Jobs India, Juliet and Kilo could be collapsed into a single Job, but I don't think that solve much.
#NonCPS
Integer getTriggeredBuildNumber(String project, String causeJobName, Integer causeBuildNumber) {
//find the job/project first
def job = Jenkins.instance.getAllItems(org.jenkinsci.plugins.workflow.job.WorkflowJob.class).find { job -> job.getFullName() == project }
//find a build for this job that was caused by the current build
def build = job.getBuilds().find { build ->
build.getCauses().findAll{ it.class == hudson.model.Cause.UpstreamCause.class }.find { cause ->
cause.getUpstreamProject() == causeJobName && cause.getUpstreamBuild() == causeBuildNumber
} != null
}
if(build != null) {
return build.getNumber()
} else {
return -1
}
}
#NonCPS
Boolean isBuildComplete(String jobName, Integer buildNumber) {
def job = Jenkins.instance.getAllItems(org.jenkinsci.plugins.workflow.job.WorkflowJob.class).find { job -> job.getFullName() == jobName }
if(job) {
def build = job.getBuildByNumber(buildNumber)
return build.isBuilding() == false && build.getResult() != null
} else {
println "WARNING: job '" + jobName + "' not found."
return false
}
}
We've hit the "Code too large" too many times, but the way to cope with it is to refactor your pipeline to remain deep under the limit. The following may be used:
You can run a combination of scripted and declarative pipeline. So some stages in the beginning and/or in the end may be refactored out.
You can build some of the parallel stages dynamically. This code would not be counted towards the limited code size.
Lastly, the issue mentions transformation variables, and that can help too.
We used the combination of the above and have expanded our pipeline well beyond what it was when we first encountered the issue you're facing.
I'm trying to figure out what options do I have,
when trying to build a good pipeline for CICD for a monorepo,
I'm trying to have something like this (this is only a pseudo pipeline)
and not really what I'm using ATM in my monorepo (or what I will have).
Explanation:
Pre: understand what I should build, test, etc..
Build dynamically a parallel step which will give me the later explained capabilities.
Foo: run the parallel and comfortably wait:)
This is the only way I thought of getting this features:
* Build process among the P’s can be shared and I can generate some waitUntil statements
to make this works, I guess...
* Every P’s is independent from the other, if one Ut of P2 fails f.e, it doesn't affect the other progress
of the pipeline, or if I want, it's only a failFast configuration
* Every step within the way is again not related to the progress of other P’s,
so when Ut finishes in any of the P's it starts immediately it's St.
(thought this might changed according to some configuration I'll probably need)
The main problems with that is:
1. I'm losing the control the Restart single steps (since I can only restart Top level steps)
2. It requires me to do a lot more with Scripted Pipeline, which looks like the support of BlueOcean
(which is kind of critical to me), is questionable...
seems that BlueOcean is more supported within the scope of the Declarative Pipeline.
NOTE: It probably looks like I can split every P’s to a another jenkins job
but, this will require me to wait a lot of time in checkout workspace+preparation of the monorepo,
and like I said the "build" step may have shared between the P’s and it's more efficient to do this like that
I will appreciate every feedback or any suggestion:)
There's no problem whatsoever with doing what you want with a Declarative pipeline, since stage can have a stages child. So:
pipeline {
stages {
stage("Pre") { }
stage("Foo") {
parallel {
stage ("P1") {
stages {
stage("P1-Build") {}
stage("P1-Ut") {}
stage("P1-St") {}
}
}
stage ("P2") {
stages {
stage("P2-Build") {}
stage("P2-Ut") {}
}
}
// etc..
Stages P1..P4 will run in parallel but within each their Build-unittest-test stages will run sequentially.
You won't be able to restart separate stages but it's not a good feature anyway.
I have a Jenkinsfile multibranch pipeline script, which runs on two different Jenkins systems. Jenkinsfile relies on a specific label name. In one of the systems, the label based agent is available and in another not (intentionally). In the former it runs fine. In the Jenkins system without the matching label, the job just hangs because it cant find a matching agent.
Is there a way to specify an option to abort (or not start) a build if a label is not found?
Some discussion here:
https://issues.jenkins-ci.org/browse/JENKINS-35905
Might not be possible anytime soon
If they are calling in to a shared library then you can check for label being online/available and then fail the build
def computers = Jenkins.instance.computers
for(computer in computers){
if(computer.isOnline()){
labelStr = computer.node.getLabelString()
}
if labelStr ~= /user input/
break;
}
System.exit(1); // no label
For a declarative pipeline it may be possible to use when{beforeAgent} to test whether a label exists.
This would only be useful where the agent is specified for a stage rather than the whole pipeline.
...and caveat that this is an as yet untested hypothesis.
Just a workaround, but in order to avoid dependency on shared lib I run the below every X minutes to clean-up culprits from queue:
import hudson.model.*
def q = Jenkins.instance.queue
q.items.each {
if (it =~ /someregex or match all/) {
why = it.getWhy()
if (why =~ /.*There are no nodes with the label.*/) {
println "No node found for $it.task.runId. It's stuck in damn jenkins queue forever and ever. Killing it"
q.cancel(it.task)
}
}
}
The Jenkins Pipeline plugin (aka Workflow) can be extended with other Multibranch plugins to build branches and pull requests automatically.
What would be the preferred way to run multiple configurations? For example, building with Java 7 and Java 8. This is often called matrix configuration (because of the multiple combinations such as language version, framework version, ...) or build variants.
I tried:
executing them serially as separate stage steps. Good, but takes more time than necessary.
executing them inside a parallel step, with or without nodes allocated inside them. Works but I cannot use the stage step inside parallel for known limitations on how it would be visualized.
Is there a recommended way to do this?
TLDR: Jenkins.io wants you to use nodes for each build.
Jenkins.io: In pipeline coding contexts, a "node" is a step that does two things, typically by enlisting help from available executors on agents:
Schedules the steps contained within it to run by adding them to the Jenkins build queue (so that as soon as an executor slot is free on a node, the appropriate steps run)
It is a best practice to do all material work, such as building or running shell scripts, within nodes, because node blocks in a stage tell Jenkins that the steps within them are resource-intensive enough to be scheduled, request help from the agent pool, and lock a workspace only as long as they need it.
Vanilla Jenkins Node blocks within a stage would look like:
stage 'build' {
node('java7-build'){ ... }
node('java8-build'){ ... }
}
Further extending this notion Cloudbees writes about parallelism and distributed builds with Jenkins. Cloudbees workflow for you might look like:
stage 'build' {
parallel 'java7-build':{
node('mvn-java7'){ ... }
}, 'java8-build':{
node('mvn-java8'){ ... }
}
}
Your requirements of visualizing the different builds in the pipeline would could be satisfied with either workflow, but I trust the Jenkins documentation for best practice.
EDIT
To address the visualization #Stephen would like to see, He's right - it doesn't work! The issue has been raised with Jenkins and is documented here, the resolution of involving the use of 'labelled blocks' is still in progress :-(
Q: Is there documentation letting pipeline users not to put stages inside of parallel steps?
A: No, and this is considered to be an incorrect usage if it is done; stages are only valid as top-level constructs in the pipeline, which is why the notion of labelled blocks as a separate construct has come to be ... And by that, I mean remove stages from parallel steps within my pipeline.
If you try to use a stage in a parallel job, you're going to have a bad time.
ERROR: The ‘stage’ step must not be used inside a ‘parallel’ block.
I would suggest Declarative Matrix as a preferred way to run multiple configurations in Jenkins. It allows you to execute the defined stages for every configuration without code duplication.
Example:
pipeline {
agent none
stages {
stage('Test') {
matrix {
agent {
label "${NODENAME}"
}
axes {
axis {
name 'NODENAME'
values 'java7node', 'java8node'
}
}
stages {
stage('Test') {
steps {
echo "Do Test for ${NODENAME}"
}
}
}
}
}
}
}
Note that declarative Matrix is a native declarative Pipeline feature, so no additional Plugin installation needed.
Jenkins blog post about the matrix directive.
As noted by #StephenKing, Blue Ocean will show parallel branches better than the current stage view. A planned upcoming version of the stage view will be able to show all the branches, though it will not visually indicate any nesting structure (would look the same as if you ran the configurations serially).
In any event, the deeper issue is that you will essentially only get a pass/fail status for the build overall, pending a resolution to JENKINS-27395 and related requests.
In order to test each commit on several platforms, I've used this base Jenkinsfile skeleton:
def test_platform(label, with_stages = false)
{
node(label)
{
// Checkout
if (with_stages) stage label + ' Checkout'
...
// Build
if (with_stages) stage label + ' Build'
...
// Tests
if (with_stages) stage label + ' Tests'
...
}
}
/*
parallel ( failFast: false,
Windows: { test_platform("Windows") },
Linux: { test_platform("Linux") },
Mac: { test_platform("Mac") },
)
*/
test_platform("Windows", true)
test_platform("Mac", true)
test_platform("Linux", true)
With this it's relatively easy to switch from a sequential to a parallel execution, each of them having their pros and cons:
Parallel execution runs much faster, but it doesn't contain the stages labelling
Sequential execution is much slower, but you get a detailed report thanks to stages, labelled as "Windows Checkout", "Windows Build", "Windows Tests", "Mac Checkout", etc.)
I'm using the sequential execution for the time being, until I find a better solution.
It seems like there is relief coming at least with the BlueOcean UI. Here is what I got (the tk-* nodes are the parallel steps):
When you set up a Jenkins job various test result plugins will show regressions if the latest build is worse than the previous one.
We have many jobs for many projects on our Jenkins and we wanted to avoid having a 'job per branch' set up. So currently we are using a parameterized build to build eg different development branches using a single job.
But that means when I build a new branch any regressions are measured against the previous build, which may be for a different branch. What I really want is to measure regressions in a feature branch against the latest build of the master branch.
I thought we should probably set up a separate 'master' build alongside the parameterized 'branches' build. But I still can't see how I would compare results between jobs. Is there any plugin that can help?
UPDATE
I have started experimenting in the Script Console to see if I could write a post-build script... I have managed to get the latest build of master branch in my parameterized job... I can't work out how to get to the test results from the build object though.
The data I need is available in JSON at
http://<jenkins server>/job/<job name>/<build number>/testReport/api/json?pretty=true
...if I could just get at this data structure it would be great!
I tried using JsonSlurper to load the json via HTTP but I get 403, I guess because my script has no auth session.
I guess I could load the xml test results from disk and parse them in my script, it just seems a bit stupid when Jenkins has already done this.
I eventually managed to achieve everything I wanted, using a Groovy script in the Groovy Postbuild Plugin
I did a lot of exploring using the script console http://<jenkins>/script and also the Jenkins API class docs are handy.
Everyone's use is going to be a bit different as you have to dig down into the build plugins to get the info you need, but here's some bits of my code which may help.
First get the build you want:
def getProject(projectName) {
// in a postbuild action use `manager.hudson`
// in the script web console use `Jenkins.instance`
def project = manager.hudson.getItemByFullName(projectName)
if (!project) {
throw new RuntimeException("Project not found: $projectName")
}
project
}
// CloudBees folder plugin is supported, you can use natural paths:
project = getProject('MyFolder/TestJob')
build = project.getLastCompletedBuild()
The main test results (jUnit etc) seem to be available directly on the build as:
result = build.getTestResultAction()
// eg
failedTestNames = result.getFailedTests().collect{ test ->
test.getFullName()
}
To get the more specialised results from eg Violations plugin or Cobertura code coverage you have to look for a specific build action.
// have a look what's available:
build.getActions()
You'll see a list of stuff like:
[hudson.plugins.git.GitTagAction#2b4b8a1c,
hudson.scm.SCMRevisionState$None#40d6dce2,
hudson.tasks.junit.TestResultAction#39c99826,
jenkins.plugins.show_build_parameters.ShowParametersBuildAction#4291d1a5]
These are instances, the part in front of the # sign is the class name so I used that to make this method for getting a specific action:
def final VIOLATIONS_ACTION = hudson.plugins.violations.ViolationsBuildAction
def final COVERAGE_ACTION = hudson.plugins.cobertura.CoberturaBuildAction
def getAction(build, actionCls) {
def action = build.getActions().findResult { act ->
actionCls.isInstance(act) ? act : null
}
if (!action) {
throw new RuntimeException("Action not found in ${build.getFullDisplayName()}: ${actionCls.getSimpleName()}")
}
action
}
violations = getAction(build, VIOLATIONS_ACTION)
// you have to explore a bit more to find what you're interested in:
pylint_count = violations?.getReport()?.getViolations()?."pylint"
coverage = getAction(build, COVERAGE_ACTION)?.getResults()
// if you println it looks like a map but it's really an Enum of Ratio objects
// convert to something nicer to work with:
coverage_map = coverage.collectEntries { key, val -> [key.name(), val.getPercentageFloat()] }
With these building blocks I was able to put together a post-build script which compared the results for two 'unrelated' build jobs, then using the Groovy Postbuild plugin's helper methods to set the build status.
Hope this helps someone else.