What is the most straightforward way to restrict pipeline stages to a specific shared resource? - jenkins

We have an existing Jenkins install that is testing firmware running on an embedded tart. The multi-stage pipeline looks something like: Checkout -> Build -> Download -> Smoke tests -> Unit tests. This is working great, except it takes 9 hours to run the pipeline. To speed things up and also to test different target variants we have added 3 more targets to the system (UUT#1, #2, and so on).
My question is, what is the most straightforward way to allow the parallelization happen while also restricting the suites to UUTs with specific properties. For example, our Unit tests contain about 10 different suites (suite1 suite2 and so on), and what I’d like to do is spread those out amongst the 4 UUTs (thus having 4 suites running at a time) but restrict the execution this way:
Suite1 can only run on a UUT that has ‘USB’
Suite2 can only run on a UUT that has ‘LCD-display’
Suite3 can run anywhere
.. and so on, then my UUTs might have properties like:
UUT#1 ‘USB LCD-display’
UUT#2 ‘Ethernet’
UUT#3 ‘RS-232 USB’
Etc.
Reading about agents, it seems that a label on an agent may allow this, but agents seem to carry a lot of overhead and I’m not sure if they’re appropriate.
Long-time Jenkins user, but this is the first time I’ve ever attempted anything this complicated and pipelines are a new concept for me.

A straightforward way is to use the Lockable Resources plugin.
This can be used as a step as well as a stage option (undocumented). The latter comes in handy if you have nested stages which all depend on the resource to be locked.
Stage option in declarative pipeline
pipeline {
agent any
stages {
stage('Test') {
options {
// Lock a single resource from all resources labeled 'mylabel'
lock( label: 'mylabel',
quantity: 1,
variable: 'MyResourceName' )
}
steps { // or 'parallel' or 'stages'
echo "Locked resource $MyResourceName"
sleep 10
echo "Resource will be unlocked after this stage"
}
}
}
}
Step in scripted pipeline
node {
stage('Test') {
lock( label: 'mylabel',
quantity: 1,
variable: 'MyResourceName' ) {
echo "Locked resource $MyResourceName"
sleep 10
echo "Resource will be unlocked after this stage"
}
}
}
Caveats
If lock is used as a step in declarative pipeline, you may get an error:
Missing required parameter: "resource"
This seems to be a little bug in argument checking. According to the documentation, you only need to specify either resource or label parameter. Simply pass null as the value for this parameter.
If parameter quantity is not specified, all resources that match the given label will be locked.

Related

Jenkins: coordinating multiple pipelines

I am developing software for an embedded device. The steps involved in building and verifying it all are complicated: creating the build environment (via containers), building the actual SD card image, running unit tests, automated tests on target hardware, license compliance checks and so on - details aren't important here.
Currently I have this in one long declarative Jenkinsfile as a multibranch-pipeline (for all intents and purpose here, we're doing gitflow). In doing this I've hit a limit on the size of a Jenkinsfile (https://issues.jenkins.io/browse/JENKINS-37984) and can't actually get all the stages in that I want to.
It's too big so i need to cut this massive pipeline up. I broke this all up in little pipeline jobs with parameters to pass data/context between each part of the pipeline and came up with something like this:
I've colour-coded the A and B artifacts as they're used a lot and the lines would make things messy. What this tries to show is an order of running things, where things in a column depend on artifacts created in column to the left.
I'm struggling to discover how to do the "waiting" for multiple upstream jobs (for instance in Job Foxtrot in the diagram) before starting another downstream job that depends on them.
I specifically do not want to turn each column in the diagram into a parallel group of things, because for instance Job Delta might take 2 minutes but Job Charlie take 20 minutes. The exact duration of each job is variable and unpredictable as for some parameter combinations will mean building from scratch and others will cause an existing artifact to be output.
I think I need something like the join plugin (https://plugins.jenkins.io/join/), but for pipeline jobs (join only works on freestyle jobs and is quite aged).
The one approach I've explored is to have a "controller" job (maybe job Alpha in the diagram?) that uses the build step (https://www.jenkins.io/doc/pipeline/steps/pipeline-build-step/_) with the wait parameter set to false to trigger the downstream jobs in correct order, with the correct parameters. It would involve searching Jenkins.instance.getItems() to locate the Runs for the downstream projects, which have an upstream cause that matches the currently executing "controller" job. This involves polling waiting for the job to appear and then polling for the job to complete. This feels like I'm "doing it wrong". Below is the source for this polling approach - be gentle, i'm new to groovy!
Is this polling approach a good way? What problems could I encounter with this approach? Should I be using the ItemListener Jenkins ExtensionPoint and writing a plugin to do this sort of thing in a generic way? Is there another way I've not found?
I feel like I'm not "holding it right" when it comes to the overall pipeline design/architecture here.
Finally after writing this I notice that Jobs India, Juliet and Kilo could be collapsed into a single Job, but I don't think that solve much.
#NonCPS
Integer getTriggeredBuildNumber(String project, String causeJobName, Integer causeBuildNumber) {
//find the job/project first
def job = Jenkins.instance.getAllItems(org.jenkinsci.plugins.workflow.job.WorkflowJob.class).find { job -> job.getFullName() == project }
//find a build for this job that was caused by the current build
def build = job.getBuilds().find { build ->
build.getCauses().findAll{ it.class == hudson.model.Cause.UpstreamCause.class }.find { cause ->
cause.getUpstreamProject() == causeJobName && cause.getUpstreamBuild() == causeBuildNumber
} != null
}
if(build != null) {
return build.getNumber()
} else {
return -1
}
}
#NonCPS
Boolean isBuildComplete(String jobName, Integer buildNumber) {
def job = Jenkins.instance.getAllItems(org.jenkinsci.plugins.workflow.job.WorkflowJob.class).find { job -> job.getFullName() == jobName }
if(job) {
def build = job.getBuildByNumber(buildNumber)
return build.isBuilding() == false && build.getResult() != null
} else {
println "WARNING: job '" + jobName + "' not found."
return false
}
}
We've hit the "Code too large" too many times, but the way to cope with it is to refactor your pipeline to remain deep under the limit. The following may be used:
You can run a combination of scripted and declarative pipeline. So some stages in the beginning and/or in the end may be refactored out.
You can build some of the parallel stages dynamically. This code would not be counted towards the limited code size.
Lastly, the issue mentions transformation variables, and that can help too.
We used the combination of the above and have expanded our pipeline well beyond what it was when we first encountered the issue you're facing.

CI for a monorepo with Jenkins and BlueOcean

I'm trying to figure out what options do I have,
when trying to build a good pipeline for CICD for a monorepo,
I'm trying to have something like this (this is only a pseudo pipeline)
and not really what I'm using ATM in my monorepo (or what I will have).
Explanation:
Pre: understand what I should build, test, etc..
Build dynamically a parallel step which will give me the later explained capabilities.
Foo: run the parallel and comfortably wait:)
This is the only way I thought of getting this features:
* Build process among the P’s can be shared and I can generate some waitUntil statements
to make this works, I guess...
* Every P’s is independent from the other, if one Ut of P2 fails f.e, it doesn't affect the other progress
of the pipeline, or if I want, it's only a failFast configuration
* Every step within the way is again not related to the progress of other P’s,
so when Ut finishes in any of the P's it starts immediately it's St.
(thought this might changed according to some configuration I'll probably need)
The main problems with that is:
1. I'm losing the control the Restart single steps (since I can only restart Top level steps)
2. It requires me to do a lot more with Scripted Pipeline, which looks like the support of BlueOcean
(which is kind of critical to me), is questionable...
seems that BlueOcean is more supported within the scope of the Declarative Pipeline.
NOTE: It probably looks like I can split every P’s to a another jenkins job
but, this will require me to wait a lot of time in checkout workspace+preparation of the monorepo,
and like I said the "build" step may have shared between the P’s and it's more efficient to do this like that
I will appreciate every feedback or any suggestion:)
There's no problem whatsoever with doing what you want with a Declarative pipeline, since stage can have a stages child. So:
pipeline {
stages {
stage("Pre") { }
stage("Foo") {
parallel {
stage ("P1") {
stages {
stage("P1-Build") {}
stage("P1-Ut") {}
stage("P1-St") {}
}
}
stage ("P2") {
stages {
stage("P2-Build") {}
stage("P2-Ut") {}
}
}
// etc..
Stages P1..P4 will run in parallel but within each their Build-unittest-test stages will run sequentially.
You won't be able to restart separate stages but it's not a good feature anyway.

How to throttle an entire pipeline in Jenkins

I'm new to Jenkins pipeline and trying to understand how can I throttle an "entire" pipeline, which basically means that the following will take place:
1) I will be able to run the same pipeline maximum number of concurrent runs, say MAX_CONCURRENT_RUNS = 2
2) Each run (essentially build) can have its own parameters, with the following "extra requirement", that two (or more) different builds CAN have (if required) the same parameters sent to it.
3) In the case where at a particular point in time there are already MAX_CONCURRENT_RUNS builds (runs) of the pipeline, then the MAX_CONCURRENT_RUNS + 1 run will "hold" itself until the first currently running build will terminate and only then will start to execute.
I have looked in this SO question and also this SO question, but they both not "exactly" applicable to my situation (requirements).
I'm using Jenkins server version 2.176.1
After some research I did mainly in these two links:
The throttle plugin official GitHub page and JENKINS-45140 issue where some of the comments were very useful, I have composed this solution:
1) First thing is install the required plugin, that can be found in the Manage Jenkins --> Manage Plugins "search tab" by typing throttle-concurrents (the official plugin page can be found here).
2) A "simple" throttle category needs to be added to the global configuration of the "throttle builder plugin" within Jenkins' global configuration. This can be done by going to Manage Jenkins --> Configure system. There under the "Throttle Concurrent Builds" section the "new" category needs to be added. In the below example, I have set the name of the category to: simpleThrottleCatagory and the following parameters:
This way, the pipeline that would be able to run several builds at the same time, with some "upper limit" on how many builds, which is essentially the MAX_CONCURRENT_RUNS (in this case 2).
3) In this example I will keep the pipeline "itself" implementation "as simple as possible" in order to focus on the "throttling" considerations and not the "common pipeline stuff".
3.1) The "simple concurrent pipeline" will simply receive two parameters from the user:
Number of seconds to sleep:NumSecToSleep.
Some sample choice parameter named BocaOrRiver with two possible values: boca or river.
3.2) The entire pipeline implementation in this case is as follows (note that some extra "approvals" needs to take place so that Calendar.getInstance().getTime().format('YYYY/MM/dd-hh:mm:ss',TimeZone.getTimeZone('CST')) function will work. In case you are unable to perform these changes, replace the two lines with this function call with any other implementation that will get the current time stamp):
// Do NOT place within the pipeline block
properties([ [ $class: 'ThrottleJobProperty',
categories: ['simpleThrottleCatagory'],
limitOneJobWithMatchingParams: false,
maxConcurrentPerNode: 2,
maxConcurrentTotal: 2,
paramsToUseForLimit: '',
throttleEnabled: true,
throttleOption: 'category' ] ])
pipeline
{
agent any parameters
{
string(name: "NumSecToSleep", description: "Number of second to sleep in the Sleep stage")
choice(name: "BocaOrRiver", choices: "boca\nriver", description: "Which Team in Buenos Aires do you prefer?")
}
stages
{
stage("First stage")
{
steps
{
echo "WORKSPACE is:${WORKSPACE}"
echo "Build number is:${env.BUILD_NUMBER}"
}
}
stage("Sleep stage")
{
steps
{
script
{
def time = params.NumSecToSleep echo "Sleeping for ${params.NumSecToSleep} seconds"
def timeStamp = Calendar.getInstance().getTime().format('YYYY/MM/dd-hh:mm:ss',TimeZone.getTimeZone('CST'))
println("Before sleeping current time is:" + timeStamp)
sleep time.toInteger() // seconds
timeStamp = Calendar.getInstance().getTime().format('YYYY/MM/dd-hh:mm:ss',TimeZone.getTimeZone('CST'))
println("After sleeping current time is:" + timeStamp)
echo "Done sleeping for ${params.NumSecToSleep} seconds"
}
}
}
}
}
3.3) NOTES:
3.3.1) The code within the actual pipeline block is essentially straight forward: Simply display some "build specific" parameters just to be sure that each build of the job gets its specific user defined parameters and it sleeps for some number of seconds also so that two (in this case) or more builds can be run indeed concurrently and it would be able to see "for our own eyes" (at run time) that the two jobs run together (in parallel).
3.3.2) The more interesting part of the pipeline is the properties block (at the top):
3.3.2.1) Note that it needs to be defined OUTSIDE of the pipeline block section.
3.3.2.2) I think that most of the settings defined within this properties block are very "self explanatory" YET the two that should be mentioned are:
$class: 'ThrottleJobProperty': This is a "predefined" value of Jenkins to indicate that this "job" (can be also pipeline) can be throttled.
categories: ['simpleThrottleCatagory']: This is the "global throttle category" defined in the previous step.
4) Basic illustration:
In the figure below there is a screen shot of a situation where three builds where started one after the other, with "enough" time to sleep in each one of them so that the first two (build 17 & 18 pointed in points 2 & 3 respectively) won't "finish too soon", meaning, so that indeed the "third" build (build 19) will "have to wait" for an available executor (pointed in point 4):
5) Here I have described a very simple and minimal yet (IMMO) representative implementation along with "global configuration" of an "entire" concurrent pipeline. Off course this topic can be discussed MUCH further, for example, it is also possible to throttle only single step within a pipeline.

Jenkins is re-using a pipeline workspace and I wish for each build to have a unique workspace

So, most of the questions and answers I've found on this subject is for people who want to use the SAME workspace for different runs. (Which baffles me, but then I require a clean slate each time I start a job. Leftover stuff will only break things)
My issue is the EXACT opposite - I MUST have a separate workspace for each run (or I need to know how to create files with the same name in different runs that stay with that run only, and which are easily reachable from bash scripts started by the pipeline!)
So, my question is - how do I either force Jenkins to NOT use the same workspace for two concurrently-running jobs on different hosts, OR what variable can I use in the 'custom workspace' field to accomplish this?
After I responded to the question by #Joerg S I realized that I'm saying the thing that Joerg S says CAN'T happen is EXACTLY what I'm observing! Jenkins is using the SAME workspace for 2 different, concurrent, jobs on 2 different hosts. Is this a Jenkins pipeline bug?
See below for a bewildering amount of information.
Given the way I have to go onto and off of nodes during the run, I've found that I can start 2 different builds on different hosts of the same job, and they SHARE the workspace dir! Since each job has shell scripts which are busy writing files into that directory, this is extremely bad.
In Custom workspace in jenkins we are told to use custom workspace, and I'm set up just like that
In Jenkins: how to run builds in unique directories we are told to use ${BUILD_NUMBER} in the above custom workspace field, so what I tried was:
${JENKINS_HOME}/workspace/${ITEM_FULLNAME}/${BUILD_NUMBER}
All that happens to me when I use that is that the workspace name is, you guessed it, "${BUILD_NUMBER}" (and I even got a "${BUILD_NUMBER}#2" just for good measure!)
I tried {$BUILD_ID}, same thing (uses that literally, does not substitute the number).
I have the 'allow concurrent builds' turned on.
I'm using pipelines exclusively.
All jobs here, as part of normal execution, cause the slave, non-master host to reboot into an OS that does not have the capability to run slave.jar (indeed, it has no network access at all), so I cannot run the entire pipeline on that host.
All jobs use the following construct somewhere inside them:
tests=Arrays.asList(tests.split("\\r?\n"))
shellerror=231
for( line in tests){
So let's call an example job 'foo' that loops through a list, as above, that I want to run on 2 different hosts. The pipeline for that job starts running on master (since the above for (line in tests) is REQUIRED to run on a node!)). Then goes back and forth between master and slave, often multiple times.
If I start this job on host A and host B at about the same time, they will BOTH use the workspace ${JENKINS_HOME}/workspace/${JOB_NAME}, or in my case /var/lib/jenkins/jenkins/workspace/job
Since they write different data to files with the same name in that directory, I'm clearly totally broken immediately.
So, how do I force Jenkins to use a unique workspace EVERY SINGLE JOB?
Or, what???
Other things: pipeline build step version 2.5.1, Jenkins 2.46.2
I've been trying to get the workspace statement ('ws') to work, but that doesn't quite work as I expected either - some files are in the workspace I explicitly name, and some are still in the 'built-in' workspace (workspace/).
I was asked to provide code. The 'standard' pipeline I use is about 26K bytes, composing about 590 lines. So, I'm going to GREATLY reduce. That being said:
node("master") { // 1
..... lots of stuff....
} // this matches the "node('master')" above
node(HOST) {
echo "on $HOST, check what os"
if (isUnix())
...some more stuff...
} // end of 'node(HOST)' above
if (isok == 0 ) {
node("master") {
echo "----------------- Running on MASTER 19 $shellerror waiting on boot out of windows ------------"
sleep 120
echo "----------------- Leaving MASTER ------------"
}
}
... lots 'o code ...
node(HOST) {
... etc
} // matches the latest 'node HOST' above
node("master") { // 120
.... code ...
for( line in tests) {
...code...
}
}
... and on and on and on, switching back and forth from one to the other
FWIW, when I tried to make the above use 'ws' so that I could make certain the ws name was unique, I simply added a 'ws wsname' block directly under (almost) every 'node' opening so it was
node(name) { ws (wsname) { ..stuff that was in node block before... } }
But then I've got two directories to worry about checking - both the 'default' workspace/jobname dir AND the new wsname one.
Try using customWorkspace node common option:
pipeline {
agent {
node {
label 'node(s)-defined-label'
customWorkspace "${JENKINS_HOME}/workspace/${JOB_NAME}/${BUILD_NUMBER}"
}
}
stages {
// Your pipeline logic here
}
}
customWorkspace
A string. Run the Pipeline or individual stage this
agent is applied to within this custom workspace, rather than the
default. It can be either a relative path, in which case the custom
workspace will be under the workspace root on the node, or an absolute
path.
Edit
Since this doesn't work for your complex pipeline. Maybe try this silly solution:
def WORKSPACE = "${JENKINS_HOME}/workspace/${JOB_NAME}/${BUILD_NUMBER}"
node(HOST) {
sh(script: "mkdir -p ${WORKSPACE}")
sh(script: "cd ${WORKSPACE}")
//Do stuff here
}
or if dir() is accessible:
def WORKSPACE = "${JENKINS_HOME}/workspace/${JOB_NAME}/${BUILD_NUMBER}"
node(HOST) {
sh(script: "mkdir -p ${WORKSPACE}")
dir(WORKSPACE) {
//Do stuff here
}
}
customWorkspace didn't work for me.
What worked:
stages {
stage("SCM (For commit trigger)"){
steps {
ws('custom-workspace') { // Because we don't want to switch from the pipeline checkout
// Generated from http://lstool01:8080/job/Permanent%20Build/pipeline-syntax/
checkout(xxx)
}
}
}
'${SOMEVAR}'
will not get substituted
"${SOMEVAR}"
will - this is how groovy strings are being handled
see groovy string handling
so if you have a
ws("/some/path/somewhere/${BUILD_ID}")
{
//something
}
on your node in your pipeline Jenkinsfile it should do the trick in this regard
the problem with #2 workspaces can occur when you allow concurrent builds of the project - I had the exact same problem with a custom ws() with #2 - simply disallow concurrent builds or work around that.

Matrix configuration with Jenkins pipelines

The Jenkins Pipeline plugin (aka Workflow) can be extended with other Multibranch plugins to build branches and pull requests automatically.
What would be the preferred way to run multiple configurations? For example, building with Java 7 and Java 8. This is often called matrix configuration (because of the multiple combinations such as language version, framework version, ...) or build variants.
I tried:
executing them serially as separate stage steps. Good, but takes more time than necessary.
executing them inside a parallel step, with or without nodes allocated inside them. Works but I cannot use the stage step inside parallel for known limitations on how it would be visualized.
Is there a recommended way to do this?
TLDR: Jenkins.io wants you to use nodes for each build.
Jenkins.io: In pipeline coding contexts, a "node" is a step that does two things, typically by enlisting help from available executors on agents:
Schedules the steps contained within it to run by adding them to the Jenkins build queue (so that as soon as an executor slot is free on a node, the appropriate steps run)
It is a best practice to do all material work, such as building or running shell scripts, within nodes, because node blocks in a stage tell Jenkins that the steps within them are resource-intensive enough to be scheduled, request help from the agent pool, and lock a workspace only as long as they need it.
Vanilla Jenkins Node blocks within a stage would look like:
stage 'build' {
node('java7-build'){ ... }
node('java8-build'){ ... }
}
Further extending this notion Cloudbees writes about parallelism and distributed builds with Jenkins. Cloudbees workflow for you might look like:
stage 'build' {
parallel 'java7-build':{
node('mvn-java7'){ ... }
}, 'java8-build':{
node('mvn-java8'){ ... }
}
}
Your requirements of visualizing the different builds in the pipeline would could be satisfied with either workflow, but I trust the Jenkins documentation for best practice.
EDIT
To address the visualization #Stephen would like to see, He's right - it doesn't work! The issue has been raised with Jenkins and is documented here, the resolution of involving the use of 'labelled blocks' is still in progress :-(
Q: Is there documentation letting pipeline users not to put stages inside of parallel steps?
A: No, and this is considered to be an incorrect usage if it is done; stages are only valid as top-level constructs in the pipeline, which is why the notion of labelled blocks as a separate construct has come to be ... And by that, I mean remove stages from parallel steps within my pipeline.
If you try to use a stage in a parallel job, you're going to have a bad time.
ERROR: The ‘stage’ step must not be used inside a ‘parallel’ block.
I would suggest Declarative Matrix as a preferred way to run multiple configurations in Jenkins. It allows you to execute the defined stages for every configuration without code duplication.
Example:
pipeline {
agent none
stages {
stage('Test') {
matrix {
agent {
label "${NODENAME}"
}
axes {
axis {
name 'NODENAME'
values 'java7node', 'java8node'
}
}
stages {
stage('Test') {
steps {
echo "Do Test for ${NODENAME}"
}
}
}
}
}
}
}
Note that declarative Matrix is a native declarative Pipeline feature, so no additional Plugin installation needed.
Jenkins blog post about the matrix directive.
As noted by #StephenKing, Blue Ocean will show parallel branches better than the current stage view. A planned upcoming version of the stage view will be able to show all the branches, though it will not visually indicate any nesting structure (would look the same as if you ran the configurations serially).
In any event, the deeper issue is that you will essentially only get a pass/fail status for the build overall, pending a resolution to JENKINS-27395 and related requests.
In order to test each commit on several platforms, I've used this base Jenkinsfile skeleton:
def test_platform(label, with_stages = false)
{
node(label)
{
// Checkout
if (with_stages) stage label + ' Checkout'
...
// Build
if (with_stages) stage label + ' Build'
...
// Tests
if (with_stages) stage label + ' Tests'
...
}
}
/*
parallel ( failFast: false,
Windows: { test_platform("Windows") },
Linux: { test_platform("Linux") },
Mac: { test_platform("Mac") },
)
*/
test_platform("Windows", true)
test_platform("Mac", true)
test_platform("Linux", true)
With this it's relatively easy to switch from a sequential to a parallel execution, each of them having their pros and cons:
Parallel execution runs much faster, but it doesn't contain the stages labelling
Sequential execution is much slower, but you get a detailed report thanks to stages, labelled as "Windows Checkout", "Windows Build", "Windows Tests", "Mac Checkout", etc.)
I'm using the sequential execution for the time being, until I find a better solution.
It seems like there is relief coming at least with the BlueOcean UI. Here is what I got (the tk-* nodes are the parallel steps):

Resources