Jenkins Pipeline "node inside stage" vs "stage inside node"

Jenkins Pipeline "node inside stage" vs "stage inside node" - jenkins

As both node step and stage step provide scoped {} syntax, what is the best practice for defining their topology inside groovy code?
Exhibit A
node ("NodeName") {
stage ("a stage inside node"){
// do stuff here
}
}
Exhibit B
stage ("a stage holding a node") {
node ("NodeName"){
// do stuff here
}
}

This depends on your actual needs.
As long as you can run your complete pipeline on a single node, I would wrap the stages in a node so that the pipeline is not blocked by busy executors.
As soon as you use the parallel step, then you don't really have a choice besides having stage around node allocations.
There are (at least for me) no issues around mixing that, i.e., have the first 2-3 stages executed on the same node and then one stage that executes on multiple nodes within parallel.

With node { stage { ... } } each stage will share the same working folder and all the files from the previous stage will be there for the next stage.
With stage { node { ... } } you need to stash/unstash files between each stage. If you have a large repository, and especially if you have a large folder of dependencies like node_modules, this repeated stash/unstash could end up being a significant, or even majority, or your build time.
IMO I would generally start with the first syntax, node { stage { ... } } as preferred. If you have individual build stages that take time and can benefit from parallelism, then switching to stage { node { ... } } might be better, as long as the time gained in parallelization is not lost in stashing.
Update:
I tested the exact effect of swapping nesting on one of our builds. with a bunch of stages inside a node, the total build time is just over one minute. With a node inside each stage, total build time is almost five minutes. Big difference.

Related

CI for a monorepo with Jenkins and BlueOcean

I'm trying to figure out what options do I have,
when trying to build a good pipeline for CICD for a monorepo,
I'm trying to have something like this (this is only a pseudo pipeline)
and not really what I'm using ATM in my monorepo (or what I will have).
Explanation:
Pre: understand what I should build, test, etc..
Build dynamically a parallel step which will give me the later explained capabilities.
Foo: run the parallel and comfortably wait:)
This is the only way I thought of getting this features:
* Build process among the P’s can be shared and I can generate some waitUntil statements
to make this works, I guess...
* Every P’s is independent from the other, if one Ut of P2 fails f.e, it doesn't affect the other progress
of the pipeline, or if I want, it's only a failFast configuration
* Every step within the way is again not related to the progress of other P’s,
so when Ut finishes in any of the P's it starts immediately it's St.
(thought this might changed according to some configuration I'll probably need)
The main problems with that is:
1. I'm losing the control the Restart single steps (since I can only restart Top level steps)
2. It requires me to do a lot more with Scripted Pipeline, which looks like the support of BlueOcean
(which is kind of critical to me), is questionable...
seems that BlueOcean is more supported within the scope of the Declarative Pipeline.
NOTE: It probably looks like I can split every P’s to a another jenkins job
but, this will require me to wait a lot of time in checkout workspace+preparation of the monorepo,
and like I said the "build" step may have shared between the P’s and it's more efficient to do this like that
I will appreciate every feedback or any suggestion:)

There's no problem whatsoever with doing what you want with a Declarative pipeline, since stage can have a stages child. So:
pipeline {
stages {
stage("Pre") { }
stage("Foo") {
parallel {
stage ("P1") {
stages {
stage("P1-Build") {}
stage("P1-Ut") {}
stage("P1-St") {}
}
}
stage ("P2") {
stages {
stage("P2-Build") {}
stage("P2-Ut") {}
}
}
// etc..
Stages P1..P4 will run in parallel but within each their Build-unittest-test stages will run sequentially.
You won't be able to restart separate stages but it's not a good feature anyway.

What is the most straightforward way to restrict pipeline stages to a specific shared resource?

We have an existing Jenkins install that is testing firmware running on an embedded tart. The multi-stage pipeline looks something like: Checkout -> Build -> Download -> Smoke tests -> Unit tests. This is working great, except it takes 9 hours to run the pipeline. To speed things up and also to test different target variants we have added 3 more targets to the system (UUT#1, #2, and so on).
My question is, what is the most straightforward way to allow the parallelization happen while also restricting the suites to UUTs with specific properties. For example, our Unit tests contain about 10 different suites (suite1 suite2 and so on), and what I’d like to do is spread those out amongst the 4 UUTs (thus having 4 suites running at a time) but restrict the execution this way:
Suite1 can only run on a UUT that has ‘USB’
Suite2 can only run on a UUT that has ‘LCD-display’
Suite3 can run anywhere
.. and so on, then my UUTs might have properties like:
UUT#1 ‘USB LCD-display’
UUT#2 ‘Ethernet’
UUT#3 ‘RS-232 USB’
Etc.
Reading about agents, it seems that a label on an agent may allow this, but agents seem to carry a lot of overhead and I’m not sure if they’re appropriate.
Long-time Jenkins user, but this is the first time I’ve ever attempted anything this complicated and pipelines are a new concept for me.

A straightforward way is to use the Lockable Resources plugin.
This can be used as a step as well as a stage option (undocumented). The latter comes in handy if you have nested stages which all depend on the resource to be locked.
Stage option in declarative pipeline
pipeline {
agent any
stages {
stage('Test') {
options {
// Lock a single resource from all resources labeled 'mylabel'
lock( label: 'mylabel',
quantity: 1,
variable: 'MyResourceName' )
}
steps { // or 'parallel' or 'stages'
echo "Locked resource $MyResourceName"
sleep 10
echo "Resource will be unlocked after this stage"
}
}
}
}
Step in scripted pipeline
node {
stage('Test') {
lock( label: 'mylabel',
quantity: 1,
variable: 'MyResourceName' ) {
echo "Locked resource $MyResourceName"
sleep 10
echo "Resource will be unlocked after this stage"
}
}
}
Caveats
If lock is used as a step in declarative pipeline, you may get an error:
Missing required parameter: "resource"
This seems to be a little bug in argument checking. According to the documentation, you only need to specify either resource or label parameter. Simply pass null as the value for this parameter.
If parameter quantity is not specified, all resources that match the given label will be locked.

Jenkins is re-using a pipeline workspace and I wish for each build to have a unique workspace

So, most of the questions and answers I've found on this subject is for people who want to use the SAME workspace for different runs. (Which baffles me, but then I require a clean slate each time I start a job. Leftover stuff will only break things)
My issue is the EXACT opposite - I MUST have a separate workspace for each run (or I need to know how to create files with the same name in different runs that stay with that run only, and which are easily reachable from bash scripts started by the pipeline!)
So, my question is - how do I either force Jenkins to NOT use the same workspace for two concurrently-running jobs on different hosts, OR what variable can I use in the 'custom workspace' field to accomplish this?
After I responded to the question by #Joerg S I realized that I'm saying the thing that Joerg S says CAN'T happen is EXACTLY what I'm observing! Jenkins is using the SAME workspace for 2 different, concurrent, jobs on 2 different hosts. Is this a Jenkins pipeline bug?
See below for a bewildering amount of information.
Given the way I have to go onto and off of nodes during the run, I've found that I can start 2 different builds on different hosts of the same job, and they SHARE the workspace dir! Since each job has shell scripts which are busy writing files into that directory, this is extremely bad.
In Custom workspace in jenkins we are told to use custom workspace, and I'm set up just like that
In Jenkins: how to run builds in unique directories we are told to use ${BUILD_NUMBER} in the above custom workspace field, so what I tried was:
${JENKINS_HOME}/workspace/${ITEM_FULLNAME}/${BUILD_NUMBER}
All that happens to me when I use that is that the workspace name is, you guessed it, "${BUILD_NUMBER}" (and I even got a "${BUILD_NUMBER}#2" just for good measure!)
I tried {$BUILD_ID}, same thing (uses that literally, does not substitute the number).
I have the 'allow concurrent builds' turned on.
I'm using pipelines exclusively.
All jobs here, as part of normal execution, cause the slave, non-master host to reboot into an OS that does not have the capability to run slave.jar (indeed, it has no network access at all), so I cannot run the entire pipeline on that host.
All jobs use the following construct somewhere inside them:
tests=Arrays.asList(tests.split("\\r?\n"))
shellerror=231
for( line in tests){
So let's call an example job 'foo' that loops through a list, as above, that I want to run on 2 different hosts. The pipeline for that job starts running on master (since the above for (line in tests) is REQUIRED to run on a node!)). Then goes back and forth between master and slave, often multiple times.
If I start this job on host A and host B at about the same time, they will BOTH use the workspace ${JENKINS_HOME}/workspace/${JOB_NAME}, or in my case /var/lib/jenkins/jenkins/workspace/job
Since they write different data to files with the same name in that directory, I'm clearly totally broken immediately.
So, how do I force Jenkins to use a unique workspace EVERY SINGLE JOB?
Or, what???
Other things: pipeline build step version 2.5.1, Jenkins 2.46.2
I've been trying to get the workspace statement ('ws') to work, but that doesn't quite work as I expected either - some files are in the workspace I explicitly name, and some are still in the 'built-in' workspace (workspace/).
I was asked to provide code. The 'standard' pipeline I use is about 26K bytes, composing about 590 lines. So, I'm going to GREATLY reduce. That being said:
node("master") { // 1
..... lots of stuff....
} // this matches the "node('master')" above
node(HOST) {
echo "on $HOST, check what os"
if (isUnix())
...some more stuff...
} // end of 'node(HOST)' above
if (isok == 0 ) {
node("master") {
echo "----------------- Running on MASTER 19 $shellerror waiting on boot out of windows ------------"
sleep 120
echo "----------------- Leaving MASTER ------------"
}
}
... lots 'o code ...
node(HOST) {
... etc
} // matches the latest 'node HOST' above
node("master") { // 120
.... code ...
for( line in tests) {
...code...
}
}
... and on and on and on, switching back and forth from one to the other
FWIW, when I tried to make the above use 'ws' so that I could make certain the ws name was unique, I simply added a 'ws wsname' block directly under (almost) every 'node' opening so it was
node(name) { ws (wsname) { ..stuff that was in node block before... } }
But then I've got two directories to worry about checking - both the 'default' workspace/jobname dir AND the new wsname one.

Try using customWorkspace node common option:
pipeline {
agent {
node {
label 'node(s)-defined-label'
customWorkspace "${JENKINS_HOME}/workspace/${JOB_NAME}/${BUILD_NUMBER}"
}
}
stages {
// Your pipeline logic here
}
}
customWorkspace
A string. Run the Pipeline or individual stage this
agent is applied to within this custom workspace, rather than the
default. It can be either a relative path, in which case the custom
workspace will be under the workspace root on the node, or an absolute
path.
Edit
Since this doesn't work for your complex pipeline. Maybe try this silly solution:
def WORKSPACE = "${JENKINS_HOME}/workspace/${JOB_NAME}/${BUILD_NUMBER}"
node(HOST) {
sh(script: "mkdir -p ${WORKSPACE}")
sh(script: "cd ${WORKSPACE}")
//Do stuff here
}
or if dir() is accessible:
def WORKSPACE = "${JENKINS_HOME}/workspace/${JOB_NAME}/${BUILD_NUMBER}"
node(HOST) {
sh(script: "mkdir -p ${WORKSPACE}")
dir(WORKSPACE) {
//Do stuff here
}
}

customWorkspace didn't work for me.
What worked:
stages {
stage("SCM (For commit trigger)"){
steps {
ws('custom-workspace') { // Because we don't want to switch from the pipeline checkout
// Generated from http://lstool01:8080/job/Permanent%20Build/pipeline-syntax/
checkout(xxx)
}
}
}

'${SOMEVAR}'
will not get substituted
"${SOMEVAR}"
will - this is how groovy strings are being handled
see groovy string handling
so if you have a
ws("/some/path/somewhere/${BUILD_ID}")
{
//something
}
on your node in your pipeline Jenkinsfile it should do the trick in this regard
the problem with #2 workspaces can occur when you allow concurrent builds of the project - I had the exact same problem with a custom ws() with #2 - simply disallow concurrent builds or work around that.

sequential processing for n number of times in jenkins pipline

I have a jenkins pipeline to do parallel processing like below
buildNumber=[:]
buildIterations.each{
buildNumber[x]=createExecution(it)
}
node('MyJenkins'){
stage{'Prepare database')
--------
}
parallel buildNumber
def createExecution(String number){
cmd = {
node('MyJenkins'){
stage('Build'){
---------------------
}
stage('Test'){----------}
stage('package'){--------}
}
return cmd
}
But now i want to change this script to have sequential execution as this will run many builds in one job and have load on database at same time.
//should be executed once
node('MyJenkins'){
stage{'Prepare database')
--------
}
//should be executed one after the other, but below code isn't even considered for job. It just stops after prepare database
buildIterations.each{
number=it
node('MyJenkins'){
stage('Build'){
---------------------
}
stage('Test'){----------}
stage('package'){--------}
}
}
I am new to scripting, please help me know what mistake i am doing

which versions of the pipeline plugins are you using? Earlier versions didn’t support iterating an object using .each{}. Sometimes this resulted in the behavior you described. An update of the groovy-cps plugin most probably will do. Version 2.33 is the absolute minimum. I‘d go for latest if possible.
See groovy-cps plug-in
See also: JENKINS-26481

Matrix configuration with Jenkins pipelines

The Jenkins Pipeline plugin (aka Workflow) can be extended with other Multibranch plugins to build branches and pull requests automatically.
What would be the preferred way to run multiple configurations? For example, building with Java 7 and Java 8. This is often called matrix configuration (because of the multiple combinations such as language version, framework version, ...) or build variants.
I tried:
executing them serially as separate stage steps. Good, but takes more time than necessary.
executing them inside a parallel step, with or without nodes allocated inside them. Works but I cannot use the stage step inside parallel for known limitations on how it would be visualized.
Is there a recommended way to do this?

TLDR: Jenkins.io wants you to use nodes for each build.
Jenkins.io: In pipeline coding contexts, a "node" is a step that does two things, typically by enlisting help from available executors on agents:
Schedules the steps contained within it to run by adding them to the Jenkins build queue (so that as soon as an executor slot is free on a node, the appropriate steps run)
It is a best practice to do all material work, such as building or running shell scripts, within nodes, because node blocks in a stage tell Jenkins that the steps within them are resource-intensive enough to be scheduled, request help from the agent pool, and lock a workspace only as long as they need it.
Vanilla Jenkins Node blocks within a stage would look like:
stage 'build' {
node('java7-build'){ ... }
node('java8-build'){ ... }
}
Further extending this notion Cloudbees writes about parallelism and distributed builds with Jenkins. Cloudbees workflow for you might look like:
stage 'build' {
parallel 'java7-build':{
node('mvn-java7'){ ... }
}, 'java8-build':{
node('mvn-java8'){ ... }
}
}
Your requirements of visualizing the different builds in the pipeline would could be satisfied with either workflow, but I trust the Jenkins documentation for best practice.
EDIT
To address the visualization #Stephen would like to see, He's right - it doesn't work! The issue has been raised with Jenkins and is documented here, the resolution of involving the use of 'labelled blocks' is still in progress :-(
Q: Is there documentation letting pipeline users not to put stages inside of parallel steps?
A: No, and this is considered to be an incorrect usage if it is done; stages are only valid as top-level constructs in the pipeline, which is why the notion of labelled blocks as a separate construct has come to be ... And by that, I mean remove stages from parallel steps within my pipeline.
If you try to use a stage in a parallel job, you're going to have a bad time.
ERROR: The ‘stage’ step must not be used inside a ‘parallel’ block.

I would suggest Declarative Matrix as a preferred way to run multiple configurations in Jenkins. It allows you to execute the defined stages for every configuration without code duplication.
Example:
pipeline {
agent none
stages {
stage('Test') {
matrix {
agent {
label "${NODENAME}"
}
axes {
axis {
name 'NODENAME'
values 'java7node', 'java8node'
}
}
stages {
stage('Test') {
steps {
echo "Do Test for ${NODENAME}"
}
}
}
}
}
}
}
Note that declarative Matrix is a native declarative Pipeline feature, so no additional Plugin installation needed.
Jenkins blog post about the matrix directive.

As noted by #StephenKing, Blue Ocean will show parallel branches better than the current stage view. A planned upcoming version of the stage view will be able to show all the branches, though it will not visually indicate any nesting structure (would look the same as if you ran the configurations serially).
In any event, the deeper issue is that you will essentially only get a pass/fail status for the build overall, pending a resolution to JENKINS-27395 and related requests.

In order to test each commit on several platforms, I've used this base Jenkinsfile skeleton:
def test_platform(label, with_stages = false)
{
node(label)
{
// Checkout
if (with_stages) stage label + ' Checkout'
...
// Build
if (with_stages) stage label + ' Build'
...
// Tests
if (with_stages) stage label + ' Tests'
...
}
}
/*
parallel ( failFast: false,
Windows: { test_platform("Windows") },
Linux: { test_platform("Linux") },
Mac: { test_platform("Mac") },
)
*/
test_platform("Windows", true)
test_platform("Mac", true)
test_platform("Linux", true)
With this it's relatively easy to switch from a sequential to a parallel execution, each of them having their pros and cons:
Parallel execution runs much faster, but it doesn't contain the stages labelling
Sequential execution is much slower, but you get a detailed report thanks to stages, labelled as "Windows Checkout", "Windows Build", "Windows Tests", "Mac Checkout", etc.)
I'm using the sequential execution for the time being, until I find a better solution.

It seems like there is relief coming at least with the BlueOcean UI. Here is what I got (the tk-* nodes are the parallel steps):

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart