Background
Let's say I have two jobs, one 'Pipeline Job' and one 'Build Job'. The 'Pipeline Job' runs on master and is of course a pipeline (using groovy). Then for a build part in the pipeline I use a slave running on Windows, the 'Build Job', which is responsible for building something I can't do on the master. The master is also running on Windows but lack some software needed for the specific build.
The Question
I have a groovy script that looks something like this:
#!groovy
node {
stage('Environment Preparation') {
// fetches stuff and sets up the environment on master
}
stage('Unit Testing') {
// some testing
}
stage('Build on Slave') {
def slaveJob = build job: 'BuildJob'
}
}
It works fine, where 'BuildJob' is "Restrict where this project can be run", i.e., on the slave.
My issue is that I want the output from 'BuildJob' to print in the pipeline logs. Do you have some clever ways of how this could be done? I'm open for everything, so if you know of more clever ways to start the 'BuildJob' etc. I'm eager to here it.
Thanks!
EDITED
You have to approve the things you want to access under script-approval. Not sure if you really neeed getRawBuild but it worked.
Search through console output of a Jenkins job
#!groovy
node {
stage('Environment Preparation') {
// fetches stuff and sets up the environment on master
}
stage('Unit Testing') {
// some testing
}
stage('Build on Slave') {
def slaveJob = build job: 'BuildJob'
println slaveJob.rawBuild.log
}
}
jenkinsurl/scriptApproval/ you approve the following:
method hudson.model.Run getLog
method org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper getRawBuild
Well, sometimes stating a question makes you think from another perspective.
Hopefully someone will benefit from this.
I stumbled upon a Pipeline-Plugin tutorial where they showed how you could use node, to label where script code should run. The resulting groovy file looks something like this:
#!groovy
stage('Environment Preparation') {
node('master') {
// fetches stuff and sets up the environment on master
}
}
stage('Unit Testing') {
node('master') {
// some testing
}
}
stage('Build on Slave') {
node('remote') {
def out = bat script: 'C:\\Build\\build.bat', returnStdout: true
}
}
As you can see the tutorial made me refactor the script a bit. The node('remote') part is what defines that the upcoming stuff should be run on the slave machine.
I had to make some customizations in the batch script, so that everything important was printed to stdout.
You have to let Jenkins know which node is 'remote' by going in to Manage Jenkins > Manage Nodes, choose the slave agent in mind, Configure Node and add 'remote' or whatever suits you to the labels field.
Related
I have parallel stages setup in my Jenkins pipeline script which execute tests on separate nodes (AWS EC2 instances)
e.g.
stage('E2E-PR') {
parallel {
stage('Cypress Tests 1') {
agent { label 'aws_slave_cypress' }
steps {
runE2eTests()
}
}
stage('Cypress Tests 2') {
agent { label 'aws_slave_cypress' }
steps {
runE2eTests()
}
}
}
}
I would like to re-use the checked out repo\generated workspace generated from the parent node used at the start of my pipeline rather than each parallel stage checking out it's own copy.
I came across an approach using sequential stages, and nesting stages within stages to share workspaces across multiple stages, which I tried like below
parallel {
stage('Cypress Tests') {
agent { label 'aws_slave_cypress' }
stages {
stage('Cypress 1') {
steps {
runE2eTests()
}
}
stage('Cypress 2') {
steps {
runE2eTests()
}
}
}
}
}
But I can see from my Jenkins build output that only one aws instance gets spun up and used for both of my nested stages which doesnt give me the benefit of parallelism.
I have also come across the stash\unstash commands, but I have read these should be used for small files and not for large directories\entire repositories?
What's the right approach here to allow me to have parallel stages across multiple nodes use the same originally generated workspace? Is this possible?
Thanks
I can share a bit information on a similar situation we have in our company:
We have an automation regression suite for UI testing that needs to be executed on several different machines (Win32,Win64, Mac and more). The suite configuration is controlled by a config file that contains all relevant environment parameters and URLs.
The Job allows a user to select the suites that will be executed (and the git branch), select labels (agents) that the suites will be executed on, and select the environment configuration that those suites will use.
How does the flow look like:
Generate the config file according to the user input and save (stash) it.
In parallel for all given labels: clone the suites repository using shallow clone -> copy the configuration file (unsatsh) -> run the tests -> save (stash) the output of the tests.
Collect the outputs of all tests (unstash), combine them into a single report, publish the report and notify users.
The pipeline itself (simplified) will look like:
pipeline {
agent any
parameters {
extendedChoice(name: 'Agents', ...)
}
stages {
stage('Create Config') {
steps {
// create the config file called config.yaml
...
stash name: 'config' , includes: 'config.yaml'
}
}
stage('Run Tests') {
steps {
script {
parallel params.Agents.collectEntries { agent ->
["Running on ${agent}": {
stage(agent) {
node(agent) {
println "Executing regression tests on ${agent}"
stage('Clone Tests') {
// Git - shallow clone the automation tests
}
stage('Update Config') {
unstash 'config'
}
stage('Run Suite') {
catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') {
// Run the tests
}
}
stage('Stash Results'){
stash name: agent , includes: 'results.json'
}
}
}
}]
}
}
}
}
stage('Combine and Publish Report') {
steps {
// Unstash all results from each agent to the same folder
params.Agents.each { agent ->
dir(agent){
unstash agent
}
}
// Run command to combine the results and generated the HTML report
// Use publishHTML to publish the HTML report to Jenkins
}
}
}
post{
// Notify users
}
}
So as you can see for handling relatively small files like the config files and the test result files (which are less the 1M) stash and unstash are quick and very easy to use and offer great flexibility, but when you use them for bigger files, then they start to overload the system.
Regarding the tests themself, eventually you must copy the files to each workspace in the different agents, and shallow cloning them from the repository is probably the most efficient and easy to use method if your tests are not packaged.
If your tests can easily be packaged, like a python package for example then you can create a different CI pipeline that will archive them as a package for each code change, and then your automation pipeline can use them with tools like pip or so and avoid the need to clone the repository.
That is also true when you run your tests in a container, as you can create a CI that will prepare an image that already contains the tests and then just use that image in the pipeline without need for cloning it agin.
If your Git sever is somewhat very slow you can always package the tests in your CI using zip or a similar archiver, upload them to S3 (using aws-steps) or equivalent and download them during the automation pipeline execution - but in most cases this will not offer any performance benefits.
I suppose you need to send generated files to s3 on post step for example. Download these files on the first step of your pipeline.
Think about storage, which you need to share between jenkins agents.
I coded a generic pipeline which accepts several parameters in order to deploy releases from a pre-defined GitHub repository to specific nodes. I wanted to host this pipeline on a Jenkinsfile on GitHub, so I configured the job to work with a "Pipeline script from SCM". The fact is - when I try and build the job - the Jenkinsfile gets checked out on every node. Is it possible to checkout and execute the Jenkinsfile only on, say, the master node and run the pipeline as intended?
EDIT: As I stated before, the pipeline works just fine and as intended setting the job to work with a pipeline script. The thing is when I try and change it to be a "Pipeline script from SCM", the Jenkinsfile gets checked out on every agent, which is a problem since I don't have git installed on any agent other than master. I want the Jenkinsfile to be checked out only on master agent and be executed as intended. FYI the pipeline below:
def agents = "$AGENTS".toString()
def agentLabel = "${ println 'Agents: ' + agents; return agents; }"
pipeline {
agent none
stages {
stage('Prep') {
steps {
script {
if (agents == null || agents == "") {
println "Skipping build"
skipBuild = true
}
if (!skipBuild) {
println "Agents set for this build: " + agents
}
}
}
}
stage('Powershell deploy script checkout') {
agent { label 'master' }
when {
expression {
!skipBuild
}
}
steps {
git url: 'https://github.com/owner/repo.git', credentialsId: 'git-credentials', branch: 'main'
stash includes: 'deploy-script.ps1', name: 'deploy-script'
}
}
stage('Deploy') {
agent { label agentLabel }
when {
expression {
!skipBuild
}
}
steps {
unstash 'deploy-script'
script {
println "Execute powershell deploy script on agents set for deploy"
}
}
}
}
}
I think that skipDefaultCheckout is what are you looking for:
pipeline {
options {
skipDefaultCheckout true
}
stages {
stage('Prep') {
steps {
script {
........................
}
}
}
}
}
Take a look to the documentation:
skipDefaultCheckout
Skip checking out code from source control by default in the agent directive.
https://www.jenkins.io/doc/book/pipeline/syntax/
I think you are requesting the impossible.
Now:
your Jenkinsfile is inside your jenkins configuration and is sent as such to each of your agents. No need for git on your agents.
Pipeline script for SCM:
Since you use git, SCM = git. So you are saying: my Pipeline needs to be fetched from a git repository. You are declaring the Deploy step to run on agent { label agentLabel }, so that step is supposed to run on another agent than master.
How would you imagine that agent could get the content of the Jenkinsfile to know what to do, but not use git ?
What happens in Jenkins?
Your master agent gets triggered that it needs to build
the master agent checkouts the Jenkinsfile using git (since it is a Pipeline script from SCM)
jenkins reads the Jenkinsfile and sees what has to be done.
for the Prep stage, I'm not quite sure what happens without agent, I guess that runs on master agent.
the Powershell deploy script checkout is marked to run on master agent, so it runs on master agent (note that the Jenkinsfile will get checked out with git two more times:
before starting the stage, because jenkins needs to know what to execute
one more checkout because you specify git url: 'https://github.com/owner/repo.git'...
the Deploy stage is marked to run on agentLabel, so jenkins tries to checkout your Jenkinsfile on that agent (using git)...
You can use Scripted Pipeline to do this, it should basically look like this
node('master') {
checkout scm
stash includes: 'deploy-script.ps1', name: 'deploy-script'
}
def stepsForParallel = [:]
env.AGENTS.split(' ').each { agent ->
stepsForParallel["deploy ${agent}"] = { ->
node(agent) {
unstash 'deploy-script'
}
}
parallel stepsForParallel
you can find all info about jenkins agent section here.
Shortly: you can call any agent by name or label.
pipeline {
agent {
label 'master'
}
}
If it will not work for you, then you will need to set any label on master node and call it by label
pipeline {
agent {
label 'master_label_here'
}
}
I am trying to have Jenkins pipline job run only using specific agents. For reasons I can not use any and I would prefer not to only use one agent via a label. The agents are named like: abc-a, abc-b, abc-c, xyz-a, xyz-b, xyz-c.
I would like only run the job using agents that are contain abc in their name. I was looking for something like:
pipeline {
agent {
contains: 'abc'
}
stages {
stage ("Example") {
steps{
sh 'really cool bash'
}
}
}
}
I read though the documentation here Jenkins Pipline but the best thing I can find is using a label, but I could be over looking something. Any ideas?
I'm trying to get the following features to work in Jenkins' Declarative Pipeline syntax:
Conditional execution of certain stages only on the master branch
input to ask for user confirmation to deploy to a staging environment
While waiting for confirmation, it doesn't block an executor
Here's what I've ended up with:
pipeline {
agent none
stages {
stage('1. Compile') {
agent any
steps {
echo 'compile'
}
}
stage('2. Build & push Docker image') {
agent any
when {
branch 'master'
}
steps {
echo "build & push docker image"
}
}
stage('3. Deploy to stage') {
when {
branch 'master'
}
input {
message "Deploy to stage?"
ok "Deploy"
}
agent any
steps {
echo 'Deploy to stage'
}
}
}
}
The problem is that stage 2 needs the output from 1, but this is not available when it runs. If I replace the various agent directives with a global agent any, then the output is available, but the executor is blocked waiting for user input at stage 3. And if I try and combine 1 & 2 into a single stage, then I lose the ability to conditionally run some steps only on master.
Is there any way to achieve all the behaviour I'm looking for?
You need to use the stash command at the end of your first step and then unstash when you need the files
I think these are available in the snippet generator
As per the documentation
Saves a set of files for use later in the same build, generally on
another node/workspace. Stashed files are not otherwise available and
are generally discarded at the end of the build. Note that the stash
and unstash steps are designed for use with small files. For large
data transfers, use the External Workspace Manager plugin, or use an
external repository manager such as Nexus or Artifactory
I was reading about the best practices of a Jenkins pipeline.
I have created a declarative pipeline which is not executing parallel jobs and I want to run everything on the same slave.
I use:
agent {
label 'xxx'
}
The rest of my pipeline looks like:
pipeline {
agent {
label 'xxx'
}
triggers {
pollSCM pipelineParams.polling
}
options {
buildDiscarder(logRotator(numToKeepStr: '3'))
}
stages {
stage('stage1') {
steps {
xxx
}
}
stage('stage2') {
steps {
xxx
}
}
}
post {
always {
cleanWs()
}
failure {
xxx"
}
success {
xxx
}
}
}
Now I read the best practices here.
Point 4 is telling:
Do: All Material Work Within a Node
Any material work within a pipeline should occur within a node block.
Why? By default, the Jenkinsfile script itself runs on the Jenkins
master, using a lightweight executor expected to use very few
resources. Any material work, like cloning code from a Git server or
compiling a Java application, should leverage Jenkins distributed
builds capability and run an agent node.
I suspect this is for scripted pipelines.
Now my questions are:
Do I ever have to create a node inside a stage in a declarative pipeline (it is possible) or do I have to use agent inside the stage when I want to run my stage on another specific agent?
My current pipeline has defined a label which is on 4 agents. But my whole pipeline is always executed on one agent (what I want) but I would suspect it's executing stage1 on slaveX and maybe stage2 on slaveY. Why is this not happening?
The documentation is quite misleading.
What the documentation is suggesting is to take advantage of distributed builds. Distributed builds activated either by using the agent or node block.
The agent should be used when you want to run the pipeline almost exclusively on one node. The node block allows for more flexibilty as it allows you to specify where a granular task should be executed.
If you running the pipeline on some agent and you encapsulate a step with node with the same agent, there won't be any effect execpt that a new executor will be allocated to the step encapsulated with node.
There is no obvious benefit in doing so. You will simply be consuming executors that you don't need.
In conclusion, you are already using distributed builds when using agent and this is what the documentation is vaguely recommending.