How to safe guard against broken jenkinsfiles causing pipelines to run indefinitely? - jenkins

I have my repo getting polled every 5 mins.
But I found that if the jenkinsfile is totally broken the pipeline will fail with "This stage has no steps".
Then every 5 mins it will retry it and keep failing.
How do I safe guard against this? Can I set a threshold somewhere so if this happens it doesn't churn forever?

If you are using scm polling, it should only build if there are changes. Sounds like you may be building on a cron schedule. Here is the different syntax for each in a declarative pipeline.
pipeline {
triggers {
cron('H/4 * * * 1-5')
pollSCM('0 0 * * 0')
}
}
Or you could do is trigger the builds from a webhook instead of starting a new build every 5 minutes.
If you really just want to throttle the builds so you can't do more than n builds in x time, you can set this property:
properties([[$class: 'JobPropertyImpl', throttle: [count: 1, durationName: 'hour']])

Related

How can I retrieve the execution status of parallel triggered child jobs to a pipeline script

have a pipeline script that executes child jobs in parallel.
Say I have 5 data (a,b,c,d,e) that has to be executed on 3 jobs (J1, J2, J3)
My pipeline script is in the below format
for (int i = 0; i < size; i++) { def index = i branches["branch${i}"] = { build job: 'SampleJob', parameters: [ string(name: 'param1', value:'${data}'), string(name:'dummy', value: "${index}")] } } parallel branches
My problem is, say the execution is happening on Job 1 with the data 1,2,3,4,5 and if the data 3 execution is failed on Job 1 then the data 3 execution should be stopped there itself and should not happen on the subsequent parallel execution on Jobs 2 and 3.
Is there any way that I can read the execution status of parallelly execution job status on the Pipeline script so that I can restrict data 3 execution to block in Jobs 2 and 3.
I am quite blocked here for a long time. Hoping for a solution from my community. Thanks a lot in advance.
In summary, it sounds like you want to
run multiple jobs in parallel against different pieces of data. I will call the set of related jobs the "batch".
avoid starting a queued job if any of the jobs in the batch have failed
automatically abort a running job if any of the jobs in the batch have failed
The jobs need some way to communicate their failure to the others. Use a shared storage location to store the "failure flag". If the file exists, then one or more of the jobs have failed.
For example, a shared NFS path: /shared/jenkins/jobstate/<BATCH_ID>/failed
At the start of the job, check for the existence of this path. Exit if it does. The file doesn't necessarily need to contain any data - its presence is enough.
Since you need running jobs to abort early if the failure flag exists, you will need to poll that location periodically. For example, after each unit of work. Again, if the file exists then exit early.
If you don't use NFS, that's ok. You could also use an object storage bucket. The important thing is that the state is accessible to all the relevant build jobs.

Jenkins pipeline queue gets full when all agents are offline

I am using a Jenkins pipeline script and when all nodes are offline, the builds keep on queuing up. How do I stop Jenkins from adding jobs to the queue while all slaves are offline?
pipeline {
triggers {
pollSCM('H/3 * * * 1-5')
}
}
Is your agent's availability configured to 'Keep this agent online as much as possible' ?
One way to tackle this situation is, run the below script on master node and build your pipeline(s) only if at least one of the nodes is online. You can pass the online node name to your downstream job as a parameter.
def axis = []
for (slave in jenkins.model.Jenkins.instance.getNodes()) {
if (slave.toComputer().isOnline()) {
axis += slave.getDisplayName()
}
}
return axis
Above script source: Jenkins: skip if node is offline
Other links that may help are:
Monitor and restart your slave nodes - https://wiki.jenkins.io/display/JENKINS/Monitor+and+Restart+Offline+Slaves
I found this script handy in some situations:
https://github.com/jenkinsci/jenkins-scripts/blob/master/scriptler/clearBuildQueue.groovy
I'm not into pipeline jobs, but for regular freestyle jobs, this kind of queueing will only happen if your builds are parameterized. Seperate builds are needed then to ensure that the project will run seperately for each and every parameter value (it does not matter whether the value is actually different).
So, removing build parameters in your project might solve the problem.

Jenkins pipeline "waitUntil" - change delay between attempts

We use a Jenkins pipeline for our builds and tests. After the build, we run automated tests on several measurement devices.
For a better overview about the needed testing time, I created a test stage which is periodically checking the status of the tests. When all tests are finished, the pipeline is done. I use the "waitUntil" implementation of Jenkins pipeline for this functionality.
My problem is: The pause between the attemps gets more and more after every try. This is a quite good idea. BUT: After a while, the pause between the attemps gets up to 16 hours and more. This value is too high for my needs because I want to know the needed test time exactly.
My question is: Does anyone know a way to change this behaviour of "waitUntil"?
I know I could use a "while" loop but I would prefer to solve this using "waitUntil".
stage ">>> Waiting for testruns"
waitUntil {
sleep(10)
return(checkIfTestsAreFinished())
}
New versions of Jenkins have capped this to never go over 15 seconds (see https://issues.jenkins-ci.org/browse/JENKINS-34554 ).
In waitUntil, if the processing in the block returns false, then the waitUntil step waits a bit longer and tries again. “a bit longer” means, a 0.25 second wait time. If it needs to loop again, it multiplies that by a factor of 1.2 to get 0.3 seconds for the next wait cycle. On each succeeding cycle, the last wait time is multiplied again by 1.2 to get the time to wait. So, the sequence goes as 0.25, 0.3, 0.36, 0.43, 0.51... until 15 secs(as it is mentioned in one of the answers below, Jenkins solved it).
See the image below:
If you are using older Jenkins version then possible solution could be to use timeout
timeout(time: 1, unit: 'HOURS'){
// you can change format in seconds, minutes, hours
// you can decide your know timeout limit.
waitUntil {
try {
//sleep(10)// you don't need this
return(checkIfTestsAreFinished())
} catch (exception) {
return false
}
}//waituntil
}//timeout
Please note the behavior is from Jenkins LTS 2.235.3.
Jenkins command waitUntil is not supposed to launch something synchronously.
To know the required time you must add a timestamp into the test output parse it and calc separately.

Jenkins .eachDir() iterate only once

I am trying to do a simple housekeeping pipeline to delete old workspace dirs within Jenkins.
node {
stage 'Housekeeping stage'
echo "Deleting all old cell directories, older then ${env.MAXIMUM_CELL_LIVE} days"
new File("${env.phaser_dir}\\workspace\\").eachDir() { dir ->
long diff = new Date().getTime() - dir.lastModified()
if (diff > env.MAXIMUM_CELL_LIVE.toInteger() * 24 * 60 * 60 * 1000) {
dir.deleteDir()
}
}
}
The result is that it iterates only once each time, deleting just one directory.
I have latest version of Pipeline at 2.2. I've also googled there used to be problems like this with .each iterator, but that supposed to be fixed?
Many Thanks
Michal
This is a known issue with Jenkins pipelines (former workflows) and it's filed in JIRA as JENKINS-26481.
Note that there's a lot happening behind the scenes in the Jenkins workflow; after every line Jenkins saves the state of the workflow (position in loops, local variables etc.) to be able to survive failure and resume processing. That's why fixing this problem within Jenkins is not trivial.
There's an easy workaround for you though - just move the logic into a separate function with #NonCPS annotation.
More information is available in the plugin documentation.

Multiple concurrent builds of the same project in Jenkins

On my team, we have a project that we want to do continuous-integration-style testing on. Our build takes around 2 hours and is triggered by the "Poll SCM" trigger (using Perforce as the server), and we have two build nodes.
Currently, if someone checks in a change, one build node will start up pretty much right away, but if another change gets checked in, the other node will not kick in, as it's waiting for the previous job to finish. However, I could like the other build node to start a build with the newer checkin as soon as possible, so that we can maximize the amount of continuous testing that's occurring (so that if e.g. one build fails we know sooner rather than later).
Is there any simple way to configure a Jenkins job (using Poll SCM against a Perforce server) to not block while another instance of the job is already running?
Unfortunately, due to the nature of the project it's not possible to simply break the project up into multiple build jobs that get pipelined across multiple slaves (as much as I'd like to change it to work in this way).
Use the "Execute concurrent builds if necessary" option in Jenkins configuration.
Just to register here in case someone needs it, in the version I'm using (Jenkins 2.249.3) I had to uncheck the option Do not allow concurrent builds in the child job that is called multiple times from the parent job.
The code is more or less like that:
stage('STAGE IN THE PARENT JOB') {
def subParallelJobs = [:]
LIST_OF_PARAMETERS = LIST_OF_PARAMETERS.split(",")
for (int i = 0; i < LIST_OF_PARAMETERS.size(); i++) {
MY_PARAMETER_VALUE = LIST_OF_PARAMETERS[i].trim()
MY_KEY_USING_THE_PARAMETER_TO_MAKE_IT_UNIQUE = "JOB_KEY_${MY_PARAMETER_VALUE}"
def jobParams = [ string(name: 'MY_JOB_PARAMETER', value: MY_PARAMETER_VALUE) ]
subParallelJobs.put("MY_KEY_USING_THE_PARAMETER_TO_MAKE_IT_UNIQUE", {build (job: "MY_CHILD_JOB", parameters: jobParams)})
}
parallel(subParallelJobs)
}
}

Resources