How can I trigger a replay of a build from another job?
Context of Problem: I want to be able to have a job that can prioritize a build over others for another job (that has concurrency disabled). I was thinking I could do this by killing / cancelling jobs in the queue, triggering the new job, and then replay the ones that were cancelled.
I think I know how to cancel the jobs in the queue. I.e. by something like:
def buildNumbers = []
def job = Jenkins.instance.getItemByFullName(TARGET_JOB)
def builds = job.builds
job = null
for (build in builds) {
if (build.isBuilding() && !(build.isInProgress())) {
if(build instanceof WorkflowRun) {
WorkflowRun run = (WorkflowRun) build
if(!dryRun) {
//hard kill
run.doKill()
//release pipeline concurrency locks
StageStepExecution.exit(run)
}
println "Killed ${run}"
buildNumbers.add(build.getNumber())
} else if(build instanceof FreeStyleBuild) {
FreeStyleBuild run = (FreeStyleBuild) build
if(!dryRun) {
run.executor.interrupt(Result.ABORTED)
}
println "Killed ${run}"
} else {
println "WARNING: Don't know how to handle ${item.class}"
}
}
}
But say I have saved these builds or build numbers that were killed, how can I replay them?
I am open to other alternatives as well that solves this problem of prioritizing one build ahead of another.
Related
I have a Jenkins job (pipeline) that updates the status page. The job is triggered externally. I want a currently running job to be stopped/cancelled is a new is scheduled.
Is there an option for that?
Here's what we use. It doesn't stop this build, but lets the other (newer) build to stop this one. In the end, this ensures that a newer build is allowed to proceed while an older build is stopped, so this may fit you.
// invoke early in your pipeline
def killOtherBuilds() {
def jobname = env.JOB_NAME
def my_buildnum = env.BUILD_NUMBER.toInteger()
echo "Job is ${jobname}, build number is ${my_buildnum}"
def job = Jenkins.instance.getItemByFullName(jobname)
def builds = job.builds
job = null
for (build in builds) {
this_buildnum = build.getNumber().toInteger()
if (!build.isBuilding()) {
println "Build ${this_buildnum} isn't building."
continue;
}
if (my_buildnum == this_buildnum)
{
println "Build ${this_buildnum} is building and it's this build."
continue;
}
else if (my_buildnum < this_buildnum)
{
errorMsg = "A newer build is already scheduled"
currentBuild.result = "ABORTED"
currentBuild.description = errorMsg
error(errorMsg)
}
echo "Kill build ${build} number ${this_buildnum}."
build.displayName += "(stopped by #${my_buildnum})"
killBuild(build)
Thread.sleep(5000)
}
}
#NonCPS
def killBuild(some_build){
some_build.doStop()
}
Stopping this build can involve checking frequently if a newer build is already scheduled, you may want to modify this according to your exact requirements.
I'm attempting to set up a script to kill/abort all Jenkins jobs with a certain name in them. I've had trouble finding documentation on Jenkins classes and what's contained in them.
I know there are plugins available, but I've been directed not to use them. Otherwise, I've referred to a few semi-related questions here (How to stop an unstoppable zombie job on Jenkins without restarting the server?), (Cancel queued builds and aborting executing builds using Groovy for Jenkins), and I attempted to rework some of the code from those, however it doesn't quite result in killed jobs:
import hudson.model.*
def jobList = Jenkins.instance.queue
jobList.items.findAll { it.task.name.contains('searchTerm') }.each { jobList.kill(it.task) }
I've also tried the following:
def jobname = ""
def buildnum = 85
def job = Jenkins.instance.getItemByFullName(jobname)
for (build in job.builds) {
if (buildnum == build.getNumber().toInteger()){
if (build.isBuilding()){
build.doStop();
build.doKill();
}
}
}
Instead of hard-killing jobs, the first script does nothing, while the second throws a NullPointerException:
java.lang.NullPointerException: Cannot get property 'builds' on null object
I managed to get it working; my second example wasn't working because I brainfarted and the job I was testing it on had no builds. :(
def searchTerm = ""
def matchedJobs = Jenkins.instance.items.findAll { job ->
job.name.contains(searchTerm)
def desiredState = "stop"
if (desiredState.equals("stop")) {
println "Stopping all current builds ${job.name}"
for (build in job.builds) {
if (build.isBuilding()){
build.doStop();
println build.name + " successfully stopped!"
}
}
}
After running tests in parallel, I need to immediately send out notifications. Currently, the parallel nodes are ran then node is given up and the send notifications sometimes waits for next available node.
// List of tasks, one for each marker/label type.
def farmTasks = ['ac', 'dc']
// Create a number of agent tasks that matches the marker/label type.
// Deploys the image to the board, then checks out the code on the agent and
// runs the tests against the board.
timestamps {
stage('Test') {
def test_tasks = [:]
for (int i = 0; i < farmTasks.size(); i++) {
String farmTask = farmTasks[i]
test_tasks["${farmTask}"] = {
node("linux && ${farmTask}") {
stage("${farmTask}: checkout on ${NODE_NAME}") {
// Checkout without clean
doCheckout(false)
}
stage("${farmTask} tests") {
<code>
}
} // end of node
} // end of test_tasks
} // end of for
parallel test_tasks
node('linux') {
sendMyNotifications();
}
} // end of Test stage
} // end of timestamps
Frankly, this code seems totally fine. I'm yet to understand how the notification needs to wait for a node (do you have more pipelines that use these agents? are there multiple instances of this pipeline running concurrently?), however the workaround to this issue is simple:
Set up another agent (it can reside on machines that already host existing agents) and give it a unique label (e.g. notifications) so its sole use will be to send notifications.
It's not perfect because you get a single point of failure, but it help remedy the situation while you figure out what causes the "real" agents to be unavailable after the parallel steps.
Having the Jenkins job dedicated to special node I'd like to have a notification if the job can't be run because the node is offline. Is it possible to set up this functionality?
In other words, the default Jenkins behavior is waiting for the node if the job has been started when the node is offline ('pending' job status). I want to fail (or don't start at all) the job in this case and send 'node offline' mail.
This node checking stuff should be inside the job because the job is executed rarely and I don't care if the node is offline when it's not needed for the job. I've tried external node watching plugin, but it doesn't do exactly what I want - it triggers emails every time the node goes offline and it's redundant in my case.
I found an answer here.
You can add a command-line or PowerShell block which invokes the curl command and processes a result
curl --silent $JENKINS_URL/computer/$JENKINS_NODENAME/api/json
The result json contains offline property with true/false value
I don't think checking if the node is available can be done inside the job (e.g JobX) you want to run. The act of checking, specifically for your JobX at time of execution, will itself need a job to run - I don't know of a plugin/configuration option that'll do this. JobX can't check if the node is free for JobX.
I use a lot of flow jobs (in process of converting to pipeline logic) where JobA will trigger the JobB, thus JobA could run on master check the node for JobB, JobX in your case, triggering it if up.
JobA would need to be a freestyle job and run a 'execute system groovy script > Groovy command' build step. The groovy code below is pulled together from a number of working examples, so untested:
import hudson.model.*;
import hudson.AbortException;
import java.util.concurrent.CancellationException;
def allNodes = jenkins.model.Jenkins.instance.nodes
def triggerJob = false
for (node in allNodes) {
if ( node.getComputer().isOnline() && node.nodeName == "special_node" ) {
println node.nodeName + " " + node.getComputer().countBusy() + " " + node.getComputer().getOneOffExecutors().size
triggerJob = true
break
}
}
if (triggerJob) {
println("triggering child build as node available")
def job = Hudson.instance.getJob('JobB')
def anotherBuild
try {
def params = [
new StringParameterValue('ParamOne', '123'),
]
def future = job.scheduleBuild2(0, new Cause.UpstreamCause(build), new ParametersAction(params))
anotherBuild = future.get()
} catch (CancellationException x) {
throw new AbortException("${job.fullDisplayName} aborted.")
}
} else {
println("failing parent build as node not available")
build.getExecutor().interrupt(hudson.model.Result.FAILURE)
throw new InterruptedException()
}
To get the node offline email, you could just trigger a post build action to send emails on failure.
I have a project setup that runs on the MISC label everytime it builds, and it had been working great.
However, I've encountered a problem where, if the previous build on one machine fails, it can cause further builds on that machine to fail as well. It would be fine on another slave.
We will like the job to run on a different node in the label, if possible, in case this happens again in the future.
Thanks,
I've run into similar problems. My solution is to take the node offline if certain types of errors happen.
I'm using this plugin to run a groovy script after every build ttps://wiki.jenkins-ci.org/display/JENKINS/Global+Post+Script+Plugin
My script looks like this
import jenkins.model.Jenkins
import hudson.model.*
import hudson.slaves.OfflineCause
// this script is designed to be called by https://wiki.jenkins-ci.org/display/JENKINS/Global+Post+Script+Plugin
if (BUILD_RESULT == "FAILURE") {
println("The job failed. The build failure cause will be checked.")
job = Jenkins.instance.getItemByFullName(JOB_NAME)
build = job.getBuildByNumber(BUILD_NUMBER.toInteger())
def buildLog = build.log
if (buildLog.contains("something indicating an unrecoverable error")) {
Node buildNode = build.getBuiltOn();
// Never set master offline
if (Hudson.getInstance() != buildNode) {
println("This is fatal. The node ${NODE_NAME} is being taken offline.")
buildNode.toComputer().setTemporarilyOffline(true, OfflineCause.create(new OfflineMessage()));
} else {
println("The error is marked to take the node offline, but the node is not being taken offline because it is the master")
}
}
}
class OfflineMessage extends org.jvnet.localizer.Localizable {
def message
OfflineMessage() {
super(null, null, [])
def timestr = new Date().format("HH:mm dd/MM/yy z", TimeZone.getDefault())
this.message = "This node was taken offline because of a failed job at " + timestr
}
String toString() {
this.message
}
String toString(java.util.Locale l) {
toString()
}
}