Update jenkins during long running jobs - jenkins

I want to upgrade my jenkins master without aborting or waiting for long running jobs to finish on slaves. Is there a plugin available that provides this feature?
We have several build jobs running regression and integration tests which take hours to run. Often, at least one of those jobs is running, making it hard to restart jenkins after updates. I know, that it is poosible to block the queue. We tried this, but it hinders more than it helps.
What we are looking for is a plugin, that runs jobs on slaves, caches the output as soon as the connection to the master is interrupted and sends the remaining output to the master when the master is up again. Does anybody know a plugin providing this feature.

Related

How to support disconnections or reboots for Jenkins slaves on Windows?

I have many long running jobs that take almost a day to complete. Splitting is not possible. If the network fails then all progress is lost.
How can a slave survive disconnections?
EDIT 1
I have around 300 slaves running in Windows tied to one single Jenkins instance.
Slaves are connected using the manual method java - jar slave.jar -jlnpUrl <serverUrl> <slaveName>. I cannot run them as a regular Windows service because some tests manipulate GUI elements and require a real interactive session otherwise test fail.
EDIT 2
According to Jenkins Cookbook I should be using Cygwin + OpenSSH approach instead of custom script with JLNP-connector. Could this improve stability?
Jenkins was not originally designed for builds to survive across server or slave restarts. There is a CloudBees Long-Running Build plugin that supports long-running builds but, unfortunately, it is available only for enterprise users and still beta.
I didn't find any free alternative and would suggest you to try to improve your network stability and to split your long running jobs. At least you can divide your tests on logical groups (test suites).
Jenkins now has a workflow plugin. It claims to handle "server" restart and loss-of connectivity with slave.
From the link
A key feature of a workflow execution is that it's suspendable. That
is, while the workflow is running your script, you can shut down
Jenkins or lose a connectivity to a slave. When it comes back, Jenkins
will still remember what it was doing, and your workflow script
resumes execution as if it was never interrupted. A technique known as
the "continuation-passing style" execution plays a key role in
achieving this.
(not tested at all)
Edit: Copied from #Jesse Glick's comments :
Workflow is open source and available for anyone running Jenkins 1.580.1+ or later. CloudBees Jenkins Enterprise does include a checkpoint feature, but this is not necessary simply to have a build survive slave disconnections and Jenkins restarts: that is automatic

How to have all jobs of a build be executed exclusively on the same node?

I have a Jenkins server with half a dozen builds. Each of these builds is composed of a parent job that triggers anywhere between 4 and 6 smaller jobs that run in parallel. I use the EC2 plugin to keep the number of active slaves in line with the number of queued builds. Or in other words, slaves are coming and going all the time. Each slave has 7 executors (parent job + max(4, 6)).
It is absolutely crucial that all jobs of a build are executed on the same machine. I also cannot allow any jobs from build A to execute on a machine that has jobs from build B running.
What I'm looking for is a way that prevents Jenkins from using any inactive executors of a node as long as any jobs from a previous build are still active on it.
I've spent the day experimenting with a combination of the Throttle Concurrent Builds Plugin and the NodeLabel Parameter Plugin. Unfortunately, there seems to be a bug somewhere that causes throttled builds to not contribute to the Load Statistics of a slave. This in turn means that throttled build will never trigger Jenkins to spin up additional slaves. Obviously this is totally unacceptable.
You can try and use "This build is parameterized"
and pass the $NODE_NAME as a parameter between the builds and then use it at the "Restrict where this project can be run"

One execution per Windows VMware VM as Jenkins slaves?

I am trying to run some automated acceptance tests on a windows VM but am running into some problems.
Here is what I want, a job which runs on a freshly reverted VM all the time. This job will get an MSI installer from an upstream job, install it, and then run some automated tests on it, in this case using robotframework (but that doesn't really matter in this case)
I have setup the slave in the vSphere plugin to only have one executor and to disconnect after one execution. On disconnect is shutsdown and reverts. My hope was this meant that it would run one Jenkins job and then revert, the next job would get a fresh snapshot, and so would the next and so on.
The problem is if a job is in queue waiting for the VM slave, as soon as the first job finishes the next one starts, before the VM has shutdown and reverted. The signal to shutdown and revert has however been sent, so the next job is almost immedieatly failed as the VM shuts down.
Everything works fine as long as jobs needing the VM aren't queued while another is running, but if they are I run into this problem.
Can anyone suggest a way to fix this?
Am I better off using vSphere build steps rather than setting up a build slave in this fashion, if so how exactly do I go about getting the same workflow to work using buildsteps and (i assume) pipelined builds.
Thanks
You can set a 'Quiet period' - it's in Advanced Project Options when you create a build. You should set it at the parent job, and this is the time to wait before executing the dependent job
If you'll increase the wait time, the server will go down before the second job starts...
Turns out the version of the vSphere plugin I was using was outdated, this bug problem is fixed in the newer version

How reboot Jenkins with running build queue?

Our Jenkins performs massive integration tests. The longer the jenkins is running, the longer the tests need. Thus we restart the Jenkins server every night via cronjob. Meanwhile the build queue is too long to be finished and the currently running job is canceled and a fail. That's ugly. I found the Safe Restart Plugin, but that waits for empty build queues. Ideally I would have a job which could be priorized to also reboot at every wanted time during the day. This job needs to perform the reboot as the safe restart plugin would do if no jobs where left.

Tool for Job scheduling with logs instead of Jenkins (Rails project)

I have more than 30 rake tasks added to Jenkins for scheduling jobs. (Rails project)
But the jenkins server goes down frequently and uses 100% of CPU at most of the time.
Please suggest me a better job scheduler instead of Jenkins, which is also capable of
doing steps like
Notify an email when jobs fail
Log the jobs terminal output
Add dependency to jobs
Your question seems to come out as "recommend me a CI server".
But - why does Jenkins fall over and/or use 100% CPU most of the time? I'd be looking at why this is. My experience of Jenkins is that it is pretty stable and low overhead. If your hardware / OS / something else is flaky or just under provisioned for the task then swapping Jenkins out isn't going to fix that.

Resources