Jenkins Executor Starvation on CloudBees

Jenkins Executor Starvation on CloudBees - jenkins

I have setup jobs correctly using Jenkins on Cloudbees, Janky, and Hubot. Hubot and Janky work and are pushing the jobs to the Jenkins server.
The job has been sitting in the Jenkins queue for over an hour now. I don't see anywhere to configure the # of executors and this is a completely default instance from Cloudbees.
Is the CloudBees service just taking a while or is something misconfigured?

This was a problem in early March caused by the build containers failing to start cleanly at times.
The root cause was a kernel oops that was occurring in the build container as it launched.
This has since been resolved, and you should not experience these long pauses waiting for an executor.
Anything more than 10 minutes is definitely a bug, and typically more than about 5s is unusual (although when a lot of jobs are launched simultaneously the time to get a container can be on the order of around 3 minutes).

Related

Update jenkins during long running jobs

I want to upgrade my jenkins master without aborting or waiting for long running jobs to finish on slaves. Is there a plugin available that provides this feature?
We have several build jobs running regression and integration tests which take hours to run. Often, at least one of those jobs is running, making it hard to restart jenkins after updates. I know, that it is poosible to block the queue. We tried this, but it hinders more than it helps.
What we are looking for is a plugin, that runs jobs on slaves, caches the output as soon as the connection to the master is interrupted and sends the remaining output to the master when the master is up again. Does anybody know a plugin providing this feature.

How to restart interrupted Jenkins jobs after a server or node failure/restart?

I'm running a Jenkins server and some slaves on a docker swarm that's hosted on preemptive google instances (akin to AWS spot instances). I've got everything set up so that at any given moment there is a Jenkins master running on a single server and slaves running on every other server on the swarm. When one server gets terminated another is spun up and replaces it, and eventually Jenkins is back up running again on another machine even if its server was stopped, and slaves get replaced as they die.
I'm facing two problems:
My first one is when the Jenkins master dies and comes back online it tries to resume the jobs that were previously running and they end up getting stuck trying to be built. Is there any way to automatically have Jenkins restart jobs that were interrupted instead of trying to resume them?
The second is when a slave dies I'd like to automatically restart any jobs that were running on it elsewhere. Is there any way to do that?
Currently I'm dealing with both situations by have an external application retry the failed build jobs, but that's not really optimal.
Thanks!

Jenkins - How to reserve an executor for (a) specific job(s)

We have a Jenkins server with 8 executors and 20 jobs. 15 of those jobs take approximately 2 hours to finish while the remaining 5 take only 15 minutes. I would like to reserve 1 executor (or 2) to run those 5 small jobs only and restrict other jobs to run on the other executors. Note: I don't have any slaves, just 8 executors on master Jenkins process.
I'm new to Jenkins so I just wonder is it any way that I can do that? Thank you.

As i understand it Kiddo uses the master for 8 executors. What you can do is to add a new slave which runs on the master, let's call it slave-master. I.e. You will have master with 6 executors that has usage set to utilise as much as possible, and then slave-master which has usage restricted to only the short builds. So on your server you will have two jenkins tasks running, one is the jenkins master it self, and two is the slave-master.
For info on how to connect slaves, go to https://wiki.jenkins-ci.org/display/JENKINS/Distributed+builds

Adding to #StephenKing answer, you also have to specify the label name for each job while configuring it, as shown in the below image:

I'm a bit late but I think it would be much easier to restrict how many concurrent "slow" jobs can run than trying to reserve executors. This is simple to do with the Lockable Resources plugin: https://wiki.jenkins.io/display/JENKINS/Lockable+Resources+Plugin
Simply add as many resources as the number of slow jobs you want to allow (6 or 7) and give them all the same label. Modify the job configurations to lock a resource (by label with quantity 1) before it can execute. If all the resources are already locked, then the job will wait until one is freed.

In the slave configuration, you can set the Usage mode to Only build jobs with label expressions matching this node.
Then, only jobs matching a given label (e.g. job-group-whatever) will be executed on this slave.

I had same issue. I installed multiple agent on same slave and it works fine.
Nodes remote directory should be different.
agent as a windows services

One execution per Windows VMware VM as Jenkins slaves?

I am trying to run some automated acceptance tests on a windows VM but am running into some problems.
Here is what I want, a job which runs on a freshly reverted VM all the time. This job will get an MSI installer from an upstream job, install it, and then run some automated tests on it, in this case using robotframework (but that doesn't really matter in this case)
I have setup the slave in the vSphere plugin to only have one executor and to disconnect after one execution. On disconnect is shutsdown and reverts. My hope was this meant that it would run one Jenkins job and then revert, the next job would get a fresh snapshot, and so would the next and so on.
The problem is if a job is in queue waiting for the VM slave, as soon as the first job finishes the next one starts, before the VM has shutdown and reverted. The signal to shutdown and revert has however been sent, so the next job is almost immedieatly failed as the VM shuts down.
Everything works fine as long as jobs needing the VM aren't queued while another is running, but if they are I run into this problem.
Can anyone suggest a way to fix this?
Am I better off using vSphere build steps rather than setting up a build slave in this fashion, if so how exactly do I go about getting the same workflow to work using buildsteps and (i assume) pipelined builds.
Thanks

You can set a 'Quiet period' - it's in Advanced Project Options when you create a build. You should set it at the parent job, and this is the time to wait before executing the dependent job
If you'll increase the wait time, the server will go down before the second job starts...

Turns out the version of the vSphere plugin I was using was outdated, this bug problem is fixed in the newer version

Tool for Job scheduling with logs instead of Jenkins (Rails project)

I have more than 30 rake tasks added to Jenkins for scheduling jobs. (Rails project)
But the jenkins server goes down frequently and uses 100% of CPU at most of the time.
Please suggest me a better job scheduler instead of Jenkins, which is also capable of
doing steps like
Notify an email when jobs fail
Log the jobs terminal output
Add dependency to jobs

Your question seems to come out as "recommend me a CI server".
But - why does Jenkins fall over and/or use 100% CPU most of the time? I'd be looking at why this is. My experience of Jenkins is that it is pretty stable and low overhead. If your hardware / OS / something else is flaky or just under provisioned for the task then swapping Jenkins out isn't going to fix that.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Jenkins Executor Starvation on CloudBees - jenkins

Related

Update jenkins during long running jobs

How to restart interrupted Jenkins jobs after a server or node failure/restart?

Jenkins - How to reserve an executor for (a) specific job(s)

One execution per Windows VMware VM as Jenkins slaves?

Tool for Job scheduling with logs instead of Jenkins (Rails project)

Categories

Resources