Kubernetes terminates jenkins slaves before running the pipeline - jenkins

I configured a local cluster of K8s with an instance of Jenkins and the idea is that it will run the slaves within that cluster. I configured the K8s plugin inside Jenkins and put a pod/container for testing with the default Jenkins slave, however, K8s seems to kill the pod before even running the pipeline. Has anyone gone through this to help?
May 21, 2022 3:18:05 PM INFO hudson.slaves.NodeProvisioner update
kube-agent-px18m provisioning successfully completed. We have now 2 computer(s)
May 21, 2022 3:18:05 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
Created Pod: kubernetes jenkins/kube-agent-px18m
May 21, 2022 3:18:07 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
Pod is running: kubernetes jenkins/kube-agent-px18m
May 21, 2022 3:18:17 PM INFO org.csanchez.jenkins.plugins.kubernetes.pod.retention.Reaper$TerminateAgentOnContainerTerminated lambda$onEvent$1
jenkins/kube-agent-px18m Container jnlp was just terminated, so removing the corresponding Jenkins agent
May 21, 2022 3:18:17 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
Terminating Kubernetes instance for agent kube-agent-px18m
May 21, 2022 3:18:17 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave deleteSlavePod
Terminated Kubernetes instance for agent jenkins/kube-agent-px18m
May 21, 2022 3:18:17 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
Disconnected computer kube-agent-px18m
May 21, 2022 3:18:17 PM WARNING org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
Error in provisioning; agent=KubernetesSlave name: kube-agent-px18m, template=PodTemplate{id='d3170ee8-ea06-474e-8e52-a2275d90346a', name='kube-agent', namespace='jenkins', idleMinutes=5, label='kubeagent', containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave', workingDir='/home/jenkins', command='', args='', resourceRequestCpu='', resourceRequestMemory='', resourceRequestEphemeralStorage='', resourceLimitCpu='', resourceLimitMemory='', resourceLimitEphemeralStorage='', livenessProbe=ContainerLivenessProbe{execArgs='', timeoutSeconds=0, initialDelaySeconds=0, failureThreshold=0, periodSeconds=0, successThreshold=0}}]}
java.lang.IllegalStateException: Node was deleted, computer is null
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:193)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:298)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
The job is waiting for the executor and K8s is in a loop creating the new slave pod and terminating it.

The problem was solved. I had forgotten to open port 50000 in Kubernetes so that the master and slave could communicate.

Related

Jenkins framework continuously registers and unregisters in Mesos

I have an internal Jenkins server which I am trying to tie in to a Mesos environment we have running here in our office; however, I'm having a hard time keeping the Jenkins framework registered in Mesos:
Mesos Logs:
I0423 10:17:12.927397 23107 master.cpp:2737] Received SUBSCRIBE call
for framework 'Jenkins Scheduler' at scheduler-5be6bf9c-7ebb-484e-
80c4-f7e60b3400d6#127.0.1.1:33889
I0423 10:17:12.927868 23107 master.cpp:2813] Subscribing framework
Jenkins Scheduler with checkpointing disabled and capabilities [ ]
E0423 10:17:12.928674 23110 process.cpp:2426] Failed to shutdown
socket with fd 51: Transport endpoint is not connected
I0423 10:17:12.929549 23105 master.cpp:1381] Framework f3f7a58b-d7a5-
4336-bc7e-69500d29c3ff-6973 (Jenkins Scheduler) at scheduler-5be6bf9c-
7ebb-484e-80c4-f7e60b3400d6#127.0.1.1:33889 disconnected
I0423 10:17:12.929616 23105 master.cpp:3081] Deactivating framework
f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973 (Jenkins Scheduler) at
scheduler-5be6bf9c-7ebb-484e-80c4-f7e60b3400d6#127.0.1.1:33889
I0423 10:17:12.929649 23105 master.cpp:3058] Disconnecting framework
f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973 (Jenkins Scheduler) at
scheduler-5be6bf9c-7ebb-484e-80c4-f7e60b3400d6#127.0.1.1:33889
I0423 10:17:12.929675 23105 master.cpp:1396] Giving framework
f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973 (Jenkins Scheduler) at
scheduler-5be6bf9c-7ebb-484e-80c4-f7e60b3400d6#127.0.1.1:33889 0ns to
failover
E0423 10:17:12.929584 23110 process.cpp:2426] Failed to shutdown
socket with fd 51: Transport endpoint is not connected
I0423 10:17:12.928707 23108 hierarchical.cpp:286] Added framework
f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973
I0423 10:17:12.930029 23108 hierarchical.cpp:415] Deactivated
framework f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973
I0423 10:17:12.930209 23103 master.cpp:6832] Framework failover
timeout, removing framework f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973
(Jenkins Scheduler) at scheduler-5be6bf9c-7ebb-484e-80c4-
f7e60b3400d6#127.0.1.1:33889
I0423 10:17:12.930279 23103 master.cpp:7717] Removing framework
f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973 (Jenkins Scheduler) at
scheduler-5be6bf9c-7ebb-484e-80c4-f7e60b3400d6#127.0.1.1:33889
I0423 10:17:12.933053 23108 hierarchical.cpp:362] Removed framework
f3f7a58b-d7a5-4336-bc7e-69500d29c3ff-6973`
I've tried setting the Slave Username to both root and jenkins explicitly without success. I've also added the LIBPROCESS_IP to my /etc/default/jenkins file.
Jenkins Logs:
`Apr 24, 2018 9:08:31 AM org.jenkinsci.plugins.mesos.Mesos getInstance
INFO: Adding a new cloud with unique cloud ID: d761feab-44ad-47e2-
aa54-aadc9e933cec
Apr 24, 2018 9:08:33 AM org.jenkinsci.plugins.mesos.MesosCloud
restartMesos
INFO: Mesos master changed from 'null' to '10.0.x.x:5050'
Apr 24, 2018 9:08:33 AM org.jenkinsci.plugins.mesos.JenkinsScheduler
<init>
INFO: JenkinsScheduler instantiated with jenkins http://10.0.x.x:8080
and mesos 10.0.x.x:5050
Apr 24, 2018 9:08:34 AM org.jenkinsci.plugins.mesos.JenkinsScheduler
init
INFO: Initializing the Mesos driver with options
Framework Name: Jenkins Scheduler
Principal: jenkins
Checkpointing: false
Apr 24, 2018 9:08:34 AM org.jenkinsci.plugins.mesos.MesosCloud
provision
INFO: Provisioning Jenkins Slave on Mesos with 1 executors. Remaining
excess workload: 0 executors)
INFO: Started provisioning MesosCloud from MesosCloud with 1
executors. Remaining excess workload: 0
I0424 09:08:34.068917 13193 sched.cpp:232] Version: 1.5.0
I0424 09:08:34.075914 13189 sched.cpp:336] New master detected at
master#10.0.x.x:5050
I0424 09:08:34.076942 13189 sched.cpp:351] No credentials provided.
Attempting to register without authentication
Apr 24, 2018 9:08:41 AM
org.jenkinsci.plugins.mesos.MesosComputerLauncher launch
INFO: Sending a request to start jenkins slave mesos-jenkins-
bdfb011dfa074c809dfd98535bec30db-mesos
INFO: MesosCloud provisioning successfully completed. We have now 3
computer(s)
Apr 24, 2018 9:09:00 AM
org.jenkinsci.plugins.mesos.MesosWorkspaceBrowser getWorkspace
INFO: Nodes went offline. Hence fetching it through master`
Jenkins version: 2.117
Mesos plugin version: 0.16
Mesos version: 1.2.0
Let me know if anything stands out to you. Thanks!
I'm not sure exactly what did it, but I found that there was a version mismatch between the version of Mesos on the Jenkin's master and the Mesos master. I corrected that, but I had also replaced the 127.0.1.1 entry in the /etc/hosts file on Jenkins master with the actual IP and added the LIBPROCESS_IP environment variable.

Jenkins slave cannot connect with master: Incorrect acknowledgement sequence

After update of our Jenkins master installation to its latest LTS version 2.46.3 one of its slaves (Windows 7 machine, 32-bit) cannot connect with the master.
The error we're getting is:
java -jar slave.jar -jnlpUrl https://<jenkins-name>/computer/<node-name>/slave-agent.jnlp -secret <secret-value>
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up slave: node-name
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://<jenkins-name>/]
Jun 22, 2017 1:19:05 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP3-connect, JNLP-connect, CLI2-connect, Ping, CLI-connect, JNLP4-connect, JNLP2-c
onnect]
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
Agent address: <jenkins-name>
Agent port: <jenkins-port>
Identity: <id:en:ti:ty>
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to <jenkins-name>:9150
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Jun 22, 2017 1:19:05 PM org.jenkinsci.remoting.protocol.impl.AckFilterLayer abort
WARNING: [JNLP4-connect connection to <our-proxy>/10.253.0.11:81] Incorrect acknowledgement sequence, expected 0x0003414333 got 0x4854545044
Jun 22, 2017 1:19:05 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Protocol JNLP4-connect encountered an unexpected exception
java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Connection closed before acknowled
gement sent
at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
at hudson.remoting.Engine.innerRun(Engine.java:385)
at hudson.remoting.Engine.run(Engine.java:287)
Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Connection closed before acknowledgement sent
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.onRecvClosed(AckFilterLayer.java:280)
at org.jenkinsci.remoting.protocol.FilterLayer.abort(FilterLayer.java:164)
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.abort(AckFilterLayer.java:130)
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.onRecv(AckFilterLayer.java:258)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:669)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$2200(BIONetworkLayer.java:48)
We spent a lot of time trying to fix the problem. Unfortunately without success.
Do you have an idea what could have caused the problem and how can it be solved?
We recently hit this issue with our AWS-based Jenkins using JNLP for remote integration testing. The remote slave would call back to the Jenkins master, which failed with a similar error. The issue ended up being a dynamically generated AWS ELB of type HTTP (because the Kubernetes ELB provisioner presently doesn't support multi-protocol ELBs) for the Jenkins Master. We had to manually change the JNLP ingress port type of the ELB to TCP, while the web interface ingress 'instance port' was protocol HTTP and 'load balancer' was protocol HTTPS.
Is the Jenkins master instance running behind a load balancer? I had the same issue when my instance was running behind an Application Load Balancer in AWS.
If so, then the acknowledgement sequence could get modified because of differing protocols in the Load balancer. JNLP requires TCP connection on port 50000 by default.
If your setup is on AWS, you could try creating a private hosted zone in Route53 with an Alias record for your Jenkins instance's private IP address.
For e.g: jenkins.example.com -> your Jenkins instance's private IP
Then, in Jenkins UI -> Manage Jenkins -> Configure System -> Manage nodes and clouds -> Configure clouds -> (under advanced settings)
Tunnel connection through : jenkins.example.com:50000
This avoids your slave agents to have to go through the load balancer to connect to the Jenkins Master.
I encounter this kind of problem on gcp, jenkins master behind load balance, almost the same as Sidharth Ramesh reply.
in configuration -> manage jenkins -> configure global security, in the 'agent' part, you must config a specific port, never choose random. I choose 50222 as example,
below is agent protocols: there is a checkbox of "Inbound TCP Agent Protocol/4 (TLS encryption)", we must make enable. if not, there is an error message: "server reports protocol jnlp4-connect not supported skipping"
open the firewall of port from jenkins slave to jenkins master vm internal ip.
enjoy
You need to check the secret key of the node is intact. If not proper, you have to download the slave.jar and also Run agent from command line with new jar file.
java -jar slave.jar -jnlpUrl http://<ipaddress>:8080/computer/<computername>/slave-agent.jnlp -secret 340d54sdrgtjj334kelkahsdjkf83f1c5120dc2fb74939fcdb7f05e1926049f8d7991
Also to check the java version installed is > 7
This happened to us when a Windows Update or some other silent background update messed with the slave's environment variables. HTTPS_PROXY and HTTP_PROXY had to be re-added, and once that was done we were back in business.
The message:
Incorrect acknowledgement sequence ...
happened for me when I had incorrectly configured a value for the Java property hudson.TcpSlaveAgentListener.port as the same port number as the HTTP port used by Jenkins. The TcpSlaveAgentListener javadoc indicates that is a misconfiguration when it says:
Aside from the HTTP endpoint, Jenkins runs TcpSlaveAgentListener that listens on a TCP socket. Historically this was used for inbound connection from agents (hence the name), but over time it was extended and made generic, so that multiple protocols of different purposes can co-exist on the same socket. (emphasis added)
If the HTTP port was 8080 and the hudson.TcpSlaveAgentListener.port was also 8080, then my JNLP agents failed to connect. As soon as I assigned another value to hudson.TcpSlaveAgentListener.port (like 50000) and restarted Jenkins, my JNLP agents were able to connect.
The stack trace on the failing JNLP agent was:
INFO: Trying protocol: JNLP4-connect
Mar 02, 2019 3:49:29 PM org.jenkinsci.remoting.protocol.impl.AckFilterLayer abort
WARNING: [JNLP4-connect connection to agent.example.com/172.16.16.113:8080] Incorrect acknowledgement sequence, expected 0x000341434b got 0x485454502f
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Protocol JNLP4-connect encountered an unexpected exception
java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Connection closed before acknowledgement sent
at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
at hudson.remoting.Engine.innerRun(Engine.java:614)
at hudson.remoting.Engine.run(Engine.java:474)
Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Connection closed before acknowledgement sent
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.onRecvClosed(AckFilterLayer.java:280)
at org.jenkinsci.remoting.protocol.FilterLayer.abort(FilterLayer.java:164)
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.abort(AckFilterLayer.java:130)
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.onRecv(AckFilterLayer.java:258)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:668)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$2200(BIONetworkLayer.java:48)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:283)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
at java.lang.Thread.run(Unknown Source)
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to testing-a.markwaite.net:8080
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP4-plaintext not supported, skipping
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP3-connect not supported, skipping
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP2-connect not supported, skipping
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP-connect not supported, skipping
Mar 02, 2019 3:49:29 PM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: The server rejected the connection: None of the protocols were accepted
java.lang.Exception: The server rejected the connection: None of the protocols were accepted
at hudson.remoting.Engine.onConnectionRejected(Engine.java:682)
at hudson.remoting.Engine.innerRun(Engine.java:639)
at hudson.remoting.Engine.run(Engine.java:474)
I had this issue, and i found a solution.
I have a jenkins master deployed via the jenkins helm chart on an EKS,
and its exposed with an ingress controller,
which is behind an ALB.
I tried to connect an inbound agent node and i had this error above.
solution in short: just tick the "Use WebSocket" option in the agent node config
(Manage Jenkins ==> Manage nodes and clouds ==> choose your inbound agent node ==> Configure ==> tick the "Use WebSocket")
after i did it - the agent could connect and the error was gone.
this is the most elegant solution, i believe its also more secured because when you use it you dont need to keep the 50000 tcp port open, you can just keep using the main port of jenkins (443 usually i guess),
Note: you do need to make sure that the agent will have access to the jenkins main port (usually 443 or 80).
this is how i found this solution:
i found this:
https://docs.cloudbees.com/docs/cloudbees-ci/latest/cloud-setup-guide/configure-ports-jnlp-agents
which led me to this:
https://github.com/jenkinsci/jep/blob/master/jep/222/README.adoc
they explain there that when you expose jenkins through a load balancer, then you better use the websocket option (and even if not, using websocket is still better, because the websocket is more secured then the jnlp and the extra tcp port)

Jenkins slave not working on mesos

I'm using the jenkins mesos plugin for CI.
Initially, I followed the following tutorial: http://www.ebaytechblog.com/2014/05/12/delivering-ebays-ci-solution-with-apache-mesos-part-ii/
but the jenkins itself was not being setup via this. (I got error could not load config.xml file, even there was one)
Then I followed https://rogerignazio.com/blog/scaling-jenkins-mesos-marathon/
, and I was able to run jenkins master (jenkin framework/scheduler), but when I define the scripts to run, the jenkins-slaves are not being created. I think I'm missing some configuration regarding slaves. Can you tell me, what's the reason that the slaves are not being created to run jobs.
On the jenkins build page, I'm getting :
(pending—Waiting for next available executor)
And in the jenkins-logs, i'm getting following error:
INFO: Provisioning Jenkins Slave on Mesos with 1 executors. Remaining excess workload: 0 executors)
Jun 19, 2015 4:02:55 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning MesosCloud from MesosCloud with 1 executors. Remaining excess workload: 0
Jun 19, 2015 4:02:55 PM org.jenkinsci.plugins.mesos.MesosComputerLauncher <init>
INFO: Constructing MesosComputerLauncher
Jun 19, 2015 4:02:55 PM org.jenkinsci.plugins.mesos.MesosSlave <init>
INFO: Constructing Mesos slave mesos-jenkins-1f8691df-9918-4175-87b3-bcc3de80b258 from cloud
Jun 19, 2015 4:03:05 PM org.jenkinsci.plugins.mesos.MesosComputerLauncher launch
INFO: Launching slave computer mesos-jenkins-1f8691df-9918-4175-87b3-bcc3de80b258
Jun 19, 2015 4:03:05 PM org.jenkinsci.plugins.mesos.MesosComputerLauncher launch
INFO: Sending a request to start jenkins slave mesos-jenkins-1f8691df-9918-4175-87b3-bcc3de80b258
Jun 19, 2015 4:03:05 PM org.jenkinsci.plugins.mesos.JenkinsScheduler requestJenkinsSlave
INFO: Enqueuing jenkins slave request
Jun 19, 2015 4:03:05 PM hudson.slaves.NodeProvisioner update
INFO: MesosCloud provisioning successfully completed. We have now 2 computer(s)
java.lang.NullPointerException
at org.jenkinsci.plugins.mesos.JenkinsScheduler.matches(JenkinsScheduler.java:306)
at org.jenkinsci.plugins.mesos.JenkinsScheduler.resourceOffers(JenkinsScheduler.java:252)
Jun 19, 2015 4:03:06 PM org.jenkinsci.plugins.mesos.JenkinsScheduler$1 run
SEVERE: The Mesos driver was aborted! Status code: 3
Edit: I think I'm getting error, because I've not defined any container port mappings.
Can anyone tell me how to do so?
Update : Actually there were many problems with 0.7 version of mesos plugin. So, I simply downgraded to 0.6 version.
For port mappings on marathon have a look here.
Hope this helps!

Openshift Jenkins does not appear to send Node label during build

I'm trying to get HA Openshift Origin working on CentOS 6.5 (Nightly packages, but may be a few days out) but one of the last things to get working is Jenkins.
When I start a build of an application, manually or after a git push, I get the following error:
Jun 06, 2014 2:24:52 PM hudson.plugins.openshift.OpenShiftCloud provision
INFO: Provisioning new node for workload = 2 and label = null in domain stu
Jun 06, 2014 2:24:52 PM hudson.plugins.openshift.OpenShiftCloud provision
INFO: Cancelling build - Label is null
Jun 06, 2014 2:24:52 PM hudson.plugins.openshift.OpenShiftCloud cancelBuild
INFO: Cancelling build
Jun 06, 2014 2:24:52 PM hudson.plugins.openshift.OpenShiftCloud cancelItem
INFO: Cancelling Item
Jun 06, 2014 2:24:52 PM hudson.plugins.openshift.OpenShiftCloud cancelItem
WARNING: Build null rawbldr has been canceled
Jun 06, 2014 2:24:52 PM hudson.triggers.SafeTimerTask run
SEVERE: Timer task hudson.slaves.NodeProvisioner$NodeProvisionerInvoker#f01ba81 failed
java.lang.UnsupportedOperationException: No Label
at hudson.plugins.openshift.OpenShiftCloud.provision(OpenShiftCloud.java:402)
at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:281)
at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:51)
at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:366)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Looking at the code of the Jenkins plugin: https://github.com/openshift/jenkins-cloud-plugin/blob/master/src/main/java/hudson/plugins/openshift/OpenShiftCloud.java#L353 it just looks like the value of the label set in the job config isn't received for some odd reason, so a builder gear doesn't get spun up.
This is very similar to this old Q from Openshift Online, but it's unclear from the comments the actual cause other than "maintenance":
Cant build on openshift jenkins
Everything else that I've tried appears to work fine, so I'm not sure if it's a bug, or misconfiguration somewhere.
Thanks
Openshift-origin nightly packages
Jenkins 1.564 (1.564-1.1)
openshift-origin-cartridge-jenkins (1.22.1-1.git.0.3f73f10.el6)
openshift-origin-cartridge-jenkins-client (1.21.1-1.git.0.93d6489.el6)
Openshift Jenkins cloud plugin 1.4 (0.6.36-0.el6oso)
I have replicated the issue in a vagrant machine, so am currently assuming it's the combination of packages I'm running.
Could someone running from the nightly repos please specify which package versions of each of the above they have running without issue? Thanks
I got the same issue (that is how I got here), and the workaround for me was to go to Manage Jenkins > Configure system and then set the "# of executors" field to 1.
I've tried a few different versions of Jenkins with the latest Openshift cloud plugin
1.510 - works, but a bit old
1.554 - works if you set JENKINS_JAR_CACHE_PATH env var (see https://github.com/openshift/jenkins-cloud-plugin/issues/30)
1.564 - hits the above issue, doesn't ever spin up a gear let alone start the Jenkins slave
I'm currently running Jenkins v1.554 and setting an env var with the following Puppet
file { '/etc/openshift/env/JENKINS_JAR_CACHE_PATH':
ensure => present,
content => '/tmp/',
require => File['/etc/openshift/env/'],
owner => 'root',
group => 'root',
mode => '0644',
}
The Openshift guys will apparently be using v1.554 by default in the near future anyway.

Jenkins on openshift " Cartridge for not found"

I tried to create a very simple Jenkins setup on openshift. Here are the steps I followed:
1. Create a new openshift app using the "Jenkins Server" cartridge.
2. Log in to the new Jenkins server using the supplied username and password.
3. Create a very simple freestyle build with a shell build step that echos some text.
4. Run the build
The new build appears briefly in the jenkins server UI and then disappears, so I checked the log in the jenkins server app to find some error messages.
May 29, 2014 4:55:41 PM hudson.plugins.openshift.OpenShiftCloud reloadConfig
INFO: Reload ResponseCode: 200
May 29, 2014 4:55:41 PM hudson.plugins.openshift.OpenShiftCloud reloadConfig
INFO: Config reload result:
May 29, 2014 4:55:41 PM hudson.plugins.openshift.OpenShiftSlave <init>
INFO: Creating slave with 15mins time-to-live
May 29, 2014 4:55:41 PM hudson.plugins.openshift.OpenShiftCloud provision
WARNING: Caught com.openshift.client.OpenShiftException: Cartridge for not found. Will retry 0 more times before canceling build.
May 29, 2014 4:55:46 PM hudson.plugins.openshift.OpenShiftCloud provision
WARNING: Cancelling build due to earlier exceptions
com.openshift.client.OpenShiftException: Cartridge for not found
at hudson.plugins.openshift.OpenShiftSlave.getCartridge(OpenShiftSlave.java:129)
at hudson.plugins.openshift.OpenShiftSlave.createApp(OpenShiftSlave.java:262)
at hudson.plugins.openshift.OpenShiftSlave.provision(OpenShiftSlave.java:253)
at hudson.plugins.openshift.OpenShiftCloud.provisionSlave(OpenShiftCloud.java:489)
at hudson.plugins.openshift.OpenShiftCloud.provision(OpenShiftCloud.java:413)
at hudson.slaves.NodeProvisioner.update(NodeProvisioner.java:264)
at hudson.slaves.NodeProvisioner.access$000(NodeProvisioner.java:51)
at hudson.slaves.NodeProvisioner$NodeProvisionerInvoker.doRun(NodeProvisioner.java:347)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
May 29, 2014 4:55:46 PM hudson.plugins.openshift.OpenShiftCloud cancelItem
INFO: Cancelling Item
It appears that some configuration is missing that would tell openshift which cartridge to use for the new slave, but I'm not sure where to configure this. Any help is greatly appreciated, thanks!
I am actually having the same issue.
If your code is configured with rhc already, run:
$ rhc cartridge add jenkins-client-1 -a jboss1
As per: https://www.openshift.com/developers/jenkins

Resources