we have jenkins project. use case:
jenkins triggers the build
slave agent builds application
server with slave agent goes to reboot (for any reason, for example, problem with electricity, somebody rebooted it, resource shortage and so on)
after that jenkins reports about failed build. how can we automatically relaunch application building in jenkins when slave agent recovered from failure?
There are two aspects to this issue -
Jenkins Server needs to reschedule the build that failed(when the slave-machine crashed).
Install the Naginator Plugin
Set it to rebuild whatever job you have set on the problematic slave
Jenkins Slave needs to restart automatically as soon as its host is up again.
On Windows, for example, you need to set it with a service that starts automatically
Note the Naginator Plugin doesn't know what caused the build to fail,
so it will try to rebuild any build that fails.
To solve this, scan the log for an indication that the slave crashed
and set a regular expression (in the Naginator) to catch it.
Cheers
Related
The on-demand slaves are being created successfully from Jenkins. The first build on the slave is successful but the subsequent builds are fails. The restart of the slave or restart of the wimrc services allows the build to proceed again.
The tcpdump shows no errors. Can't figure out what the issue is. Looks like issues Jenkins communicating with on demand slaves using wimrc.
Has anybody faced similar issue?
The on-demand slaves are windows slave
The issue was with the "MaxMemoryPerShellMB" parameter of the Winrm. This was set to too low. So when the npm or git was doing a checkout it was running out this memory in the Winrm shell.
I have increased to 1GB and its working fine now
My Jenkins master is up and running. I have created a slave node, launched it successfully from the slave machine, and have done the web services installation so that the connection is established on startup of the slave machine. I have also created a "job" that builds successfully in Jenkins.
How do I tell Jenkins what to actually do on my slave machine? I want to use Jenkins to run an IntelliJ test suite (Selenium and Cucumber) on the slave machine, but haven't been able to figure out exactly how to get it to do this. Note: I've just started looking into the Seleniumhq plug-in, but I'm not sure if this is what I need or not since I'm working with a remote slave.
Limit where the jobs can run using the 'Restrict where this project can be run' to your slave node.
Distributed Builds in Jenkins
My confusion here stemmed from not having my project connected to a VCS repository. Without it, I couldn't figure out how to build-out my project's workspace in the slave environment from Jenkins. I also didn't understand the concept of adding additional build steps at the time I asked this question.
Once I had the VCS connection set-up (I had to do some finagling with Git/Visual Studio Team Services to get it connected, which is why I went with "none" as my version control option at first), my workspace was built for me on the slave machine when I built the project from Jenkins. Then, I used a combination of build steps ("execute Windows batch command" and "Invoke top-level Maven targets") to carry-out the rest of the project's functions.
We recently tried moving our Windows Jenkins slaves to run as a service instead of just running the slave agent jnlp file.
According to the Mercurial Plugin (https://wiki.jenkins-ci.org/display/JENKINS/Mercurial+Plugin),
The default installation runs windows service with "local system" account, which does not seem to have enough priveleges for hg to execute, so You could try running Jenkins service with the same account as TortoiseHG, which will allow it to complete.
This we did, and it worked. For a while.
But sometimes after there was a disconnect between the Jenkins slave and master, it would stop working. Jenkins would call mercurial and it would hang, just like it would do if the service was running with the "local system" account.
I could sometimes get it to start working again by restarting the Jenkins service on the slave. But somtimes I'd have to go back in and re-set the service to run with an elevated account.
Has anybody else experienced anything like this? Is there any way to keep the Jenkins Service running with elevated priveleges?
In some occasions a runtime error causes a Jenkins Workflow build to crash, but Jenkins still sees this as a running build. Aborting the job is not possible in de Jenkins Gui. How can I abort or delete such a build?
I restart jenkins. I'm not happy with it though.
You can disconnect a slave node even when it's running a build. Once you reconnect the slave, there should be no jobs running on it.
See JENKINS-25550 for the current workaround for this class of bug.
I am trying to run some automated acceptance tests on a windows VM but am running into some problems.
Here is what I want, a job which runs on a freshly reverted VM all the time. This job will get an MSI installer from an upstream job, install it, and then run some automated tests on it, in this case using robotframework (but that doesn't really matter in this case)
I have setup the slave in the vSphere plugin to only have one executor and to disconnect after one execution. On disconnect is shutsdown and reverts. My hope was this meant that it would run one Jenkins job and then revert, the next job would get a fresh snapshot, and so would the next and so on.
The problem is if a job is in queue waiting for the VM slave, as soon as the first job finishes the next one starts, before the VM has shutdown and reverted. The signal to shutdown and revert has however been sent, so the next job is almost immedieatly failed as the VM shuts down.
Everything works fine as long as jobs needing the VM aren't queued while another is running, but if they are I run into this problem.
Can anyone suggest a way to fix this?
Am I better off using vSphere build steps rather than setting up a build slave in this fashion, if so how exactly do I go about getting the same workflow to work using buildsteps and (i assume) pipelined builds.
Thanks
You can set a 'Quiet period' - it's in Advanced Project Options when you create a build. You should set it at the parent job, and this is the time to wait before executing the dependent job
If you'll increase the wait time, the server will go down before the second job starts...
Turns out the version of the vSphere plugin I was using was outdated, this bug problem is fixed in the newer version