I am trying to run some automated acceptance tests on a windows VM but am running into some problems.
Here is what I want, a job which runs on a freshly reverted VM all the time. This job will get an MSI installer from an upstream job, install it, and then run some automated tests on it, in this case using robotframework (but that doesn't really matter in this case)
I have setup the slave in the vSphere plugin to only have one executor and to disconnect after one execution. On disconnect is shutsdown and reverts. My hope was this meant that it would run one Jenkins job and then revert, the next job would get a fresh snapshot, and so would the next and so on.
The problem is if a job is in queue waiting for the VM slave, as soon as the first job finishes the next one starts, before the VM has shutdown and reverted. The signal to shutdown and revert has however been sent, so the next job is almost immedieatly failed as the VM shuts down.
Everything works fine as long as jobs needing the VM aren't queued while another is running, but if they are I run into this problem.
Can anyone suggest a way to fix this?
Am I better off using vSphere build steps rather than setting up a build slave in this fashion, if so how exactly do I go about getting the same workflow to work using buildsteps and (i assume) pipelined builds.
Thanks
You can set a 'Quiet period' - it's in Advanced Project Options when you create a build. You should set it at the parent job, and this is the time to wait before executing the dependent job
If you'll increase the wait time, the server will go down before the second job starts...
Turns out the version of the vSphere plugin I was using was outdated, this bug problem is fixed in the newer version
Related
I am having Jenkins running as a service and have a job to execute UFT tests on a remote slave. As part of the pipeline I am required to un-install our product, restart the slave, install the product (latest version) and start the test execution.
Since UFT tests need a dedicated UI, I am trying to launch a mstsc connection to the test VM from a temp VM. But since Jenkins is running as a service the mstsc process runs as a background process on the temp VM. Due to this UFT tests don't get a dedicated UI and some of the tests fail.
Tried running Jenkins using the war file instead of service. But after 30-40 mins or so the master slave connection drops.
Any workaround / tweak would be appreciated.
you need to run your jenkins remote agent(war) as a normal Process and not as a service, otherwise, as you mentioned there is no Desktop for them.
My Proposal:
Make sure the jenkins remote agent is running as a normal OS process (on both VMs). You can have a Windows Scheduled Task that launches this Process on Logon and Checks every 5 minutes if it is still alive (if not restarts it)
After the Temporary VM (Let's call it a Gateway) woke up your Test VM, the Test VM should execute a tscon command which will redirect the currently active RDP Session to the console (the Physical Monitor - which on Virtual machines well it's virtual). This will help you having your UI Session alive until the next restart, without having to bother about the Gateway
tscon here. Example: tscon rdp-tcp#1 /dest:console This can be solved again with a Scheduled Task which is executed At Logon (waiting a few Seconds just to make sure)
Have Caffeine.exe or MouseJiggle.exe running in the background as Processes (also launched at Logon) on your Test Computers to make sure the SCreen is never Locked or any Screen Saver is activated. Both tools are free.
If your Jenkins Connection drops that is a different issue has nothing to do with UFT. In my case this combination works perfectly fine. It is also easy to automate the installation of these things. Windows Batch and Vbs can do all these things for you. (Putting the mentioned tools to your %PATH% and creating Scheduled Tasks Programmatically)
** Bonus Tipp: In order to avoid a taskkill java.exe command killing your remote agent, you can simply rename the java.exe of your jvm to jenkins_remote_agent.exe and use that as your jenkins remote agent executable
UFT requires an interactive session for some Win32 operations.
In the Tools ⇨ Options menu, select General ⇨ Run Sessions there you will find an option to Enable continued testing on locked/disconnected remote computers, this may help in your case too.
I want to upgrade my jenkins master without aborting or waiting for long running jobs to finish on slaves. Is there a plugin available that provides this feature?
We have several build jobs running regression and integration tests which take hours to run. Often, at least one of those jobs is running, making it hard to restart jenkins after updates. I know, that it is poosible to block the queue. We tried this, but it hinders more than it helps.
What we are looking for is a plugin, that runs jobs on slaves, caches the output as soon as the connection to the master is interrupted and sends the remaining output to the master when the master is up again. Does anybody know a plugin providing this feature.
I have many long running jobs that take almost a day to complete. Splitting is not possible. If the network fails then all progress is lost.
How can a slave survive disconnections?
EDIT 1
I have around 300 slaves running in Windows tied to one single Jenkins instance.
Slaves are connected using the manual method java - jar slave.jar -jlnpUrl <serverUrl> <slaveName>. I cannot run them as a regular Windows service because some tests manipulate GUI elements and require a real interactive session otherwise test fail.
EDIT 2
According to Jenkins Cookbook I should be using Cygwin + OpenSSH approach instead of custom script with JLNP-connector. Could this improve stability?
Jenkins was not originally designed for builds to survive across server or slave restarts. There is a CloudBees Long-Running Build plugin that supports long-running builds but, unfortunately, it is available only for enterprise users and still beta.
I didn't find any free alternative and would suggest you to try to improve your network stability and to split your long running jobs. At least you can divide your tests on logical groups (test suites).
Jenkins now has a workflow plugin. It claims to handle "server" restart and loss-of connectivity with slave.
From the link
A key feature of a workflow execution is that it's suspendable. That
is, while the workflow is running your script, you can shut down
Jenkins or lose a connectivity to a slave. When it comes back, Jenkins
will still remember what it was doing, and your workflow script
resumes execution as if it was never interrupted. A technique known as
the "continuation-passing style" execution plays a key role in
achieving this.
(not tested at all)
Edit: Copied from #Jesse Glick's comments :
Workflow is open source and available for anyone running Jenkins 1.580.1+ or later. CloudBees Jenkins Enterprise does include a checkpoint feature, but this is not necessary simply to have a build survive slave disconnections and Jenkins restarts: that is automatic
Our Jenkins performs massive integration tests. The longer the jenkins is running, the longer the tests need. Thus we restart the Jenkins server every night via cronjob. Meanwhile the build queue is too long to be finished and the currently running job is canceled and a fail. That's ugly. I found the Safe Restart Plugin, but that waits for empty build queues. Ideally I would have a job which could be priorized to also reboot at every wanted time during the day. This job needs to perform the reboot as the safe restart plugin would do if no jobs where left.
I recently deleted 2 job from Jenkins (via GUI). When I log into slaves afterwards I still see workspace of those 2 jobs lying around. Is this behavior normal?
Notes:
Jenkins master and slaves are all running on Windows environment.
Master runs on Windows Server 2003 and slaves run on Windows Server 2008 R2.
Jenkins version is 1.509.2
Regards,
Benil
Unfortunately it is: https://groups.google.com/forum/#!topic/jenkinsci-users/SiZ3DL-UJ-8
Workspaces found on slaves are not deleted because this is a non-trivial problem (it would mean that jobs would need to record every slave the job has ever been executed on and it would also have to take into account slaves that are offline, for a real thorough solution).
I have just written a script that can be scheduled periodically to clean unused workspace. It will go through all of the Jenkins slaves and check if the directories under the workspace are already deleted in Jenkins master.
https://gist.github.com/ceilfors/1400fd590632db1f51ca
To refine what #oblio said
it would mean that jobs would need to record every slave the job has ever been executed on
Builds of jobs do record which slave they ran on, but builds can be (and often are) deleted after a while.
and it would also have to take into account slaves that are offline
Of course, but this is handled generally by the workspace cleanup feature built into Jenkins core, since it runs as a background process that deals with currently online slaves (deleting seldom-used workspaces), so any slave which is sometimes online will eventually be cleaned.
The problem is that this feature currently ignores apparent workspaces that do not correspond to a job which does not exist at the time it runs, to err on the conservative side. This commit of mine rewrote the cleanup thread to fix some other problems, but not this one.
I came across an effective script that does a good job of only cleaning when disk space gets low and brings the slave offline: https://gist.github.com/rb2k/8372402