Jenkins job pauses for 15 minutes - jenkins

I have a job running on a WindowsServer2012R2 agent. The job is pausing between 2 plugins (BuildNameSetter v1.6.8 and DiscardOldBuilds v1.0.5) as you can see below:
13:05:25 Set build name.
13:05:25 New build name is '5.0.811.0'
13:20:21 Discard old builds...
I've started to notice this strange behavior after upgrading Jenkins Master from 2.89 to 2.190.3.
It's frustrating to see your job taking a 15 minutes nap!
Is this a server side issue or a agent side one?
Can someone give me some hints about how to tackle this problem?
Did you experience something similar?

you could have a look at jenkins central logs /log/all to see if there is any java stacktrace error in there
Then you should first try to isolate the issue. try deactivating the build name setter step first. then try to disable the discard old build. then enable build name setter again and keep discard old build deactivated
now you know which plugin is causing the issue try to downgrade or upgrade the plugin that makes your build hang
if the issue comes from discard old build, I would try to remove clean the job's workspace and remove builds manually
look for your issue on jenkins's jira system, upvote. create a ticket if you have not found another user experiencing the same issue
Finally you should be able to find workarounds for these plugins

Today I've upgraded Jenkins to a newer version (2.263.2) on the Jenkins server and the 15 minutes pause dissapeared.

Related

How do I get the details of the user who deleted a few builds of a Jenkins job?

I know that a build had run for a Jenkins job (I received an e-mail with the results and the build number #9).
But, when I open Jenkins to check the Build History, the build in question (Build #9) is not there. When I try to trigger a new build, it is counted as Build #10. So, somebody must have deleted Build #9 manually from Jenkins (my guess).
If yes, how do I get the details of the user who deleted that build?
Is there a log I can refer to?
Do you have the Jenkins Audit Trail Plugin installed?
You can use it to keep an audit of "most actions with significant effect such as creating/configuring/deleting jobs and views or delete/save-forever/start a build".
So it looks like the example you gave of deleting a build would be covered.
See https://wiki.jenkins.io/display/JENKINS/Audit+Trail+Plugin
The Job Configuration History plugin will keep track of all changes (delta+user) AND let you roll back, but only after you install it.

The session for this agent already exists

I am using TFS to execute a nightly build that includes several steps that use the TFS Test Agent. I am running the latest version of TFS/Test Agent(2015 - Update 3) and there are no other builds being run at this time. Often(maybe half the time), when the nightly job is run the step "Visual Studio Test Agent Deployment" fails with the following error:
The job has been abandoned because agent Agent-XXX did not renew the
lock. Ensure agent is running, not sleeping, and has not lost
communication with the service.
This is due to the error found in the Test Agent's log file(under _diag):
The session for this agent already exists. Sleeping for 30 seconds
before next retry.
Microsoft.TeamFoundation.DistributedTask.WebApi.TaskAgentSessionConflictException:
The task agent Agent-XXX already has an active session for owner XXX.
This issue is directly referenced here, and indirectly talked about here.
The solution I've found to this issue is to restart the server that the test agent is running on, this clears any dead sessions, and after the server starts back up, the tests run just fine. I think this is effectively what is being done in the previously mentioned post. The result of resetting the configs is that the service is restarted.
While being presented as a solution in the linked article, it is only temporary. Even after the server has been restarted and the build runs successfully, the next day the issue will again reappear necessitating manual intervention to get the build to run.
I could schedule a task to reset the service or even restart the server directly before the nightly build is run, but it strikes me as a bandage rather than a fix. Has anyone experienced this issue before, and if so is there any way to prevent it from occurring in the first place?
Update 1
I simply set up a build that runs 5 minutes before my main tests that runs a Bat script to restart all my servers hosting my test agents. This is a workaround, but one that seems to resolve the issue. Hopefully someday someone can come up with a better solution than this, but for now, it's how I have to run automated testing in TFS.
Update 2
I have three servers now, all three exhibit the same issue, though it is hard to pin down exactly when it occurs. Scaling up the workaround without creating downtime it proving to be quite challenging.
Update 3
A better day came, I upgraded TFS to 2018, and the build agent to the latest version, this issue no longer occurs, I think its a bug in the old build agent. I still don't have a solution for the original version of the build agent...
t sounds like a process Agent.Listener.exe was running under somewhere on the machine, maybe as a service (not a logged in user session).
note, if an agent process is abruptly terminated while it has an active session, the session will eventually timeout (after 5 minutes i think). and on startup, if an agent encounters session conflict then it will retry for up to 5.5 minutes i think before giving up (enough time for an abruptly terminated session to expire).
i'm going to go ahead and close this and assume a process was running somewhere. we havent had any issues in this area and haven't heard any other reports, so i dont think there is an issue here with the agent. if you find a repro, or it looks like i'm wrong then please reopen.

bitbucket-build-status-notifier plugin for jenkins reports wrong status

Jenkins should notify bitbucket if a job that is linked to a branch has passed or falied, and it does:
But for some reason, in the branch view, it doesn't notify about the result of the last build, and says it failed even if the last build has passed:
How do I make it refer to the result of the last build only?
Today it was released a new version of the plugin for jenkins bitbucket-build-status-notifier which allows exactly what you need to avoid the problem you describe. It's new config option "Only show latest build status", just ensure this checkbox is checked and enjoy it.
Hi I'm the maintainer of the bitbucket-build-status-notifier for Jenkins. Actually the plugin creates a new build status for every jenkins build execution for a given commit. That means that if you exec a build for a given commit id and it failed and later exec a new build for the same commir id and success, both status success and failed will remain in bitbucket, that's find and not an issue. Anyways I understand your problem or desires and you are not the only one since there's already a issue
for solving it.
At the moment I've not much time for developing this new features but I'll do it as soon as possible.

TFS Release Management has a release stuck In Progress

Our Release Managemeng has a job that is stuck "In Progress".
The error is
Communication with the deployer was lost during the deployment. Please
make sure (1) the deployer machine has not rebooted during
installation and (2) the component timeout is sufficient to copy the
files from the drop location to the deployer machine and install the
package.
I can't stop or abandon the release. The buttons are all disabled. How can I kill this?
From the Release Manager, go to the Release tab. Enter in the details of the actual release, go to the step that is pending and you will see a a "Stop" Button at the top. That will stop the step and change the step of the Release.
Is the build stuck? Can you restart the build controller and / or the build agent? You can look for them by editing the build definition.
Don't trust me as Release Management is pretty new, but the error is about the connection between the RM Server and the RM Deployer service (i.e. the RM agent). RM Server don't know anything more about the agent, so your option is to connect to the target machine(s) and manually check deployment status. If completed, restart the RM Deployer service and cross fingers.
I faced the same issue of the release being stuck in 'In Progress' state. Turned out, the password of the credentials I was using, changed. Once the new password was specified in the deployment agent, the release managed to complete. This was months ago, and now I am facing same issue on other server. No clue what is the reason this time.
We has had this problem in which all releases got stuck on TFS 2018
As there is a connectivity issue with SQL when release is completed , it may not update the status in DB in some cases if load is more, so the release is stuck in InProgress state and started consuming pipeline in SQL . Other releases will also not move ahead, as there is blockage in pipeline. Once we increase the pipeline count, the problematic release could move out as processing of releases started happening.
Once the problematic release is canceled by the system, we set the pipeline back to original count of 1, then you could see their releases progressing and not being stuck.
Solution:
You need to increase the count of pipeline to let say 25 after this create a new Release Pipeline and queue this pipeline this will push all those pipeline which got stuck .Once pipeline start queuing make the count back to one or original count.
Reference - https://blogs.msdn.microsoft.com/tfssetup/2017/11/14/understanding-build-and-release-pipelines-visual-studio-team-servicesteam-foundation-server/

How to clean up Jenkins job so build stability is not affected

How would someone clean up a Jenkins job such that the build stability rating is reset and not affected by previous builds? I created a build job and through trial and error, I finally got the job to compile/build correctly. However, I don't want all the previous test builds to affect the build stability rating. I tried deleting all the builds and restarting Jenkins but it still says 20 of the last 25 builds failed. I looked in the $JENKINS_HOME directory (~/.jenkins) and couldn't find anything regarding build stability. Thanks.
When you configure the job you can tell it how long to keep the logs for - either days or build. Set this to one build to clear it out then reset it back again

Resources