Communication with the deployer was lost during the deployment - tfs

I am using MS Release management with agent based release templates for my releases.
The releases are getting failed with the following error message and which is randomly happening for different actions in the deployment sequence.
"Communication with the deployer was lost during the deployment. Please make sure (1) the deployer machine has not rebooted during installation and (2) the component timeout is sufficient to copy the files from the drop location to the deployer machine and install the package."
Please note:
1) Deployer machine was not rebooted during the deployment
2) I have created custom components for all actions and made the components' time out to 60 mins
When I restarts the release, it succeeded with out any error. What could be the reason of this error, what is the solution.
Experts please share your views on this.

I think you must have a Encrypted field in your deployment component, just re-enter the value for that encrypted field.
You could refer to this: https://lajak.wordpress.com/2015/12/14/release-management-communication-with-the-deployer-was-lost-during-the-deployment-the-parameter-is-incorrect/

Related

Jenkins Agents "Unable to create live FilePath" and marked offline

Jenkins Controller reports : Unable to create live FilePath for i-xxxxxxxxxxxxx and Agent is marked Offline
Googling this error indicates that it is a problem with the communication paths between Controller and Agent, but what?
Background:
Jenkins Controller running v2.332.1, Java 11 64bit OS, inside a docker container
Jenkins Agents running Swarm-Client jar downloaded from the Controller on startup. Swarm Plugin Version 3.32 Java 11 and 64bit OS, inside a docker container
Agents and Controller are hosted on separate EC2 instances in AWS with Security Group permissions on the relevant ports.
The Instance starts up runs the Cloud-Init, downloads the swarm-client.jar from Jenkins Controller and then runs it with the parameters required to connect to the controller. I mention this to avoid the "are you using the correct version" comments :-)
The Agent connects and is all fully online and gets busy servicing the pending Job queue.
Then some time later, indeterminate, some jobs last > 24 hours and have not failed, other jobs last minutes and sometimes fail.
Things I have tried: (some)
The Swarm Client jar can use either WebSockets and connect to the FQDN of the Jenkins controller or use the JNLP protocol to connect to the IP and dedicated agent connection port (fixed value on the Controller).
Similar behavior is seen with either protocols.
Opening all the AWS Security Groups: incase there was another port, not mentioned, that needed to be open.
Bypass AWS Load balancer: Agent connects directly to Controller IP:PORT via JNLP
Matching Versions: Swarm Client downloaded from Controller
Updated Versions: Jenkins 2.319.3, 2.332.1
Normalized Java environments: Java 11 64bit OS
Enabled Logging on the Agents: periodic communications happens and then stops after a while, without obvious reason.
Increased Controller Instance size: m5.xlarge -> m5.2xlarge
Bumping Jenkins up to a non-LTS version allowed the connections to become more stable.
Jenkins 2.341 and Swarm-Client version 3.32 both use Remoting version 4.13
Now, while I am not particularly happy about running a non-LTS version of Jenkins, I am pleased to have found a workaround
I have also struggled with this issue, I am adding details here, so, that others don't have to struggle.
This is all what i tried:
we had everything running when we had JDK 8 in both master and slave.
So, we added code to have JDK 11 in both and I replaced ec2 of Jenkins with a new one with help of ASG.
So, issue came, and we reverted, but still the issue was the same.
So, I was just assuming by this warning in jenkins as it says moveto jdk 11,as there anything like deprecated...so, I was just checking also we can try this new version of Jenkins as well, what they have mentioned. --going to Jenkins 2.344 with jdk8 ,same issue, and also to different jenkins version didn't help and I lost hope.
I have tried with a biggest ec2 type for slave --didn't help
I checked htop in slave --didn't help.
I tried restarting jenkins master --didn't help.
I tried changing remote dir for slave as mentioned in stack overflow --didn't help.
So, I have a thought, as Jenkins ec2 is terminated and new ec2 came up, so, things may get updated in jenkins by that...and also warning showing to have a new version of jenkins and jdk 11..so, that looked somewhat a hope to me.
I tried by increasing tomeout 20 min in slave setup, didn't help.
I tried adding this command :sudo yum -y update --security in init script of node of jenkins ec2 plgin--will not help.
we have tried jdk 11 image, jdk8 image and new jdk8 jenkins version image, issue was same in all.
So, what finally solved the issue:
that we moved to older version of jenkins:
https://hub.docker.com/layers/jenkins/jenkins/jenkins/2.330-jdk8/images/sha256-97fcb[…]17da34f0d07c021ab57083ee8c77dc4b21281d3498137?context=explore
Fixed by upgrading to Jenkins 2.344

Jenkins failed to start - Verify that you have sufficient privileges to start system services

When installing Jenkins (LTS) on windows 10 via the installer, after choosing the JDK folder - an error pops when the service is trying to start.
The error reads: "Jenkins failed to start - Verify that you have sufficient privileges to start system services"
Let me make it clear that I DO have sufficient privileges, yet something is not working.
I tried many different suggestions to fix this issue, and read many posts but none helped.
Also, a lot of these posts are getting old and I'm not sure how relevant they are these days.
I found a way to fix this issue, and I'll post it as a suggested answer. This could also work for other installers, but it was only tested with the Jenkins installer.
However, if anyone knows a better way to fix this - please share it with us.
Hope this method will help many people!
Important: If you retried the installation too many times, skip down to "Option 2". The local user account that runs Jenkins may be locked. You will need to unlock that account before attempting either fix below.
This is how I fixed the problem.
Option 1: Re-enter credentials for jenkins user
Please read it all before executing and follow the steps in order:
Delete any Jenkins installation leftovers you currently have
Start the installation process, input your credentials when asked, and continue with the on-screen instructions (including choosing the JDK folder) up until the point where the error is raised.
When the error is raised, >>> DO NOT DO ANYTHING! <<< leave it as it is shown in my question
Now (and only now), open "Services"
Search for the "Jenkins" service. It should be set to "Automatic", but it might be "Disabled"
Open the "Jenkins" properties, and go to the "Log On" tab
Make sure you choose the "This account" radio button, delete the account name and password fields, and enter them AGAIN
Click "Apply"
Go back to the installer and click "Retry"
If everything is according to plan - The installation will now continue without a problem.
This method was tested on a local and VM / AWS computer and worked!
If you still encounter a problem, try changing the startup in step 5 to "Automatic" and make sure you only open "Services" on step 4. "Services" will not update while it is open.
Option 2: Unlock jenkins user account and manually start service
If you encounter the "Service 'Jenkins' failed to start" error too many times, the account on your computer that should run Jenkins will become locked. You will need to unlock that local account first. Keep the Jenkins installation window open with the error message, and then perform the following steps:
Open the "Local Users and Groups" application.
Go into the "Users" folder.
Right-click on the user who will run the Jenkins service, and click "Properties".
Uncheck "Account is locked out". And while you are at it, make sure "Account is disabled" is unchecked as well.
Click OK.
Open the "Services" app on your computer.
Make sure the Jenkins service is set to start automatically.
Right-click the Jenkins service and click "Start". The service should start successfully.
Switch back to the Jenkins installation window with the "insufficient privileges" message still showing.
Click the "Retry" button in the "insufficient privileges" message box. The installer should recognize that the Jenkins service has started.
You should be able to finish the installation.
It didn't work for me until I installed Java Runtime Environment (JRE) 11.
For me wasn't working because java installed was jre-8u301-windows-x64.exe
I installed x86, jdk-8u301-windows-i586.exe and it worked
I had the same issue, I have both jre8, jdk 8 and jdk 11 corretto, I think there are some compatibility issues using these versions of java. I was able to fix it by installing jre mentioned above by #maksym which is this.
The Jenkins versions that I am trying to install are 2.332.2 LTS and 2.345. I am able to successfully install 2.332.2 LTS
I fix it with:
Uninstall all thing what linked with jenkins: foldert where i try to install and installer
Download again the jenkins installer
When i started reinstall next time, i had another issue with not correct jdk version, i downloaded 17.0.4.1 version
After that steps i didn't have previous error (Jenkins failed to start) again
Open cmd prompt as admin
Run
net user administrator
If the property value of Active in the response is No then Run
net user administrator /active:yes
This fixed the issue for me

Error on create Release from build

I am trying setup Release Management with TFS 2013 using build template "ReleaseTfvcTemplate.12.xaml" but when I set up to carry out the release occurs the following error:
"ERROR: The account running the TFS build service (Domain\User) needs to be added to the system user in the Release Management Server."
The RM is installed on isolated server from the server that is configured Build Controller, however, the machines are within the same domain, and each server has its own user with administator permission to perform the services.
The build server user was added among users of services in RM, and the error continues to occur.
Anyone have any idea how to solve this problem?
Thanks.
One possible cause is that the Release Management client needs to be installed on the machine(s) running the build agent. This gem is hidden away on p26 of the current RM user guide. This won't be a problem for anyone with everything on one server but will need addressing where components are distributed.

OpsCenter on AWS: No permission to create /mnt/cassandra/data

I installed latest OpsCenter (v5.0.0, through AMI 3cf7c979), found here) on EC2 m3.large. When adding new nodes through the admin interface (port 8888), I get this error:
Error: Start stage failed: Failed to start node [ip]: Timed out waiting for Cassandra to start.
The log on the individual server is:
CassandraDaemon.java (line 235) Directory /mnt/cassandra/data doesn't exist
CassandraDaemon.java (line 239) Has no permission to create /mnt/cassandra/data directory
How come new nodes don't have the permissions to create the /mnt/cassandra dir?
I generated a key/secret with all permissions for the "Amazon EC2 Credentials".
If I manually SSH every new instance, create the /mnt/cassandra dir, chown it and restart the service - it works. I expected it to happen automatically.
Opscenter 5.0.0 is configured with a default AMI version. When you attempt a cloud provision via the UI, you'll see an AMI version is already specified. This is the version to use with opscenter. There are newer AMIs (such as the versions you linked) but as yet they are not fully supported in opscenter, which is why there is an issue with provisioning when you attempt to use them.
With the document you linked, that is instructions for using AMIs via the EC2 console. That is a different provisioning experience than when you provision via opscenter. This is the difference you are experiencing.
As a future improvement to opscenter, I think possibly changing that field from a text box to a drop down to make it clear which AMIs are supported might clarify this sort of problem.
I ended up ditching the AMI. It was probably not up-to-date. I installed opscenter with apt-get on a fresh ubuntu machine and everything worked great.

please wait while jenkins is restarting- waiting long

I updated some plugins and restarted the jenkins but now it says:
Please wait while Jenkins is restarting
Your browser will reload automatically when Jenkins is ready.
It is taking too much time (waiting from last 40 minutes). I have only 1 project with around 20 builds. I have restarted jenkins many times and worked fine but now it stucks.
Is there any way out to kill/suspend jenkins to avoid this wait?
I had a very similar issue when using jenkins build-in restart function. To fix it I killed the service (with crossed fingers), but somehow it kept serving the "Please wait" page. I guess it is served by a separate thread, but since i could not see any running java or jenkins processes i restarted the server to stop it.
After reboot jenkins worked but it was not updated. To make it work it I ran the update again and restarted the jenkins service manually - it took less than a minute and worked just fine...
Jenkins seems to have a number of bugs related to restarting, and at least one unresolved: jenkins issue
Windows ONLY....
All the solutions here didn't work and restarting the server was not an option. If you are in the same situation.
I had to kill java.exe and restart the jenkins service. After I did this Jenkins reloaded several times and then went back to normal.
I was stuck on the jenkins restarting page for 10-ish minutes untill I did this.
Hope this helps.
Running this in the command line helped me:
service jenkins restart
I had a similar issue when updating plugins from the pluging update page and I marked the restart jenkins options. jenkins only showed the waiting message for a long time.
I solved the issue restoring .bak to .jpi files of the the plugins that I tried to update.
I did the follow in my jenkins
cd $JENKINS_HOME/plugins/
>sudo mv git.bak git.jpi
.
. (more plugins files)
.
>sudo mv ldap.bak ldap.jpi
>sudo /sbin/service jenkins restart
Check Event Viewer.
I found that my Java died.
Faulting application java.exe, version 7.0.250.17, time stamp 0x51c4b3fd, faulting module ntdll.dll, version 6.0.6002.18541, time stamp 0x4ec3e39f, exception code 0xc0000374, fault offset 0x000abc4f, process id 0x1188, application start time 0x01cee4f42968bc81.
Finally I found that it's Jenkins 1.540 problem. Don't use it.
https://issues.jenkins-ci.org/browse/JENKINS-20630
I faced the same issue after upgrading some plugins on Windows. Looking on jenkins.err.log it displayed this error
Exception in thread "main" java.io.IOException: Jenkins has failed to create a temporary file in C:\Users\builder\AppData\Local\Temp\
at Main.extractFromJar(Main.java:350)
at Main._main(Main.java:194)
at Main.main(Main.java:91)
Caused by: java.io.IOException: There is not enough space on the disk
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(Unknown Source)
at Main.extractFromJar(Main.java:347)
... 2 more
The problem was that the TEMP folder of the jenkins user had lots of temporary files. After cleaning that folder jenkins restarted correctly.
just performed a restart on the server. That fixed the issue !
In Command prompt execute this
C:\>service jenkins restart
Or
You can go for Service currently running in your machine( Win + R ) seach for Jenkins and Click on restart
For me, the cause seemed to be having lots of old job build logs hanging around. To clean them up, I ran:
cd $JENKINS_HOME/jobs
find -name 'builds' | xargs -n 1 bash -c 'rm -rf $0/[1-9]*'
Then I stopped and started Jenkins again, and it came up within a minute.
Credit to: https://stackoverflow.com/a/39230597/2255242
This is an old thread.. but my personal recommendation is to WAIT before attempting to do anything (such as restarting service, etc).
I wasted hours once trying to fix something that turned out to be not an issue in the first place. In the end, I messed things up and wasted a lot of time.
Just because you see errors in the logs doesn't necessarily mean that you need to take action.
The upgrade took about 45 minutes in the end for me. All i did at one point was refreshing my browser window. It can take a while.
Just my opinion
On Win 10: Stopping with the service command from the command line reported failure to stop the service, but I was able to stop it from services.msc (running as administrator). The updates were applied. Sorry, no definitive answer from me. YMMV.
I used TCPView and killed process that was using port 8080. BAsically it was all Java.exe from Jenkins. Killed all processes and restarted Jenkins Service
try to restart that inside windows services console, it will work
I have observed the same issue after installing a plugin and opting to restart the jenkins when no jobs are running.
When I looked at the jenkins server process, it was running fine and no issues.
On restarting the jenkins service using the below command and reloading the browser, Jenkins was up.
sudo service jenkins restart
If Jenkins is taking an unusually long time to restart the best recourse is to check the generated logs to see what may be wrong. However, even that may be of little help because many plugins try to be "quiet" by default, even if they are furiously working to load content. So if all else fails, you may have to resort to manually disabling plugins.
However here is a free tip: Some plugins are known to be messy. For example the Job Config History plugin we observed to write hundreds of thousands of records for both job configuration changes AND agent changes. Removing this plugin, and deleting the configHistory folder fixed one problem where our startup literally took > 4 hours.
In our case, the problem was we were launching ephemeral agents (via docker and/or kubernetes). Each new "agent" was treated as a configuration change. With thousands of agents per day, it didn't take long to fill up a substantial part of the disk with history that never was effectively cleared.
There are other plugins that leak data in this way. And you can also create self-inflicted wounds, e.g. by using a standalone process to remove "obsolete" files. An example where we were "bitten" is a process that tried to discard old build records, but did an incomplete job - and was "warring" with the running Jenkins process. Jenkins will try breaking its neck to load a build.xml record that is empty or incomplete.
Three more tips:
You can install the monitoring plugin. Often when the jenkins UI proper didn't start, we were able to see the /monitoring in action.
Likewise, /userContent can often be loaded even when the rest of the UI is not fully up.
Don't rule out bad actors. It just takes one aggressive script that tries, e.g. to load the entire build history and ship it back via a REST call to effectively deny service to all other UI users.
I try to fix a file named hudson.model.UpdateCenter.xml located /var/lib/jenkins
I change the URL to https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json
Finally, restart Jenkins. it solves my problem

Resources