Over the weekend, my Jenkins slave was down. Now it is up again, I can SSH from and to the master, so I expect Jenkins should be able to restart the agent on that slave. It is configured to be started via SSH.
What happens instead is that the master seemingly can't connect to the slave, the log shows exactly nothing.
What can I do to get this unstuck?
I tried already temporarily disabling and reenabling the node.
Restarting the master helped. My problem sounded similar to this old Jenkins bug, the workaround worked for me.
Related
I have a Jenkins master setup which has 2 linux slaves and a windows slave. I have a configuration where all boxes are switched off in the night and restarted in the morning. The Jenkins master shows 2 linux nodes in the morning however it does not show windows slave (it just disappears and not even shown offline). The Jenkins version I am using : 2.73.
The problem was related to swarm configuration which was resolved after putting together correct configuration files and enable on machine startup (to handle a situation if the machine goes down).
i am running jenkins 2.103 in docker and have connected it to a kubernetes on arm cluster.
i have been able to manually connect the jnlp (v3.16) slave to the master, however it appears to take around 15mins for it to fully connect and report as online. Once online I can run builds as expected.
The problem is that it appears the 'slaveConnectTimeout' setting in the podTemplate is not honoured in the pipeline configuration, and neither is the default template setting of 'Timeout in seconds for Jenkins connection' in Pod Template section of Global Settings.
has anyone be able to make this setting work, and, does anyone have any idea what could be causing the 15min delay in registration?
this issue has been raised as a bug JENKINS-49281 now as well.
the issue ended up being openjdk and me not fully understanding what the kubernetes timeout is all about.
the delay in agent registration is not just a jenkins issue, i have seen the same behaviour in gocd and other java based apps. platform issue, not app issue
The on-demand slaves are being created successfully from Jenkins. The first build on the slave is successful but the subsequent builds are fails. The restart of the slave or restart of the wimrc services allows the build to proceed again.
The tcpdump shows no errors. Can't figure out what the issue is. Looks like issues Jenkins communicating with on demand slaves using wimrc.
Has anybody faced similar issue?
The on-demand slaves are windows slave
The issue was with the "MaxMemoryPerShellMB" parameter of the Winrm. This was set to too low. So when the npm or git was doing a checkout it was running out this memory in the Winrm shell.
I have increased to 1GB and its working fine now
I'm trying to run a job from my Jenkins master to a Virtual Machine with a jenkins slave on it.
What I want is to run tests that are on Quality Center in my Virtual Machine, but when I try to run the jenkins slave this error is happening.
Has anyone ever see it? Do you have any idea how to fix that? Does the master jenkins and the slave need to be in the same domain for it to work?
I just went back to the master jenkins and deleted my current node and created a new one. Them I downloaded the slave again and it worked!
If you're having this problem try this.
Today morning, we noticed all Putty Jobs running Jenkins were closed due to Network Issue. Once network was up, we re-started Jenkins and we observed that Jenkins Dashboard was not showing ANY jobs. We had around 80 Jobs on the dash board. We are using VM servers for Master/Slave setup. Config.xml is fine. What do we do? how do we get back on track?
All the jenkins jobs are basically xml config files kept in jenkins home.
If your Jenkins is not showing these jobs then it is not using same home directory.
Kindly check jenkins process to see which directory it is pointing to.