We are using jenkins version is 2.121.3.
For java JDK versions we are using different versions for master and slave machines.
This issue will occurs only for security jobs. Which they are added through jenkins node.
This issue will occurs only in occasionally.
Some times job getting succeed sometimes getting failed.
For temporary solution we are relaunching the nodes.
At that time it was working.
We are searching for permanent solution but we unable to find it.
Related
i am running jenkins 2.103 in docker and have connected it to a kubernetes on arm cluster.
i have been able to manually connect the jnlp (v3.16) slave to the master, however it appears to take around 15mins for it to fully connect and report as online. Once online I can run builds as expected.
The problem is that it appears the 'slaveConnectTimeout' setting in the podTemplate is not honoured in the pipeline configuration, and neither is the default template setting of 'Timeout in seconds for Jenkins connection' in Pod Template section of Global Settings.
has anyone be able to make this setting work, and, does anyone have any idea what could be causing the 15min delay in registration?
this issue has been raised as a bug JENKINS-49281 now as well.
the issue ended up being openjdk and me not fully understanding what the kubernetes timeout is all about.
the delay in agent registration is not just a jenkins issue, i have seen the same behaviour in gocd and other java based apps. platform issue, not app issue
The on-demand slaves are being created successfully from Jenkins. The first build on the slave is successful but the subsequent builds are fails. The restart of the slave or restart of the wimrc services allows the build to proceed again.
The tcpdump shows no errors. Can't figure out what the issue is. Looks like issues Jenkins communicating with on demand slaves using wimrc.
Has anybody faced similar issue?
The on-demand slaves are windows slave
The issue was with the "MaxMemoryPerShellMB" parameter of the Winrm. This was set to too low. So when the npm or git was doing a checkout it was running out this memory in the Winrm shell.
I have increased to 1GB and its working fine now
We have a large number of jenkins slaves setup vip jenkins-swarm plugin(auto-discover the master)
Recently, we started to see slaves going offline and existing jobs get stuck. We fixed it with restarting the master. However, it has been happening too frequently. Everything seems to be fine, no network issue nor gc issue.
Anyone?
On the jenkins slaves, we see this repeatity once the node become unavailable:
Retrying in 10 seconds
Attempting to connect to http://ci.foobar.com/jenkins/ 46185efa-0009-4281-89d2-c4018fa9240c with ID 5a0f1773
Could not obtain CSRF crumb. Response code: 404
Failed to create a slave on Jenkins CODE: 409
A slave called '10.9.201.89-5a0f1773' is already created and on-line
I am using the most recent jenkins setup from https://cloud.google.com/tools/repo/push-to-deploy in order to CI my development environment codebase. I noticed that the jenkins git plugin in the image provided is out of date and seems to have a bug with regex branch matching. I updated the plugin to the latest version but after that it seems that jenkins will not restart.
Rebooting the server got jenkins back up, but the google startup script doesn't relaunch and reconnect the docker slave images properly.
Has anyone had this experience and resolved it?
In the job post build action, am archiving the artifacts. 90% of the time, when the jenkins job reaches this step, the slave on which it is running hangs (or) goes offline (or) the job hangs and if I kill the job it throws a "Caused by: java.lang.OutOfMemoryError: Java heap space" error.
Am running Jenkins ver 1.560.
Has anyone seen this or is aware of a fix for this? Any help is appreciated.
Thanks
It looks like you're running into https://issues.jenkins-ci.org/browse/JENKINS-22734 which started in version 1.560 and will be fixed in 1.563.
It's always a good idea to browse the Jenkins change log, especially the Community Ratings section, when you install a new version.
Whenever Hudson master will run out of space, slaves will disconnect and will have to be restarted.
You need to check Hudson master box and see how much space is allocated to the drive where hudson is running.
Another thing to note is that even if a job is running on slave, artifacts are archived always on master. So space allocation on master should be done properly.
I ran into this issue with 1.560v of Jenkins. Right now I have disabled the archiving of the maven artifacts from the "Build" section.