Jenkins agent keeps disconnecting / reconnecting repeatedly - jenkins

I have a jenkins master server. I just created a new jenkins agent and launching it via Java Web Start in a ubuntu host. The agent connects successfully, but after some time it says "Terminated", then again after some time it says "Connected". And it keeps repeating like this throughout.
I am not even trying to run a build/job yet
Interestingly enough, this ubuntu agent and this jnlp and this java web start has been working fine for the last several weeks - even until a few hours ago. Now suddenly it's starting to disconnected and reconnect repeatedly like this.
JNLP agent connected from /116.68.205.58
<===[JENKINS REMOTING CAPACITY]===>Slave.jar version: 3.2
This is a Unix agent
ERROR: Connection terminated
java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2353)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2822)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
ERROR: Failed to install restarter
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:307)
at hudson.remoting.Channel.terminate(Channel.java:888)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:92)
at ......remote call to Channel to /116.68.205.58(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1537)
at hudson.remoting.Request.call(Request.java:172)
at hudson.remoting.Channel.call(Channel.java:821)
at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.install(JnlpSlaveRestarterInstaller.java:52)
at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller.access$000(JnlpSlaveRestarterInstaller.java:33)
at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:39)
at jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$1.call(JnlpSlaveRestarterInstaller.java:36)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:73)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2353)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2822)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:48)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:59)
JNLP agent connected from /116.68.205.58
<===[JENKINS REMOTING CAPACITY]===>Slave.jar version: 3.2
This is a Unix agent

Check the jenkins slave log for possible problems. Also, how is the Availability setting under the Jenkins node configuration page?
Jenkins >> Manage Jenkins >> Manage Nodes >> your node >> Configure
I recently had a Windows slave with the same symptom and change the Availability from
"Take this agent online when in demand, and offline when idle"
to
"Keep this agent online as much as possible"
and it solved my problem, but you might have a different problem from the one I had. So I suggest first viewing the slave logs. If you can, post the log snippet here for further analysis.

Related

Jenkins remote executors are disconnecting frequently

I am trying to setup 4 RHEL machines in jenkins as build executors. But the systems are getting disconnected so frequently. I have checked the following points already. Still no luck
Network problems between master and slave
Slave server availability
Slow response from slave (disabled Response time in Manage Jenkins > Manage Nodes > Configure > Response time)
I am getting the following error everytime
Feb 22, 2019 7:59:55 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
INFO: I/O error in channel channel
java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2353)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2822)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)

JRebel remoting crashing WebSphere on Docker using Eclipse

I have configured JRebel remoting mode in an Eclipse maven project on a Windows Machine, and running WebSphere in a Linux docker container.
In JVM args, it's setting
-agentpath:/opt/jrebel/lib/libjrebel64.dll -Drebel.remoting_plugin=true
When I change source code, jrebel start updating code, and I got the error:
2018-02-27 23:32:14.066 ERROR [rebel-CancellableExecutorService-1] c.z.jrebel.remoting.Transaction - [OUT] [tr_36] [Project <maven-module-name>, server websphere] Synchronization failed! Read timed out
com.zeroturnaround.jrebel.remoting.RemotingException: Read timed out
at com.zeroturnaround.jrebel.remoting.net.RemotingClient.tryMakePostRequest(JRebelRemoting:189)
at com.zeroturnaround.jrebel.remoting.net.RemotingClient.sendTransactionCommand(JRebelRemoting:147)
at com.zeroturnaround.jrebel.remoting.net.RemotingClient.commitTransaction(JRebelRemoting:101)
at com.zeroturnaround.jrebel.remoting.Transaction.commit(JRebelRemoting:487)
at com.zeroturnaround.jrebel.remoting.Transaction.synchronize(JRebelRemoting:231)
at com.zeroturnaround.jrebel.remoting.RemoteServer$1.run(JRebelRemoting:56)
at org.zeroturnaround.common.util.ExecutorUtil$RunnableWrapper.run(ExecutorUtil.java:153)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
After that, the web application and WebSphere crashed.
The error message indicates that the connection failed between the IDE and the WebSphere application server running inside your Linux Docker container.
Firstly, if you are starting your WebSphere server on a Linux machine, you should use the JRebel's agent library compiled for Linux. That is, you should use libjrebel64.so in the -agentpath argument's path value instead of libjrebel64.dll. Using the .dll in Linux environment should crash the server on startup. Currently, my hypothesis is that the server does not respond since it does not even start up correctly.
Additionally, please make sure that the server is reachable via the address you entered in the remote server configuration on the IDE side.
If the same issue persists after the previous suggestions, please do reproduce this issue and send us the jrebel.log and jrebel-eclipse.log files from ~/.jrebel/ directory to support#zeroturnaround.com.

Jenkins upgrade to 2.58 can't launch slaves

I have upgraded Jenkins from 2.55 to 2.58 and the upgrade appears to have taken all our slaves offline.
Jenkins and slaves all running on linux (RHEL 6)
I've read quite a bit about similar issues but nothing that is 100% the same.
Has anyone run into this before? I've been doing some reading on MACs in sshd_config but haven't had any success getting the slaves restarted.
Any ideas for solution greatly appreciated as I'm probably going to have to shout cake unless I get it sorted by the morning :D
Cheers,
Deon.
Slave log follows -
[05/03/17 19:55:25] [SSH] Opening SSH connection to . Key
exchange was not finished, connection is closed. java.io.IOException:
There was a problem while connecting to at
com.trilead.ssh2.Connection.connect(Connection.java:834) at
com.trilead.ssh2.Connection.connect(Connection.java:703) at
com.trilead.ssh2.Connection.connect(Connection.java:617) at
hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:1265)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:790)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:785)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Caused by:
java.io.IOException: Key exchange was not finished, connection is
closed. at
com.trilead.ssh2.transport.KexManager.getOrWaitForConnectionInfo(KexManager.java:95)
at
com.trilead.ssh2.transport.TransportManager.getConnectionInfo(TransportManager.java:237)
at com.trilead.ssh2.Connection.connect(Connection.java:786) ... 9
more Caused by: java.io.IOException: Cannot read full block, EOF
reached. at
com.trilead.ssh2.crypto.cipher.CipherInputStream.getBlock(CipherInputStream.java:81)
at
com.trilead.ssh2.crypto.cipher.CipherInputStream.read(CipherInputStream.java:108)
at
com.trilead.ssh2.transport.TransportConnection.receiveMessage(TransportConnection.java:232)
at
com.trilead.ssh2.transport.TransportManager.receiveLoop(TransportManager.java:706)
at
com.trilead.ssh2.transport.TransportManager$1.run(TransportManager.java:502)
... 1 more [05/03/17 19:55:25] Launch failed - cleaning up connection
[05/03/17 19:55:25] [SSH] Connection closed.

Unable to create a windows salve node for jenkins

I have setup a master on Ubuntu machine and want to create a salve on Windows 10. While launching the agent I am facing following issue. Can someone please help.
just before slave javed_pc gets launched ...
executing pre-launch scripts ...
[2017-04-21 10:26:54] [windows-slaves] Connecting to 172.26.152.23
Checking if Java exists
java -version returned 1.8.0.
[2017-04-21 10:26:56] [windows-slaves] Copying jenkins-slave.xml
[2017-04-21 10:26:56] [windows-slaves] Copying slave.jar
[2017-04-21 10:26:56] [windows-slaves] Starting the service
ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
org.jinterop.dcom.common.JIException: Service Logon Failure
at org.jvnet.hudson.wmi.Win32Service$Implementation.start(Win32Service.java:149)
Caused: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor206.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.kohsuke.jinterop.JInteropInvocationHandler.invoke(JInteropInvocationHandler.java:140)
Caused: java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy64.start(Unknown Source)
at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:342)
at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:262)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Using the JENKINS Windows Slaves Plugin, check if one of the reason listed in "Windows slaves fail to start via DCOM" would apply in your case.
It lists a wide variety of reasons, from the Windows account used, to network, registry, security reasons.
Make sure you don't have a proxy issue, where Jenkins would try and use said proxy to access a machine (Windows here) on your LAN: the environment variable no_proxy should be used to exclude your local domain.
The OP Javed Ahmed reports having solve it with:
In 'Configure Global security' settings, when you check 'Enable Security' option, then it allows you to connect via java web start.
Otherwise It was not showing the the option to connect through java web start and connecting via windows service is a pain.

Unable to start SLAVE node in Jenkins

Unable to start SLAVE node in Jenkins.
Master machine showing error exception in log file.
[12/01/14 16:21:44] [SSH] Opening SSH connection to
10.0.11.120:22.
Connection refused: connect
ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins.
java.lang.IllegalStateException: Connection is not established!
at com.trilead.ssh2.Connection.getRemainingAuthMethods(Connection.java:1030)
at com.cloudbees.jenkins.plugins.sshcredentials.impl.TrileadSSHPasswordAuthenticator.canAuthenticate(TrileadSSHPasswordAuthenticator.java:82)
at com.cloudbees.jenkins.plugins.sshcredentials.SSHAuthenticator.newInstance(SSHAuthenticator.java:207)
at com.cloudbees.jenkins.plugins.sshcredentials.SSHAuthenticator.newInstance(SSHAuthenticator.java:169)
at hudson.plugins.sshslaves.SSHLauncher.openConnection(SSHLauncher.java:1173)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:701)
at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
[12/01/14 16:21:45] Launch failed - cleaning up connection
[12/01/14 16:21:45] [SSH] Connection closed.
[12/01/14 16:23:44] [SSH] Opening SSH connection to 10.0.11.120:22.
If your slave machine is Mac then
Go to System Preferences -> Sharing and enable Remote Login then try again.
I had this problem too. I have trace the log of the slave.
tail -f /var/log/*.log
And I have seen this message.
Sep 20 14:51:43 clicrdv.aws-eu-west-01.batch-01.adm sshd[1035]: fatal: no matching mac found: client hmac-sha1-96,hmac-sha1,hmac-md5-96,hmac-md5 server hmac-sha2-512,hmac-sha2-256,hmac-ripemd160 [preauth]
Then, I have deleted this line in /etc/ssh/sshd_config
#MACs hmac-sha2-512,hmac-sha2-256,hmac-ripemd160
And restart ssh.
Then no problem at all.
Eric
I suspect that you need to install the Java Cryptography Extension for your JVM. Without that the RSA key size is limited and authentication is not being established.
See https://issues.jenkins-ci.org/browse/JENKINS-26495 for more details.
I had this problem and resolved it by replacing the IP address with the host name.
Problem was solved for my by replacing host name by IP Adress.
See also: https://issues.jenkins-ci.org/browse/JENKINS-26379?focusedCommentId=249378&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-249378

Resources