I have added an EC2 cloud with the intention of using Windows instances.
The Windows AMI is configured to use SSH connection, not WinRM.
When I provision a Windows instance, I see it starting in the AWS console, and I see this in the Jenkins console:
May 07, 2021 5:45:42 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-18-197-156-192.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 5:45:52 PM INFO hudson.plugins.ec2.EC2Cloud log
Failed to connect via ssh: The kexTimeout (10000 ms) expired.
May 07, 2021 5:45:52 PM INFO hudson.plugins.ec2.EC2Cloud log
Waiting for SSH to come up. Sleeping 5.
May 07, 2021 5:45:57 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-18-197-156-192.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 5:45:57 PM INFO hudson.plugins.ec2.EC2Cloud log
Failed to connect via ssh: There was a problem while connecting to ec2-18-197-156-192.eu-central-1.compute.amazonaws.com:22
May 07, 2021 5:45:57 PM INFO hudson.plugins.ec2.EC2Cloud log
Waiting for SSH to come up. Sleeping 5.
(repeat many times)
This is normal of course, as long as the instance is still starting.
After a while, I see in the AWS console that the instance is Running.
And immediately, it gets terminated.
I also see this in the Jenkins log, which may or may not be relevant:
May 07, 2021 5:50:52 PM INFO hudson.plugins.ec2.EC2OndemandSlave lambda$terminate$0
Terminated EC2 instance (terminated): i-0c82439628de13743
May 07, 2021 5:50:52 PM INFO hudson.plugins.ec2.EC2OndemandSlave lambda$terminate$0
Removed EC2 instance from jenkins master: i-0c82439628de13743
This is when I manually provision:
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate getImage
Getting image for request {ExecutableUsers: [],Filters: [],ImageIds: [ami-0d8ed2b0df1b1c6ef],Owners: []}
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
SlaveTemplate{description='Windows', labels='aws windows'}. Considering launching
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice
AMI had /dev/sda1
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate setupRootDevice
{DeleteOnTermination: true,SnapshotId: snap-0ab3c9f92925a0abf,VolumeSize: 200,VolumeType: gp2,Encrypted: false}
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
SlaveTemplate{description='Windows', labels='aws windows'}. EBS default encryption value set to: Based on AMI (null)
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
SlaveTemplate{description='Windows', labels='aws windows'}. Setting Instance Initiated Shutdown Behavior : ShutdownBehavior.Terminate
May 07, 2021 6:18:25 PM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
SlaveTemplate{description='Windows', labels='aws windows'}. Looking for existing instances with describe-instance: {Filters: [{Name: image-id,Values: [ami-0d8ed2b0df1b1c6ef]}, {Name: instance-type,Values: [t3a.medium]}, {Name: key-name,Values: [rsa-8192-aws-jenkins]}, {Name: tenancy,Values: [default]}, {Name: subnet-id,Values: [subnet-d426f499]}, {Name: instance.group-id,Values: [sg-0c2d470529c492f13]}, {Name: tag:jenkins_server_url,Values: [https://jenkins.itextsupport.com/]}, {Name: tag:jenkins_slave_type,Values: [demand_Windows]}, {Name: tag:Name,Values: [Jenkins/Windows]}],InstanceIds: [],}
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.SlaveTemplate logProvisionInfo
SlaveTemplate{description='Windows', labels='aws windows'}. Return instance: {AmiLaunchIndex: 0,ImageId: ami-0d8ed2b0df1b1c6ef,InstanceId: i-0078ac5df5320de3d,InstanceType: t3a.medium,KeyName: rsa-8192-aws-jenkins,LaunchTime: Fri May 07 18:18:26 CEST 2021,Monitoring: {State: pending},Placement: {AvailabilityZone: eu-central-1c,GroupName: ,Tenancy: default,},Platform: windows,PrivateDnsName: ip-172-31-12-78.eu-central-1.compute.internal,PrivateIpAddress: 172.31.12.78,ProductCodes: [],PublicDnsName: ,State: {Code: 0,Name: pending},StateTransitionReason: ,SubnetId: subnet-d426f499,VpcId: vpc-ddb7aab5,Architecture: x86_64,BlockDeviceMappings: [],ClientToken: e9f54d34-c9a4-47db-a472-a1204e9b5784,EbsOptimized: true,EnaSupport: true,Hypervisor: xen,ElasticGpuAssociations: [],ElasticInferenceAcceleratorAssociations: [],NetworkInterfaces: [{Attachment: {AttachTime: Fri May 07 18:18:26 CEST 2021,AttachmentId: eni-attach-0789ae7127addb88f,DeleteOnTermination: true,DeviceIndex: 0,Status: attaching,NetworkCardIndex: 0},Description: ,Groups: [{GroupName: ssh,GroupId: sg-0c2d470529c492f13}],Ipv6Addresses: [],MacAddress: 0a:1a:df:ec:e0:26,NetworkInterfaceId: eni-034d777e34991680a,OwnerId: 050520292612,PrivateDnsName: ip-172-31-12-78.eu-central-1.compute.internal,PrivateIpAddress: 172.31.12.78,PrivateIpAddresses: [{Primary: true,PrivateDnsName: ip-172-31-12-78.eu-central-1.compute.internal,PrivateIpAddress: 172.31.12.78}],SourceDestCheck: true,Status: in-use,SubnetId: subnet-d426f499,VpcId: vpc-ddb7aab5,InterfaceType: interface}],RootDeviceName: /dev/sda1,RootDeviceType: ebs,SecurityGroups: [{GroupName: ssh,GroupId: sg-0c2d470529c492f13}],SourceDestCheck: true,StateReason: {Code: pending,Message: pending},Tags: [{Key: Name,Value: Jenkins/Windows}, {Key: jenkins_slave_type,Value: demand_Windows}, {Key: jenkins_server_url,Value: https://jenkins.itextsupport.com/}],VirtualizationType: hvm,CpuOptions: {CoreCount: 1,ThreadsPerCore: 2},CapacityReservationSpecification: {CapacityReservationPreference: open,},Licenses: [],MetadataOptions: {State: pending,HttpTokens: optional,HttpPutResponseHopLimit: 1,HttpEndpoint: enabled},EnclaveOptions: {Enabled: false},}
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.EC2RetentionStrategy start
Start requested for EC2 (eu-central-1 Windows) - Windows (i-0078ac5df5320de3d)
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.EC2Cloud log
Launching instance: i-0078ac5df5320de3d
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.EC2Cloud log
bootstrap()
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.EC2Cloud log
Getting keypair...
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.EC2Cloud log
Using private key rsa-8192-aws-jenkins (SHA-1 fingerprint 87:87:34:8e:a4:d5:7f:17:bb:08:9d:f8:68:c9:33:f1)
May 07, 2021 6:18:26 PM INFO hudson.plugins.ec2.EC2Cloud log
Authenticating as jenkins
May 07, 2021 6:18:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to null on port 22, with timeout 10000.
May 07, 2021 6:18:27 PM INFO hudson.plugins.ec2.EC2Cloud log
No SSH key verification (ssh-ed25519 57:02:49:51:15:8a:41:70:67:ca:7a:e5:48:5f:32:b6) for connections to EC2 (eu-central-1 Windows) - Windows (i-0078ac5df5320de3d)
May 07, 2021 6:18:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connected via SSH.
May 07, 2021 6:18:27 PM WARNING hudson.plugins.ec2.EC2Cloud log
Authentication failed. Trying again...
May 07, 2021 6:18:57 PM INFO hudson.plugins.ec2.EC2Cloud log
Authenticating as jenkins
May 07, 2021 6:18:57 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-3-66-217-65.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:19:00 PM INFO hudson.slaves.SlaveComputer tryReconnect
Attempting to reconnect aws-linux-node-eu-central-1-packer
May 07, 2021 6:19:07 PM INFO hudson.plugins.ec2.EC2Cloud log
Failed to connect via ssh: The kexTimeout (10000 ms) expired.
May 07, 2021 6:19:07 PM INFO hudson.plugins.ec2.EC2Cloud log
Waiting for SSH to come up. Sleeping 5.
May 07, 2021 6:19:12 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-3-66-217-65.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:19:22 PM INFO hudson.plugins.ec2.EC2Cloud log
Failed to connect via ssh: The kexTimeout (10000 ms) expired.
May 07, 2021 6:19:22 PM INFO hudson.plugins.ec2.EC2Cloud log
Waiting for SSH to come up. Sleeping 5.
May 07, 2021 6:19:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-3-66-217-65.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:19:37 PM INFO hudson.plugins.ec2.EC2Cloud log
Failed to connect via ssh: The kexTimeout (10000 ms) expired.
May 07, 2021 6:19:37 PM INFO hudson.plugins.ec2.EC2Cloud log
Waiting for SSH to come up. Sleeping 5.
May 07, 2021 6:19:42 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-3-66-217-65.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:19:47 PM INFO hudson.plugins.ec2.EC2Cloud log
No SSH key verification (ssh-ed25519 78:ed:18:5b:52:00:76:72:8e:ae:73:69:3c:69:31:dd) for connections to EC2 (eu-central-1 Windows) - Windows (i-0078ac5df5320de3d)
May 07, 2021 6:19:47 PM INFO hudson.plugins.ec2.EC2Cloud log
Connected via SSH.
May 07, 2021 6:19:48 PM INFO hudson.plugins.ec2.EC2Cloud log
connect fresh as root
May 07, 2021 6:19:48 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to ec2-3-66-217-65.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:19:48 PM INFO hudson.plugins.ec2.EC2Cloud log
No SSH key verification (ssh-ed25519 78:ed:18:5b:52:00:76:72:8e:ae:73:69:3c:69:31:dd) for connections to EC2 (eu-central-1 Windows) - Windows (i-0078ac5df5320de3d)
May 07, 2021 6:19:48 PM INFO hudson.plugins.ec2.EC2Cloud log
Connected via SSH.
May 07, 2021 6:19:49 PM INFO hudson.plugins.ec2.EC2Cloud log
Creating tmp directory (/tmp) if it does not exist
May 07, 2021 6:20:07 PM INFO hudson.plugins.ec2.EC2Cloud log
Verifying: java -fullversion
May 07, 2021 6:20:12 PM INFO hudson.plugins.ec2.EC2Cloud log
Verifying: which scp
May 07, 2021 6:20:14 PM INFO hudson.plugins.ec2.EC2Cloud log
Copying remoting.jar to: /tmp
May 07, 2021 6:20:15 PM INFO hudson.plugins.ec2.EC2Cloud log
Launching remoting agent (via Trilead SSH2 Connection): java -Dfile.encoding=UTF-8 -jar /tmp/remoting.jar -workDir C:\Users\jenkins
May 07, 2021 6:20:18 PM INFO hudson.plugins.ec2.EC2OndemandSlave lambda$terminate$0
Terminated EC2 instance (terminated): i-0078ac5df5320de3d
May 07, 2021 6:20:19 PM INFO hudson.plugins.ec2.EC2OndemandSlave lambda$terminate$0
Removed EC2 instance from jenkins master: i-0078ac5df5320de3d
And in the console log of the instance:
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Launching instance: i-02435726c827f7a6a
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: bootstrap()
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Getting keypair...
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Using private key rsa-8192-aws-jenkins (SHA-1 fingerprint 87:87:34:8e:a4:d5:7f:17:bb:08:9d:f8:68:c9:33:f1)
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Authenticating as jenkins
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Connecting to null on port 22, with timeout 10000.
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: No SSH key verification (ssh-ed25519 57:02:49:51:15:8a:41:70:67:ca:7a:e5:48:5f:32:b6) for connections to EC2 (eu-central-1 Windows) - Windows (i-02435726c827f7a6a)
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Connected via SSH.
May 07, 2021 6:29:46 PM hudson.plugins.ec2.EC2Cloud
WARNING: Authentication failed. Trying again...
May 07, 2021 6:30:16 PM hudson.plugins.ec2.EC2Cloud
INFO: Authenticating as jenkins
May 07, 2021 6:30:16 PM hudson.plugins.ec2.EC2Cloud
INFO: Connecting to ec2-52-29-72-22.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:30:26 PM hudson.plugins.ec2.EC2Cloud
INFO: Failed to connect via ssh: The kexTimeout (10000 ms) expired.
May 07, 2021 6:30:26 PM hudson.plugins.ec2.EC2Cloud
INFO: Waiting for SSH to come up. Sleeping 5.
(...)
May 07, 2021 6:31:01 PM hudson.plugins.ec2.EC2Cloud
INFO: Connecting to ec2-52-29-72-22.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:31:09 PM hudson.plugins.ec2.EC2Cloud
INFO: No SSH key verification (ssh-ed25519 78:ed:18:5b:52:00:76:72:8e:ae:73:69:3c:69:31:dd) for connections to EC2 (eu-central-1 Windows) - Windows (i-02435726c827f7a6a)
May 07, 2021 6:31:09 PM hudson.plugins.ec2.EC2Cloud
INFO: Connected via SSH.
May 07, 2021 6:31:09 PM hudson.plugins.ec2.EC2Cloud
INFO: connect fresh as root
May 07, 2021 6:31:09 PM hudson.plugins.ec2.EC2Cloud
INFO: Connecting to ec2-52-29-72-22.eu-central-1.compute.amazonaws.com on port 22, with timeout 10000.
May 07, 2021 6:31:13 PM hudson.plugins.ec2.EC2Cloud
INFO: No SSH key verification (ssh-ed25519 78:ed:18:5b:52:00:76:72:8e:ae:73:69:3c:69:31:dd) for connections to EC2 (eu-central-1 Windows) - Windows (i-02435726c827f7a6a)
May 07, 2021 6:31:13 PM hudson.plugins.ec2.EC2Cloud
INFO: Connected via SSH.
May 07, 2021 6:31:13 PM hudson.plugins.ec2.EC2Cloud
INFO: Creating tmp directory (C:\\Windows\\Temp\\) if it does not exist
[91mNew-Item: [91mAn item with the specified name C:\Windows\Temp already exists. [0m
May 07, 2021 6:31:28 PM hudson.plugins.ec2.EC2Cloud
INFO: Verifying: java -fullversion
openjdk full version "11.0.11+9"
May 07, 2021 6:31:32 PM hudson.plugins.ec2.EC2Cloud
INFO: Verifying: which scp
/c/Windows/System32/OpenSSH/scp
May 07, 2021 6:31:34 PM hudson.plugins.ec2.EC2Cloud
INFO: Copying remoting.jar to: C:\\Windows\\Temp\\
May 07, 2021 6:31:35 PM hudson.plugins.ec2.EC2Cloud
INFO: Launching remoting agent (via Trilead SSH2 Connection): java -Dfile.encoding=UTF-8 -jar C:\\Windows\\Temp\\/remoting.jar -workDir C:\Users\jenkins
HTTP ERROR 404 Not Found
URI: /computer/EC2%20(eu-central-1%20Windows)%20-%20Windows%20(i-02435726c827f7a6a)/logText/progressiveHtml
STATUS: 404
MESSAGE: Not Found
SERVLET: Stapler
I had the same issue but I was using the same ssh keys on the master and on the node. Mine was coming online but builds were not executing on the node but on the master.
I worked out that when trying to connect to the node it was connecting to ssh on the localhost (master).
This bit from your log-
INFO: Connecting to null on port 22, with timeout 10000.
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: No SSH key verification (ssh-ed25519 57:02:49:51:15:8a:41:70:67:ca:7a:e5:48:5f:32:b6) for connections to EC2 (eu-central-1 Windows) - Windows (i-02435726c827f7a6a)
May 07, 2021 6:29:45 PM hudson.plugins.ec2.EC2Cloud
INFO: Connected via SSH.
I changed my ssh key for the node and then I started getting the same as you, the node would be killed off as soon as it was connected.
My work around to fix this was to change the ssh port on the master to anything other than 22 so that it couldn't connect to localhost first.
You could either change the master ssh port or the node but changing the master is the easier option without creating a new ami for the node.
I'm having an issue with one of my docker containers connecting to my jenkins master. This used to work fine for several months but something must have changed either in Jenkins or our corporate firewall rules that I haven't been able to pinpoint.
Jenkins communicates with Docker host on port 4243 for Docker API.
I have the JNLP port fixed to 50724. My container is using jenkins/jnlp-slave as the base image. I'm using the Yet Another Docker Plugin.
Jenkins is able to start the container but it fails to establish the JNLP4 connnection. This is the error from docker logs of the container:
Feb 19, 2019 7:49:42 AM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: YAD Singapore Docker-ead378f6bce7
Feb 19, 2019 7:49:42 AM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Feb 19, 2019 7:49:42 AM hudson.remoting.jnlp.Main createEngine
WARNING: Certificate validation for HTTPs endpoints is disabled
Feb 19, 2019 7:49:42 AM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.29
Feb 19, 2019 7:49:42 AM hudson.remoting.Engine startEngine
WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
Feb 19, 2019 7:49:42 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://jenkins-master.work.com/]
Feb 19, 2019 7:49:42 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver openURLConnection
WARNING: HTTPs certificate check is disabled for the endpoint.
Feb 19, 2019 7:49:43 AM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
Agent address: jenkins-master.work.com
Agent port: 50724
Identity: 3c:1d:86:85:6a:18:a1:bd:89:a7:a9:aa:1b:6b:0c:20
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins-master.work.com:50724
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Protocol JNLP4-connect encountered an unexpected exception
java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Connection closed before acknowledgement sent
at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
at hudson.remoting.Engine.innerRun(Engine.java:614)
at hudson.remoting.Engine.run(Engine.java:474)
Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Connection closed before acknowledgement sent
at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.onRecvClosed(AckFilterLayer.java:280)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:816)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1800(BIONetworkLayer.java:48)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
at java.lang.Thread.run(Thread.java:748)
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins-master.work.com:50724
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP4-plaintext not supported, skipping
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP3-connect not supported, skipping
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP2-connect not supported, skipping
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener status
INFO: Server reports protocol JNLP-connect not supported, skipping
Feb 19, 2019 7:49:43 AM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: The server rejected the connection: None of the protocols were accepted
java.lang.Exception: The server rejected the connection: None of the protocols were accepted
at hudson.remoting.Engine.onConnectionRejected(Engine.java:682)
at hudson.remoting.Engine.innerRun(Engine.java:639)
at hudson.remoting.Engine.run(Engine.java:474)
Jenkins logs has this:
Feb 19, 2019 7:49:27 AM INFO com.github.kostyasha.yad.DockerCloud provision
Asked to provision load: '1', for: 'sing-slave-docker' label
Feb 19, 2019 7:49:27 AM INFO com.github.kostyasha.yad.DockerCloud provision
Will provision 'jnlp-slave-ssh', for label: 'sing-slave-docker', in cloud: 'YAD Singapore Docker'
Feb 19, 2019 7:49:28 AM INFO com.github.kostyasha.yad.DockerCloud addProvisionedSlave
Provisioning 'jnlp-slave-ssh' number '0' on 'YAD Singapore Docker'; Total containers: '0'
Feb 19, 2019 7:49:37 AM INFO hudson.slaves.NodeProvisioner$2 run
jnlp-slave-ssh provisioning successfully completed. We have now 3 computer(s)
Feb 19, 2019 7:49:37 AM INFO com.github.kostyasha.yad.launcher.DockerComputerJNLPLauncher launch
Starting connection command for ead378f6bce7616b7264de0605747f3299a4c750118c161d68e25bb99ea64b2c...
Feb 19, 2019 7:49:43 AM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #703 failed
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:244)
Feb 19, 2019 7:49:43 AM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #704 failed
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:192)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:244)
Feb 19, 2019 7:49:44 AM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #705 failed
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:244)
Now I have another docker host that is not behind a firewall using the same docker image and it is able to connect and run my build. That's where I figure it has to be an issue with the firewall. But looking at the logs of the successful connection, I'm confused about what ports are actually being used. I know jenkins->docker on port 4243 for Docker API. JNLP port fixed to 50724. The container exposes port 4200 and is mapped to port 49810.
d442c6d53a1b jnlp-slave-ssh "/bin/sh -cxe 'cat <<" 0.0.0.0:49810->4200/tcp sleepy_liskov
But in the jenkins log it shows that it connects on some other port 56602:
Asked to provision load: '1', for: 'lewi-slave-docker' label
Feb 19, 2019 12:36:07 AM INFO com.github.kostyasha.yad.DockerCloud provision
Will provision 'jnlp-slave-ssh', for label: 'lewi-slave-docker', in cloud: 'YAD Lewisville Docker'
Feb 19, 2019 12:36:07 AM INFO com.github.kostyasha.yad.DockerCloud addProvisionedSlave
Provisioning 'jnlp-slave-ssh' number '0' on 'YAD Lewisville Docker'; Total containers: '0'
Feb 19, 2019 12:36:17 AM INFO hudson.slaves.NodeProvisioner$2 run
jnlp-slave-ssh provisioning successfully completed. We have now 4 computer(s)
Feb 19, 2019 12:36:17 AM INFO com.github.kostyasha.yad.launcher.DockerComputerJNLPLauncher launch
Starting connection command for d442c6d53a1b0a3ffa3f55732bceb112f3efacd1078313744cffb6d6c44eae21...
Feb 19, 2019 12:36:20 AM WARNING hudson.TcpSlaveAgentListener$ConnectionHandler run
Connection #562 failed
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at hudson.TcpSlaveAgentListener$ConnectionHandler.run(TcpSlaveAgentListener.java:244)
Feb 19, 2019 12:36:20 AM INFO hudson.TcpSlaveAgentListener$ConnectionHandler run
Accepted JNLP4-connect connection #563 from lewi-docker.work.com/10.180.168.192:56602
What is port 56602 used for? This port is also random. When I run it again it shows up as 57820, etc.
Anything else I can look at or try?
OK, after much back and forth it was a firewall issue that was blocking the Agent Port 50724.
Looking for some help here!
I am running Jenkins(v2.134) as a docker container managed by Rancher(v1.6.16); and i am using haproxy(HA-Proxy version 1.6.3 2015/12/25) as my loadbalancer. Jenkins JNLP port is configured as 50000. haproxy got the rule for TCP port forwarding(8081 > 50000)
My slave (macmini) is in different network and behind firewall (This network can make reach my Jenkins Infra). I am using "Tunnel connection through" property and specified port ":8081". The port 8081 is already opened in the firewall.
My Jenkins Java version is -
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-1~deb9u1-b12)
My Slave Java version is -
Sun JDK "1.8.0_151"
Here are the agent logs -
*INFO: Setting up agent: my-slave-01*
Sep 26, 2018 2:48:50 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Sep 26, 2018 2:48:50 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.23
Sep 26, 2018 2:48:50 PM hudson.remoting.Engine startEngine
WARNING: No Working Directory. Using the legacy JAR Cache location: /Users/jenkins/.jenkins/cache/jars
Sep 26, 2018 2:48:51 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://<My Jenkins URL>/]
Sep 26, 2018 2:48:51 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Sep 26, 2018 2:48:51 PM hudson.remoting.jnlp.Main$CuiListener status
*INFO: Agent discovery successful*
*Agent address: <My Jenkins DNS Name>*
*Agent port: 8081*
Identity: b5:c7:33:8d:9c:97:41:3f:e1:b1:b5:31:25:ea:b5:2e
Sep 26, 2018 2:48:51 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Sep 26, 2018 2:48:51 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to <My Jenkins DNS>:8081
Sep 26, 2018 2:48:51 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Sep 26, 2018 2:48:51 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: b5:c7:33:8d:9c:97:41:3f:e1:b1:b5:31:25:ea:b5:2e
Sep 26, 2018 2:48:52 PM hudson.remoting.jnlp.Main$CuiListener status
*INFO: Connected*
Sep 26, 2018 2:48:54 PM org.jenkinsci.remoting.util.AnonymousClassWarnings warn
WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.envinject.EnvInjectComputerListener$2; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/
Sep 26, 2018 2:49:49 PM hudson.remoting.jnlp.Main$CuiListener status
*INFO: Terminated*
Any help will be appreciated.
Is there is a way to get more verbose logs?
I found the problem; it was with my load balancer(haproxy) default timeout for TCP connections (if no traffic) which was 50000ms.
https://cbonte.github.io/haproxy-dconv/configuration-1.5.html
However the Jenkins ping interval to slave i.e.hudson.slaves.ChannelPinger.pingInterval was 5 minutes (dafault).
https://wiki.jenkins.io/display/JENKINS/Features+controlled+by+system+properties
Increasing the load balancer timeout i.e. > 5 minutes solved the issue.
I'm using the jenkins mesos plugin for CI.
Initially, I followed the following tutorial: http://www.ebaytechblog.com/2014/05/12/delivering-ebays-ci-solution-with-apache-mesos-part-ii/
but the jenkins itself was not being setup via this. (I got error could not load config.xml file, even there was one)
Then I followed https://rogerignazio.com/blog/scaling-jenkins-mesos-marathon/
, and I was able to run jenkins master (jenkin framework/scheduler), but when I define the scripts to run, the jenkins-slaves are not being created. I think I'm missing some configuration regarding slaves. Can you tell me, what's the reason that the slaves are not being created to run jobs.
On the jenkins build page, I'm getting :
(pending—Waiting for next available executor)
And in the jenkins-logs, i'm getting following error:
INFO: Provisioning Jenkins Slave on Mesos with 1 executors. Remaining excess workload: 0 executors)
Jun 19, 2015 4:02:55 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning MesosCloud from MesosCloud with 1 executors. Remaining excess workload: 0
Jun 19, 2015 4:02:55 PM org.jenkinsci.plugins.mesos.MesosComputerLauncher <init>
INFO: Constructing MesosComputerLauncher
Jun 19, 2015 4:02:55 PM org.jenkinsci.plugins.mesos.MesosSlave <init>
INFO: Constructing Mesos slave mesos-jenkins-1f8691df-9918-4175-87b3-bcc3de80b258 from cloud
Jun 19, 2015 4:03:05 PM org.jenkinsci.plugins.mesos.MesosComputerLauncher launch
INFO: Launching slave computer mesos-jenkins-1f8691df-9918-4175-87b3-bcc3de80b258
Jun 19, 2015 4:03:05 PM org.jenkinsci.plugins.mesos.MesosComputerLauncher launch
INFO: Sending a request to start jenkins slave mesos-jenkins-1f8691df-9918-4175-87b3-bcc3de80b258
Jun 19, 2015 4:03:05 PM org.jenkinsci.plugins.mesos.JenkinsScheduler requestJenkinsSlave
INFO: Enqueuing jenkins slave request
Jun 19, 2015 4:03:05 PM hudson.slaves.NodeProvisioner update
INFO: MesosCloud provisioning successfully completed. We have now 2 computer(s)
java.lang.NullPointerException
at org.jenkinsci.plugins.mesos.JenkinsScheduler.matches(JenkinsScheduler.java:306)
at org.jenkinsci.plugins.mesos.JenkinsScheduler.resourceOffers(JenkinsScheduler.java:252)
Jun 19, 2015 4:03:06 PM org.jenkinsci.plugins.mesos.JenkinsScheduler$1 run
SEVERE: The Mesos driver was aborted! Status code: 3
Edit: I think I'm getting error, because I've not defined any container port mappings.
Can anyone tell me how to do so?
Update : Actually there were many problems with 0.7 version of mesos plugin. So, I simply downgraded to 0.6 version.
For port mappings on marathon have a look here.
Hope this helps!