Jenkins agent has been connected to the master via JNLP remoting.
Initially, the connection seems to be successful for about 20 min, later ping was attempted thrice which fails, followed by successful reconnection of the agent. Thereby, incurs a downtime of 30 min approx.
Aug 30, 2021 7:12:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Aug 30, 2021 7:12:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkinsmain.cloud:9000
Aug 30, 2021 7:12:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Aug 30, 2021 7:12:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: xx:d9:de:d9:xx.xx.xx.xx:b3
Aug 30, 2021 7:12:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Aug 30, 2021 7:31:32 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkinsmain.cloud/xx.xx.101.4:9000.
java.util.concurrent.TimeoutException: Ping started at 1630351652299 hasn't completed by 1630351892300
at hudson.remoting.PingThread.ping(PingThread.java:133)
at hudson.remoting.PingThread.run(PingThread.java:89)
Aug 30, 2021 7:36:32 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkinsmain.cloud/xx.xx.101.4:9000.
java.util.concurrent.TimeoutException: Ping started at 1630351952301 hasn't completed by 1630352192302
at hudson.remoting.PingThread.ping(PingThread.java:133)
at hudson.remoting.PingThread.run(PingThread.java:89)
Aug 30, 2021 7:41:32 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkinsmain.cloud/xx.xx.101.4:9000.
java.util.concurrent.TimeoutException: Ping started at 1630352252300 hasn't completed by 1630352492301
at hudson.remoting.PingThread.ping(PingThread.java:133)
at hudson.remoting.PingThread.run(PingThread.java:89)
Aug 30, 2021 7:43:01 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Performing onReconnect operation.
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: onReconnect operation failed.
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://jenkinsmain.cloud/]
Aug 30, 2021 7:43:11 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
Agent address: jenkinsmain.cloud
Agent port: 9000
Identity: xx:d9:de:d9:xx.xx.xx.xx:b3
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkinsmain.cloud:9000
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: xx:d9:de:d9:xx.xx.xx.xx:b3
Aug 30, 2021 7:43:11 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
The scenario is made by 2 VMs, each one with docker and hazelcast memebers running as containers.
Reading this guide https://hazelcast.com/blog/configuring-hazelcast-in-non-orchestrated-docker-environments/ I can get the Scenario 3 Public IP address, port mapping, and TCP discovery method working with one member per node.
But if I add a member to one of the nodes it takes the place of the other member in the cluster or logs connection problems. So I'm not able to make the cluster working with more than one member per node.
The configuration in both nodes is:
hazelcast:
network:
join:
multicast:
enabled: false
tcp-ip:
enabled: true
member-list:
- 10.132.0.2:5701
- 10.128.0.3:5701
- 10.128.0.3:5702
The Container in node 10.132.0.2 is run with:
docker run -v `pwd`:/mnt --rm --name member1 -e "JAVA_OPTS=-Dhazelcast.local.
publicAddress=10.132.0.2 -Dhazelcast.config=/mnt/hazelcast.yml" -p 5701:5701 hazelcast/hazelcast:4.0.1
########################################
# JAVA_OPTS=-Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properties -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED -Dhazelcast.local.publicAddress=10.132.0.2 -Dhazelcast.config=/mnt/hazelcast.yml
# CLASSPATH=/opt/hazelcast/*:/opt/hazelcast/lib/*
# starting now....
########################################
+ exec java -server -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properties -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED -Dhazelcast.local.publicAddress=10.132.0.2 -Dhazelcast.config=/mnt/hazelcast.yml com.hazelcast.core.server.HazelcastMemberStarter
Sep 29, 2020 6:35:23 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Loading configuration '/mnt/hazelcast.yml' from System property 'hazelcast.config'
Sep 29, 2020 6:35:23 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Using configuration file at /mnt/hazelcast.yml
Sep 29, 2020 6:35:24 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [10.128.0.3, 10.132.0.2]
Sep 29, 2020 6:35:24 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Prefer IPv4 stack is true, prefer IPv6 addresses is false
Sep 29, 2020 6:35:24 AM com.hazelcast.instance.AddressPicker
WARNING: [LOCAL] [dev] [4.0.1] Could not find a matching address to start with! Picking one of non-loopback addresses.
Sep 29, 2020 6:35:24 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Picked [172.17.0.2]:5701, using socket ServerSocket[addr=/0.0.0.0,localport=5701], bind any local is true
Sep 29, 2020 6:35:24 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Using public address: [10.132.0.2]:5701
Sep 29, 2020 6:35:24 AM com.hazelcast.system
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Hazelcast 4.0.1 (20200409 - e086b9c) starting at [10.132.0.2]:5701
Sep 29, 2020 6:35:24 AM com.hazelcast.system
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Copyright (c) 2008-2020, Hazelcast, Inc. All Rights Reserved.
Sep 29, 2020 6:35:24 AM com.hazelcast.spi.impl.operationservice.impl.BackpressureRegulator
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Backpressure is disabled
Sep 29, 2020 6:35:25 AM com.hazelcast.instance.impl.Node
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Creating TcpIpJoiner
Sep 29, 2020 6:35:25 AM com.hazelcast.cp.CPSubsystem
WARNING: [10.132.0.2]:5701 [dev] [4.0.1] CP Subsystem is not enabled. CP data structures will operate in UNSAFE mode! Please note that UNSAFE mode will not provide strong consistency guarantees.
Sep 29, 2020 6:35:26 AM com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Starting 2 partition threads and 3 generic threads (1 dedicated for priority tasks)
Sep 29, 2020 6:35:26 AM com.hazelcast.internal.diagnostics.Diagnostics
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.
Sep 29, 2020 6:35:26 AM com.hazelcast.core.LifecycleService
INFO: [10.132.0.2]:5701 [dev] [4.0.1] [10.132.0.2]:5701 is STARTING
Sep 29, 2020 6:35:26 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:35:26 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Connecting to /10.128.0.3:5702, timeout: 10000, bind-any: true
Sep 29, 2020 6:35:26 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Connecting to /10.132.0.2:5703, timeout: 10000, bind-any: true
Sep 29, 2020 6:35:26 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Connecting to /10.132.0.2:5702, timeout: 10000, bind-any: true
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Could not connect to: /10.128.0.3:5702. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Could not connect to: /10.132.0.2:5702. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.cluster.impl.TcpIpJoiner
INFO: [10.132.0.2]:5701 [dev] [4.0.1] [10.132.0.2]:5702 is added to the blacklist.
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.132.0.2]:5701 [dev] [4.0.1] Could not connect to: /10.132.0.2:5703. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.cluster.impl.TcpIpJoiner
INFO: [10.132.0.2]:5701 [dev] [4.0.1] [10.128.0.3]:5702 is added to the blacklist.
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.cluster.impl.TcpIpJoiner
INFO: [10.132.0.2]:5701 [dev] [4.0.1] [10.128.0.3]:5701 is added to the blacklist.
Sep 29, 2020 6:35:36 AM com.hazelcast.internal.cluster.impl.TcpIpJoiner
INFO: [10.132.0.2]:5701 [dev] [4.0.1] [10.132.0.2]:5703 is added to the blacklist.
Sep 29, 2020 6:35:37 AM com.hazelcast.internal.cluster.ClusterService
INFO: [10.132.0.2]:5701 [dev] [4.0.1]
Members {size:1, ver:1} [
Member [10.132.0.2]:5701 - 69284e57-ce61-405c-87d3-1e9ea46b2bed this
]
Sep 29, 2020 6:35:37 AM com.hazelcast.core.LifecycleService
INFO: [10.132.0.2]:5701 [dev] [4.0.1] [10.132.0.2]:5701 is STARTED
The containers in node 10.128.0.3 are run with:
docker run -v `pwd`:/mnt --rm --name member2 -e "JAVA_OPTS=-Dhazelcast.local.p
ublicAddress=10.128.0.3:5701 -Dhazelcast.config=/mnt/hazelcast.yml" -p 5701:5701 hazelcast/hazelcast:4.0.1
########################################
# JAVA_OPTS=-Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properties -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED -Dhazelcast.local.publicAddress=10.128.0.3:5701 -Dhazelcast.config=/mnt/hazelcast.yml
# CLASSPATH=/opt/hazelcast/*:/opt/hazelcast/lib/*
# starting now....
########################################
Sep 29, 2020 6:36:54 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Loading configuration '/mnt/hazelcast.yml' from System property 'hazelcast.config'
Sep 29, 2020 6:36:54 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Using configuration file at /mnt/hazelcast.yml
Sep 29, 2020 6:36:55 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [10.128.0.3, 10.132.0.2]
Sep 29, 2020 6:36:55 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Prefer IPv4 stack is true, prefer IPv6 addresses is false
Sep 29, 2020 6:36:55 AM com.hazelcast.instance.AddressPicker
WARNING: [LOCAL] [dev] [4.0.1] Could not find a matching address to start with! Picking one of non-loopback addresses.
Sep 29, 2020 6:36:55 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Picked [172.17.0.2]:5701, using socket ServerSocket[addr=/0.0.0.0,localport=5701], bind any local is true
Sep 29, 2020 6:36:55 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Using public address: [10.128.0.3]:5701
Sep 29, 2020 6:36:55 AM com.hazelcast.system
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Hazelcast 4.0.1 (20200409 - e086b9c) starting at [10.128.0.3]:5701
Sep 29, 2020 6:36:55 AM com.hazelcast.system
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Copyright (c) 2008-2020, Hazelcast, Inc. All Rights Reserved.
Sep 29, 2020 6:36:56 AM com.hazelcast.spi.impl.operationservice.impl.BackpressureRegulator
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Backpressure is disabled
Sep 29, 2020 6:36:56 AM com.hazelcast.instance.impl.Node
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Creating TcpIpJoiner
Sep 29, 2020 6:36:56 AM com.hazelcast.cp.CPSubsystem
WARNING: [10.128.0.3]:5701 [dev] [4.0.1] CP Subsystem is not enabled. CP data structures will operate in UNSAFE mode! Please note that UNSAFE mode will not provide strong consistency guarantees.
Sep 29, 2020 6:36:58 AM com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Starting 2 partition threads and 3 generic threads (1 dedicated for priority tasks)
Sep 29, 2020 6:36:58 AM com.hazelcast.internal.diagnostics.Diagnostics
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.
Sep 29, 2020 6:36:58 AM com.hazelcast.core.LifecycleService
INFO: [10.128.0.3]:5701 [dev] [4.0.1] [10.128.0.3]:5701 is STARTING
Sep 29, 2020 6:36:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Connecting to /10.128.0.3:5702, timeout: 10000, bind-any: true
Sep 29, 2020 6:36:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Connecting to /10.132.0.2:5703, timeout: 10000, bind-any: true
Sep 29, 2020 6:36:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Connecting to /10.132.0.2:5702, timeout: 10000, bind-any: true
Sep 29, 2020 6:36:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Connecting to /10.132.0.2:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:36:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnection
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Initialized new cluster connection between /172.17.0.2:56429 and /10.132.0.2:5701
Sep 29, 2020 6:37:05 AM com.hazelcast.internal.cluster.ClusterService
INFO: [10.128.0.3]:5701 [dev] [4.0.1]
Members {size:2, ver:2} [
Member [10.132.0.2]:5701 - 69284e57-ce61-405c-87d3-1e9ea46b2bed
Member [10.128.0.3]:5701 - aa22f242-cc82-44ff-9dc1-06678d14420e this
]
Sep 29, 2020 6:37:06 AM com.hazelcast.core.LifecycleService
INFO: [10.128.0.3]:5701 [dev] [4.0.1] [10.128.0.3]:5701 is STARTED
Sep 29, 2020 6:37:08 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Could not connect to: /10.132.0.2:5702. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:37:08 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Could not connect to: /10.128.0.3:5702. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:37:08 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5701 [dev] [4.0.1] Could not connect to: /10.132.0.2:5703. Reason: SocketTimeoutException[null]
So far everything is ok, but when I start member 3:
docker run -v `pwd`:/mnt --rm --name member3 -e "JAVA_OPTS=-Dhazelcast.local.p
ublicAddress=10.128.0.3:5702 -Dhazelcast.config=/mnt/hazelcast.yml" -p 5702:5701 hazelcast/hazelcast:4.0.1
########################################
# JAVA_OPTS=-Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properties -XX:M
axRAMPercentage=80.0 -XX:+UseParallelGC --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-UNNAMED
--add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.ni
o.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.manageme
nt.internal=ALL-UNNAMED -Dhazelcast.local.publicAddress=10.128.0.3:5702 -Dhazelcast.config=/mnt/hazelcast.yml
# CLASSPATH=/opt/hazelcast/*:/opt/hazelcast/lib/*
# starting now....
########################################
+ exec java -server -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properti
es -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-
UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.bas
e/sun.nio.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.
management.internal=ALL-UNNAMED -Dhazelcast.local.publicAddress=10.128.0.3:5702 -Dhazelcast.config=/mnt/hazelcast.y
ml com.hazelcast.core.server.HazelcastMemberStarter
Sep 29, 2020 6:38:26 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Loading configuration '/mnt/hazelcast.yml' from System property 'hazelcast.config'
Sep 29, 2020 6:38:26 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Using configuration file at /mnt/hazelcast.yml
Sep 29, 2020 6:38:26 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [10.12
8.0.3, 10.132.0.2]
Sep 29, 2020 6:38:26 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Prefer IPv4 stack is true, prefer IPv6 addresses is false
Sep 29, 2020 6:38:26 AM com.hazelcast.instance.AddressPicker
WARNING: [LOCAL] [dev] [4.0.1] Could not find a matching address to start with! Picking one of non-loopback address
es.
Sep 29, 2020 6:38:26 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Picked [172.17.0.3]:5701, using socket ServerSocket[addr=/0.0.0.0,localport=5701], bind
any local is true
Sep 29, 2020 6:38:26 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.1] Using public address: [10.128.0.3]:5702
Sep 29, 2020 6:38:26 AM com.hazelcast.system
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Hazelcast 4.0.1 (20200409 - e086b9c) starting at [10.128.0.3]:5702
Sep 29, 2020 6:38:26 AM com.hazelcast.system
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Copyright (c) 2008-2020, Hazelcast, Inc. All Rights Reserved.
Sep 29, 2020 6:38:27 AM com.hazelcast.spi.impl.operationservice.impl.BackpressureRegulator
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Backpressure is disabled
Sep 29, 2020 6:38:27 AM com.hazelcast.instance.impl.Node
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Creating TcpIpJoiner
Sep 29, 2020 6:38:27 AM com.hazelcast.cp.CPSubsystem
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] CP Subsystem is not enabled. CP data structures will operate in UNSAFE mod
e! Please note that UNSAFE mode will not provide strong consistency guarantees.
Sep 29, 2020 6:38:27 AM com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Starting 2 partition threads and 3 generic threads (1 dedicated for priority
tasks)
Sep 29, 2020 6:38:27 AM com.hazelcast.internal.diagnostics.Diagnostics
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to t
he JVM arguments.
Sep 29, 2020 6:38:27 AM com.hazelcast.core.LifecycleService
INFO: [10.128.0.3]:5702 [dev] [4.0.1] [10.128.0.3]:5702 is STARTING
Sep 29, 2020 6:38:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.132.0.2:5703, timeout: 10000, bind-any: true
Sep 29, 2020 6:38:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.132.0.2:5702, timeout: 10000, bind-any: true
Sep 29, 2020 6:38:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:38:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.132.0.2:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:38:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnection
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Initialized new cluster connection between /172.17.0.3:52951 and /10.132.0.2:
5701
Sep 29, 2020 6:38:35 AM com.hazelcast.internal.cluster.ClusterService
INFO: [10.128.0.3]:5702 [dev] [4.0.1]
Members {size:3, ver:3} [
Member [10.132.0.2]:5701 - 69284e57-ce61-405c-87d3-1e9ea46b2bed
Member [10.128.0.3]:5701 - aa22f242-cc82-44ff-9dc1-06678d14420e
Member [10.128.0.3]:5702 - 0dd31ea2-db2e-4e43-941a-98592e222817 this
]
Sep 29, 2020 6:38:38 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.132.0.2:5703. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:38:38 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:38:38 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.132.0.2:5702. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:38:38 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:38:48 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:38:48 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:38:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:39:08 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:39:18 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:39:18 AM com.hazelcast.internal.nio.tcp.TcpIpConnectionErrorHandler
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] Removing connection to endpoint [10.128.0.3]:5701 Cause => java.net.Socket
TimeoutException {null}, Error-Count: 5
Sep 29, 2020 6:39:18 AM com.hazelcast.internal.cluster.impl.MembershipManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] Member [10.128.0.3]:5701 - aa22f242-cc82-44ff-9dc1-06678d14420e is suspect
ed to be dead for reason: No connection
Sep 29, 2020 6:39:18 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:39:27 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:39:28 AM com.hazelcast.internal.nio.tcp.TcpIpConnectionErrorHandler
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] Removing connection to endpoint [10.128.0.3]:5701 Cause => java.net.Socket
TimeoutException {null}, Error-Count: 6
Sep 29, 2020 6:39:32 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:37 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:38 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:39:42 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:47 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:48 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:39:48 AM com.hazelcast.internal.nio.tcp.TcpIpConnectionErrorHandler
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] Removing connection to endpoint [10.128.0.3]:5701 Cause => java.net.Socket
TimeoutException {null}, Error-Count: 7
Sep 29, 2020 6:39:48 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Connecting to /10.128.0.3:5701, timeout: 10000, bind-any: true
Sep 29, 2020 6:39:52 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:57 AM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] This node does not have a connection to Member [10.128.0.3]:5701 - aa22f24
2-cc82-44ff-9dc1-06678d14420e
Sep 29, 2020 6:39:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.128.0.3]:5702 [dev] [4.0.1] Could not connect to: /10.128.0.3:5701. Reason: SocketTimeoutException[null]
Sep 29, 2020 6:39:58 AM com.hazelcast.internal.nio.tcp.TcpIpConnectionErrorHandler
WARNING: [10.128.0.3]:5702 [dev] [4.0.1] Removing connection to endpoint [10.128.0.3]:5701 Cause => java.net.Socket
TimeoutException {null}, Error-Count: 8
It looks like there is a communication problem between the members on the same node
In another test the member3 replaced in the cluster the member2 and it marked the connection attemps from node2 as suspicious
The VM are made fresh on GCP and are on the same network, I used this image:
Google, Container-Optimized OS, 85-13310.1041.9 stable, Kernel: ChromiumOS-5.4.49 Kubernetes: 1.18.9 Docker: 19.03.9 Family: cos-stable, supports Shielded VM features, supports Confidential VM features on N2D
The problem is caused by member3 that thinks to be on the port 5701 instead of the 5702.
The solution is to specify in the configuration the port where the member will listen on the docker host
The configuration for the member3 is
hazelcast:
network:
port:
port: 5702
join:
multicast:
enabled: false
tcp-ip:
enabled: true
member-list:
- 10.132.0.2:5701
- 10.128.0.3:5701
- 10.128.0.3:5702
In this way the cluster works and every member can communicate with others.
Your scenario should work and I guess it will be some environmental issue. Could you try the following 2 things?
Explicitly set port also on member1
Use :5701 also in member1 configuration - both member-list (hazelcast.yml) and hazelcast.local.publicAddress property (command-line).
This step probably changes nothing for your problem, but it should at least avoid non-related warnings in the logs.
Try if member2 is reachable from member3
After you start your nodes, execute an interactive shell in member3 and try to send protocol challenge (HZC) to member2's socket address. If you see the protocol response (HZC) in the terminal, then the communication between containers works properly. If you don't see the response, then check your Docker and firewall configuration to see what can cause the issue.
docker exec -it member3 /bin/bash
# Following command runs in container.
# The first HZC line is the one you type (followed by Enter).
# The second is the reply from member2.
bash-5.0# nc 10.128.0.3 5701
HZC
HZC
If you see the correct protocol response in the terminal we will need to investigate the Hazelcast behavior more. I was not able to reproduce the problem in my environment.
If you want to take this a little further, Hazelcast supports the lifecycle calls into containers and from orchestration managers (i.e. K8s and OpenShift), for this you use the hazelcast network discovery module for that particular platform. This free's you from hard-coding addresses and ports and allows those assignments to be done dynamically at runtime.
I have been facing this issue for the past 8 months (since Dec 2019) and decided to post it here - I am connecting Jenkins Agents - Inbound TCP (JNLP) across Azure VNet and Subscriptions.
The Agents within the same VNet / Subscription as Master connect without any issue and no ping timeout issues occurs. However, Jenkins Agents residing in other Vnets and subscriptions very often get disconnected due to Ping Timeout. These agents are in AKS clusters, built upon openjdk:8-jdk-alpine image and running as pods, managed by Deployments. We use port 50000 as a static port for all JNLP connections.
The logs for Agents in other Vnet's state:
Jul 25, 2020 12:03:33 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:03:33 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /opt/workspace/remoting
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jenkins-slave-jmeter
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Jul 25, 2020 12:03:34 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.33
Jul 25, 2020 12:03:34 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://jenkins.internaldomain.com/]
Jul 25, 2020 12:03:34 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
Agent address: jenkins.internaldomain.com
Agent port: 50000
Identity: 70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins.internaldomain.com:50000
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Jul 25, 2020 12:03:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:03:35 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
Jul 25, 2020 12:27:37 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkins.internaldomain.com/10.177.xxx.xxx:50000.
java.util.concurrent.TimeoutException: Ping started at 1595679817121 hasn't completed by 1595680057121
at hudson.remoting.PingThread.ping(PingThread.java:134)
at hudson.remoting.PingThread.run(PingThread.java:90)
Jul 25, 2020 12:32:37 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkins.internaldomain.com/10.177.xxx.xxx:50000.
java.util.concurrent.TimeoutException: Ping started at 1595680117120 hasn't completed by 1595680357121
at hudson.remoting.PingThread.ping(PingThread.java:134)
at hudson.remoting.PingThread.run(PingThread.java:90)
Jul 25, 2020 12:37:37 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel JNLP4-connect connection to jenkins.internaldomain.com/10.177.xxx.xxx:50000.
java.util.concurrent.TimeoutException: Ping started at 1595680417120 hasn't completed by 1595680657122
at hudson.remoting.PingThread.ping(PingThread.java:134)
at hudson.remoting.PingThread.run(PingThread.java:90)
Jul 25, 2020 12:39:20 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Jul 25, 2020 12:39:30 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Performing onReconnect operation.
Jul 25, 2020 12:39:30 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$FindEffectiveRestarters$1 onReconnect
INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter#3ac9ecc9
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /opt/workspace/remoting
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jenkins-slave-jmeter
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Jul 25, 2020 12:39:32 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.33
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/workspace/remoting as a remoting work directory
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [https://jenkins.internaldomain.com/]
Jul 25, 2020 12:39:32 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
Agent address: jenkins.internaldomain.com
Agent port: 50000
Identity: 70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins.internaldomain.com:50000
Jul 25, 2020 12:39:32 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Trying protocol: JNLP4-connect
Jul 25, 2020 12:39:33 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Remote identity confirmed: 70:35:e7:e9:31:ed:a3:1f:xx:xx:xx:xx:xx:xx:xx:xx
Jul 25, 2020 12:39:34 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connected
ICMP Ping (Echo) is disabled as per org policies and this blog here states that Remoting Ping is different than ICMP Ping so ICMP being disabled isn't one of the worries (that's what I think)
We tried by disabling Ping in Jenkins Master and Agents but that didn't work.
Initially we ran Master on port 80 and it was blocked hence we suspected it to issue with non-secure port. We configured Master with SSL and on port 443 but the issue was still present.
We found that Azure Idle Timeout is 4 minutes, and Jenkins default ping interval is 5 minutes (300 seconds) so I have tried setting these configuration properties:
hudson.slaves.ChannelPinger.pingIntervalSeconds :
Default: 300,
Custom: 108
Description: Frequency of pings between the master and agents, in seconds
hudson.slaves.ChannelPinger.pingTimeoutSeconds :
Default: 240,
Custom: No Changes
Description: Timeout for each ping between the master and agents, in seconds
Still, no luck.
Has anyone faced an issue like this before, all I can find about is similar issue happened for Windows Agents
Jenkins Agents Name, Identity, Master DNS and IP have been changed to look like generic
Jenkins Master and Slave could keep connection between same VNet but is not able to connect across VNet, that would mean specific port might be blocked across VNet. You will need to enable ping port network traffic across VNet using Network Security Groups (NSG). You can read about it from link.
Once you have enabled that, you can create VM in VNet of jenkins master and try connecting to jenkins slave ping port (using telnet or similar tool). If you are able to connect that would mean NSG is not blocking the traffic otherwise NSG would be blocking the traffic.
On Windows 10 (1903), I'm running Jenkins in a Linux container. I'm trying to use the Docker Plugin to spin up Docker Agents, but it keeps getting a "filesystem operations against a running Hyper-V container are not supported" error. I believe this error means that files are trying to be copied to a container. Does anyone know what I doing wrong or missing?
For testing I trimmed my Docker Agent image to just:
FROM centos:latest
The Jenkins log:
Finished DockerContainerWatchdog Asynchronous Periodic Work. 18 ms
Aug 19, 2019 4:04:07 PM INFO com.nirima.jenkins.plugins.docker.DockerCloud provision
Asked to provision 1 slave(s) for: testslave
Aug 19, 2019 4:04:07 PM INFO com.nirima.jenkins.plugins.docker.DockerCloud canAddProvisionedSlave
Provisioning 'jenkins-slave' on 'LocalDockerHost'; Total containers: 0 (of 100)
Aug 19, 2019 4:04:07 PM INFO com.nirima.jenkins.plugins.docker.DockerCloud provision
Will provision 'jenkins-slave', for label: 'testslave', in cloud: 'LocalDockerHost'
Aug 19, 2019 4:04:07 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Image of jenkins-slave from LocalDockerHost with 1 executors. Remaining excess workload: 0
Aug 19, 2019 4:04:07 PM INFO com.nirima.jenkins.plugins.docker.DockerTemplate doProvisionNode
Trying to run container for jenkins-slave
Aug 19, 2019 4:04:07 PM INFO com.nirima.jenkins.plugins.docker.DockerTemplate doProvisionNode
Trying to run container for node docker-00001yrz9aglv from image: jenkins-slave
Aug 19, 2019 4:04:07 PM INFO com.nirima.jenkins.plugins.docker.DockerTemplate doProvisionNode
Started container ID 39db6b07f16e98ea37844fcc54e18a8d95a1f29a8cc0386bcb5a2c672d912daa for node docker-00001yrz9aglv from image: jenkins-slave
Aug 19, 2019 4:04:08 PM SEVERE com.github.dockerjava.core.async.ResultCallbackTemplate onError
Error during callback
com.github.dockerjava.api.exception.InternalServerErrorException: {"message":"filesystem operations against a running Hyper-V container are not supported"}
at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:109)
at com.github.dockerjava.netty.handler.HttpResponseHandler.channelRead0(HttpResponseHandler.java:33)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
so the odds are you need to turn on virtualization in your local machines bios settings.
This will require you to reboot your machine find the setting and turn it on.
While running the docker-compose up for git project
Linked-Data-Theater
I am getting error standard_init_linux.go:195: exec user process caused "no such file or directory
Below is a stack trace,
ifour.techno#ifour-137 MINGW64 /d/test/Docker/LinkData_Theater_Repo/Linked-Data-Theatre (master)
$ docker-compose up
Starting virtuoso ...
Starting ldt ... done
Attaching to virtuoso, ldt
virtuoso | standard_init_linux.go:195: exec user process caused "no such file or directory"
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Server version: Apache Tomcat/7.0.85
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Server built: Feb 7 2018 18:52:33 UTC
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Server number: 7.0.85.0
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: OS Name: Linux
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: OS Version: 4.4.111-boot2docker
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Architecture: amd64
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Java Home: /usr/lib/jvm/java-8-openjdk-amd64/jre
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: JVM Version: 1.8.0_151-8u151-b12-1~deb9u1-b12
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: JVM Vendor: Oracle Corporation
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: CATALINA_BASE: /usr/local/tomcat
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: CATALINA_HOME: /usr/local/tomcat
virtuoso exited with code 1
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Djdk.tls.ephemeralDHKeySize=2048
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Dignore.endorsed.dirs=
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Dcatalina.base=/usr/local/tomcat
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Dcatalina.home=/usr/local/tomcat
ldt | Mar 01, 2018 7:35:47 AM org.apache.catalina.startup.VersionLoggerListener log
ldt | INFO: Command line argument: -Djava.io.tmpdir=/usr/local/tomcat/temp
ldt | Mar 01, 2018 7:35:48 AM org.apache.catalina.core.AprLifecycleListener lifecycleEvent
ldt | INFO: Loaded APR based Apache Tomcat Native library 1.2.16 using APR version 1.5.2.
ldt | Mar 01, 2018 7:35:48 AM org.apache.catalina.core.AprLifecycleListener lifecycleEvent
ldt | INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
ldt | Mar 01, 2018 7:35:48 AM org.apache.catalina.core.AprLifecycleListener initializeSSL
ldt | INFO: OpenSSL successfully initialized (OpenSSL 1.1.0f 25 May 2017)
ldt | Mar 01, 2018 7:35:48 AM org.apache.coyote.AbstractProtocol init
ldt | INFO: Initializing ProtocolHandler ["http-apr-8080"]
ldt | Mar 01, 2018 7:35:48 AM org.apache.coyote.AbstractProtocol init
ldt | INFO: Initializing ProtocolHandler ["ajp-apr-8009"]
ldt | Mar 01, 2018 7:35:48 AM org.apache.catalina.startup.Catalina load
ldt | INFO: Initialization processed in 890 ms
ldt | Mar 01, 2018 7:35:48 AM org.apache.catalina.core.StandardService startInternal
ldt | INFO: Starting service Catalina
ldt | Mar 01, 2018 7:35:48 AM org.apache.catalina.core.StandardEngine startInternal
ldt | INFO: Starting Servlet Engine: Apache Tomcat/7.0.85
ldt | Mar 01, 2018 7:35:48 AM org.apache.coyote.AbstractProtocol start
ldt | INFO: Starting ProtocolHandler ["http-apr-8080"]
ldt | Mar 01, 2018 7:35:48 AM org.apache.coyote.AbstractProtocol start
ldt | INFO: Starting ProtocolHandler ["ajp-apr-8009"]
In this stack trace, you can see that standard_init_linux.go:195: exec user process caused "no such file or directory
below is my docker-compose.yml file,
version: '2'
services:
ldt:
privileged: true
container_name: ldt
image: tomcat:7-jre8
hostname: ldt.local
ports:
- "8080:8080"
volumes:
- ./webapps:/usr/local/tomcat/webapps
- ./shared_import:/usr/local/tomcat/temp:z
networks:
- ldt
virtuoso:
privileged: true
container_name: virtuoso
build:
context: virtuoso
hostname: virtuoso.local
ports:
- "1111:1111"
- "8890:8890"
environment:
DBA_PASSWORD: "dba"
SPARQL_UPDATE: "true"
VIRTUOSO_DBA_PWD: dba
volumes:
- ./virtuoso_data:/var/lib/virtuoso/db:z
- ./shared_import:/var/lib/virtuoso/usr/local/tomcat/temp:z
networks:
- ldt
networks:
ldt:
external:
name: ldt
What is missing? I am a beginner in docker so please help me regarding this problem also give me suggestions. I have googled it but didn't find the problem solution anywhere.
I have also tried the docker-compose up --build but getting the same error as above.
There might be couple of reasons for the issue you are facing, I resolved the same issue by trying the following:-
Ensure all the folders exists, which looks fine in your docker compose
Copying some shell script files from windows to unix docker container, ensure you execute dos2unix command after copy. While copying special characters are added which may result in the above issue.
Please add your docker file, its possible issue is in docker file and not in docker-compose
This answer is applicable if you use Windows as host OS.
Yml file does not allow to pinpoint the exact place of the problem.
The problem seems to be inside your virtuoso container. You need to look at the files that this container runs when it starts. Quite likely it has an .sh file that has wrong line endings format. Line endings in the file(s) were converted at some point from Unix format (LF) to Windows format (CR LF).
If such conversion happens to .sh file that will be running inside Docker container, Linux will not recognize Windows format of end of line and will treat the whole file as a single line. It will lead to an error like standard_init_linux.go:XXX: exec user process caused "no such file or directory"
Cause
The EOL conversion could happen because one of the following:
your local Git is configured to automatically convert line endings to Windows format (autocrlf = true) when you git pull sources
you saved one of the files in some editor in Windows, so it was saved with CR LF line endings
Solution
As a quick fix you can open the file in Notepad++, go to menu Edit/EOL Conversion/Unix, and then save the file
Another quick fix: use CLI tool dos2unix to convert files from command line
Change git configuration by turning off automatic conversion to Windows EOL format:
git config --global core.autocrlf input
It will change the setting globally, for all repositories on your machine.
You can also set it per repository.
See https://help.github.com/articles/dealing-with-line-endings/ for more details.