We have installed WSO2 Message Broker, v2.2.0 on Suse 64 bit OS, single core. We have configured the master-datasources.xml to point to an Oracle database. The startup of the MB takes minutes, especially:
TID: [0] [MB] [2014-06-11 15:57:53,039] INFO {org.apache.cassandra.thrift.ThriftServer} - Listening for thrift clients... {org.apache.cassandra.thrift.ThriftServer}
TID: [0] [MB] [2014-06-11 15:57:53,219] INFO {org.apache.cassandra.service.GCInspector} - GC for MarkSweepCompact: 407 ms for 1 collections, 60663688 used; max is 1037959168 {org.apache.cassandra.service.GCInspector}
TID: [0] [MB] [2014-06-11 15:58:39,137] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,org.wso2.carbon.throttling.agent.ThrottlingAgent, {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent}
TID: [0] [MB] [2014-06-11 15:59:39,136] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,org.wso2.carbon.throttling.agent.ThrottlingAgent, {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent}
TID: [0] [MB] [2014-06-11 16:00:39,136] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,org.wso2.carbon.throttling.agent.ThrottlingAgent, {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent}
Is there a reason for this?
With Wso2 MB 220 we are getting these kind of errors when zookeeper/casandra server does not start properly.Ideally if clustering enabled zookeeper(Internal or External) server should be started properly before MB starts.
Further If you trying to run a MB cluster on a single machine and want to run two Zookeeper nodes there, Most probably you will be end up in these OSGI level errors.Please follow blog post on http://indikasampath.blogspot.com/2014/05/wso2-message-broker-cluster-setup-in.html for configuration details on WSO2 Message Broker cluster setup on a single machine
Related
Our Environment:
Jenkins version - Jenkins 2.319.1
Jenkins Master image : jenkins/jenkins:2.319.1-lts-alpine
Jenkins worker image: jenkins/inbound-agent:4.11-1-alpine
Installed plugins:
Kubernetes - 1.30.6
Kubernetes Client API - 5.4.1
Kubernetes Credentials Plugin - 0.9.0
JAVA version on master: openjdk 11.0.13
JAVA version on Agent/worker : openjdk 11.0.14
Hi team,
We are facing issue in jenkins where jenkins agent disconnects(or goes offline) from master while still job is running on agent/worker. We are getting below error(highlighted) and tried below things but issue is still not resolving fully. Jenkins is deployed on EKS.
Error:
5334535:2022-11-02 14:07:54.573+0000 [id=140290] INFO hudson.slaves.NodeProvisioner#update: worker-7j4x4 provisioning successfully completed. We have now 2 computer(s)
5334695:2022-11-02 14:07:54.675+0000 [id=140291] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes done-jenkins/worker-7j4x4
5334828:2022-11-02 14:07:56.619+0000 [id=140291] INFO o.c.j.p.k.KubernetesLauncher#launch: Pod is running: kubernetes done-jenkins/worker-7j4x4
5334964-2022-11-02 14:07:58.650+0000 [id=140309] INFO h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #97 from /100.122.254.111:42648
5335123-2022-11-02 14:09:19.733+0000 [id=140536] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Started DockerContainerWatchdog Asynchronous Periodic Work
5335275-2022-11-02 14:09:19.733+0000 [id=140536] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog has been triggered
5335409-2022-11-02 14:09:19.734+0000 [id=140536] INFO c.n.j.p.d.DockerContainerWatchdog$Statistics#writeStatisticsToLog: Watchdog Statistics: Number of overall executions: 2608, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
5335965-2022-11-02 14:09:19.734+0000 [id=140536] INFO c.n.j.p.d.DockerContainerWatchdog#loadNodeMap: We currently have 1 nodes assigned to this Jenkins instance, which we will check
5336139-2022-11-02 14:09:19.734+0000 [id=140536] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog check has been completed
5336279-2022-11-02 14:09:19.734+0000 [id=140536] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Finished DockerContainerWatchdog Asynchronous Periodic Work. 1 ms
5336438-groovy.lang.MissingPropertyException: No such property: envVar for class: groovy.lang.Binding
5336532- at groovy.lang.Binding.getVariable(Binding.java:63)
5336585- at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:271)
–
5394279-2022-11-02 15:09:19.733+0000 [id=141899] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Started DockerContainerWatchdog Asynchronous Periodic Work
5394431-2022-11-02 15:09:19.734+0000 [id=141899] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog has been triggered
5394565-2022-11-02 15:09:19.734+0000 [id=141899] INFO c.n.j.p.d.DockerContainerWatchdog$Statistics#writeStatisticsToLog: Watchdog Statistics: Number of overall executions: 2620, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
5395121-2022-11-02 15:09:19.734+0000 [id=141899] INFO c.n.j.p.d.DockerContainerWatchdog#loadNodeMap: We currently have 3 nodes assigned to this Jenkins instance, which we will check
5395295-2022-11-02 15:09:19.734+0000 [id=141899] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog check has been completed
5395435-2022-11-02 15:09:19.734+0000 [id=141899] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Finished DockerContainerWatchdog Asynchronous Periodic Work. 1 ms
5395594-2022-11-02 15:11:59.502+0000 [id=140320] INFO hudson.slaves.ChannelPinger$1#onDead: Ping failed. Terminating the channel JNLP4-connect connection from ip-100-122-254-111.eu-central-1.compute.internal/100.122.254.111:42648.
5395817-java.util.concurrent.TimeoutException: Ping started at 1667401679501 hasn't completed by 1667401919502
5395920- at hudson.remoting.PingThread.ping(PingThread.java:134)
5395977- at hudson.remoting.PingThread.run(PingThread.java:90)
5396032:2022-11-02 15:11:59.503+0000 [id=141914] INFO j.s.DefaultJnlpSlaveReceiver#channelClosed: Computer.threadPoolForRemoting 5049 for worker-7j4x4 terminated: java.nio.channels.ClosedChannelException
5396231-2022-11-02 15:12:35.579+0000 [id=141933] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Started Periodic background build discarder
5396368-2022-11-02 15:12:36.257+0000 [id=141933] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Finished Periodic background build discarder. 678 ms
5396514-2022-11-02 15:14:15.582+0000 [id=141422] INFO hudson.slaves.ChannelPinger$1#onDead: Ping failed. Terminating the channel JNLP4-connect connection from ip-100-122-237-38.eu-central-1.compute.internal/100.122.237.38:55038.
5396735-java.util.concurrent.TimeoutException: Ping started at 1667401815582 hasn't completed by 1667402055582
5396838- at hudson.remoting.PingThread.ping(PingThread.java:134)
5396895- at hudson.remoting.PingThread.run(PingThread.java:90)
5396950-2022-11-02 15:14:15.584+0000 [id=141915] INFO j.s.DefaultJnlpSlaveReceiver#channelClosed: Computer.threadPoolForRemoting 5050 for worker-fjf1p terminated: java.nio.channels.ClosedChannelException
****5397149-2022-11-02 15:14:19.733+0000 [id=141950] INFO hudson.model.AsyncPeriodicWork#lambda$doRun$1: Started DockerContainerWatchdog Asynchronous Periodic Work
5397301-2022-11-02 15:14:19.733+0000 [id=141950] INFO c.n.j.p.d.DockerContainerWatchdog#execute: Docker Container Watchdog has been triggered
5397435-2022-11-02 15:14:19.734+0000 [id=141950] INFO c.n.j.p.d.DockerContainerWatchdog$Statistics#writeStatisticsToLog: Watchdog Statistics: Number of overall executions: 2621, Executions with processing timeout: 0, Containers removed gracefully: 0, Containers removed with force: 0, Containers removal failed: 0, Nodes removed successfully: 0, Nodes removal failed: 0, Container removal average duration (gracefully): 0 ms, Container removal average duration (force): 0 ms, Average overall runtime of watchdog: 0 ms, Average runtime of container retrieval: 0 ms
Any suggestion or resolutions pls.
Tried below things:
Increased idleMinutes to 180 from default
Verified that resources are sufficient as per graphana dashboard
Changed podRetention to onFailure from Never
Changed podRetention to Always from Never
Increased readTimeout
Increased connectTimeout
Increased slaveConnectTimeoutStr
Disabled the ping thread from UI via disabling “response time" checkbox from preventive node monitroing
Increased activeDeadlineSeconds
Verified same java version on master and agent
Updated kubernetes and kubernetes API client plugins
Expectation is worker/agent should disconnect once job is successfully ran and after idleMinutes defined it should terminate but few times its terminating while job is still running on agent
I have a ktor app. I works fine when I run it in development mode. I package it in a docker image by copying over what the gradle application plugin provided. That also works fine on my local machine 8 cores. But now the strange part. When I do exactly the same thing on a rented V-Server also running Ubuntu-20.04 like my local system, ktor is incredible slow.
docker-compose logs server:
server | 2021-08-24 08:00:23.337 [main] INFO ktor.application - Autoreload is disabled because the development mode is off.
server | 2021-08-24 08:25:35.048 [main] INFO ktor.application - Autoreload is disabled because the development mode is off.
server | 2021-08-24 09:18:48.246 [main] INFO c.e.e.s.TemplateStore - Starting to parse Sentences
server | 2021-08-24 09:18:48.345 [main] INFO c.e.e.s.TemplateStore - Finished parsing sentences
server | 2021-08-24 09:18:48.346 [main] INFO ktor.application - Responding at http://0.0.0.0:8080
server | 2021-08-24 09:18:48.347 [main] INFO ktor.application - Application started in 3193.32 seconds.
Application started in 3193.32 seconds
The source code can be found here https://github.com/1-alex98/whatisthat . It has a docker-compose.yml defining the whole docker container being started.
Local system 32 gb ram + 8 cores . V-Server 4 gb Ram + 2 cores (htop shows pleinty of resources are free).
I am looking for ideas on what in the world could cause this behavior. Or ways to debug it.
Update:
Seems to read a file forever:
"main" #1 prio=5 os_prio=0 cpu=652.14ms elapsed=173.92s tid=0x00007f01d4016000 nid=0xe runnable [0x00007f01dace6000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(java.base#11.0.12/Native Method)
at java.io.FileInputStream.read(java.base#11.0.12/FileInputStream.java:279)
at java.io.FilterInputStream.read(java.base#11.0.12/FilterInputStream.java:133)
at sun.security.provider.NativePRNG$RandomIO.readFully(java.base#11.0.12/NativePRNG.java:424)
at sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(java.base#11.0.12/NativePRNG.java:526)
at sun.security.provider.NativePRNG$RandomIO.implNextBytes(java.base#11.0.12/NativePRNG.java:545)
- locked <0x00000000c7571158> (a java.lang.Object)
at sun.security.provider.NativePRNG$Blocking.engineNextBytes(java.base#11.0.12/NativePRNG.java:268)
at java.security.SecureRandom.nextBytes(java.base#11.0.12/SecureRandom.java:751)
at kotlin.random.AbstractPlatformRandom.nextBytes(PlatformRandom.kt:47)
at kotlin.random.Random.nextBytes(Random.kt:260)
at com.example.routes.websocket.WebsocketRoutingKt.<clinit>(WebsocketRouting.kt:40)
at com.example.plugins.RoutingKt$routing$1.invoke(Routing.kt:13)
at com.example.plugins.RoutingKt$routing$1.invoke(Routing.kt:11)
at io.ktor.routing.Routing$Feature.install(Routing.kt:106)
at io.ktor.routing.Routing$Feature.install(Routing.kt:88)
at io.ktor.application.ApplicationFeatureKt.install(ApplicationFeature.kt:68)
at io.ktor.routing.RoutingKt.routing(Routing.kt:129)
at com.example.plugins.RoutingKt.routing(Routing.kt:11)
at com.example.ApplicationKt$main$1.invoke(Application.kt:18)
at com.example.ApplicationKt$main$1.invoke(Application.kt:14)
at io.ktor.server.engine.internal.CallableUtilsKt.executeModuleFunction(CallableUtils.kt:50)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading$launchModuleByName$1.invoke(ApplicationEngineEnvironmentReloading.kt:317)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading$launchModuleByName$1.invoke(ApplicationEngineEnvironmentReloading.kt:316)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.avoidingDoubleStartupFor(ApplicationEngineEnvironmentReloading.kt:341)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.launchModuleByName(ApplicationEngineEnvironmentReloading.kt:316)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.access$launchModuleByName(ApplicationEngineEnvironmentReloading.kt:30)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading$instantiateAndConfigureApplication$1.invoke(ApplicationEngineEnvironmentReloading.kt:304)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading$instantiateAndConfigureApplication$1.invoke(ApplicationEngineEnvironmentReloading.kt:295)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.avoidingDoubleStartup(ApplicationEngineEnvironmentReloading.kt:323)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.instantiateAndConfigureApplication(ApplicationEngineEnvironmentReloading.kt:295)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.createApplication(ApplicationEngineEnvironmentReloading.kt:136)
at io.ktor.server.engine.ApplicationEngineEnvironmentReloading.start(ApplicationEngineEnvironmentReloading.kt:268)
at io.ktor.server.netty.NettyApplicationEngine.start(NettyApplicationEngine.kt:174)
at com.example.ApplicationKt.main(Application.kt:21)
at com.example.ApplicationKt.main(Application.kt)
It is a fresh rented server but I guess something is wrong with it
docker-compose being slow and my program not starting seemed to be due to insufficient(not good enough) input to /dev/urandom. Installing https://github.com/smuellerDD/jitterentropy-rngd resolved the problem.
I'm running iotedge on kubernetes.
The K8S cluster is a local cluster setup largely using the "Kubernetes the hard way" method, with some modifications.
I did manage to get things working on one installation. However, I'm now getting this on another installation. The initial installation works fine, but after shutting down a machine to simulate a hardware failure, the pod gets recreated, but starts to show this error again. This error happens EVEN if the node shutdown is NOT the one iotedged is running on.
Environment
3 Nodes running Ubuntu 20.04 LTS
Two networks on each node, one for the internet, one for an internal network. K8S is setup using the internal, static IP address
HAProxy/Keepalived for HA without a load balancer, running on a Virtual IP address
Multus CNI for attaching pods to additional networks
CoreDNS
Troubleshooting
Confirmed that CoreDNS seems to be functioning fine, and is able to resolve internal and external addresses
Remaining nodes are able to ping pods on other nodes
Deleting the iotedged pod and allowing k8s to recreate it works, but then edgeAgent an edgeHub have errors until I delete/recreate them as well
Re-run the entire k8s installation. Initial installation works fine, but simulating machine failure continues to be problematic.
Kubernetes Versions:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:25:06Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
edgeiotd error:
<6>2021-07-09T22:00:02Z [INFO] - Starting Azure IoT Edge Security Daemon - Kubernetes mode
<6>2021-07-09T22:00:02Z [INFO] - Version - 1.1.3
<6>2021-07-09T22:00:02Z [INFO] - Using config file: /etc/iotedged/config.yaml
<6>2021-07-09T22:00:02Z [INFO] - Configuring /var/lib/iotedge as the home directory.
<6>2021-07-09T22:00:02Z [INFO] - Configuring certificates...
<6>2021-07-09T22:00:02Z [INFO] - Transparent gateway certificates not found, operating in quick start mode...
<6>2021-07-09T22:00:02Z [INFO] - Finished configuring provisioning environment variables and certificates.
<6>2021-07-09T22:00:02Z [INFO] - Initializing hsm...
<6>2021-07-09T22:00:02Z [INFO] - Finished initializing hsm.
<6>2021-07-09T22:00:02Z [INFO] - Provisioning edge device...
<6>2021-07-09T22:00:02Z [INFO] - Starting provisioning edge device via manual mode using a device connection string...
<6>2021-07-09T22:00:02Z [INFO] - Manually provisioning device "********" in hub "********.azure-devices.net"
<6>2021-07-09T22:00:02Z [INFO] - Finished provisioning edge device.
<6>2021-07-09T22:00:02Z [INFO] - Initializing the module runtime...
<6>2021-07-09T22:00:02Z [INFO] - Attempting to use config from /home/edgeletuser/.kube/config file.
<6>2021-07-09T22:00:02Z [INFO] - Using in-cluster config
<3>2021-07-09T22:00:34Z [ERR!] - The daemon could not start up successfully: Could not initialize module runtime
<3>2021-07-09T22:00:34Z [ERR!] - caused by: Could not initialize kubernetes module runtime
<3>2021-07-09T22:00:34Z [ERR!] - caused by: HTTP response error: SelfSubjectAccessReviewCreate
<3>2021-07-09T22:00:34Z [ERR!] - caused by: Hyper HTTP error
<3>2021-07-09T22:00:34Z [ERR!] - caused by: error trying to connect: Connection timed out (os error 110)
<6>2021-07-09T22:00:02Z [INFO] (/project/hsm-sys/azure-iot-hsm-c/src/hsm_log.c:log_init:41) Initialized logging
edgeHub Logs after recreating iotedged:
2021-08-18 19:05:40 Starting Edge Hub
2021-08-18 19:05:40.481 +00:00 Edge Hub Main()
<7> 2021-08-18 19:05:40.609 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClient] - Making a Http call to http://localhost:35001/ to CreateServerCertificateAsync
<7> 2021-08-18 19:05:40.912 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClient] - Error when getting an Http response from http://localhost:35001/ for CreateServerCertificateAsync
HTTP Response:
{"message":"Module not found"}
Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.GeneratedCode.IoTEdgedException`1[Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.GeneratedCode.ErrorResponse]: Not Found
at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.GeneratedCode.HttpWorkloadClient.CreateServerCertificateAsync(String api_version, String name, String genid, ServerCertificateRequest request, CancellationToken cancellationToken) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/generatedCode/HttpWorkloadClient.cs:line 624
at Microsoft.Azure.Devices.Edge.Util.TaskEx.TimeoutAfter[T](Task`1 task, TimeSpan timeout) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/TaskEx.cs:line 126
at Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClientVersioned.Execute[T](Func`1 func, String operation) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/WorkloadClientVersioned.cs:line 59
Unhandled exception. System.AggregateException: One or more errors occurred. (Error calling CreateServerCertificateAsync: Module not found)
---> Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadCommunicationException- Message:Error calling CreateServerCertificateAsync: Module not found, StatusCode:404, at: at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.WorkloadClient.HandleException(Exception ex, String operation) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/WorkloadClient.cs:line 109
at Microsoft.Azure.Devices.Edge.Util.Edged.WorkloadClientVersioned.Execute[T](Func`1 func, String operation) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/WorkloadClientVersioned.cs:line 77
at Microsoft.Azure.Devices.Edge.Util.Edged.Version_2019_01_30.WorkloadClient.CreateServerCertificateAsync(String hostname, DateTime expiration) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/edged/version_2019_01_30/WorkloadClient.cs:line 35
at Microsoft.Azure.Devices.Edge.Util.CertificateHelper.GetServerCertificatesFromEdgelet(Uri workloadUri, String workloadApiVersion, String workloadClientApiVersion, String moduleId, String moduleGenerationId, String edgeHubHostname, DateTime expiration) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/CertificateHelper.cs:line 260
at Microsoft.Azure.Devices.Edge.Hub.Service.EdgeHubCertificates.LoadAsync(IConfigurationRoot configuration, ILogger logger) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/EdgeHubCertificates.cs:line 54
at Microsoft.Azure.Devices.Edge.Hub.Service.Program.MainAsync(IConfigurationRoot configuration) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/Program.cs:line 54
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
at System.Threading.Tasks.Task`1.get_Result()
at Microsoft.Azure.Devices.Edge.Hub.Service.Program.Main() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Service/Program.cs:line 33
Are you still blocked? what troubleshooting steps you have tried so far? Did you check the Common issues and resolutions for Azure IoT Edge? As per the error messages Transparent gateway certificates not found, operating in quick start mode and The daemon could not start up successfully: Could not initialize module runtime looks like the setup is not configured properly. Try restarting the server and check the transparent gateway setup. Please refer the transparent gateway setup and check if you have missed anything.
Recently I've reinstalled my VPS and have a fresh install of Neo4j on it.
I'm using putty to connect from my machine, tunneling port 7474 as I've done in the past. I'm new to Neo4j 3.2 and am getting this error when I try to connect to the server on the Neo4j browser:
N/A: WebSocket connection failure. Due to security constraints in your
web browser, the reason for the failure is not available to this Neo4j
Driver.
After trying a lot of different suggestions for sort of related topics I ended up allowing remote connections and discovered that when I access remotely eg. http://my_vps_ip:7474/browser/ I have no issues at all.
This is the output of neo4j status:
● neo4j.service - Neo4j Graph Database
Loaded: loaded (/lib/systemd/system/neo4j.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2017-05-12 04:47:11 CEST; 2h 1min ago
Main PID: 17040 (java)
Tasks: 38
Memory: 272.1M
CPU: 1min 6.731s
CGroup: /system.slice/neo4j.service
└─17040 /usr/bin/java -cp /var/lib/neo4j/plugins:/etc/neo4j:/usr/share/neo4j/lib/*:/var/lib/neo4j/plugins/* -server -XX:
May 12 04:47:11 vps276997 neo4j[17040]: import: /var/lib/neo4j/import
May 12 04:47:11 vps276997 neo4j[17040]: data: /var/lib/neo4j/data
May 12 04:47:11 vps276997 neo4j[17040]: certificates: /var/lib/neo4j/certificates
May 12 04:47:11 vps276997 neo4j[17040]: run: /var/run/neo4j
May 12 04:47:11 vps276997 neo4j[17040]: Starting Neo4j.
May 12 04:47:12 vps276997 neo4j[17040]: 2017-05-12 02:47:12.417+0000 INFO ======== Neo4j 3.2.0 ========
May 12 04:47:12 vps276997 neo4j[17040]: 2017-05-12 02:47:12.844+0000 INFO Starting...
May 12 04:47:13 vps276997 neo4j[17040]: 2017-05-12 02:47:13.950+0000 INFO Bolt enabled on 0.0.0.0:7687.
May 12 04:47:18 vps276997 neo4j[17040]: 2017-05-12 02:47:18.196+0000 INFO Started.
May 12 04:47:20 vps276997 neo4j[17040]: 2017-05-12 02:47:20.274+0000 INFO Remote interface available at http://localhost:7474/
Any ideas why this might be happening?
Please ensure that public access to 7687 port is enabled in your
'neo4j.conf' file. In the latest version, it should be two line in your 'neo4j.conf':
dbms.connector.bolt.enabled=true
dbms.connector.bolt.listen_address=0.0.0.0:7687
That is because neo4j's bolt protocol takes 7687 port.
Also ensure your expose 7687 in your instance to public, if you are using AWS EC2, choose protocol to be TCP because bolt is based on TCP.
If you are using Docker/k8s, also ensure that you expose all ports(7474,7473,7687 by default) in your containers or k8s service.
There is a neo4j knowledge base article is about this exact issue.
Quote:
This error can be resolved by editing the file
$NEO4J_HOME/conf/neo4j.conf and uncommenting:
# To have Bolt accept non-local connections, uncomment this line:
dbms.connector.bolt.address=0.0.0.0:7687
One of our pipelines failed this morning with an error we've never seen before. In addition, we had to manually delete the one VM that was was spun up to cancel/stop the job.
Has anything changed in the Dataflow service that could cause this error?
0 [main] INFO com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner - PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 49 files. Enable logging at DEBUG level to see which files will be staged.
2243 [main] INFO com.<removed>.cdf.dfp.DFPDenormalizationCloudDataFlowJob - Successfully created cloud dataflow service pipeline
2282 [main] INFO com.<removed>.cdf.dfp.DFPDenormalizationCloudDataFlowJob - Last loaded table was found. It will be processed for denormalization: Clicks_06_2015
2282 [main] INFO com.<removed>.cdf.dfp.DFPDenormalizationCloudDataFlowJob - Last loaded table was found. It will be processed for denormalization: ActiveViews_06_2015
2282 [main] INFO com.<removed>.cdf.dfp.DFPDenormalizationCloudDataFlowJob - Last loaded table was found. It will be processed for denormalization: Impressions_06_2015
2435 [main] WARN com.google.cloud.dataflow.sdk.Pipeline - Transform <removed>:<removed>.advertisers2 does not have a stable unique name. In the future, this will prevent reloading streaming pipelines
2615 [main] WARN com.google.cloud.dataflow.sdk.Pipeline - Transform <removed>:<removed>.lineitems2 does not have a stable unique name. In the future, this will prevent reloading streaming pipelines
2616 [main] WARN com.google.cloud.dataflow.sdk.Pipeline - Transform <removed>:<removed>.creative2name2 does not have a stable unique name. In the future, this will prevent reloading streaming pipelines
2616 [main] WARN com.google.cloud.dataflow.sdk.Pipeline - Transform <removed>:<removed>.adunit2site2 does not have a stable unique name. In the future, this will prevent reloading streaming pipelines
3236 [main] INFO com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner - Executing pipeline on the Dataflow Service, which will have billing implications related to Google Compute Engine usage and other Google Cloud Services.
3241 [main] INFO com.google.cloud.dataflow.sdk.util.PackageUtil - Uploading 49 files from PipelineOptions.filesToStage to staging location to prepare for execution.
41834 [main] INFO com.google.cloud.dataflow.sdk.util.PackageUtil - Uploading PipelineOptions.filesToStage complete: 10 files newly uploaded, 39 files cached
Dataflow SDK version: 0.4.150602
51003 [main] INFO com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner - To access the Dataflow monitoring console, please navigate to https://console.developers.google.com/project/<removed>/dataflow/job/2015-06-11_16_39_02-17130055143605818331
Submitted job: 2015-06-11_16_39_02-17130055143605818331
51004 [main] INFO com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner - To cancel the job using the 'gcloud' tool, run:
> gcloud alpha dataflow jobs --project=<removed> cancel 2015-06-11_16_39_02-17130055143605818331
2015-06-11T23:39:02.506Z: Detail: (b056559940543e6a): Expanding GroupByKey operations into optimizable parts.
2015-06-11T23:39:02.509Z: Detail: (b056559940543d60): Annotating graph with Autotuner information.
2015-06-11T23:39:02.759Z: Detail: (b0565599405437a9): Fusing adjacent ParDo, Read, Write, and Flatten operations
2015-06-11T23:39:02.762Z: Detail: (b05655994054369f): Fusing consumer Impressions_06_2015-ParDoDFP-transform into Impressions_06_2015-BQ-Read
2015-06-11T23:39:02.764Z: Detail: (b056559940543595): Fusing consumer Impressions_06_2015-BQ-Write into Impressions_06_2015-ParDoDFP-transform
2015-06-11T23:39:02.766Z: Detail: (b05655994054348b): Fusing consumer ActiveViews_06_2015-ParDoDFP-transform into ActiveViews_06_2015-BQ-Read
2015-06-11T23:39:02.767Z: Detail: (b056559940543381): Fusing consumer ActiveViews_06_2015-BQ-Write into ActiveViews_06_2015-ParDoDFP-transform
2015-06-11T23:39:02.769Z: Detail: (b056559940543277): Fusing consumer Clicks_06_2015-ParDoDFP-transform into Clicks_06_2015-BQ-Read
2015-06-11T23:39:02.771Z: Detail: (b05655994054316d): Fusing consumer Clicks_06_2015-BQ-Write into Clicks_06_2015-ParDoDFP-transform
2015-06-11T23:39:02.818Z: Detail: (b056559940543987): Adding StepResource setup and teardown to workflow graph.
2015-06-11T23:39:18.614Z: Error: (5494fb7a460f58a8): Workflow failed. Causes: (20fbc2bb0e7cb0b1): One or more operations had an error: 'operation-1434065943092-518467f1f5b21-8d000d8a-d5cd5762': 'The resource 'projects/<removed>/zones/us-central1-a/disks/dfp-denormalization-job-1-06111639-3db5-harness-0' is not ready'.
2015-06-11T23:39:18.651Z: Detail: (4fb958a4957733a5): Cleaning up.
2015-06-11T23:40:36.126Z: Error: (d41cf136c17a5e79): Workflow failed. Causes: (20fbc2bb0e7cb0b1): One or more operations had an error: 'operation-1434065943092-518467f1f5b21-8d000d8a-d5cd5762': 'The resource 'projects/<removed>/zones/us-central1-a/disks/dfp-denormalization-job-1-06111639-3db5-harness-0' is not ready'.
2015-06-11T23:43:05.998Z: Warning: (c5964e114f42988b): Job 2015-06-11_16_39_02-17130055143605818331 is already finishing. Ignoring cancel request.
2015-06-11T23:48:04.715Z: Warning: (cf462c726cde3704): Job 2015-06-11_16_39_02-17130055143605818331 is already finishing. Ignoring cancel request.
2015-06-11T23:50:35.529Z: Warning: Internal Issue (4fb958a495773599): 65177287:8503
748739 [main] INFO com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner - Job finished with status FAILED
748740 [main] ERROR com.<removed>.cdf.dfp.DFPDenormalizationCloudDataFlowJob - Job "dfp-denormalization-job-1434066640362" failed. Job may be retried.
This was a temporary issue with the Google Compute Engine API that has since been resolved. When calling GCE on behalf of the user, Dataflow will attempt to work around any transient errors.