I am running a self-hosted OKD 4 cluster with minimum production requirements (3 control planes and two compute nodes). This setup includes a Jenkins installation - installed via Helm (https://www.jenkins.io/doc/book/installing/kubernetes/) So far everything worked fine: builds start automatically when changes are pushed to Github and when they are successful are deployed to the same cluster where Jenkins runs in.
But currently I am facing the problem that when a build job executes a Spring Boot test which fires up a persistence context. The build agent (a jdk-11 image, see additionalAgent configuration below) gets killed as soon as Spring starts up the persistence context. Downloading dependencies and compilation works fine, btw.
additionalAgents:
jdk-11:
podName: jdk-11
customJenkinsLabels: jdk-11
image: jenkins/jnlp-agent-jdk11
tag: latest
...
When the tests are disabled the job runs fine. But as soon as the persistence gets initialised the agent gets killed.
Those are the configurations I have tried for the test:
Starting with an in-memory h2 database and flyway provisioning.
Without flyway provisioning.
Even without the database connection string set.
The time where the job gets killed is almost the same:
For 1. it is
2021-10-20 22:44:06.637 INFO 299 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2021-10-20 22:44:07.032 INFO 299 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 310 ms. Found 2 JPA repository interfaces.
2021-10-20 22:44:08.240 INFO 299 --- [ main] o.s.cloud.context.scope.GenericScope : BeanFactory id=1c9e8306-7514-338e-8a9f-3cfba5c1169b
2021-10-20 22:44:10.527 INFO 299 --- [ main] o.f.c.internal.license.VersionPrinter : Flyway Community Edition 7.7.3 by Redgate
2021-10-20 22:44:10.532 INFO 299 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2021-10-20 22:44:11.744 INFO 299 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2021-10-20 22:44:12.041 INFO 299 --- [ main] o.f.c.i.database.base.DatabaseType : Database: jdbc:h2:mem:testdb (H2 1.4)
Killed
For 2.
2021-10-21 19:50:51.604 INFO 306 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2021-10-21 19:50:52.005 INFO 306 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 391 ms. Found 2 JPA repository interfaces.
2021-10-21 19:50:53.510 INFO 306 --- [ main] o.s.cloud.context.scope.GenericScope : BeanFactory id=0fd77ef3-b5a2-35cb-b157-6d27c0cfe9a5
2021-10-21 19:50:56.405 INFO 306 --- [ main] o.hibernate.jpa.internal.util.LogHelper : HHH000204: Processing PersistenceUnitInfo [name: default]
2021-10-21 19:50:56.708 INFO 306 --- [ main] org.hibernate.Version : HHH000412: Hibernate ORM core version 5.4.32.Final
2021-10-21 19:50:57.503 INFO 306 --- [ main] o.hibernate.annotations.common.Version : HCANN000001: Hibernate Commons Annotations {5.1.2.Final}
Killed
And for 3.
2021-10-21 22:02:48.810 INFO 309 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2021-10-21 22:02:49.198 INFO 309 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 380 ms. Found 2 JPA repository interfaces.
2021-10-21 22:02:50.509 INFO 309 --- [ main] o.s.cloud.context.scope.GenericScope : BeanFactory id=0fd77ef3-b5a2-35cb-b157-6d27c0cfe9a5
2021-10-21 22:02:53.523 INFO 309 --- [ main] o.hibernate.jpa.internal.util.LogHelper : HHH000204: Processing PersistenceUnitInfo [name: default]
2021-10-21 22:02:53.898 INFO 309 --- [ main] org.hibernate.Version : HHH000412: Hibernate ORM core version 5.4.32.Final
Killed
The log of the Jenkins pod just states
Terminated Kubernetes instance for agent jenkins/jdk-11-bjtz5
Disconnected computer jdk-11-bjtz5
2021-10-21 22:02:57.342+0000 [id=465] INFO o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent jenkins/jdk-11-bjtz5
2021-10-21 22:02:57.342+0000 [id=465] INFO o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer jdk-11-bjtz5
2021-10-21 22:02:57.356+0000 [id=436] INFO j.s.DefaultJnlpSlaveReceiver#channelClosed: Computer.threadPoolForRemoting [#56] for jdk-11-bjtz5 terminated: java.nio.channels.ClosedChannelException
In all cases there are no exceptions, stacktraces or suspicious events. And these steps are reproducable - when I run the build with the same configuration again the agents gets killed at exactly the same step in the test.
The setup:
Jenkins Version 2.303.2
Jenkins uses a MySQL database running in the same cluster
all Jenkins plugins are up-to-date
OKD currently running at version 4.8.0-0.okd-2021-10-10-030117
currently there are no resource quotas set and the system still has plenty of free resources
I am presuming that a little bit of configuration is missing to make this work. But I just cannot find what it could be. So I am asking: have had anyone the same issue here? Or any guesses what the missing part could be?
When there is some information missing please point it out and I will add it.
After a little step back I took a look into the actual pod which runs the build. And found out that the memory limit of the agents was the problem.
So increasing the limit solved the problem!
I have done that by modifying the local jenkins-values.yaml and updated the limits section of the agent: block in it.
A little confusing for me was the fact that no log entry stated the exceed of memory usage.
Next thought is that I will set a memory limit for the test step via Java options to kill the maven process before the pod exceeds the limit. Guess it would be more transparent in the build.
And as a sidenote: the limit has been set to 512Mi previously and has been exceeded by ~10 MiB -.-
Luckily for me I have found it at this point and the other build-jobs where running fine was just due to the lack of resource-usage (haven't figured that just starting Hibernate would exceed the 512 MiB mark)
Related
I'm trying to deploy the entire Spring Cloud Data Flow platform to a MicroK8s cluster running on one of our server, a VM with Ubuntu 20.04. Before starting performing actions on the target server, I tried to deploy it on my local computer (same OS) and I even succeeded and created/run one stream. Nevertheless, I am currently experiencing an error both on my local computer and on the VM, and I can't manage to pinpoint the root cause.
My current situation:
I'm following the official guide for deploying SCDF using kubectl, only difference being that I'm using tag v2.9.4, latest at the time of writing, instead of v2.9.1. I also skipped the configuration of monitoring frameworks, and hence commented the relevant lines in the configuration of SCDF server, as suggested in the docs. Kafka message broker and MySQL database are deployed without issues.
But, after executing kubectl commands to create config map, service and deployment for Skipper, I can see that Skipper pod goes in status "CrashLoopBackOff". Checking the logs of the pod, the only thing I see is that the application is terminated right after it seems to have started:
[...]
2022-04-11 15:00:11.713 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 7577 (http) with context path ''
2022-04-11 15:00:11.907 INFO 1 --- [ main] o.s.c.s.s.app.SkipperServerApplication : Started SkipperServerApplication in 78.901 seconds (JVM running for 82.435)
2022-04-11 15:00:12.531 INFO 1 --- [ionShutdownHook] o.s.s.s.DefaultStateMachineService : Entering stop sequence, stopping all managed machines
2022-04-11 15:00:12.617 INFO 1 --- [ionShutdownHook] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2022-04-11 15:00:12.703 INFO 1 --- [ionShutdownHook] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...
2022-04-11 15:00:12.799 INFO 1 --- [ionShutdownHook] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.
Native Memory Tracking:
Total: reserved=961864767, committed=325411903
- Java Heap (reserved=356515840, committed=138334208)
(mmap: reserved=356515840, committed=138334208)
- Class (reserved=269444100, committed=94409732)
(classes #17623)
( instance classes #16455, array classes #1168)
(malloc=3355652 #45645)
(mmap: reserved=266088448, committed=91054080)
( Metadata: )
( reserved=79691776, committed=78340096)
( used=76414680)
( free=1925416)
( waste=0 =0.00%)
( Class space:)
( reserved=186396672, committed=12713984)
( used=11544696)
( free=1169288)
( waste=0 =0.00%)
- Thread (reserved=14794856, committed=1323112)
(thread #14)
(stack: reserved=14729216, committed=1257472)
(malloc=51792 #86)
(arena=13848 #25)
- Code (reserved=255686068, committed=26629556)
(malloc=2053556 #8654)
(mmap: reserved=253632512, committed=24576000)
- GC (reserved=1728178, committed=1019570)
(malloc=560818 #2163)
(mmap: reserved=1167360, committed=458752)
- Compiler (reserved=35543622, committed=35543622)
(malloc=71174 #1162)
(arena=35472448 #19)
- Internal (reserved=432627, committed=432627)
(malloc=399859 #1104)
(mmap: reserved=32768, committed=32768)
- Other (reserved=10248, committed=10248)
(malloc=10248 #3)
- Symbol (reserved=22101496, committed=22101496)
(malloc=19867360 #240000)
(arena=2234136 #1)
- Native Memory Tracking (reserved=4899928, committed=4899928)
(malloc=9656 #122)
(tracking overhead=4890272)
- Arena Chunk (reserved=81808, committed=81808)
(malloc=81808)
- Tracing (reserved=1, committed=1)
(malloc=1 #1)
- Logging (reserved=4572, committed=4572)
(malloc=4572 #192)
- Arguments (reserved=19063, committed=19063)
(malloc=19063 #495)
- Module (reserved=310496, committed=310496)
(malloc=310496 #2710)
- Synchronizer (reserved=283672, committed=283672)
(malloc=283672 #2348)
- Safepoint (reserved=8192, committed=8192)
(mmap: reserved=8192, committed=8192)
No matter how many times the pod is restarted, it always exits at this phase. This is the output of kubectl get all
NAME READY STATUS RESTARTS AGE
pod/kafka-zk-6b6f4976cf-9hjzn 1/1 Running 0 69m
pod/kafka-broker-0 1/1 Running 0 58m
pod/mysql-7c57b4cfdf-njb97 1/1 Running 0 39m
pod/skipper-b46bfd5fd-wrnqv 0/1 CrashLoopBackOff 13 (57s ago) 38m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 148m
service/kafka-zk ClusterIP 10.152.183.62 <none> 2181/TCP,2888/TCP,3888/TCP 69m
service/kafka-broker ClusterIP None <none> 9092/TCP 69m
service/mysql ClusterIP 10.152.183.139 <none> 3306/TCP 40m
service/skipper LoadBalancer 10.152.183.250 <pending> 80:31955/TCP 38m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kafka-zk 1/1 1 1 69m
deployment.apps/mysql 1/1 1 1 39m
deployment.apps/skipper 0/1 1 0 38m
NAME DESIRED CURRENT READY AGE
replicaset.apps/kafka-zk-6b6f4976cf 1 1 1 69m
replicaset.apps/mysql-7c57b4cfdf 1 1 1 39m
replicaset.apps/skipper-b46bfd5fd 1 1 0 38m
NAME READY AGE
statefulset.apps/kafka-broker 1/1 69m
What I tried:
Changing the Skipper service type from LoadBalancer to NodePort (I have not enabled metallb so load balancing is not provided), but didn't work;
Changing the port exposed by the container, in the default configuration is port 80, I changed it to 7577 (also in the service configuration), but the error still occurs;
Downgraded to the version 2.8.2 of skipper, the same in the documentation above, the behaviour was exactly the same.
Increasing the logging level by setting logging.level.org.springframework to DEBUG and then to TRACE didn't result in anything useful showing up in the logs, except a cryptic line which I did not found anywhere on google:
[...]
2022-04-11 15:22:38.818 DEBUG 1 --- [ main] o.s.c.c.CompositeCompatibilityVerifier : All conditions are passing
2022-04-11 15:22:39.098 DEBUG 1 --- [ main] ocalVariableTableParameterNameDiscoverer : Cannot find '.class' file for class [class org.springframework.statemachine.boot.autoconfigure.StateMachineAutoConfiguration$StateMachineMonitoringConfiguration$$EnhancerBySpringCGLIB$$b266f314] - unable to determine constructor/method parameter names
2022-04-11 15:22:39.925 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 7577 (http) with context path ''
2022-04-11 15:22:40.244 INFO 1 --- [ main] o.s.c.s.s.app.SkipperServerApplication : Started SkipperServerApplication in 76.267 seconds (JVM running for 79.716)
[...]
Can anyone suggest me what to try next, or give me some way to further diagnosticate this issue?
We use XGBoost model for regression prediction model, We use XGBoost as grid search hyper parameter tuning process,
We run this model on 90GB h2o cluster. This process now running over 1.2 years, but suddenly this process stop due to "Closing connection _sid_af1c at exit"
Training data set is 800 000, due to this error we decreased it to 500 000 but same error occurred.
ntrees - 300,400
depth - 8.10
variables - 382
I have attached H2o memory log and our application error log. Could you please support to fixed this issue.
----------------------------------------H2o Log [Start]----------------------
**We start H2o as 2 node cluster, but h2o log crated on one node.**
INFO water.default: ----- H2O started -----
INFO water.default: Build git branch: master
INFO water.default: Build git hash: 0588cccd72a7dc1274a83c30c4ae4161b92d9911
INFO water.default: Build git describe: jenkins-master-5236-4-g0588ccc
INFO water.default: Build project version: 3.33.0.5237
INFO water.default: Build age: 1 year, 3 months and 17 days
INFO water.default: Built by: 'jenkins'
INFO water.default: Built on: '2020-10-27 19:21:29'
WARN water.default:
WARN water.default: *** Your H2O version is too old! Please download the latest version from http://h2o.ai/download/ ***
WARN water.default:
INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone]
INFO water.default: Processed H2O arguments: [-flatfile, /usr/local/h2o/flatfile.txt, -port, 54321]
INFO water.default: Java availableProcessors: 20
INFO water.default: Java heap totalMemory: 962.5 MB
INFO water.default: Java heap maxMemory: 42.67 GB
INFO water.default: Java version: Java 1.8.0_262 (from Oracle Corporation)
INFO water.default: JVM launch parameters: [-Xmx48g]
INFO water.default: JVM process id: 83043#masterb.xxxxx.com
INFO water.default: OS version: Linux 3.10.0-1127.10.1.el7.x86_64 (amd64)
INFO water.default: Machine physical memory: 62.74 GB
INFO water.default: Machine locale: en_US
INFO water.default: X-h2o-cluster-id: 1644769990156
INFO water.default: User name: 'root'
INFO water.default: IPv6 stack selected: false
INFO water.default: Possible IP Address: ens192 (ens192), xxxxxxxxxxxxxxxxxxxx
INFO water.default: Possible IP Address: ens192 (ens192), xxxxxxxxxxx
INFO water.default: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo
INFO water.default: Possible IP Address: lo (lo), 127.0.0.1
INFO water.default: H2O node running in unencrypted mode.
INFO water.default: Internal communication uses port: 54322
INFO water.default: Listening for HTTP and REST traffic on http://xxxxxxxxxxxx:54321/
INFO water.default: H2O cloud name: 'root' on /xxxxxxxxxxxx:54321, discovery address /xxxxxxxxxxxx:57653
INFO water.default: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
INFO water.default: 1. Open a terminal and run 'ssh -L 55555:localhost:54321 root#xxxxxxxxxxxx'
INFO water.default: 2. Point your browser to http://localhost:55555
INFO water.default: Log dir: '/tmp/h2o-root/h2ologs'
INFO water.default: Cur dir: '/usr/local/h2o/h2o-3.33.0.5237'
INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized
INFO water.default: HDFS subsystem successfully initialized
INFO water.default: S3 subsystem successfully initialized
INFO water.default: GCS subsystem successfully initialized
INFO water.default: Flow dir: '/root/h2oflows'
INFO water.default: Cloud of size 1 formed [/xxxxxxxxxxxx:54321]
INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV]
INFO water.default: XGBoost extension initialized
INFO water.default: KrbStandalone extension initialized
INFO water.default: Registered 2 core extensions in: 2632ms
INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone]
INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_gpu
INFO hex.tree.xgboost.XGBoostExtension: XGBoost supported backends: [WITH_GPU, WITH_OMP]
INFO water.default: Registered: 217 REST APIs in: 353ms
INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4]
INFO water.default: Registered: 291 schemas in 112ms
INFO water.default: H2O started in 4612ms
INFO water.default:
INFO water.default: Open H2O Flow in your web browser: http://xxxxxxxxxxxx:54321
INFO water.default:
INFO water.default: Cloud of size 2 formed [mastera.xxxxxxxxxxxx.com/xxxxxxxxxxxx:54321, masterb.xxxxxxxxxxxx.com/xxxxxxxxxxxx:54321]
INFO water.default: Locking cloud to new members, because water.rapids.Session$1
INFO hex.tree.xgboost.task.XGBoostUpdater: Initial Booster created, size=448
ERROR water.default: Got IO error when sending a batch of bytes:
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:468)
at water.H2ONode$SmallMessagesSendThread.sendBuffer(H2ONode.java:605)
at water.H2ONode$SmallMessagesSendThread.run(H2ONode.java:588)
----------------------------------------H2o Log [End]--------------------------------
----------------------------------------Application Log [Start]----------------------
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
Warning: Your H2O cluster version is too old (1 year, 3 months and 17 days)! Please download and install the latest version from http://h2o.ai/download/
-------------------------- ------------------------------------------------------------------
H2O_cluster_uptime: 19 mins 49 secs
H2O_cluster_timezone: Asia/Colombo
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.33.0.5237
H2O_cluster_version_age: 1 year, 3 months and 17 days !!!
H2O_cluster_name: root
H2O_cluster_total_nodes: 2
H2O_cluster_free_memory: 84.1 Gb
H2O_cluster_total_cores: 40
H2O_cluster_allowed_cores: 40
H2O_cluster_status: locked, healthy
H2O_connection_url: http://localhost:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.0 final
-------------------------- ------------------------------------------------------------------
-------------------------- ------------------------------------------------------------------
H2O_cluster_uptime: 19 mins 49 secs
H2O_cluster_timezone: Asia/Colombo
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.33.0.5237
H2O_cluster_version_age: 1 year, 3 months and 17 days !!!
H2O_cluster_name: root
H2O_cluster_total_nodes: 2
H2O_cluster_free_memory: 84.1 Gb
H2O_cluster_total_cores: 40
H2O_cluster_allowed_cores: 40
H2O_cluster_status: locked, healthy
H2O_connection_url: http://localhost:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.0 final
-------------------------- ------------------------------------------------------------------
release memory here...
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
Warning: Your H2O cluster version is too old (1 year, 3 months and 17 days)! Please download and install the latest version from http://h2o.ai/download/
-------------------------- ------------------------------------------------------------------
H2O_cluster_uptime: 19 mins 49 secs
H2O_cluster_timezone: Asia/Colombo
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.33.0.5237
H2O_cluster_version_age: 1 year, 3 months and 17 days !!!
H2O_cluster_name: root
H2O_cluster_total_nodes: 2
H2O_cluster_free_memory: 84.1 Gb
H2O_cluster_total_cores: 40
H2O_cluster_allowed_cores: 40
H2O_cluster_status: locked, healthy
H2O_connection_url: http://localhost:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.0 final
-------------------------- ------------------------------------------------------------------
Parse progress: |█████████████████████████████████████████████████████████| 100%
xgboost Grid Build progress: |████████Closing connection _sid_af1c at exit
H2O session _sid_af1c was not closed properly.
Closing connection _sid_9313 at exit
H2O session _sid_9313 was not closed properly.
----------------------------------------Application Log [End]----------------------
This typically means one of the nodes crashed, it can be due to many different reasons - memory is the most common one.
I see your machine has about 64GB of physical memory and H2O is getting 48GB out of that. XGBoost runs in native memory, not in the JVM memory. For XGBoost we recommend splitting the physical memory 50-50 to H2O and XGBoost.
You are running a development version of H2O (3.33) - I suggest upgrading to the latest stable.
We recently upgraded to k8s version 1.20.9 and not sure if that is the root cause but SCDF server pod fails to come up with the error below.
I usually deploy scdf server using kubectl based deployment.
Anyone has any idea ? Attached error below.
2022-01-05 05:08:56.207 INFO 1 --- [ main]
o.a.coyote.http11.Http11NioProtocol : Starting ProtocolHandler
["http-nio-80"] 2022-01-05 05:08:56.300 WARN 1 --- [ main]
ConfigServletWebServerApplicationContext : Exception encountered
during context initialization - cancelling refresh attempt:
org.springframework.context.ApplicationContextException: Failed to
start bean 'webServerStartStop'; nested exception is
org.springframework.boot.web.server.WebServerException: Unable to
start embedded Tomcat server 2022-01-05 05:08:56.798 INFO 1 --- [
main] j.LocalContainerEntityManagerFactoryBean : Closing JPA
EntityManagerFactory for persistence unit 'default' 2022-01-05
05:08:56.893 INFO 1 --- [ main]
com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown
initiated... 2022-01-05 05:08:57.194 INFO 1 --- [ main]
com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown
completed. 2022-01-05 05:08:57.197 INFO 1 --- [ main]
o.a.coyote.http11.Http11NioProtocol : Pausing ProtocolHandler
["http-nio-80"] 2022-01-05 05:08:57.197 INFO 1 --- [ main]
o.apache.catalina.core.StandardService : Stopping service [Tomcat]
2022-01-05 05:08:57.292 INFO 1 --- [ main]
o.a.coyote.http11.Http11NioProtocol : Stopping ProtocolHandler
["http-nio-80"] 2022-01-05 05:08:57.293 INFO 1 --- [ main]
o.a.coyote.http11.Http11NioProtocol : Destroying ProtocolHandler
["http-nio-80"] 2022-01-05 05:08:57.793 ERROR 1 --- [ main]
o.s.boot.SpringApplication : Application run failed
org.springframework.context.ApplicationContextException: Failed to
start bean 'webServerStartStop'; nested exception is
org.springframework.boot.web.server.WebServerException: Unable to
start embedded Tomcat server Caused by:
org.springframework.boot.web.server.WebServerException: Unable to
start embedded Tomcat server Caused by:
java.lang.IllegalArgumentException:
standardService.connector.startFailed Caused by:
org.apache.catalina.LifecycleException: Protocol handler start failed
Caused by: java.net.SocketException: Permission denied
What stands out in the trace is SocketException: permission denied It is likely due to some security configuration change in the upgrade affecting the TCP layer. I would start with your security configuration. Keep us posted.
I use SCDF with skipper server 2.3.2 and dataflow server 2.4.2. deployed with Docker-compose
I build a task with composed-task-runner 2.1.3 release
I try with my task but even with samples, composed-task-runner doesn't launch tasks
Example :
composed-task-runner && mytask && myTask
2020-04-14 14:40:41.633 INFO 231 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'taskExecutor'
2020-04-14 14:40:41.752 INFO 231 --- [ main] o.s.b.c.r.s.JobRepositoryFactoryBean : No database type set, using meta data indicating: POSTGRES
2020-04-14 14:40:41.885 INFO 231 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
2020-04-14 14:40:42.083 DEBUG 231 --- [ main] o.s.c.t.a.c.DataFlowConfiguration : Not configuring basic security for accessing the Data Flow Server
2020-04-14 14:40:46.922 DEBUG 231 --- [ main] o.s.c.t.r.s.TaskRepositoryInitializer : Initializing task schema for postgresql database
2020-04-14 14:40:47.461 DEBUG 231 --- [ main] BatchConfiguration$ReferenceTargetSource : Initializing lazy target object
2020-04-14 14:40:47.552 DEBUG 231 --- [ main] o.s.c.t.r.support.SimpleTaskRepository : Starting: TaskExecution{executionId=102, parentExecutionId=null, exitCode=null, taskName='testComposedTask2', startTime=Tue Apr 14 14:40:47 GMT 2020, endTime=null, exitMessage='null', externalExecutionId='null', errorMessage='null', arguments=[xx=7777, --spring.cloud.data.flow.platformname=default, --spring.cloud.task.executionid=102, --spring.cloud.data.flow.taskappname=composed-task-runner]}
2020-04-14 14:40:47.675 INFO 231 --- [ main] .t.a.c.ComposedtaskrunnerTaskApplication : Started ComposedtaskrunnerTaskApplication in 33.768 seconds (JVM running for 38.059)
2020-04-14 14:40:47.743 DEBUG 231 --- [ main] o.s.c.t.r.support.SimpleTaskRepository : Updating: TaskExecution with executionId=102 with the following {exitCode=0, endTime=Tue Apr 14 14:40:47 GMT 2020, exitMessage='null', errorMessage='null'}
2020-04-14 14:40:47.831 INFO 231 --- [ Thread-7] o.s.s.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'taskExecutor'
2020-04-14 14:40:47.836 INFO 231 --- [ Thread-7] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2020-04-14 14:40:47.852 INFO 231 --- [ Thread-7] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...
2020-04-14 14:40:47.909 INFO 231 --- [ Thread-7] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.
Could you help me? I am with a Postgres DB
New remark :
I build a new composed task as said in the documentation :
testComposedTaskA with definition myTask && timestamp
dataflow:>task list
╔═══════════════════════════╤═══════════════════╤═══════════╤═══════════╗
║ Task Name │ Task Definition │description│Task Status║
╠═══════════════════════════╪═══════════════════╪═══════════╪═══════════╣
║testComposedTaskA-myTask │myTask │ │UNKNOWN ║
║testComposedTaskA-timestamp│timestamp │ │UNKNOWN ║
║testComposedTaskA │myTask && timestamp│ │UNKNOWN ║
╚═══════════════════════════╧═══════════════════╧═══════════╧═══════════╝
dataflow:>task launch testComposedTaskA
Launched task 'testComposedTaskA' with execution id 135
dataflow:>task execution list
╔═════════════════╤═══╤════════════════════════════╤════════════════════════════╤═════════╗
║ Task Name │ID │ Start Time │ End Time │Exit Code║
╠═════════════════╪═══╪════════════════════════════╪════════════════════════════╪═════════╣
║testComposedTaskA│135│Fri Apr 17 15:04:42 GMT 2020│Fri Apr 17 15:04:42 GMT 2020│0 ║
╚═════════════════╧═══╧════════════════════════════╧════════════════════════════╧═════════╝
Sub-tasks are never executed . In the samples, we can see the sub task as executed
What is wrong?
As request, the dataflow server start log
Hi the dataflow server start log :
dataflow-server | 2020-04-20 07:43:24.047 INFO 1 --- [ main] o.h.e.t.j.p.i.JtaPlatformInitiator : HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
dataflow-server | 2020-04-20 07:43:24.061 INFO 1 --- [ main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
dataflow-server | 2020-04-20 07:43:27.735 DEBUG 1 --- [ main] o.s.c.t.c.SimpleTaskAutoConfiguration : Using org.springframework.cloud.task.configuration.DefaultTaskConfigurer TaskConfigurer
dataflow-server | 2020-04-20 07:43:27.745 DEBUG 1 --- [ main] o.s.c.t.c.DefaultTaskConfigurer : EntityManager was found, using JpaTransactionManager
dataflow-server | 2020-04-20 07:43:28.282 INFO 1 --- [ main] o.s.b.c.r.s.JobRepositoryFactoryBean : No database type set, using meta data indicating: POSTGRES
skipper | 2020-04-20 07:43:28.294 INFO 1 --- [ main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
dataflow-server | 2020-04-20 07:43:28.505 INFO 1 --- [ main] o.s.c.d.s.b.SimpleJobServiceFactoryBean : No database type set, using meta data indicating: POSTGRES
dataflow-server | 2020-04-20 07:43:28.975 WARN 1 --- [ main] JpaBaseConfiguration$JpaWebConfiguration : spring.jpa.open-in-view is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure spring.jpa.open-in-view to disable this warning
dataflow-server | 2020-04-20 07:43:29.392 INFO 1 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor'
dataflow-server | 2020-04-20 07:43:32.150 INFO 1 --- [ main] .s.c.DataFlowControllerAutoConfiguration : Skipper URI [http://skipper-server:7577/api]
dataflow-server | 2020-04-20 07:43:32.614 INFO 1 --- [ main] o.a.coyote.http11.Http11NioProtocol : Starting ProtocolHandler ["http-nio-9393"]
dataflow-server | 2020-04-20 07:43:32.764 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 9393 (http) with context path ''
dataflow-server | 2020-04-20 07:43:32.780 INFO 1 --- [ main] o.s.c.d.s.s.DataFlowServerApplication : Started DataFlowServerApplication in 68.947 seconds (JVM running for 78.165)
dataflow-server | 2020-04-20 07:43:33.396 INFO 1 --- [ main] .s.c.d.s.s.LauncherInitializationService : Added 'Local' platform account 'default' into Task Launcher repository.
The log at the start of the task :
2020-04-20 08:01:58.886 INFO 115 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data repositories in DEFAULT mode.
2020-04-20 08:01:59.008 INFO 115 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 68ms. Found 0 repository interfaces.
2020-04-20 08:02:00.545 INFO 115 --- [ main] o.s.cloud.context.scope.GenericScope : BeanFactory id=56219e3d-5a2d-3fb3-8901-c3dad5be90af
2020-04-20 08:02:00.943 INFO 115 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.transaction.annotation.ProxyTransactionManagementConfiguration' of type [org.springframework.transaction.annotation.ProxyTransactionManagementConfiguration$$EnhancerBySpringCGLIB$$cd22d842] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-04-20 08:02:01.049 INFO 115 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.cloud.autoconfigure.ConfigurationPropertiesRebinderAutoConfiguration' of type [org.springframework.cloud.autoconfigure.ConfigurationPropertiesRebinderAutoConfiguration$$EnhancerBySpringCGLIB$$e93cdb3f] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-04-20 08:02:01.062 INFO 115 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.cloud.task.batch.configuration.TaskBatchAutoConfiguration' of type [org.springframework.cloud.task.batch.configuration.TaskBatchAutoConfiguration$$EnhancerBySpringCGLIB$$529283ce] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-04-20 08:02:01.094 INFO 115 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.cloud.task.batch.listener.BatchEventAutoConfiguration' of type [org.springframework.cloud.task.batch.listener.BatchEventAutoConfiguration$$EnhancerBySpringCGLIB$$9ae88dd1] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-04-20 08:02:02.529 INFO 115 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2020-04-20 08:02:03.239 INFO 115 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2020-04-20 08:02:04.489 INFO 115 --- [ main] o.hibernate.jpa.internal.util.LogHelper : HHH000204: Processing PersistenceUnitInfo [
name: default
...]
2020-04-20 08:02:07.075 INFO 115 --- [ main] org.hibernate.Version : HHH000412: Hibernate Core {5.3.13.Final}
2020-04-20 08:02:07.111 INFO 115 --- [ main] org.hibernate.cfg.Environment : HHH000206: hibernate.properties not found
2020-04-20 08:02:08.896 INFO 115 --- [ main] o.hibernate.annotations.common.Version : HCANN000001: Hibernate Commons Annotations {5.0.4.Final}
2020-04-20 08:02:10.736 INFO 115 --- [ main] org.hibernate.dialect.Dialect : HHH000400: Using dialect: org.hibernate.dialect.PostgreSQL95Dialect
2020-04-20 08:02:10.825 INFO 115 --- [ main] o.h.e.j.e.i.LobCreatorBuilderImpl : HHH000422: Disabling contextual LOB creation as connection was null
2020-04-20 08:02:10.843 INFO 115 --- [ main] org.hibernate.type.BasicTypeRegistry : HHH000270: Type registration [java.util.UUID] overrides previous : org.hibernate.type.UUIDBinaryType#3eb91815
2020-04-20 08:02:11.457 INFO 115 --- [ main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2020-04-20 08:02:11.675 DEBUG 115 --- [ main] o.s.c.t.c.SimpleTaskAutoConfiguration : Using org.springframework.cloud.task.configuration.DefaultTaskConfigurer TaskConfigurer
2020-04-20 08:02:11.679 DEBUG 115 --- [ main] o.s.c.t.c.DefaultTaskConfigurer : EntityManager was found, using JpaTransactionManager
2020-04-20 08:02:11.902 INFO 115 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'taskExecutor'
2020-04-20 08:02:12.005 INFO 115 --- [ main] o.s.b.c.r.s.JobRepositoryFactoryBean : No database type set, using meta data indicating: POSTGRES
2020-04-20 08:02:12.107 INFO 115 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
2020-04-20 08:02:12.203 DEBUG 115 --- [ main] o.s.c.t.a.c.DataFlowConfiguration : Not configuring basic security for accessing the Data Flow Server
2020-04-20 08:02:15.942 DEBUG 115 --- [ main] o.s.c.t.r.s.TaskRepositoryInitializer : Initializing task schema for postgresql database
2020-04-20 08:02:16.551 DEBUG 115 --- [ main] BatchConfiguration$ReferenceTargetSource : Initializing lazy target object
2020-04-20 08:02:16.617 DEBUG 115 --- [ main] o.s.c.t.r.support.SimpleTaskRepository : Starting: TaskExecution{executionId=136, parentExecutionId=null, exitCode=null, taskName='testComposedTaskA', startTime=Mon Apr 20 08:02:16 GMT 2020, endTime=null, exitMessage='null', externalExecutionId='null', errorMessage='null', arguments=[--spring.cloud.data.flow.platformname=default, --spring.cloud.task.executionid=136, --spring.cloud.data.flow.taskappname=composed-task-runner]}
2020-04-20 08:02:16.686 INFO 115 --- [ main] .t.a.c.ComposedtaskrunnerTaskApplication : Started ComposedtaskrunnerTaskApplication in 31.176 seconds (JVM running for 34.541)
2020-04-20 08:02:16.717 DEBUG 115 --- [ main] o.s.c.t.r.support.SimpleTaskRepository : Updating: TaskExecution with executionId=136 with the following {exitCode=0, endTime=Mon Apr 20 08:02:16 GMT 2020, exitMessage='null', errorMessage='null'}
2020-04-20 08:02:16.778 INFO 115 --- [ Thread-7] o.s.s.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'taskExecutor'
2020-04-20 08:02:16.784 INFO 115 --- [ Thread-7] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2020-04-20 08:02:16.803 INFO 115 --- [ Thread-7] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...
2020-04-20 08:02:16.854 INFO 115 --- [ Thread-7] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.
the docker compose used :
dataflow-server:
environment:
- SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/dataflow
- SPRING_DATASOURCE_USERNAME=root
- SPRING_DATASOURCE_PASSWORD=rootpw
- SPRING_DATASOURCE_DRIVER_CLASS_NAME=org.postgresql.Driver
- spring.jpa.properties.hibernate.temp.use_jdbc_metadata_defaults=false
- spring.jpa.database-platform=org.hibernate.dialect.PostgreSQL95Dialect
entrypoint: "./wait-for-it.sh -t 240 postgres:5432 -- java -jar /maven/spring-cloud-dataflow-server.jar"
skipper-server:
environment:
- SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/dataflow
- SPRING_DATASOURCE_USERNAME=root
- SPRING_DATASOURCE_PASSWORD=rootpw
- SPRING_DATASOURCE_DRIVER_CLASS_NAME=org.postgresql.Driver
entrypoint: "./wait-for-it.sh -t 240 postgres:5432 -- java -Djava.security.egd=file:/dev/./urandom -jar /spring-cloud-skipper-server.jar"
I hope, it will be an help.
Regards
Frederic
From the log: composed-task-runner && mytask && myTask, the syntax for creating the Composed Task definition looks incorrect.
The app name composed-task-runner should not be part of the task DSL.
You can check here for the detailed documentation on how to create and manage composed tasks.
I am running jBPM (v7.18) in docker on localhost using the following docker-compose configuration:
version: '2'
services:
postgres:
image: postgres:10.4
volumes:
- ./volumes/psql/:/var/lib/postgresql/data/
environment:
- POSTGRES_USER=jbpm
- POSTGRES_PASSWORD=jbpm
ports:
- 5432:5432
jbpm:
image: jboss/jbpm-server-full
environment:
JBPM_DB_DRIVER: postgres
JBPM_DB_HOST: postgres
ports:
- 8080:8080
- 8001:8001
volumes:
- "/Users/guest/prac/jbpm/quickfox:/opt/jboss/quickfox"
depends_on:
- postgres
I generated the business application from https://start.jbpm.org/
I am starting the service of the business application in dev mode as follows.
./launch-dev.sh clean install
As per the documentation ,
KIE server configuration needs to be as follows:
kieserver.serverId=business-application-service
kieserver.serverName=business-application-service
kieserver.location=http://localhost:8090/rest/server
kieserver.controllers=http://localhost:8080/jbpm-console/rest/controller
(which are the default settings in application-dev.properties)
But when I start the service it is not able to connect to business-central. I get the following logs
2019-05-01 11:56:50.789 INFO 47000 --- [ main] o.k.s.s.j.u.f.r.BootstrapFormRenderer : Boostrap Form renderer templates loaded successfully.
2019-05-01 11:56:50.795 INFO 47000 --- [ main] o.k.s.s.j.u.f.r.PatternflyFormRenderer : patternfly Form renderer templates loaded successfully.
2019-05-01 11:56:50.799 INFO 47000 --- [ main] o.k.s.s.j.u.f.r.PatternflyFormRenderer : workbench Form renderer templates loaded successfully.
2019-05-01 11:56:50.801 INFO 47000 --- [ main] o.k.server.services.impl.KieServerImpl : jBPM-UI KIE Server extension has been successfully registered as server extension
2019-05-01 11:56:50.802 INFO 47000 --- [ main] o.k.server.services.impl.KieServerImpl : DMN KIE Server extension has been successfully registered as server extension
2019-05-01 11:56:50.806 INFO 47000 --- [ main] o.k.s.s.impl.policy.PolicyManager : Registered KeepLatestContainerOnlyPolicy{interval=0 ms} policy under name KeepLatestOnly
2019-05-01 11:56:50.807 INFO 47000 --- [ main] o.k.s.s.impl.policy.PolicyManager : Policy manager started successfully, activated policies are []
2019-05-01 11:56:50.817 WARN 47000 --- [ main] o.kie.server.common.KeyStoreHelperUtil : Unable to load key store. Using password from configuration
2019-05-01 11:56:50.933 WARN 47000 --- [ main] o.k.s.s.i.c.DefaultRestControllerImpl : Exception encountered while syncing with controller at http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev error Error while sending PUT request to http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev response code 405
2019-05-01 11:56:50.933 WARN 47000 --- [ main] o.k.s.s.i.ControllerBasedStartupStrategy : Unable to connect to any controllers, delaying container installation until connection can be established
2019-05-01 11:56:50.934 WARN 47000 --- [ntrollerConnect] o.kie.server.common.KeyStoreHelperUtil : Unable to load key store. Using password from configuration
2019-05-01 11:56:50.950 WARN 47000 --- [ntrollerConnect] o.k.s.s.i.c.DefaultRestControllerImpl : Exception encountered while syncing with controller at http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev error Error while sending PUT request to http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev response code 405
2019-05-01 11:56:51.009 INFO 47000 --- [ main] o.k.s.s.a.KieServerAutoConfiguration : KieServer (id business-application-service-dev) started successfully
2019-05-01 11:56:51.339 INFO 47000 --- [ main] org.apache.cxf.endpoint.ServerImpl : Setting the server's publish address to be /
2019-05-01 11:56:51.652 INFO 47000 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8090 (http) with context path ''
2019-05-01 11:56:51.658 INFO 47000 --- [ main] com.quickfox.service.Application : Started Application in 13.158 seconds (JVM running for 13.969)
2019-05-01 11:57:00.954 WARN 47000 --- [ntrollerConnect] o.kie.server.common.KeyStoreHelperUtil : Unable to load key store. Using password from configuration
2019-05-01 11:57:00.961 WARN 47000 --- [ntrollerConnect] o.k.s.s.i.c.DefaultRestControllerImpl : Exception encountered while syncing with controller at http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev error Error while sending PUT request to http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev response code 405
2019-05-01 11:57:10.963 WARN 47000 --- [ntrollerConnect] o.kie.server.common.KeyStoreHelperUtil : Unable to load key store. Using password from configuration
2019-05-01 11:57:10.972 WARN 47000 --- [ntrollerConnect] o.k.s.s.i.c.DefaultRestControllerImpl : Exception encountered while syncing with controller at http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev error Error while sending PUT request to http://localhost:8080/jbpm-console/rest/controller/server/business-application-service-dev response code 405
But If I use the following configuration it works.
kieserver.serverId=business-application-service-dev
kieserver.serverName=business-application-service Dev
kieserver.location=http://localhost:8080/kie-server/services/rest/server
kieserver.controllers=http://localhost:8080/business-central/rest/controller
Can someone tell me what is the reason for this behavior? Please correct me if I am missing anything.
The 2nd set of URLs that you used are correct URLs. It seems like documentation needs to be corrected. "jbpm-console" end point were used in older versions.