Docker containers running on the server are not visible in the netdata dashboard - docker

ı installed netdata on my server and connected it to the netdata dashboard, but I noticed that I cannot see the Docker containers that are currently running on the server on the dashboard. To fix this, I updated the compose.yml file with the following volumes section:
netdata:
image: netdata/netdata
++pid:"host"
container_name: netdata
hostname: example.com # set to fqdn of host
ports:
- 19999:19999
restart: unless-stopped
cap_add:
- SYS_PTRACE
security_opt:
- apparmor:unconfined
volumes:
- netdataconfig:/etc/netdata
- netdatalib:/var/lib/netdata
- netdatacache:/var/cache/netdata
- /etc/passwd:/host/etc/passwd:ro
- /etc/group:/host/etc/group:ro
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /etc/os-release:/host/etc/os-release:ro
- /sys/fs/cgroup:/host/sys/fs/cgroup:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /user-service:/host/user-service:ro
and then I restarted the container. However, when I checked the logs of the netdata container, I saw the following error:
2023-01-05 10:08:38: cgroup-name.sh: INFO: docker container
'7d6a282fefd31f1df1b89cc234ba1266278a09f61280f8b3bb707dd79f9fb46e' is named 'notification-service'
2023-01-05 10:08:38: cgroup-name.sh: INFO: cgroup 'system.slice_docker-7d6a282fefd31f1df1b89cc234ba1266278a09f61280f8b3bb707dd79f9fb46e.scope' is called 'notification-service'
2023-01-05 10:08:39: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : Using host prefix directory '/host'
2023-01-05 10:08:39: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot open pid_from_cgroup() file '/host/sys/fs/cgroup/system.slice/docker-7d6a282fefd31f1df1b89cc234ba1266278a09f61280f8b3bb707dd79f9fb46e.scope/tasks'. (errno 2, No such file or directory)
2023-01-05 10:08:39: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : running: exec /usr/libexec/netdata/plugins.d/cgroup-network-helper.sh --cgroup '/host/sys/fs/cgroup/system.slice/docker-7d6a282fefd31f1df1b89cc234ba1266278a09f61280f8b3bb707dd79f9fb46e.scope'
2023-01-05 10:08:39: cgroup-network-helper.sh: INFO: searching for network interfaces of cgroup '/host/sys/fs/cgroup/system.slice/docker-7d6a282fefd31f1df1b89cc234ba1266278a09f61280f8b3bb707dd79f9fb46e.scope'
When I go to the bash of the netdata container, the file path /host/sys/fs/cgroup/system.slice/docker-7d6a282fefd31f1df1bs89cc234ba1sd266278a0s9f61280f8b3bb707dd79f9fb46e.scope
actually exists. But there is no file named tasks in this file path.
How can I solve this error and see every container running on the server on the dashboard?
I solved the above error by adding the pid line, but I still can't see the containers running on the server on the dashboard. netdata container logs as follows:
2023-01-05 12:43:35: cgroup-name.sh: INFO: cgroup 'init.scope' is called 'init.scope'
2023-01-05 12:43:35: cgroup-name.sh: INFO: Running API command: curl --unix-socket "/var/run/docker.sock" http://localhost/containers/d1e758ed14c0e33d53f1a6467daebb715db468818fb77e5cb1c194ad6a890928/json
2023-01-05 12:43:35: cgroup-name.sh: INFO: docker container 'd1e758ed14c0e33d53f1a6467daebb715db468818fb77e5cb1c194ad6a890928' is named 'definition-service'
2023-01-05 12:43:35: cgroup-name.sh: INFO: cgroup 'system.slice_docker-d1e758ed14c0e33d53f1a6467daebb715db468818fb77e5cb1c194ad6a890928.scope' is called 'definition-service'
2023-01-05 12:43:36: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : Using host prefix directory '/host'
2023-01-05 12:43:36: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : running: exec /usr/libexec/netdata/plugins.d/cgroup-network-helper.sh --cgroup '/host/sys/fs/cgroup/system.slice/docker-d1e758ed14c0e33d53f1a6467daebb715db468818fb77e5cb1c194ad6a890928.scope'
2023-01-05 12:43:37: cgroup-network-helper.sh: INFO: searching for network interfaces of cgroup '/host/sys/fs/cgroup/system.slice/docker-d1e758ed14c0e33d53f1a6467daebb715db468818fb77e5cb1c194ad6a890928.scope'
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : child pid 1853221 exited with code 1.
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to network namespace of pid 1342633 (errno 1, Operation not permitted)
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to pid namespace of pid 1342633 (errno 1, Operation not permitted)
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to mount namespace of pid 1342633 (errno 1, Operation not permitted)
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : PROCFILE: Cannot open file '/proc/net/dev' (errno 2, No such file or directory)
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot open file '/proc/net/dev'
2023-01-05 12:43:37: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : cannot read cgroup interface list.
2023-01-05 12:43:37: cgroup-name.sh: INFO: Running API command: curl --unix-socket "/var/run/docker.sock" http://localhost/containers/459c9a2f6634443c1a0931d55628c60c6a268c9303052171cf4771ae88a340ad/json
2023-01-05 12:43:37: cgroup-name.sh: INFO: docker container '459c9a2f6634443c1a0931d55628c60c6a268c9303052171cf4771ae88a340ad' is named 'mongo_db'
2023-01-05 12:43:37: cgroup-name.sh: INFO: cgroup 'system.slice_docker-459c9a2f6634443c1a0931d55628c60c6a268c9303052171cf4771ae88a340ad.scope' is called 'mongo_db'
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : Using host prefix directory '/host'
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : running: exec /usr/libexec/netdata/plugins.d/cgroup-network-helper.sh --cgroup '/host/sys/fs/cgroup/system.slice/docker-459c9a2f6634443c1a0931d55628c60c6a268c9303052171cf4771ae88a340ad.scope'
2023-01-05 12:43:38: ACLK STA [30c26177-2ef2-4768-b509-6e58e18e2643 (example.com)]: QUEUED REMOVED ALERTS
2023-01-05 12:43:38: cgroup-network-helper.sh: INFO: searching for network interfaces of cgroup '/host/sys/fs/cgroup/system.slice/docker-459c9a2f6634443c1a0931d55628c60c6a268c9303052171cf4771ae88a340ad.scope'
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : child pid 1853268 exited with code 1.
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to network namespace of pid 1343129 (errno 1, Operation not permitted)
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to pid namespace of pid 1343129 (errno 1, Operation not permitted)
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to mount namespace of pid 1343129 (errno 1, Operation not permitted)
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : PROCFILE: Cannot open file '/proc/net/dev' (errno 2, No such file or directory)
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot open file '/proc/net/dev'
2023-01-05 12:43:38: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : cannot read cgroup interface list.
2023-01-05 12:43:39: cgroup-name.sh: INFO: Running API command: curl --unix-socket "/var/run/docker.sock" http://localhost/containers/ee448a793745d2a72ed86fbcb26baa7de16f7103a816910684ad457b6557e179/json
2023-01-05 12:43:39: cgroup-name.sh: INFO: docker container 'ee448a793745d2a72ed86fbcb26baa7de16f7103a816910684ad457b6557e179' is named 'mail-service'
2023-01-05 12:43:39: cgroup-name.sh: INFO: cgroup 'system.slice_docker-ee448a793745d2a72ed86fbcb26baa7de16f7103a816910684ad457b6557e179.scope' is called 'mail-service'
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : Using host prefix directory '/host'
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network INFO : MAIN : running: exec /usr/libexec/netdata/plugins.d/cgroup-network-helper.sh --cgroup '/host/sys/fs/cgroup/system.slice/docker-ee448a793745d2a72ed86fbcb26baa7de16f7103a816910684ad457b6557e179.scope'
2023-01-05 12:43:40: cgroup-network-helper.sh: INFO: searching for network interfaces of cgroup '/host/sys/fs/cgroup/system.slice/docker-ee448a793745d2a72ed86fbcb26baa7de16f7103a816910684ad457b6557e179.scope'
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : child pid 1853317 exited with code 1.
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to network namespace of pid 1499298 (errno 1, Operation not permitted)
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to pid namespace of pid 1499298 (errno 1, Operation not permitted)
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot switch to mount namespace of pid 1499298 (errno 1, Operation not permitted)
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : PROCFILE: Cannot open file '/proc/net/dev' (errno 2, No such file or directory)
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : Cannot open file '/proc/net/dev'
2023-01-05 12:43:40: /usr/libexec/netdata/plugins.d/cgroup-network ERROR : MAIN : cannot read cgroup interface list.
2023-01-05 12:43:41: cgroup-name.sh: INFO: Running API command: curl --unix-socket "/var/run/docker.sock" http://localhost/containers/bcb761b2cd1e89d530cc8973ed21a894e21bc50d0d7138eb8605705ba4ea7b32/json
2023-01-05 12:43:41: cgroup-name.sh: INFO: docker container 'bcb761b2cd1e89d530cc8973ed21a894e21bc50d0d7138eb8605705ba4ea7b32' is named 'netdata'
2023-01-05 12:43:41: cgroup-name.sh: INFO: cgroup 'system.slice_docker-bcb761b2cd1e89d530cc8973ed21a894e21bc50d0d7138eb8605705ba4ea7b32.scope' is called 'netdata'

I saw this similar issue https://github.com/netdata/netdata/issues/11069 and the fix seemed to be to add pid: "host", check this comment
https://github.com/netdata/netdata/issues/11069#issuecomment-952228202
Can you try and see if it works?

Related

Cant open Website after adding Security functionality

im using the latest version of Vaadin.
I upgraded my 3 Months old project to the latest version.
Also implemented the default login system by the vaadin start generator.
But know when the Fontend gets buid following error accurs:
Vaadin is running in DEVELOPMENT mode - do not use for production deployments.
2023-01-05 19:40:37.742 INFO 51916 --- [ restartedMain] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path ''
2023-01-05 19:40:37.759 INFO 51916 --- [ restartedMain] de.admin.commgr.Application : Started Application in 5.716 seconds (JVM running for 6.395)
2023-01-05 19:40:47.100 INFO 51916 --- [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring DispatcherServlet 'dispatcherServlet'
2023-01-05 19:40:47.100 INFO 51916 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet : Initializing Servlet 'dispatcherServlet'
2023-01-05 19:40:47.102 INFO 51916 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet : Completed initialization in 2 ms
2023-01-05 19:40:47.168 INFO 51916 --- [nio-8080-exec-2] c.vaadin.flow.spring.SpringInstantiator : The number of beans implementing 'I18NProvider' is 0. Cannot use Spring beans for I18N, falling back to the default behavior
npm WARN deprecated stable#0.1.8: Modern JS already guarantees Array#sort() is a stable sort, so this library is deprecated. See the compatibility table on MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#browser_compatibility
npm WARN deprecated rollup-plugin-terser#7.0.2: This package has been deprecated and is no longer maintained. Please use #rollup/plugin-terser
npm WARN deprecated sourcemap-codec#1.4.8: Please use #jridgewell/sourcemap-codec instead
npm WARN deprecated svgo#1.3.2: This SVGO version is no longer supported. Upgrade to v2.x.x.
2023-01-05 19:42:40.395 INFO 51916 --- [onPool-worker-1] c.v.f.s.frontend.TaskUpdatePackages : Frontend dependencies resolved successfully.
2023-01-05 19:42:42.484 INFO 51916 --- [onPool-worker-1] c.v.f.s.frontend.TaskCopyFrontendFiles : Copying frontend resources from jar files ...
2023-01-05 19:42:42.512 INFO 51916 --- [onPool-worker-1] c.v.f.s.frontend.TaskCopyFrontendFiles : Visited 23 resources. Took 27 ms.
2023-01-05 19:42:42.544 INFO 51916 --- [onPool-worker-1] c.v.f.server.frontend.TaskUpdateImports :
Failed to find the following imports in the `node_modules` tree:
- #vaadin/flow-frontend/comboBoxConnector.js
- #vaadin/flow-frontend/contextMenuTargetConnector.js
- #vaadin/flow-frontend/messageListConnector.js
- #vaadin/flow-frontend/ironListStyles.js
- #vaadin/flow-frontend/notificationConnector.js
- #vaadin/flow-frontend/selectConnector.js
- #vaadin/flow-frontend/lit-renderer.ts
- #vaadin/flow-frontend/lumo-includes.ts
- #vaadin/flow-frontend/vaadin-big-decimal-field.js
- #vaadin/flow-frontend/contextMenuConnector.js
- #vaadin/flow-frontend/dndConnector-es6.js
- #vaadin/flow-frontend/menubarConnector.js
- #vaadin/flow-frontend/vaadin-grid-flow-selection-column.js
- #vaadin/flow-frontend/virtualListConnector.js
- #vaadin/flow-frontend/cookieConsentConnector.js
- #vaadin/flow-frontend/datepickerConnector.js
- #vaadin/flow-frontend/vaadin-map/mapConnector.js
- #vaadin/flow-frontend/ironListConnector.js
- #vaadin/flow-frontend/vaadin-time-picker/timepickerConnector.js
- #vaadin/flow-frontend/loginOverlayConnector.js
- #vaadin/flow-frontend/dialogConnector.js
- #vaadin/flow-frontend/gridProConnector.js
- #vaadin/flow-frontend/gridConnector.js
- #vaadin/flow-frontend/flow-component-renderer.js
- #vaadin/flow-frontend/confirmDialogConnector.js
If the build fails, check that npm packages are installed.
To fix the build remove `package-lock.json` and `node_modules` directory to reset modules.
In addition you may run `npm install` to fix `node_modules` tree structure.
2023-01-05 19:42:42.546 INFO 51916 --- [onPool-worker-1] c.v.b.devserver.AbstractDevServerRunner : Starting Vite
------------------ Starting Frontend compilation. ------------------
2023-01-05 19:42:44.816 INFO 51916 --- [onPool-worker-1] c.v.b.devserver.AbstractDevServerRunner : Running Vite to compile frontend resources. This may take a moment, please stand by...
2023-01-05 19:42:46.071 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : Searching themes folder '/Users/admin/Documents/Development/Projects/test-manager/frontend/themes' for theme 'testmanager'
2023-01-05 19:42:46.072 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : no assets to handle no static assets were copied
2023-01-05 19:42:46.075 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : Found theme files from '/Users/admin/Documents/Development/Projects/test-manager/frontend/themes'
2023-01-05 19:42:46.718 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker :
2023-01-05 19:42:46.719 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : VITE v3.1.0 ready in 1832 ms
----------------- Frontend compiled successfully. -----------------
2023-01-05 19:42:46.719 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker :
2023-01-05 19:42:46.719 INFO 51916 --- [onPool-worker-1] c.v.b.devserver.AbstractDevServerRunner : Started Vite. Time: 4173ms
2023-01-05 19:42:46.719 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : ➜ Local: http://127.0.0.1:50554/VAADIN/
2023-01-05 19:42:48.056 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker :
2023-01-05 19:42:48.057 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : ERROR(TypeScript) Cannot find module '#vaadin/flow-frontend/Flow' or its corresponding type declarations.
2023-01-05 19:42:48.057 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : FILE /Users/admin/Documents/Development/Projects/test-manager/frontend/generated/index.ts:17:22
2023-01-05 19:42:48.057 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker :
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 15 |
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 16 | // import Flow module to enable navigation to Vaadin server-side views
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : > 17 | import { Flow } from '#vaadin/flow-frontend/Flow';
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 18 |
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 19 | const { serverSideRoutes } = new Flow({
2023-01-05 19:42:48.058 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 20 | imports: () => import('../../target/frontend/generated-flow-imports')
2023-01-05 19:42:48.059 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker :
2023-01-05 19:42:48.059 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : [TypeScript] Found 1 error. Watching for file changes.
2023-01-05 19:42:51.098 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : [vite] Internal server error: Failed to resolve import "#vaadin/flow-frontend/vaadin-dev-tools.js" from "frontend/generated/vaadin.ts". Does the file exist?
2023-01-05 19:42:51.098 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : Plugin: vite:import-analysis
2023-01-05 19:42:51.098 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : File: /Users/admin/Documents/Development/Projects/test-manager/frontend/generated/vaadin.ts
2023-01-05 19:42:51.098 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 1 | import "./vaadin-featureflags.ts";
2023-01-05 19:42:51.098 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 2 | import "./index";
2023-01-05 19:42:51.098 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 3 | import "#vaadin/flow-frontend/vaadin-dev-tools.js";
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : | ^
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 4 | import { applyTheme } from "./theme";
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : 5 | applyTheme(document);
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at formatError (file:///Users/admin/Documents/Development/Projects/test-manager/node_modules/vite/dist/node/chunks/dep-665b0112.js:40782:46)
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at TransformContext.error (file:///Users/admin/Documents/Development/Projects/test-manager/node_modules/vite/dist/node/chunks/dep-665b0112.js:40778:19)
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at normalizeUrl (file:///Users/admin/Documents/Development/Projects/test-manager/node_modules/vite/dist/node/chunks/dep-665b0112.js:37514:33)
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2023-01-05 19:42:51.099 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at async TransformContext.transform (file:///Users/admin/Documents/Development/Projects/test-manager/node_modules/vite/dist/node/chunks/dep-665b0112.js:37648:47)
2023-01-05 19:42:51.100 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at async Object.transform (file:///Users/admin/Documents/Development/Projects/test-manager/node_modules/vite/dist/node/chunks/dep-665b0112.js:41031:30)
2023-01-05 19:42:51.100 INFO 51916 --- [v-server-output] c.v.b.devserver.DevServerOutputTracker : at async loadAndTransform (file:///Users/admin/Documents/Development/Projects/test-manager/node_modules/vite/dist/node/chunks/dep-665b0112.js:37292:29)
These are my POM Versions:
<properties>
<java.version>11</java.version>
<vaadin.version>23.3.2</vaadin.version>
<selenium.version>4.5.3</selenium.version>
</properties>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.7.5</version>
</parent>
I already tried using no vpn, disabling pihole for downloading the npm modules, frontend cleanup and so on and so on.
Are you upgrading from a previous Vaadin 23 version?
Try running this Maven goal and then run your project again: mvn clean vaadin:clean-frontend

[HTCONDOR][kubernetes / k8s] : Unable to start minicondor image within k8s - condor_master not working

POST EDIT
The issue is due to :
PSP (Pod security policy) By default escalation is not permit for my condor user. That is why it is not working. because the supervisord is running as root user and try to write logs and start condor collector as root and not as an other user (i.e condor)
Description
The mini-condor base image is not starting as expected on kubernetes rancher pod.
I am using :
This image : https://hub.docker.com/r/htcondor/mini In a custom namespace in rancher (k8s)
ps : the image was working perfectly on :
a local env
minikube default installation
I am running it as a simple deployment :
When the pod is starting, the Kubernetes default log file is displaying :
2021-09-15 09:26:36,908 INFO supervisord started with pid 1
2021-09-15 09:26:37,911 INFO spawned: 'condor_master' with pid 20
2021-09-15 09:26:37,912 INFO spawned: 'condor_restd' with pid 21
2021-09-15 09:26:37,917 INFO exited: condor_restd (exit status 127; not expected)
2021-09-15 09:26:37,924 INFO exited: condor_master (exit status 4; not expected)
2021-09-15 09:26:38,926 INFO spawned: 'condor_master' with pid 22
2021-09-15 09:26:38,928 INFO spawned: 'condor_restd' with pid 23
2021-09-15 09:26:38,932 INFO exited: condor_restd (exit status 127; not expected)
2021-09-15 09:26:38,936 INFO exited: condor_master (exit status 4; not expected)
2021-09-15 09:26:40,939 INFO spawned: 'condor_master' with pid 24
2021-09-15 09:26:40,943 INFO spawned: 'condor_restd' with pid 25
2021-09-15 09:26:40,947 INFO exited: condor_restd (exit status 127; not expected)
2021-09-15 09:26:40,948 INFO exited: condor_master (exit status 4; not expected)
2021-09-15 09:26:43,953 INFO spawned: 'condor_master' with pid 26
2021-09-15 09:26:43,955 INFO spawned: 'condor_restd' with pid 27
2021-09-15 09:26:43,959 INFO exited: condor_restd (exit status 127; not expected)
2021-09-15 09:26:43,968 INFO gave up: condor_restd entered FATAL state, too many start retries too quickly
2021-09-15 09:26:43,969 INFO exited: condor_master (exit status 4; not expected)
2021-09-15 09:26:44,970 INFO gave up: condor_master entered FATAL state, too many start retries too quickly
Here is a brief cmd and output result:
CMD
output
condor_status
CEDAR:6001:Failed to connect to <127.0.0.1:9618>
condor_master
ERROR "Cannot open log file '/var/log/condor/MasterLog'" at line 174 in file /var/lib/condor/execute/slot1/dir_17406/userdir/.tmpruBd6F/BUILD/condor-9.0.5/src/condor_utils/dprintf_setup.cpp`
1)first try to fix the issue
I decided to customize the image, but the error is the same
The docker images used to try to fix the permission issue
Image :
FROM htcondor/mini:9.2-el7
RUN condor_master
RUN chown condor:root /var/
RUN chown condor:root /var/log
RUN chown -R condor:root /var/log/
RUN chown -R condor:condor /var/log/condor
RUN chown condor:condor /var/log/condor/ProcLog
RUN chown condor:condor /var/log/condor/MasterLog
RUN chmod 775 -R /var/
Kubernetes - rancher
yaml file :
apiVersion: apps/v1
kind: Deployment
metadata:
name: htcondor-mini--all-in-one
namespace: grafana-exporter
spec:
containers:
- image: <custom_image>
imagePullPolicy: Always
name: htcondor-mini--all-in-one
resources: {}
securityContext:
capabilities: {}
stdin: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
tty: true
dnsConfig: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
Here is a brief cmd and output result:
CMD
output
condor_status
CEDAR:6001:Failed to connect to <127.0.0.1:9618>
condor_master
ERROR "Cannot open log file '/var/log/condor/MasterLog'" at line 174 in file /var/lib/condor/execute/slot1/dir_17406/userdir/.tmpruBd6F/BUILD/condor-9.0.5/src/condor_utils/dprintf_setup.cpp`
ls -ld /var/
drwxrwxr-x 1 condor root 17 Nov 13 2020 /var/
ls -ld /var/log/
drwxrwxr-x 1 condor root 65 Oct 7 11:54 /var/log/
ls -ld /var/log/condor
drwxrwxr-x 1 condor condor 240 Oct 7 11:23 /var/log/condor
ls -ld /var/log/condor/MasterLog
-rwxrwxr-x 1 condor condor 3243 Oct 7 11:23 /var/log/condor/MasterLog
MasterLog content :
10/07/21 11:23:21 ******************************************************
10/07/21 11:23:21 ** condor_master (CONDOR_MASTER) STARTING UP
10/07/21 11:23:21 ** /usr/sbin/condor_master
10/07/21 11:23:21 ** SubsystemInfo: name=MASTER type=MASTER(2) class=DAEMON(1)
10/07/21 11:23:21 ** Configuration: subsystem:MASTER local:<NONE> class:DAEMON
10/07/21 11:23:21 ** $CondorVersion: 9.2.0 Sep 23 2021 BuildID: 557262 PackageID: 9.2.0-1 $
10/07/21 11:23:21 ** $CondorPlatform: x86_64_CentOS7 $
10/07/21 11:23:21 ** PID = 7
10/07/21 11:23:21 ** Log last touched time unavailable (No such file or directory)
10/07/21 11:23:21 ******************************************************
10/07/21 11:23:21 Using config source: /etc/condor/condor_config
10/07/21 11:23:21 Using local config sources:
10/07/21 11:23:21 /etc/condor/config.d/00-htcondor-9.0.config
10/07/21 11:23:21 /etc/condor/config.d/00-minicondor
10/07/21 11:23:21 /etc/condor/config.d/01-misc.conf
10/07/21 11:23:21 /etc/condor/condor_config.local
10/07/21 11:23:21 config Macros = 73, Sorted = 73, StringBytes = 1848, TablesBytes = 2692
10/07/21 11:23:21 CLASSAD_CACHING is OFF
10/07/21 11:23:21 Daemon Log is logging: D_ALWAYS D_ERROR
10/07/21 11:23:21 SharedPortEndpoint: waiting for connections to named socket master_7_43af
10/07/21 11:23:21 SharedPortEndpoint: failed to open /var/lock/condor/shared_port_ad: No such file or directory
10/07/21 11:23:21 SharedPortEndpoint: did not successfully find SharedPortServer address. Will retry in 60s.
10/07/21 11:23:21 Permission denied error during DISCARD_SESSION_KEYRING_ON_STARTUP, continuing anyway
10/07/21 11:23:21 Adding SHARED_PORT to DAEMON_LIST, because USE_SHARED_PORT=true (to disable this, set AUTO_INCLUDE_SHARED_PORT_IN_DAEMON_LIST=False)
10/07/21 11:23:21 SHARED_PORT is in front of a COLLECTOR, so it will use the configured collector port
10/07/21 11:23:21 Master restart (GRACEFUL) is watching /usr/sbin/condor_master (mtime:1632433213)
10/07/21 11:23:21 Cannot remove wait-for-startup file /var/lock/condor/shared_port_ad
10/07/21 11:23:21 WARNING: forward resolution of ip6-localhost doesn't match 127.0.0.1!
10/07/21 11:23:21 WARNING: forward resolution of ip6-loopback doesn't match 127.0.0.1!
10/07/21 11:23:22 Started DaemonCore process "/usr/libexec/condor/condor_shared_port", pid and pgroup = 9
10/07/21 11:23:22 Waiting for /var/lock/condor/shared_port_ad to appear.
10/07/21 11:23:22 Found /var/lock/condor/shared_port_ad.
10/07/21 11:23:22 Cannot remove wait-for-startup file /var/log/condor/.collector_address
10/07/21 11:23:23 Started DaemonCore process "/usr/sbin/condor_collector", pid and pgroup = 10
10/07/21 11:23:23 Waiting for /var/log/condor/.collector_address to appear.
10/07/21 11:23:23 Found /var/log/condor/.collector_address.
10/07/21 11:23:23 Started DaemonCore process "/usr/sbin/condor_negotiator", pid and pgroup = 11
10/07/21 11:23:23 Started DaemonCore process "/usr/sbin/condor_schedd", pid and pgroup = 12
10/07/21 11:23:24 Started DaemonCore process "/usr/sbin/condor_startd", pid and pgroup = 15
10/07/21 11:23:24 Daemons::StartAllDaemons all daemons were started
A huge thanks for reading. Hope it will help many other people.
Cause of the issue
The issue is due to :
PSP policy (Pod security policy)
By default escalation is not permit for my condor user.
SOLUTION
THE BEST SOLUTION I found at the moment is to run EVERYTHING as condor user and give the permisssion to the condor users. To do so you need :
In the supervisord.conf : Run supervisor as condor user
In the supervisord.conf : run log and socket in /tmp
In the Dockerfile : Change the owner of most of folder by condor
In the deployment.yamlset the ID to 64 (condor user)
Dockerfile
FROM htcondor/mini:9.2-el7
# SET WORKDIR
WORKDIR /home/condor/
RUN chown condor:condor /home/condor
# COPY SUPERVISOR
COPY supervisord.conf /etc/supervisord.conf
# Need to run the cmd to create all dir
RUN condor_master
# FIX PERMISSION ISSUES FOR RANCHER
RUN chown -R condor:condor /var/log/ /tmp &&\
chown -R restd:restd /home/restd &&\
chmod 755 -R /home/restd
supervisord.conf:
[supervisord]
user=condor
nodaemon=true
logfile = /tmp/supervisord.log
directory = /tmp
pidfile = /tmp/supervisord.pid
childlogdir = /tmp
# next 3 sections contain using supervisorctl to manage daemons
[unix_http_server]
file=/tmp/supervisord.sock
chown=condor:condor
chmod=0777
user=condor
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock
[program:condor_master]
user=condor
command=/usr/sbin/condor_master -f
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile = /var/log/condor_master.log
stderr_logfile = /var/log/condor_master.error.log
deployment.yaml
apiVersion: apps/v1
kind: Deployment
spec:
containers:
- image: <condor-image>
imagePullPolicy: Always
name: htcondor-exporter
ports:
- containerPort: 8080
name: myport
protocol: TCP
resources: {}
securityContext:
capabilities: {}
runAsNonRoot: false
runAsUser: 64
stdin: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
tty: true

Management page won't load when using RabbitMQ docker container

I'm running RabbitMQ locally using:
docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3-management
Some log:
narley#brittes ~ $ docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3-management
2020-01-08 22:31:52.079 [info] <0.8.0> Feature flags: list of feature flags found:
2020-01-08 22:31:52.079 [info] <0.8.0> Feature flags: [ ] drop_unroutable_metric
2020-01-08 22:31:52.079 [info] <0.8.0> Feature flags: [ ] empty_basic_get_metric
2020-01-08 22:31:52.079 [info] <0.8.0> Feature flags: [ ] implicit_default_bindings
2020-01-08 22:31:52.080 [info] <0.8.0> Feature flags: [ ] quorum_queue
2020-01-08 22:31:52.080 [info] <0.8.0> Feature flags: [ ] virtual_host_metadata
2020-01-08 22:31:52.080 [info] <0.8.0> Feature flags: feature flag states written to disk: yes
2020-01-08 22:31:52.160 [info] <0.268.0> ra: meta data store initialised. 0 record(s) recovered
2020-01-08 22:31:52.162 [info] <0.273.0> WAL: recovering []
2020-01-08 22:31:52.164 [info] <0.277.0>
Starting RabbitMQ 3.8.2 on Erlang 22.2.1
Copyright (c) 2007-2019 Pivotal Software, Inc.
Licensed under the MPL 1.1. Website: https://rabbitmq.com
## ## RabbitMQ 3.8.2
## ##
########## Copyright (c) 2007-2019 Pivotal Software, Inc.
###### ##
########## Licensed under the MPL 1.1. Website: https://rabbitmq.com
Doc guides: https://rabbitmq.com/documentation.html
Support: https://rabbitmq.com/contact.html
Tutorials: https://rabbitmq.com/getstarted.html
Monitoring: https://rabbitmq.com/monitoring.html
Logs: <stdout>
Config file(s): /etc/rabbitmq/rabbitmq.conf
Starting broker...2020-01-08 22:31:52.166 [info] <0.277.0>
node : rabbit#1586b4698736
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.conf
cookie hash : bwlnCFiUchzEkgAOsZwQ1w==
log(s) : <stdout>
database dir : /var/lib/rabbitmq/mnesia/rabbit#1586b4698736
2020-01-08 22:31:52.210 [info] <0.277.0> Running boot step pre_boot defined by app rabbit
...
...
...
2020-01-08 22:31:53.817 [info] <0.277.0> Setting up a table for connection tracking on this node: tracked_connection_on_node_rabbit#1586b4698736
2020-01-08 22:31:53.827 [info] <0.277.0> Setting up a table for per-vhost connection counting on this node: tracked_connection_per_vhost_on_node_rabbit#1586b4698736
2020-01-08 22:31:53.828 [info] <0.277.0> Running boot step routing_ready defined by app rabbit
2020-01-08 22:31:53.828 [info] <0.277.0> Running boot step pre_flight defined by app rabbit
2020-01-08 22:31:53.828 [info] <0.277.0> Running boot step notify_cluster defined by app rabbit
2020-01-08 22:31:53.829 [info] <0.277.0> Running boot step networking defined by app rabbit
2020-01-08 22:31:53.833 [info] <0.624.0> started TCP listener on [::]:5672
2020-01-08 22:31:53.833 [info] <0.277.0> Running boot step cluster_name defined by app rabbit
2020-01-08 22:31:53.833 [info] <0.277.0> Running boot step direct_client defined by app rabbit
2020-01-08 22:31:53.922 [info] <0.674.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2020-01-08 22:31:53.922 [info] <0.780.0> Statistics database started.
2020-01-08 22:31:53.923 [info] <0.779.0> Starting worker pool 'management_worker_pool' with 3 processes in it
completed with 3 plugins.
2020-01-08 22:31:54.316 [info] <0.8.0> Server startup complete; 3 plugins started.
* rabbitmq_management
* rabbitmq_management_agent
* rabbitmq_web_dispatch
Then I go to http:localhost:15672 and page doesn't load. No error is displayed.
Interesting thing is that it worked last time I used it (about 3 weeks ago).
Can anyone give me some help?
Cheers!
have a try:
step 1, going into docker container
docker exec -it rabbitmq bash
step 2, run it in docker container
rabbitmq-plugins enable rabbitmq_management
is work for me
I got it working by simply upgrading docker.
Was running docker 18.09.7 and upgrade to 19.03.5.
In my case, clearing the cookies up has fixed this issue instantly.

How to run cassandra docker container from nomad?

I want to run a cassandra container from a nomad job. It seems to start, but after a few seconds it dies (it seems to be killed by nomad itself).
If I run the container from the command line, with:
docker run --name some-cassandra -p 9042:9042 -d cassandra:3.0
the container starts flawlessly. But if I create a nomad job like this:
job "cassandra" {
datacenters = ["dc1"]
type = "service"
update {
max_parallel = 1
min_healthy_time = "10s"
healthy_deadline = "5m"
progress_deadline = "10m"
auto_revert = false
canary = 0
}
migrate {
max_parallel = 1
health_check = "checks"
min_healthy_time = "10s"
healthy_deadline = "5m"
}
group "cassandra" {
restart {
attempts = 2
interval = "240s"
delay = "120s"
mode = "delay"
}
task "cassandra" {
driver = "docker"
config {
image = "cassandra:3.0"
network_mode = "bridge"
port_map {
cql = 9042
}
}
resources {
memory = 2048
cpu = 800
network {
port "cql" {}
}
}
env {
CASSANDRA_LISTEN_ADDRESS = "${NOMAD_IP_cql}"
}
service {
name = "cassandra"
tags = ["global", "cassandra"]
port = "cql"
}
}
}
}
Then it will never start. The nomad's web interface shows nothing in the stdout logs of the created allocation, and the stdin stream only shows Killed.
I know that as this is happening, docker containers are created, and removed after a few seconds. I cannot read the logs of these containers, for when I try (with docker logs <container_id>), all I get is:
Error response from daemon: configured logging driver does not support reading
And the allocation overview shows this message:
12/06/18 14:16:04 Terminated Exit Code: 137, Exit Message: "Docker container exited with non-zero exit code: 137"
According to docker:
If there is no database initialized when the container starts, then a
default database will be created. While this is the expected behavior,
this means that it will not accept incoming connections until such
initialization completes. This may cause issues when using automation
tools, such as docker-compose, which start several containers
simultaneously.
But I doubt this is the source of my problem, because I've increased the restart stanza values with no effect, and because the task fails after just a few seconds.
Not long ago I experience a somewhat similar problem, with a kafka container, that -it turns out- was not happy because it wanted more memory. But in this case I've provided generous values for memory and for CPU in the resources stanza, and it doesn't seem to make any difference.
My host OS is Arch Linux, with kernel 4.19.4-arch1-1-ARCH. I'm running consul as a systemd service, and the nomad agent with this command line:
sudo nomad agent -dev
What can I possibly be missing? Any help and/or pointers will be appreciated.
Update (2018-12-06 16:26 GMT): by reading in detail the output of the nomad agent, I get that some valuable information can be read at the host's /tmp directory. A snippet of that output:
2018/12/06 16:03:03 [DEBUG] memberlist: TCP connection from=127.0.0.1:45792
2018/12/06 16:03:03.180586 [DEBUG] driver.docker: docker pull cassandra:latest succeeded
2018-12-06T16:03:03.184Z [DEBUG] plugin: starting plugin: path=/usr/bin/nomad args="[/usr/bin/nomad executor {"LogFile":"/tmp/NomadClient073551030/1c315bf2-688c-2c7b-8d6f-f71fec1254f3/cassandra/executor.out","LogLevel":"DEBUG"}]"
2018-12-06T16:03:03.185Z [DEBUG] plugin: waiting for RPC address: path=/usr/bin/nomad
2018-12-06T16:03:03.235Z [DEBUG] plugin.nomad: plugin address: timestamp=2018-12-06T16:03:03.235Z address=/tmp/plugin681788273 network=unix
2018/12/06 16:03:03.253166 [DEBUG] driver.docker: Setting default logging options to syslog and unix:///tmp/plugin559865372
2018/12/06 16:03:03.253196 [DEBUG] driver.docker: Using config for logging: {Type:syslog ConfigRaw:[] Config:map[syslog-address:unix:///tmp/plugin559865372]}
2018/12/06 16:03:03.253206 [DEBUG] driver.docker: using 2147483648 bytes memory for cassandra
2018/12/06 16:03:03.253217 [DEBUG] driver.docker: using 800 cpu shares for cassandra
2018/12/06 16:03:03.253237 [DEBUG] driver.docker: binding directories []string{"/tmp/NomadClient073551030/1c315bf2-688c-2c7b-8d6f-f71fec1254f3/alloc:/alloc", "/tmp/NomadClient073551030/1c315bf2-688c-2c7b-8d6f-f71fec1254f3/cassandra/local:/local", "/tmp/NomadClient073551030/1c315bf2-688c-2c7b-8d6f-f71fec1254f3/cassandra/secrets:/secrets"} for cassandra
2018/12/06 16:03:03.253282 [DEBUG] driver.docker: allocated port 127.0.0.1:29073 -> 9042 (mapped)
2018/12/06 16:03:03.253296 [DEBUG] driver.docker: exposed port 9042
2018/12/06 16:03:03.253320 [DEBUG] driver.docker: setting container name to: cassandra-1c315bf2-688c-2c7b-8d6f-f71fec1254f3
2018/12/06 16:03:03.361162 [INFO] driver.docker: created container 29b0764bd2de69bda6450ebb1a55ffd2cbb4dc3002f961cb5db71b323d611199
2018/12/06 16:03:03.754476 [INFO] driver.docker: started container 29b0764bd2de69bda6450ebb1a55ffd2cbb4dc3002f961cb5db71b323d611199
2018/12/06 16:03:03.757642 [DEBUG] consul.sync: registered 1 services, 0 checks; deregistered 0 services, 0 checks
2018/12/06 16:03:03.765001 [DEBUG] client: error fetching stats of task cassandra: stats collection hasn't started yet
2018/12/06 16:03:03.894514 [DEBUG] client: updated allocations at index 371 (total 2) (pulled 0) (filtered 2)
2018/12/06 16:03:03.894584 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2)
2018/12/06 16:03:05.190647 [DEBUG] driver.docker: error collecting stats from container 29b0764bd2de69bda6450ebb1a55ffd2cbb4dc3002f961cb5db71b323d611199: io: read/write on closed pipe
2018-12-06T16:03:09.191Z [DEBUG] plugin.nomad: 2018/12/06 16:03:09 [ERR] plugin: plugin server: accept unix /tmp/plugin681788273: use of closed network connection
2018-12-06T16:03:09.194Z [DEBUG] plugin: plugin process exited: path=/usr/bin/nomad
2018/12/06 16:03:09.223734 [INFO] client: task "cassandra" for alloc "1c315bf2-688c-2c7b-8d6f-f71fec1254f3" failed: Wait returned exit code 137, signal 0, and error Docker container exited with non-zero exit code: 137
2018/12/06 16:03:09.223802 [INFO] client: Restarting task "cassandra" for alloc "1c315bf2-688c-2c7b-8d6f-f71fec1254f3" in 2m7.683274502s
2018/12/06 16:03:09.230053 [DEBUG] consul.sync: registered 0 services, 0 checks; deregistered 1 services, 0 checks
2018/12/06 16:03:09.233507 [DEBUG] consul.sync: registered 0 services, 0 checks; deregistered 0 services, 0 checks
2018/12/06 16:03:09.296185 [DEBUG] client: updated allocations at index 372 (total 2) (pulled 0) (filtered 2)
2018/12/06 16:03:09.296313 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2)
2018/12/06 16:03:11.541901 [DEBUG] http: Request GET /v1/agent/health?type=client (452.678µs)
But the contents of /tmp/NomadClient.../<alloc_id>/... is deceptively simple:
[root#singularity 1c315bf2-688c-2c7b-8d6f-f71fec1254f3]# ls -lR
.:
total 0
drwxrwxrwx 5 nobody nobody 100 Dec 6 15:52 alloc
drwxrwxrwx 5 nobody nobody 120 Dec 6 15:53 cassandra
./alloc:
total 0
drwxrwxrwx 2 nobody nobody 40 Dec 6 15:52 data
drwxrwxrwx 2 nobody nobody 80 Dec 6 15:53 logs
drwxrwxrwx 2 nobody nobody 40 Dec 6 15:52 tmp
./alloc/data:
total 0
./alloc/logs:
total 0
-rw-r--r-- 1 root root 0 Dec 6 15:53 cassandra.stderr.0
-rw-r--r-- 1 root root 0 Dec 6 15:53 cassandra.stdout.0
./alloc/tmp:
total 0
./cassandra:
total 4
-rw-r--r-- 1 root root 1248 Dec 6 16:19 executor.out
drwxrwxrwx 2 nobody nobody 40 Dec 6 15:52 local
drwxrwxrwx 2 nobody nobody 60 Dec 6 15:52 secrets
drwxrwxrwt 2 nobody nobody 40 Dec 6 15:52 tmp
./cassandra/local:
total 0
./cassandra/secrets:
total 0
./cassandra/tmp:
total 0
Both cassandra.stdout.0 and cassandra.stderr.0 are empty, and the full contents of the executor.out file is:
2018/12/06 15:53:22.822072 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin278120866
2018/12/06 15:55:53.009611 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin242312234
2018/12/06 15:58:29.135309 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin226242288
2018/12/06 16:00:53.942271 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin373025133
2018/12/06 16:03:03.252389 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin559865372
2018/12/06 16:05:19.656317 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin090082811
2018/12/06 16:07:28.468809 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin383954837
2018/12/06 16:09:54.068604 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin412544225
2018/12/06 16:12:10.085157 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin279043152
2018/12/06 16:14:48.255653 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin209533710
2018/12/06 16:17:23.735550 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin168184243
2018/12/06 16:19:40.232181 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin839254781
2018/12/06 16:22:13.485457 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin406142133
2018/12/06 16:24:24.869274 [DEBUG] syslog-server: launching syslog server on addr: /tmp/plugin964077792
Update (2018-12-06 16:40 GMT): since it's apparent that logging to syslog is desirable for the agent, I've setup and launched a local syslog server, to no avail. And the syslog server receive no message whatsoever.
Problem solved. Its nature is twofold:
Nomad's docker driver is (very efficiently) encapsulating the
behaviour of the containers, making them at times very silent.
Cassandra is very demanding of resources. Much more than I
originally thought. I was convinced that 4 GB RAM would be enough for
it run comfortably, but as it turns out it needs (at least in my
environment) 6 GB.
Disclaimer: I'm actually using now bitnami/cassandra instead of cassandra, because I believe their images are of very high quality, secure and configurable by means of environment variables. This discovery I made using bitnami's image, and I haven't tested how the original one reacts to having this amount of memory.
As to why it doesn't fail when running the container directly from docker's CLI, I think that's because there is no specification of limits when running it that way. Docker simply takes as much memory as it needs for its containers, so if eventually host's memory is insufficient for all containers, the realisation will come much later (and possibly painfully). So this early failure should be a welcome benefit of an orchestration platform as nomad. If there is any complain on my part is that finding what the problem was took so long because of the lack of visibility of the container!

GitLab CE docker container keeps crashing at startup

I'm trying to run GitLab CE edition as a docker container (gitlab/gitlab-ce) via docker compose, following the instructions at http://doc.gitlab.com/omnibus/docker.
The problem is that every time I start with docker-compose up -d, the container crashes/exits after about a minute. I collected all information that could be useful, there are some chef-related error message that I'm not able to decrypt. The environment runs insides an Ubuntu Vagrant virtual machine.
I tried to use a different tagged version of the image instead of the :latest, but getting similar results.
docker-compose.yml relevant snippet:
gitlab:
image: gitlab/gitlab-ce
container_name: my_gitlab
volumes:
- ./runtime/gitlab/config:/etc/gitlab
- ./runtime/gitlab/data:/var/opt/gitlab
- ./runtime/gitlab/logs:/var/log/gitlab
ports:
- 443:443
- 22:22
- 8082:80
following is the log file saved in ./runtime/gitlab/logs (volume for /var/log/gitlab)
# Logfile created on 2016-04-28 08:07:43 +0000 by logger.rb/44203
[2016-04-28T08:07:44+00:00] INFO: Started chef-zero at chefzero://localhost:8889 with repository at /opt/gitlab/embedded
One version per cookbook
[2016-04-28T08:07:44+00:00] INFO: Forking chef instance to converge...
[2016-04-28T08:07:44+00:00] INFO: *** Chef 12.6.0 ***
[2016-04-28T08:07:44+00:00] INFO: Chef-client pid: 36
[2016-04-28T08:07:47+00:00] INFO: HTTP Request Returned 404 Not Found: Object not found: chefzero://localhost:8889/nodes/bcfc5b569532
[2016-04-28T08:07:48+00:00] INFO: Setting the run_list to ["recipe[gitlab]"] from CLI options
[2016-04-28T08:07:48+00:00] INFO: Run List is [recipe[gitlab]]
[2016-04-28T08:07:48+00:00] INFO: Run List expands to [gitlab]
[2016-04-28T08:07:48+00:00] INFO: Starting Chef Run for bcfc5b569532
[2016-04-28T08:07:48+00:00] INFO: Running start handlers
[2016-04-28T08:07:48+00:00] INFO: Start handlers complete.
[2016-04-28T08:07:48+00:00] INFO: HTTP Request Returned 404 Not Found: Object not found:
[2016-04-28T08:07:52+00:00] INFO: Loading cookbooks [gitlab#0.0.1, runit#0.14.2, package#0.0.0]
[2016-04-28T08:07:54+00:00] INFO: directory[/etc/gitlab] owner changed to 0
[2016-04-28T08:07:54+00:00] INFO: directory[/etc/gitlab] group changed to 0
[2016-04-28T08:07:54+00:00] INFO: directory[/etc/gitlab] mode changed to 775
[2016-04-28T08:07:54+00:00] WARN: Cloning resource attributes for directory[/var/opt/gitlab] from prior resource (CHEF-3694)
[2016-04-28T08:07:54+00:00] WARN: Previous directory[/var/opt/gitlab]: /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/default.rb:43:in `from_file'
[2016-04-28T08:07:54+00:00] WARN: Current directory[/var/opt/gitlab]: /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/users.rb:24:in `from_file'
[2016-04-28T08:07:54+00:00] WARN: Selected upstart because /sbin/init --version is showing upstart.
[2016-04-28T08:07:54+00:00] WARN: Cloning resource attributes for directory[/etc/sysctl.d] from prior resource (CHEF-3694)
[2016-04-28T08:07:54+00:00] WARN: Previous directory[/etc/sysctl.d]: /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/definitions/sysctl.rb:22:in `block in from_file'
[2016-04-28T08:07:54+00:00] WARN: Current directory[/etc/sysctl.d]: /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/definitions/sysctl.rb:22:in `block in from_file'
[2016-04-28T08:07:54+00:00] WARN: Cloning resource attributes for file[/etc/sysctl.d/90-postgresql.conf] from prior resource (CHEF-3694)
.
. several similar WARN: log entries
.
[2016-04-28T08:07:55+00:00] INFO: directory[/var/opt/gitlab] owner changed to 0
[2016-04-28T08:07:55+00:00] INFO: directory[/var/opt/gitlab] group changed to 0
[2016-04-28T08:07:55+00:00] INFO: directory[/var/opt/gitlab] mode changed to 755
.
.
.
[2016-04-28T08:07:57+00:00] INFO: template[/var/opt/gitlab/gitlab-rails/etc/rack_attack.rb] owner changed to 0
[2016-04-28T08:07:57+00:00] INFO: template[/var/opt/gitlab/gitlab-rails/etc/rack_attack.rb] group changed to 0
[2016-04-28T08:07:57+00:00] INFO: template[/var/opt/gitlab/gitlab-rails/etc/rack_attack.rb] mode changed to 644
[2016-04-28T08:07:58+00:00] INFO: Running queued delayed notifications before re-raising exception
[2016-04-28T08:07:58+00:00] INFO: template[/var/opt/gitlab/gitlab-rails/etc/gitlab.yml] sending run action to execute[clear the gitlab-rails cache] (delayed)
[2016-04-28T08:09:02+00:00] ERROR: Running exception handlers
[2016-04-28T08:09:02+00:00] ERROR: Exception handlers complete
[2016-04-28T08:09:02+00:00] FATAL: Stacktrace dumped to /opt/gitlab/embedded/cookbooks/cache/chef-stacktrace.out
[2016-04-28T08:09:02+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2016-04-28T08:09:02+00:00] ERROR: Chef::Exceptions::MultipleFailures
[2016-04-28T08:09:02+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
/opt/gitlab/embedded/bin/chef-client:23:in `<main>'root#bcfc5b569532:/# tail -f /opt/gitlab/embedded/cookbooks/cache/chef-stacktrace.out
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/local_mode.rb:44:in `with_server_connectivity'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/application.rb:203:in `run_chef_client'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/application/client.rb:413:in `block in interval_run_chef_client'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/application/client.rb:403:in `loop'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/application/client.rb:403:in `interval_run_chef_client'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/application/client.rb:393:in `run_application'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/lib/chef/application.rb:58:in `run'
/opt/gitlab/embedded/lib/ruby/gems/2.1.0/gems/chef-12.6.0/bin/chef-client:26:in `<top (required)>'
/opt/gitlab/embedded/bin/chef-client:23:in `load'
/opt/gitlab/embedded/bin/chef-client:23:in `<main>'
<...here the container terminates and my exec bash shell returns...>
Below the output from docker logs -f for the container. The log is very long (>12K lines), so I tried to look for lines containing useful info but am not sure I found them all:
Thank you for using GitLab Docker Image!
Current version: gitlab-ce=8.7.0-ce.0
Configure GitLab for your system by editing /etc/gitlab/gitlab.rb file
And restart this container to reload settings.
To do it use docker exec:
docker exec -it gitlab vim /etc/gitlab/gitlab.rb
docker restart gitlab
For a comprehensive list of configuration options please see the Omnibus GitLab readme
https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/README.md
If this container fails to start due to permission problems try to fix it by executing:
docker exec -it gitlab update-permissions
docker restart gitlab
Preparing services...
Starting services...
Configuring GitLab package...
Configuring GitLab...
[2016-04-28T08:02:39+00:00] INFO: GET /organizations/chef/nodes/bcfc5b569532
[2016-04-28T08:02:39+00:00] INFO: #<ChefZero::RestErrorResponse: 404: Object not found: chefzero://localhost:8889/nodes/bcfc5b569532>
.
.
.
/opt/gitlab/embedded/bin/chef-client:23:in `load'
/opt/gitlab/embedded/bin/chef-client:23:in `<main>'
[2016-04-28T08:02:39+00:00] INFO:
--- RESPONSE (404) ---
{
"error": [
"Object not found: chefzero://localhost:8889/nodes/bcfc5b569532"
]
}
--- END RESPONSE ---
.
.
.
...a lot of logs (~12K lines), including some errors like the following one:
.
.
.
--- END RESPONSE ---
init (upstart 1.12.1)
[0m
================================================================================[0m
[31mError executing action `create` on resource 'link[/var/log/gitlab/gitlab-rails/sidekiq.log]'[0m
================================================================================[0m
[0mErrno::EPROTO[0m
-------------[0m
Protocol error # sys_fail2 - (/var/log/gitlab/sidekiq/current, /var/log/gitlab/gitlab-rails/sidekiq.log)[0m
.
.
.
================================================================================
Error executing action `create` on resource 'link[/var/log/gitlab/gitlab-rails/sidekiq.log]'
================================================================================
Errno::EPROTO
-------------
Protocol error # sys_fail2 - (/var/log/gitlab/sidekiq/current, /var/log/gitlab/gitlab-rails/sidekiq.log)
Resource Declaration:
---------------------
# In /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/gitlab-rails.rb
281: link legacy_sidekiq_log_file do
282: to File.join(node['gitlab']['sidekiq']['log_directory'], 'current')
283: not_if { File.exists?(legacy_sidekiq_log_file) }
284: end
285:
Compiled Resource:
------------------
# Declared in /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/gitlab-rails.rb:281:in `from_file'
link("/var/log/gitlab/gitlab-rails/sidekiq.log") do
action [:create]
retries 0
retry_delay 2
default_guard_interpreter :default
to "/var/log/gitlab/sidekiq/current"
link_type :symbolic
target_file "/var/log/gitlab/gitlab-rails/sidekiq.log"
declared_type :link
cookbook_name "gitlab"
recipe_name "gitlab-rails"
not_if { #code block }
end
<output ends>
My Gitlab container was crashing on run too, until I noticed that there was a rights issue (Gitlab not having rights on its own files because there were externally replaced, especially the config file gitlab.rb).
This fixed my problem:
docker exec -it my-gitlab-container update-permissions
docker exec -it my-gitlab-container gitlab-ctl reconfigure
docker restart my-gitlab-container
I'm not sure my issue is related to yours but in my case, I wanted to migrate the gitlab volumes to others directory because of space availability. There was a permission issue because I ran :
cp -R /my/old/gitlab /my/new/gitlab
insteda of :
cp -a /my/old/gitlab /my/new/gitlab
The -a preserve the attributes including permissions which were problematic for our container.
cheers
sudo chmod g+s /opt/gitlab/data/git-data/repositories/
Where /opt/gitlab/ is the linked docker share

Resources