Unable to restart Unicorn in Ubuntu 16.04 - ruby-on-rails

I am trying to deploy my new rails app to Ubuntu 16.04 Digital Ocean Server. Here Unicorn is managed via systemd. This is my /etc/systemd/system/unicorn.service file
[Unit]
Description=Skreem Application
Before=nginx.service
Requires=network.target
[Service]
Type=simple
User=rails
Group=rails
RuntimeDirectory=DigitalOceanOneClick
SyslogIdentifier=DigitalOceanRailsOneClick
# Go paranoid
PrivateTmp=true
PrivateDevices=true
ProtectSystem=full
ProtectKernelTunables=true
NoNewPrivileges=true
WorkingDirectory=/home/rails/skreem-ror
ExecStart=/bin/bash /home/rails/skreem-ror/.unicorn.sh
TimeoutSec=60s
RestartSec=10s
Restart=always
[Install]
WantedBy=multi-user.target
When I am trying to restart the unicorn service, I am getting following error
Failed to restart unicorn.service: Unit unicorn.service is not loaded properly: Invalid argument.
See system logs and 'systemctl status unicorn.service' for details.
Then I tried systemctl status unicorn.service and getting
Jul 03 10:05:06 skreem-dev2 systemd[1]: unicorn.service: Main process exited, code=exited, status=1/FAILURE
Jul 03 10:05:06 skreem-dev2 systemd[1]: unicorn.service: Unit entered failed state.
Jul 03 10:05:06 skreem-dev2 systemd[1]: unicorn.service: Failed with result 'exit-code'.
Jul 03 10:05:07 skreem-dev2 systemd[1]: [/etc/systemd/system/unicorn.service:18] Unknown lvalue 'ProtectKernelTunables' in section 'Service'
Jul 03 10:05:07 skreem-dev2 systemd[1]: [/etc/systemd/system/unicorn.service:32] Missing '='.
Jul 03 10:05:16 skreem-dev2 systemd[1]: unicorn.service: Service hold-off time over, scheduling restart.
Jul 03 10:05:16 skreem-dev2 systemd[1]: unicorn.service: Failed to schedule restart job: Unit unicorn.service is not loaded properly: Invalid a
Jul 03 10:05:16 skreem-dev2 systemd[1]: unicorn.service: Unit entered failed state.
Jul 03 10:05:16 skreem-dev2 systemd[1]: unicorn.service: Failed with result 'resources'.
Jul 03 11:33:51 skreem-dev2 systemd[1]: Stopped DigitalOcean Rails One-Click Application.
Its not coming from my updated unicorn.service file. Is it because my changes are not loading properly. Please help me to solve this issue.

Related

Job for artifactory.service failed

Ubuntu 20.04 LTS
I just installed oss version if JFrog Artifactory
To run artifactory I used systemctl start artifactory.service by getting this error:
Job for artifactory.service failed because the control process exited with error code.
See "systemctl status artifactory.service" and "journalctl -xe" for details.
If I run systemctl status artifactory.service this is what I am getting
● artifactory.service - Artifactory service
Loaded: loaded (/lib/systemd/system/artifactory.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Jun 01 00:25:42 siddharth-HP-Notebook systemd[1]: Stopped Artifactory service.
Jun 01 00:25:42 siddharth-HP-Notebook systemd[1]: Starting Artifactory service...
Jun 01 00:25:43 siddharth-HP-Notebook artifactoryManage.sh[17274]: 2020-05-31T18:55:43.286Z [shell] [INFO ] [] [artifac>
Jun 01 00:25:43 siddharth-HP-Notebook systemd[1]: artifactory.service: Can't open PID file /run/artifactory.pid (yet?) >
Jun 01 00:25:43 siddharth-HP-Notebook systemd[1]: artifactory.service: Failed with result 'protocol'.
Jun 01 00:25:43 siddharth-HP-Notebook systemd[1]: Failed to start Artifactory service.
Jun 01 00:25:48 siddharth-HP-Notebook systemd[1]: Stopped Artifactory service.
Jun 01 00:25:48 siddharth-HP-Notebook systemd[1]: /lib/systemd/system/artifactory.service:10: PIDFile= references a pat>
Jun 01 00:31:37 siddharth-HP-Notebook systemd[1]: /lib/systemd/system/artifactory.service:10: PIDFile= references a pat>
Jun 01 00:31:38 siddharth-HP-Notebook systemd[1]: /lib/systemd/system/artifactory.service:10: PIDFile= references a pat>
Also during the installation, I got this error in the end that can be helpful:
Triggering migration script, this will migrate if needed ...
chown: invalid user: ‘artifactory:artifactory’
[WARN] Could not set owner of [/opt/jfrog/artifactory/var/etc] to [artifactory:artifactory]
Processing triggers for systemd (245.4-4ubuntu3.1) ...
Be sure that PID file is there:
Jun 01 00:25:43 siddharth-HP-Notebook systemd[1]: artifactory.service: Can't open PID file /run/artifactory.pid (yet?) >
If it's there, you need to check permissions and your service file to check what's your path of PID file

Jenkins makes a Kubernetes node stuck when high CPU usage

I noticed that when launching some Jenkins builds sometimes the node hosting Jenkins get stuck forever. It means the whole node is not reachable, and all its pods are down (not ready in the dashboard).
To make things up again I need to remove it from the cluster and add it again (I'm on GCE so I need to remove it from the instance group to be able to delete it).
Note: during hours I'm not able to connect through SSH to the node, it is clearly out of service ^^
From my understanding, reaching memory top crashes a node, but reaching top CPU usage should just slow down the server and not make a big deal like what I'm experiencing. In the worst case Kubelet should be unavailable until CPU gets better.
Does someone is able to help me determining the origin of this issue? What could cause such a problem?
Node metrics 1
Node metrics 2
Jenkins slave metrics
Node metrics from GCE
On the other side, after waiting hours, I've been able to access the node through SSH and I run sudo journalctl -u kubelet to see what's going on. I don't see anything specific at 7pm o'clock but I'm able to see recurrent error like:
Apr 04 19:00:58 nodes-s2-2g5v systemd[43508]: kubelet.service: Failed at step EXEC spawning /home/kubernetes/bin/kubelet: Permission denied
Apr 04 19:00:58 nodes-s2-2g5v systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXEC
Apr 04 19:00:58 nodes-s2-2g5v systemd[1]: kubelet.service: Unit entered failed state.
Apr 04 19:00:58 nodes-s2-2g5v systemd[1]: kubelet.service: Failed with result 'exit-code'.
Apr 04 19:01:00 nodes-s2-2g5v systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Apr 04 19:01:00 nodes-s2-2g5v systemd[1]: Stopped Kubernetes Kubelet Server.
Apr 04 19:01:00 nodes-s2-2g5v systemd[1]: Started Kubernetes Kubelet Server.
Apr 04 19:01:00 nodes-s2-2g5v systemd[43511]: kubelet.service: Failed at step EXEC spawning /home/kubernetes/bin/kubelet: Permission denied
Apr 04 19:01:00 nodes-s2-2g5v systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXEC
Apr 04 19:01:00 nodes-s2-2g5v systemd[1]: kubelet.service: Unit entered failed state.
Apr 04 19:01:00 nodes-s2-2g5v systemd[1]: kubelet.service: Failed with result 'exit-code'.
Apr 04 19:01:02 nodes-s2-2g5v systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Apr 04 19:01:02 nodes-s2-2g5v systemd[1]: Stopped Kubernetes Kubelet Server.
Apr 04 19:01:02 nodes-s2-2g5v systemd[1]: Started Kubernetes Kubelet Server.
I go to older logs and I found at 5:30pm the start of this kind of messages:
Apr 04 17:26:50 nodes-s2-2g5v kubelet[1841]: I0404 17:25:05.168402 1841 prober.go:111] Readiness probe for "...
Apr 04 17:26:50 nodes-s2-2g5v kubelet[1841]: I0404 17:25:04.021125 1841 prober.go:111] Readiness probe for "...
-- Reboot --
Apr 04 17:31:31 nodes-s2-2g5v systemd[1]: Started Kubernetes Kubelet Server.
Apr 04 17:31:31 nodes-s2-2g5v systemd[1699]: kubelet.service: Failed at step EXEC spawning /home/kubernetes/bin/kubelet: Permission denied
Apr 04 17:31:31 nodes-s2-2g5v systemd[1]: kubelet.service: Main process exited, code=exited, status=203/EXEC
Apr 04 17:31:31 nodes-s2-2g5v systemd[1]: kubelet.service: Unit entered failed state.
Apr 04 17:31:31 nodes-s2-2g5v systemd[1]: kubelet.service: Failed with result 'exit-code'.
Apr 04 17:31:33 nodes-s2-2g5v systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Apr 04 17:31:33 nodes-s2-2g5v systemd[1]: Stopped Kubernetes Kubelet Server.
Apr 04 17:31:33 nodes-s2-2g5v systemd[1]: Started Kubernetes Kubelet Server.
At this time node kubelet reboots and it corresponds to a Jenkins build. There is the same pattern with high CPU usage. I don't know why earlier it just rebooted and around 7pm the node just get stuck :/
I'm really sorry, it's a lot of information but I'm totally lost, that's not the first time it happens to me ^^
Thank you,
As mentioned by #Brandon, it was related to resource limits applied to my Jenkins slaves.
In my case even if precised in my Helm chart YAML file, the values were not set. I had to go deeper in the UI to set them manually.
From this modification, everything is now stable! :)

Docker 17 fails to start in Centos 7

We have installed docker 17.12 in our Centos 7.x and after the installation is complete, am facing an error while trying to start the docker service. Initially, I tried for systemctl docker start then for more output on this when I tried journalctl it says docker.service entered failed state.
More details below:
Docker :
17.12.1-ce , build 7390fc6
Command tried:
sudo systemctl start docker
journalctl -u docker.service
Expected Output:
Docker service should be started successfully
Actual output:
Mar 26 23:51:19 docker[16420]: See 'docker --help'
Mar 26 23:51:19 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Mar 26 23:51:19 systemd[1]: Failed to start Docker Application Container Engine.
Mar 26 23:51:19 systemd[1]: Unit docker.service entered failed state.
Mar 26 23:51:19 docker.service failed.
Mar 26 23:51:21 systemd[1]: docker.service holdoff time over, scheduling restart.
Mar 26 23:51:21 systemd[1]: start request repeated too quickly for docker.service
Mar 26 23:51:21 systemd[1]: Failed to start Docker Application Container Engine.
Mar 26 23:51:21 systemd[1]: Unit docker.service entered failed state.
Mar 26 23:51:21 systemd[1]: docker.service failed.
Mar 26 23:52:22 systemd[1]: Starting Docker Application Container Engine...
Mar 26 23:52:22 docker[16582]: docker: 'daemon' is not a docker command.
Mar 26 23:52:22 docker[16582]: See 'docker --help'
Mar 26 23:52:22 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Mar 26 23:52:22 systemd[1]: Failed to start Docker Application Container Engine.
Mar 26 23:52:22 systemd[1]: Unit docker.service entered failed state.
Mar 26 23:52:22 systemd[1]: docker.service failed.
Mar 26 23:52:24 systemd[1]: docker.service holdoff time over, scheduling restart.
Mar 26 23:52:24 systemd[1]: Starting Docker Application Container Engine...
Mar 26 23:52:25 docker[16601]: docker: 'daemon' is not a docker command.
Mar 26 23:52:25 docker[16601]: See 'docker --help'
Mar 26 23:52:25 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Mar 26 23:52:25 systemd[1]: Failed to start Docker Application Container Engine.
Mar 26 23:52:25 systemd[1]: Unit docker.service entered failed state.
Mar 26 23:52:25 systemd[1]: docker.service failed.
Mar 26 23:52:27 systemd[1]: docker.service holdoff time over, scheduling restart.
Mar 26 23:52:27 systemd[1]: Starting Docker Application Container Engine...
Mar 26 23:52:27 docker[16619]: docker: 'daemon' is not a docker command.
Mar 26 23:52:27 docker[16619]: See 'docker --help'
Mar 26 23:52:27 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Mar 26 23:52:27 systemd[1]: Failed to start Docker Application Container Engine.
Mar 26 23:52:27 systemd[1]: Unit docker.service entered failed state.
Mar 26 23:52:27 systemd[1]: docker.service failed.
Mar 26 23:52:29 systemd[1]: docker.service holdoff time over, scheduling restart.
Mar 26 23:52:29 systemd[1]: start request repeated too quickly for docker.service
Mar 26 23:52:29 systemd[1]: Failed to start Docker Application Container Engine.
Mar 26 23:52:29 systemd[1]: Unit docker.service entered failed state.
Mar 26 23:52:29 systemd[1]: docker.service failed.
Please check on this issue and help us resolve the docker start issue.
no evidence in your log.
Would you just reinstall with the official way ?
$ curl -fsSL https://get.docker.com -o get-docker.sh
$ sh get-docker.sh
Check if there's another issue with:
sudo dockerd --debug
In my situation I had invalid config in the daemon.json.

while start marathon , exited with status 1

24 15:28:57 ivum01-HP-Pro-3330-SFF systemd[1]: marathon.service: Main process exited, code=exited, status=1/FAILURE
Jan 24 15:28:57 ivum01-HP-Pro-3330-SFF systemd[1]: marathon.service: Unit entered failed state.
Jan 24 15:28:57 ivum01-HP-Pro-3330-SFF systemd[1]: marathon.service: Failed with result 'exit-code'.
Jan 24 15:29:57 ivum01-HP-Pro-3330-SFF systemd[1]: marathon.service: Service hold-off time over, scheduling restart.
Jan 24 15:29:57 ivum01-HP-Pro-3330-SFF systemd[1]: Stopped Scheduler for Apache Mesos.
Jan 24 15:29:57 ivum01-HP-Pro-3330-SFF systemd[1]: Starting Scheduler for Apache Mesos...
Jan 24 15:29:57 ivum01-HP-Pro-3330-SFF systemd[1]: Started Scheduler for Apache Mesos.
Jan 24 15:29:57 ivum01-HP-Pro-3330-SFF marathon[1838]: No start hook file found ($HOOK_MARATHON_START). Proceeding with the start script.
Jan 24 15:29:59 ivum01-HP-Pro-3330-SFF marathon[1838]: [scallop] Error: Required option 'master' not found
Jan 24 15:29:59 ivum01-HP-Pro-3330-SFF systemd[1]: marathon.service: Main process exited, code=exited, status=1/FAILURE
Jan 24 15:29:59 ivum01-HP-Pro-3330-SFF systemd[1]: marathon.service: Unit entered failed state.
These are the commands I am using for Marathon:
sudo mkdir -p /etc/marathon/conf
sudo cp /etc/mesos-master/hostname /etc/marathon/conf
sudo cp /etc/mesos/zk /etc/marathon/conf/master
sudo cp /etc/marathon/conf/master /etc/marathon/conf/zk
sudo nano /etc/marathon/conf/zk
The only portion I need to modify in this file is the endpoint. Change it from /mesos to /marathon.
That’s an out of memory error. Are you sure your node has enough memory to run both Mesos Master and Marathon?

Can't start docker after reboot Ubuntu 16.05

I'm trying run docker in Ubuntu 16.04 after system reboot . I created service for it "/etc/systemd/system/openvpnBOX.service":
[Unit]
Description=Openvpn Docker
[Service]
User=root
ExecStart=/etc/init/openvpn.conf
[Install]
WantedBy=multi-user.target
Alias=openvpnBOX.service
openvpn.conf:
#!/bin/bash
exec docker run --volumes-from ovpn-data --rm -p 1194:1194/udp --cap- add=NET_ADMIN kylemanna/openvpn
When i'm running this service "sudo service openvpnBOX start i see that service is run, but when i'm rebooting my system, after reboot i see that service can't start:
"sudo service openvpnBOX status"
● openvpnBOX.service - Openvpn Docker
Loaded: loaded (/etc/systemd/system/openvpnBOX.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sun 2017-10-01 21:35:48 SST; 2min 51s ago
Process: 1771 ExecStart=/etc/init/openvpn.conf (code=exited, status=1/FAILURE)
Main PID: 1771 (code=exited, status=1/FAILURE)
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Main process exited, code=exited, status=1/FAILURE
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Unit entered failed state.
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Failed with result 'exit-code'.
Oct 01 21:35:48 systemd[1]: Started Openvpn Docker.
Oct 01 21:35:48 openvpn.conf[1771]: Error response from daemon: 404 page not found
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Main process exited, code=exited, status=1/FAILURE
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Unit entered failed state.
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Failed with result 'exit-code'.
Oct 01 21:35:48 systemd[1]: openvpnBOX.service: Start request repeated too quickly.
Oct 01 21:35:48 systemd[1]: Failed to start Openvpn Docker.
I can use "sudo docker run --restart=always --volumes-from ovpn-data -p 1194:1194/udp --cap-add=NET_ADMIN kylemanna/openvpn" but it doesn't solve my problem, because i woud like understand why my service doesn't work after reboot.
Any idea?

Resources