In my AWS Ubuntu machine, I started a Jenkins build and somehow due to this machine hang so I forcefully stop the instance and then start again after that Jenkins is not starting up. Getting below.
<pre><code>
jenkins.service - LSB: Start Jenkins at boot time
Loaded: loaded (/etc/init.d/jenkins; generated)
Active: failed (Result: exit-code) since Tue 2020-12-01 15:56:48 UTC; 4min 6s ago
Docs: man:systemd-sysv-generator(8)
Process: 27605 ExecStart=/etc/init.d/jenkins start (code=exited, status=7)
Dec 01 15:56:47 ip-172-31-7-133 systemd[1]: Starting LSB: Start Jenkins at boot time...
Dec 01 15:56:47 ip-172-31-7-133 jenkins[27605]: Correct java version found
Dec 01 15:56:47 ip-172-31-7-133 jenkins[27605]: * Starting Jenkins Automation Server jenkins
Dec 01 15:56:48 ip-172-31-7-133 jenkins[27605]: ...fail!
Dec 01 15:56:48 ip-172-31-7-133 systemd[1]: jenkins.service: Control process exited, code=exited status=7
Dec 01 15:56:48 ip-172-31-7-133 systemd[1]: jenkins.service: Failed with result 'exit-code'.
Dec 01 15:56:48 ip-172-31-7-133 systemd[1]: Failed to start LSB: Start Jenkins at boot time.
</code></pre>
I imagine it can't start because there is already a Jenkins process running which is consuming that same port. You should be able to verify that by looking at the Jenkins logfile.
You can do:
ps auxw | grep jenkins
If it returns a process, then you can kill -9 PID.
For example:
[user#server~]$ ps auxw | grep jenkins
jenkins 4656 0.3 33.2 5070780 2716228 ? Ssl Nov05 144:15 /etc/alternatives/java -Djava.awt.headless=true -DJENKINS_HOME=/var/lib/jenkins -jar /usr/lib/jenkins/jenkins.war --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war --httpPort=8080 --debug=5 --handlerCountMax=100 --handlerCountMaxIdle=20
user 14665 0.0 0.0 119416 908 pts/0 S+ 23:08 0:00 grep --color=auto jenkins
[user#server~]$ kill -9 4656
Then try and start your Jenkins instance. Depending on how your Jenkins server is setup you would most likely need to do the above via sudo.
I don't know if this applies to you, but it was my problem. It is similar to Terry Sposato's answer.
We had a scenario where our Ubuntu node was running a standalone instance of Jenkins as wells an SSH started child instance managed from another node.
We saw these sort of service start errors if the SSH started child instance was running.
I resolved this by
Accessing the parent Jenkins instance and selecting disconnect for the Ubuntu node
Immediately invoking sudo /etc/init.d/jenkins start on the Ubuntu node to start it's standalone instance.
After that the jenkins parent in step 1 would eventually reconnect to the the Ubuntu node.
We have not seen this sort of behavior with our CentOS node that also has a standalone and child instance of Jenkins running. I suspect it's a defect with Ubuntu's lsb startup scripts, my debugging showed that the problem happened when /etc/init.d/jenkins sourced /lib/lsb/init-functions.
Related
Following this page, I have moved the docker data directory and created a symbolic link to it. It works. But everytime after rebooting my computer, the Docker service doesn't start automatically any more. How can I solve this problem?
journalctl -u docker.service returns:
Jun 30 10:29:55 ubuntu systemd[1]: Starting Docker Application Container Engine...
Jun 30 10:29:55 ubuntu dockerd[2358]: time="2022-06-30T10:29:55.426467188+10:00" level=info msg="S>
Jun 30 10:29:55 ubuntu dockerd[2358]: mkdir /var/lib/docker: file exists
Jun 30 10:29:55 ubuntu systemd[1]: docker.service: Main process exited, code=exited, status=1/FAIL>
Jun 30 10:29:55 ubuntu systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 30 10:29:55 ubuntu systemd[1]: Failed to start Docker Application Container Engine.
Jun 30 10:29:57 ubuntu systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Jun 30 10:29:57 ubuntu systemd[1]: Stopped Docker Application Container Engine.
Jun 30 10:29:57 ubuntu systemd[1]: docker.service: Start request repeated too quickly.
Jun 30 10:29:57 ubuntu systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 30 10:29:57 ubuntu systemd[1]: Failed to start Docker Application Container Engine.
Before moving the data directory "/var/lib/docker", it was a directory used by Docker, now it is a symbolic link that points to the external directory where the docker image data is stored. Why there is a mkdir command?
If I run dockerd, it returns:
INFO[2022-06-30T20:53:05.143671302+10:00] Starting up
dockerd needs to be started with root privileges. To run dockerd in rootless mode as an unprivileged user, see https://docs.docker.com/go/rootless/
If I run sudo service docker start, docker can start without error. But I don't want to run this everyday. Docker used to start automatically. Any ideas?
I was able to reproduce the error message with the same configuration:
systemd[1]: Starting Docker Application Container Engine...
dockerd[47623]: time="2022-06-30T16:36:20.047741616Z" level=in..
dockerd[47623]: mkdir /data/docker: file exists
systemd[1]: docker.service: Main process exited, code=exited, ..
The reason was that my external drive wasn't mounted yet.
Adding systemd mount/automount units resolve the issue. Or you can add your external drive to your /etc/fstab (Add nofail for avoid the 90 seconds wait when you don't have it with you).
Also from Docker doc:
You can configure the Docker daemon to use a different directory, using the data-root configuration option.
So editing your /etc/docker/daemon.json with:
{
"data-root": "/data/docker"
}
is probably better than using symlinks.
I have successfully installed Jenkins on AWS server ubuntu V18.04, the service is up and running on the server successfully
jenkins.service - LSB: Start Jenkins at boot time
Loaded: loaded (/etc/init.d/jenkins; generated)
Active: active (exited) since Mon 2020-01-20 05:26:40 UTC; 8min ago
Docs: man:systemd-sysv-generator(8)
Process: 1905 ExecStop=/etc/init.d/jenkins stop (code=exited, status=0/SUCCESS)
Process: 1951 ExecStart=/etc/init.d/jenkins start (code=exited, status=0/SUCCESS)
Jan 20 05:26:39 ip-3.106.165.24 systemd[1]: Stopped LSB: Start Jenkins at boot time.
Jan 20 05:26:39 ip-3.106.165.24 systemd[1]: Starting LSB: Start Jenkins at boot time...
Jan 20 05:26:39 ip-3.106.165.24 jenkins[1951]: Correct java version found
Jan 20 05:26:39 ip-3.106.165.24 jenkins[1951]: * Starting Jenkins Automation Server jenkins
Jan 20 05:26:39 ip-3.106.165.24 su[1997]: Successful su for jenkins by root
Jan 20 05:26:39 ip-3.106.165.24 su[1997]: + ??? root:jenkins
Jan 20 05:26:39 ip-3.106.165.24 su[1997]: pam_unix(su:session): session opened for user jenkins by (uid=0)
Jan 20 05:26:39 ip-3.106.165.24 su[1997]: pam_unix(su:session): session closed for user jenkins
Jan 20 05:26:40 ip-3.106.165.24 jenkins[1951]: ...done.
Jan 20 05:26:40 ip-3.106.165.24 systemd[1]: Started LSB: Start Jenkins at boot time.
My security groups are setup port 22 for SSH and port 8080 for jenkins and 80 for http
When I attempt to access jenkins through the web browser I get "This site can't be reached" error
Not sure what else I can try, as I have tried every solution under the sun but still the problem persists.
I can SSH into the server and I can access port 80 as I setup nginx on the server successfully.
Any help would be greatly appreciated
Regards
Danny
Not sure why port 8080 is not working. I had to change the port to 8081 in the jenkins config file in following location as there are multiple jenkins config file
/etc/default/jenkins
Also changed the security group settings in AWS to match
Works fine now
Further to this is that my ISP is blocking port 8080. I hotspotted my phone and port 8080 was accessible.
I'm trying to use a startup script on a Google Compute Engine instance to either:
If the docker container called rstudio is present but in stopped state, run docker start rstudio
If the docker container is not present, run rstudio run --name=rstudio rocker/rstudio
From this SO I thought this could be achieved via docker top rstudio || docker run --name=rstudio rocker/rstudio but it seems to always error at the docker top rstudio part. In that case, I have tried piping docker top rstudio &>/dev/null but no effect.
I have a cloud-config that runs when the instance boots up.
My problem is that the script to run or start the container keeps registering as an error, and doesn't go on to the logic of pulling the image. I have tried putting it in a seperate bash script and directly via ExecStart - also putting "-" in front of the ExecStart command (which is supposed to ignore errors?) but this also seems to have no effect. This is where I have ended up:
#cloud-config
users:
- name: gcer
uid: 2000
write_files:
- path: /home/gcer/docker-rstudio.sh
permissions: 0755
owner: root
content: |
#!/bin/bash
echo "Docker RStudio launch script"
if ! docker top rstudio &>/dev/null
then
echo "Pulling new rstudio"
docker run -p 80:8787 \
-e ROOT=TRUE \
-e USER=%s -e PASSWORD=%s \
-v /home/gcer:/home/rstudio \
--name=rstudio \
%s
else
echo "Starting existing rstudio"
docker start rstudio
fi
- path: /etc/systemd/system/rstudio.service
permissions: 0644
owner: root
content: |
[Unit]
Description=RStudio Server
Requires=docker.service
After=docker.service
[Service]
Restart=always
Environment="HOME=/home/gcer"
ExecStartPre=/usr/share/google/dockercfg_update.sh
ExecStart=-/home/gcer/docker-rstudio.sh
ExecStop=/usr/bin/docker stop rstudio
runcmd:
- systemctl daemon-reload
- systemctl start rstudio.service
Whatever I try, I end up with this error log when I run sudo journalctl -u rstudio.service
Feb 14 23:26:09 test-9 systemd[1]: Started RStudio Server.
Feb 14 23:26:09 test-9 docker[770]: Error response from daemon: No such container: rstudio
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Control process exited, code=exited status=1
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Unit entered failed state.
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Failed with result 'exit-code'.
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Service hold-off time over, scheduling restart.
Feb 14 23:26:09 test-9 systemd[1]: Stopped RStudio Server.
Feb 14 23:26:09 test-9 systemd[1]: Starting RStudio Server...
...
Feb 14 23:26:09 test-9 systemd[1]: Started RStudio Server.
Feb 14 23:26:09 test-9 docker[809]: Error response from daemon: No such container: rstudio
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Control process exited, code=exited status=1
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Unit entered failed state.
Feb 14 23:26:09 test-9 systemd[1]: rstudio.service: Failed with result 'exit-code'.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Service hold-off time over, scheduling restart.
Feb 14 23:26:10 test-9 systemd[1]: Stopped RStudio Server.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Start request repeated too quickly.
Feb 14 23:26:10 test-9 systemd[1]: Failed to start RStudio Server.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Unit entered failed state.
Feb 14 23:26:10 test-9 systemd[1]: rstudio.service: Failed with result 'exit-code'.
Can anyone help me get this working?
I would delete the container when you stop it. Then your startup script reduces to making extra sure the container is deleted, and unconditionally docker running it anew.
This would make the entire contents of the script be:
#!/bin/sh
docker stop rstudio
docker rm rstudio
docker run -p 80:8787 \
--name=rstudio \
... \
rstudio run --name=rstudio rocker/rstudio
Without the set -e option, even if the earlier commands fail (because the container doesn't exist) the script will still go on to the docker run command. This avoids any testing of trying to figure out whether a container is there or not and always leaves you in a consistent state.
Similarly, to clean up a little better, I'd change the systemd unit file to delete the container after it stops
ExecStop=/usr/bin/docker stop rstudio
ExecStopPost=/usr/bin/docker rm rstudio
(Your setup has three possible states: the container is running; the container exists but is stopped; and the container doesn't exist. My setup removes the "exists but is stopped" state, which doesn't have a whole lot of value, especially since you use a docker run -v option to store data outside of container space.)
I have installed and configured jenkins on the centos 7.I have added valid java path i.e "/usr/bin/java" in the file /etc/init.d/jenkins.
Below are the java path detils:
lrwxrwxrwx. 1 root root 22 Dec 24 2015 java -> /etc/alternatives/java
Now, on running "service jenkins start" command from root user, I am getting below error.
● jenkins.service - LSB: Jenkins Continuous Integration Server
Loaded: loaded (/etc/rc.d/init.d/jenkins)
Active: failed (Result: exit-code) since Wed 2016-07-13 18:25:51 IST; 5s ago
Docs: man:systemd-sysv-generator(8)
Process: 807 ExecStart=/etc/rc.d/init.d/jenkins start (code=exited, status=1/FAILURE)
Jul 13 18:25:51 localhost systemd[1]: Starting LSB: Jenkins Continuous Integration Server...
Jul 13 18:25:51 localhost runuser[812]: pam_unix(runuser:session): session opened for user jenkins by (uid=0)
Jul 13 18:25:51 localhost jenkins[807]: Starting Jenkins bash: /usr/bin/java: Permission denied
Jul 13 18:25:51 localhost runuser[812]: pam_unix(runuser:session): session closed for user jenkins
Jul 13 18:25:51 localhost jenkins[807]: [FAILED]
Jul 13 18:25:51 localhost systemd[1]: jenkins.service: control process exited, code=exited status=1
Jul 13 18:25:51 localhost systemd[1]: Failed to start LSB: Jenkins Continuous Integration Server.
Jul 13 18:25:51 localhost systemd[1]: Unit jenkins.service entered failed state.
Jul 13 18:25:51 localhost systemd[1]: jenkins.service failed.
I am not able to figure out why it's giving me permission denied even though every user having access to the java path.
also on running "journalctl -xe" command it shows below log:
Jul 13 18:45:33 localhost systemd[1]: Unit jenkins.service entered failed state.
Jul 13 18:45:33 localhost systemd[1]: jenkins.service failed.
Jul 13 18:45:33 localhost polkitd[20151]: Unregistered Authentication Agent for unix-process:27889:3161602 (system bus name :1.303, object path /org/freedesktop/PolicyKit1/AuthenticationAgen
Is it like that the Jenkins service does't having permission to access the java path? if not why it's giving that error?
You have two options to solve the problem.
Jenkins service is started by jenkins user. The error says that jenkins user does not have permission to run java. So check orginal java path and give execute permissions to other users.
In jenkins.service unit file, change the owner of the service. Replace User=jenkins with User=root
The default user of service jenkins is "jenkins". So "jenkins" may not have the pemission to access service "java".
So we need to change the use of service jenkins.
From the jenkins service boost script "/etc/init.d/jenkins". We can get the config file path, such as "/etc/sysconfig/jenkins"
Try changing the file, /etc/init.d/jenkins
Specifically look for the JENKINS_USER key and try replacing jenkins with root.
This worked for me on RHEL.
I tried launching a service using chat.service unit file on a CoreOS and it failed:
// chat.service
[Unit]
Description=ChatApp
[Service]
ExecStartPre=-/usr/bin/docker kill simplechat1
ExecStartPre=-/usr/bin/docker rm simplechat1
ExecStartPre=-/usr/bin/docker pull jochasinga/socketio-chat
ExecStart=/usr/bin/docker run -p 3000:3000 --name simplechat1 jochasinga/socketio-chat
fleetctl list-units shows:
UNIT MACHINE ACTIVE SUB
chat.service cfe13a03.../<virtual-ip> failed failed
However, if I changed the chat.service to just:
// chat.service
[Service]
ExecStart=/usr/bin/docker run -p 3000:3000 <mydockerhubuser>/socketio-chat
It ran just fine. fleetctl list-units shows:
UNIT MACHINE ACTIVE SUB
chat.service 8df7b42d.../<virtual-ip> active running
EDIT
Using journalctl -u chat.service I got:
Jun 02 00:02:47 core-01 systemd[1]: Started chat.service.
Jun 02 00:02:47 core-01 systemd[1]: chat.service: Main process exited, code=exited, status=125/n/a
Jun 02 00:02:47 core-01 docker[8924]: docker: Error response from daemon: failed to create endpoint clever_tesla on network brid
Jun 02 00:02:47 core-01 systemd[1]: chat.service: Unit entered failed state.
Jun 02 00:02:47 core-01 systemd[1]: chat.service: Failed with result 'exit-code'.
Jun 02 00:02:58 core-01 systemd[1]: Stopped chat.service.
Jun 02 00:03:08 core-01 systemd[1]: Stopped chat.service.
What had I done wrong in the first chat.service unit file? Any guidance is appreciated.
Running Vagrant version of CoreOS (stable) on Mac OS X.
Your ExecStartPre= command doesn't seem to have a docker subcommand in it. Did you mean to use pull?
Reading the journal for the unit should get you more information: journactl -u chat.service
After looking into the journal using #Rob suggestion and some research, it appears that docker couldn't create an endpoint at the port 3000 because on the OS there was a running docker process on that port.
Simply stop the process with docker stop <processname> and re-launching with fleetctl start chat solved the problem.