Jenkins cannot start after disk restored from snapshot - jenkins

This Jenkins installation is running on a GCE since the beginning of 2020, pipelines, plugins and other configurations are working without issue.
After a day of experimenting with it to introduce some new OS-level additional features that should have taken 5 minutes and ended up taking the whole day, I realized I was hitting a wall and decided to clean the board by deleting the VM and creating a new one using a snapshot taken back in november, last time the VM was properly working without modification. This particular Jenkins installation is used to build staging version of our internal applications so I wasn't that concerned with downtime and such.
After creating a new VM, same specifications of the previous one, running Debian 10, and assigning the snapshot as source for the disk and went to reload the dashboard and got surprised with this:
Logging in the vm itself I find that everything directory/file wise is there but running sudo systemctl status jenkins returns this:
● jenkins.service - LSB: Start Jenkins at boot time
Loaded: loaded (/etc/init.d/jenkins; generated; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2021-01-05 17:54:55 UTC; 5s ago
Docs: man:systemd-sysv-generator(8)
Process: 653 ExecStart=/etc/init.d/jenkins start (code=exited, status=7)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/jenkins.service
Jan 05 17:52:34 jenkins-1-vm systemd[1]: Starting LSB: Start Jenkins at boot time...
Jan 05 17:52:40 jenkins-1-vm jenkins[653]: Correct java version found
Jan 05 17:52:41 jenkins-1-vm su[767]: Successful su for jenkins by root
Jan 05 17:52:41 jenkins-1-vm su[767]: + ??? root:jenkins
Jan 05 17:52:41 jenkins-1-vm su[767]: pam_unix(su:session): session opened for user jenkins by (uid=0)
Jan 05 17:54:55 jenkins-1-vm jenkins[653]: Starting Jenkins Automation Server: jenkins failed!
Jan 05 17:54:55 jenkins-1-vm systemd[1]: jenkins.service: Control process exited, code=exited status=7
Jan 05 17:54:55 jenkins-1-vm systemd[1]: Failed to start LSB: Start Jenkins at boot time.
Jan 05 17:54:55 jenkins-1-vm systemd[1]: jenkins.service: Unit entered failed state.
Jan 05 17:54:55 jenkins-1-vm systemd[1]: jenkins.service: Failed with result 'exit-code'.
I started searching on google, basically spending the last 2 hours on this, and found nothing relevant apart a lot of articles mentioning using Java8 which cannot be applying to this case as java is there and the log itself says Correct java version found.
As a last attempt I tried to apt purge jenkins and reinstall it and after that everything works but, of course, everything is also wiped out. So I created another vm and before attempting anything else, decided to ask here for help.
Is there something in Jenkins that could not being brought over in a snapshot of the disk and cause this terrible Failed to start LSB: Start Jenkins at boot time. message? What can I try to fix this and restore it?
Adding more information: Trying to launch jenkins via the .war file (java -jar /usr/share/jenkins/jenkins.war) works but start it as if it's a new installation, asking for an admin password and all the rest, ignoring the existing config.xml and all the rest already present in /var/lib/jenkins.

I have had a similar experience with Jenkins running on a GCE VM. I have not finished solving the problem, but I have managed to get Jenkins running without reconfiguring everything again.
After stepping through the start-up script over a few hours I found a spot where it disappeared into a hole and came back as a failure. By looking at the steps after the failure I was able to get the system going again from first principles.
The commands I ended up running (and should really stick in a script because my Jenkins instance will not start after a system reboot with the same fingerprint you are getting). This is a Debian 10 system running in GCE.
. /etc/default/jenkins
DAEMON_ARGS="--name=$NAME --inherit --env=JENKINS_HOME=$JENKINS_HOME --output=$JENKINS_LOG --pidfile=$PIDFILE"
DAEMON=/usr/bin/daemon
SU=/bin/su
JAVA=`type -p java`
$SU -l $JENKINS_USER --shell=/bin/bash -c "$DAEMON $DAEMON_ARGS -- $JAVA $JAVA_ARGS -jar $JENKINS_WAR $JENKINS_ARGS"
At this point Jenkins is running and answers to my web browser call.

Related

docker.service: failed to start and not found

Im really at the beginning of diving into Linux world and its CLI, so I apologize for my lack of knowledge in this matter.
But what is my problem?
Im trying to install "docker" on my Synology DS418. Following this guide https://wiki.servarr.com/docker-arm-synology everything runs smoothly until I try to check if docker starts via
systemctl start docker
it reports
Failed to start docker.service: Unit docker.service failed to load: No such file or directory.
When I run
systemctl status docker
it states
docker.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
This feels like the installation wasnt properly setup, am I correct? What can I do to start docker and run it correctly?
For all following readers with the same confusion: actually it seems that the docker.service is not needed in that case. With a lot of research I managed to install "portainer" via ssh in CLI, so docker works. Im kinda proud of myself that I did it all by myself, yay!
Thank you though for the help!

Linux CFS scheduling of cron-job vs console

I've created some job (weather forecasting) and it is a heavy load, mostly CPU and memory, for a long(er) time. I notice that when I'm running the job from the cli I can still use my browser without stuttering. But when I move the same job to a cron job there are stutters all over the place.
I think this has to do with the way that CFS scheduling from the kernel will group processes (by tty). See e.g. here for documentation.
Now that link does provide some pointers on how to fix it, possibly. But I was wondering if anyone has already done such a thing and what the results were.
Linux xyz 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Startup performance in Jenkins

I am having some performance problems when I am starting Jenkins inside Kubernetes cluster.
One of the points that sometimes occurs and it takes so much time is next operation:
INFO: Finished Download metadata. 1,397 ms
In this case, it is just 1 second but sometimes it takes like 40 seconds. I have tried to find this log message in Jenkins core but I have not found it, so I suspect it is some plugin. My question where is this happening, what is doing and why it is required.
Thanks.
Feb 10, 2018 2:04:22 PM hudson.model.AsyncPeriodicWork$1 run
INFO: Started Download metadata
Feb 10, 2018 2:04:22 PM hudson.model.AsyncPeriodicWork$1 run
INFO: Finished Download metadata. 4 ms
Believe you are referring to the logs like the one above. If yes, these are the log rotation strategy logs thats gets executed through AsyncPeriodicWork class and it is configured in Jenkins specifically for discarding Old Builds.
Following image gives you the configuration in Jenkins UI
You can appropriately configure this based on your project requirements, if you feel this is impacting your startup time.

"Error response from daemon: 404 page not found" While using docker command

While i'm using the docker command i'm getting below error. Anyone have solution for this? Please help me to sort out this issue.
akshath#akshu:~$ docker images
Error response from daemon: 404 page not found
akshath#akshu:~$ docker version
Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5
Built: Fri Nov 20 13:16:54 UTC 2015
OS/Arch: linux/amd64
Error response from daemon: 404 page not found
As mentioned in "Docker daemon answers '404 page not found' after update", check if you have any PROXY defined (HTTP_PROXY, HTTPS_PROXY) in your current environment (env|grep -i proxy)
It is referenced in issue 109.
Also issue 17960 reports the same problem, and includes:
sudo mv /var/lib/docker/network/files/ /path/to/backup/docker-network-files
solved the problem.
(If everything goes well, /path/to/backup/docker-network-files can be deleted)
If that is not enough, chech systemctl status docker.service or logs, to find the real cause.
If this is still not working:
uninstalling/ reinstalling docker can help
make sure to move after uninstall the /var/lib/docker/network folder, to have a fresh start.
Exact Answer:
issue 17083: Moving /var/lib/docker away solved the problem. Or: "I removed only /var/lib/docker/network and now everything works well and without containers lost."
I had this exact same problem on Ubuntu 14.04. Changing the folder or removing it did nothing for me, BUT because Ubuntu never updates it's repos, my version of docker was waaaaaay out of date. I was running 1.6 when the latest is 1.11.
Follow this instructions to update Docker on Ubuntu: https://docs.docker.com/engine/installation/linux/ubuntulinux/
and try again. This fixed my issue!

starting openhab after beagle bone reboot

I am following this tuto to start openhab after beagle bone reboot:
http://tuxtec.blogspot.fr/2013/11/installing-openhab-on-beaglebone-black.html
($4. Autostart OpenHAB)
but It is not working, I got the following error:
root#beaglebone:~# systemctl status openhab.service
openhab.service - OpenHAB
Loaded: loaded (/lib/systemd/system/openhab.service; enabled)
Active: activating (auto-restart) (Result: exit-code) since Fri, 29 May 2015 12:26:39 +0200; 17s ago
Process: 1812 ExecStart=/usr/local/OpenHab1.7/start.sh (code=exited, status=127)
CGroup: name=systemd:/system/openhab.service
my beagle bone operating system is: "Debian GNU/Linux 7 (wheezy)"
any ideas why?
Thanks in advance for your help!
OpenHAB 1.7 (current as of this writing) has an apt-repo so that you can easily install the OpenHAB code and addons as one would install all software on Debian. Please follow these instructions for Beaglebone Black, which continue with the generic instructions for all Linux and OSX installations. When new stable releases come out, you will only have to
sudo apt-get update
sudo apt-get upgrade
in order to stay current.

Resources