How to restart Jenkins automatically after it crashes or closes? - jenkins

I am finding a way to restart jenkins service automatically after it fails or crashes, I want to know that is there any way to do it?
Thanks.

You can use bash scrip in crontab and periodically check:
service jenkins status #if installed as service
if return value other than correct value then:
service jenkins restart
But..
in my opinion You should try to find crashing reason. I use Jenkins for years and it never crashed..

Related

How to identify commands ran by Ansible on a remote host in Falco context?

I would like to know if someone has an idea about how to identify commands ran by Ansible within a remote host.
To give you more context I'm gonna describe my workflow in-depth:
I have a scheduled job between 1 am to 6 am which runs a compliance Ansible playbook to ensure the production servers configuration are up to date and well configured, however, this playbook change some files inside the /etc folder.
Besides this, I have a Falco stack which keeps an eye on what is going on the production servers and raises alerts when an event that I describe as suspicious is found (It can be a syscall/ network connection/ sensitive file editing "/etc/passwd, pam.conf, ..." etc...
So the problem I'm running through is, my playbook triggers some alerts for example:
Warning Sensitive file opened for reading by non-trusted program (user=XXXX user_loginuid=XXX program=python3 command=python3 file=/etc/shadow parent=sh gparent=sudo ggparent=sh gggparent=sshd container_id=host image=<NA>)
My question is, can we set a "flag or prefix" to all Ansible commands, which will allow me to whitelist this flag of prefix and avoid triggering my alerts for nothing.
PS: whitelisting python3 for the user root is not a solution in my opinion.
Ansible is python tool, so the process accessing the file will be python3. The commands that Ansible executes are based on the steps that are in the playbook.
You can solve your problem by modifying the falco rules. You can evaluating the proc.pcmdline in falcon rule and the chain of the proc.aname to identify that the command was executed by the ansible process (ex. process is python3, parent is sh grandparent is sudo, etc.)

jenkins build process running forever

emphasized text
I want to stop jenkins process after build and start server.
but It's running forever like this...
what should I do?
The application you are starting does not seem to have an end, because it seems you are running a spring boot application. After the successful build you start it rightaway inside the jenkins job. So it will never terminate, because the application is not terminating.
So I think you want to deploy the application somewhere and let it run maybe on the target VM?

ssh-agent issues when running on heroku

I have a Rails app, using docker, that does some auto changes to another app, and then git pushes the changes it up to GitHub. It took me a bit of time to be able to get my ssh keys onto the docker container, in a sort of same manor (not happy with it fully, but will change it up after I sort this out). My issue now is that when running the git clones in the Dockerfile, it is all good, but then from my rails code, it fails saying that I don't have access, so in the code I go to re ssh-add the keys. However it then says that Could not open a connection to your authentication agent., so then I try to re-initialise the ssh-agent (echo $(ssh-agent -s)), which seems to succeed, but still fails on ssh-add.
If I SSH in and try those steps, it works fine, but if I rails console in and run the functions that run these console calls, it fails with the same problem. It then seems to be that the ssh-agent call to set the env variables aren't being set. I have a feeling that heroku containers are not allowing changing of the env variables, without it going through their heroku config:set, but this isn't possible as each process will have different SSH_AUTH_SOCK and SSH_AGENT_PID. Any suggestions on how to deal with this would be a massive help.
This error normally happens when you don't have active SSH agent running.
Could not open a connection to your authentication agent.
This is quite common with Debian based systems, whereas most Ubuntu has one running at all times.
To fix this, you just need to start a new agent.
eval $(ssh-agent)
This should be run before ssh-add.
In your current setup, you need to evaluate the risk/cost of using a passphrase-protected private SSH key.
As mentioned here, for an automated process, using a passphrase-less key would be the recommended option, provided you are sure there is no easy way to access said private key.

jenkins kills ssh session when supervisord restarts

I'm using jenkins to do a few actions in a remote server.
I have an Execute Shell command in which I do the following:
sudo ssh <remote server> 'sudo service supervisor restart'
sleep 30
When jenkins reaches the first line I can see 'Restarting Supervisor' but after a moment I see that jenkins closed the ssh connection and moved on to the second line.
I tried adding a 'sleep 30' after the restart command but it still doesn't work.
Seems jenkins doesn't wait for the supervisor restart command to be completed.
Problem is it's not something that always happens, just sometimes, but it does make a lot of problems when it fails.
I think you can never be certain all processes started by supervisord are in a 'ready' state after a restart. Even is the restart action would wait for processes to be started, it wouldn't know if they are 'ready'.
In docker-compose setups that need to know if a certain service is available I've used an extra 'really ready' check for this - optionally in a loop with a sleep/wait. If the process that you are starting opens a port you can use one of the variations of 'wait-for' for this.

Jenkins Project Not Running Correctly When Slave is Connected via Windows Service?

I have a Jenkins project that runs automated tests on a slave machine. However, when I set the connection to the slave node up as a Windows Service, and run the project on that connection, the build itself will "succeed" (sometimes) but my tests will not run correctly. When the build does succeed, the console output looks like everything went fine; I know it isn't how it should be though, because the Selenium web browser never runs on the slave machine during the execution when it's done through the Service connection. At one point I thought it might be because installing the slave-agent as a Service puts all of the associated files in the same directory that the slave node is based in by default, but when I changed the path to the executable for the Service and moved all of the files, it would still connect, and the project still wouldn't run as it should.
As soon as I delete the Service, and launch a connection manually from my slave machine, everything goes through as expected.
Does anyone know why this might be happening? Or, if not, do you know of an alternative to connecting at startup? Thanks in advance for your advice/ideas.
Just moving my comment to an answer so you can accept it as you indicated this resolved the problem, and should make it easier for others to follow.
Have you set permissions properly? The slave task runs with the local account which may not have access to the paths or tools you are trying to use. As a service in the background, you may also need to allow the service to interact with the desktop.
The service will not show up on the computer running the tests, unless you enable the check box to allow the service to interact with the desktop:
For anyone else who may be having this problem, I wanted to post the solution I ended up using (I'm not accepting it as the answer to this particular question because it's a work-around; however, #StevenScott has the answer to making this work as a Service in the comment he posted, above.)
I nixed the Service I created and made a Scheduled Task that utilized a batch script to connect the JNLP file instead.There is a command on the slave node's page in Jenkins that has a command line option, but this did not work for me in the batch file; rather, I simply wrote it to cd to a directory that already contained a copy of the slave-agent.jnlp and just ran it from there: slave agent batch script screenshot
For this to work, you will need to disable the pop-up that appears when you run the slave-agent (the one that asks if you want to run the program).
The settings for the task should include the following:
General Settings: "Run only when user is logged on"
Triggers: "At log on" (specify your user account)
Actions: "Start a program" (specify the location of your batch script)

Resources