Jenkins with Publish over SSH plugin, -1 exit status - jenkins

I use Jenkins for build and plugin for deploy my artifacts to server. After deploying files I stopped service by calling eec in plugin
sudo service myservice stop
and I receive answer from Publish over SSH:
SSH: EXEC: channel open
SSH: EXEC: STDOUT/STDERR from command [sudo service myservice stop]...
SSH: EXEC: connected
Stopping script myservice
SSH: EXEC: completed after 200 ms
SSH: Disconnecting configuration [172.29.19.2] ...
ERROR: Exception when publishing, exception message [Exec exit status not zero. Status [-1]]
Build step 'Send build artifacts over SSH' changed build result to UNSTABLE
Finished: UNSTABLE
The build is failed but the service is stopped.
My /etc/init.d/myservice
#! /bin/sh
# /etc/init.d/myservice
#
# Some things that run always
# touch /var/lock/myservice
# Carry out specific functions when asked to by the system
case "$1" in
start)
echo "Starting myservice"
setsid /opt/myservice/bin/myservice --spring.config.location=/etc/ezd/application.properties --server.port=8082 >> /opt/myservice/app.log &
;;
stop)
echo "Stopping script myservice"
pkill -f myservice
#
;;
*)
echo "Usage: /etc/init.d/myservice {start|stop}"
exit 1
;;
esac
exit 0
Please say me why I get -1 exit status?

Well, the script is called /etc/init.d/myservice, so it matches the myservice pattern given to pkill -f. And because the script is waiting for the pkill to complete, it is still alive and gets killed and returns -1 for that reason (there is also the killing signal in the result of wait, but the Jenkins slave daemon isn't printing it).
Either:
come up with more specific pattern for pkill,
use proper pid-file or
switch to systemd, which can reliably kill exactly the process it started.
On this day and age, I recommend the last option. Systemd is simply lot more reliable than init scripts.

Yes, Jan Hudec is right. I call ps ax | grep myservice in Publish over SSH plugin:
83469 pts/5 Ss+ 0:00 bash -c ps ax | grep myservice service myservice stop
So pkill -f myservice will affect the process with PID 83469 which is parent for pkill. This is -1 status cause as I understand.
I changed pkill -f myservice to pkill -f "java.*myservice" and this solved my problem.

Related

Testing minimal docker containers with healthcheck

I have 5 containers running one after another. First 3, (ABC) are very minimal. ABC containers need to be health checked, but curl,wget cannot be run on them, so currently I just run test:[CMD-SHELL], "whoami || exit 1" in docker-compose.yml. Which seems to bring them to a healthy state. Other 2 (DE) dependent on ABC to be healthy are being checked using test: [CMD-SHELL] , "curl --fail http://localhost" command. My question is how can I properly check health of those minimal containers, without using curl, wget etc. ?
If you can live with a TCP connection test to your internal service's port, you could use /dev/tcp for this:
HEALTHCHECK CMD bash -c 'echo -n > /dev/tcp/127.0.0.1/<port>'
Like this:
# PASS (webserver is serving on 8099)
root#ab7470ea0c8b:/app# echo -n > /dev/tcp/127.0.0.1/8099
root#ab7470ea0c8b:/app# echo $?
0
# FAIL (webserver is NOT serving on 9000)
root#ab7470ea0c8b:/app# echo -n > /dev/tcp/127.0.0.1/9000
bash: connect: Connection refused
bash: /dev/tcp/127.0.0.1/9000: Connection refused
root#ab7470ea0c8b:/app# echo $?
1
Unfortunately, I think this is the best that can be done without installing curl or wget.

Container Optimized OS Graceful Shutdown of Celery

Running COS on GCE
Any ideas on how to get COS to do a graceful docker shutdown?
My innermost process is celery, which says he wants a SIGTERM to stop gracefully
http://docs.celeryproject.org/en/latest/userguide/workers.html#stopping-the-worker
My entrypoint is something like
exec celery -A some_app worker -c some_concurrency
On COS I am running my docker a service, something like
write_files:
- path: /etc/systemd/system/servicename.service
permissions: 0644
owner: root
content: |
[Unit]
Description=Some service
[Service]
Environment="HOME=/home/some_home"
RestartSec=10
Restart=always
ExecStartPre=/usr/share/google/dockercfg_update.sh
ExecStart=/usr/bin/docker run -u 2000 --name=somename --restart always some_image param_1 param_2
ExecStopPost=/usr/bin/docker stop servicename
KillMode=processes
KillSignal=SIGTERM
But ultimately when my COS instance it shut down, it just yanks the plug.
Do I need to add a shutdown script to do a docker stop? Do I need to do something more advanced?
What is the expected exit status of your container process when when it receives SIGTERM?
Running systemctl stop <service> then systemctl status -l <service> should show the exit code of the main process. Example:
Main PID: 21799 (code=exited, status=143)
One possibility is that the process does receive SIGTERM and shuts down gracefully, but returns non-zero exit code.
This would make the systemd believe that it didn't shutdown correctly. If that is the case, adding
SuccessExitStatus=143
to your systemd service should help. (Replace 143 with the actual exit code of your main process.)

Jenkins Build With Parameters Only Taking First Parameter

I set up a Jenkins build that uses the "Publish over SSH" plugin to remotely execute an ansible script, injecting the variables into the call to ansible-playbook
The command that Jenkins will remotely execute:
ansible-playbook /home/username/test/test.yml --extra-vars "ui_version=$UI_VERSION web_version=$WEB_VERSION git_release=$GIT_RELEASE release_environment=$RELEASE_ENVIRONMENT"
Which is triggered by the following curl:
curl -k --user username:secretPassword -v -X POST https://jenkins/job/Ansible_Test/buildWithParameters?UI_VERSION=abc&WEB_VERSION=def&GIT_RELEASE=ghi&RELEASE_ENVIRONMENT=jkl
Which should be utilizing the following variables:
My Problem: only the first parameter gets injected, as you can see on the longest line of the console output on Jenkins below:
...
SSH: EXEC: completed after 201 ms
SSH: Opening exec channel ...
SSH: EXEC: channel open
SSH: EXEC: STDOUT/STDERR from command [ansible-playbook /home/dholt2/test/test.yml --extra-vars "ui_version=abc web_version= git_release= release_environment="] ...
SSH: EXEC: connected
...
It turns out that the terminal was trying to interpret the & after the first parameter, as mentioned here. Quoting the URL resulted in a successful transmission and variable injection.
I should've known it was the cause when the command waited for more input.

Starting a service during Docker build

I would like start a service during the Docker build. I do not need this service to continue running after the build process has finished necessarily (or I know I can use the CMD command for that), however I do need it running long enough to execute a second command which relies on this service being up and running.
To be more precise I am trying to write a Dockerfile for the ejabberd XMPP Server, which also installs a module for this server. I am trying to start the ejabberd server with ejabberdctl start and then install the module with the ejabberdctl module_install utility, which depends on the node being up and running. It looks like this:
RUN ejabberdctl start && ejabberdctl modules_update_specs && ejabberdctl module_install ejabberd_auth_http
Now I have run into a problem, and I came up with two possible causes. The problem is that my build does not work from this line on, because the node is down when the second command is trying to execute. I get the following error, which is a typical one when you try to use the ejabberdctl utility, without the node actually being up:
Failed RPC connection to the node ejabberd#localhost
The command '/bin/sh -c ejabberdctl start && ejabberdctl modules_update_specs && ejabberdctl module_install ejabberd_auth_http' returned a non-zero code: 3
This could be either because the starting of the service takes a little, longer than it takes for the second command to get executed, so the second command runs into a node which is just starting up. Not sure how likely this is. The second cause could be that a starting of a service which depends on init.d just doesnt work in Docker during the build process.
I build the container up until that line that causes the problem, entered the container and executed the commands manually and everything worked as it should.
So to summarize I would like to start the ejabberd server during the build and then use its control utility to install some stuff. A last option would be to install the module manually without the server running, however I would prefer doing it with the ejabberdctl control utility.
These *ctl programs usually come in with a few utilities to start / stop / monitor the status of the service.
If your case I think the best idea is to have a simple bash script you can run at build time that does this:
start ejabberd
monitor the status at intervals
if the status of the process is up, run your command
Look at this:
root#158479dec020:/# ejabberdctl status
Failed RPC connection to the node ejabberd#158479dec020: nodedown
root#158479dec020:/# echo $?
3
root#158479dec020:/# ejabberdctl start
root#158479dec020:/# echo $?
0
root#158479dec020:/# ejabberdctl status
The node ejabberd#158479dec020 is started with status: started
ejabberd 16.01 is running in that node
root#158479dec020:/# echo $?
0
root#158479dec020:/# ejabberdctl stop
root#158479dec020:/# echo $?
0
root#158479dec020:/# ejabberdctl status
Failed RPC connection to the node ejabberd#158479dec020: nodedown
root#158479dec020:/# echo $?
3
So this tells us that if you run a ejabberd status and the daemon is not running you receive exit code 3, 0 if it's up and running instead.
There you go with your bash script:
function run() {
ejabberdctl start # Repeating just in case...
ejabberdctl status &>/dev/null
if [ $? -eq 0 ]; then
echo "Do some magic here, ejabberd is running..."
exit 0
fi
echo "Ejabberd still down..."
}
while true; do run; sleep 1; done
And this is what you'd get at the CLI:
root#158479dec020:/# ./check.sh
Ejabberd still down...
Do some magic here, ejabberd is running...
root#158479dec020:/# ejabberdctl stop
root#158479dec020:/# ./check.sh
Ejabberd still down...
Ejabberd still down...
Do some magic here, ejabberd is running...

How to know if my program is completely started inside my docker with compose

In my CI chain I execute end-to-end tests after a "docker-compose up". Unfortunately my tests often fail because even if the containers are properly started, the programs contained in my containers are not.
Is there an elegant way to verify that my setup is completely started before running my tests ?
You could poll the required services to confirm they are responding before running the tests.
curl has inbuilt retry logic or it's fairly trivial to build retry logic around some other type of service test.
#!/bin/bash
await(){
local url=${1}
local seconds=${2:-30}
curl --max-time 5 --retry 60 --retry-delay 1 \
--retry-max-time ${seconds} "${url}" \
|| exit 1
}
docker-compose up -d
await http://container_ms1:3000
await http://container_ms2:3000
run-ze-tests
The alternate to polling is an event based system.
If all your services push notifications to an external service, scaeda gave the example of a log file or you could use something like Amazon SNS. Your services emit a "started" event. Then you can subscribe to those events and run whatever you need once everything has started.
Docker 1.12 did add the HEALTHCHECK build command. Maybe this is available via Docker Events?
If you have control over the docker engine in your CI setup you could execute docker logs [Container_Name] and read out the last line which could be emitted by your application.
RESULT=$(docker logs [Container_Name] 2>&1 | grep [Search_String])
logs output example:
Agent pid 13
Enter passphrase (empty for no passphrase): Enter same passphrase again: Identity added: id_rsa (id_rsa)
#host SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6
#host SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6
parse specific line:
RESULT=$(docker logs ssh_jenkins_test 2>&1 | grep Enter)
result:
Enter passphrase (empty for no passphrase): Enter same passphrase again: Identity added: id_rsa (id_rsa)

Resources