I've setup monit to monitor my sunspot_solr process, which seems to work at first. If I restart the monit service with sudo service monit restart my sunspot process starts:
ps aux | grep sunspot
root 4086 0.0 0.0 9940 1820 ? Ss 12:41 0:00 bash ./solr start -f -s /ebs/staging/shared/bundle/ruby/2.3.0/gems/sunspot_solr-2.2.4/solr/solr
root 4137 45.1 4.8 1480560 185632 ? Sl 12:41 0:09 java -server -Xss256k -Xms512m -Xmx512m -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:CMSFullGCsBeforeCompaction=1 -XX:CMSTriggerPermRatio=80 -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/ebs/staging/shared/bundle/ruby/2.3.0/gems/sunspot_solr-2.2.4/solr/server/logs/solr_gc.log -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC -Djetty.home=/ebs/staging/shared/bundle/ruby/2.3.0/gems/sunspot_solr-2.2.4/solr/server -Dsolr.solr.home=/ebs/staging/shared/bundle/ruby/2.3.0/gems/sunspot_solr-2.2.4/solr/solr -Dsolr.install.dir=/ebs/staging/shared/bundle/ruby/2.3.0/gems/sunspot_solr-2.2.4/solr -jar start.jar --module=http
ubuntu 4192 0.0 0.0 10460 936 pts/3 S+ 12:41 0:00 grep --color=auto sunspot
However, I'm also running tail -f /var/logs/monit.log and see this at the same time:
[CST Mar 3 12:42:54] error : 'sunspot_solr' process is not running
[CST Mar 3 12:42:54] info : 'sunspot_solr' trying to restart
[CST Mar 3 12:42:54] info : 'sunspot_solr' start: /usr/bin/sudo
[CST Mar 3 12:43:25] error : 'sunspot_solr' failed to start
Plus, to make sure monit can actually restart the sunspot_solr process, I run sudo kill -9 <the pid> and monit can't restart sunspot_solr:
[CST Mar 3 12:44:25] error : 'sunspot_solr' process is not running
[CST Mar 3 12:44:25] info : 'sunspot_solr' trying to restart
[CST Mar 3 12:44:25] info : 'sunspot_solr' start: /usr/bin/sudo
[CST Mar 3 12:44:55] error : 'sunspot_solr' failed to start
Obviously something is wrong with my monit-solr_sunspot.conf file, but after messing around with it for a few hours now, I'm stumped:
check process sunspot_solr with pidfile /ebs/staging/shared/pids/sunspot-solr.pid
start program = "/usr/bin/sudo -H -u root /bin/bash -l -c 'cd /ebs/staging/releases/20160226191542; bundle exec sunspot-solr start -- -p 8983 -d /ebs/staging/shared/solr/data --pid-dir=/ebs/staging/shared/pids'"
stop program = "/usr/bin/sudo -H -u root /bin/bash -l -c 'cd /ebs/staging/releases/20160226191542; bundle exec sunspot-solr stop -- -p 8983 -d /ebs/staging/shared/solr/data --pid-dir=/ebs/staging/shared/pids'"
I've adapted this monit script to suit my needs: Sample sunspot-solr.monit but am still having no luck!
UPDATE
I've gotten monit to successfully restart sunspot_solr if I kill it, however it still produces the error that it failed to restart in the monit.log file.
I think monit runs as root. You may not want to use sudo both because it prompts for a password and because monit doesn't need it.
Related
in attempt to deploy rails app to server I faced problem that 'thin' does't stars when I try do star it with cap production deploy:start. What is realy strange, than it hasn't any errors.
After this I try do it on deplyment server
env RAILS_ENV=production bundle exec thin start -C config/thin.yml
Starting server on /home/deployer/app/current/tmp/sockets/thin.0.sock ...
Starting server on /home/deployer/app/current/tmp/sockets/thin.1.sock ...
ls /home/deployer/app/current/tmp/sockets/
ps -aux | grep thin
root 16769 0.0 0.1 15468 908 pts/0 S 11:34 0:00 grep --color=auto thin
thin.yml
chdir: /home/deployer/app/current
environment: production
timeout: 30
log: /home/deployer/app/current/log/thin.log
pid: /home/deployer/app/current/tmp/pids/thin.pid
socket: /home/deployer/app/current/tmp/sockets/thin.sock
max_conns: 1024
max_persistent_conns: 10
require: []
wait: 30
servers: 2
daemonize: true
What is gone wrong?
In production.log only migrations
bundle exec thin start -C config/thin.yml &
returns
Starting server on /home/deployer/app/current/tmp/sockets/thin.0.sock ...
Starting server on /home/deployer/app/current/tmp/sockets/thin.1.sock ...
'bundle exec thin start -C confi…' has ended
Answer
Okey, answer was log/thin.0.log there are some errors in code
You need to demonize thin for running it in production by adding &. Try this:
RAILS_ENV=production bundle exec thin start -C config/thin.yml &
I'm trying to get monit to restart my sidekiq service on CentOS server. After trying multiple solutions out there, I'm stumped, still failing to start the service.
My sidekiq file from monit.d:
check process sidekiq
with pidfile /var/www/App/tmp/pids/sidekiq.pid
start program = "/bin/bash -l -c 'sudo cd /var/www/App && bundle exec sidekiq --index 0 --pidfile /var/www/App/tmp/pids/sidekiq.pid --environment production --logfile /var/www/App/log/sidekiq.log --daemon'" as uid deploy and gid deploy
stop program = "/bin/bash -l -c 'cd /var/www/App && bundle exec sidekiqctl stop /var/www/App/tmp/pids/sidekiq.pid 10'" as uid deploy and gid deploy
if totalmem is greater than 512 MB for 2 cycles then restart
if 3 restarts within 5 cycles then timeout
If I run start program command manually, it starts the sidekiq fine but the monit doesn't seem to do anything. Just comes up with:
[BST Oct 6 11:51:17] error : 'sidekiq' process is not running
[BST Oct 6 11:51:17] info : 'sidekiq' trying to restart
[BST Oct 6 11:51:17] info : 'sidekiq' start: /bin/bash
[BST Oct 6 11:52:47] error : 'sidekiq' failed to start
So it is including file fine, but somehow doesn't manage to start the service from the script.
What can it be? Some permissions issue of sorts?
You need to update to the latest Monit version (5.14).
Remove your current monit installation and follow these instructions:
https://rtcamp.com/tutorials/monitoring/monit/
Hope it helps!
PS: Found the solution here: https://bitbucket.org/tildeslash/monit/issues/109/failed-to-stop-always-after-60-seconds
according to Debugging monit
I found i need set PATH.
my start program:
/bin/bash -c 'cd /home/vagrant/apps/skylark/current; PATH=/home/vagrant/.rbenv/shims:/home/vagrant/.rbenv/bin:$PATH bundle exec sidekiq -d -e production -C -P /home/vagrant/apps/skylark/shared/tmp/pids/sidekiq.pid -L /home/vagrant/apps/skylark/shared/log/sidekiq.log'
i think the issue is with your user. You need to execute using deploy user.
check process sidekiq
with pidfile /var/www/App/tmp/pids/sidekiq.pid
start program = "/bin/su - deploy -c 'sudo cd /var/www/App && bundle exec sidekiq --index 0 --pidfile /var/www/App/tmp/pids/sidekiq.pid --environment production --logfile /var/www/App/log/sidekiq.log --daemon'" as uid deploy and gid deploy
stop program = "/bin/su - deploy -c 'cd /var/www/App && bundle exec sidekiqctl stop /var/www/App/tmp/pids/sidekiq.pid 10'" as uid deploy and gid deploy
if totalmem is greater than 512 MB for 2 cycles then restart
if 3 restarts within 5 cycles then timeout
I am using monit for sidekiq
while I am running the monit log file, it is showing the error.
[EDT Jun 18 09:50:11] error : 'sidekiq_site' process is not running
[EDT Jun 18 09:50:11] info : 'sidekiq_site' trying to restart
[EDT Jun 18 09:50:11] info : 'sidekiq_site' start: /bin/bash
[EDT Jun 18 09:51:41] error : 'sidekiq_site' failed to start
/etc/monit/conf.d/sidekiq.conf
check process sidekiq_site
with pidfile /var/www/project/shared/pids/sidekiq.pid
start program = "bash -c 'cd /var/www/project/current ; RAILS_ENV=production bundle exec sidekiq --index 0 --pidfile /var/www/project/shared/pids/sidekiq.pid --environment production --logfile /var/www/project/shared/log/sidekiq.log --daemon'" as uid root and gid root with timeout 90 seconds
stop program = "bash -c 'if [ -d /var/www/project/current ] && [ -f /var/www/project/shared/pids/sidekiq.pid ] && kill -0 `cat /var/www/project/shared/pids/sidekiq.pid`> /dev/null 2>&1; then cd /var/www/project/current && bundle exec sidekiqctl stop /var/www/project/shared/pids/sidekiq.pid 1 ; else echo 'Sidekiq is not running'; fi'" as uid root and gid root
if totalmem is greater than 200 MB for 2 cycles then restart # eating up memory?
group site_sidekiq
/etc/monit/monitrc
set daemon 30
set logfile /var/log/monit.log
set idfile /var/lib/monit/id
set statefile /var/lib/monit/state
set eventqueue
basedir /var/lib/monit/events
slots 100
set httpd port 2812
allow admin:""
set httpd port 2812 and
use address xx.xxx.xx.xx
allow xx.xx.xx.xx
check system trrm_server
if loadavg(5min) > 2 for 2 cycles then alert
if memory > 75% for 2 cycles then alert
if cpu(user) > 75% for 2 cycles then alert
include /etc/monit/conf.d/*
When running a start/stop event in monit there is no path variable set, therefore all programs must have absolute paths, even your call to bash.
No environment variables are used by Monit
Using Rails 3.2.21, whenever gem. This is the list of my crontab:
Begin Whenever generated tasks for: abc
0 * * * * /bin/bash -l -c 'cd /home/deployer/abc/releases/20141201171336 &&
RAILS_ENV=production bundle exec rake backup:perform --silent'
Here's the output when the scheduled job is run:
deployer#localhost:~$ ps aux | grep rake
deployer 25593 0.0 0.0 4448 764 ? Ss 12:00 0:00 /bin/sh -c /bin/bash -l -c
'cd /home/deployer/abc/releases/20141201171336 && RAILS_ENV=production bundle exec rake
backup:perform --silent'
deployer 25594 0.0 0.1 12436 3040 ? S 12:00 0:00 /bin/bash -l -c cd
/home/deployer/abc/releases/20141201171336 && RAILS_ENV=production bundle exec rake
backup:perform --silent
deployer 25631 69.2 4.4 409680 90072 ? Sl 12:00 0:06 ruby /home/deployer/abc/
shared/bundle/ruby/1.9.1/bin/rake backup:perform --silent
deployer 25704 0.0 0.0 11720 2012 pts/0 S+ 12:00 0:00 grep --color=auto rake
Notice the the top 2 processes are actually similar processes. Are they running 2 same jobs concurrently? How do I prevent that?
deployer 25593 0.0 0.0 4448 764 ? Ss 12:00 0:00 /bin/sh -c /bin/bash …
deployer 25594 0.0 0.1 12436 3040 ? S 12:00 0:00 /bin/bash …
Notice the the top 2 processes are actually similar processes. Are they running 2 same jobs concurrently?
No, they aren't. The first is a /bin/sh that started the second, the crontab command /bin/bash …. Most probably /bin/sh is just waiting for termination of /bin/bash and not running again before /bin/bash … has finished execution; you can verify this with e. g. strace -p 25593.
Check your scheduled.rb for a duplicate entry, if you find then remove and deploy.
If there is no duplicate entry in scheduled.rb then you need to remove/comment it from cron tab.
To delete or comment jobs in cron take a look at https://help.1and1.com/hosting-c37630/scripts-and-programming-languages-c85099/cron-jobs-c37727/delete-a-cron-job-a757264.html OR http://www.esrl.noaa.gov/gmd/dv/hats/cats/stations/qnxman/crontab.html
In my schedule:
every 10.minutes do
runner "Model.method"
end
Whenever created this in my crontabs
0,10,20,30,40,50 * * * * /bin/bash -l -c 'cd /home/projects/Monitoring && script/rails runner -e development '\''Model.method'\'''
I tried to run the command in my console and it works. Why does it not work automaticly, i am going insane!
In my syslog
Mar 11 11:38:01 UbuntuRails CRON[20050]: (ruben) CMD (/bin/bash -l -c 'cd /home/projects/Monitoring && script/rails runner -e development '\''Ping.check_pings'\''')
Mar 11 11:38:01 UbuntuRails CRON[20048]: (CRON) info (No MTA installed, discarding output)
Mar 11 11:38:01 UbuntuRails CRON[20047]: (CRON) error (grandchild #20050 failed with exit status 1)
Mar 11 11:38:01 UbuntuRails CRON[20047]: (CRON) info (No MTA installed, discarding output)
I am on Ubuntu 10.10 and had the same problem.
Turns out the -l option does not load the environment as expected, but -i does. (see this issue)
As the issue thread states, the fix is to edit your schedule.rb and add:
set :job_template, "/bin/bash -i -c ':job'"
Cheers