Thinking Sphinx indexing succeeding on command line, but failing in Cron job - ruby-on-rails

I admit I've cobbled together a mostly working production setup on Ubuntu with Capistrano from the official docs (which seem dated and make a lot of assumptions) and various blog posts of varying outdatedness. Anyway, the last annoying hang up is that indexing works when I do it by hand (and on deploy I'm pretty sure), but doesn't work from Cron.
Here's my crontab:
$ crontab -l
# m h dom mon dow command
* * * * * cd /var/www/app/current && /usr/local/bin/rake RAILS_ENV=production thinking_sphinx:index >> /var/www/app/current/log/cron.log 2>&1
Here is the log output (this actually appears 3 times per call):
Sphinx cannot be found on your system. You may need to configure the following
settings in your config/sphinx.yml file:
* bin_path
* searchd_binary_name
* indexer_binary_name
For more information, read the documentation:
http://freelancing-god.github.com/ts/en/advanced_config.html
This is when I run the same command by hand (also works when logging):
$ cd /var/www/app/current && /usr/local/bin/rake RAILS_ENV=production thinking_sphinx:index
(in /var/www/app/releases/20100729042739)
Generating Configuration to /var/www/app/releases/20100729042739/config/production.sphinx.conf
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff
using config file '/var/www/app/releases/20100729042739/config/production.sphinx.conf'...
indexing index 'app_core'...
collected 5218 docs, 3.9 MB
collected 5218 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 0.7 Mhits, 100.0% done
total 5218 docs, 3898744 bytes
total 0.616 sec, 6328760 bytes/sec, 8470.28 docs/sec
distributed index 'app' can not be directly indexed; skipping.
total 3 reads, 0.008 sec, 1110.2 kb/call avg, 2.6 msec/call avg
total 15 writes, 0.016 sec, 540.4 kb/call avg, 1.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=20101).
Also relevant:
$ which rake
/usr/local/bin/rake
$ which indexer
/usr/local/bin/indexer
The error is somewhat common, but it smells funny that it works fine from the command line, I suspect something else is weird. I have 2 other mission-critical cron jobs that run rake tasks that look identical and run fine, not sure what's different about this one. Any help would be greatly appreciated!
PS-is there an authoritative deploy config for this with current Capistrano and TS versions? It seems everyone rolls their own, and the official docs seem to be as idiosyncratic as the blog posts out there.

Is the crontab owned by the same user as the one you are logged in as when you run things manually?
Since it seems like a clear PATH issue, and cron runs with a restricted PATH (i.e. not what's in your .profile), try adding this to the top of your crontab file.
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin
Or if you don't want to modify cron's PATH, you could symlink the files you need into /usr/sbin, which is likely in the PATH by default.

I can confirm I had a similar error like #kbighorse where the commands ran fine manually on the command line, but did not run from the cron job. I did not receive any errors, but the log file would only output the directory the sphinx command was being run from. Once I added the following path variable from #jdl to the top of the crontab file the cron job would run properly:
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin

Related

Whenever / cron jobs failing, but fine manually

Struggling with cron jobs. Ubuntu 11.10 on the server.
Until recently had whenever cron jobs running successfully several times a day; then due to another problem I had to remove RVM from the server and go back to ruby 1.9.3 installed without RVM (I'm sure this is something to do with it)
There is no .rvmrc file in my app
Now, the cron jobs are somehow failing as I can see from syslog:
Jun 30 08:03:01 ip-10-251-30-96 CRON[18706]: (ubuntu) CMD (/bin/bash -l -c 'cd /var/www/my_app/app/releases/201300629090954 && script/rails runner -e production '\''User.remind_non_confirmed_users'\''')
Jun 30 08:03:01 ip-10-251-30-96 CRON[18705]: (CRON) error (grandchild #18706 failed with exit status 127)
Jun 30 08:03:01 ip-10-251-30-96 CRON[18705]: (CRON) info (No MTA installed, discarding output)
If I run that command manually (with env - /bin/bash -l -c '...' ) it runs fine..
I'm going to add "set :output, 'tmp/whenever.log'" to whenever to see what is going on, but I suspect it is an issue with the ruby version / path or something.
Any idea how I could diagnose / fix this properly??
this is my cron/whenever job:
3 8 * * * /bin/bash -l -c 'cd /var/www/my_app/app/releases/20130629090954 && script/rails runner -e production '\''User.remind_non_confirmed_users'\'''
many thanks
To help diagnose what's going on, I usually capture the cron output into a separate log file. There's probably an error that's just not being recorded anywhere.
#hourly bash -lc 'cd /path/to/app; RAILS_ENV=production bundle exec rake remind_non_confirmed_users' >> /path/to/app/log/tasks.log
Also, I prefer creating rake tasks for cron jobs as opposed to runners. A little easier to invoke via the command line than runners, for me at least.
I'm still not sure what was going on, running Whatever with 'set :output' should have created log files, but it didn't, yet the jobs are still failing (and write permissions were there for the log files).
I got so fed up I redeveloped the solution without using script/runner, in stead have cron just call a URL that then takes care of matters as a delayed job. For our particular situation this has a number of additional benefits, though I know it is not ideal for many.
thanks for the suggestions

set up logrotate for a Rails app

I've been searching online for days on how to set up my server to automatically rotate the logs for a Rails app my team recently released. I've gotten myself as far as being able to run sudo logrotate -f /etc/logrotate.conf but of course, who wants to do that all the time?
The contents of the config file for the app's log (I want to add more, but don't see a need to when I can't rotate one file yet):
/path/to/app/production.log {
daily
missingok
rotate 7
compress
delaycompress
notifempty
copytruncate
}
I've verified the /etc/logrotate.conf file contains this line:
include /etc/logrotate.d
But this is the part where I'm not too sure where to go. I've found many different approaches at actually automating the process, but none seem to work. For the record, I've verified the server has the anacron command installed, but I don't know how to configure it for any process of my own. Also, root does not have a crontab on the server yet (we haven't needed it), and I'm unsure if that's better to use than /etc/crontab. In the /etc/crontab file, I've added:
15 0 * * * root cd / && run-parts --report /etc/cron.daily
but I've seen other people use
15 0 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
Is the latter a better option? Why? If so, how do I ensure it works? Again, I don't know how to set up anacron for the task at hand.
Finally, here are the previous contents of the /etc/cron.daily/logrotate file:
/usr/sbin/logrotate /etc/logrotate.conf >/dev/null 2>&1
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
exit 0
and after some research, I replaced that with this (which I understand a bit better):
test -x /usr/sbin/logrotate || exit 0
/usr/sbin/logrotate /etc/logrotate.conf
Can someone explain to me what the first config was doing, and which of these two options is better? I'm unsure why I have to force this process just to get it to run. Maybe /etc/crontab doesn't work like I think it does?
Is the latter a better option? Why?
With your cron command, /etc/cron.daily/* will only ever run if the computer is on at midnight (00:15). If you turn it off at night, as some people do, it would never run.
The work around this, and instead run the command when the computer starts later in the day, one can use anacron. This is obviously less useful for servers than desktops.
Of course, you don't want to use both at once, since that would run the jobs twice a day. Therefore, cron, the most brittle mechanism, will yield to anacron by only running the job if anacron is not installed.
This is what Debian and Ubuntu does by default with their test -x /usr/sbin/anacron || crontab job prefixes.
All server distros will come with logrotate correctly set up, so you shouldn't be modifying the crontab, anacrontab, or /etc/cron.daily/logrotate. The only thing you should do is add a file to /etc/logrotate.d.
Try putting this in /etc/crontab file :
-*/15 * * * * root test -x /usr/lib/cron/run-crons && /usr/lib/cron/run-crons >/dev/null 2>&1
It works for me.

RVM isnt setting environment with cron

I'm having a rough time executing script/runner with a cron and RVM. I believe the issues lie with the rvm environment not being set before the runner is executed.
currently im throwing the error
/bin/sh: 1.sql: command not found
which is more than i've gotten earlier, so i guess that's good.
I've read this thread Need to set up rvm environment prior to every cron job but im still not really getting it. Part of the problem i think is the error reporting.
this is my runner thus far.
*/1 * * * * * /bin/bash -l -c 'rvm use 1.8.7-p352#2310; cd development/app/my_app2310 && script/runner -e development "Mailer.find_customer"'
as per the above link, i tried making a rvm_cron_runner.
i created a file and placed this in it:
#!/bin/sh
source "/Users/dude/.rvm/scripts/rvm"
exec $1
then i updated my crontab to this.
*/1 * * * * * /bin/bash -l -c '/Users/dude/development/app/my_app2310/rvm_cron_runner; rvm use 1.8.7-p352#2310; cd development/app/my_app2310 && script/runner -e development "Mailer.find_customer"'
This also has made no difference. i get no error. nothing.
Can anyone see what i'm doing incorrectly?
P.S i hope my code formatting worked.
Could you try to place the code you want to run in a separate script, and then use the rvm_cron_runner ?
So place your actions in a file called /path/cron_job
rvm use 1.8.7-p352#2310
cd development/app/my_app2310 && script/runner -e development "Mailer.find_customer"
and then in your crontab write
1 2 * * * /path/rvm_cron_runner /path/cron_job
The differences:
this does not start a separate shell
use the parameter of the rvm_cron_runner
If you would use an .rvmrc file, you could even drop the rvm use ... line, I think.
You don't need to write a second cron runner (following that logic, you might as well write a third cron runner runner). Please keep things simple. All you need to do is configure your cron job to launch a bash shell, and make that bash shell load your environment.
The shebang line in your script should not refer directly to a ruby executable, but to rvm's ruby:
#!/usr/bin/env ruby
This instructs the script to load the environment and run ruby as we would on the command line with rvm loaded.
On many UNIX derived systems, crontabs can have a configuration section before the actual lines that define the jobs to be run. If this is the case, you would then specify:
SHELL=/path/to/bash
This will ensure that the cron job will be spawned from bash. Still, your environment is missing, so to instruct bash to load your environment, you will want to add to the configuration section the following:
BASH_ENV=/path/to/environment (typically .bash_profile or .bashrc)
HOME is automatically derived from the /etc/passwd line of the crontab owner, but you can override it.
HOME=/path/to/home
After this, a cron job might look like this:
15 14 1 * * $HOME/rvm_script.rb
What if your crontab doesn't support the configuration section. Well, you will have to give all the environment directives in one line, with the job itself. For example,
15 14 1 * * export BASH_ENV=/path/to/environment && /full/path/to/bash -c '/full/path/to/rvm_script.rb'
Full blog post on the subject
You can use rvm wrappers:
/home/deploy/.rvm/wrappers/ruby-2.2.4/ruby
Source: https://rvm.io/deployment/cron#direct

Cron job can't get it to run, What syntax to use for the crontab?

I'm trying to get a rails job running with CRON. All the examples I find direct me to other rails tools, plugins, gems, etc, which is good, but I really just want to use CRON, regardless. I can run my job ok with the following, but when I've tried cron I haven't had any luck (just doesn't seem to do anything). I want to run it every 3 minutes (for testing).
/usr/bin/env ruby ~/Dropbox/98_2011/webs/apps238/swapper/script/runner /home/durrantm/Dropbox/98_2011/webs/apps238/swapper/app/controllers/scheduled_emails_controller.rb
I'm on Linux Ubuntu.
My PATH has:
/var/lib/gems/1.8/bin:/home/durrantm/.rvm/gems/ruby-1.8.7-p302/bin:/home/durrantm/.rvm/gems/ruby-1.8.7-p302#global/bin:/home/durrantm/.rvm/rubies/ruby-1.8.7-p302/bin:/home/durrantm/.rvm/bin:/home/durrantm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/pgsql/bin
Cron jobs don't load the user's environment. Try adding RAILS_ENV=production before your command within crontab, or whichever environment you need.
Example:
RAILS_ENV=production
*/3 * * * * /your/command/here
OR, if you want to make sure you have your user's full environment, execute the command within a login shell:
*/3 * * * * bash --login -c '/your/command/here'
Get rid of the home dir expansion character (replace ~ with /home/user/etc/etc) and you should be in good shape (quite likely cron's expansion of ~ doesn't match your users).
If the other parts of the syntax are bothersome there's an easy gui.
http://gnome-schedule.sourceforge.net/
sudo apt-get install gnome-schedule
You'll still have to have the path to your rb file fixed up though.
1- you might not have permissions. try running crontab -e as root
2- why don't you write to a log file to debug the issue:
*/3 * * * * /your/command/here >> /path/to/logfile

How to debug an issue of cron's not executing a given script -- or other?

I have a Rails script that I would like to run daily. I know there are many approaches, and that a cron'd script/runner approach is frowned upon by some, but it seems to meet my needs.
However, my script is not getting executed as scheduled.
My application lives at /data/myapp/current, and the script is in script/myscript.rb. I can run it manually without problem as root with:
/data/myapp/current/script/runner -e production /data/myapp/current/script/myscript.rb
When I do that, the special log file (log/myscript.log) gets logged to as expected:
Tue Mar 03 13:16:00 -0500 2009 Starting to execute script...
...
Tue Mar 03 13:19:08 -0500 2009 Finished executing script in 188.075028 seconds
I have it set to run with cron every morning at 4 am. root's crontab:
$ crontab -l
0 4 * * * /data/myapp/current/script/runner -e production /data/myapp/current/script/myscript.rb
In fact, it looks like it's tried to run as recently as this morning!
$ tail -100 /var/log/cron
...
Mar 2 04:00:01 hostname crond[8894]: (root) CMD (/data/myapp/current/script/runner -e production /data/myapp/current/script/myscript.rb)
...
Mar 3 04:00:01 hostname crond[22398]: (root) CMD (/data/myapp/current/script/runner -e production /data/myapp/current/script/myscript.rb)
...
However, there is no entry in my log file, and the data that it should update has not been getting updated. The log file permissions (as a test) were even set to globally writable:
$ ls -lh
total 19M
...
-rw-rw-rw- 1 myuser apps 7.4K Mar 3 13:19 myscript.log
...
I am running on CentOS 5.
So my questions are...
Where else can I look for information to debug this?
Could this be a SELinux issue? Is there a security context that I could set or change that might resolve this error?
Thank you!
Update
Thank you to Paul and Luke both. It did turn out to be an environment issue, and capturing the stderr to a log file enabled me to find the error.
$ cat cron.log
/usr/bin/env: ruby: No such file or directory
$ head /data/myapp/current/script/runner
#!/usr/bin/env ruby
require File.dirname(__FILE__) + '/../config/boot'
require 'commands/runner'
Adding the specific Ruby executable to the command did the trick:
$ crontab -l
0 4 * * * /usr/local/bin/ruby /data/myapp/current/script/runner -e production /data/myapp/current/script/myscript.rb >> /data/myapp/current/log/cron.log 2>&1
By default cron mails its output to the user who ran it. You could look there.
It's very useful to redirect the output of scripts run by cron so that you can look at the results in a log file instead of some random user's local mail on the server.
Here's how you would redirect stdout and stderr to a log file:
cd /home/deploy/your_app/current; script/runner -e production ./script/my_cron_job.rb >> /home/deploy/your_app/current/log/my_file.log 2>&1
The >> redirect stdout to a file, and and the 2>&1 redirects stderr to stdout so any error messages will be logged as well.
Having done this, you will be able to examine the error messages to see what's really going on.
The usual problem when somebody discovers their script won't run in a cron job when it will run from the command line is that it relies on some piece of the environment that an interactive session has but cron doesn't get. Some frequent candidates are the "PATH" environment, and possibly "HOME".
On Linux, make sure all the config files (/etc/crontab, /etc/crond.{daily,hourly,etc}/* and /etc/cron.d/*) are only writeable to user root and are not symlinks, otherwise they will not even be considered.
To allow non-root and/or symlinks, specify the -p option to the crond daemon.

Resources