Run simultaneous or asynchronous tasks with Capistrano - ruby-on-rails

I have a few long-running restarts of processes in my deploy.rb like:
rake assets:precompile
script/delayed_job restart
rake sunspot:solr:stop, rake sunspot:solr:start
All of these processes have to occur, but not necessarily one after another.
I was wondering if I can run the assets:precompile and the delayed_job restart simultaneously, as they don't need to happen one after another, and I could speed up my deploy time by doing them asynchronously.
I've run some Google searches but I can't find anything about it.

This is not a feature that capistrano supports.
I have been looking around for a solution and found something on the Capistrano google groups. The suggestion was to use Capistrano to run a ruby script that runs the jobs in parallel using Ruby's own threading support.
If you read the post one of the authors does ask why do these tasks need to run in parallel because you can introduce race conditions and other non-deterministic behaviour which can make the deployment process more brittle.

Related

delayed_job rake task parameters and concurrency

The documentation states that a delayed job worker can be invoked using a rake task like so: rake jobs:work, or QUEUE=queue1 rake jobs:work if you want it to work on a specific queue.
I have a couple of questions about this way to invoke jobs:
Is there a way to pass other parameters like sleep-delay or read-ahead (like you would do if you start the worker using the script: delayed_job start --sleep-delay 30 --read-ahead 500 --queue=queue1)?
Is there any gain in processing speed if you launch 2 workers on the same queue using the rake task?
In answer to 1. - yes you can set sleep delay and read ahead from the command line. You do it via environment variables:
QUEUE=queue1 SLEEP_DELAY=1 rake jobs:work
for example. See this commit.
rake jobs:work is just a means to an end to put up another worker, for development purposes or to work off a big queue (you have rake jobs:workoff for this though) so all benefits and disclaimers of multiple workers apply,
two jobs process in parallel so if you've got the cpu power your queue will be worked quicker
I don't know about the question #1 though, it's possible rake jobs wasn't intended to be used outside of development

What's the better way to execute daemons when Rails server runing

I have some gems in my Rails App, such as resque, sunspot. I run the following command manually when the machines boots:
rake sunspot:solr:start
/usr/local/bin/redis-server /usr/local/etc/redis.conf
rake resque:work QUEUE='*'
Is there a better practice to run these daemon in the background? And is there any side-effect when run these tasks run in the background?
My solution to that is to use a mix of god, capistrano and whenever. A specific problem I have is that I want all app processes to be run as user, so initd scripts are not an option (this could be done, but it's quite a pain of user switching / environment loading).
God
The basic idea is to use god to start / restart / monitor processes. God may be difficult to get start with, but is very powerful :
running god alone will start all your processes (webserver, bg jobs, whatever)
it can detect a process crashed and restart it
you can group processes and batch restart them (staging, production, background, devops, etc)
Whenever
You still have to start god on server restart. A good mean to do so is to use user crontab. Most cron implementation have a special instruction called #reboot, which allows you to run a specific command on server restart :
#reboot /bin/bash -l -c 'cd /home/my_app && SERVER=true god -c production/current/config/app.god"
Whenever is a gem that allows easy management for crontab, including generating reboot command. While it's not absolutely necessary for achieving what I describe, it's really useful for its capistrano integration.
Capistrano
You not only want to start your processes on server restart, you also want to restart them on deploy. If your background jobs code is not up to date, problem will arise.
Capistrano allows to easily handle that, just ask god to restart the whole group (like : god restart production) in a post deploy capistrano task, and it will be handled seamlessly.
Whenever's capistrano integration also ensure your crontab is always up to date, updating it if you changed your config/schedule.rb file.
You can use something like foreman to manage these processes. You can define process types and other things in a Procfile and you can start and do whatever with them.

Use linux script to make a continuous rake task running (start, stop etc)

I have a rake task which parses a streaming API and enters data into database. The streaming API is live feed and the rake task should run continuously for the live data to enter the database. The rake task once called will run continuously and parse the data. Now i have started the rake task and it is running. The problem is that if i close the terminal or reboot the server, the rake task wil be stopped. So, i want a script in linux (something like the one used to start, or stop apache server), which does the following:
1. start the rake task by calling rake command (rake parse:stream) from the RAILS-ROOT (application directory of Rails app)
2. stop the rake task by killing the process.
3. start the rake task automatically when the server reboots.
i am not familiar to linux scripts and i dont know where to start. i am using ubuntu server. can anyone help me?
Here's an article that might help you also. It discussed various options for managing Ruby applications and their related processes:
http://michaelvanrooijen.com/articles/2011/06/08-managing-and-monitoring-your-ruby-application-with-foreman-and-upstart/
You need to run your script as a daemon. When I create this kind of startup scripts I usually make 2 files, one that stays in /etc/init.d and handles the start/stop/status/restart commands and another one that actually does the job and gets called by the first script.
Here is one solution, and although the daemon script is written in perl, you want to run some command lines only, so daemonizing a perl script could do your job easily.
If you want, there are also ruby gems for daemonizing scripts, so you can write a script in ruby that does the rake tasks.
And if you want to go hardcore, there are solutions for writing bash scripts that can daemonize, but I'm not sure I would recommend a solution like that; at least I find them pretty difficult to use.
Take a look at how Github's Resque project does it.
Essentially they create tasks for starting/restarting/stopping a particular task, in this case resque:work. Note that the restart_workers task simply invokes the other tasks, stop and start. It should be really easy to change this for what you want.

Why would my rake tasks running via cron get invoked twice?

I have a rails app with the whenever gem installed to setup cron jobs which invoke various rake tasks. For reasons unbeknownst to me, each rake task gets invoked twice at precisely the same time. So my db backup task backs up the db twice at 4:00am.
Inspecting crontab reveals correct syntax for all of the cron jobs, so I don't think this is an issue with the whenever gem not correctly configuring the cron jobs. Also confusing is that in both staging and production environments and can invoke tasks on the command line and they only run once.
Any thoughts on what would cause this? I'm at a complete loss troubleshooting wise.
The number of cron jobs that run depends on the number of application instances running in the server box. Are you have two instances of rails application running in the same server box?

How can I make sure the Sphinx daemon runs?

I'm working on setting up a production server using CentOS 5.3, Apache, and Phusion Passenger (mod_rails). I have an app that uses the Sphinx search engine and the Thinking Sphinx gem.
According to the Thinking Sphinx docs...
If you actually want to search against
the indexed data, then you’ll need
Sphinx’s searchd daemon to be running.
This can be controlled using the
following tasks:
rake thinking_sphinx:start
rake ts:start
rake thinking_sphinx:stop
rake ts:stop
What would be the best way to ensure that this takes place in production? I can deploy my app, then manually run rake thinking_sphinx:start, but I like to set things up so that if I have to bounce the server, everything will come back up.
Should I put a call to that Rake task in an initializer? Or something in rc.local?
rc.local is a good start, but its not enough. I would pair is with a monit rule to ensure it is running AND more importantly...
Sphinx requires a full-reindex to make all the latest and greatest available. There is some doco on the thinking sphinx site about delta indexing, but if your index is small, an hourly re-index will take care of things and you do not need the delta indexing stuff.
I run this hourly to take care of this:
0 * * * * cd /var/rails/my_site/current/ && RAILS_ENV=production /usr/bin/rake ts:rebuild
Note: for deployment, I will use the built in thinking sphinx capistrano tasks:
In your Capfile add
require 'thinking_sphinx/deploy/capistrano'
I used to chain the re-indexing in the cap task but stopped cause it is really slow, when I make schema changes I will remember to run it or wait for the hourly cron job to fix it up.
I haven't done this before with Spinix, so I hope someone can give you a better answer, but you should take a look at monit. Monit is designed for keeping daemons running, just like what you need to do.
A quick Google for spinix monit turned up this link: Capistrano recipes: sphinx:monit. That would be a good place to start.
For what it's worth, I'm running
thinking_sphinx:index
... in my cron job, instead of the "rebuild" task. This does not require the searchd process to be offline, but the indices are still rotated when it's done, so new changes are picked up. I think the "rebuild" task is only necessary when you actually change your index structure in your models, which happens very rarely for me.

Resources