Scaling out an app that hits external apis - ruby-on-rails

I'm using beanstalkd to background process api calls to facebook graph api and I want the app to update, i.e. hits facebook api every 10 minutes get the info. I thought about creating a simple script that loads necessary info from db (fb ids/urls), queues jobs in beanstalkd and then sleeps for 9 minutes. Maybe use God to make sure the script keeps running/restart if memory consumption gets too big.
Then I started reading about drbs and wondered if there's a way/need to integrate the two.
I asked in #rubyonrails and got cron and regular rb script as two options. Just wondering if there's a better way.

I would recommend, for simplicity of configuration using delayed_job and a cronjob which calls a rake task which deals with queueing of the jobs.
Monit is also a good alternative to God and seems to be more stable and less memory hungry for process monitoring.
For delayed job you need to add the following to your deploy script (assuming you plan to deploy with capistrano)
namespace :delayed_job do
def rails_env
fetch(:rails_env, false) ? "RAILS_ENV=#{fetch(:rails_env)}" : ''
end
desc "Stop the delayed_job process"
task :stop, :roles => :app do
run "cd #{current_path};#{rails_env} script/delayed_job stop"
end
desc "Start the delayed_job process"
task :start, :roles => :app do
run "cd #{current_path};#{rails_env} script/delayed_job start"
end
desc "Restart the delayed_job process"
task :restart, :roles => :app do
run "cd #{current_path};#{rails_env} script/delayed_job stop"
run "cd #{current_path};#{rails_env} script/delayed_job start"
end
end
I had to extract these recipies from the delayed_job gem to get them to run.

Related

Ruby on rails: How to run a background task automatically when the server starts?

I have created a rails application that runs a background process. It pings a server periodically and displays a graph for the response time. For this I am using a gem called crono. I am starting the task from the command line using 'bundle exec crono'.
How can I run the background process automatically when the rails server starts without having to start it from the command line?
Also, is there a way to automatically refresh the page periodically so that it displays an updated graph?
Edit: This application will be deployed to production.
Edit: I still couldn't get this to work. Here's the folder structure:
application/config/
ping_job.rb
cronotab.rb
cronotab uses 'crono' gem to execute the task inside ping_job.rb every 5 seconds.
require 'typhoeus'
class PingJob
  def peform
   #task definition goes here.
  end
end
I want to run the task defined in ping_job.rb automatically when the server starts. I am thinking of using whenever gem. Any and all suggestions is welcome.
Put it in config/environment.rb right under Rails.application.initialize! this is ran to start up the rails server, so would be run after the application is initialized
Some time ago I wanted to join the start of a background process with the start of the rail server as well as you. And in the end I found out that it is the bad idea. I think the best solution is to create a deploy task that starts and restarts the process on each deploy. For example capistrano allows to do something like this:
namespace :deploy do
task :start do
invoke 'my_process:start'
end
task :stop do
invoke 'my_process:stop'
end
task :restart do
invoke 'my_process:start'
invoke 'my_process:stop'
end
end
namespace :my_process
task :start do
execute "some system command to start the process"
end
task :stop do
execute "some system command to stop the process"
end
end
Never start your process in Rails initialization files. It might start the process several times when there are few application workers on your server. Or it might start the process when you start the Rails console and so on.

Rails: managing multiple sidekiqs without upstart script

Once upon a time, I had one app - Cashyy. It used sidekiq. I deployed it and used this upstart script to manage sidekiq (start/restart).
I decide to deploy another app to the same server. The app (let's call it Giimee) also uses sidekiq.
And here is the issue. Sometimes I need to restart sidekiq for Cashyy, but not for Giimee. Now, as I understand I will need to hack something using index thing (in upstart script and managing sidekiqs: sudo restart sidekiq index=1) (if I understood it correctly).
BUT!
I have zero desire to dabble with these indexes (nightmare to support? Like you need to know how many apps are using sidekiq and be sure to assign unique index to each sidekiq. And to know assigned index if you want to restart specific sidekiq).
So here is the question: how can I isolate each sidekiq (so I would not need to maintain the index) and still get the stability and usability of upstart (starting the process, restarting, etc.)?
Or maybe I don't understand something and the thing with index is state of art?
You create two services:
cp sidekiq.conf /etc/init/cashyy.conf
cp sidekiq.conf /etc/init/glimee.conf
Edit each as necessary. sudo start cashyy, sudo stop glimee, etc. Now you'll have two completely separate Sidekiq processes running.
As an alternative to an upstart script, you can use Capistrano and Capistrano-Sidekiq to manage those Sidekiqs.
We have Sidekiq running on 3 machines and have had a good experience with these two libraries/tools.
Note: we currently use an older version of Capistrano (2.15.5)
In our architecture, the three machines are customized slightly on deploy. This led us to break up our capistrano deploy scripts by machine so that we could customize some classes, manage Sidekiq, etc. Our capistrano files are structured something like this:
- config/
- deploy.rb
- deploy/
- gandalf.rb
- gollum.rb
- legolas.rb
With capistrano-sidekiq, we are able to control, well, Sidekiq :) at any time (during a deploy or otherwise). We set up the Sidekiq aspects of our deploy scripts in the following way:
# config/deploy.rb
# global sidekiq settings
set :sidekiq_default_hooks, false
set :sidekiq_cmd, "#{fetch(:bundle_cmd, 'bundle')} exec sidekiq"
set :sidekiqctl_cmd, "#{fetch(:bundle_cmd, 'bundle')} exec sidekiqctl"
set :sidekiq_role, :app
set :sidekiq_pid, "#{current_path}/tmp/pids/sidekiq.pid"
set :sidekiq_env, fetch(:rack_env, fetch(:rails_env, fetch(:default_stage)))
set :sidekiq_log, File.join(shared_path, 'log', 'sidekiq.log')
# config/deploy/gandalf.rb
# Custom Sidekiq settings
set :sidekiq_timeout, 30
set :sidekiq_processes, 1
namespace :sidekiq do
# .. code omitted from methods and tasks for brevity
def for_each_process(&block)
end
desc 'Quiet sidekiq (stop accepting new work)'
task :quiet, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
desc 'Stop sidekiq'
task :stop, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
desc 'Start sidekiq'
task :start, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
desc 'Restart sidekiq'
task :restart, :roles => lambda { fetch(:sidekiq_role) }, :on_no_matching_servers => :continue do
end
end
When I need to restart one of my Sidekiq instances, I can just go to my terminal and execute the following:
$ bundle exec cap gandalf sidekiq:restart
$ bundle exec cap gollum sidekiq:stop
It's made Sidekiq management quite painless for our team and thought it would be worth sharing in the event something similar could help you out.

Capistrano tasks not performing within the given scope.

I have build some capistrano tasks which I need to run on within the defined :app roles. This is what I have so far:
desc "Stop unicorn"
task :stop, :roles => :app do
logger.info "Stopping unicorn server(s).."
run "touch #{unicorn_pid}"
pid = capture("cat #{unicorn_pid}").to_i
run "kill -s QUIT #{pid}" if pid > 0
end
As far as I know, this should run the given commands on the servers given in the :app role, right? But the fact of the matter is that it's running the commands on the servers in the :db role.
Can anyone give some insight into this problem? Or is there a way to force Capistrano to adhere to the :roles flag?
Thanks in advance
// Emil
Using Capture will cause the task to be run only on the first server listed.
From the documentation:
The capture helper will execute the given command on the first matching server, and will return the output of the command as a string.
https://github.com/capistrano/capistrano/wiki/2.x-DSL-Action-Inspection-Capture
Unfortunately I am facing a similar issue, the find_servers solution may work, but it's hacky, and runs N x N times, where N in the number of servers you have.

Restarting Rails Server

I've inherited an existing Rails 2 application and am currently trying to deploy it on production servers.
As a rails/unix novice, what's the best way to find out what webserver the rails application is running on and how can I restart the server. (since from what I've read, rails will cache everything on production servers)
The previous developer used Capistrano, but unfortunately I don't have access to the GIT repository.
I noticed /configuration/deploy.rb has the following lines:
desc "Custom restart task for mongrel cluster"
task :restart, :roles => :app, :except => { :no_release => true } do
deploy.mongrel.restart
end
desc "Custom start task for mongrel cluster"
task :start, :roles => :app do
deploy.mongrel.start
end
desc "Custom stop task for mongrel cluster"
task :stop, :roles => :app do
deploy.mongrel.stop
end
Does this imply mongrel_rails is being used?
If so what's the best way to restart the application to pick up my changes?
Many thanks.
Does this imply mongrel_rails is being
used?
Yes.
If so what's the best way to restart
the application to pick up my changes?
It depends on which Application server you currently use. Assuming the current recipe is ok, simply call the Capistrano restart task.
$ cap deploy:restart

Run cron jobs on rails (deployed over several servers)

What is the best way to run cron jobs on rails, when different machines have different jobs to do?
For example, server 1 runs cron job A, while server 2 runs cron job B
Is there a way to deploy the cron files along when we do a regular cap deploy?
take a look at the whenever gem, http://github.com/javan/whenever
It is great for automating cron tasks with rails with a clear DSL. We have been using it production for several months now and it just works and is very lightweight. Some examples from their README:
every 3.hours do
runner "MyModel.some_process"
rake "my:rake:task"
command "/usr/bin/my_great_command"
end
every 1.day, :at => '4:30 am' do
runner "MyModel.task_to_run_at_four_thirty_in_the_morning"
end
every :hour do # Many shortcuts available: :hour, :day, :month, :year, :reboot
runner "SomeModel.ladeeda"
end
every :sunday, :at => '12pm' do # Use any day of the week or :weekend, :weekday
runner "Task.do_something_great"
end
The README is very thorough, but there is also a good screencast on railscasts: http://railscasts.com/episodes/164-cron-in-ruby
It easily integrates with capistrano with the following code (copied from README):
after "deploy:symlink", "deploy:update_crontab"
namespace :deploy do
desc "Update the crontab file"
task :update_crontab, :roles => :db do
run "cd #{release_path} && whenever --update-crontab #{application}"
end
end
As far as machine specific, you could use a local config file or even symlink the config/schedule.rb file on deploy. I think I would include a local file that would be symlinked on deploy local_schedule.rb and then put this at the top of the config/schedule.rb
if File.exists?(File.dirname(__FILE__) + '/config/local_schedule.rb')
require File.dirname(__FILE__) + '/local_schedule.rb'
end
Your schedule would run but then include anything local, just make sure it is symlinked before the cap task above is run and you should be good to go.
I hope this helps!

Resources