What exactly is a "role" in Capistrano? - ruby-on-rails

What is the purpose and function of "roles" in a Capistrano recipe? When I look at sample recipes, I often see something like this:
role :app, 'somedomain.com'
role :web, 'somedomain.com'
role :db, 'somedomain.com', :primary => true
So it looks like a role is basically a server where Capistrano executes commands. If that's the case, then why would it be called a "role" rather than a "host" or "server"?
In the above example, what is the difference between the :app and :web roles?
What does the :primary => true option do?

Roles allow you to write capistrano tasks that only apply to certain servers. This really only applies to multi-server deployments. The default roles of "app", "web", and "db" are also used internally, so their presence is not optional (AFAIK)
In the sample you provided, there is no functional difference.
The ":primary => true" is an attribute that allows for further granularity in specifying servers in custom tasks.
Here is an example of role specification in a task definition:
task :migrate, :roles => :db, :only => { :primary => true } do
# ...
end
See the capistrano website # https://github.com/capistrano/capistrano/wiki/2.x-DSL-Configuration-Roles-Role for a more extensive explanation.

The ":primary => true" option indicates that the database server is primary server. This is important for when you want to use replication with MySQL, for example. It allows you to create another mirrored database server that can be used for automatic failover. It's also used for deciding on which database server the model migrations should be run (as those changes will be replicated to the failover servers). This link clarifies it a bit more: https://github.com/capistrano/capistrano/wiki/2.x-from-the-beginning#back-to-configuration

Related

How to configure capistrano to deploy puma and nginx on one server and resque on another?

I am preparing capistrano to deploy ruby on rails application to AWS. The application servers will be behind bastian host.
I have two servers server1 and server2. I want to deploy and run puma, nginx on server1, and run resque workers and resque schedulers on server2. I know about roles and here is my configuration so far:
# deploy/production.rb
web_instances = [web-instance-ip]
worker_instances = [worker-instance-ip]
role :app, web_instances
role :web, web_instances
role :worker, worker_instances
set :deploy_user, ENV['DEPLOY_USER'] || 'ubuntu'
set :branch, 'master'
set :ssh_options, {
forward_agent: true,
keys: ENV['SSH_KEY_PATH'],
proxy: Net::SSH::Proxy::Command.new("ssh -i '#{ENV['SSH_KEY_PATH']}' #{fetch(:deploy_user)}##{ENV['BASTIAN_PUBLIC_IP']} -W %h:%p"),
}
set :puma_role, :app
I am not sure what should I do or how to write tasks by making sure that puma start, restart is done on only server1 and resque, resque scheduler start restart etc is handled on only server2. While common tasks such as pulling latest code, bundle install etc is done on each instance?
Let's assume, you have defined the roles in the following manner
role :puma_nginx_role, 'server1.com'
role :resque_role, 'server2.com'
Now define a rake task in your config/deploy.rb file, ex:
namespace :git do
desc 'To push the code'
task :push do
execute "git push"
end
end
Now assuming the above example should be run on server1, all you have do is
namespace :git do
desc 'To push the code'
task :push, :roles => [:puma_nginx_role] do
execute "git push"
end
end
Thereby, your telling capistrano configuration, that the git:push should be executed on role :puma_nginx_role, which in-turn would run it on server1.com. You'll have to modify the tasks to run puma/nginx/resque and make changes based on roles.
This can achieve by using role to limit the tasks to be run for each servers and some hooks to trigger your custom tasks. Your deploy/production.rb file will look something similar to this.
web_instances = [web-instance-ip]
worker_instances = [worker-instance-ip]
role :app, web_instances
role :web, web_instances
role :worker, worker_instances
set :deploy_user, ENV['DEPLOY_USER'] || 'ubuntu'
set :branch, 'master'
set :ssh_options, {
forward_agent: true,
keys: ENV['SSH_KEY_PATH'],
proxy: Net::SSH::Proxy::Command.new("ssh -i '#{ENV['SSH_KEY_PATH']}' #{fetch(:deploy_user)}##{ENV['BASTIAN_PUBLIC_IP']} -W %h:%p"),
}
# This will run on server with web role only
namespace :puma do
task :restart do
on roles(:web) do |host|
with rails_env: fetch(:rails_env) do
** Your code to restart puma server **
end
end
end
end
# This will run on server with worker role only
namespace :resque do
task :restart do
on roles(:worker) do |host|
with rails_env: fetch(:rails_env) do
** Your code to restart resque server **
end
end
end
end
after :deploy, 'puma:restart'
after :deploy, 'resque:restart'
Check out the docs for more information about commands and hooks to setup your deployment.

capistrano-resque: Multistage with Different :workers

I have two production AWS instances which will be running resque but listening for different queues. Here is an example of my current configuration:
config/deploy/production/prod_resque_1.rb
server "<ip>", :web, :app, :resque_worker, :db, primary: true
set :resque_log_file, "log/resque.log"
set :resque_environment_task, true
set :workers, {
"queue1" => 5,
"*" => 2
}
after "deploy:restart", "resque:restart"
config/deploy/production/prod_resque_2.rb
server "<ip>", :web, :app, :resque_worker, :db, primary: true
set :resque_log_file, "log/resque.log"
set :resque_environment_task, true
set :workers, {
"queue2,queue3,queue4" => 5
}
after "deploy:restart", "resque:restart"
Then, I have a "global" recipe:
load 'config/deploy/production/common'
load 'config/deploy/production/prod_resque_1'
load 'config/deploy/production/prod_resque_2'
The obvious problem is, when I call cap prod_resque resque:start, the :workers definition in prod_resque_1 is overwritten by the load of prod_resque_2, resulting in both prod_resque_1 and prod_resque_2 both having workers listening to queue2, queue3, and queue4 only.
My work around has been to run cap prod_resque_1 resque:start then cap prod_resque_2 resque:start, but this kind of defeats the purpose of capistrano.
Any suggestions for a cleaner solution allowing me to run cap prod_resque resque:start and have the "first" server running 7 workers, 5 listening to queue1 and 2 listening to all queues, and the "second" server running 5 workers, only listening to queue2, queue3, and queue4?
An example of this is given in the capistrano-resque docs: if you assign a different role to each server (or groups of servers), then you can define workers on a per role basis.
In your case you would do something like
role :queue_one_workers, [ip_from_prod_resque_1]
role :other_queue_workers, [ip_from_prod_resque_2]
set :workers, {
:queue_one_workers => {"queue1" => 5, "*" => 2},
:other_queue_workers => {"queue2" => 5, "queue3" => 5, "queue4" => 5}
}

Capistrano 3 discards :user variable when executing remote SSH command

Trying to test Capistrano from scratch.
Capfile:
require 'capistrano/setup'
require 'capistrano/deploy'
I18n.enforce_available_locales = false
Dir.glob('lib/capistrano/tasks/*.rb').each { |r| import r }
deploy.rb:
role :testrole, 'x.x.x.x'
set :user, 'ubuntu'
The test.rb task:
namespace :test do
desc "Uptime on servers"
task :uptime do
on roles(:testrole) do
execute "uptime"
end
end
end
cap command:
cap production test:uptime
output:
INFO [c077da7f] Running /usr/bin/env uptime on x.x.x.x
DEBUG [c077da7f] Command: /usr/bin/env uptime
cap aborted!
Net::SSH::AuthenticationFailed
Dont have a problem to login from the shell using the same user and key.
While logged in the remote server, I can see in auth.log that an empty user given while executing the cap:
test-srv sshd[1459]: Invalid user from x.x.x.x
What do I miss ?
Thanks!
If you take a look at their example code, supplied when you cap install your project, you'll see something like this in staging.rb and production.rb:
# Simple Role Syntax
# ==================
# Supports bulk-adding hosts to roles, the primary
# server in each group is considered to be the first
# unless any hosts have the primary property set.
# Don't declare `role :all`, it's a meta role
role :app, %w{deploy#example.com}
role :web, %w{deploy#example.com}
role :db, %w{deploy#example.com}
# Extended Server Syntax
# ======================
# This can be used to drop a more detailed server
# definition into the server list. The second argument
# something that quacks like a hash can be used to set
# extended properties on the server.
server 'example.com', user: 'deploy', roles: %w{web app}, my_property: :my_value
You'll either want to specify your user in one of those places, or use fetch(:user) to grab it programmatically at runtime. E.g.,
server 'example.com', user: fetch(:user), roles: %w{web app}, my_property: :my_value

Multiple delayed_jobs roles with Capistrano?

I have a question that I am not finding much useful information for. I'm wondering if this is possible and, if so, how to best implement it.
We are building an app in Rails which has heavy data-processing in the background via DelayedJob (…it is working well for us.)
The app runs in AWS and we have a few different environments configured in Capistrano.
When we have heavy processing loads, our DelayedJob queues can back up--which is mostly fine. I do have one or two queues that I'd like to have a separate node tend to. Since it would be ignoring the 'clogged' queues, it would keep tending its one or two queues and they would stay current. For example, some individual jobs can take over an hour and I wouldn't want a forgotten-password-email delivery to be held up for 90 minutes until the next worker completes a task and checks for a priority job.
What I want is to have a separate EC2 instance that has one worker launched that tends to two different, explicit queues.
I can do this manually on my dev machine by launching one or two workers with the '--QUEUES' option.
Here is my question, how can I define a new role in capistrano and tell that role's nodes to start a different number of workers and tend to specific queues? Again, my normal delayed_jobs role is set to 3 workers and runs all queues.
Is this possible? Is there a better way?
Presently on Rails 3.2.13 with PostgreSQL 9.2 and the delayed_job gem.
Try this code - place it in deploy.rb after requiring default delayed_job recipes.
# This overrides default delayed_job tasks to support args per role
# If you want to use command line options, for example to start multiple workers,
# define a Capistrano variable delayed_job_args_per_role:
#
# set :delayed_job_args_per_role, {:worker_heavy => "-n 4",:worker_light => "-n 1" }
#
# Target server roles are taken from delayed_job_args_per_role keys.
namespace :delayed_job do
def args_per_host(host)
roles.each do |role|
find_servers(:roles => role).each do |server|
return args[role] if server.host == host
end
end
end
def args
fetch(:delayed_job_args_per_role, {:app => ""})
end
def roles
args.keys
end
desc "Start the delayed_job process"
task :start, :roles => lambda { roles } do
find_servers_for_task(current_task).each do |server|
run "cd #{current_path};#{rails_env} script/delayed_job start #{args_per_host server.host}", :hosts => server.host
end
end
desc "Restart the delayed_job process"
task :restart, :roles => lambda { roles } do
find_servers_for_task(current_task).each do |server|
run "cd #{current_path};#{rails_env} script/delayed_job restart #{args_per_host server.host}", :hosts => server.host
end
end
end
P.S. I've tested it only with single role in hash, but multiple roles should work fine too.
In Capistrano3, using the official capistrano3-delayed-job gem, you can do this without modifying the Capistrano methods:
# If you have several servers handling Delayed Jobs and you want to configure
# different pools per server, you can define delayed_job_pools_per_server:
#
# set :delayed_job_pools_per_server, {
# 'server11-prod' => {
# 'default,emails' => 3,
# 'loud_notifications' => 1,
# 'silent_notifications' => 1,
# },
# 'server12-prod' => {
# 'default' => 2
# }
# }
# Server names (server11-prod, server12-prod) in :delayed_job_pools_per_server
# must match the hostnames on Delayed Job servers. You can verify it by running
# `hostname` on your servers.
# If you use :delayed_job_pools_per_server, :delayed_job_pools will be ignored.

Why keep a copy of an app on the DB host?

A lot of Capistrano example recipes include a :db role. By default the deploy task exports the app code to all hosts in all roles. So that suggests that it's typical for people to keep a copy of their app on the DB host. Also, in Capistrano's distributed deploy.rb recipe, :deploy:migrate looks like this:
task :migrate, :roles => :db, :only => { :primary => true } do
# ...
end
My question is, why is it done like that? Wouldn't it be cleaner to keep app code off the DB host (which might not even have Ruby installed) and run migrations from the production box?
The db server runs migrations because it is the one 'responsible' for the database(s).
One could also imagine security policies that only allows for creating/dropping/changing of tables from the database server itself.
There might even be slight performance gains if there is data being loaded during a migration, although that is a terrible idea to begin with.
If you have the need to reference your database host and do not need a copy of the code on it you can use something like this:
role :db, 'dbhost', :no_release => true
Sample code to run migrations on an application server:
role :app, 'apphost', :runs_migrations => true
task :migrate, :roles = :app, :only => {:runs_migrations => true } do
#...
end

Resources