Heroku Rails Rake Task to Sync Production & Local DB - ruby-on-rails

I'm trying to create a rake task so that I can simply type "rake db:sync" in order to update my local DB to match production.
This solution leverages code provided by the Heroku team here:
Importing and Exporting Heroku Postgres Databases with PG Backups
When I use curl --output /tmp/latest.dump #{url} I'm getting the following error in my latest.dump file:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AuthorizationQueryParametersError</Code><Message>Query-string authentication version 4 requires the X-Amz-Algorithm, X-Amz-Credential, X-Amz-Signature, X-Amz-Date, X-Amz-SignedHeaders, and X-Amz-Expires parameters.</Message><RequestId>421FEFF763870123</RequestId><HostId>vlVr/ihmQiDgYIpdFFkuCgEP8Smvr2ks0wRkf89fJ8NfHfsBb92EVv40Q0NZuQIC</HostId></Error>
Here is the code I'm using.
#lib/tasks/db_sync.rake
namespace :db do
desc 'Pull production db to development'
task :sync => [:backup, :dump, :restore]
task :backup do
Bundler.with_clean_env {
puts 'Backup started...'
system "heroku pg:backups capture --app YOUR_APP_NAME"
puts 'Backup complete!'
}
end
task :dump do
dumpfile = "#{Rails.root}/tmp/latest.dump"
puts 'Fetching url and file...'
Bundler.with_clean_env {
url = `heroku pg:backups public-url --app YOUR_APP_NAME | cat`
system "curl --output #{dumpfile} #{url}"
}
puts 'Fetching complete!'
end
task :restore do
dev = Rails.application.config.database_configuration['development']
dumpfile = "#{Rails.root}/tmp/latest.dump"
puts 'PG_RESTORE on development database...'
system "pg_restore --verbose --clean --no-acl --no-owner -h localhost -U #{dev['username']} -d #{dev['database']} #{dumpfile}"
puts 'PG_RESTORE Complete!'
end
end

Check out the Parity gem. It offers several commands to do the following Heroku Rails tasks easily -
Backup DB's
Restore DB's
Run rails console
Tail logs
Run migrations
Deploy
You're of course primarily looking for the first two.
After installation, it expects that you have two git remote values set named staging and production. development isn't needed as it is assumed to be your local machine.
You can get the git url for the other two environments from your Heroku dashboard -> (your app) -> Settings -> Info
After you have that set up, it's as simple as
production backup
development restore production
The code is pretty simple, so I encourage you to read it. But it's essentially doing exactly what your rake code attempts to do by getting a public URL and restoring it.

Related

Pushing a single table to Heroku

I am aware of the heroku pg:push command which pushes an entire database up to Heroku.
Now that I am launching my product, I would like to be able to push up only a specific table that contains information collected locally without overwriting existing tables (such as users).
Is there a command that enables me to only push specific tables to heroku?
My suggestion is to use PostgreSQL dump/restore capabilities directly using the pg_dump and psql commands.
With pg_dump you can dump a specific table from your local database
$ pg_dump --data-only --table=products sourcedb > products.sql
Then grab the Heroku PostgreSQL connection string from the configs
$ heroku config | grep HEROKU_POSTGRESQL
# example
# postgres://user3123:passkja83kd8#ec2-117-21-174-214.compute-1.amazonaws.com:6212/db982398
and restore the table in the remote database, using the information retrieved from Heroku.
$ psql -h ec2-117-21-174-214.compute-1.amazonaws.com -p 6212 -U user3123 db982398 < products.sql
You will need to customize the -p, -h and -U parameters, as well as the database name. The password will be prompted by psql.
You can also use the pg_restore to filter a dump and restore the table, but I personally prefer psql.
Note that Heroku is recommending the use of PostgreSQL tools in several documentations, such as Importing and Exporting for large data, or whenever the provided CLI commands don't cover specific cases like the one in this question.
I wrote script which extracts DB url from heroku. Then it dumps single tables from production and restores them on development/localhost. Run it like this:
rake production_to_development:run\['users;news;third_table',my-sushi-app\]
Code:
namespace :production_to_development do
task :run, [:tables, :app] => [:environment] do |t, args|
tables = args["tables"].split(';')
database_url = nil
Bundler.with_clean_env { database_url = `heroku config:get DATABASE_URL --app=#{args["app"]}` }
require 'addressable/uri'
uri = Addressable::URI.parse(database_url)
remote_database = uri.path[1,uri.path.length-2] # there is \n at the end of the path!
tables.each do |table|
backup_file = "tmp/#{table}.backup"
#bin_dir = "/Applications/Postgres.app/Contents/Versions/latest/bin"
bin_dir = ""
dump_command = "PGPASSWORD=#{uri.password} #{bin_dir}/pg_dump --file \"#{backup_file}\" --host \"#{uri.host}\" --port \"#{uri.port}\" --username \"#{uri.user}\" --no-password --verbose --format=c --blobs --table \"public.#{table}\" \"#{remote_database}\""
`#{dump_command}`
`psql -U 'root' -d my_table -c 'drop table if exists #{table}'`
`pg_restore -d my_table --no-owner #{backup_file}`
end
end
end
If I understand correctly, you just need a single database table with its locally created data pushed to your Rails production app. Maybe this is a simplistic approach, but you could create a migration for your table and then populate using db/seeds.rb.
After you've populated the seeds.rb file and pushed your repo to heroku:
heroku run rake db:migrate
heroku run rake db:seed
Also, if your local table has a ton of data and you're using Rails 4, check out the seed dump gem: https://github.com/rroblak/seed_dump. This will take your existing db data and map it to the seed format.

Authenticating heroku commands run from Rails app on heroku server

I am writing a script for periodically dumping the database and uploading it to S3
I have this method call utilizes the Heroku gem and that gets called from a rake task at various intervals
def dump_database
Rails.logger.info "Dumping database into temporary file."
# Stamp the filename
datestamp = Time.now.strftime("%d-%m-%Y_%H-%M-%S")
# Drop it in the db/backups directory temporarily
file_path = "#{Rails.root}/db/backups/APPNAME_#{Rails.env}_#{datestamp}_dump.sql.gz"
# Dump and zip the backup file
sh "heroku pgbackups:capture --expire -a APPNAME-#{Rails.env}"
sh "curl -o #{file_path} `heroku pgbackups:url -a APPNAME-#{Rails.env}` | gzip > #{file_path}"
return file_path
end
Problem is that when I call the
sh "heroku pgbackups:capture --expire -a APPNAME-#{Rails.env}"
Heroku asks for credentials (email/password) to me entered into the terminal
Cant this be done without entering credentials?

Clear Memcached on Heroku Deploy

What is the best way to automatically clear Memcached when I deploy my rails app to Heroku?
I'm caching the home page, and when I make changes and redeploy, the page is served from the cache, and the updates aren't incorporated.
I want to have this be totally automated. I don't want to have to clear the cache in the heroku console each time I deploy.
Thanks!
I deploy my applications using a bash script that automates GitHub & Heroku push, database migration, application maintenance mode activation and cache clearing action.
In this script, the command to clear the cache is :
heroku run --app YOUR_APP_NAME rails runner -e production Rails.cache.clear
This works with Celadon Cedar with the Heroku Toolbelt package. I know this is not a Rake-based solution however it's quite efficient.
Note : be sure you set the environment / -e option of the runner command to production as it will be executed on the development one otherwise.
Edit : I have experienced issues with this command on Heroku since a few days (Rails 3.2.21). I did not have time to check the origin the issue but removing the -e production did the trick, so if the command does not succeed, please run this one instead :
heroku run --app YOUR_APP_NAME rails runner Rails.cache.clear
[On the Celadon Cedar Stack]
-- [Update 18 June 2012 -- this no longer works, will see if I can find another workaround]
The cleanest way I have found to handle these post-deploy hooks is to latch onto the assets:precompile task that is already called during slug compilation. With a nod to asset_sync Gem for the idea:
Rake::Task["assets:precompile"].enhance do
# How to invoke a task that exists elsewhere
# Rake::Task["assets:environment"].invoke if Rake::Task.task_defined?("assets:environment")
# Clear cache on deploy
print "Clearing the rails memcached cache\n"
Rails.cache.clear
end
I just put this in a lib/tasks/heroku_deploy.rake file and it gets picked up nicely.
What I ended up doing was creating a new rake task that deployed to heroku and then cleared the cache. I created a deploy.rake file and this is it:
namespace :deploy do
task :production do
puts "deploying to production"
system "git push heroku"
puts "clearing cache"
system "heroku console Rails.cache.clear"
puts "done"
end
end
Now, instead of typing git push heroku, I just type rake deploy:production.
25 Jan 2013: this is works for a Rails 3.2.11 app running on Ruby 1.9.3 on Cedar
In your Gemfile add the following line to force ruby 1.9.3:
ruby '1.9.3'
Create a file named lib/tasks/clear_cache.rake with this content:
if Rake::Task.task_defined?("assets:precompile:nondigest")
Rake::Task["assets:precompile:nondigest"].enhance do
Rails.cache.clear
end
else
Rake::Task["assets:precompile"].enhance do
# rails 3.1.1 will clear out Rails.application.config if the env vars
# RAILS_GROUP and RAILS_ENV are not defined. We need to reload the
# assets environment in this case.
# Rake::Task["assets:environment"].invoke if Rake::Task.task_defined?("assets:environment")
Rails.cache.clear
end
end
Finally, I also recommend running heroku labs:enable user-env-compile on your app so that its environment is available to you as part of the precompilation.
Aside from anything you can do inside your application that runs on 'application start' you could use the heroku deploy hooks (http://devcenter.heroku.com/articles/deploy-hooks#http_post_hook) that would hit a URL within your application that clears the cache
I've added config/initializers/expire_cache.rb with
ActionController::Base.expire_page '/'
Works sweet!
Since the heroku gem is deprecated, an updated version of Solomons very elegant answer would be to save the following code in lib/tasks/heroku_deploy.rake:
namespace :deploy do
task :production do
puts "deploying to production"
system "git push heroku"
puts "clearing cache"
system "heroku run rake cache:clear"
puts "done"
end
end
namespace :cache do
desc "Clears Rails cache"
task :clear => :environment do
Rails.cache.clear
end
end
then instead of git push heroku master you type rake deploy:production in command line.
To just clear the cache you can run rake cache:clear
The solution I like to use is the following:
First, I implement a deploy_hook action that looks for a parameter that I set differently for each app. Typically I just do this on the on the "home" or "public" controller, since it doesn't take that much code.
### routes.rb ###
post 'deploy_hook' => 'home#deploy'
### home_controller.rb ###
def deploy_hook
Rails.cache.clear if params[:secret] == "a3ad3d3"
end
And, I simply tell heroku to setup a deploy hook to post to that action whenever I deploy!
heroku addons:add deployhooks:http \
--url=http://example.com/deploy_hook?secret=a3ad3d3
Now, everytime that I deploy, heroku will do an HTTP post back to the site to let me know that the deploy worked just fine.
Works like a charm for me. Of course, the secret token not "high security" and this shouldn't be used if there were a good attack vector for taking your site down if caches were cleared. But, honestly, if the site is that critical to attack, then don't host it on Heroku! However, if you wanted to increase the security a bit, then you could use a Heroku configuration variable and not have the 'token' in the source code at all.
Hope people find this useful.
I just had this problem as well but wanted to stick to the git deployment without an additional script as a wrapper.
So my approach is to write a file during slug generation with an uuid that marks the current precompilation. This is impelmented as a hook in assets:precompile.
# /lib/tasks/store_asset_cacheversion.rake
# add uuidtools to Gemfile
require "uuidtools"
def storeCacheVersion
cacheversion = UUIDTools::UUID.random_create
File.open(".cacheversion", "w") { |file| file.write(cacheversion) }
end
Rake::Task["assets:precompile"].enhance do
puts "Storing git hash in file for cache invalidation (assets:precompile)\n"
storeCacheVersion
end
Rake::Task["assets:precompile:nondigest"].enhance do
puts "Storing git hash in file for cache invalidation (assets:precompile:nondigest)\n"
storeCacheVersion
end
The other is an initializer that checks this id against the cached version. If they differ, there has been another precompilation and the cache will be invalidated.
So it dosen't matter how often the application spins up or down or on how many nodes the worker will be distributed, because the slug generation just happens once.
# /config/initializers/00_asset_cache_check.rb
currenthash = File.read ".cacheversion"
cachehash = Rails.cache.read "cacheversion"
puts "Checking cache version: #{cachehash} against slug version: #{currenthash}\n"
if currenthash != cachehash
puts "flushing cache\n"
Rails.cache.clear
Rails.cache.write "cacheversion", currenthash
else
puts "cache ok\n"
end
I needed to use a random ID because there is as far as I know no way of getting the git hash or any other useful id. Perhaps the ENV[REQUEST_ID] but this is an random ID as well.
The good thing about the uuid is, that it is now independent from heroku as well.

copy production database to staging capistrano

I am using rails and capistrano with a staging and production server. I need to be able to copy the production database to the staging database when I deploy to staging. Is there an easy way to accomplish this?
I thought about doing this with mysql and something like:
before "deploy:migrate" do
run "mysqldump -u root #{application}_production > output.sql"
run "mysql -u root #{application}_staging < output.sql"
end
(I have not tested this btw, so not sure it would even work)
but it would be easier / better if there was another way.
Thanks for any help
This is a quick way to do it also. This uses SSH remote commands and pipes to avoid temp files.
mysql -e 'DROP DATABASE stag_dbname;'
ssh prod.foo.com mysqldump -uprodsqluser -pprodsqlpw prod_dbname | gzip -c | gunzip -c | mysql stag_dbname
Here's my deployment snippet:
namespace :deploy do
task :clone_production_database, :except => { :no_release => true } do
mysql_user = "username"
mysql_password = "s3C_re"
production_database = "production"
preview_database = "preview"
run "mysql -u#{mysql_user} -p#{mysql_password} --execute='CREATE DATABASE IF NOT EXISTS #{preview_database}';"
run "mysqldump -u#{mysql_user} -p#{mysql_password} #{production_database} | mysql -u#{mysql_user} -p#{mysql_password} #{preview_database}"
end
end
before "deploy:migrate", "deploy:clone_production_database"
I do this -- it is really useful. Here are links explaining how ...
http://c.kat.pe/post/capistrano-task-for-loading-production-data-into-your-development-database/
or
http://blog.robseaman.com/2008/12/2/production-data-to-development
or
https://web.archive.org/web/20160404204752/http://blog.robseaman.com/2008/12/2/production-data-to-development
mysql -e 'DROP DATABASE stag_dbname;'
ssh prod.foo.com mysqldump -u prodsqluser
This may not works. At least it does not work with the PostgreSQL.
You have your staging application locked the database so you
cannot drop it
While some tables are locked you will still
overwrite rest tables. So you got an corrupted database
working link for the post above
https://web.archive.org/web/20160404204752/http://blog.robseaman.com/2008/12/2/production-data-to-development

transfer db from one heroku app to another faster

Is there a faster way to transfer my production database to a test app?
Currently I'm doing a heroku db:pull to my local machine then heroku db:push --app testapp but this is becoming time consuming. I have some seed data but it is not nearly as accurate as simply testing with my real-world data. And since they're both stored on a neighboring AWS cloud, there must be a faster way to move the data?
I thought about using a heroku bundle, but I noticed the animate command is gone?
bundles:animate <bundle> # animate a bundle into a new app
It's quite common to migrate databases between staging, testing and production environments for Rails Apps. And heroku db:pull/push is painfully slow. The best way I have found so far is using Heroku PG Backups add-on and it's free. I followed following steps to migrate
production database to staging server:
1) Create the backup for the production-app db
heroku pg:backups capture --app production-app
This will generate b001 backup file from the main database (usually production db in database.yml)
2) To view all the backups (OPTIONAL)
heroku pg:backups --app production-app
3) Now use the pg:backups restore command to populate staging server database from the last backup file on production server
heroku pg:backups restore $(heroku pg:backups public-url --app production-app) DATABASE_URL --app staging-app
Remember that restore is a destructive operation, it will delete existing data before replacing it with the contents of the backup file.
So things are even easier now .. checkout the transfer command as part of pgbackups
heroku pgbackups:transfer HEROKU_POSTGRESQL_PINK sushi-staging::HEROKU_POSTGRESQL_OLIVE -a sushi
https://devcenter.heroku.com/articles/upgrading-heroku-postgres-databases#4b-alternative-transfer-data-between-applications
This has worked beautifully for me taking production code back to my staging site.
The correct answer has changed again as of March 11, 2015.
heroku pg:backups restore $(heroku pg:backups public-url --app myapp-production) DATABASE_URL --app myapp-staging
Note specifically that the argument is now public-url.
https://blog.heroku.com/archives/2015/3/11/pgbackups-levels-up
Update for mid-2015...
The pgbackups add-on has been deprecated. No more pgbackups:transfer. pg:copy is ideal for this scenario.
To copy a database from yourapp (example db name: HEROKU_POSTGRESQL_PINK_URL to yourapp_staging (example db name: HEROKU_POSTGRESQL_WHITE_URL)
# turn off the web dynos in staging
heroku maintenance:on -a yourapp-staging
# if you have non-web-dynos, do them too
heroku ps:scale worker=0 -a yourapp-staging
# backup the staging database if you are paranoid like me (optional)
heroku pg:backups capture -a yourapp-staging
# execute the copy to splat over the top of the staging database
heroku pg:copy yourapp::HEROKU_POSTGRESQL_PINK_URL HEROKU_POSTGRESQL_WHITE_URL -a yourapp-staging
Then when it's complete, turn staging back on:
# this is if you have workers, change '1' to whatever
heroku ps:scale worker=1 -a yourapp-staging
heroku maintenance:off -a yourapp-staging
Reminder: you can use heroku pg:info -a yourapp-staging (and yourapp) to get the database constants.
(source: https://devcenter.heroku.com/articles/upgrading-heroku-postgres-databases#upgrade-with-pg-copy-default)
psql -h test_host -c 'drop database test_db_name; create database test_db_name;'
pg_dump -h production_host production_db_name | psql -h test_host test_db_name`
This can be done on production_host or on test_host — will work both ways.
Have not tested this, but it might work.
Do this to get the URL of your source database:
heroku console "ENV['DATABASE_URL']" --app mysourceapp
Then try executing db:push with that.
heroku db:push database_url_from_before --app mytargetapp
This might not work if Heroku doesn't allow access to the DB machines from outside their network, which is probably the case. You could, perhaps, try using taps (gem that heroku db commands use internally) from within your app code somewhere (maybe a rake task). This would be even faster than the above approach because everything stays completely within AWS.
Edit:
Here's an (admittedly hacky) way to do what I described above:
Grab the database URL as in the first code snippet above. Then from a rake task (you could do it on console but you risk running into the 30 second timeout limit on console commands), execute a shell command to taps (couldn't easily determine whether it's possible to use taps directly from Ruby; all docs show use of the CLI):
`taps pull database_url_from_source_app #{ENV['DATABASE_URL']}`
The backticks are important; this is how Ruby denotes a shell command, which taps is. Hopefully the taps command is accessible from the app. This avoids the problem of accessing the database machine from outside Heroku, since you're running this command from within your app.
Heroku enables you to fork existing applications in production. Use heroku fork to copy an existing application, including add-ons, config vars, and Heroku Postgres data.
Follow the instructions on Heroku: https://devcenter.heroku.com/articles/fork-app
Update for mid-2016...
Heroku now have a --fast flag when creating forks, however they will be up to 30 hours out-of-date.
$ heroku addons:create heroku-postgresql:standard-4 --fork HEROKU_POSTGRESQL_CHARCOAL --fast --app sushi
https://devcenter.heroku.com/articles/heroku-postgres-fork#fork-fast-option

Resources