With Ruby on Rails, is there a way for me to dump my production database into a form that the test part of Rails can access?
I'm thinking either a way to turn the production database into fixtures, or else a way to migrate data from the production database into the test database that will not get routinely cleared out by Rails.
I'd like to use this data for a variety of tests, but foremost in my mind is using real data with the performance tests, so that I can get a realistic understanding of load times.
You can also check out http://github.com/napcs/lazy_developer which will allow you to put the production data into yaml files.
We just had a similar problem and ended up writing a helper method in rspec that fetches some data (in our case, login details for some accounts) from the production database.
The following should give an idea:
require 'yaml'
def accounts
#accounts ||= lambda {
config = YAML.load_file("#{Rails.root}/config/database.yml")['production']
dbh = Mysql.real_connect(config['host'], config['username'], config['password'], config['database'])
accounts = []
result = dbh.query("SELECT `accounts`.* FROM `accounts`")
while row = result.fetch_hash do
row.delete("id")
accounts << Account.make(row)
end
dbh.close
return accounts
}.call
end
You can use the seed.rb inside the db folder, populate your test database with the data you need. There is nice example available on Railscasts: http://railscasts.com/episodes?search=seed
Would recommend you though to keep your production data away from your testing environments. And do make backups!!!
Related
I've sucessfully rolled my own roles and permissions based authorisation system:
Roles and Permissions are both models in the database that have a many-to-many relationship through another table.
Users belong_to a role and have_many permissions through that role
Admin users are able to create new roles and define which permissions go with which roles.
Controllers and Views check the current user has particular permissions before acting or rendering certain things.
It all works wonderfully. Here's a key part of the code which gets included into application_controller,rb:
def permitted_action (permission_names)
# if the users permissions do not instersect with those given then redirect to root
redirect_to(root_url) if (permission_names & current_user.permissions.map(&:name)).empty?
end
and here's a controller validating user permissions:
before_action only: [:new, :create] do
permitted_action ['create_user']
end
My problem is that permissions are DB rows with unique string names, and I use the names when I want to identify the permissions. So I seed the DB with all the permissions that I need, and then when a View or Controller needs to check the permission I need to get the permission name right at that point. If I check for permission "add_user" but in the db the permission name is "add_users" it goes wrong.
Then in testing I have to put all the permissions in again as fixtures, and get all the names right again.
When I add functionality that requires more permissions I have to add these permissions in with the same string in at least 3 different places. That seems like the wrong thing to me.
Ideally the controllers would define what the permissions are that they use, and the db/seeds.rb file would pick up those and seed the db, and the test fixtures would also pick them up, but I'm not sure if this is possible.
How should I structure this?
Edit:
I think what would help me is a fixtures parser that feeds the db/seeds.rb file. Taking something like this from permissions.yml:
archive_tally:
name: archive_tally
description: Archive or make active any tally
category: 3
And spitting out something like this:
archive_tally = Permission.find_or_initiate_by_name("archive_tally")
archive_tally.description = "Archive or make active any tally"
archive_tally.category = 3
archive_tally.save!
I don't think this sort of thing exists yet.
You can actually cut this down to just being the user table (or roles too depending on how many there are) by adding fields like is_admin or is_general_user to group your users into permissions groups. Then the permissions are just a matter of methods on the User Model, for example:
def can_create_new_roles?
self.is_admin
end
So now you can just do
before_action only: [:create, :new] do
redirect_to root_path unless current_user.can_create_new_roles?
end
Which reads much nicer. Plus since this all happens on the User model vs. on its own DB table it would make testing for all of these UserPermissions much easier.
This is what I've come up with and it satisfies me, though it's not the ideal solution that I was hunting for in the question:
It turns out that I want the production database to be seeded with fixtures that are a subset of the fixtures I want appearing in the testing database.
I've made a directory db/seed_fixtures that includes files such as roles.yml and permissions.yml. In these files I've put the fixtures I want to seed the production database with.
In tests/fixtures/roles.yml I've included the other fixtures with this line at the top of the file:
<%= ERB.new(IO.read(Rails.root.join "db/seed_fixtures/roles.yml")).result %>
Same deal with permissions.yml and other YAML files this applies to. I put any testing-only fixtures in there under that include line.
In db/seeds.rb I generate seeding code from the fixtures for production like this: ActiveRecord::Fixtures.create_fixtures("#{Rails.root}/db/seed_fixtures", "roles")
Now I write all my fixtures once each, some for use as seeds in the production database, others are only for testing. To load the seeds in to the database I run rake db:seed (which I do on the production server) and to load all the testing fixtures in the database I run rake db:fixtures:load
Update:
I've just started using Cucumber and I've realised that I can use the code in step 3 above in a Cucumber Step to get the seed data into Cucumber's test database:
Given(/^seed data is loaded into the database$/) do
ActiveRecord::FixtureSet.create_fixtures("#{Rails.root}/db/seed_fixtures", "roles")
_(Role.count).wont_equal 0
end
Cucumber also informed me that ActiveRecord::Fixtures is deprecated in favour of ActiveRecord::FixtureSet
I've come up with a completely different solution: I've created a fixture parser.
Most of my first solution still applies (see my other answer) but now, instead of step 3 in that solution, I've made a parser that reads the fixtures that I want to use for seeds and adds them to the database, checking for whether they are already there or not. I run the parser from db/seeds.rb
I've put the code for the parser up as a gist on github
How can we run a ruby on rails application with different database configuration?
in detail: I want to run multiple instance of a rails app with different database config for each on production. How is it possible?
I think you can duplicate the config in database.yml into different environments, like prod1, prod2 ... and then set RAILS_ENV environment variable to match before starting up each respective server...
You can duplicate your database.yml file as DGM mentioned. However, the right way to do this would be to use a configuration management solution like Chef.
If you look at the guide for setting up Rails stack, it includes a 2 front-end Web server + 1 back-end DB server. Which would include your case of duplicating database.yml file.
If you are able to control and configure each Rails instance, and you can afford wasting resources because of them being on standby, save yourself some trouble and just change the database.yml to modify the database connection used on every instance. If you are concerned about performance this approach won't cut it.
For models bound to a single unique table on only one database you can call establish_connection inside the model:
establish_connection "database_name_#{RAILS_ENV}"
As described here: http://apidock.com/rails/ActiveRecord/Base/establish_connection/class
You will have some models using tables from one database and other different models using tables from other databases.
If you have identical tables, common on different databases, and shared by a single model, ActiveRecord won't help you. Back in 2009 I required this on a project I was working on, using Rails 2.3.8. I had a database for each customer, and I named the databases with their IDs. So I created a method to change the connection inside ApplicationController:
def change_database database_id = params[:company_id]
return if database_id.blank?
configuration = ActiveRecord::Base.connection.instance_eval { #config }.clone
configuration[:database] = "database_name_#{database_id}_#{RAILS_ENV}"
MultipleDatabaseModel.establish_connection configuration
end
And added that method as a *before_filter* to all controllers:
before_filter :change_database
So for each action of each controller, when params[:company_id] is defined and set, it will change the database to the correct one.
To handle migrations I extended ActiveRecord::Migration, with a method that looks for all the customers and iterates a block with each ID:
class ActiveRecord::Migration
def self.using_databases *args
configuration = ActiveRecord::Base.connection.instance_eval { #config }
former_database = configuration[:database]
companies = args.blank? ? Company.all : Company.find(args)
companies.each do |company|
configuration[:database] = "database_name_#{company[:id]}_#{RAILS_ENV}"
ActiveRecord::Base.establish_connection configuration
yield self
end
configuration[:database] = former_database
ActiveRecord::Base.establish_connection configuration
end
end
Note that by doing this, it would be impossible for you to make queries within the same action from two different databases. You can call *change_database* again but it will get nasty when you try using methods that execute queries, from the objects no longer linked to the correct database. Also, it is obvious you won't be able to join tables that belong to different databases.
To handle this properly, ActiveRecord should be considerably extended. There should be a plugin by now to help you with this issue. A quick research gave me this one:
DB-Charmer: http://kovyrin.github.com/db-charmer/
I'm willing to try it. Let me know what works for you.
Well. We have to create multiple environments in you application
create config/environmenmts/production1.rb which will be same as of config/environmenmts/production.rb
then edit database.yml for production1 settings and you are done.
start server using rails s -e production1
I need to populate my production database app with data in particular tables. This is before anyone ever even touches the application. This data would also be required in development mode as it's required for testing against. Fixtures are normally the way to go for testing data, but what's the "best practice" for Ruby on Rails to ship this data to the live database also upon db creation?
ultimately this is a two part question I suppose.
1) What's the best way to load test data into my database for development, this will be roughly 1,000 items. Is it through a migration or through fixtures? The reason this is a different answer than the question below is that in development, there's certain fields in the tables that I'd like to make random. In production, these fields would all start with the same value of 0.
2) What's the best way to bootstrap a production db with live data I need in it, is this also through a migration or fixture?
I think the answer is to seed as described here: http://lptf.blogspot.com/2009/09/seed-data-in-rails-234.html but I need a way to seed for development and seed for production. Also, why bother using Fixtures if seeding is available? When does one seed and when does one use fixtures?
Usually fixtures are used to provide your tests with data, not to populate data into your database. You can - and some people have, like the links you point to - use fixtures for this purpose.
Fixtures are OK, but using Ruby gives us some advantages: for example, being able to read from a CSV file and populate records based on that data set. Or reading from a YAML fixture file if you really want to: since your starting with a programming language your options are wide open from there.
My current team tried to use db/seed.rb, and checking RAILS_ENV to load only certain data in certain places.
The annoying thing about db:seed is that it's meant to be a one shot thing: so if you have additional items to add in the middle of development - or when your app has hit production - ... well, you need to take that into consideration (ActiveRecord's find_or_create_by...() method might be your friend here).
We tried the Bootstrapper plugin, which puts a nice DSL over the RAILS_ENV checking, and lets your run only the environment you want. It's pretty nice.
Our needs actually went beyond that - we found we needed database style migrations for our seed data. Right now we are putting normal Ruby scripts into a folder (db/bootstrapdata/) and running these scripts with Arild Shirazi's required gem to load (and thus run) the scripts in this directory.
Now this only gives you part of the database style migrations. It's not hard to go from this to creating something where these data migrations can only be run once (like database migrations).
Your needs might stop at bootstrapper: we have pretty unique needs (developing the system when we only know half the spec, larg-ish Rails team, big data migration from the previous generation of software. Your needs might be simpler).
If you did want to use fixtures the advantage over seed is that you can easily export also.
A quick guess at how the rake task may looks is as follows
desc 'Export the data objects to Fixtures from data in an existing
database. Defaults to development database. Set RAILS_ENV to override.'
task :export => :environment do
sql = "SELECT * FROM %s"
skip_tables = ["schema_info"]
export_tables = [
"roles",
"roles_users",
"roles_utilities",
"user_filters",
"users",
"utilities"
]
time_now = Time.now.strftime("%Y_%h_%d_%H%M")
folder = "#{RAILS_ROOT}/db/fixtures/#{time_now}/"
FileUtils.mkdir_p folder
puts "Exporting data to #{folder}"
ActiveRecord::Base.establish_connection(:development)
export_tables.each do |table_name|
i = "000"
File.open("#{folder}/#{table_name}.yml", 'w') do |file|
data = ActiveRecord::Base.connection.select_all(sql % table_name)
file.write data.inject({}) { |hash, record|
hash["#{table_name}_#{i.succ!}"] = record
hash }.to_yaml
end
end
end
desc "Import the models that have YAML files in
db/fixture/defaults or from a specified path."
task :import do
location = 'db/fixtures/default'
puts ""
puts "enter import path [#{location}]"
location_in = STDIN.gets.chomp
location = location_in unless location_in.blank?
ENV['FIXTURES_PATH'] = location
puts "Importing data from #{ENV['FIXTURES_PATH']}"
Rake::Task["db:fixtures:load"].invoke
end
I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.
namespace :db do
desc "Load seed fixtures (from db/fixtures) into the current environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
end
end
end
I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?
Seeding data with fixtures is an extremely bad idea.
Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.
Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.
There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.
I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.
I wrote an article about these techniques a while back called Loading seed data.
For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:
task :exit_or_continue_in_production? do
if Rails.env.production?
puts "!!!WARNING!!! This task will DESTROY " +
"your production database and RESET all " +
"application settings"
puts "Continue? y/n"
continue = STDIN.gets.chomp
unless continue == 'y'
puts "Exiting..."
exit!
end
end
end
I have created this gist for some context.
For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).
I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)
I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.
Rails 3 will solve this for you using a seed.rb file.
http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb
We've built up a bunch of best practices that we use for seeding data. We rely heavily on seeding, and we have some unique requirements since we need to seed multi-tenant systems. Here's some best practices we've used:
Fixtures aren't the best solution, but you still should store your seed data in something other than Ruby. Ruby code for storing seed data tends to get repetitive, and storing data in a parseable file means you can write generic code to handle your seeds in a consistent fashion.
If you're going to potentially update seeds, use a marker column named something like code to match your seeds file to your actual data. Never rely on ids being consistent between environments.
Think about how you want to handle updating existing seed data. Is there any potential that users have modified this data? If so, should you maintain the user's information rather than overriding it with seed data?
If you're interested in some of the ways we do seeding, we've packaged them into a gem called SeedOMatic.
How about just deleting the task off your production server once you have seeded the database?
I just had an interesting idea...
what if you created \db\seeds\ and added migration-style files:
file: 200907301234_add_us_states.rb
class AddUsStates < ActiveRecord::Seeds
def up
add_to(:states, [
{:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
{:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
]
end
end
def down
remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
end
end
alternately:
def up
State.create!( :name => ... )
end
This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.
thoughts?
I have a migration in Rails that inserts a record into the database. The Category model depends on this record. Since RSpec clears the database before each example, this record is lost and furthermore never seems to be created since RSpec does not seem to generate the database from migrations. What is the best way to create/recreate this record in the database? Would it be using before(:all)?
It's not that RSpec clears the database, it's that Rails's rake:db:prepare task copies the schema (but not the contents) of your dev database into your *_test db.
Yes, you can use before(:all), as transactions are wrapped around each individual example - but a simple fixture file would also do the same job.
(There's a more complicated general solution to this issue: moving to a service-oriented architecture, where your 'dev' and 'test' services are going to be completely separate instances. You can then point your test db config to the development database in your test service, disable rake:db:prepare, and build your test service from migrations as you regenerate it. Then you can test your migrations and data transformations.)
What I like to do is create a folder in db/migration called data, and then put yml fixtures in there, in your case categories.yml
Then I create a migration with the following
def self.up
down
directory = File.join( File.dirname(__FILE__), "data" )
Fixtures.create_fixtures( directory, "categories" )
end
def self.down
Category.delete_all
end