Populate a constant values table - ruby-on-rails

In a Rails application, I need a table in my database to contain constant data.
This table content is not intended to change for the moment but I do not want to put the content in the code, to be able to change it whenever needed.
I tried filling this table in the migration that created it, but this does not seem to work with the test environment and breaks my unit tests. In test environment, my model is never able to return any value while it is ok in my development environment.
Is there a way to fill that database correctly even in test environment ? Is there another way of handling these kind of data that should not be in code ?
edit
Thanks all for your answers and especially Vlad R for explaining the problem.
I now understand why my data are not loaded in test. This is because the test environment uses the db:load rake command which directly loads the schema instead of running the migrations. Having put my values in the migration only and not in the schema, these values are not loaded for test.

What you are probably observing is that the test framework is not running the migrations (db:migrate), but loading db/schema.rb directly (db:load) instead.
You have two options:
continue to use the migration for production and development; for the test environment, add your constant data to the corresponding yml files in db/fixtures
leave the existing db/fixtures files untouched, and create another set of yml files (containing the constant data) in the same vein as db/fixtures, but usable by both test and production/development environments when doing a rake db:load schema initialization
To cover those scenarios that use db:load (instead of db:migrate - e.g. test, bringing up a new database on a new development machine using the faster db:load instead of db:migrate, etc.) is create a drop-in rakefile in RAILS_APP/lib/tasks to augment the db:load task by loading your constant intialization data from "seed" yml files (one for each model) into the database.
Use the db:seed rake task as an example. Put your seed data in db/seeds/.yml
#the command is: rake:db:load
namespace :db do
desc 'Initialize data from YAML.'
task :load => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/seeds/*.yml').each do |file|
Fixtures.create_fixtures('db/seeds', File.basename(file, '.*'))
end
end
end
To cover the incremental scenarios (db:migrate), define one migration that does the same thing as the task defined above.
If your seed data ever changes, you will need to add another migration to remove the old seed data and load the new one instead, which may be non-trivial in case of foreign-key dependencies etc.

Take a look at my article on loading seed data.
There's a number of ways to do this. I like a rake task called db:populate which lets you specify your fixed data in normal ActiveRecord create statements. For getting the data into tests, I've just be loading this populate file in my test_helper. However, I think I am going to switch to a test database that already has the seed data populated.
There's also plugin called SeedFu that helps with this problem.
Whatever you do, I recommend against using fixtures for this, because they don't validate your data, so it's very easy to create invalid records.

Related

a question on database seed.rb

If I have the following code defined inside db/seeds.rb,
default_car=Car.create({:name=>'TOYOTA'})
User.create({:username=>'default_user', car_id=>default_car.id})
I know the default_car and the user instances will be stored into Database when I run "rake db:seed".
My question is, if I run 'rake db:seed' again, again and again(multiple times), will the same instances be stored to database with multiple copies or it only save the instance once into database no matter how many times I run rake db:seed?
Better solution:
default_car = Car.find_or_create_by_name 'TOYOTA'
user = User.find_or_create_by_username 'default_user'
user.car = default_car
user.save
That way you can run "rake db:seed" multiple times without having to drop the database manually every time.
This is a limitation of having a single seed file. I was finding this frustrating as the application grows you often want to add new seed data so you end up either doing what Pascal suggests or creating either migrations with data in them or rake tasks to load the seeds. To get round this I knocked up seedbank. So I combine this with Pascals approach so I can re-run the seeds but can also target specific ones if I want to.
depends on your models if you allow duplicate values. if you don't it will throw an error. what you do is to clear your db first before running seed via rake db:resetdb

What is the scope of the code in db/seeds.rb in Ruby on Rails?

Question 1
Suppose I write define some variables in db/seeds.rb, e.g.: user = User.create(...).
What is the scope of these variables ?
Question 2
If I have a big amount of code in db/seeds.rb, is it recommended to put it in a class ?
The variables are in the scope of the rake instance that has been started.
So they would be in scope for other tasks if multiple tasks where started at once.
For example
rake db:seed custom:sometask
Instance variables defined in db:seed could be accessed in 'sometask'
If the rake file is too big because of adding too many records, you could move the data that is to be inserted into a yaml file, that could make your seeds file cleaner, rather than creating a class.
Seed data is anything that must be loaded for an application to work properly. An application needs its seed data loaded in order to run in development, test, and production.
Seed data is mostly unchanging. It typically won’t be edited in your application. But requirements can and do change, so seed data may need to be reloaded on deployed applications.
Answer for your second question
lines of code in seed.rb doesn't affect the performance the basic task of seed is to initialize the database with predefined records. Keep one thing in mind that the parent creation is done before the child is created.
Here are some references that might help you
ASCIICasts
Rail Spikes

Ruby on Rails 2.3.5: Populating my prod and devel database with data (migration or fixture?)

I need to populate my production database app with data in particular tables. This is before anyone ever even touches the application. This data would also be required in development mode as it's required for testing against. Fixtures are normally the way to go for testing data, but what's the "best practice" for Ruby on Rails to ship this data to the live database also upon db creation?
ultimately this is a two part question I suppose.
1) What's the best way to load test data into my database for development, this will be roughly 1,000 items. Is it through a migration or through fixtures? The reason this is a different answer than the question below is that in development, there's certain fields in the tables that I'd like to make random. In production, these fields would all start with the same value of 0.
2) What's the best way to bootstrap a production db with live data I need in it, is this also through a migration or fixture?
I think the answer is to seed as described here: http://lptf.blogspot.com/2009/09/seed-data-in-rails-234.html but I need a way to seed for development and seed for production. Also, why bother using Fixtures if seeding is available? When does one seed and when does one use fixtures?
Usually fixtures are used to provide your tests with data, not to populate data into your database. You can - and some people have, like the links you point to - use fixtures for this purpose.
Fixtures are OK, but using Ruby gives us some advantages: for example, being able to read from a CSV file and populate records based on that data set. Or reading from a YAML fixture file if you really want to: since your starting with a programming language your options are wide open from there.
My current team tried to use db/seed.rb, and checking RAILS_ENV to load only certain data in certain places.
The annoying thing about db:seed is that it's meant to be a one shot thing: so if you have additional items to add in the middle of development - or when your app has hit production - ... well, you need to take that into consideration (ActiveRecord's find_or_create_by...() method might be your friend here).
We tried the Bootstrapper plugin, which puts a nice DSL over the RAILS_ENV checking, and lets your run only the environment you want. It's pretty nice.
Our needs actually went beyond that - we found we needed database style migrations for our seed data. Right now we are putting normal Ruby scripts into a folder (db/bootstrapdata/) and running these scripts with Arild Shirazi's required gem to load (and thus run) the scripts in this directory.
Now this only gives you part of the database style migrations. It's not hard to go from this to creating something where these data migrations can only be run once (like database migrations).
Your needs might stop at bootstrapper: we have pretty unique needs (developing the system when we only know half the spec, larg-ish Rails team, big data migration from the previous generation of software. Your needs might be simpler).
If you did want to use fixtures the advantage over seed is that you can easily export also.
A quick guess at how the rake task may looks is as follows
desc 'Export the data objects to Fixtures from data in an existing
database. Defaults to development database. Set RAILS_ENV to override.'
task :export => :environment do
sql = "SELECT * FROM %s"
skip_tables = ["schema_info"]
export_tables = [
"roles",
"roles_users",
"roles_utilities",
"user_filters",
"users",
"utilities"
]
time_now = Time.now.strftime("%Y_%h_%d_%H%M")
folder = "#{RAILS_ROOT}/db/fixtures/#{time_now}/"
FileUtils.mkdir_p folder
puts "Exporting data to #{folder}"
ActiveRecord::Base.establish_connection(:development)
export_tables.each do |table_name|
i = "000"
File.open("#{folder}/#{table_name}.yml", 'w') do |file|
data = ActiveRecord::Base.connection.select_all(sql % table_name)
file.write data.inject({}) { |hash, record|
hash["#{table_name}_#{i.succ!}"] = record
hash }.to_yaml
end
end
end
desc "Import the models that have YAML files in
db/fixture/defaults or from a specified path."
task :import do
location = 'db/fixtures/default'
puts ""
puts "enter import path [#{location}]"
location_in = STDIN.gets.chomp
location = location_in unless location_in.blank?
ENV['FIXTURES_PATH'] = location
puts "Importing data from #{ENV['FIXTURES_PATH']}"
Rake::Task["db:fixtures:load"].invoke
end

Adding sample data to database using rake for a rails engine

I am trying out Rails engines by creating a classifieds engine where users can view/post/reply to classifieds.
The main application contains code for user authentication and profiles while there is an engine which I have created which will deal with the classifieds functionality.
Now I want to add some sample data to the database for the classifieds engine. So I created a rake file called 'sample_classifieds_data.rake' in 'vendor/plugins/classifieds/lib/tasks' and I added the yml files in 'vendor/plugins/classifieds/lib/tasks/sample_classifieds_data'
The code of the rake file and a sample yml file can be found here: http://gist.github.com/216776
Now the problem is that when I run the rake task, no error is being thrown but the values are not getting populated in the database.
Any ideas? BTW, it is development environment and the database is the development database.
I ran a similar rake task to populate sample users in the database which worked. the location of that rake file 'sample_data.rake' was located in 'lib/tasks'.
In rails edge, you can use the rake db:seed feature to add datas to your base. See the commit.
The use is pretty simple.
Create a db/seeds.rb file.
And put whatever code you want to seed your database in it.
For example :
Category.create!(:name => 'My Category')
Country.create!(:name => 'Cassoulet Land')
And when you want to seed your database, you could do a rake db:seed
If, for any reason, you do not wish to use edge (which would be comprehensible in a production environment), you can use the Seed Fu plugin, which quite does the trick for you.
Your task looks good. About the only thing would cause your task to fail silently is that the file you're passing to Fixture.new does not point to a yml or csv file.
Double check by modifying the put statement to print the full path of the file it imported, and compare what it prints against your directory structure.
For example, things will fail silently if your fixture files start with a capital letter? Categories.yml instead of categories.yml
The db:seed task was added in Rails 2.3.4. So no need to run edge.
http://weblog.rubyonrails.org/2009/9/4/ruby-on-rails-2-3-4

Is seeding data with fixtures dangerous in Ruby on Rails

I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.
namespace :db do
desc "Load seed fixtures (from db/fixtures) into the current environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
end
end
end
I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?
Seeding data with fixtures is an extremely bad idea.
Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.
Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.
There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.
I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.
I wrote an article about these techniques a while back called Loading seed data.
For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:
task :exit_or_continue_in_production? do
if Rails.env.production?
puts "!!!WARNING!!! This task will DESTROY " +
"your production database and RESET all " +
"application settings"
puts "Continue? y/n"
continue = STDIN.gets.chomp
unless continue == 'y'
puts "Exiting..."
exit!
end
end
end
I have created this gist for some context.
For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).
I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)
I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.
Rails 3 will solve this for you using a seed.rb file.
http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb
We've built up a bunch of best practices that we use for seeding data. We rely heavily on seeding, and we have some unique requirements since we need to seed multi-tenant systems. Here's some best practices we've used:
Fixtures aren't the best solution, but you still should store your seed data in something other than Ruby. Ruby code for storing seed data tends to get repetitive, and storing data in a parseable file means you can write generic code to handle your seeds in a consistent fashion.
If you're going to potentially update seeds, use a marker column named something like code to match your seeds file to your actual data. Never rely on ids being consistent between environments.
Think about how you want to handle updating existing seed data. Is there any potential that users have modified this data? If so, should you maintain the user's information rather than overriding it with seed data?
If you're interested in some of the ways we do seeding, we've packaged them into a gem called SeedOMatic.
How about just deleting the task off your production server once you have seeded the database?
I just had an interesting idea...
what if you created \db\seeds\ and added migration-style files:
file: 200907301234_add_us_states.rb
class AddUsStates < ActiveRecord::Seeds
def up
add_to(:states, [
{:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
{:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
]
end
end
def down
remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
end
end
alternately:
def up
State.create!( :name => ... )
end
This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.
thoughts?

Resources