What is the best way to seed a database in Rails?

What is the best way to seed a database in Rails? - ruby-on-rails

I have a rake task that populates some initial data in my rails app. For example, countries, states, mobile carriers, etc.
The way I have it set up now, is I have a bunch of create statements in files in /db/fixtures and a rake task that processes them. For example, one model I have is themes. I have a theme.rb file in /db/fixtures that looks like this:
Theme.delete_all
Theme.create(:id => 1, :name=>'Lite', :background_color=>'0xC7FFD5', :title_text_color=>'0x222222',
:component_theme_color=>'0x001277', :carrier_select_color=>'0x7683FF', :label_text_color=>'0x000000',
:join_upper_gradient=>'0x6FAEFF', :join_lower_gradient=>'0x000000', :join_text_color=>'0xFFFFFF',
:cancel_link_color=>'0x001277', :border_color=>'0x888888', :carrier_text_color=>'0x000000', :public => true)
Theme.create(:id => 2, :name=>'Metallic', :background_color=>'0x000000', :title_text_color=>'0x7299FF',
:component_theme_color=>'0xDBF2FF', :carrier_select_color=>'0x000000', :label_text_color=>'0xDBF2FF',
:join_upper_gradient=>'0x2B25FF', :join_lower_gradient=>'0xBEFFAC', :join_text_color=>'0x000000',
:cancel_link_color=>'0xFF7C12', :border_color=>'0x000000', :carrier_text_color=>'0x000000', :public => true)
Theme.create(:id => 3, :name=>'Blues', :background_color=>'0x0060EC', :title_text_color=>'0x000374',
:component_theme_color=>'0x000374', :carrier_select_color=>'0x4357FF', :label_text_color=>'0x000000',
:join_upper_gradient=>'0x4357FF', :join_lower_gradient=>'0xffffff', :join_text_color=>'0x000000',
:cancel_link_color=>'0xffffff', :border_color=>'0x666666', :carrier_text_color=>'0x000000', :public => true)
puts "Success: Theme data loaded"
The idea here is that I want to install some stock themes for users to start with. I have a problem with this method.
Setting the ID does not work. This means that if I decide to add a theme, let's call it 'Red', then I would simply like to add the theme statement to this fixture file and call the rake task to reseed the database. If I do that, because themes belong to other objects and their id's change upon this re-initialization, all links are broken.
My question is first of all, is this a good way to handle seeding a database? In a previous post, this was recommended to me.
If so, how can I hard code the IDs, and are there any downsides to that?
If not, what is the best way to seed the database?
I will truly appreciate long and thought out answers that incorporate best practices.

Updating since these answers are slightly outdated (although some still apply).
Simple feature added in rails 2.3.4, db/seeds.rb
Provides a new rake task
rake db:seed
Good for populating common static records like states, countries, etc...
http://railscasts.com/episodes/179-seed-data
*Note that you can use fixtures if you had already created them to also populate with the db:seed task by putting the following in your seeds.rb file (from the railscast episode):
require 'active_record/fixtures'
Fixtures.create_fixtures("#{Rails.root}/test/fixtures", "operating_systems")
For Rails 3.x use 'ActiveRecord::Fixtures' instead of 'Fixtures' constant
require 'active_record/fixtures'
ActiveRecord::Fixtures.create_fixtures("#{Rails.root}/test/fixtures", "fixtures_file_name")

Usually there are 2 types of seed data required.
Basic data upon which the core of your application may rely. I call this the common seeds.
Environmental data, for example to develop the app it is useful to have a bunch of data in a known state that us can use for working on the app locally (the Factory Girl answer above covers this kind of data).
In my experience I was always coming across the need for these two types of data. So I put together a small gem that extends Rails' seeds and lets you add multiple common seed files under db/seeds/ and any environmental seed data under db/seeds/ENV for example db/seeds/development.
I have found this approach is enough to give my seed data some structure and gives me the power to setup my development or staging environment in a known state just by running:
rake db:setup
Fixtures are fragile and flakey to maintain, as are regular sql dumps.

Using seeds.rb file or FactoryBot is great, but these are respectively great for fixed data structures and testing.
The seedbank gem might give you more control and modularity to your seeds. It inserts rake tasks and you can also define dependencies between your seeds. Your rake task list will have these additions (e.g.):
rake db:seed # Load the seed data from db/seeds.rb, db/seeds/*.seeds.rb and db/seeds/ENVIRONMENT/*.seeds.rb. ENVIRONMENT is the current environment in Rails.env.
rake db:seed:bar # Load the seed data from db/seeds/bar.seeds.rb
rake db:seed:common # Load the seed data from db/seeds.rb and db/seeds/*.seeds.rb.
rake db:seed:development # Load the seed data from db/seeds.rb, db/seeds/*.seeds.rb and db/seeds/development/*.seeds.rb.
rake db:seed:development:users # Load the seed data from db/seeds/development/users.seeds.rb
rake db:seed:foo # Load the seed data from db/seeds/foo.seeds.rb
rake db:seed:original # Load the seed data from db/seeds.rb

factory_bot sounds like it will do what you are trying to achieve. You can define all the common attributes in the default definition and then override them at creation time. You can also pass an id to the factory:
Factory.define :theme do |t|
t.background_color '0x000000'
t.title_text_color '0x000000',
t.component_theme_color '0x000000'
t.carrier_select_color '0x000000'
t.label_text_color '0x000000',
t.join_upper_gradient '0x000000'
t.join_lower_gradient '0x000000'
t.join_text_color '0x000000',
t.cancel_link_color '0x000000'
t.border_color '0x000000'
t.carrier_text_color '0x000000'
t.public true
end
Factory(:theme, :id => 1, :name => "Lite", :background_color => '0xC7FFD5')
Factory(:theme, :id => 2, :name => "Metallic", :background_color => '0xC7FFD5')
Factory(:theme, :id => 3, :name => "Blues", :background_color => '0x0060EC')
When used with faker it can populate a database really quickly with associations without having to mess about with Fixtures (yuck).
I have code like this in a rake task.
100.times do
Factory(:company, :address => Factory(:address), :employees => [Factory(:employee)])
end

Rails has a built in way to seed data as explained here.
Another way would be to use a gem for more advanced or easy seeding such as: seedbank.
The main advantage of this gem and the reason I use it is that it has advanced capabilities such as data loading dependencies and per environment seed data.
Adding an up to date answer as this answer was first on google.

The best way is to use fixtures.
Note: Keep in mind that fixtures do direct inserts and don't use your model so if you have callbacks that populate data you will need to find a workaround.

Add it in database migrations, that way everyone gets it as they update. Handle all of your logic in the ruby/rails code, so you never have to mess with explicit ID settings.

Related

Use seed_fu together with seedbank

I have been using the seedbank gem to give my Rails seeds some structure (i.e. environment specific seed folders, one seed file per model, order dependencies etc.)
Now I came across the seed_fu gem which among other things makes it easy and expressive to say "seed these records, and if one with that id exists, update the other fields". E.g:
Category.seed(:id,
{ :id => 1, :name => "Food" },
{ :id => 2, :name => "Drink" }
)
I can achieve the same result with some cumbersome ActiveRecord calls in my seed files, but I would much rather use the nice syntax provided by seed_fu. Additionally, I want to keep using the features seedbank gives me. Another rationale is that I may migrate only part of my seed files to the other syntax and that it makes no sense to use two rake commands side by side.
If I just put the above code in my db/seeds/categories.seeds.rb file and run rake db:seed:categories i get the error undefined method 'seed'. I guess I would somehow need to import the seed method from SeedFu:ActiveRecordExtension, but I don't know how.
I'm on Rails 3.2.13 and using the newest version of seed_fu straight from the github repo.

Short answer: seed_fu and seedbank work together without any problem.
I was using the zeus gem and hadn't reloaded my Rails environment so the seed_fu DSL just wasn't loaded yet.
(I will leave the question here for anyone wondering if you can use these gems together. TL;DR: yes.)

how add test data to my local sqlite in rails 3?

I need to add rows of test data to my sqlite.
INSERT INTO courses (`name`) VALUES
(‘java'),
(‘ruby').......
How can i do it with rails3?
Thanks

You can write the insert in activerecord way in db/seeds.rb something like:
courses = Course.create([{ :name => 'java' }, { :name => 'ruby' }])
then run rake db:seed.

Fixtures and FactoryGirl are two popular options for adding test data. I'm personally a bigger fan of FactoryGirl because you are working directly with the models in your tests, but it's really a matter of opinion. From my experience, people who use Test::Unit tend to be fans of fixtures, and those who use RSpec tend to like FactoryGirl. This isn't a hard rule though and you will definitely find people who mix them up. It seems to depend on whether you like using DSLs or plain Ruby in your tests.

You can use 'Factory_girl' for creating test objects in your database.
Just install gem and create Factory objects for your project.
this is the link about it

Why is this ERB code in a fixture throwing 'undefined method'?

I'm trying to use fixtures to add more complex test data in order to test specific scenarios with the front-end, which is in Flex. I'm not sure this is the right way to go about it with rails. My rails app is a card game and the 'more complex test data' I'm trying to test are various combinations of cards.
For example, I want to set up a test game where player 1 has cards B and C in hand, where I've specifically added cards B and C to the player's hand in a fixture.
I have basic fixtures for players, games, and users, which have been there for awhile and working fine. I've tried to add the following erb code in the games fixture, to invoke the Game.start method, and am getting
NoMethodError: undefined method `games' for main:Object
The fixture code snippet is :
four:
id: 4
num_players: 3
turn_num: 0
status_id: 1
<% game_four = games(:four).find
game_four.start
%>

game_four = games(:four).find
games method exists only in test cases, not in fixtures.
You should either query the database or use relationships.
This is just an example.
four:
id: 4
num_players: 3
turn_num: 0
status_id: 1
<% Game.find_by_name(four).start %>
Also, this is not really the right place for such command. Fixtures are not intended "to start games".
You should really move this command elsewhere, perhaps in a dedicated test case within the setup block.
EDIT:
I copy here my comment posted a couple of days ago on the original answer with a link to the new Rails Database Seeding feature: http://ryandaigle.com/articles/2009/5/13/what-s-new-in-edge-rails-database-seeding
This is the one explained by Yehuda Katz in his answer and definitely the best way to solve this problem.

Probably the best solution (and in fact, the one that is now canonized on edge) is to have a seeds.rb file in your db directory that you load from a rake task.
Here's what Rails does now on edge (to be in Rails 3).
# db/seeds.rb
# This file should contain all the record creation needed to seed the database with its default values.
# The data can then be loaded with the rake db:seed (or created alongside the db with db:setup).
#
# Examples:
#
# cities = City.create([{ :name => 'Chicago' }, { :name => 'Copenhagen' }])
# Major.create(:name => 'Daley', :city => cities.first)
And then a new rake task (which you can add to your Rakefile):
desc 'Load the seed data from db/seeds.rb'
task :seed => :environment do
seed_file = File.join(Rails.root, 'db', 'seeds.rb')
load(seed_file) if File.exist?(seed_file)
end
If you set up your seeds.rb file this way, you will be following the new convention and will be able to delete the seeds rake task when you upgrade to Rails 3.
Also, migrations are not for data. This is well established and the universal opinion of the Rails core team as far as I know.

If you want to use fixtures method (when loading data for development, not during tests) you can use fixtures_references plugin. Its behaviour will be the same.

Is seeding data with fixtures dangerous in Ruby on Rails

I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.
namespace :db do
desc "Load seed fixtures (from db/fixtures) into the current environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
end
end
end
I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?

Seeding data with fixtures is an extremely bad idea.
Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.
Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.
There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.
I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.
I wrote an article about these techniques a while back called Loading seed data.

For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:
task :exit_or_continue_in_production? do
if Rails.env.production?
puts "!!!WARNING!!! This task will DESTROY " +
"your production database and RESET all " +
"application settings"
puts "Continue? y/n"
continue = STDIN.gets.chomp
unless continue == 'y'
puts "Exiting..."
exit!
end
end
end
I have created this gist for some context.
For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).
I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)
I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.

Rails 3 will solve this for you using a seed.rb file.
http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb

We've built up a bunch of best practices that we use for seeding data. We rely heavily on seeding, and we have some unique requirements since we need to seed multi-tenant systems. Here's some best practices we've used:
Fixtures aren't the best solution, but you still should store your seed data in something other than Ruby. Ruby code for storing seed data tends to get repetitive, and storing data in a parseable file means you can write generic code to handle your seeds in a consistent fashion.
If you're going to potentially update seeds, use a marker column named something like code to match your seeds file to your actual data. Never rely on ids being consistent between environments.
Think about how you want to handle updating existing seed data. Is there any potential that users have modified this data? If so, should you maintain the user's information rather than overriding it with seed data?
If you're interested in some of the ways we do seeding, we've packaged them into a gem called SeedOMatic.

How about just deleting the task off your production server once you have seeded the database?

I just had an interesting idea...
what if you created \db\seeds\ and added migration-style files:
file: 200907301234_add_us_states.rb
class AddUsStates < ActiveRecord::Seeds
def up
add_to(:states, [
{:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
{:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
]
end
end
def down
remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
end
end
alternately:
def up
State.create!( :name => ... )
end
This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.
thoughts?

How (and whether) to populate rails application with initial data

I've got a rails application where users have to log in. Therefore in order for the application to be usable, there must be one initial user in the system for the first person to log in with (they can then create subsequent users). Up to now I've used a migration to add a special user to the database.
After asking this question, it seems that I should be using db:schema:load, rather than running the migrations, to set up fresh databases on new development machines. Unfortunately, this doesn't seem to include the migrations which insert data, only those which set up tables, keys etc.
My question is, what's the best way to handle this situation:
Is there a way to get d:s:l to include data-insertion migrations?
Should I not be using migrations at all to insert data this way?
Should I not be pre-populating the database with data at all? Should I update the application code so that it handles the case where there are no users gracefully, and lets an initial user account be created live from within the application?
Any other options? :)

Try a rake task. For example:
Create the file /lib/tasks/bootstrap.rake
In the file, add a task to create your default user:
namespace :bootstrap do
desc "Add the default user"
task :default_user => :environment do
User.create( :name => 'default', :password => 'password' )
end
desc "Create the default comment"
task :default_comment => :environment do
Comment.create( :title => 'Title', :body => 'First post!' )
end
desc "Run all bootstrapping tasks"
task :all => [:default_user, :default_comment]
end
Then, when you're setting up your app for the first time, you can do rake db:migrate OR rake db:schema:load, and then do rake bootstrap:all.

Use db/seed.rb found in every Rails application.
While some answers given above from 2008 can work well, they are pretty outdated and they are not really Rails convention anymore.
Populating initial data into database should be done with db/seed.rb file.
It's just works like a Ruby file.
In order to create and save an object, you can do something like :
User.create(:username => "moot", :description => "king of /b/")
Once you have this file ready, you can do following
rake db:migrate
rake db:seed
Or in one step
rake db:setup
Your database should be populated with whichever objects you wanted to create in seed.rb

I recommend that you don't insert any new data in migrations. Instead, only modify existing data in migrations.
For inserting initial data, I recommend you use YML. In every Rails project I setup, I create a fixtures directory under the DB directory. Then I create YML files for the initial data just like YML files are used for the test data. Then I add a new task to load the data from the YML files.
lib/tasks/db.rake:
namespace :db do
desc "This loads the development data."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/*.yml').each do |file|
base_name = File.basename(file, '.*')
say "Loading #{base_name}..."
Fixtures.create_fixtures('db/fixtures', base_name)
end
end
desc "This drops the db, builds the db, and seeds the data."
task :reseed => [:environment, 'db:reset', 'db:seed']
end
db/fixtures/users.yml:
test:
customer_id: 1
name: "Test Guy"
email: "test#example.com"
hashed_password: "656fc0b1c1d1681840816c68e1640f640c6ded12"
salt: "188227600.754087929365988"

This is my new favorite solution, using the populator and faker gems:
http://railscasts.com/episodes/126-populating-a-database

Try the seed-fu plugin, which is quite a simple plugin that allows you to seed data (and change that seed data in the future), will also let you seed environment specific data and data for all environments.

I guess the best option is number 3, mainly because that way there will be no default user which is a great way to render otherwise good security useless.

I thought I'd summarise some of the great answers I've had to this question, together with my own thoughts now I've read them all :)
There are two distinct issues here:
Should I pre-populate the database with my special 'admin' user? Or should the application provide a way to set up when it's first used?
How does one pre-populate the database with data? Note that this is a valid question regardless of the answer to part 1: there are other usage scenarios for pre-population than an admin user.
For (1), it seems that setting up the first user from within the application itself is quite a bit of extra work, for functionality which is, by definition, hardly ever used. It may be slightly more secure, however, as it forces the user to set a password of their choice. The best solution is in between these two extremes: have a script (or rake task, or whatever) to set up the initial user. The script can then be set up to auto-populate with a default password during development, and to require a password to be entered during production installation/deployment (if you want to discourage a default password for the administrator).
For (2), it appears that there are a number of good, valid solutions. A rake task seems a good way, and there are some plugins to make this even easier. Just look through some of the other answers to see the details of those :)

Consider using the rails console. Good for one-off admin tasks where it's not worth the effort to set up a script or migration.
On your production machine:
script/console production
... then ...
User.create(:name => "Whoever", :password => "whichever")
If you're generating this initial user more than once, then you could also add a script in RAILS_ROOT/script/, and run it from the command line on your production machine, or via a capistrano task.

That Rake task can be provided by the db-populate plugin:
http://github.com/joshknowles/db-populate/tree/master

Great blog post on this:
http://railspikes.com/2008/2/1/loading-seed-data
I was using Jay's suggestions of a special set of fixtures, but quickly found myself creating data that wouldn't be possible using the models directly (unversioned entries when I was using acts_as_versioned)

I'd keep it in a migration. While it's recommended to use the schema for initial setups, the reason for that is that it's faster, thus avoiding problems. A single extra migration for the data should be fine.
You could also add the data into the schema file, as it's the same format as migrations. You'd just lose the auto-generation feature.

For users and groups, the question of pre-existing users should be defined with respect to the needs of the application rather than the contingencies of programming. Perhaps your app requires an administrator; then prepopulate. Or perhaps not - then add code to gracefully ask for a user setup at application launch time.
On the more general question, it is clear that many Rails Apps can benefit from pre-populated date. For example, a US address holding application may as well contain all the States and their abbreviations. For these cases, migrations are your friend, I believe.

Some of the answers are outdated. Since Rails 2.3.4, there is a simple feature called Seed available in db/seed.rb :
#db/seed.rb
User.create( :name => 'default', :password => 'password' )
Comment.create( :title => 'Title', :body => 'First post!' )
It provides a new rake task that you can use after your migrations to load data :
rake db:seed
Seed.rb is a classic Ruby file, feel free to use any classic datastructure (array, hashes, etc.) and iterators to add your data :
["bryan", "bill", "tom"].each do |name|
User.create(:name => name, :password => "password")
end
If you want to add data with UTF-8 characters (very common in French, Spanish, German, etc.), don't forget to add at the beginning of the file :
# ruby encoding: utf-8
This Railscast is a good introduction : http://railscasts.com/episodes/179-seed-data

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

What is the best way to seed a database in Rails? - ruby-on-rails

The best way is to use fixtures. Note: Keep in mind that fixtures do direct inserts and don't use your model so if you have callbacks that populate data you will need to find a workaround.

Add it in database migrations, that way everyone gets it as they update. Handle all of your logic in the ruby/rails code, so you never have to mess with explicit ID settings.

Related

Use seed_fu together with seedbank

how add test data to my local sqlite in rails 3?

Why is this ERB code in a fixture throwing 'undefined method'?

Is seeding data with fixtures dangerous in Ruby on Rails

How (and whether) to populate rails application with initial data

Categories

Resources