Rails recommended way to add sample data - ruby-on-rails

I have a Rake script similar to below,but I am wondering if there is a more efficient way to do this, without having to drop the database, run all the migrations, reseed the database and then add the sample data?
namespace :db do
desc 'Fill database with sample data'
task populate: :environment do
purge_database
create_researchers
create_organisations
add_survey_groups_to_organisations
add_members_to_survey_groups
create_survey_responses_for_members
end
end
def purge_database
puts 'about to drop and recreate database'
system('rake db:drop')
puts 'database dropped'
system('rake db:create')
system('rake db:migrate')
system('rake db:seed')
puts 'Database recreated...'
end
def create_researchers
10.times do
researcher = User.new
researcher.email = Faker::Internet.email
researcher.save!
end
end

You should not fill your database with sample data via db:seed. That's not the purpose of the seeds file.
db:seed is for initial data that your app needs in order to function. It's not for testing and/or development purposes.
What I do is to have one task that populates sample data and another task that drops the database, creates it, migrates, seeds and populates. The cool thing is that it's composed of other tasks, so you don't have to duplicate code anywhere:
# lib/tasks/sample_data.rake
namespace :db do
desc 'Drop, create, migrate, seed and populate sample data'
task prepare: [:drop, :create, "schema:load", :seed, :populate_sample_data] do
puts 'Ready to go!'
end
desc 'Populates the database with sample data'
task populate_sample_data: :environment do
10.times { User.create!(email: Faker::Internet.email) }
end
end

I would suggest making rake db:seed self sufficient. By which I mean, you should be able to run it multiple times without it doing any damage, while ensuring that whatever sample data you need loaded gets loaded.
So, for your researches, the db:seed task should do something like this:
User.destroy_all
10.times do
researcher = User.new
researcher.email = Faker::Internet.email
researcher.save!
end
You can run this over and over and over and are ensured you will always end up with 10 random users.
I see this is for development. In that case, I wouldn't put it in db:seed as that might get run in production. But you can put it in a similar rake task that you can re-run as often as needed.

Related

Rails dynamically create model and use it

I am creating a model whose name is the input argument in a rake task. After the rake task, I wish to use the model to insert data.
So for example, I call my rake task with input Apple and the model Apple is created. Then I wish to do Apple.insert_all([{name: x},{name: y}...]) in another rake task but I get NameError: uninitialized constant Apple
Here's a better picture of the flow of what I'm doing
Rake::Task["create:fruit"].invoke("Apple") # create model here
Rake::Task["create:insert"].invoke("Apple") # insert data here but getting error
This is how I process the input in the second rake task:
task :insert, [:name] do |t, args|
fruit = args.name
fruit.classify.constantize.insert_all(xxx)
end
Any suggestions for how to go about this?
I created a new project and tried your code. I think the problem is in this line
fruit.classify.constantize.insert_all(xxx)
The code bellow works and create new records. I use a simple rake command to run it.
create.rake file
namespace :create do
desc "TODO"
task :insert, [:name] do |t, args|
klass = Object.const_get(args.name)
klass.create([{name: 'x'},{name: 'y'}])
p klass.count # testing new records have been saved
end
end
Rakefile file
require File.expand_path('../config/application', __FILE__)
Rails.application.load_tasks
task :default do
Rake::Task["create:insert"].invoke("Apple")
end

The case of the disappearing ActiveRecord attribute

Following the instructions in https://stackoverflow.com/a/24496452/102675 I wound up with the following:
namespace :db do
desc 'Drop, create, migrate, seed and populate sample data'
task seed_sample_data: [:drop, :create, :migrate, :seed, :populate_sample_data] do
puts 'Sample Data Populated. Ready to go!'
end
desc 'Populate the database with sample data'
task populate_sample_data: :environment do
puts Inspector.column_names.include?('how_to_fix')
# create my sample data
end
end
As you would expect, I get true if I run bundle exec rake db:populate_sample_data
BUT if I run bundle exec rake db:seed_sample_data I get all the migration output and then false. In other words I can't see the Inspector attribute how_to_fix even though it definitely exists as proved by the other rake run. Where did my attribute go?
My guess is that this is a "caching" problem. Can you try the following?
task populate_sample_data: :environment do
Inspector.reset_column_information
# ...
end
P.S. We used to have a similar problem working with different databases having the exact same schema (only except some columns here and there)

How do I seed my database with only part of my seed code?

Is it possible to run one or two blocks within my seeds.rb code like you can with tests and gemfiles?
For example, if I had the following code in my seeds.rb file, could I just seed the Employee model?
20.times do
Employee.create!(name: "Bob",
email: Faker::Internet.email)
end
20.times do
User.create!(name: "Hank",
password: "foobar")
end
If this were my entire seeds.rb file, running rake db:seed would create 20 additional users when I only want to add more employees.
You can pass an option while running rake db:seed like following:
rake db:seed users=yes
And, then in your code, you can access it through the following way:
20.times do
Employee.create!(name: "Bob",
email: Faker::Internet.email)
end
if ENV["users"]
20.times do
User.create!(name: "Hank",
password: "foobar")
end
end
I've used the following setup for a couple of years now to help my sanity.
In db/seeds I have the following files:
001_providers.rb
005_api_users.rb
007_mailing_lists.rb
010_countries.rb
011_us_states.rb
012_canadian_provinces.rb
013_mexican_states.rb
100_world_cities.rb
101_us_zipcodes.rb
My db/seeds.rb file looks like this:
if ENV['VERSION'].present?
seed_files = Dir[File.join(File.dirname(__FILE__), 'seeds', "*#{ENV['VERSION']}*.rb")]
raise "No seed files found matching '#{ENV['VERSION']}'" if seed_files.empty?
else
seed_files = Dir[File.join(File.dirname(__FILE__), 'seeds', '*.rb')]
end
seed_files.sort_by{|f| File.basename(f).to_i}.each do |file|
require File.join(File.dirname(__FILE__), 'seeds', File.basename(file, File.extname(file)))
end
Just a bit of ruby code to let me run one or more seed files. I can now do things like this:
# run them all
bin/rake db:seed
# run just 001_providers.rb
bin/rake db:seed VERSION=001
# run all seeds for the USA (probably dangerous, but if you name your seeds right, could be useful).
bin/rake db:seed VERSION=us
One thing that is very important is that your seed files should be able to be run over and over and over and end up with a consistent state. If you run your seeds over and over you'll end up with many more users than just 20.
For example, my providers one has a main loop like this:
# providers is a hash of attributes...
providers.each_with_index do |(code, attrs), i|
p = Provider.find_by(code: code) || Provider.new(code: code) p.update!(attrs)
end
This way regardless of when I run it, I always get back exactly the providers I defined in my hash.

How can you make a rake task that makes changes in multiple environments?

I have a rake task that I use to populate my development database. When it is done I would like it to also reset the test database, but I can't figure out the syntax. I need something like this:
namespace :db do
task populate: :environment do
Rake::Task["db:reset"].execute
Rake::Task["db:reset"].execute RAILS_ENV=test
# Add lots of data to the :environment database
end
end
This lets me run rake db:populate to populate my development database using the latest schema as well as reset the test database.
The task db:test:clone_structure will reset the test database schema to match the development database schema
namespace :db do
task populate: :environment do
Rake::Task["db:reset"].execute
Rake::Task["db:test:clone_structure"].execute
# Add lots of data to the :environment database
end
end

Create a script or task to modify database

I need to create a script that imports data from a file system source. How do I do that?
I already tried to create a rake task but there the models are not loaded. How do I get the whole rails environment into my task?
desc 'Do stuff with models'
task :do_stuff => :environment do
1000.times.each do |i|
Model.create :name => "model number #{i}"
end
end
You declare :environment as a dependency of your rake task. This loads up rails and all of your app code before it runs.

Resources