I am writing a browser based RPG and I am trying to figure out the best way to store "seed" data, ie. data like locations, monsters and items, that are standard entities of the game.
Would it be best to store this in a "def self.items.." hash in which i keep all the different items, or should I put them in a database and seed them? If I should seed them, what is the best way to accomplish this, just fill up the seed.rb files with 200-300 different instances or keep them in other files?
I absolutely hate cluttered seed files. Admittedly, it's a personal preference and not a right or wrong.
Here's what I did in my app at work to handle the significant volume of seed data we are required to use:
# Folder Structure
db
schema.rb
seeds.rb
seeds/
01_organization.rb
02_real_estate_industry.rb
03_click_types.rb
...
My seeds.rb file:
# All seeds are broken out into ordinally names files inside the db/seeds/* directory.
Dir[File.join(Rails.root, 'db', 'seeds', '**/*.rb')].each do |file|
require file
end
A sample seed file in the seeds/ folder:
organizations = Organization.all
Organization.create(
{
name: 'CompanyName',
blurb: 'Hi. I am a blurb.',
city: 'Some City',
state: 'ST',
web: 'http://www.somewebsite.com',
phone: '1-000-555-1212'
}
) if organizations.empty?
This has the effect of loading our seed files in the desired order (using 01_ then 02_, etc.). It also breaks up our seeds into logical groupings and organizes them in such a way that, for our team, we can more easily manage them.
Related
I have a json file with a lot of movies in it. I want to create a model 'Movie' and fill it with all movies from that json file. How do i do it? I know that I can parse json file into a hash but that's not the thing I am looking for.
The correct term you're looking for is "seeding"!
You're going to need a database however, and a migration to create the database along with the associated movies table. (There are plenty of guides on how to do this, along with the official documentation).
After that's done, you'll need to "seed" your database with the data in your json file.
In the seeds.rb file, assuming that the JSON file is an array of Movies in JSON form, you should be able to loop over every Movie JSON object and insert it into your database.
To add to docaholic's helpful response, here's some steps/pseudo-code that may help.
Assuming you're using a SQL database and need to create a model:
# creates a migration file.
rails generate migration create_movies title:string #duration_in_minutes:integer or whatever fields you have
# edit the file to add other fields/ensure it has what you want.
rake db:migrate
Write a script to populate your database. There are many patterns for this (rake task, test fixtures, etc) and which one you'd want to use would depend on what you need (whether it's for testing, for production environment, as seed data for new environments, etc).
But generally what the code would look like is:
text_from_file = File.read(file_path)
JSON.parse(text_from_file).each do |json_movie_object|
Movie.create!(title: json_movie_object[:title], other_attribute: json_movie_object[:other_attribute])
# if the json attributes exactly match the column names, you can do
# Movie.create!(json_movie_object)
end
This is not the most performant option for large amounts of data. For large files you can use insert_all for much greater efficiency, but this bypasses activerecord validations and callbacks so you'd want to understand what that means.
For my case, I need to seed around 200 hundred datas for production from a JSON file, so I tried to insert data from the json files in a database.
SO at first I created a database in rails project
rails generate migration create_movies title:string duration_in_minutes:integer
# or whatever fields you have
# edit the file to add other fields/ensure it has what you want.
rake db:migrate
Now its time to seed datas!
Suppose your movies.json file has:
[
{"id":"1", "name":"Titanic", "time":"120"},
{"id":"2", "name":"Ruby tutorials", "time":"120"},,
{"id":"3", "name":"Something spetial", "time":"500"}
{"id":"4", "name":"2HAAS", "time":"320"}
]
NOW, like this 4 datas, think your JSON file has 400+ datas to input which is a nightmare to write in seed files manually.
You need to add JSON gem to work. Run
bundle add json
to work with JSON file.
In db/seed.rb file, add these lines to add those 400+ infos in your DATABASE:
infos_from_json = File.read(your/JSON/file/path/movies.json)
JSON.parse(infos_from_json).each do |t|
Movie.create!(title: t['name'], duration_in_minutes:
t['time'])
end
Here:
infos_from_json variable fixing the JSON file from file directory
JSON.parse is calling the JSON datas then the variable is declared to indicate where the file is located and each do loop is dividing each data and |t| will be used to add part JSON.parse(infos_from_json). in every call.
Movie.create!(title: t['title'], duration_in_minutes: t['time']) is adding the data in database
This is how we can add datas in our database from a JSON file easily.
For more info about seeding data in database checkout the documentation
THANKS
Edit: This operation requires JSON gem just type bundle add json
I need to populate my production database app with data in particular tables. This is before anyone ever even touches the application. This data would also be required in development mode as it's required for testing against. Fixtures are normally the way to go for testing data, but what's the "best practice" for Ruby on Rails to ship this data to the live database also upon db creation?
ultimately this is a two part question I suppose.
1) What's the best way to load test data into my database for development, this will be roughly 1,000 items. Is it through a migration or through fixtures? The reason this is a different answer than the question below is that in development, there's certain fields in the tables that I'd like to make random. In production, these fields would all start with the same value of 0.
2) What's the best way to bootstrap a production db with live data I need in it, is this also through a migration or fixture?
I think the answer is to seed as described here: http://lptf.blogspot.com/2009/09/seed-data-in-rails-234.html but I need a way to seed for development and seed for production. Also, why bother using Fixtures if seeding is available? When does one seed and when does one use fixtures?
Usually fixtures are used to provide your tests with data, not to populate data into your database. You can - and some people have, like the links you point to - use fixtures for this purpose.
Fixtures are OK, but using Ruby gives us some advantages: for example, being able to read from a CSV file and populate records based on that data set. Or reading from a YAML fixture file if you really want to: since your starting with a programming language your options are wide open from there.
My current team tried to use db/seed.rb, and checking RAILS_ENV to load only certain data in certain places.
The annoying thing about db:seed is that it's meant to be a one shot thing: so if you have additional items to add in the middle of development - or when your app has hit production - ... well, you need to take that into consideration (ActiveRecord's find_or_create_by...() method might be your friend here).
We tried the Bootstrapper plugin, which puts a nice DSL over the RAILS_ENV checking, and lets your run only the environment you want. It's pretty nice.
Our needs actually went beyond that - we found we needed database style migrations for our seed data. Right now we are putting normal Ruby scripts into a folder (db/bootstrapdata/) and running these scripts with Arild Shirazi's required gem to load (and thus run) the scripts in this directory.
Now this only gives you part of the database style migrations. It's not hard to go from this to creating something where these data migrations can only be run once (like database migrations).
Your needs might stop at bootstrapper: we have pretty unique needs (developing the system when we only know half the spec, larg-ish Rails team, big data migration from the previous generation of software. Your needs might be simpler).
If you did want to use fixtures the advantage over seed is that you can easily export also.
A quick guess at how the rake task may looks is as follows
desc 'Export the data objects to Fixtures from data in an existing
database. Defaults to development database. Set RAILS_ENV to override.'
task :export => :environment do
sql = "SELECT * FROM %s"
skip_tables = ["schema_info"]
export_tables = [
"roles",
"roles_users",
"roles_utilities",
"user_filters",
"users",
"utilities"
]
time_now = Time.now.strftime("%Y_%h_%d_%H%M")
folder = "#{RAILS_ROOT}/db/fixtures/#{time_now}/"
FileUtils.mkdir_p folder
puts "Exporting data to #{folder}"
ActiveRecord::Base.establish_connection(:development)
export_tables.each do |table_name|
i = "000"
File.open("#{folder}/#{table_name}.yml", 'w') do |file|
data = ActiveRecord::Base.connection.select_all(sql % table_name)
file.write data.inject({}) { |hash, record|
hash["#{table_name}_#{i.succ!}"] = record
hash }.to_yaml
end
end
end
desc "Import the models that have YAML files in
db/fixture/defaults or from a specified path."
task :import do
location = 'db/fixtures/default'
puts ""
puts "enter import path [#{location}]"
location_in = STDIN.gets.chomp
location = location_in unless location_in.blank?
ENV['FIXTURES_PATH'] = location
puts "Importing data from #{ENV['FIXTURES_PATH']}"
Rake::Task["db:fixtures:load"].invoke
end
I'm trying to save some lookup table data out to a YAML file so that later when I need to set up my app on a different machine I can load the data in as seed data.
The data is stuff like select options, and it's pretty much set, so no worries about the live data changing between serializing and deserializing.
I have output the data like this...
file = File.open("#{RAILS_ROOT}/lib/tasks/questions/questions.yml", 'w')
questions = Question.find(:all, :order => 'order_position')
file << YAML::dump(questions)
file.close()
And I can load the file like this...
questions = YAML.load_file('lib/tasks/questions/questions.yml')
However, when I try to save a question I get this error...
>> questions[0].save
NoMethodError: undefined method `save' for #<YAML::Object:0x2226b84>
What is the correct way to do this?
Create a seed.yml file in db directory. Add a YAML document for each model you want to create. This document should contain a list of hash. Each hash should contain model attributes.
users:
- login: jake
password: jake123
password_confirmation: jake123
first_name: Jake
last_name: Driver
- login: Jane
password: jane123
password_confirmation: jane123
first_name: Jane
last_name: McCain
categories:
products:
In your seed.rb file
seed_file = File.join(Rails.root, 'db', 'seed.yml')
config = YAML::load_file(seed_file)
User.create(config["users"])
Category.create(config["categories"])
Product.create(config["products"])
Run the rake task to load the rows
rake db:seed
Does the accepted answer actually answer the question? It looks like the asker wanted to save the models, not just retrieve them from a YAML file.
To actually save the loaded model(s) back to the database you need to fool ActiveRecord into thinking the model needs saving. You can do it with this rather dirty bit of code
questions = YAML.load_file("#{RAILS_ROOT}/lib/tasks/questions/questions.yml")
questions.each{|q| q.instance_variable_set("#new_record", true); q.save}
It works and saved my bacon once or twice.
I tried your scenario and I did not have any issues. I did the following changes to your YAML file creation logic:
yml_file = Rails.root.join('lib', 'tasks', 'questions', 'questions.yml')
File.open(yml_file, 'w') do |file|
questions = Question.order(:order_position).to_a
YAML::dump(questions, file)
end
I was able to retrieve the questions list as follows:
yml_file = Rails.root.join('lib', 'tasks', 'questions', 'questions.yml')
question_attributes_list = YAML.load_file(yml_file).map(&:attributes)
questions = Question.create(question_attributes_list)
If you're using Rails 2.3.4 (or above), they have a seeds.rb file that can be found in your applications db folder. This lets you define basic active record creates, and when you've set up your new project, you can simply call:
rake db:seed
There is an excellent Railscast on it here, and a good blog post about it here. If you're not using Rails 2.3.4 (Or, ideally, 2.3.5), I highly recommend updating for these cool features, and addition security/bug fixes.
I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.
namespace :db do
desc "Load seed fixtures (from db/fixtures) into the current environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
end
end
end
I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?
Seeding data with fixtures is an extremely bad idea.
Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.
Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.
There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.
I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.
I wrote an article about these techniques a while back called Loading seed data.
For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:
task :exit_or_continue_in_production? do
if Rails.env.production?
puts "!!!WARNING!!! This task will DESTROY " +
"your production database and RESET all " +
"application settings"
puts "Continue? y/n"
continue = STDIN.gets.chomp
unless continue == 'y'
puts "Exiting..."
exit!
end
end
end
I have created this gist for some context.
For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).
I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)
I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.
Rails 3 will solve this for you using a seed.rb file.
http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb
We've built up a bunch of best practices that we use for seeding data. We rely heavily on seeding, and we have some unique requirements since we need to seed multi-tenant systems. Here's some best practices we've used:
Fixtures aren't the best solution, but you still should store your seed data in something other than Ruby. Ruby code for storing seed data tends to get repetitive, and storing data in a parseable file means you can write generic code to handle your seeds in a consistent fashion.
If you're going to potentially update seeds, use a marker column named something like code to match your seeds file to your actual data. Never rely on ids being consistent between environments.
Think about how you want to handle updating existing seed data. Is there any potential that users have modified this data? If so, should you maintain the user's information rather than overriding it with seed data?
If you're interested in some of the ways we do seeding, we've packaged them into a gem called SeedOMatic.
How about just deleting the task off your production server once you have seeded the database?
I just had an interesting idea...
what if you created \db\seeds\ and added migration-style files:
file: 200907301234_add_us_states.rb
class AddUsStates < ActiveRecord::Seeds
def up
add_to(:states, [
{:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
{:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
]
end
end
def down
remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
end
end
alternately:
def up
State.create!( :name => ... )
end
This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.
thoughts?
I have a migration in Rails that inserts a record into the database. The Category model depends on this record. Since RSpec clears the database before each example, this record is lost and furthermore never seems to be created since RSpec does not seem to generate the database from migrations. What is the best way to create/recreate this record in the database? Would it be using before(:all)?
It's not that RSpec clears the database, it's that Rails's rake:db:prepare task copies the schema (but not the contents) of your dev database into your *_test db.
Yes, you can use before(:all), as transactions are wrapped around each individual example - but a simple fixture file would also do the same job.
(There's a more complicated general solution to this issue: moving to a service-oriented architecture, where your 'dev' and 'test' services are going to be completely separate instances. You can then point your test db config to the development database in your test service, disable rake:db:prepare, and build your test service from migrations as you regenerate it. Then you can test your migrations and data transformations.)
What I like to do is create a folder in db/migration called data, and then put yml fixtures in there, in your case categories.yml
Then I create a migration with the following
def self.up
down
directory = File.join( File.dirname(__FILE__), "data" )
Fixtures.create_fixtures( directory, "categories" )
end
def self.down
Category.delete_all
end