I'm trying to load some seed data into an app with heroku. I'm more looking just to insert a bunch of seed (sample) data into my app so my client can see it with a lot of objects. (Imagine a demo e-commerce app - I want to be able to show a couple dozen sample products without manually entering them in)
My seed data works fine in development but in production, the one HMBT association causes the below error:
WARNING: Rails was not able to disable referential integrity.
This is most likely caused due to missing permissions.
Rails needs superuser privileges to disable referential integrity.
Heroku documentation said this this so I tried a conditional to remove the referential integrity on production in my seeds file(see below), but now seeds won't run. It just does this:
Running rake db:seed on ⬢ app... up, run.1225 (Free)
ActiveRecord::SchemaMigration Load (1.7ms) SELECT "schema_migrations".* FROM "schema_migrations"
Here's my seeds file below:
if Rails.env.development?
ActiveRecord::Base.connection.disable_referential_integrity do
end
1.upto(14) do |n|
pic_one = File.open(File.join(Rails.root,'app/assets/images/file1.jpg'))
pic_two = File.open(File.join(Rails.root,'app/assets/images/carpet_2/file2.jpg'))
image = PicAttachment.create!([
{picture: pic_one, w_dream_id: "#{n}"},
{picture: pic_two, w_dream_id: "#{n}"},
])
rug = Product.create!(
pic_attachments: image
end
if Rails.env.development?
end
end
Does anyone where I'm going wrong?
The link you posted states that referential integrity cannot be removed in Heroku. It suggests considering using another test data scaffolding tool (like FactoryGirl or Fabrication Gem)
Anyway, your code does nothing if the environment is not development. All code is inside the if Rails.env.development?. The first end corresponds to the do. The indentation is wrong. Your code is in fact:
if Rails.env.development?
ActiveRecord::Base.connection.disable_referential_integrity do
end
1.upto(14) do |n|
pic_one = File.open(File.join(Rails.root,'app/assets/images/file1.jpg'))
pic_two = File.open(File.join(Rails.root,'app/assets/images/carpet_2/file2.jpg'))
image = PicAttachment.create!([
{picture: pic_one, w_dream_id: "#{n}"},
{picture: pic_two, w_dream_id: "#{n}"},
])
rug = Product.create!(
pic_attachments: image
if Rails.env.development?
end
end
Ultimately I took this answer, which was already in my code to ensure associations work in development.
For development, this is needed:
ActiveRecord::Base.connection.disable_referential_integrity do
For production:
the disable_referential_integrity isn't allowed in heroku and the problem is the associated model (Pic_Attachment) gets created before the model object it belongs to, hence an error gets thrown because it needs an object to belong. What worked for me was to delete the disable_referential_integrity from the seeds file AND comment out the belongs_to line in the associated model (PicAttachment), then commit/push your changes and it works. (Add these lines back in afterwards so your development works)
I hope this helps someone. Took me a few days of work here.
Related
The problem is the following:
I have db/seed.rb full of initial data.
One of migrations depends on data this seed provides.
I'm trying to deploy my app from empty db.
Result is:
RAILS_ENV=production rake db:migrate - fails due to lack of initial data
RAILS_ENV=production rake db:seed - fails due to pending migrations
I wanted to somehow tell rake to ignore pending migrations, but unable to do it so far.
UPDATE (due to additional experience)
Sometimes migrations and model code goes out of sync, so migrations are not being run.
To avoid this problem recently used redefining of model in migrations:
# reset all callbacks, hooks, etc for this model
class MyAwesomeModel < ActiveRecord::Base
end
class DoSomethingCool < ActiveRecord::Migration
def change
...
end
end
I am not very sure if this will help you. But I was looking for something and found this question. So it looks like this might help:
In RAILS_ROOT/config/environments/development.rb
Set the following setting to false:
# NOTE: original version is `:page_load`
config.active_record.migration_error = false
In my situation it now does not show the pending migration error anymore. Should work for rake tasks and console for the same environment as well.
Source in rails/rails
Rename the migration dependent on the data from:
20140730091353_migration_name.rb
to
.20140730091353_migration_name.rb
(add a dot at the start of the filename)
Then run rake db:seed (it will no longer complain on the pending migrations) and then rename back the migration.
If you have more migrations following after, you have to rename all of them or just move it temporary away.
Rails stores migration information in a table called schema_migrations.
You can add the version from your migration into that table to skip a specific migration.
The version is the number string which comes before the description in the file name.
[version]_Create_Awesome.rb
I had a similar issue. I commented out the add_column lines and ran the rake db:migrate commands and then removed the comment when I will need it for the testing or production environment.
There is no way unless you monkey patch the Rails code. I strongly advise you to fix your migrations instead.
A migration should not depend on the existence of some data in the database. It can depend on a previous migration, but of course absolutely not on the data on the db.
If you came across the "pending migrations" issue when trying to seed your data from within a running Rails application, you can simply call this directly which avoids the abort_if_pending_migrations check:
ActiveRecord::Tasks::DatabaseTasks.load_seed
See where seeds are actually called from within ActiveRecord:
https://github.com/rails/rails/blob/v6.0.3.2/activerecord/lib/active_record/railties/databases.rake#L331
and see the DatabaseTasks docs:
https://apidock.com/rails/v6.0.0/ActiveRecord/Tasks/DatabaseTasks
https://apidock.com/rails/v6.0.0/ActiveRecord/Tasks/DatabaseTasks/load_seed
I have a lot of spots in my code that actually call activerecord finders. For example, in a Blog engine, I might have a table of tags that correspond to an activerecord model Tag. Suppose, for some reason, that I want special logic to happen if a post is created with a tag where tag.description == 'humor'. Then I might have a method in the model:
class Tag < ActiveRecord::Base
def self.humor_tag
find_by_description('humor')
end
end
Whether or not this is poor design, it causes insane amounts of problems for me when using rake commands to build a database. Say that later on, I've finished my development and I want to deploy to production. So I take the dumped schema.rb file, and then I want to load a new database structure from that schema.rb, or alternatively, just run my migrations to create a production database.
RAILS_ENV=production rake db:schema:load
The problem is that, in the production environment, the rake command seems to load every model. When it tries to load the Tag#humor_tag method, it throws an error that stops the process:
rake aborted!
Table 'production_database.tags' doesn't exist
Well of course it doesn't exist, it hasn't been created yet! I've googled around and people seem to solve this problem by either cloning the database in SQL or moving around their code just so they can run the rake task.
What are you supposed to do? It seems like there might be some configuration somewhere to let you tell rake to freaking ignore calls to database tables before any tables are created.
I would suggest replacing queries by class methods with scopes: http://guides.rubyonrails.org/active_record_querying.html#scopes
and if you have an initializer that is causing the models to load, use a proc in the scope definition, such as
class Post < ActiveRecord::Base
scope :published, Proc.new { where(:published => true) }
end
to prevent the scope from running at initialization time.
I'm not completely satisfied with this answer, but if anyone gets to this question and has a similar problem, this may be helpful. In order to move over a database in a situation where you would usually rake db:schema:load or just create it and run the migrations, you can alternatively load the database from SQL (or presumably other database technologies).
rake db:structure:dump
That command will dump the structure of the database into a file that will then be able to recreate it. For me, it created a file db/development_dump.sql, that contained calls to create all of the tables and indices, but didn't copy any of the data like on a normal sql dump. Then, I moved that file to my production database, and ran it:
mysql prod_database < development_dump.sql
This doesn't answer the question at hand, but it may be relevant for someone facing a similar problem.
I have some weird issues going on (for a very weird use case as I'll explain). I'm setting up a multi-tenant application using postgres schemas for data multi-tenancy.
Each company in my system will get its own schema. The way I accomplish this is with an after_commit on the model, on create, that then goes and creates a new postgres schema, and loads schema.rb into it. (copied from rake db:schema:load code) using ruby load.
You can see the gem here
Anyway, all this works (in the console). Creating a company creates the new schema and i can switch to it etc... my problem lies in my integration tests. I have an rspec test that creates to companies like so:
before do
#c1 = Factory :company
#c2 = Factory :company
end
What's odd is that I start to get the logs about the db schema loading, but they're truncated. Almost as if they're happening in parallel. Here's a sample output:
>> create: database: unique_name1
-- create_table("first_table_in_schema_rb", {:force=>true})
>> create: database: unique_name2
create: database is my log line, the -- create_table is from schema.rb itself.
As you can see, the second create: database seems to happen while I'm loading schema.rb from the first company creation.
Does anyone know if load is somehow asynchronous? I know ruby doesn't have real threads, but could it be using fibres or something? This is really messing me up because when my test comes around, the postgres schema that was meant to be created doesn't seem to exist.
Rails 3.0.8
Ruby 1.9.2
Im not 100% sure this is your problem because im sure of what happens with require but not with load, the things this happen to me once with require because require is not atomic, so loading code from a file with require will cause a race condition. Maybe that is what is happening with load but i was not able to find any info about load been atomic or not.
nevermind... issue had nothing to do with load it was the fact that i was already connected to the wrong schema when importing the schema.rb
There was in fact an exception being thrown that was silently caught somewhere
I need to populate my production database app with data in particular tables. This is before anyone ever even touches the application. This data would also be required in development mode as it's required for testing against. Fixtures are normally the way to go for testing data, but what's the "best practice" for Ruby on Rails to ship this data to the live database also upon db creation?
ultimately this is a two part question I suppose.
1) What's the best way to load test data into my database for development, this will be roughly 1,000 items. Is it through a migration or through fixtures? The reason this is a different answer than the question below is that in development, there's certain fields in the tables that I'd like to make random. In production, these fields would all start with the same value of 0.
2) What's the best way to bootstrap a production db with live data I need in it, is this also through a migration or fixture?
I think the answer is to seed as described here: http://lptf.blogspot.com/2009/09/seed-data-in-rails-234.html but I need a way to seed for development and seed for production. Also, why bother using Fixtures if seeding is available? When does one seed and when does one use fixtures?
Usually fixtures are used to provide your tests with data, not to populate data into your database. You can - and some people have, like the links you point to - use fixtures for this purpose.
Fixtures are OK, but using Ruby gives us some advantages: for example, being able to read from a CSV file and populate records based on that data set. Or reading from a YAML fixture file if you really want to: since your starting with a programming language your options are wide open from there.
My current team tried to use db/seed.rb, and checking RAILS_ENV to load only certain data in certain places.
The annoying thing about db:seed is that it's meant to be a one shot thing: so if you have additional items to add in the middle of development - or when your app has hit production - ... well, you need to take that into consideration (ActiveRecord's find_or_create_by...() method might be your friend here).
We tried the Bootstrapper plugin, which puts a nice DSL over the RAILS_ENV checking, and lets your run only the environment you want. It's pretty nice.
Our needs actually went beyond that - we found we needed database style migrations for our seed data. Right now we are putting normal Ruby scripts into a folder (db/bootstrapdata/) and running these scripts with Arild Shirazi's required gem to load (and thus run) the scripts in this directory.
Now this only gives you part of the database style migrations. It's not hard to go from this to creating something where these data migrations can only be run once (like database migrations).
Your needs might stop at bootstrapper: we have pretty unique needs (developing the system when we only know half the spec, larg-ish Rails team, big data migration from the previous generation of software. Your needs might be simpler).
If you did want to use fixtures the advantage over seed is that you can easily export also.
A quick guess at how the rake task may looks is as follows
desc 'Export the data objects to Fixtures from data in an existing
database. Defaults to development database. Set RAILS_ENV to override.'
task :export => :environment do
sql = "SELECT * FROM %s"
skip_tables = ["schema_info"]
export_tables = [
"roles",
"roles_users",
"roles_utilities",
"user_filters",
"users",
"utilities"
]
time_now = Time.now.strftime("%Y_%h_%d_%H%M")
folder = "#{RAILS_ROOT}/db/fixtures/#{time_now}/"
FileUtils.mkdir_p folder
puts "Exporting data to #{folder}"
ActiveRecord::Base.establish_connection(:development)
export_tables.each do |table_name|
i = "000"
File.open("#{folder}/#{table_name}.yml", 'w') do |file|
data = ActiveRecord::Base.connection.select_all(sql % table_name)
file.write data.inject({}) { |hash, record|
hash["#{table_name}_#{i.succ!}"] = record
hash }.to_yaml
end
end
end
desc "Import the models that have YAML files in
db/fixture/defaults or from a specified path."
task :import do
location = 'db/fixtures/default'
puts ""
puts "enter import path [#{location}]"
location_in = STDIN.gets.chomp
location = location_in unless location_in.blank?
ENV['FIXTURES_PATH'] = location
puts "Importing data from #{ENV['FIXTURES_PATH']}"
Rake::Task["db:fixtures:load"].invoke
end
I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.
namespace :db do
desc "Load seed fixtures (from db/fixtures) into the current environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
end
end
end
I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?
Seeding data with fixtures is an extremely bad idea.
Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.
Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.
There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.
I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.
I wrote an article about these techniques a while back called Loading seed data.
For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:
task :exit_or_continue_in_production? do
if Rails.env.production?
puts "!!!WARNING!!! This task will DESTROY " +
"your production database and RESET all " +
"application settings"
puts "Continue? y/n"
continue = STDIN.gets.chomp
unless continue == 'y'
puts "Exiting..."
exit!
end
end
end
I have created this gist for some context.
For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).
I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)
I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.
Rails 3 will solve this for you using a seed.rb file.
http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb
We've built up a bunch of best practices that we use for seeding data. We rely heavily on seeding, and we have some unique requirements since we need to seed multi-tenant systems. Here's some best practices we've used:
Fixtures aren't the best solution, but you still should store your seed data in something other than Ruby. Ruby code for storing seed data tends to get repetitive, and storing data in a parseable file means you can write generic code to handle your seeds in a consistent fashion.
If you're going to potentially update seeds, use a marker column named something like code to match your seeds file to your actual data. Never rely on ids being consistent between environments.
Think about how you want to handle updating existing seed data. Is there any potential that users have modified this data? If so, should you maintain the user's information rather than overriding it with seed data?
If you're interested in some of the ways we do seeding, we've packaged them into a gem called SeedOMatic.
How about just deleting the task off your production server once you have seeded the database?
I just had an interesting idea...
what if you created \db\seeds\ and added migration-style files:
file: 200907301234_add_us_states.rb
class AddUsStates < ActiveRecord::Seeds
def up
add_to(:states, [
{:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
{:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
]
end
end
def down
remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
end
end
alternately:
def up
State.create!( :name => ... )
end
This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.
thoughts?