ActiveRecord models cached in rake tasks? - ruby-on-rails

I know that in rails 2.3.2 ActiveRecord queries are cached, i.e. you may see something in the development/production log:
CACHE (0.0ms) SELECT * FROM `users` WHERE `users`.`id` = 1
I was wondering if the same principles apply to rake tasks.
I have a rake task that will query a lot of different models, and I want to know if I should implement my own caching, or if this behavior is included by default.
Also, is there a way to see the sql queries that are performed during the rake task? Similar to that of the development/production log

The SQL cache isn't enabled per default for rake tasks. You can wrap your code in a cache block, like this:
task :foobar => :environment do
ActiveRecord::Base.connection.cache do
User.find 1 # Will hit the db
User.find 1 # Will hit the cache
end
end
This is essentially what Rails does for controller actions. Note that the cache uses memory and rake tasks has a tendency to work with large sets of data, which might give you issues. You can selectively opt-out of caching for parts of your code, using uncached

You are talking about ActiveRecord Query Caching. That should work in Rake-Tasks too, provided you're running them in an environment with caching enabled, e.g. production.
See Rails Guide on Caching for examples.
It may or may not be the right sort of caching for your case:
u1=User.find 1 # loads user1 first time from DB
u2=User.find 2 # loads user2 first time from DB
u1again = User.find 1 # loads user1 from cache
all = User.all # loads user1 and user2 from DB again

A rake task will run in the the environment you specify, in which case it will adopt the rules of that environment.
You can set the rails env from the command line:
RAILS_ENV=test
Logging can be set as part of rake and you should see this in your normal Rails log.

Related

Mysql2 error while using RSpec fixtures

I've added a dependency to both order and order_items fixtures (which already existed), but I'm receiving the following error every time I run my rspec worker test.
ActiveRecord::StatementInvalid:
Mysql2::Error: Table 'inventory_test10.order_packages' doesn't exist: SHOW FULL FIELDS FROM `order_packages` /*controller:,action:,line:*/
I have an order which has many order_items and many order_packages. order_items also belong to order_packages. Therefore, I am able to do:
order.order_items.each do |oi|
put oi.order_package.status
end
The original issue was that status wasn't recognized for nil class because an order_packages.yml fixture was never created. I've tried several rake tasks, but I'm not super familiar with fixtures, migrations, rake tasks, etc and I'm not sure if I accidentally caused the error running multiple taks. Below is a snippet from a blog that warned about running the command multiple times - http://brandonhilkert.com/blog/using-rails-fixtures-to-seed-a-database/:
rake db:fixtures:load FIXTURES=credit_card_types
A word of warning, if we run this command multiple times, it will seed
the table multiple times. It’s not idempotent.
Other tasks I ran:
FIXTURES=orders; rake db:fixtures:load
rake db:fixtures:dump (didn't work - error)
rake db:fixtures:drop (didn't work - error)
Thanks in advance for any suggestions!
Your test framework should automatically load fixtures at the beginning of the test run, and delete them at the end of the test run. You should not need to load fixtures yourself.
Fixtures load data into tables; they do not alter the database structure. Migrations can alter the database by creating/dropping tables, adding/removing columns, etc. If you are having an issue with a missing table, it is very like a migration problem.
I recommend a review of the Guide to Testing Rails Applications, and (if you are using RSpec) the rspec-rails documentation, which explain these concepts in greater depth.

Rollback rake db:seed if exception is raised

My seeds file runs through quite a few csv files, does a few checks and creates various ActiveRecord records accordingly. While testing all these files, I finally think I have it and run rake db:seed but if something fails, I want what has been created so far to rollback.
Scenario that has already happened:
Seeds file requires 4 different CSV's
Only 3 of the 4 CSV's were uploaded to staging server
rake db:seed was run and the seeds file blew up half way through because it couldn't find a file, but over 1000 AR objects were created prior to that.
Ideally I'd like to do something like:
begin
CSV.readlines(file1)
CSV.readlines(file2)
CSV.readlines(file3)
CSV.readlines(file4)
rescue
# raise an error
# rollback all objects created prior to error
end
I suppose I could implement something custom but I can't find anything on the rails guides regarding this.
This is the purpose of Active Record Transactions:
Transactions are protective blocks where SQL statements are only
permanent if they can all succeed as one atomic action. The classic
example is a transfer between two accounts where you can only have a
deposit if the withdrawal succeeded and vice versa. Transactions
enforce the integrity of the database and guard the data against
program errors or database break-downs. So basically you should use
transaction blocks whenever you have a number of statements that must
be executed together or not at all.
Try this
ActiveRecord::Base.transaction do
...
end

How do I redirect the sql statements in Rails to other than log/development.log?

Something in Rails (ActiveRecord::Base.logger ?) puts all executed SQL into log/development.log.
I have a rails app, whose data is populated by several rake tasks. Often in development, I want to run the web app, and run several of these rake tasks simultaneously (they are long running tasks that talk to other systems and create data in a local database).
Annoyingly, they all log to the same file at the same time.
How/where should I change this? Can I do it from the command line? Where (if in a rake file) should I do it? Or should I create new environments for each rake task?
Is there documentation i should have read to answer this, where is it?
Thanks a bunch.
Depends on what you are really trying to log. In Rails 3.+ you can use the ActiveSupport::LogSubscriber mechanism to hook into Rails internals and log certain interests to other files.
If you really want to hijack the core Rails.logger instance for each of your rake tasks, than do just that.
# lib/tasks/foo.rake
desc "Prints Hello World"
task :helloworld do
Rails.logger = Logger.new("/path/to/hello-world.log")
# do something in your Rails stack that would write to "Rails.logger"
end
Not tested, but I think that might work.
That being said, I think subscribing to interesting events in ActiveRecord via ActiveSupport::LogSubscriber might be a cleaner approach
My current solution is
dec "update statistics"
task :update_stats => :environment do
root_path = Rails.configuration.root_path
env = Rails.configuration.environment
ActiveRecord::Base.logger = Logger.new(File.join(root_path, "log", "#{env}-stats.log"))
#code ...
end
It feels a bit hacky, but maybe that's just the way to do it.

Why am I seeing test environment data in my development Rails cache?

I am attempting to cache a class variable like so:
Rails.cache.write("##page_types", ##page_types)
This method is called within a class I have called PageTypes.
If I start up a rails console and do:
Rails.cache.write("##page_types", nil)
Rails.cache.read("##page_types")
I get nil. I leave the console open and do this in another window:
rake test:units
When the tests are over, I switch back to my rails console window and do
Rails.cache.read("##page_types")
It returns an array of my test page types! I'm positive they are from my test db because the models all have super high IDs, while my dev data all has very low ones.
I suppose I could append Rails.env to the cache keys, but it seems like the two caches shouldn't be mixing....
Define a different cache backend for your test environment. A memory_store should be perfect for unit tests.
ActionController::Base.cache_store = :memory_store
in config/environments/test.rb:
config.cache_store = :memory_store

Is seeding data with fixtures dangerous in Ruby on Rails

I have fixtures with initial data that needs to reside in my database (countries, regions, carriers, etc.). I have a task rake db:seed that will seed a database.
namespace :db do
desc "Load seed fixtures (from db/fixtures) into the current environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/yamls/*.yml').each do |file|
Fixtures.create_fixtures('db/fixtures/yamls', File.basename(file, '.*'))
end
end
end
I am a bit worried because this task wipes my database clean and loads the initial data. The fact that this is even possible to do more than once on production scares the crap out of me. Is this normal and do I just have to be cautious? Or do people usually protect a task like this in some way?
Seeding data with fixtures is an extremely bad idea.
Fixtures are not validated and since most Rails developers don't use database constraints this means you can easily get invalid or incomplete data inserted into your production database.
Fixtures also set strange primary key ids by default, which is not necessarily a problem but is annoying to work with.
There are a lot of solutions for this. My personal favorite is a rake task that runs a Ruby script that simply uses ActiveRecord to insert records. This is what Rails 3 will do with db:seed, but you can easily write this yourself.
I complement this with a method I add to ActiveRecord::Base called create_or_update. Using this I can run the seed script multiple times, updating old records instead of throwing an exception.
I wrote an article about these techniques a while back called Loading seed data.
For the first part of your question, yes I'd just put some precaution for running a task like this in production. I put a protection like this in my bootstrapping/seeding task:
task :exit_or_continue_in_production? do
if Rails.env.production?
puts "!!!WARNING!!! This task will DESTROY " +
"your production database and RESET all " +
"application settings"
puts "Continue? y/n"
continue = STDIN.gets.chomp
unless continue == 'y'
puts "Exiting..."
exit!
end
end
end
I have created this gist for some context.
For the second part of the question -- usually you really want two things: a) very easily seeding the database and setting up the application for development, and b) bootstrapping the application on production server (like: inserting admin user, creating folders application depends on, etc).
I'd use fixtures for seeding in development -- everyone from the team then sees the same data in the app and what's in app is consistent with what's in tests. (Usually I wrap rake app:bootstrap, rake app:seed rake gems:install, etc into rake app:install so everyone can work on the app by just cloning the repo and running this one task.)
I'd however never use fixtures for seeding/bootstrapping on production server. Rails' db/seed.rb is really fine for this task, but you can of course put the same logic in your own rake app:seed task, like others pointed out.
Rails 3 will solve this for you using a seed.rb file.
http://github.com/brynary/rails/commit/4932f7b38f72104819022abca0c952ba6f9888cb
We've built up a bunch of best practices that we use for seeding data. We rely heavily on seeding, and we have some unique requirements since we need to seed multi-tenant systems. Here's some best practices we've used:
Fixtures aren't the best solution, but you still should store your seed data in something other than Ruby. Ruby code for storing seed data tends to get repetitive, and storing data in a parseable file means you can write generic code to handle your seeds in a consistent fashion.
If you're going to potentially update seeds, use a marker column named something like code to match your seeds file to your actual data. Never rely on ids being consistent between environments.
Think about how you want to handle updating existing seed data. Is there any potential that users have modified this data? If so, should you maintain the user's information rather than overriding it with seed data?
If you're interested in some of the ways we do seeding, we've packaged them into a gem called SeedOMatic.
How about just deleting the task off your production server once you have seeded the database?
I just had an interesting idea...
what if you created \db\seeds\ and added migration-style files:
file: 200907301234_add_us_states.rb
class AddUsStates < ActiveRecord::Seeds
def up
add_to(:states, [
{:name => 'Wisconsin', :abbreviation => 'WI', :flower => 'someflower'},
{:name => 'Louisiana', :abbreviation => 'LA', :flower => 'cypress tree'}
]
end
end
def down
remove_from(:states).based_on(:name).with_values('Wisconsin', 'Louisiana', ...)
end
end
alternately:
def up
State.create!( :name => ... )
end
This would allow you to run migrations and seeds in an order that would allow them to coexist more peaceably.
thoughts?

Resources