Updating a large record set in Rails - ruby-on-rails

I need to update a single field across a large set of records. Normally, I would just run a quick SQL update statement from the console and be done with it, but this is a utility that end users need to be able to run in this app.
So, here's my code:
users = User.find(:all, :select => 'id, flag')
users.each do |u|
u.flag = false
u.save
end
I'm afraid this is just going to take a while as the number of users increases (current sitting at around 35k, adding 2-5k a week). Is there a faster way to do this?
Thanks!

If you really want to update all records, the easiest way is to use #update_all:
User.update_all(:flag => false)
This is the equivalent of:
UPDATE users SET flag = 'f'
(The exact SQL will be different depending on your adapter)
The #update_all method also accepts conditions:
User.update_all({:flag => false}, {:created_on => 3.weeks.ago .. 5.hours.ago})
Also, #update_all can be combined with named scopes:
class User < ActiveRecord::Base
named_scope :inactive, lambda {{:conditions => {:last_login_at => 2.years.ago .. 2.weeks.ago}}
end
User.inactive.update_all(:flag => false)

You could use ActiveRecord's execute method to execute the update SQL. Something like this:
ActiveRecord::Base.connection.execute('UPDATE users SET flag=0')

Related

rails habtm: return associated records but with exclusive match

Maintaining an existing Rails 2.3.x app that has a custom role-based authorization system.
The code has something like this:
class Role << AR:Base
# has an int attribute called "level" with higher values indicating more powerful role
habtm: members
end
class Member << AR:Base
habtm: roles
end
Roles table has something like
(id, name, level)
1, admin, 1000
2, VIP, 500
3, regular, 100
4, some_other_role, 50
I have the following members with stated roles
member1 (roles: admin, VIP, regular)
member2 (roles: VIP, regular)
member3 (roles: regular)
What I need at times is pull up members based on their highest assigned role:
Role.admins_exclusively # should return member1
Role.vips_exclusively # should return just member2
Role.regulars_exclusively # should be just member3
Can't wrap my head around how to do this in Rails, without resorting to writing raw SQL queries.
Any suggestions?
Update: Mar 29th, 2012
This was my solution to basically define a bunch of methods like this (well using some dynamic programming along with define_method()) for each role.
class Member < AR:Base
define_method :vips_exclusively do
scoped :joins => :roles,
:group => 'members.id',
:having => ["max(roles.level) = ?", Role.find_by_name('vip').level]
end
end
However, I discovered that there is an issue with older rails 2.3.x. Calling size() or count() on Member.vips_exclusively for example would produce incorrect totals. Calling length() would produce correct result, but it is recommended to use size() wherever possible.
After looking at Rails code, it looks like options like :group and :having do not get passed along to count() when set in scoped(). Replacing calls to scoped() with named_scopes (update: DOES NOT) solve the counting problem.
So I incorporated Chris's proposal along with some edits for correctness/brevity. Thank you!
Another update.
Actually the issue of :group and :having not being passed is also in named_scoped implementation.
And sure enough here's a stale ticket with no fix ever making it to Rails source tree (at least not in 2.3.x branch).
https://rails.lighthouseapp.com/projects/8994/tickets/1349-named-scope-with-group-by-bug
That's great...
I don't think you'll need to write SQL queries directly, but I think you'll need a named_scope with some SQL group and having clauses to do what you're looking for:
In app/models/member.rb
named_scope :maximum_level, lambda { |level| {
:having => [ 'MAX(roles.level) = ?', level ],
:group => 'members.id', # edited to need quotes
:joins => :roles # dont need the whole join statement }
}
In app/models/role.rb
def exclusive_members
Member.maximum_level(self.level).all
end
def self.members_by_role_name(role_name)
role = self.find(:conditions => ['name = ?', role_name]).first
role.exclusive_members
end
Example
>> r = Role.find(1)
=> <Role id:1, name:"admin", level:1000>
>> r.exclusive_members
=> [ list of members with a highest role of "admin"]
>> r.exclusive_members.map { |m| m.name }
=> [ "member1" ]
>> Role.members_by_role_name("admin")
=> # the same list as you'd get by calling r.exclusive_members

Polymorphic Relationship Table Queries in Rails — find object by multiple

I have a relationship table in a rails application called edit_privileges, in which the User is the "editor" and a number of other classes are "editable". Let's say that two of those classes are Message and Comment.
My EditPrivilege model uses the following code:
belongs_to :editor, :class_name => "User"
belongs_to :editable, :polymorphic => true
And User, of course
has_many :edit_privileges, :foreign_key => "editor_id"
In order to determine if a user has edit privileges for a certain model, I can't do the normal query:
user.edit_privileges.find_by_editable_id(#message.id)
because if the user has edit privileges to edit a comment with the same id as #message, the query will return true with the wrong edit privilege record from the table.
So, I tried doing these options:
user.edit_privileges.find(:all, :conditions => ["editable_id = ? AND editable_type ?", #message.id, #message.class.to_s])
user.edit_privileges.where(:editable_id => #message.id, :editable_type => #message.class.to_s)
which works great at finding the right record, but returns an array instead of an object (an empty array [] if there is no edit privilege). This is especially problematic if I'm trying to create a method to destroy edit privileges, since you can't pass .destroy on an array.
I figure appending .first to the two above solutions returns the first object and nil if the result of the query is an empty has, but is that really the best way to do it? Are there any problems with doing it this way? (like, instead of using dynamic attribute-based finders like find_by_editabe_id_and_editable_type)
Use find(:first, ...) instead of find(:all, ...) to get one record (note it might return nil while find will raise an RecordNotFound exception). So for your example:
user.edit_privileges.find(:first, :conditions => { :editable_id => #message.id, :editable_type => #message.class.to_s })
BTW, if you're on more edge rails version (3.x), Model.where(...).first is the new syntax:
user.edit_privileges.where(:editable_id => #message.id, :editable_type => #message.class.to_s).first

Cleaning up controllers to speed up application

So in my app I have notifications and different record counts that are used in the overall layout, and are therefore needed on every page.
Currently in my application_controller I have a lot of things like such:
#status_al = Status.find_by_name("Alive")
#status_de = Status.find_by_name("Dead")
#status_sus = Status.find_by_name("Suspended")
#status_hid = Status.find_by_name("Hidden")
#status_arc = Status.find_by_name("Archived")
#balloon_active = Post.where(:user_id => current_user.id, :status_id => #status_al.id )
#balloon_dependent = Post.where(:user_id => current_user.id, :status_id => #status_de.id )
#balloon_upcoming = Post.where(:user_id => current_user.id, :status_id => #status_sus.id )
#balloon_deferred = Post.where(:user_id => current_user.id, :status_id => #status_hid.id )
#balloon_complete = Post.where(:user_id => current_user.id, :status_id => #status_arc.id )
..
Thats really just a small piece, I have at least double this with similar calls. The issue is I need these numbers pretty much on every page, but I feel like I'm htting the DB wayyyy too many times here.
Any ideas for a better implementation?
Scopes
First off, you should move many of these into scopes, which will allow you to use them in far more flexible ways, such as chaining queries using ActiveRecord. See http://edgerails.info/articles/what-s-new-in-edge-rails/2010/02/23/the-skinny-on-scopes-formerly-named-scope/index.html.
Indexes
Second, if you're doing all these queries anyway, make sure you index your database to, for example, find Status quickly by name. A sample migration to accomplish the first index:
add_index :status (or the name of your Status controller), :name
Session
If the data you need here is not critical, i.e. you don't need to rely on it to further calculations or database updates, you could consider storing some of this data in the user's session. If you do so, you can simply read whatever you need from the session in the future instead of hitting your db on every page load.
If this data is critical and/or it must be updated to the second, then avoid this option.
Counter Caching
If you need certain record counts on a regular basis, consider setting up a counter_cache. Basically, in your models, you do the following:
Parent.rb
has_many :children
Child.rb
belongs_to :parent, :counter_cache => true
Ensure your parent table has a field called child_count and Rails will update this field for you on every child's creation/deletion. If you use counter_caching, you will avoid hitting the database to get the counts.
Note: Using counter_caching will result in a slightly longer create and destroy action, but if you are using these counts often, it's usually worth going with counter_cache.
You should only need 1 database query for this, something like:
#posts = Post.where(:user_id => current_user.id).includes(:status)
Then use Enumerable#group_by to collect the posts into the different categories:
posts_by_status = #posts.group_by do {|post| post.status.name }
which will give you a hash:
{'Alive' => [...], 'Dead' => [...]}
etc.

Rails find conditions... where attribute is not a database column

I think it's safe to say everyone loves doing something like this in Rails:
Product.find(:all, :conditions => {:featured => true})
This will return all products where the attribute "featured" (which is a database column) is true. But let's say I have a method on Product like this:
def display_ready?
(self.photos.length > 0) && (File.exist?(self.file.path))
end
...and I want to find all products where that method returns true. I can think of several messy ways of doing it, but I think it's also safe to say we love Rails because most things are not messy.
I'd say it's a pretty common problem for me... I'd have to imagine that a good answer will help many people. Any non-messy ideas?
The only reliable way to filter these is the somewhat ugly method of retrieving all records and running them through a select:
display_ready_products = Product.all.select(&:display_ready?)
This is inefficient to the extreme especially if you have a large number of products which are probably not going to qualify.
The better way to do this is to have a counter cache for your photos, plus a flag set when your file is uploaded:
class Product < ActiveRecord::Base
has_many :photos
end
class Photo < ActiveRecord::Base
belongs_to :product, :counter_cache => true
end
You'll need to add a column to the Product table:
add_column :products, :photos_count, :default => 0
This will give you a column with the number of photos. There's a way to pre-populate these counters with the correct numbers at the start instead of zero, but there's no need to get into that here.
Add a column to record your file flag:
add_column :products, :file_exists, :boolean, :null => false, :default => false
Now trigger this when saving:
class Product < ActiveRecord::Base
before_save :assign_file_exists_flag
protected
def assign_file_exists_flag
self.file_exists = File.exist?(self.file.path)
end
end
Since these two attributes are rendered into database columns, you can now query on them directly:
Product.find(:all, :conditions => 'file_exists=1 AND photos_count>0')
You can clean that up by writing two named scopes that will encapsulate that behavior.
You need to do a two level select:
1) Select all possible rows from the database. This happens in the db.
2) Within Ruby, select the valid rows from all of the rows. Eg
possible_products = Product.find(:all, :conditions => {:featured => true})
products = possible_products.select{|p| p.display_ready?}
Added:
Or:
products = Product.find(:all, :conditions => {:featured => true}).select {|p|
p.display_ready?}
The second select is the select method of the Array object. Select is a very handy method, along with detect. (Detect comes from Enumerable and is mixed in with Array.)

:from parameter in active record find not well designed?

i got this error:
SQLite3::SQLException: no such column: apis.name: SELECT * FROM examples WHERE ("apis"."name" = 'deep')
my code
Api.find :all, :from => params[:table_name], :conditions => {:name => 'deep' }
I need to make a back end rails application which will be used by a silverlight application. one of the requirements is to fetch simple data from the database. i need to be able to query different tables with the same code.(my app has 2000 tables!)
i think it does not make sense for rails to put in "apis" in the WHERE clause. is there any speciic reason for this?
It does that so when joins are performed, the where clauses will line up with the right tables' columns. This is handy most of the time, but in your particular case causes issues.
What you could do is use the other conditions syntax, which will not add rails table names to the attributes, but still sanitize the inputs properly.
Api.find :all, :from => params[:table_name], :conditions => ['name = ?','deep']

Resources