Simple ActiveRecord Question - ruby-on-rails

I have a database model set up such that a post has many votes, a user has many votes and a post belongs to both a user and a post. I'm using will paginate and I'm trying to create a filter such that the user can sort a post by either the date or the number of votes a post has. The date option is simple and looks like this:
#posts = Post.paginate :order => "date DESC"
However, I can't quite figure how to do the ordering for the votes. If this were SQL, I would simply use GROUP BY on the votes user_id column, along with the count function and then I would join the result with the posts table.
What's the correct way to do with with ActiveRecord?

1) Use the counter cache mechanism to store the vote count in Post model.
# add a column called votes_count
class Post
has_many :votes
end
class Vote
belongs_to :post, :counter_cache => true
end
Now you can sort the Post model by vote count as follows:
Post.order(:votes_count)
2) Use group by.
Post.select("posts.*, COUNT(votes.post_id) votes_count").
join(:votes).group("votes.post_id").order(:votes_count)
If you want to include the posts without votes in the result-set then:
Post.select("posts.*, COUNT(votes.post_id) votes_count").
join("LEFT OUTER JOIN votes ON votes.post_id=posts.id").
group("votes.post_id").order(:votes_count)
I prefer approach 1 as it is efficient and the cost of vote count calculation is front loaded (i.e. during vote casting).

Just do all the normal SQL stuff as part of the query with options.
#posts = Post.paginate :order => "date DESC", :join => " inner join votes on post.id..." , :group => " votes.user_id"
http://apidock.com/rails/ActiveRecord/Base/find/class
So I don't know much about your models, but you seem to know somethings about SQL so
named scopes: you basically just put the query into a class method:
named_scope :index , :order => 'date DESC', :join => .....
but they can take parameters
named_scope :blah, {|param| #base query on param }
for you, esp if you are more familiar with SQL you can write your own query,
#posts = Post.find_by_sql( <<-SQL )
SELECT posts.*
....
SQL

Related

Activerecord opitimization - best way to query all at once?

I am trying to achieve by reducing the numbers of queries using ActiveRecord 3.0.9. I generated about 'dummy' 200K customers and 500K orders.
Here's Models:
class Customer < ActiveRecord::Base
has_many :orders
end
class Orders < ActiveRecord::Base
belongs_to :customer
has_many :products
end
class Product < ActiveRecord::Base
belongs_to :order
end
when you are using this code in the controller:
#customers = Customer.where(:active => true).paginate(page => params[:page], :per_page => 100)
# SELECT * FROM customers ...
and use this in the view (I removed HAML codes for easier to read):
#order = #customers.each do |customer|
customer.orders.each do |order| # SELECT * FROM orders ...
%td= order.products.count # SELECT COUNT(*) FROM products ...
%td= order.products.sum(:amount) # SELECT SUM(*) FROM products ...
end
end
However, the page is rendered the table with 100 rows per page. The problem is that it kinda slow to load because its firing about 3-5 queries per customer's orders. thats about 300 queries to load the page.
There's alternative way to reduce the number of queries and load the page faster?
Notes:
1) I have attempted to use the includes(:orders), but it included more than 200,000 order_ids. that's issue.
2) they are already indexed.
If you're only using COUNT and SUM(amount) then what you really need is to retrieve only that information and not the orders themselves. This is easily done with SQL:
SELECT customer_id, order_id, COUNT(id) AS order_count, SUM(amount) AS order_total FROM orders LEFT JOIN products ON orders.id=products.order_id GROUP BY orders.customer_id, products.order_id
You can wrap this in a method that returns a nice, orderly hash by re-mapping the SQL results into a structure that fits your requirements:
class Order < ActiveRecord::Base
def self.totals
query = "..." # Query from above
result = { }
self.connection.select_rows(query).each do |row|
# Build out an array for each unique customer_id in the results
customer_set = result[row[0].to_i] ||= [ ]
# Add a hash representing each order to this customer order set
customer_set << { :order_id => row[1].to_i, :count => row[2].to_i, :total => row[3].to_i } ]
end
result
end
end
This means you can fetch all order counts and totals in a single pass. If you have an index on customer_id, which is imperative in this case, then the query will usually be really fast even for large numbers of rows.
You can save the results of this method into a variable such as #order_totals and reference it when rendering your table:
- #order = #customers.each do |customer|
- #order_totals[customer.id].each do |order|
%td= order[:count]
%td= order[:total]
You can try something like this (yes, it looks ugly, but you want performance):
orders = Order.find_by_sql([<<-EOD, customer.id])
SELECT os.id, os.name, COUNT(ps.amount) AS count, SUM(ps.amount) AS amount
FROM orders os
JOIN products ps ON ps.order_id = os.id
WHERE os.customer_id = ? GROUP BY os.id, os.name
EOD
%td= orders.name
%td= orders.count
%td= orders.amount
Added: Probably it is better to create count and amount cache in Orders, but you will have to maintain it (count can be counter-cache, but I doubt there is a ready recipe for amount).
You can join the tables in with Arel (I prefer to avoid writing raw sql when possible). I believe that for your example you would do something like:
Customer.joins(:orders -> products).select("id, name, count(products.id) as count, sum(product.amount) as total_amount")
The first method--
Customer.joins(:orders -> products)
--pulls in the nested association in one statement. Then the second part--
.select("id, name, count(products.id) as count, sum(product.amount) as total_amount")
--specifies exactly what columns you want back.
Chain those and I believe you'll get a list of Customer instances only populated with what you've specified in the select method. You have to be careful though because you now have in hand read only objects that are possibly in in invalid state.
As with all the Arel methods what you get from those methods is an ActiveRecord::Relation instance. It's only when you start to access that data that it goes out and executes the SQL.
I have some basic nervousness that my syntax is incorrect but I'm confident that this can be done w/o relying on executing raw SQL.

Is it possible to delete_all with inner join conditions?

I need to delete a lot of records at once and I need to do so based on a condition in another model that is related by a "belongs_to" relationship. I know I can loop through each checking for the condition, but this takes forever with my large record set because for each "belongs_to" it makes a separate query.
Here is an example. I have a "Product" model that "belongs_to" an "Artist" and lets say that artist has a property "is_disabled".
If I want to delete all products that belong to disabled artists, I would like to be able to do something like:
Product.delete_all(:joins => :artist, :conditions => ["artists.is_disabled = ?", true])
Is this possible? I have done this directly in SQL before, but not sure if it is possible to do through rails.
The problem is that delete_all discards all the join information (and rightly so). What you want to do is capture that as an inner select.
If you're using Rails 3 you can create a scope that will give you what you want:
class Product < ActiveRecord::Base
scope :with_disabled_artist, lambda {
where("product_id IN (#{select("product_id").joins(:artist).where("artist.is_disabled = TRUE").to_sql})")
}
end
You query call then becomes
Product.with_disabled_artist.delete_all
You can also use the same query inline but that's not very elegant (or self-documenting):
Product.where("product_id IN (#{Product.select("product_id").joins(:artist).where("artist.is_disabled = TRUE").to_sql})").delete_all
In Rails 4 (I tested on 4.2) you can almost do how OP originally wanted
Application.joins(:vacancy).where(vacancies: {status: 'draft'}).delete_all
will give
DELETE FROM `applications` WHERE `applications`.`id` IN (SELECT id FROM (SELECT `applications`.`id` FROM `applications` INNER JOIN `vacancies` ON `vacancies`.`id` = `applications`.`vacancy_id` WHERE `vacancies`.`status` = 'draft') __active_record_temp)
If you are using Rails 2 you can't do the above. An alternative is to use a joins clause in a find method and call delete on each item.
TellerLocationWidget.find(:all, :joins => [:widget, :teller_location],
:conditions => {:widgets => {:alt_id => params['alt_id']},
:retailer_locations => {:id => #teller_location.id}}).each do |loc|
loc.delete
end

Elegant Summing/Grouping/Etc in Rails

I have a number of objects which are associated together, and I'd like to layout some dashboards to show them off. For the sake of argument:
Publishing House - has many books
Book - has one author and is from one, and goes through many states
Publishing House Author - Wrote many
books
I'd like to get a dashboard that said:
How many books a publishing house put
out this month?
How many books an
author wrote this month?
What state (in progress, published) each of the books are in?
To start with, I'm thinking some very simple code:
#all_books = Books.find(:all, :joins => [:author, :publishing_house], :select => "books.*, authors.name, publishing_houses.name", :conditions => ["books.created_at > ?", #date])
Then I proceed to go through each of the sub elements I want and total them up into new arrays - like:
#ph_stats = {}
#all_books.map {|book| #ph_stats[book.publishing_house_id] = (#ph_stats[book.publishing_house_id] || 0) + 1 }
This doesn't feel very rails like - thoughts?
I think your best bet is to chain named scopes together so you can do things like:
#books = Books.published.this_month
http://api.rubyonrails.org/classes/ActiveRecord/NamedScope/ClassMethods.html#M001683
http://m.onkey.org/2010/1/22/active-record-query-interface
You should really be thinking of the SQL required to write such a query, as such, the following queries should work in all databases:
Number of books by publishing house
PublishingHouse.all(:joins => :book, :select => "books.publishing_house_id, publishing_houses.name, count(*) as total", :group => "1,2")
Number of books an author wrote this month
If you are going to move this into a scope - you WILL need to put this in a lambda
Author.all(:joins => :books, :select => "books.author_id, author.name, count(*) as total", :group => "1,2", :conditions => ["books.pub_date between ? and ?", Date.today.beginning_of_month, Date.today.end_of_month])
this is due to the use of Date.today, alternatively - you could use now()::date (postgres specific) and construct dates based on that.
Books of a particular state
Not quite sure this is right wrt your datamodel
Book.all(:joins => :state, :select => "states.name, count(*) as total", :group => "1")
All done through the magic of SQL.

Targeting every object in an array syntax

Newb question of the day:
I'm trying to select all the users with this condition, and then perform an action with each one :
User.find(:all).select { |u| u.organizations.count > 0} do |user|
Except, this isn't the right way to do this. Not entirely sure what the proper syntax is.
Any fellow rubyist offer a newb a hand?
To perform an action with each element of a collection use the each method, like this:
User.find(:all).select { |u| u.organizations.count > 0}.each do |user|
You'd probably be better folding the select into the query with:
User.find(:all, :conditions => "organization_id IS NOT NULL").each do |user|
This will only fetch the relevant results from the database so there should be less unnecessary data retrieved and thrown away.
EDIT:
As suggested in the comments, the following would be correct for a many-to-many relationship assuming a join model called memberships (where user has_many :organisations, :through => :membership)...
User.all(:joins => "inner join memberships on memberships.user_id = users.id")

Find all objects with no associated has_many objects

In my online store, an order is ready to ship if it in the "authorized" state and doesn't already have any associated shipments. Right now I'm doing this:
class Order < ActiveRecord::Base
has_many :shipments, :dependent => :destroy
def self.ready_to_ship
unshipped_orders = Array.new
Order.all(:conditions => 'state = "authorized"', :include => :shipments).each do |o|
unshipped_orders << o if o.shipments.empty?
end
unshipped_orders
end
end
Is there a better way?
In Rails 3 using AREL
Order.includes('shipments').where(['orders.state = ?', 'authorized']).where('shipments.id IS NULL')
You can also query on the association using the normal find syntax:
Order.find(:all, :include => "shipments", :conditions => ["orders.state = ? AND shipments.id IS NULL", "authorized"])
One option is to put a shipment_count on Order, where it will be automatically updated with the number of shipments you attach to it. Then you just
Order.all(:conditions => [:state => "authorized", :shipment_count => 0])
Alternatively, you can get your hands dirty with some SQL:
Order.find_by_sql("SELECT * FROM
(SELECT orders.*, count(shipments) AS shipment_count FROM orders
LEFT JOIN shipments ON orders.id = shipments.order_id
WHERE orders.status = 'authorized' GROUP BY orders.id)
AS order WHERE shipment_count = 0")
Test that prior to using it, as SQL isn't exactly my bag, but I think it's close to right. I got it to work for similar arrangements of objects on my production DB, which is MySQL.
Note that if you don't have an index on orders.status I'd strongly advise it!
What the query does: the subquery grabs all the order counts for all orders which are in authorized status. The outer query filters that list down to only the ones which have shipment counts equal to zero.
There's probably another way you could do it, a little counterintuitively:
"SELECT DISTINCT orders.* FROM orders
LEFT JOIN shipments ON orders.id = shipments.order_id
WHERE orders.status = 'authorized' AND shipments.id IS NULL"
Grab all orders which are authorized and don't have an entry in the shipments table ;)
This is going to work just fine if you're using Rails 6.1 or newer:
Order.where(state: 'authorized').where.missing(:shipments)

Resources