Rails 4 grouping duplicates - ruby-on-rails

I wasn't sure how to ask this question. I am new to Ruby on Rails and am still figuring out how to piece everything together.
I have books, orders, products and users. Within a specific view, I would like to display one product_id per user — most users will have the same product_id multiple times.
Here's my book view.
<% #orders.each do |order| %>
<% order.product.books.each do |book| %>
<%= link_to book.title %>
<% end %>
<% end %>
In my book controller I have
def index
#orders = Order.select(:product_id).distinct
#books = Book.page params[:page]
end
All of this works well (view shows one product_id per order product_id, even if there are duplicates) until I call order.created_at which gives a missing attribute error. I know this error is because I am only requesting :product_id on order, but when I add :created_at (#orders = Order.select(:product_id, :created_at).distinct) to the select query I get duplicate :product_ids.
What am I missing?

Well, you have to think about what that would look like. When you are selecting rows by a distinct product_id you are discarding all other order rows that have the same product_id. But when you add created_at back into the mix your resulting record set will include the same product_id because the distinct is operating against the tuple of (product_id, created_at) instead of just the product_id.
Your original #orders isn't actually all of your orders, it's just a subset because it chose a unique one per product_id.
I find it helpful to append to_sql to the end of a query to see what the actual SQL coming out of ActiveRecord is doing:
> Order.select(:product_id).distinct.to_sql
=> SELECT DISTINCT "orders"."product_id" FROM "orders"
> Order.select(:product_id, :created_at).distinct.to_sql
=> SELECT DISTINCT "orders"."product_id", "orders"."created_at" FROM "orders"
Try running those in the database and you'll see what data ActiveRecord is using to construct ActiveRecord instances and realize that ActiveRecord is just taking whatever rows come back in the data and manufacturing objects based on that.

Related

Rails, where filter on an already joined table

I have a table :periods that has a column :hours
I also have a table :simulations that belongs to :period (so it has period_id as a column)
I join the 2 table
#simulations=Simulation.join(:period) #Controller
Now in my view I have to loop through my simulations but filter based on the "hours" column, which exists on the period table.
So I tried things like
In my view I have to loop through simulations:
<% #simulations.where( :period => {:hour => count})each do |sim| %>
or
<% #simulations.periods.where(:hour => count )each do |sim| %>
Again, I want to filter my simulations based on data from the period table. And neither of these approaches work. I have also tried using includes and eager_loads neither of which works.
Am I attempting something that is not possible under rails?
Also I am using postgres
Try using this query, in your controller
#simulations = Simulation.where(periods: {hours: count}) # All simulations that belongs to periods with hours = count
Notice, the 's' in 'periods' and adjust the 's' in 'hours' as per your column name.
In views, directly use simulations variable:
<%= #simulations.each do |sim| %>
You are making same query twice of joining periods and simulations. Also try to avoid any query on Model in views or in controller. Instead, you should make a method in Model and use it to get the results from above query.

Print what is not in a join table

I have a join table created from 2 other tables.
Say one model is called cat, another is called request, and the join table is called catrequest (this table would have cat_id and request_id)
How would I print all of the cats not intersected in the join table i.e. all of the cats NOT requested using rails. I saw some DB based answers, but I am looking for a rails solution using ruby code.
I get how to print a cat that belongs to a request i.e.:
<% #requests.each do |request| %>
<% request.cats.each do |cat| %>
<%= cat.name %>
<% end %>
but I don't understand how to do the reverse of this.
To get a list of cats that have never been requested you'd go with:
Cat.includes(:cat_requests).where(cat_requests: { id: nil })
# or, if `cat_requests` table does not have primary key (id):
Cat.includes(:cat_requests).where(cat_requests: { cat_id: nil })
The above assumes you have the corresponding association:
class Cat
has_many :cat_requests
end
It sounds like what you need is an outer join, and then to thin out the cats rows that don't have corresponding data for the requests? If that's the case, you might consider using Arel. It supports an outer join and can probably be used to get what you're looking for. Here is a link to a guide that has a lot of helpful information on Arel:
http://jpospisil.com/2014/06/16/the-definitive-guide-to-arel-the-sql-manager-for-ruby.html
Search the page for "The More the Merrier" section which is where joins are discussed.

How to sort through an associated attribute on two levels at once?

This is a simple ruby question I believe. In my app, I have Product model that has_many Reviews. Each Review has an attribute of an "overall" rating which is an integer.
What I want to do is display the top ten Products based on the average of their overall ratings. I've already gotten this to work, BUT, I also want to sort Products that have the SAME overall rating by a secondary aggregate attribute, which would be how MANY reviews that Product has. Right now, if I have 3 products with the same average overall rating, they seem to be displayed in random order.
So far my code is:
Controller
#best = Product.has_reviews.get_best_products(10)
Product Model
scope :has_reviews, joins{reviews.outer}.where{reviews.id != nil}
def self.get_best_products(number)
sorted = self.uniq
sorted = sorted.sort { |x, y| y.reviews.average("overall").to_f <=> x.reviews.average("overall").to_f }
sorted.first(number)
end
I've tried this for my model code:
def self.get_best_products(number)
sorted = self.uniq.sort! { |x, y| x.reviews.count.to_f <=> y.reviews.count.to_f }
sorted = sorted.sort { |x, y| y.reviews.average("overall").to_f <=> x.reviews.average("overall").to_f }
sorted.first(number)
end
...but it does not do what I want it to do. I am just iterating through the #best array using each in my view.
---UPDATE
OK now I am trying this:
Controller:
#best = Product.get_best_products(6)
Model:
def self.get_best_products(number)
self.joins{reviews}.order{'AVG(reviews.overall), COUNT(reviews)'}.limit(number)
end
But I am getting this error:
PGError: ERROR: column "products.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "products".* FROM "products" INNER JOIN "reviews" ...
I am using the Squeel gem btw to avoid having direct SQL code in the model.
----UPDATE 2
Now I added the 'group' part to my method but I am still getting an error:
def self.get_best_products(number)
self.joins{reviews}.group('product.id').order{'AVG(reviews.overall), COUNT(reviews)'}.limit(number)
end
I get this error:
PGError: ERROR: missing FROM-clause entry for table "product"
LINE 1: ...eviews"."product_id" = "products"."id" GROUP BY product.i...
product.rb
scope :best_products, (lambda do |number|
joins(:reviews).order('AVG(reviews.overall), COUNT(reviews)').limit(number)
)
products_controller.rb
Product.best_products(10)
This makes sure everything happens in the database, so you won't get records you don't need.
If I got it right here is my idea of how I would do it:
As products has many reviews and reviews has an overall attribute I would add a reviews_counter column to the products table that will increment with each added review, this way you'll be able to gain a little more db performance as you don't have to count all the products reviews to get the most reviewed one.
Now you'll get the products ordered by reviews_counter:
#best_products = Products.order("reviews_counter desc")
and next you'll get the reviews for each product ordered by overall:
<% for prod in #best_products %>
<%= prod.reviews.order("overall desc") %> # can do all this or more in helper
<% end %>
also ordering this way, if you have 3 reviews with the same overall you can one more order() statement and sort it by name or id or whatever you like so they don't display in random order.
This is just my idea of how I would do it, I worked recently on an app that required something similar and we just added a counter_field to our model, it's not illegal to do so :)
p.s. it's not very clear for me how many records you would want to display for each so you'll just need to add .limit(5) for exemple to get only the first 5 reviews of a product.

Activerecord opitimization - best way to query all at once?

I am trying to achieve by reducing the numbers of queries using ActiveRecord 3.0.9. I generated about 'dummy' 200K customers and 500K orders.
Here's Models:
class Customer < ActiveRecord::Base
has_many :orders
end
class Orders < ActiveRecord::Base
belongs_to :customer
has_many :products
end
class Product < ActiveRecord::Base
belongs_to :order
end
when you are using this code in the controller:
#customers = Customer.where(:active => true).paginate(page => params[:page], :per_page => 100)
# SELECT * FROM customers ...
and use this in the view (I removed HAML codes for easier to read):
#order = #customers.each do |customer|
customer.orders.each do |order| # SELECT * FROM orders ...
%td= order.products.count # SELECT COUNT(*) FROM products ...
%td= order.products.sum(:amount) # SELECT SUM(*) FROM products ...
end
end
However, the page is rendered the table with 100 rows per page. The problem is that it kinda slow to load because its firing about 3-5 queries per customer's orders. thats about 300 queries to load the page.
There's alternative way to reduce the number of queries and load the page faster?
Notes:
1) I have attempted to use the includes(:orders), but it included more than 200,000 order_ids. that's issue.
2) they are already indexed.
If you're only using COUNT and SUM(amount) then what you really need is to retrieve only that information and not the orders themselves. This is easily done with SQL:
SELECT customer_id, order_id, COUNT(id) AS order_count, SUM(amount) AS order_total FROM orders LEFT JOIN products ON orders.id=products.order_id GROUP BY orders.customer_id, products.order_id
You can wrap this in a method that returns a nice, orderly hash by re-mapping the SQL results into a structure that fits your requirements:
class Order < ActiveRecord::Base
def self.totals
query = "..." # Query from above
result = { }
self.connection.select_rows(query).each do |row|
# Build out an array for each unique customer_id in the results
customer_set = result[row[0].to_i] ||= [ ]
# Add a hash representing each order to this customer order set
customer_set << { :order_id => row[1].to_i, :count => row[2].to_i, :total => row[3].to_i } ]
end
result
end
end
This means you can fetch all order counts and totals in a single pass. If you have an index on customer_id, which is imperative in this case, then the query will usually be really fast even for large numbers of rows.
You can save the results of this method into a variable such as #order_totals and reference it when rendering your table:
- #order = #customers.each do |customer|
- #order_totals[customer.id].each do |order|
%td= order[:count]
%td= order[:total]
You can try something like this (yes, it looks ugly, but you want performance):
orders = Order.find_by_sql([<<-EOD, customer.id])
SELECT os.id, os.name, COUNT(ps.amount) AS count, SUM(ps.amount) AS amount
FROM orders os
JOIN products ps ON ps.order_id = os.id
WHERE os.customer_id = ? GROUP BY os.id, os.name
EOD
%td= orders.name
%td= orders.count
%td= orders.amount
Added: Probably it is better to create count and amount cache in Orders, but you will have to maintain it (count can be counter-cache, but I doubt there is a ready recipe for amount).
You can join the tables in with Arel (I prefer to avoid writing raw sql when possible). I believe that for your example you would do something like:
Customer.joins(:orders -> products).select("id, name, count(products.id) as count, sum(product.amount) as total_amount")
The first method--
Customer.joins(:orders -> products)
--pulls in the nested association in one statement. Then the second part--
.select("id, name, count(products.id) as count, sum(product.amount) as total_amount")
--specifies exactly what columns you want back.
Chain those and I believe you'll get a list of Customer instances only populated with what you've specified in the select method. You have to be careful though because you now have in hand read only objects that are possibly in in invalid state.
As with all the Arel methods what you get from those methods is an ActiveRecord::Relation instance. It's only when you start to access that data that it goes out and executes the SQL.
I have some basic nervousness that my syntax is incorrect but I'm confident that this can be done w/o relying on executing raw SQL.

ActiveRecord and SELECT AS SQL statements

I am developing in Rails an app where I would like to rank a list of users based on their current points. The table looks like this: user_id:string, points:integer.
Since I can't figure out how to do this "The Rails Way", I've written the following SQL code:
self.find_by_sql ['SELECT t1.user_id, t1.points, COUNT(t2.points) as user_rank FROM registrations as t1, registrations as t2 WHERE t1.points <= t2.points OR (t1.points = t2.points AND t1.user_id = t2.user_id) GROUP BY t1.user_id, t1.points ORDER BY t1.points DESC, t1.user_id DESC']
The thing is this: the only way to access the aliased column "user_rank" is by doing ranking[0].user_rank, which brinks me lots of headaches if I wanted to easily display the resulting table.
Is there a better option?
how about:
#ranked_users = User.all :order => 'users.points'
then in your view you can say
<% #ranked_users.each_with_index do |user, index| %>
<%= "User ##{index}, #{user.name} with #{user.points} points %>
<% end %>
if for some reason you need to keep that numeric index in the database, you'll need to add an after_save callback to update the full list of users whenever the # of points anyone has changes. You might look into using the acts_as_list plugin to help out with that, or that might be total overkill.
Try adding user_rank to your model.
class User < ActiveRecord::Base
def rank
#determine rank based on self.points (switch statement returning a rank name?)
end
end
Then you can access it with #user.rank.
What if you did:
SELECT t1.user_id, COUNT(t1.points)
FROM registrations t1
GROUP BY t1.user_id
ORDER BY COUNT(t1.points) DESC
If you want to get all rails-y, then do
cool_users = self.find_by_sql ['(sql above)']
cool_users.each do |cool_user|
puts "#{cool_user[0]} scores #{cool_user[1]}"
end

Resources