Sorting users by score - rails - ruby-on-rails

In my rails app, each user has a karma/score that i'm calculating through the user model as follows:
after_invitation_accepted :increment_score_of_inviter
def increment_score_of_inviter
invitation_by.increment!(:score, 10)
end
def comment_vote_count
Vote.find(comment_ids).count * 2
end
def calculated_vote_count
base_score + comment_vote_count
end
def recalculate_score!
update_attribute(:score, calculated_vote_count)
end
I'm trying to create paginated list of all the users, sorted by their scores. With thousands of users, how do I do this efficiently?
I was think of using:
User.all.sort_by(&:calculated_vote_count)
But, this would be pretty heavy.

Well...using User.all upon a table full of records will be a memory hog for your application. Instead you should try to accomplish what you want on the DB layer.
At this point I'm assuming base_score is one of the table columns (is base_score same as score?), so you'd have to do something like the following (using LEFT JOIN):
User.select("users.*, (COUNT(votes.id) * 2 + users.base_score) AS calculated_vote_count").joins("LEFT JOIN votes ON votes.user_id = user.id").order("calculated_vote_count DESC")
And then you can paginate the results the way you like.
I didn't test it, but it should work. Let me know if doesn't.

It's pretty straight forward:
User.order('score DESC').all
Obviously you'd have pagination, User.order('score DESC').page(params[:page]).per(20) with Kaminari.

Related

Does splitting up an active record query over 2 methods hit the database twice?

I have a database query where I want to get an array of Users that are distinct for the set:
#range is a predefinded date range
#shift_list is a list of filtered shifts
def listing
Shift
.where(date: #range, shiftname: #shift_list)
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
and I read somewhere that for readability, or isolating for testing, or code reuse, you could split this into seperate methods:
def listing
shiftlist
.select(:user_id)
.distinct
.map { |id| User.find( id.user_id ) }
.sort
end
def shift_list
Shift
.where(date: #range, shiftname: #shift_list)
end
So I rewrote this and some other code, and now the page takes 4 times as long to load.
My question is, does this type of method splitting cause the database to be hit twice? Or is it something that I did elsewhere?
And I'd love a suggestion to improve the efficiency of this code.
Further to the need to remove mapping from the code, this shift list is being created with the following code:
def _month_shift_list
Shift
.select(:shiftname)
.distinct
.where(date: #range)
.map {|x| x.shiftname }
end
My intention is to create an array of shiftnames as strings.
I am obviously missing some key understanding in database access, as this method is clearly creating part of the problem.
And I think I have found the solution to this with the following:
def month_shift_list
Shift.
.where(date: #range)
.pluck(:shiftname)
.uniq
end
Nope, the database will not be hit twice. The queries in both methods are lazy loaded. The issue you have with the slow page load times is because the map function now has to do multiple finds which translates to multiple SELECT from the DB. You can re-write your query to this:
def listing
User.
joins(:shift).
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
This has just one hit to the DB and will be much faster and should produce the same result as above.
The assumption here is that there is a has_one/has_many relationship on the User model for Shifts
class User < ActiveRecord::Base
has_one :shift
end
If you don't want to establish the has_one/has_many relationship on User, you can re-write it to:
def listing
User.
joins("INNER JOIN shifts on shifts.user_id = users.id").
merge(Shift.where(date: #range, shiftname: #shift_list).
uniq.
sort
end
ALTERNATIVE:
You can use 2 queries if you experience issues with using ActiveRecord#merge.
def listing
user_ids = Shift.where(date: #range, shiftname: #shift_list).uniq.pluck(:user_id).sort
User.find(user_ids)
end

Ruby on Rails inefficient database query

My Rails app has the following conditions:
Each Style has many Bookings
Each Booking has a single warehouse value, and a single netbooked value
I need to update the warehouse_netbooked column of every Style with a hash containing the total netbooked sum for each warehouse across all of the style's bookings.
My current code works, but is way too slow (each iteration is taking ~0.5s, and there are thousands of styles):
def assign_warehouse_bookings
warehouses = ["WH1","WH2","WH3"]
Style.all.each do |s|
style_warehouse_bookings = Hash.new
warehouses.each do |wh|
total_netbooked = s.bookings.where(warehouse: wh).sum(:netbooked)
style_warehouse_bookings[wh] = total_netbooked
end
s.update(warehouse_netbooked: "#{style_warehouse_bookings}")
end
end
Here a small change to your code to avoid to do many queries following #eric-duminil advise
#styles = Style.includes(:bookings)
#styles.each do |s|
style_warehouse_bookings = Hash.new
warehouses.each do |wh|
total_netbooked = s.bookings.map {|book| book.warehouse.eql?(wh) ? book.netbooked : 0}.sum
style_warehouse_bookings[wh] = total_netbooked
end
s.update(warehouse_netbooked: "#{style_warehouse_bookings}")
end
end
I hope it help you.
In your case, you have for a single style, you are fetching booking with 3 type of warehouses finding sum of netbooked for each type of warehouse. This is highly inefficient.
One good rule is, first fetch required data from database, and after fetching data use ruby to handle data. It's important to fetch only required data. So, in your case, you can fetch styles and all related bookings. Then you can iterate through collection of bookings and prepare style_warehouse_bookings hash.
use find_each instead of each
use includes or preload to preload data.
Here is simple example which will definitely improve performance,
warehouses = ["WH1","WH2","WH3"]
# preload bookings with style, `preload` used explicitely instead of `includes` to prevent cross join queries.
styles = Style.joins(:bookings).where('bookings.warehouse' => warehouses).preload(:bookings)
# find_each fetches data in batches of 1000 records
styles.find_each do |s|
style_warehouse_bookings = Hash.new
warehouses.each do |wh|
# select and sum methods of ruby are used instead of where and sum of active-record
total_netbooked = s.bookings.select{ |booking| booking.warehouse = wh }.sum(&:netbooked)
style_warehouse_bookings[wh] = total_netbooked
end
s.update(warehouse_netbooked: "#{style_warehouse_bookings}")
end
Read in depth about preload, includes and joins at eager loading associations documentation. Apart from that I wrote an article on when to use preload, includes and joins here which can help.
I think you want to do batch update. If I am correct check the following
link
Is there anything like batch update in Rails?
Also you can introduce transaction to avoid too many commits
For example
def assign_warehouse_bookings
Style.transaction do
<your remaining code goes here>
end
end

Rails sort users by method

I'm trying to rank my user's in order of an integer. The integer I'm getting is in my User Model.
def rating_number
Impression.where("impressionable_id = ?", self).count
end
This gives each User on the site a number (in integer form). Now, on the homepage, I want to show an ordered list that places these user's in order with the user with the highest number first and lowest number second. How can I accomplish this in the controller???
#users = User....???
Any help would be appreciated!
UPDATE
Using this in the controller
#users = User.all.map(&:rating_number)
and this for the view
<% #users.each do |user| %>
<li><%= user %></li>
<% end %>
shows the user's count. Unfortunately, the variable user is acting as the integer not the user, so attaching user.name doesn't work. Also, the list isn't in order based on the integer..
The advice here is still all kinds of wrong; all other answers will perform terribly. Trying to do this via a nested select count(*) is almost as bad an idea as using User.all and sorting in memory.
The correct way to do this if you want it to work on a reasonably large data set is to use counter caches and stop trying to order by the count of a related record.
Add a rating_number column to the users table, and make sure it has an index defined on it
Add a counter cache to your belongs_to:
class Impression < ActiveRecord::Base
belongs_to :user, counter_cache: :rating_number
end
Now creating/deleting impressions will modify the associated user's rating_number.
Order your results by rating_number, dead simple:
User.order(:rating_number)
The advice here is just all kinds of wrong. First of model your associations correctly. Secondly you dont ever want to do User.all and then sort it in-memory based on anything. How do you think it will perform with lots of records?
What you want to do is query your user rows and sort them based on a subquery that counts impressions for that user.
User.order("(SELECT COUNT(impressions.id) FROM impressions WHERE impressionable_id = users.id) DESC")
While this is not terribly efficient, it is still much more efficient than operating with data sets in memory. The next step is to cache the impressions count on the user itself (a la counter cache), and then use that for sorting.
It just pains me that doing User.all is the first suggestion...
If impressions is a column in your users table, you can do
User.order('impressions desc')
Edit
Since it's not a column in your users table, you can do this:
User.all.each(&:rating_number).sort {|x,y| y <=> x }
Edit
Sorry, you want this:
User.all.sort { |x, y| y.rating_number <=> x.rating_number }

Paginate data from two models into one newsfeed: Ruby on Rails 3 // Will_paginate

I'd like to make a newsfeed for the homepage of a site i'm playing around with. There are two models: Articles, and Posts. If I wanted just one in the newsfeed it would be easy:
#newsfeed_items = Article.paginate(:page => params[:page])
But I would like for the two to be both paginated into the same feed, in reverse chronological order. The default scope for the article and post model are already in that order.
How do I get the articles and posts to be combined in to the newsfeed as such?
Thanks!
EDIT: What about using SQL in the users model?
Just wondering: maybe would it be possible define in User.rb:
def feed
#some sql like (SELECT * FROM articles....)
end
Would this work at all?
in my last project i stuck into a problem, i had to paginate multiple models with single pagination in my search functionality. it should work in a way that the first model should appear first when the results of the first model a second model should continue the results and the third and so on as one single search feed, just like facebook feeds. this is the function i created to do this functionality
def multi_paginate(models, page, per_page)
WillPaginate::Collection.create(page, per_page) do |pager|
# set total entries
pager.total_entries = 0
counts = [0]
offsets = []
for model in models
pager.total_entries += model.count
counts << model.count
offset = pager.offset-(offsets[-1] || 0)
offset = offset>model.count ? model.count : offset
offsets << (offset<0 ? 0 : offset)
end
result = []
for i in 0...models.count
result += models[i].limit(pager.per_page-result.length).offset(offsets[i]).to_a
end
pager.replace(result)
end
end
try it and let me know if you have any problem with it, i also posted it as an issue to will_paginate repository, if everyone confirmed that it works correctly i'll fork and commit it to the library. https://github.com/mislav/will_paginate/issues/351
for those interested, please check this question: Creating a "feed" from multiple rails models, efficiently?
Here, Victor Piousbox provides a good, efficient solution.
Look at paginate_by_sql method. You can write unione query to fetch both articles and posts:
select 'article' as type, id from articles
union
select 'post' as type, id from posts
You can paginate both if you use AJAX. Here is well explained how to paginate using AJAX with WillPaginate.
You can paginate an array using WillPaginate::Collection.create. So you'd need to use ActiveRecord to find both sets of data and then combine them in a single array.
Then take a look at https://github.com/mislav/will_paginate/blob/master/lib/will_paginate/collection.rb for documentation on how to use the Collection to paginate any array.

Help converting Rails 2 Database logic to Rails 3.1/ PostgreSQL

How do I select a single random record for each user, but order the Array by the latest record pr. user.
If Foo uploads a new painting, I would like to select a single random record from foo. This way a user that uploads 10 paintings won't monopolize all the space on the front page, but still get a slot on the top of the page.
This is how I did it with Rails 2.x running on MySQL.
#paintings = Painting.all.reverse
first_paintings = []
#paintings.group_by(&:user_id).each do |user_id, paintings|
first_paintings << paintings[rand(paintings.size-1)]
end
#paintings = (first_paintings + (Painting.all - first_paintings).reverse).paginate(:per_page => 9, :page => params[:page])
The example above generates a lot of SQL query's and is properly badly optimized. How would you pull this off with Rails 3.1 running on PostgreSQL? I have 7000 records..
#paintings = Painting.all.reverse = #paintings = Painting.order("id desc")
If you really want to reverse the order of the the paintings result set I would set up a scope then just use that
Something like
class Painting < ActiveRecord::Base
scope :reversed, order("id desc")
end
Then you can use Painting.reversed anywhere you need it
You have definitely set up a belongs_to association in your Painting model, so I would do:
# painting.rb
default_scope order('id DESC')
# paintings_controller.rb
first_paintings = User.includes(:paintings).collect do |user|
user.paintings.sample
end
#paintings = (first_paintings + Painting.where('id NOT IN (?)', first_paintings)).paginate(:per_page => 9, :page => params[:page])
I think this solution results in the fewest SQL queries, and is very readable. Not tested, but I hope you got the idea.
You could use the dynamic finders:
Painting.order("id desc").find_by_user_id!(user.id)
This is assuming your Paintings table contains a user_id column or some other way to associate users to paintings which it appears you have covered since you're calling user_id in your initial code. This isn't random but using find_all_by_user_id would allow you to call .reverse on the array if you still wanted and find a random painting.

Resources