grouping comments by parent object, ordering parent object by oldest comment - ruby-on-rails

I have objects that have comments. As part of a periodic email summary, I want to determine comments for a period, and present the objects in the order of oldest commented object first.
Data:
object_id comment_id
30 40
40 42
32 41
30 43
32 44
Output:
Object #30
comment 40
comment 43
Object #40
comment 42
Object #32
comment 41
comment 44
I am using this code to get the data to an intermediate array - I tried to get it all in one swoop using .group_by(&:commentable_id) but the data didn't come out in correct order.
comments = account.comments.all(
:conditions => ["comments.created_at > ?", 8.hours.ago],
:order => "comments.created_at asc" ).map { |c| [c.commentable_id,c.id] }
=> [ [30,40], [40,42], [32,41], [30,43], [32,44] ]
If I can get that data to transform into the following form, I could just iterate over the array to build the email content...
[ [30,[40,43]], [40,[42]], [32,[41,44]] ]
But I wonder if I'm making this harder than I need to... Any advice?
(I'm using Rails 2.3 and Ruby ree-1.8.7)

You can use a group with an array aggregate to get to the array form that you're looking for.
Array aggregates are massively db dependent. MySQL's is GROUP_CONCAT. Postgres' is ARRAY_AGG. Sqlite doesn't have one out of the box, but I know you can define custom aggregate functions, so it's not impossible.
Haven't actually tried running this code, but here's something that should point you in the right direction:
result = Object.all(
:select => 'objects.id, GROUP_CONCAT(comment_id) AS comment_array',
:group => 'comments.id'
).map { |c| [c.id, c.comment_array] }
I used the naming from the first example, so you'll need to change 'object' to whatever your table is called. Hope it makes sense. Rails probably doesn't have inbuilt support for parsing an array, so it will probably return a string for comment_array, and you might have to parse it.

Having all the comments for a single object in a single block/element will definitely make life easier while doing any operation on them. However, I won't go as far as turning them into an array of array of arrays because it is already an array of arrays. I would prefer creating a hash like so:
comments_array = [ [30,40], [32,41], [40,42], [30,43], [32,44] ]
obj_with_comments = {}
comments_array.each do |x|
obj_with_comments[x.first] ||= []
obj_with_comments[x.first] << x.last
end
obj_with_comments #=> {40=>[42], 30=>[40, 43], 32=>[41, 44]}
But this presents another problem which is, hashes are not ordered, so you loose your ordering in some random fashion if you just iterate over the hash. However, you can create an array of objects then iterate over the hash like so:
objects = comments_array.collect{|x| x.first}.uniq
objects #=> [30, 32, 40]
# now get the hash value for each object key in order
objects.each { |obj| puts obj_with_comments[obj].inspect }
Hope that makes sense.

Try this:
comments = account.comments.all(
:conditions => ["comments.created_at > ?", 8.hours.ago],
:order => "comments.commentable_id ASC, comments.id ASC"
).map { |c| [c.commentable_id,c.id] }
This will return the following result set:
=> [ [30,40], [30,43], [32,41], [32,44], [40,42] ]
In the query above I am using id for sorting instead of create_at. If you are using MySQL and if the id's are auto generated this logic will work as the id of a new object will be higher than the id of an older object. If you don't allow editing of comments then this logic will work.
If you want to explicitly sort by the dates then use the following syntax:
comments = account.comments.all(
:joins => "accounts AS accounts ON comments.commentable_type = 'Account' AND
comments.commentable_id = accounts.id",
:conditions => ["comments.created_at > ?", 8.hours.ago],
:order => "accounts.id ASC, comments.id ASC"
).map { |c| [c.commentable_id,c.id] }

Related

Get sorted most common objects from an array of hashmaps in Ruby on Rails

I'm looking to get the sorted most common results from an array containing hashmaps. The hashmap data is non-numerical so:
line_value = {'date' => date, 'name' => name, 'url' => url }
where I can grab the most common urls. I considered using SQL to grab the counts, sort them and be done with it, but I think there is probably a faster way to do it in straight ruby since the array and hashmaps are not in a database and would need to be put there to begin with.
So I'm looking for non-SQL methods to do this. Note, I'm not just looking for the most common result (singular) but the top 5 or 10 common results.
How about
most_common_urls = line_value['url'].sort[0..9]
Change
[0..5]
to whatever number you need.
The first thing to do is to build up a count of the unique urls in your array. I much prefer each_with_object to inject for this (you don't have to return the hash at each step):
url_count = items.each_with_object(Hash.new(0)) do |item, count|
count[item['url']] += 1
end
# => {'example.com' => 1, 'facebook.com' => 4, 'twitter.com' => 2, ...}
Then you want to turn this into is an array of the keys, sorted by the values. Using Array#sort_by will do quite nicely, but it sorts in ascending order. You could take the last N items, and reverse them:
top_urls = url_count.keys.sort_by!{|url| url_count[url]}.last(5).reverse!
or you could negate the count so that the highest numbers are sorted to the front:
top_urls = url_count.keys.sort_by!{|url| -url_count[url]}.first(5)
urls.map {|u| u["url"]}.inject(Hash.new(0)) {|k,v| k[v] += 1; k}.rank.sort_by {|k,v| v}.last(5).reverse
Or:
urls.group_by {|k|{ :u => k["url"], :q => 0}}.map {|k,v| k[:q] = v.count; k}.sort_by {|k| k[:q]}.last(5).reverse

Perform find on an array of hashes to avoid database find

The find methods are very convenient for retrieving records, and I'm frequently using :include to prefetch referenced records to avoid expensive db accesses.
I have a case where I retrieve all sales by a salesperson.
#sales = Sales.find(:all,
:include => [:salesperson, :customer, :batch, :product],
:conditions => {:salesperson_id => someone},
:order => :customer_id)
I then want to slice and dice the returned records based on what was returned. For instance, I want to produce a report for all the sales made by this salesperson at a particular store, which we know is a subset of the previously returned data.
What I'd like to do is,
#storeSales = #sales.find_by_store(store_id)
...and retrieve this subset from the array held in memory as a new array, rather than achieve the same thing by performing a find on the database again. After all, #sales is just a array of Sales objects, so it doesn't seem unreasonable that this should be supported.
However, it doesn't seem that there's a convenient way to do this, is there? Thanks.
If you are using Rails 3, #sales will be an AREL criteria object. What you can do is as follows:
#sales = Sales.find(:all,
:include => [:salesperson, :customer, :batch, :product],
:conditions => {:salesperson_id => someone},
:order => :customer_id)**.all**
Now #sales is an instance of an Array of Sales model objects. Getting a subset of the array objects is now easy using the select method:
#my_product_sales = #sales.select { |s| s.product == my_product_criteria }
Upon using select method you will now have #sales being the full result set and #my_product_sales being the subset based on the collect criteria.

Will_Paginate - how to get an array of all paginated object ids?

I'm using will_paginate 2.3.x gem and I'm trying to get an array of all the ids of the objects that are in the WillPaginate::Collection. Is there a simple way to iterate all the items in the collection? With ActiveRecord collection this is simple:
all_ids = MyObject.find(:all).map { |o| o.id }
However, when paginated, the collection returns only the N elements that are in the first page:
not_all_ids = MyObject.paginate(:page => 1).map { |o| o.id }
What I want is to go through all the pages in the collection. Basically, I'm looking for a method that retrieves the next page. Any thoughts? thanks.
EDIT - I'm using rails 2.3. Also, I'm using pagination with some conditions but dropped them to simplify:
MyObject.paginate(:conditions => [...], :include => [...], :page => 1)
You could do something like that :
ids = MyObject.select("id").paginate(:page => 1)
if ids.total_pages > 1
(2..ids.total_pages).each do |page|
ids << MyObject.select("id").paginate(:page => page)
end
end
ids = ids.map(&:id)
But why using paginate in this case ?
ids = MyObject.select("id").map(&:id)
will be faster, and will use less resources ... if you're db contains 10000 elements and you're iterating 10 at a times you'll make 1000 cals to your db vs 1 :-)
ps: I'm using .select("id") so the request generated is :
SELECT id from users;
VS
SELECT * FROM users;
I'm also using a nice shortcut map(&:id), this is the equivalent of .map { |o| o.id }

how to paginate records from multiple models? (do I need a polymorphic join?)

After quite a bit of searching, I'm still a bit lost. There are a few other similar questions out there that deal with paginating multiple models, but they are either unanswered or they pagainate each model separately.
I need to paginate all records of an Account at once.
class Account
:has_many :emails
:has_many :tasks
:has_many :notes
end
So, I'd like to find the 30 most recent "things" no matter what they are. Is this even possible with the current pagination solutions out there?
Like using some combination of eager loading and Kaminari or will_paginate?
Or, should I first set up a polymorphic join of all these things, called Items. Then paginate the most recent 30 items, then do a lookup of the associated records of those items.
And if so, I'm not really sure what that code should look like. Any suggestions?
Which way is better? (or even possible)
Rails 3.1, Ruby 1.9.2, app not in production.
with will_paginate :
#records = #do your work and fetch array of records you want to paginate ( various types )
then do the following :
current_page = params[:page] || 1
per_page = 10
#records = WillPaginate::Collection.create(current_page, per_page, records.size) do |pager|
pager.replace(#records)
end
then in your view :
<%=will_paginate #records%>
Good question... I'm not sure of a "good" solution, but you could do a hacky one in ruby:
You'd need to first fetch out the 30 latest of each type of "thing", and put them into an array, indexed by created_at, then sort that array by created_at and take the top 30.
A totally non-refactored start might be something like:
emails = Account.emails.all(:limit => 30, :order => :created_at)
tasks = Account.tasks.all(:limit => 30, :order => :created_at)
notes = Account.notes.all(:limit => 30, :order => :created_at)
thing_array = (emails + tasks + notes).map {|thing| [thing.created_at, thing] }
# sort by the first item of each array (== the date)
thing_array_sorted = thing_array.sort_by {|a,b| a[0] <=> b[0] }
# then just grab the top thirty
things_to_show = thing_array_sorted.slice(0,30)
Note: not tested, could be full of bugs... ;)
emails = account.emails
tasks = account.tasks
notes = account.notes
#records = [emails + tasks + notes].flatten.sort_by(&:updated_at).reverse
#records = WillPaginate::Collection.create(params[:page] || 1, 30, #records.size) do |pager|
pager.replace(#records)
end
Thats it... :)

Rails 3 select random follower query efficiency

I have a method that selects 5 random users who are following a certain user, and adds them to an array.
Relationship.find_all_by_followee_id( user.id ).shuffle[0,4].each do |follower|
follower = User.find(follower.user_id)
array.push follower
end
return array
I'm wondering, is this an efficient way of accomplishing this? My main concern is with the find_all_by_followee_id call. This returns a list of all the relationships where the specified user is being followed (this could be in the 100,000s). And then I shuffle that entire list, and then I trim it to the first 5. Is there a more efficient way to do this?
You can try this:
Relationship.find_all_by_followee_id( user.id, :order => 'rand()', :limit => 5 ) do |follower|
follower = User.find(follower.user_id)
array.push follower
end
return array
Btw, this will work with MySql. If you are using PostgreSQL or anything else you may need to change the rand() with any valid random function that your DB supports.
Some minor changes to make it a little more clean:
return Relationship.find_all_by_followee_id( user.id, :order => 'rand()', :limit => 5 ).collect {|follower| User.find(follower.user_id) }
You can also use a join in there in order to prevent the 5 selects but it won't make much difference.
Edit1:
As #mike.surowiec mentioned.
"Just for everyones benefit, translating this to the non-deprecated active record query syntax looks like this:"
Relationship.where(:followee_id => user.id).order( "random()" ).limit( 5 ).collect {|follower| User.find(follower.user_id) }

Resources