Ambiguous table reference - ruby-on-rails

This problem seems fairly simple, but I've never encountered one like this.
Here are the settings:
Post has_many :reader_links
Post has_many :readers, :through => :reader_links
I need to find out if there are readers reading a post.
#post.reader_links.where('created_at >= ?', 45.minutes.ago).any?
Works great.
#post.readers.where('created_at >= ?', 45.minutes.ago),any?
throws an ambiguous table column error because it's confused whether the created_at column means that of reader object or reader_link object. This happens because the class of a reader is actually User. How do I query readers who were created by reader_links 45 minutes ago?
I'm looking for something like..
#post.readers.where('reader_link.created_at >= ?', 45.minutes.ago)

If I get it right, you just need to specify which created_at column you're talking about:
#post.readers.where('reader_links.created_at >= ?', 45.minutes.ago).any?

You coul merge the scopes to get rid of ambigious errors, so each scope has it's own visibility range.
using meta_where:
Post.scoped & (ReaderLink.scoped & User.where(:created_at.gt => 45.minutes.ago))
without meta_where:
Post.scoped.merge(ReaderLink.scoped.merge(User.where('created_at >= ?', 45.minutes.ago))
This will result in arrays of Post objects containing the reader_links and readers data for all readers younger than 45 minutes. Please try it in the rails console.
Edit: for a single post
post_with_fresh_users = Post.where('id = ?', some_id).merge(ReaderLink.scoped.merge(User.where('created_at >= ?', 45.minutes.ago))
Edit: all fresh readers of a post (different order)
fresh_readers_for_post = User.where('created_at >= ?', 45.minutes.ago).merge(ReaderLink.scoped.merge(Post.where('id = ?', #post.id))
How it works:
http://benhoskin.gs/2012/07/04/arel-merge-a-hidden-gem

Related

Rails activerecord count number of associations

I have a model called Impression - which counts the views of each Card (also a model).
I am trying to print a table with the cards, that has at least 10 views in the last month.
So I started with -
cards = Card.joins(:impressions).where("impressions.created_at > ? AND impressions.created_at < ?", Date.today-30.days, Date.today).uniq
And then I did -
cards.select {|card| card.impressions.count >= 10 }
But it runs a long long times. I want something much more efficient.
Any ideas for counting the number of impressions and sorting them?
I want to do it efficiently as I can - without iterating over the whole DB with the N+1 problem, cause it could get pretty ugly.
Does your impressions tables' created_at column have index?
If you are querying it and there is none - you could add it by generating a migration file.
add_index(:impressions, :created_at)
And you could use pure SQL to add condition to your query like:
cards = Card.joins(:impressions)
.where("impressions.created_at > ? AND impressions.created_at < ?", Date.today-30.days, Date.today)
.group('cards.id')
.having('COUNT(impressions.*) >= 10')
You can use .last() function
Try this
cards = Card.joins(:impressions).where("impressions.created_at BETWEEN ? AND ?", 1.month.ago.beginning_of_month , 1.month.ago.end_of_month).uniq.last(10)
I would use having and AREL
Try this :
cards = Card.joins(:impressions).where(impressions: {created_at: (Date.today-30.days..Date.today)}).having(Impressions.arel_table[:id].count.gteq(10)).uniq
Don't know how your relations looks and the quantity of data you have. Thus assuming your relations look like
Card
class Card
has_many :impressions
end
Impression
class Impression
belongs_to :card
end
my suggestion is to try something like this:
cards_id = Card.pluck(:id)
past_one_month_impressions_arel = Impression.where("impressions.created_at > ? AND impressions.created_at < ?", Date.today-30.days, Date.today)
cards_id_having_atleast_10_views = cards_id.select do |c_id|
arel = past_one_month_impressions_arel.where(impressions: { card_id: cid } )
arel.count >= 10
end
cards = Card.where(id: cards_id_having_atleast_10_views)
Also you may try using find_each to iterate in batches if you have a lot of data.
Hope that helps.
Thanks.

Efficient ActiveRecord association conditions

Let's say you have an assocation in one of your models like this:
class User
has_many :articles
end
Now assume you need to get 3 arrays, one for the articles written yesterday, one of for the articles written in the last 7 days, and one of for the articles written in the last 30 days.
Of course you might do this:
articles_yesterday = user.articles.where("posted_at >= ?", Date.yesterday)
articles_last7d = user.articles.where("posted_at >= ?", 7.days.ago.to_date)
articles_last30d = user.articles.where("posted_at >= ?", 30.days.ago.to_date)
However, this will run 3 separate database queries. More efficiently, you could do this:
articles_last30d = user.articles.where("posted_at >= ?", 30.days.ago.to_date)
articles_yesterday = articles_last30d.select { |article|
article.posted_at >= Date.yesterday
}
articles_last7d = articles_last30d.select { |article|
article.posted_at >= 7.days.ago.to_date
}
Now of course this is a contrived example and there is no guarantee that the array select will actually be faster than a database query, but let's just assume that it is.
My question is: Is there any way (e.g. some gem) to write this code in a way which eliminates this problem by making sure that you simply specify the association conditions, and the application itself will decide whether it needs to perform another database query or not?
ActiveRecord itself does not seem to cover this problem appropriately. You are forced to decide between querying the database every time or treating the association as an array.
There are a couple of ways to handle this:
You can create separate associations for each level that you want by specifying a conditions hash on the association definition. Then you can simply eager load these associations for your User query, and you will be hitting the db 3x for the entire operation instead of 3x for each user.
class User
has_many articles_yesterday, class_name: Article, conditions: ['posted_at >= ?', Date.yesterday]
# other associations the same way
end
User.where(...).includes(:articles_yesterday, :articles_7days, :articles_30days)
You could do a group by.
What it comes down to is you need to profile your code and determine what's going to be fastest for your app (or if you should even bother with it at all)
You can get rid of the necessity of checking the query with something like the code below.
class User
has_many :articles
def article_30d
#articles_last30d ||= user.articles.where("posted_at >= ?", 30.days.ago.to_date)
end
def articles_last7d
#articles_last7d ||= articles_last30d.select { |article| article.posted_at >= 7.days.ago.to_date }
end
def articles_yesterday
#articles_yesterday ||= articles_last30d.select { |article| article.posted_at >= Date.yesterday }
end
end
What it does:
Makes only one query maximum, if any of the three is used
Calculates only the used array, and the 30d version in any case, but only once
It does not however simplifies the initial 30d query even if you do not use it. Is it enough, or you need something more?

Making this ActiveRecord query more efficient?

I have User and Gift models. A user can send gifts to another users. I have a relational table telling me which users received a gift. On the other hand, a user belongs to a School, which can be free or paid.
I want the count of users that have received a gift in the last week for a specific type of school (this is, free or paid).
I can do:
Gift.joins(:schools).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).collect(&:gift_recipients).flatten.uniq.count.
Or, I want to know how many users sent gifts the last week. This works:
Gift.joins(:schools).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).collect(&:user_id).uniq.count.
If I want to know how many users have sent or received a gift in the last week I can do:
(Gift.joins(:schools).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).collect(&:gift_recipients).flatten + Gift.joins(:schools).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).collect(&:user_id)).uniq.count
All this works fine but if the database is big enough this is really slow. Do you have any suggestions to make it more efficient, maybe using raw SQL where needed?
"gifts"
user_id (integer)
school_id (integer)
created_at (datetime)
updated_at (datetime)
"gift_recipients" is a table like
gift_id | recipient_id,
You do not want to do this using collect(), which is loading all of the results into memory and filtering them within an Array of ActiveRecords. This is slow and dangerous, as it could potential leak/use all of the memory available, depending on the size of the data vs. your server.
Once you post your schema I can help you query/aggregate this in SQL, which is the right way to do it.
For example, instead of:
Gift.joins(:schools).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).collect(&:user_id).uniq.count
You should use:
Gift.joins(:schools).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).count('distinct user_id')
...which will count the distinct user_ids in SQL and return the result instead of returning all of the objects and counting them in memory.
I saw this old post and I wanted to make a couple of comments:
As Winfield said
Gift.joins(:school).where("created_at >= ? AND schools.free_school = ?", Time.now.beggining_of_week, true).count('distinct user_id')
is a good way of doing this. I would do
Gift.joins(:school).count('distinct user_id', :conditions => ["gifts.created_at >= ? AND free_school = ?", Time.now.beginning_of_week, true])
but just because this is nicer to my eyes, a personal thing, you can check that both produces exactly the same SQL query. Note that is necessary to write
gifts.created_at
to avoid ambiguity because both tables has a column with this name, in the case of the column name
free_school
there is no ambiguity as this is not a column name in gifts tables. For the first query i was doing
Gift.joins(:school).where("created_at >= ? AND schools.free_school = ?", Time.now.beginning_of_week, true).collect(&:user_id).uniq.count
which is awkward. This works better
Gift.joins(:school).count("distinct user_id", :conditions => ["gifts.created_at >= ? AND free_school = ?", Time.now.beginning_of_week, true])
which avoid the problem of bringing gifts to memory and filtering them with ruby.
Up to this there's nothing new. The key point here is that my problem was calculating the number of users who sent or received a gift during the last week. For this I came up with the following
senders_ids = Gift.joins(:school).find(:all, :select => 'distinct user_id', :conditions => ['gifts.created_at >= ? AND free_school = ?', Time.now.beginning_of_week, type]).map {|g| g.user_id}
receivers_ids = Gift.joins(:school).find(:all, :select => 'distinct rec.recipient_id', :conditions => ['gifts.created_at >= ? AND free_school = ?', Time.now.beginning_of_week, type], :joins => "INNER JOIN gifts_recipients as rec on rec.gift_id = gifts.id").map {|g| g.recipient_id}
(senders_ids + receivers_ids).uniq.count
I'm pretty sure that exists a better way of doing this, I mean, returning exactly this number in a single SQL query, but at least the results are arrays of objects containing only the id (recipient_id for the receivers case), not bringing all objects into memory. Well this is just hoping to be useful for someone new to sql queries through rails like me :).

How to count "days with records" in Rails?

Rails adds and populates a created_at column for new records.
How can I use the to count the number of days that have records within a specified timeframe? (note: counting days, not counting records).
For example, say I have a Post model, how can I calculate how many days in the last year have a Post?
Since you asked for the ruby way, here it is:
Post.where('created_at >= ?', 1.year.ago).map { |p| p.created_at.beginning_of_day }.uniq.size
Update
You can put the following in your Post model
def self.number_of_days
where('created_at >= ?', 1.year.ago).map { |p| p.created_at.beginning_of_day }.uniq.size
end
Then in your controller you can do stuff like
#user.posts.number_of_days
Here's a more efficient way that delegates most of the work to the database (MySQL, not sure if it'll work on others):
Post.where('created_at >= ?', 1.year.ago).group('DATE(created_at)').length

Find all old records returning same amount for greater than and less than (> <)

I'm trying to find (and then delete) all old records in my DB on heroku.
For some reason these two are equal (notice < and >)
Post.find(:all, "updated_at > ?", 30.days.ago).count
Post.find(:all, "updated_at < ?", 30.days.ago).count
Makes me hesitent about using the delete.
What call should I make to ensure I do get only the older records?
The other answers are correct, but for updated ActiveRecord syntax:
Post.where("updated_at < ?", 30.days.ago)
Less than is what you want:
Post.find(:all, :conditions => ["updated_at < ?", 30.days.ago])
If you're unsure, print some of the records to the console using p or awesome_print (ap).
You need to specify your where clause with :conditions, like so:
Post.all(:conditions => ["updated_at < ?", 30.days.ago])
If you want to delete old records, you want to delete records that have an earlier date, ie, less than the given date (30.days.ago)
Instead of deleting the records, why don't you set a state flag to pending_delete and then once you're satisfied you can delete all the records in that state.

Resources