ActiveRecord query with alias'd table names - ruby-on-rails

Using model concerns which include scopes, what is the best way to write these knowing that nested and/or self-referencing queries are likely?
In one of my concerns, I have scopes similar to these:
scope :current, ->(as_at = Time.now) { current_and_expired(as_at).current_and_future(as_at) }
scope :current_and_future, ->(as_at = Time.now) { where("#{upper_bound_column} IS NULL OR #{upper_bound_column} >= ?", as_at) }
scope :current_and_expired, ->(as_at = Time.now) { where("#{lower_bound_column} IS NULL OR #{lower_bound_column} <= ?", as_at) }
def self.lower_bound_column
lower_bound_field
end
def self.upper_bound_column
upper_bound_field
end
And is referred to via has_many's, example: has_many :company_users, -> { current }
If an ActiveRecord query is made which refers to a few models that include the concern, this results in an 'ambiguous column name' exception which makes sense.
To help overcome this, I change the column name helper methods to now be
def self.lower_bound_column
"#{self.table_name}.#{lower_bound_field}"
end
def self.upper_bound_column
"#{self.table_name}.#{upper_bound_field}"
end
Which works great, until you require self-referencing queries. Arel helps mitigate these issues by aliasing the table name in the resulting SQL, for example:
LEFT OUTER JOIN "company_users" "company_users_companies" ON "company_users_companies"."company_id" = "companies"."id"
and
INNER JOIN "company_users" ON "users"."id" = "company_users"."user_id" WHERE "company_users"."company_id" = $2
The issue here is that self.table_name no longer refers to the table name in the query. And this results in the tongue in cheek hint: HINT: Perhaps you meant to reference the table alias "company_users_companies"
In an attempt to migrate these queries over to Arel, I changed the column name helper methods to:
def self.lower_bound_column
self.class.arel_table[lower_bound_field.to_sym]
end
def self.upper_bound_column
self.class.arel_table[upper_bound_field.to_sym]
end
and updated the scopes to reflect:
lower_bound_column.eq(nil).or(lower_bound_column.lteq(as_at))
but this just ported the issue across since self.class.arel_table will always be the same regardless of the query.
I guess my question is, is how do I create scopes that can be used in self-referencing queries, which require operators such as <= and >=?
Edits
I have created a basic application to help showcase this issue.
git clone git#github.com:fattymiller/expirable_test.git
cd expirable_test
createdb expirable_test-development
bundle install
rake db:migrate
rake db:seed
rails s
Findings and assumptions
Works in sqlite3, not Postgres. Most likely because Postgres enforces the order of queries in the SQL?

Well, well, well. After quite a big time looking through the sources of Arel, ActiveRecord and Rails issues (it seems this is not new), I was able to find the way to access the current arel_table object, with its table_aliases if they are being used, inside the current scope at the moment of its execution.
That made possible to know if the scope is going to be used within a JOIN that has the table name aliased, or if on the other hand the scope can be used on the real table name.
I just added this method to your Expirable concern:
def self.current_table_name
current_table = current_scope.arel.source.left
case current_table
when Arel::Table
current_table.name
when Arel::Nodes::TableAlias
current_table.right
else
fail
end
end
As you can see, I'm using current_scope as the base object to look for the arel table, instead of the prior attempts of using self.class.arel_table or even relation.arel_table, which as you said remained the same regardless of where the scope was used. I'm just calling source on that object to obtain an Arel::SelectManager that in turn will give you the current table on the #left. At this moment there are two options: that you have there an Arel::Table (no alias, table name is on #name) or that you have an Arel::Nodes::TableAlias with the alias on its #right.
With that table_name you can revert to your first attempt of #{current_table_name}.#{lower_bound_field} and #{current_table_name}.#{upper_bound_field} in your scopes:
def self.lower_bound_column
"#{current_table_name}.#{lower_bound_field}"
end
def self.upper_bound_column
"#{current_table_name}.#{upper_bound_field}"
end
scope :current_and_future, ->(as_at = Time.now) { where("#{upper_bound_column} IS NULL OR #{upper_bound_column} >= ?", as_at) }
scope :current_and_expired, ->(as_at = Time.now) { where("#{lower_bound_column} IS NULL OR #{lower_bound_column} <= ?", as_at) }
This current_table_name method seems to me to be something that would be useful to have on the AR / Arel public API, so it can be maintained across version upgrades. What do you think?
If you are interested, here are some references I used down the road:
A similar question on SO, answered with a ton of code, that you could use instead of your beautiful and concise Ability.
This Rails issue and this other one.
And the commit on your test app on github that made tests green!

I have a slightly modified approach from #dgilperez, which uses the full power of Arel
def self.current_table_name
current_table = current_scope.arel.source.left
end
now you could modify your methods with arel_table syntax
def self.lower_bound_column
current_table[:lower_bound_field]
end
def self.upper_bound_column
current_table[:upper_bound_field]
end
and use it query like this
lower_bound_column.eq(nil).or(lower_bound_column.lteq(as_at))

Related

Rails code refactor in call method to handle map

I'm just wondering is there any chance to get fresh eye on code below and make some code refactor?
def call
inq_proc_ids = InquiryProcess.all.includes(inquiry_field_responses: :inquiry_field).select do |process|
process.inquiry_field_responses.select do |inquiry_field_responses|
inquiry_field_responses.inquiry_field.name == 'company_name'
end.last&.value&.start_with?(company_filter)
end.map(&:id)
InquiryProcess.where(id: inq_proc_ids)
end
I think I should leave only InquiryProcess.where(id: inq_proc_ids) in my call method but I don't know how to handle with all these .last&.value&.start_with?(company_filter) and .map(&:id) stuff.
EDIT:
I was trying to split it to the new methods
def call
InquiryProcess.where(id: inquiry_process_id)
end
private
attr_reader :company_filter, :inquiry_field_response
def inquiry_process_id
InquiryProcess.all.includes(inquiry_field_responses: :inquiry_field).select do |process|
process.inquiry_field_responses.select_company_name
end.map(&:id)
end
def select_company_name
select do |inquiry_field_responses|
inquiry_field_responses.inquiry_field.name == 'company_name'
end.last&.value&.start_with?(company_filter)
end
but I got an error:
NoMethodError (undefined method `select_company_name' for ActiveRecord::Associations::CollectionProxy []>):
The code you posted is not only hard to follow, but I remember we had a massive memory leak connected to ActiveReocrd caching when using precalculated ids in a query.
That said, I'd try to utilise the above within a single sql query:
def call
id_select = InquiryProcess
.joins(inquiry_field_responses: :inquiry_field)
.where(inquire_fields: { name: 'company_name' })
.where(InquiryField.arel_table[:value].matches("#{company_filter}%"))
.select(:id)
InquiryProcess.where(id: id_select)
end
Note that id_select is not an array of ids but ActiveRecord scope, the above will translate to following SQL:
SELECT "inquiry_processes".*
FROM "inquiry_processes"
WHERE "inquiry_processes"."id" IN (
SELECT "inquiry_processes"."id"
FROM "inquiry_processes"
INNER JOIN ...
WHERE ...
)
And to answer another question - why do we query table by matching id to a result of another subquery on the same table? This is to avoid all sort of painful issues when you deal with an active record relation that has a join in it - e.g. it would affect all further includes statements, as the preloaded association would only include records matching the relation join conditions.
I really hope for you that this bit is quite well tested or you have someone who can verify validity of the behaviour.

How to write complex query in Ruby

Need advice, how to write complex query in Ruby.
Query in PHP project:
$get_trustee = db_query("SELECT t.trustee_name,t.secret_key,t.trustee_status,t.created,t.user_id,ui.image from trustees t
left join users u on u.id = t.trustees_id
left join user_info ui on ui.user_id = t.trustees_id
WHERE t.user_id='$user_id' AND trustee_status ='pending'
group by secret_key
ORDER BY t.created DESC")
My guess in Ruby:
get_trustee = Trustee.find_by_sql('SELECT t.trustee_name, t.secret_key, t.trustee_status, t.created, t.user_id, ui.image FROM trustees t
LEFT JOIN users u ON u.id = t.trustees_id
LEFT JOIN user_info ui ON ui.user_id = t.trustees_id
WHERE t.user_id = ? AND
t.trustee_status = ?
GROUP BY secret_key
ORDER BY t.created DESC',
[user_id, 'pending'])
Option 1 (Okay)
Do you mean Ruby with ActiveRecord? Are you using ActiveRecord and/or Rails? #find_by_sql is a method that exists within ActiveRecord. Also it seems like the user table isn't really needed in this query, but maybe you left something out? Either way, I'll included it in my examples. This query would work if you haven't set up your relationships right:
users_trustees = Trustee.
select('trustees.*, ui.image').
joins('LEFT OUTER JOIN users u ON u.id = trustees.trustees_id').
joins('LEFT OUTER JOIN user_info ui ON ui.user_id = t.trustees_id').
where(user_id: user_id, trustee_status: 'pending').
order('t.created DESC')
Also, be aware of a few things with this solution:
I have not found a super elegant way to get the columns from the join tables out of the ActiveRecord objects that get returned. You can access them by users_trustees.each { |u| u['image'] }
This query isn't really THAT complex and ActiveRecord relationships make it much easier to understand and maintain.
I'm assuming you're using a legacy database and that's why your columns are named this way. If I'm wrong and you created these tables for this app, then your life would be much easier (and conventional) with your primary keys being called id and your timestamps being called created_at and updated_at.
Option 2 (Better)
If you set up your ActiveRecord relationships and classes properly, then this query is much easier:
class Trustee < ActiveRecord::Base
self.primary_key = 'trustees_id' # wouldn't be needed if the column was id
has_one :user
has_one :user_info
end
class User < ActiveRecord::Base
belongs_to :trustee, foreign_key: 'trustees_id' # relationship can also go the other way
end
class UserInfo < ActiveRecord::Base
self.table_name = 'user_info'
belongs_to :trustee
end
Your "query" can now be ActiveRecord goodness if performance isn't paramount. The Ruby convention is readability first, reorganizing code later if stuff starts to scale.
Let's say you want to get a trustee's image:
trustee = Trustee.where(trustees_id: 5).first
if trustee
image = trustee.user_info.image
..
end
Or if you want to get all trustee's images:
Trustee.all.collect { |t| t.user_info.try(:image) } # using a #try in case user_info is nil
Option 3 (Best)
It seems like trustee is just a special-case user of some sort. You can use STI if you don't mind restructuring you tables to simplify even further.
This is probably outside of the scope of this question so I'll just link you to the docs on this: http://api.rubyonrails.org/classes/ActiveRecord/Base.html see "Single Table Inheritance". Also see the article that they link to from Martin Fowler (http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html)
Resources
http://guides.rubyonrails.org/association_basics.html
http://guides.rubyonrails.org/active_record_querying.html
Yes, find_by_sql will work, you can try this also:
Trustee.connection.execute('...')
or for generic queries:
ActiveRecord::Base.connection.execute('...')

Efficient ActiveRecord association conditions

Let's say you have an assocation in one of your models like this:
class User
has_many :articles
end
Now assume you need to get 3 arrays, one for the articles written yesterday, one of for the articles written in the last 7 days, and one of for the articles written in the last 30 days.
Of course you might do this:
articles_yesterday = user.articles.where("posted_at >= ?", Date.yesterday)
articles_last7d = user.articles.where("posted_at >= ?", 7.days.ago.to_date)
articles_last30d = user.articles.where("posted_at >= ?", 30.days.ago.to_date)
However, this will run 3 separate database queries. More efficiently, you could do this:
articles_last30d = user.articles.where("posted_at >= ?", 30.days.ago.to_date)
articles_yesterday = articles_last30d.select { |article|
article.posted_at >= Date.yesterday
}
articles_last7d = articles_last30d.select { |article|
article.posted_at >= 7.days.ago.to_date
}
Now of course this is a contrived example and there is no guarantee that the array select will actually be faster than a database query, but let's just assume that it is.
My question is: Is there any way (e.g. some gem) to write this code in a way which eliminates this problem by making sure that you simply specify the association conditions, and the application itself will decide whether it needs to perform another database query or not?
ActiveRecord itself does not seem to cover this problem appropriately. You are forced to decide between querying the database every time or treating the association as an array.
There are a couple of ways to handle this:
You can create separate associations for each level that you want by specifying a conditions hash on the association definition. Then you can simply eager load these associations for your User query, and you will be hitting the db 3x for the entire operation instead of 3x for each user.
class User
has_many articles_yesterday, class_name: Article, conditions: ['posted_at >= ?', Date.yesterday]
# other associations the same way
end
User.where(...).includes(:articles_yesterday, :articles_7days, :articles_30days)
You could do a group by.
What it comes down to is you need to profile your code and determine what's going to be fastest for your app (or if you should even bother with it at all)
You can get rid of the necessity of checking the query with something like the code below.
class User
has_many :articles
def article_30d
#articles_last30d ||= user.articles.where("posted_at >= ?", 30.days.ago.to_date)
end
def articles_last7d
#articles_last7d ||= articles_last30d.select { |article| article.posted_at >= 7.days.ago.to_date }
end
def articles_yesterday
#articles_yesterday ||= articles_last30d.select { |article| article.posted_at >= Date.yesterday }
end
end
What it does:
Makes only one query maximum, if any of the three is used
Calculates only the used array, and the 30d version in any case, but only once
It does not however simplifies the initial 30d query even if you do not use it. Is it enough, or you need something more?

Is it possible to "dynamically" join a table only if that table is not joined yet?

I am using Ruby on Rails 3.2.2 and I would like to know if in scope methods it is possible to "dynamically" join a table only if that table is not joined yet. That it, I have:
def self.scope_method_name(user)
joins(:joining_association_name).where("joining_table_name.user_id = ?", user.id)
end
I would like to make something like the following:
# Note: the following code is just a sample in order to understand what I mean.
def self.scope_method_name(user)
if table_is_joined?(joining_table_name)
where("joining_table_name.user_id = ?", user.id)
else
joins(:joining_association_name).where("joining_table_name.user_id = ?", user.id)
end
end
Is it possible / advised to make that? If so, how could / should I proceed?
I would like to use this approach in order to avoid multiple database table statements in INNER JOIN of SQL queries (in some cases it seems to make my SQL querying not working as expected since multiple table statements) and so to use the scope_method_name without caring related SQL query concerns (in my case, without caring to join database tables).
Note: It could raise SQL errors (for example, errors as-like "ActiveRecord::StatementInvalid: Mysql2::Error: Unknown column 'joining_table_name.user_id' in 'where clause'") when you have not joined yet the database table (for example, this could happen when you run code like ClassName.scope_method_name(#user) without to previously join the joining_association_name and so without to join the related joining_table_name table).
Where is the method loaded? to check if an association has been loaded. You could try to use that.
if association_name.loaded?
where("joining_table_name.user_id = ?", user.id)
else
joins(:joining_association_name).where("joining_table_name.user_id = ?", user.id)
end

Rails 3 Active Record relation order: Use hash instead of string

To sort a relation in Rails 3, we have to do this:
User.where(:activated => true).order('id ASC')
But I think this:
User.where(:activated => true).order(:id => :asc)
would make better sense because the way the field name be escaped should depend on the adapter (SqlLite vs Mysql vs PostgreSQL), right?
Is there something similar to that?
As far as I know there's no option for this syntax built into ActiveRecord, but it shouldn't be hard for you to add one. I found the order method defined in lib/active_record/relation/query_methods.rb. Theoretically, you should be able to do something like this:
module ActiveRecord
module QueryMethods
def order(*args)
args.map! do |arg|
if arg.is_a? Hash
# Format a string out of the hash that matches the original AR style
stringed_arg
else
arg
end
end
super
end
end
end
I think the key problem is: ActiveRecord API is not aware of ordering semantic. It just accepts a string and bypasses to the underlying database. Fortunately, Sqlite, MySQL and PostgreSQL has no difference in order syntax.
I don't think ActiveRecord can do this abstraction well, and it doesn't need to do it. It works well with relation databases, but is hard to integrate with NoSQL, eg. MongoDB.
DataMapper, another famous Ruby ORM, did better abstraction. Take a look at its query syntax:
#zoos_by_tiger_count = Zoo.all(:order => [ :tiger_count.desc ])
The API is aware of the ordering semantic. By default, DataMapper will generate SQL order statement:
https://github.com/datamapper/dm-do-adapter/blob/master/lib/dm-do-adapter/adapter.rb#L626-634
def order_statement(order, qualify)
statements = order.map do |direction|
statement = property_to_column_name(direction.target, qualify)
statement << ' DESC' if direction.operator == :desc
statement
end
statements.join(', ')
end
However, it's possible to override at DB adapter layer:
https://github.com/solnic/dm-mongo-adapter/blob/master/lib/dm-mongo-adapter/query.rb#L260-264
def sort_statement(conditions)
conditions.inject([]) do |sort_arr, condition|
sort_arr << [condition.target.field, condition.operator == :asc ? 'ascending' : 'descending']
end
end
TL;DR:
You don't need worry about syntax problem if you are only using SqlLite, Mysql and PostgreSQL.
For better abstraction, you can try DataMapper.
For this particular case, you could drop the 'ASC' bit as ordering in all database is implicitly ascending
Foo.order(:bar)
I am aware that this doesn't cover the case where you'd want to do order by bar desc but actually for order by this doesn't matter much unless you are using functions for the order by clause in which case maybe something like squeel would help

Resources