Rails 4 string interpolation in raw SQL request - ruby-on-rails

What would be the best way of rewriting this query without interpolation?
def case_joins(type)
subquery = <<-SQL.squish
SELECT id FROM cases c2
WHERE c2.title_id = titles.id AND c2.value = 0 AND c2.type = '#{type}'
ORDER BY c2.created_at DESC LIMIT 1
SQL
"LEFT OUTER JOIN cases ON cases.title_id = titles.id AND cases.value = 0 AND cases.type = '#{type}' AND cases.id = (#{subquery})"
end

I'm assuming that you want to avoid interpolation of variables, which is dangerous since its open to SQL injection. I would simply join onto the cases selected from the subquery instead of putting the subquery into the WHERE conditions. This does involve interpolation, but only of AR-generated SQL. I would also implement it as a scope to leverage AR scope chaining:
class Title < ActiveRecord::Base
def self.case_joins(type)
case_query = Case.from("cases c").where(c: {title_id: title_id, value: 0, type: type}).order('c.created_at DESC').limit(1)
joins("LEFT OUTER JOIN (#{case_query.to_sql}) cases ON cases.title_id = titles.id")
end
end
This way, you can chain the scope to others like so:
Title.where(attribute1: value1).case_joins("typeA")
(Note that removed the superfluous WHERE conditions in the outer SELECT.)

It's difficult to infer what the rest of your code looks like, but I presume titles is being used in a query further up your call stack.
If you were to use ActiveRecord instead of native SQL, you could do something like this:
def case_joins(scope, title_id, type)
ids = Case.where(title_id: title_id, value: 0, type: type)
.order('created_at desc').limit(1).pluck(:id)
scope.joins('left outer join cases on cases.title_id = titles.id')
.where(value: 0, type: type, id: ids)
end
scope here is the current AR query you are modifying.
This is off the top of my head, so I'm not sure if the AR syntax above is correct, but it does avoid the need to interpolate SQL and also uses scoping.
To be honest, though, it's not all that much more readable than native SQL, and so YMMV. It does at least mean that (apart from the join) you're not encoding SQL in your code.

Here is modification of #eirikir's answer, that works the same way as method in question.
def case_joins(type)
case_query = Case.from("cases c").where('c.title_id = titles.id AND c.value = 0 AND c.type = ?', type).order('c.created_at DESC').select(:id).limit(1)
"LEFT OUTER JOIN cases ON cases.title_id = titles.id AND cases.id = (#{case_query.to_sql})"
end

Related

Selecting MIN and MAX for a collection, based on an ActiveRecord scope, in one SQL query

I would like to run the following query:
/* Inside a heredoc */
SELECT
MIN("books"."page_count") AS min,
MAX("books"."page_count") AS max
FROM "books"
WHERE "books"."author_id" IN (#{authors.pluck(:id).map { |id| "'#{id}'" }.join(",")})
AND "books"."publisher_id" IN (#{publishers.pluck(:id).map { |id| "'#{id}'" }.join(",")});
But instead of having to manually write the WHERE clauses at the end with that ugly interpolation, I'd like to use scopes I've defined on Book, something like:
query = Book.by_author(authors).by_publisher(publishers).select('MIN("books"."page_count") as min, MAX("books"."page_count") as max').first
attrs = [query["min"], query["max"]]
I understand that this doesn't work because Book.my_scope... expects to return a collection of books, while I'm looking for numeric values.
I am aware I could do e.g.:
query = Book.by_author(authors).by_publisher(publishers)
attrs = [query.minimum(:page_count), query.maximum(:page_count)]
But this results in two SQL queries, which seems quite unnecessary.
Is there a Rails-y way I can do this while keeping the flexibility of chainable scopes, without heredoc interpolation, and in one SQL query?
You could define it as a class method:
class Book
class << self
def minmax_page_count
count = select('MIN(page_count) OVER (), MAX(page_count) OVER ()').take
[count.min, count.max]
end
end
end
Book.by_author(authors).by_publisher(publishers).minmax_page_count # => [1, 2]

Ruby on Rails SQL Injection - Building a query

I'm resolving all the SQL Injections in a system and I've found something that I don't know how to treat.
Can somebody help me?
Here is my method
def get_structure()
#build query
sql = %(
SELECT pc.id AS "product_id", pc.code AS "code", pc.description AS "description", pc.family AS "family",
p.code AS "father_code", p.description AS "father_description",
p.family AS "father_family"
FROM products pc
LEFT JOIN imported_structures imp ON pc.id = imp.product_id
LEFT JOIN products p ON imp.product_father_id = p.id
WHERE pc.enable = true AND p.enable = true
)
#verify if there is any filter
if !params[:code].blank?
sql = sql + " AND UPPER(pc.code) LIKE '%#{params[:code].upcase}%'"
end
#many other parameters like the one above
#execute query
str = ProductStructure.find_by_sql(sql)
end
Thank you!
You could use Arel which will escape for you, and is the underlying query builder for ActiveRecord/Rails. eg.
products = Arel::Table.new("products")
products2 = Arel::Table.new("products", as: 'p')
imported_structs = Arel::Table.new("imported_structures")
query = products.project(
products[:id].as('product_id'),
products[:code],
products[:description],
products[:family],
products2[:code].as('father_code'),
products2[:description].as('father_description'),
products2[:family].as('father_family')).
join(imported_structs,Arel::Nodes::OuterJoin).
on(imported_structs[:product_id].eq(products[:id])).
join(products2,Arel::Nodes::OuterJoin).
on(products2[:id].eq(imported_structs[:product_father_id])).
where(products[:enable].eq(true).and(products2[:enable].eq(true)))
if !params[:code].blank?
query.where(
Arel::Nodes::NamedFunction.new('UPPER',[products[:code]])
.matches("%#{params[:code].to_s.upcase}%")
)
end
SQL result: (with params[:code] = "' OR 1=1 --test")
SELECT
[products].[id] AS product_id,
[products].[code],
[products].[description],
[products].[family],
[p].[code] AS father_code,
[p].[description] AS father_description,
[p].[family] AS father_family
FROM
[products]
LEFT OUTER JOIN [imported_structures] ON [imported_structures].[product_id] = [products].[id]
LEFT OUTER JOIN [products] [p] ON [p].[id] = [imported_structures].[product_father_id]
WHERE
[products].[enable] = true AND
[p].[enable] = true AND
UPPER([products].[code]) LIKE N'%'' OR 1=1 --test%'
To use
ProductStructure.find_by_sql(query.to_sql)
I prefer Arel, when available, over String queries because:
it supports escaping
it leverages your existing connection adapter for sytnax (so it is portable if you change databases)
it is built in code so statement order does not matter
it is far more dynamic and maintainable
it is natively supported by ActiveRecord
you can build any complex query you can possibly imagine (including complex joins, CTEs, etc.)
it is still very readable
You need to turn that into a placeholder value (?) and add the data as a separate argument. find_by_sql can take an array:
def get_structure
#build query
sql = %(SELECT...)
query = [ sql ]
if !params[:code].blank?
sql << " AND UPPER(pc.code) LIKE ?"
query << "%#{params[:code].upcase}%"
end
str = ProductStructure.find_by_sql(query)
end
Note, use << on String in preference to += when you can as it avoids making a copy.

Increase performance: avoid looking for the right element in a collection

I have this situation.
activity.rb
belongs_to :user
belongs_to :cause
belongs_to :sub_cause
belongs_to :client
def amount
duration / 60.0 * user.hourly_cost_by_year(date.year).amount rescue 0
end
user.rb
has_many :hourly_costs # one hourly_cost for year
has_many :activities
def hourly_cost_by_year(year = Date.today.year)
hourly_costs.find { |hc| hc.year == year }
end
hourly_cost.rb
belongs_to :user
I have a big report where I achieved good performance (the number of SQL queries is fixed) but I think I could do better. The query I use is
activities = Activity.includes(:client, :cause, :sub_cause, user: :hourly_costs)
And this is ok, it's fast, but I think is improvable because hourly_cost_by_year method. I mean, activity has a date and I can use that date to know which of those hourly costs I should use. Something like this in activity
def self.user_with_single_hourly_cost
joins('LEFT JOIN users u ON u.id = activities.user_id').
joins('LEFT JOIN hourly_costs hc ON hc.user_id = u.id AND hc.year = EXTRACT(year from activities.date)')
end
But I don't how integrate this in my query. Whatever I tried did not work. I could use raw SQL but I'm trying to use ActiveRecord. I even thought to use redis to cache every hourly cost by user and year, could work, but I think this query, with the extract part, should do the best job because I'd have a flat table.
Update: I try to clarify. Whatever query I use in my action at some point I have to do
activities.sum(&:amount)
and that method, you know, is
def amount
duration / 60.0 * user.hourly_cost_by_year(date.year).amount rescue 0
end
And I don't know how to pick directly the hourly_cost I want without search between hourly_costs. Is this possible?
You may consider using Arel for this. Arel is the underlying query assembler for rails/activerecord (so no new dependencies) and can be very useful when building complex queries because it offers far more depth than the high level ActiveRecord::QueryMethods.
Obviously with a broader API comes more verbosity (which actually adds quite a bit to the readability) and less syntactical sugar which takes some getting used to but has proven indispensable for me on multiple occasions.
While I did not take the time to recreate your data structure something like this may work for you
activities = Activity.arel_table
users = User.arel_table
hourly_costs = HourlyCost.arel_table
activity_users_hourly_cost = activities
.join(users,Arel::Nodes::OuterJoin)
.on(activities[:user_id].eq(users[:id]))
.join(hourly_costs,Arel::Nodes::OuterJoin)
.on(hourly_costs[:user_id].eq(users[:id])
.and(hourly_costs[:year].eq(Arel::Nodes::Extract.new(activities[:date],'year'))
)
)
Activity.includes(:client, :cause, :sub_cause).joins(activity_users_hourly_cost.join_sources)
This will add the requested join e.g.
activity_users_hourly_cost.to_sql
#=> SELECT
FROM [activities]
LEFT OUTER JOIN [users] ON [activities].[user_id] = [users].[id]
LEFT OUTER JOIN [hourly_costs] ON [hourly_costs].[user_id] = [users].[id]
AND [hourly_costs].[year] = EXTRACT(YEAR FROM [activities].[date])
Update
If you just want to add the "hourly_cost" this should work for you
Activity.includes(:client, :cause, :sub_cause)
.joins(activity_users_hourly_cost.join_sources)
.select("activities.*, activities.duration / 60.0 * ISNULL([hourly_costs].[amount],0) as hourly_cost_by_year")
Please note that this will only return Activity objects but they will now have a method called hourly_cost_by_year which will return the result of that calculation. Full SQL will look like
SELECT
[activities].*,
activities.duration / 60.0 * ISNULL([hourly_costs].[amount],0) as hourly_cost_by_year
FROM [activities]
-- Dependant upon WHERE Clause
LEFT OUTER JOIN causes ON [activities].[cause_id] = [causes].[id]
LEFT OUTER JOIN sub_causes ON [activities].[subcause_id] = [subcauses].[id]
LEFT OUTER JOIN clients [activities].[client_id] = [clients].[id]
--
LEFT OUTER JOIN [users] ON [activities].[user_id] = [users].[id]
LEFT OUTER JOIN [hourly_costs] ON [hourly_costs].[user_id] = [users].[id]
AND [hourly_costs].[year] = EXTRACT(YEAR FROM [activities].[date])
You could build the select portion in Arel too if you like but seems overkill for such a simple statement.

Avoid sql injection with connection.execute

If a query can't be efficiently expressed using ActiveRecord, how to safely use ActiveRecord::Base.connection.execute when interpolating passed params attributes?
connection.execute "... #{params[:search]} ..."
You can use the methods in ActiveRecord::Sanitization::ClassMethods.
You do have to be slightly careful as they are protected and therefore only readily available for ActiveRecord::Base subclasses.
Within a model class you could do something like:
class MyModel < ActiveRecord::Base
def bespoke_query(params)
query = sanitize_sql(['select * from somewhere where a = ?', params[:search]])
connection.execute(query)
end
end
You can send the method to try it out on the console too:
> MyModel.send(:sanitize_sql, ["Evening Officer ?", "'Dibble'"])
=> "Evening Officer '\\'Dibble\\''"
ActiveRecord has a sanitize method that allows you to clean the query first.
Perhaps it's something you can look into: http://apidock.com/rails/v4.1.8/ActiveRecord/Sanitization/ClassMethods/sanitize
I'd be very careful inserting parameters directly like that though.
What problem are you experiencing, that you cannot use ActiveRecord?
You can use functions from ActiveRecord::Base to sanitize your sql query. E.g. sanitize_sql_array. As mentioned in other answers they are protected, but that's possible to get around without having to deal with inheritance.
sanitize_sql_array accepts an array of strings where the first element is the query and the subsequent elements will replace ? characters in the query.
query = 'SELECT * FROM users WHERE id = ? OR first_name = ?'
id = 1
name = 'Alice'
sanitized_query = ActiveRecord::Base.send(:sanitize_sql_array, [query, id, name])
response = ActiveRecord::Base.connection.execute(sanitized_query)

Ordering by specific value in Activerecord

In Ruby on Rails, I'm trying to order the matches of a player by whether the current user is the winner.
The sort order would be:
Sort by whether the current user is the winner
Then sort by created_at, etc.
I can't figure out how to do the equivalent of :
Match.all.order('winner_id == ?', #current_user.id)
I know this line is not syntactically correct but hopefully it expresses that the order must be:
1) The matches where the current user is the winner
2) the other matches
You can use a CASE expression in an SQL ORDER BY clause. However, AR doesn't believe in using placeholders in an ORDER BY so you have to do nasty things like this:
by_owner = Match.send(:sanitize_sql_array, [ 'case when winner_id = %d then 0 else 1 end', #current_user.id ])
Match.order(by_owner).order(:created_at)
That should work the same in any SQL database (assuming that your #current_user.id is an integer of course).
You can make it less unpleasant by using a class method as a scope:
class Match < ActiveRecord::Base
def self.this_person_first(id)
by_owner = sanitize_sql_array([ 'case when winner_id = %d then 0 else 1 end', id])
order(by_owner)
end
end
# and later...
Match.this_person_first(#current_user.id).order(:created_at)
to hide the nastiness.
This can be achived using Arel without writing any raw SQL!
matches = Match.arel_table
Match
.order(matches[:winner_id].eq(#current_user.id).desc)
.order(created_at: :desc)
Works for me with Postgres 12 / Rails 6.0.3 without any security warning
If you want to do sorting on the ruby side of things (instead of the SQL side), then you can use the Array#sort_by method:
query.sort_by(|a| a.winner_id == #current_user.id)
If you're dealing with bigger queries, then you should probably stick to the SQL side of things.
I would build a query and then execute it after it's built (mostly because you may not have #current_user. So, something like this:
query = Match.scoped
query = query.order("winner_id == ?", #current_user.id) if #current_user.present?
query = query.order("created_at")
#results = query.all

Resources