Avoid sql injection with connection.execute - ruby-on-rails

If a query can't be efficiently expressed using ActiveRecord, how to safely use ActiveRecord::Base.connection.execute when interpolating passed params attributes?
connection.execute "... #{params[:search]} ..."

You can use the methods in ActiveRecord::Sanitization::ClassMethods.
You do have to be slightly careful as they are protected and therefore only readily available for ActiveRecord::Base subclasses.
Within a model class you could do something like:
class MyModel < ActiveRecord::Base
def bespoke_query(params)
query = sanitize_sql(['select * from somewhere where a = ?', params[:search]])
connection.execute(query)
end
end
You can send the method to try it out on the console too:
> MyModel.send(:sanitize_sql, ["Evening Officer ?", "'Dibble'"])
=> "Evening Officer '\\'Dibble\\''"

ActiveRecord has a sanitize method that allows you to clean the query first.
Perhaps it's something you can look into: http://apidock.com/rails/v4.1.8/ActiveRecord/Sanitization/ClassMethods/sanitize
I'd be very careful inserting parameters directly like that though.
What problem are you experiencing, that you cannot use ActiveRecord?

You can use functions from ActiveRecord::Base to sanitize your sql query. E.g. sanitize_sql_array. As mentioned in other answers they are protected, but that's possible to get around without having to deal with inheritance.
sanitize_sql_array accepts an array of strings where the first element is the query and the subsequent elements will replace ? characters in the query.
query = 'SELECT * FROM users WHERE id = ? OR first_name = ?'
id = 1
name = 'Alice'
sanitized_query = ActiveRecord::Base.send(:sanitize_sql_array, [query, id, name])
response = ActiveRecord::Base.connection.execute(sanitized_query)

Related

Use Ruby's select method on a Rails relation and update it

I have an ActiveRecord relation of a user's previous "votes"...
#previous_votes = current_user.votes
I need to filter these down to votes only on the current "challenge", so Ruby's select method seemed like the best way to do that...
#previous_votes = current_user.votes.select { |v| v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id }
But I also need to update the attributes of these records, and the select method turns my relation into an array which can't be updated or saved!
#previous_votes.update_all :ignore => false
# ...
# undefined method `update_all' for #<Array:0x007fed7949a0c0>
How can I filter down my relation like the select method is doing, but not lose the ability to update/save it the items with ActiveRecord?
Poking around the Google it seems like named_scope's appear in all the answers for similar questions, but I can't figure out it they can specifically accomplish what I'm after.
The problem is that select is not an SQL method. It fetches all records and filters them on the Ruby side. Here is a simplified example:
votes = Vote.scoped
votes.select{ |v| v.active? }
# SQL: select * from votes
# Ruby: all.select{ |v| v.active? }
Since update_all is an SQL method you can't use it on a Ruby array. You can stick to performing all operations in Ruby or move some (all) of them into SQL.
votes = Vote.scoped
votes.select{ |v| v.active? }
# N SQL operations (N - number of votes)
votes.each{ |vote| vote.update_attribute :ignore, false }
# or in 1 SQL operation
Vote.where(id: votes.map(&:id)).update_all(ignore: false)
If you don't actually use fetched votes it would be faster to perform the whole select & update on SQL side:
Vote.where(active: true).update_all(ignore: false)
While the previous examples work fine with your select, this one requires you to rewrite it in terms of SQL. If you have set up all relationships in Rails models you can do it roughly like this:
entry = Entry.find(params[:entry_id])
current_user.votes.joins(:challenges).merge(entry.challenge.votes)
# requires following associations:
# Challenge.has_many :votes
# User.has_many :votes
# Vote.has_many :challenges
And Rails will construct the appropriate SQL for you. But you can always fall back to writing the SQL by hand if something doesn't work.
Use collection_select instead of select. collection_select is specifically built on top of select to return ActiveRecord objects and not an array of strings like you get with select.
#previous_votes = current_user.votes.collection_select { |v| v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id }
This should return #previous_votes as an array of objects
EDIT: Updating this post with another suggested way to return those AR objects in an array
#previous_votes = current_user.votes.collect {|v| records.detect { v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id}}
A nice approach this is to use scopes. In your case, you can set this up the scope as follows:
class Vote < ActiveRecord::Base
scope :for_challenge, lambda do |challenge_id|
joins(:entry).where("entry.challenge_id = ?", challenge_id)
end
end
Then your code for getting current votes will look like:
challenge_id = Entry.find(params[:entry_id]).challenge_id
#previous_votes = current_user.votes.for_challenge(challenge_id)
I believe you can do something like:
#entry = Entry.find(params[:entry_id])
#previous_votes = Vote.joins(:entry).where(entries: { id: #entry.id, challenge_id: #entry.challenge_id })

Dynamic Method with ActiveRecord, passing in hash of conditions

I am struggling with the best way to meta program a dynamic method, where I'll be limiting results based on conditions... so for example:
class Timeslip < ActiveRecord::Base
def self.by_car_trans(car, trans)
joins(:car)
.where("cars.trans IN (?) and cars.year IN (?) and cars.model ILIKE ?", trans, 1993..2002, car)
.order('et1320')
end
end
Let's say instead of passing in my arguments, i pass in an array of conditions with key being the fieldname, and value being the field value. so for example, I'd do something like this:
i'd pass in [["field", "value", "operator"],["field", "value", "operator"]]
def self.using_conditions(conditions)
joins(:car)
conditions.each do |key, value|
where("cars.#{key} #{operator} ?", value)
end
end
However, that doesn't work, and it's not very flexible... I was hoping to be able to detect if the value is an array, and use IN () rather than =, and maybe be able to use ILIKE for case insensitive conditions as well...
Any advice is appreciated. My main goal here is to have a "lists" model, where a user can build their conditions dynamically, and then save that list for future use. This list would filter the timeslips model based on the associated cars table... Maybe there is an easier way to go about this?
First of all, you might find an interest in the Squeel gem.
Other than that, use arel_table for IN or LIKE predicates :
joins( :car ).where( Car.arel_table[key].in values )
joins( :car ).where( Car.arel_table[key].matches value )
you can detect the type of value to select an adequate predicate (not nice OO, but still):
column = Car.arel_table[key]
predicate = value.respond_to?( :to_str ) ? :in : :matches # or any logic you want
joins( :car ).where( column.send predicate, value )
you can chain as many as those as you want:
conditions.each do |(key, value, predicate)|
scope = scope.where( Car.arel_table[key].send predicate, value )
end
return scope
So, you want dynamic queries that end-users can specify at run-time (and can be stored & retrieved for later use)?
I think you're on the right track. The only detail is how you model and store your criteria. I don't see why the following won't work:
def self.using_conditions(conditions)
joins(:car)
crit = conditions.each_with_object({}) {|(field, op, value), m|
m["#{field} #{op} ?"] = value
}
where crit.keys.join(' AND '), *crit.values
end
CAVEAT The above code as is is insecure and prone to SQL injection.
Also, there's no easy way to specify AND vs OR conditions. Finally, the simple "#{field} #{op} ?", value for the most part only works for numeric fields and binary operators.
But this illustrates that the approach can work, just with a lot of room for improvement.

Searching serialized data, using active record

I'm trying to do a simple query of a serialized column, how do you do this?
serialize :mycode, Array
1.9.3p125 :026 > MyModel.find(104).mycode
MyModel Load (0.6ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`id` = 104 LIMIT 1
=> [43565, 43402]
1.9.3p125 :027 > MyModel.find_all_by_mycode("[43402]")
MyModel Load (0.7ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`mycode` = '[43402]'
=> []
1.9.3p125 :028 > MyModel.find_all_by_mycode(43402)
MyModel Load (1.2ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`mycode` = 43402
=> []
1.9.3p125 :029 > MyModel.find_all_by_mycode([43565, 43402])
MyModel Load (1.1ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`mycode` IN (43565, 43402)
=> []
It's just a trick to not slow your application. You have to use .to_yaml.
exact result:
MyModel.where("mycode = ?", [43565, 43402].to_yaml)
#=> [#<MyModel id:...]
Tested only for MySQL.
Basically, you can't. The downside of #serialize is that you're bypassing your database's native abstractions. You're pretty much limited to loading and saving the data.
That said, one very good way to slow your application to a crawl could be:
MyModel.all.select { |m| m.mycode.include? 43402 }
Moral of the story: don't use #serialize for any data you need to query on.
Serialized array is stored in database in particular fashion eg:
[1, 2, 3, 4]
in
1\n 2\n 3\n etc
hence the query would be
MyModel.where("mycode like ?", "% 2\n%")
put space between % and 2.
Noodl's answer is right, but not entirely correct.
It really depends on the database/ORM adapter you are using: for instance PostgreSQL can now store and search hashes/json - check out hstore. I remember reading that ActiveRecord adapter for PostgreSQl now handles it properly. And if you are using mongoid or something like that - then you are using unstructured data (i.e. json) on a database level everywhere.
However if you are using a db that can't really handle hashes - like MySQL / ActiveRecord combination - then the only reason you would use serialized field is for somet data that you can create / write in some background process and display / output on demand - the only two uses that I found in my experience are some reports ( like a stat field on a Product model - where I need to store some averages and medians for a product), and user options ( like their preferred template color -I really don't need to query on that) - however user information - like their subscription for a mailing list - needs to be searchable for email blasts.
PostgreSQL hstore ActiveRecord Example:
MyModel.where("mycode #> 'KEY=>\"#{VALUE}\"'")
UPDATE
As of 2017 both MariaDB and MySQL support JSON field types.
You can query the serialized column with a sql LIKE statement.
MyModel.where("mycode LIKE '%?%'", 43402)
This is quicker than using include?, however, you cannot use an array as the parameter.
Good news! If you're using PostgreSQL with hstore (which is super easy with Rails 4), you can now totally search serialized data. This is a handy guide, and here's the syntax documentation from PG.
In my case I have a dictionary stored as a hash in an hstore column called amenities. I want to check for a couple queried amenities that have a value of 1 in the hash, I just do
House.where("amenities #> 'wifi => 1' AND amenities #> 'pool => 1'")
Hooray for improvements!
There's a blog post from 2009 from FriendFeed that describes how to use serialized data within MySQL.
What you can do is create tables that function as indexes for any data that you want to search.
Create a model that contains the searchable values/fields
In your example, the models would look something like this:
class MyModel < ApplicationRecord
# id, name, other fields...
serialize :mycode, Array
end
class Item < ApplicationRecord
# id, value...
belongs_to :my_model
end
Creating an "index" table for searchable fields
When you save MyModel, you can do something like this to create the index:
Item.where(my_model: self).destroy
self.mycode.each do |mycode_item|
Item.create(my_model: self, value: mycode_item)
end
Querying and Searching
Then when you want to query and search just do:
Item.where(value: [43565, 43402]).all.map(&:my_model)
Item.where(value: 43402).all.map(&:my_model)
You can add a method to MyModel to make that simpler:
def find_by_mycode(value_or_values)
Item.where(value: value_or_values).all.map(&my_model)
end
MyModel.find_by_mycode([43565, 43402])
MyModel.find_by_mycode(43402)
To speed things up, you will want to create a SQL index for that table.
Using the following comments in this post
https://stackoverflow.com/a/14555151/936494
https://stackoverflow.com/a/15287674/936494
I was successfully able to query a serialized Hash in my model
class Model < ApplicationRecord
serialize :column_name, Hash
end
When column_name holds a Hash like
{ my_data: [ { data_type: 'MyType', data_id: 113 } ] }
we can query it in following manner
Model.where("column_name = ?", hash.to_yaml)
That generates a SQL query like
Model Load (0.3ms) SELECT "models".* FROM "models" WHERE (column_name = '---
:my_data:
- :data_type: MyType
:data_id: 113
')
In case anybody is interested in executing the generated query in SQL terminal it should work, however care should be taken that value is in exact format stored in DB. However there is another easy way I found at PostgreSQL newline character to use a raw string containing newline characters
select * from table_name where column_name = E'---\n:my_data:\n- :data_type: MyType\n :data_id: 113\n'
The most important part in above query is E.
Note: The database on which I executed above is PostgreSQL.
To search serialized list you need to prefix and postfix the data with unique characters.
Example:
Rather than something like:
2345,12345,1234567 which would cause issues you tried to search for 2345 instead, you do something like <2345>,<12345>,<1234567> and search for <2345> (the search query get's transformed) instead. Of course choice of prefix/postfix characters depends on the valid data that will be stored. You might instead use something like ||| if you expect < to be used and potentially| to be used. Of course that increases the data the field uses and could cause performance issues.
Using a trigrams index or something would avoid potential performance issues.
You can serialize it like data.map { |d| "<#{d}>" }.join(',') and deserialize it via data.gsub('<').gsub('>','').split(','). A serializer class would do the job quite well to load/extract tha data.
The way you do this is by setting the database field to text and using rail's serialize model method with a custom lib class. The lib class needs to implement two methods:
def self.dump(obj) # (returns string to be saved to database)
def self.load(text) # (returns object)
Example with duration. Extracted from the article so link rot wouldn't get it, please visit the article for more information. The example uses a single value, but it's fairly straightforward to serialize a list of values and deserialize the list using the methods mentioned above.
class Duration
# Used for `serialize` method in ActiveRecord
class << self
def load(duration)
self.new(duration || 0)
end
def dump(obj)
unless obj.is_a?(self)
raise ::ActiveRecord::SerializationTypeMismatch,
"Attribute was supposed to be a #{self}, but was a #{obj.class}. -- #{obj.inspect}"
end
obj.length
end
end
attr_accessor :minutes, :seconds
def initialize(duration)
#minutes = duration / 60
#seconds = duration % 60
end
def length
(minutes.to_i * 60) + seconds.to_i
end
end
If you have serialized json column and you want to apply like query on that. do it like that
YourModel.where("hashcolumn like ?", "%#{search}%")

Rails, how to sanitize SQL in find_by_sql

Is there a way to sanitize sql in rails method find_by_sql?
I've tried this solution:
Ruby on Rails: How to sanitize a string for SQL when not using find?
But it fails at
Model.execute_sql("Update users set active = 0 where id = 2")
It throws an error, but sql code is executed and the user with ID 2 now has a disabled account.
Simple find_by_sql also does not work:
Model.find_by_sql("UPDATE user set active = 0 where id = 1")
# => code executed, user with id 1 have now ban
Edit:
Well my client requested to make that function (select by sql) in admin panel to make some complex query(joins, special conditions etc). So I really want to find_by_sql that.
Second Edit:
I want to achieve that 'evil' SQL code won't be executed.
In admin panel you can type query -> Update users set admin = true where id = 232 and I want to block any UPDATE / DROP / ALTER SQL command.
Just want to know, that here you can ONLY execute SELECT.
After some attempts I conclude sanitize_sql_array unfortunatelly don't do that.
Is there a way to do that in Rails??
Sorry for the confusion..
Try this:
connect = ActiveRecord::Base.connection();
connect.execute(ActiveRecord::Base.send(:sanitize_sql_array, "your string"))
You can save it in variable and use for your purposes.
I made a little snippet for this that you can put in initializers.
class ActiveRecord::Base
def self.escape_sql(array)
self.send(:sanitize_sql_array, array)
end
end
Right now you can escape your query with this:
query = User.escape_sql(["Update users set active = ? where id = ?", true, params[:id]])
And you can call the query any way you like:
users = User.find_by_sql(query)
Slightly more general-purpose:
class ActiveRecord::Base
def self.escape_sql(clause, *rest)
self.send(:sanitize_sql_array, rest.empty? ? clause : ([clause] + rest))
end
end
This one lets you call it just like you'd type in a where clause, without extra brackets, and using either array-style ? or hash-style interpolations.
User.find_by_sql(["SELECT * FROM users WHERE (name = ?)", params])
Source: http://blog.endpoint.com/2012/10/dont-sleep-on-rails-3-sql-injection.html
Though this example is for INSERT query, one can use similar approach for UPDATE queries. Raw SQL bulk insert:
users_places = []
users_values = []
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
params[:users].each do |user|
users_places << "(?,?,?,?)" # Append to array
users_values << user[:name] << user[:punch_line] << timestamp << timestamp
end
bulk_insert_users_sql_arr = ["INSERT INTO users (name, punch_line, created_at, updated_at) VALUES #{users_places.join(", ")}"] + users_values
begin
sql = ActiveRecord::Base.send(:sanitize_sql_array, bulk_insert_users_sql_arr)
ActiveRecord::Base.connection.execute(sql)
rescue
"something went wrong with the bulk insert sql query"
end
Here is the reference to sanitize_sql_array method in ActiveRecord::Base, it generates the proper query string by escaping the single quotes in the strings. For example the punch_line "Don't let them get you down" will become "Don\'t let them get you down".
I prefer to do it with key parameters. In your case it may looks like this:
Model.find_by_sql(["UPDATE user set active = :active where id = :id", active: 0, id: 1])
Pay attention, that you pass ONLY ONE parameter to :find_by_sql method - its an array, which contains two elements: string query and hash with params (since its our favourite Ruby, you can omit the curly brackets).

Rails - find_by_sql - Querying with multiple values for one field

I'm having trouble joining the values for querying multiple values to one column. Here's what I got so far:
def self.showcars(cars)
to_query = []
if !cars.empty?
to_query.push cars
end
return self.find_by_sql(["SELECT * FROM cars WHERE car IN ( ? )"])
end
That makes the query into:
SELECT * FROM cars WHERE car IN (--- \n- \"honda\"\n- \"toyota\"\n')
It seems find_by_sql sql_injection protection adds the extra characters. How do I get this to work?
Do you really need find_by_sql? Since you're performing a SELECT *, and assuming your method resides on the Car model, a better way would be:
class Car < ActiveRecord::Base
def self.showcars(*cars)
where('car in :cars', :cars => cars)
# or
where(:car => cars)
end
end
Note the * right after the parameter name... Use it and you won't need to write code to make a single parameter into an array.
If you really need find_by_sql, try to write it this way:
def self.showcars(*cars)
find_by_sql(['SELECT * FROM cars where car in (?)', cars])
end
Try joining the to_query array into a comma separated string with all values in single quotes, and then passing this string as a parameter "?".
Problem resolve.
def self.average_time(time_init, time_end)
query = <<-SQL
SELECT COUNT(*) FROM crawler_twitters AS twitter WHERE CAST(twitter.publish AS TIME) BETWEEN '#{time_init}' AND '#{time_end}'
GROUP BY user) AS total_tweets_time;
SQL
self.find_by_sql(sanitize_sql(query))
end

Resources