Ecto's fragment allowing SQL injection - erlang

When Ecto queries get more complex and require clauses like CASE...WHEN...ELSE...END, we tend to depend on Ecto's fragment to solve it.
e.g. query = from t in <Model>, select: fragment("SUM(CASE WHEN status = ? THEN 1 ELSE 0 END)", 2)
In fact the most popular Stack Overflow post about this topic suggests to create a macro like this:
defmacro case_when(condition, do: then_expr, else: else_expr) do
quote do
fragment(
"CASE WHEN ? THEN ? ELSE ? END",
unquote(condition),
unquote(then_expr),
unquote(else_expr)
)
end
end
so you can use it this way in your Ecto queries:
query = from t in <Model>,
select: case_when t.status == 2
do 1
else 0
end
at the same time, in another post, I found this:
(Ecto.Query.CompileError) to prevent SQL injection attacks, fragment(...) does not allow strings to be interpolated as the first argument via the `^` operator, got: `"exists (\n SELECT 1\n FROM #{other_table} o\n WHERE o.column_name = ?)"
Well, it seems Ecto's team figured out people are using fragment to solve complex queries, but they don't realize it can lead to SQL injection, so they don't allow string interpolation there as a way to protect developers.
Then comes another guy who says "don't worry, use macros."
I'm not an elixir expert, but that seems like a workaround to DO USE string interpolation, escaping the fragment protection.
Is there a way to use fragment and be sure the query was parameterized?

SQL injection, here, would result of string interpolation usage with an external data. Imagine where: fragment("column = '#{value}'") (instead of the correct where: fragment("column = ?", value)), if value comes from your params (usual name of the second argument of a Phoenix action which is the parameters extracted from the HTTP request), yes, this could result in a SQL injection.
But, the problem with prepared statement, is that you can't substitute a paremeter (the ? in fragment/1 string) by some dynamic SQL part (for example, a thing as simple as an operator) so, you don't really have the choice. Let's say you would like to write fragment("column #{operator} ?", value) because operator would be dynamic and depends on conditions, as long as operator didn't come from the user (harcoded somewhere in your code), it would be safe.
I don't know if you are familiar with PHP (PDO in the following examples), but this is exactly the same with $bdd->query("... WHERE column = '{$_POST['value']}'") (inject a value by string interpolation) in opposite to $stmt = $bdd->prepare('... WHERE column = ?') then $stmt->execute([$_POST['value']]); (a correct prepared statement). But, if we come back to my previous story of dynamic operator, as stated earlier, you can't dynamically bind some random SQL fragment, the DBMS would interpret "WHERE column ? ?" with > as operator and 'foo' as value like (for the idea) WHERE column '>' 'foo' which is not syntactically correct. So, the easiest way to turn this operator dynamic is to write "WHERE column {$operator} ?" (inject it, but only it, by string interpolation or concatenation). If this variable $operator is defined by your own code (eg: $operator = some_condition ? '>' : '=';), it's fine but, in the opposite, if it involves some superglobal variable which comes from the client like $_POST or $_GET, this creates a security hole (SQL injection).
TL;DR
Then comes another guy who says "don't worry, use macros."
The answer of Aleksei Matiushkin, in the mentionned post, is just a workaround to the disabled/forbidden string interpolation by fragment/1 to dynamically inject a known operator. If you reuse this trick (and can't really do otherwise), as long as you don't blindly "inject" any random value coming from the user, you'll be fine.
UPDATE:
It seems, after all, that fragment/1 (which I didn't inspect the source) doesn't imply a prepared statement (the ? are not placeholder of a true prepared statement). I tried some simple and stupid enough query like the following:
from(
Customer,
where: fragment("lastname ? ?", "LIKE", "%")
)
|> Repo.all()
At least with PostgreSQL/postgrex, the generated query in console appears to be in fact:
SELECT ... FROM "customers" AS c0 WHERE (lastname 'LIKE' '%') []
Note the [] (empty list) at the end for the parameters (and absence of $1 in the query) so it seems to act like the emulation of prepared statement in PHP/PDO meaning Ecto (or postgrex?) realizes proper escaping and injection of values directly in the query but, still, as said above LIKE became a string (see the ' surrounding it), not an operator so the query fails with a syntax error.

Related

convert my string to comma based elements

I am working on a legacy Rails project that relies on Ruby version 1.8
I have a string looks like this:
my_str = "a,b,c"
I would like to convert it to
value_list = "('a','b','c')"
so that I can directly use it in my SQL statement like:
"SELECT * from my_table WHERE value IN #{value_list}"
I tried:
my_str.split(",")
but it returns "abc" :(
How to convert it to what I need?
To split the string you can just do
my_str.split(",")
=> ["a", "b", "c"]
The easiest way to use that in a query, is using where as follows:
Post.where(value: my_str.split(","))
This will just work as expected. But, I understand you want to be able to build the SQL-string yourself, so then you need to do something like
quoted_values_str = my_str.split(",").map{|x| "'#{x}'"}.join(",")
=> "'a','b','c'"
sql = ""SELECT * from my_table WHERE value IN (#{quoted_values_str})"
Note that this is a naive approach: normally you should also escape quotes if they should be contained inside your strings, and makes you vulnerable for sql injection. Using where will handle all those edge cases correctly for you.
Under no circumstances should you reinvent the wheel for this. Rails has built-in methods for constructing SQL strings, and you should use them. In this case, you want sanitize_sql_for_assignment (aliased to sanitize_sql):
my_str = "a,b,c"
conditions = sanitize_sql(["value IN (?)", my_str.split(",")])
# => value IN ('a','b','c')
query = "SELECT * from my_table WHERE #{conditions}"
This will give you the result you want while also protecting you from SQL injection attacks (and other errors related to badly formed SQL).
The correct usage may depend what version of Rails you're using, but this method exists as far back as Rails 2.0 so it will definitely work even with a legacy app; just consult the docs for the version of Rails you're using.
value_list = "('#{my_str.split(",").join("','")}')"
But this is a very bad way to query. You better use:
Model.where(value: my_str.split(","))
The string can be manipulated directly; there is no need to convert it to an array, modify the array then join the elements.
str = "a,b,c"
"(%s)" % str.gsub(/([^,]+)/, "'\\1'")
#=> "('a','b','c')"
The regular expression reads, "match one or more characters other than commas and save to capture group 1. \\1 retrieves the contents of capture group 1 in the formation of gsub's replacement string.
couple of use cases:
def full_name
[last_name, first_name].join(' ')
end
or
def address_line
[address[:country], address[:city], address[:street], address[:zip]].join(', ')
end

Rails ActiveRecord sanitize_sql replaces ? in string

I have a plain SQL query written by a trusted administrator that is to be run in a Rails (4.2) app. I am sanitizing it with ActiveRecord::Base.send(:sanitize_sql, ...) to allow user inputs to act as conditions, using the ? character for bind variables. The code has to allow arbitrary SQL, so I'm not interested in the arguments about why this is not the Rails way, etc.
The problem is that I can not include ? in a result field in the SQL without the underlying replace_bind_variables method replacing an intended literal ? in the result.
A simple query for example would be:
select 'http://www.google.com?q=' || res from some_table where a = ?;
To sanitize:
ActiveRecord::Base.send(:sanitize_sql, [sql, 'not me'], :some_table)
The sanitization fails because the ? in the URL gets replaced with the data intended for the condition, leading to the exception:
ActiveRecord::PreparedStatementInvalid: wrong number of bind variables (1 for 2)
The question is, does sanitize_sql or some variant allow literal ? characters to be included in a query so that they are not replaced? Is there some way of escaping them?
In the end I read through the ActiveRecord source and couldn't identify a way to handle this situation without a lot of code changes. There doesn't appear to be a way to escape the ? characters.
To resolve it for this one query I ended up using the SQL chr() function to generate a character that would pass the santization step untouched:
select 'http://www.google.com' || chr(63) || 'q=' || res from some_table where a = ?;
ASCII character 63 is ?.
Although not a perfect solution, I could at least get this one SQL query into the system without having to make massive code changes.

Arbitrary-length LIKE clause in Ruby on Rails ActiveRecord

I'm attempting to write a Ruby method which accepts an array of strings (for example, ["EG", "K", "C"], and returns all records from a database table where the icao_code field starts with any of those strings (for example, KORD, EGLL, and CYVR would all match). The length of the array will vary, and it will be input by a user, so it needs to be sanitized.
If I were only searching for a single string, I could do something like Airport.where("icao_code LIKE ?", "#{icao_start}%"). However, since I need to search against an arbitrary number of strings, I can't use that syntax.
Right now, I've got it working as follows:
def in_region(icao_starts)
where_clause = icao_starts.map{|i| "icao_code LIKE '#{i}%'"}.join(" OR ")
return Airport.where(where_clause)
end
However, I'm a bit worried using a setup like this with untrusted user input, since I suspect it would be vulnerable to SQL injection.
Is there a better way to get the same result in a more secure way?
You could consider something like this:
def in_region(icao_starts)
where_clause = "icao_code LIKE '#?%' OR " * icao_starts.length
return Airport.where(where_clause.sub(/\ OR\ $/, ''), *icao_starts)
end
This will build up a (potentially very long?) string with ? placeholders. The *icao_starts will expand that array into arguments to the where clause, so each ? will end up getting safely replaced. The sub(/\ OR\ $/, '') simply trims off the final OR (you could append 1=0 instead if you wanted).
If I were you I would also perform a .uniq on icao_starts before you do anything, truncate the array at some sensible upper length limit, and also have a whitelist of permitted values (oh, forget that, I thought users were searching by airport code). That should be pretty much infallible.
You are right about not interpolating user input into your SQL query. This is dangerous and makes your code vulnerable for SQLI attacks.
def in_region(icao_starts)
conditions = icao_starts.map { "icao_code LIKE ?"}
Airport.where(conditions.join(' OR '), *icao_starts.map { |name| "#{name}%"})
end
It is pretty similar than the solution of bogardpd but does not use a Regexp to get rid of the last " OR"

Safely pass a dynamic column name into an ActiveRecord query with a Postgres cast?

I do a lot of time without date querying in my app, and I would like to abstract some of the queries away.
So say I have a model with a DateTime starts_at field:
Shift.where('starts_at::time > ?', '20:31:00.00')
-> SELECT "shifts".* FROM "shifts" WHERE (starts_at::time > '20:31:00.00')
This correctly returns all of the 'starts_at' values greater than the time 20:31.
I want to dynamically pass in the column name into the query, so I can do something like:
Shift.where('? > ?', "#{column_name}::time", '20:31:00.00').
-> SELECT "shifts".* FROM "shifts" WHERE ('starts_at::time' > '20:31:00.00')
In this example, this does not work as the search executes starts_at::time as a string, not as a column with the time cast.
How can I safely pass in column_name into a query with the ::time cast? While this will not accept user input, I would still like to ensure SQL injection is accounted for.
This is more complicated than you might think at first because identifiers (column names, table names, ...) and values ('pancakes', 6, ...) are very different things in SQL that have different quoting rules and even quote characters (single quotes for strings, double quotes for identifiers in standard SQL, backticks for identifiers in MySQL, brackets for identifiers in SQL-Server, ...). If you think of identifiers like Ruby variable names and values like, well, literal Ruby values then you can start to see the difference.
When you say this:
where('? > ?', ...)
both placeholders will be treated as values (not identifiers) and quoted as such. Why is this? ActiveRecord has no way of knowing which ? should be an identifier (such as the created_at column name) and which should be a value (such as 20:31:00.00).
The database connection does have a method specifically for quoting column names though:
> puts ActiveRecord::Base.connection.quote_column_name('pancakes')
"pancakes"
=> nil
so you can say things like:
quoted_column = Shift.connection.quote_column_name(column_name)
Shift.where("#{quoted_name}::time > ?", '20:31:00.00')
This is a little unpleasant because we recoil (or at least we should) at using string interpolation to build SQL. However, quote_column_name will take care of anything dodgy or unsafe in column_name so this isn't actually dangerous.
You could also say:
quoted_column = "#{Shift.connection.quote_column_name(column_name)}::time"
Shift.where("#{quoted_name} > ?", '20:31:00.00')
if you didn't always need to cast the column name to a time. Or even:
clause = "#{Shift.connection.quote_column_name(column_name)}::time > ?"
Shift.where(clause, '20:31:00.00')
You could also use extract or one of the other date/time functions instead of a typecast but you'd still be left with the quoting problem and the somewhat cringeworthy quote_column_name call.
Another option would be to whitelist column_name so that only specific valid values would be allowed. Then you could throw the safe column_name right into the query:
if(!in_the_whitelist(column_name))
# Throw a tantrum, hissy fit, or complain in your preferred fashion
end
Shift.where("#{column_name} > ?", '20:31:00.00')
This should be fine as long as you don't have any funky column names like "gotta have some breakfast" or similar things that always need to be properly quoted. You could even use Shift.column_names or Shift.columns to build your whitelist.
Using both a whitelist and then quote_column_name would probably be the safest but the quote_column_name method should be sufficient.
I decided to use this small solution, taking advantage of Rails column naming conventions:
scope :field_before_and_on_date, -> (field, time) do
column_name = field.to_s.parameterize.underscore
where("#{column_name} <= ?", time.end_of_day)
end
# Takes advantage of:
> "); and delete everything(); stuff(".parameterize.underscore
=> "and_delete_everything_stuff"
It's limited but the concept would work for a type cast too.

How do I use TADOQuery.Parameters with integer parameter types that have to be put in two or more places in a query?

I have a complex query that contains more than one place where the same primary key value must be substituted. It looks like this:
select Foo.Id,
Foo.BearBaitId,
Foo.LinkType,
Foo.BugId,
Foo.GooNum,
Foo.WorkOrderId,
(case when Goo.ZenID is null or Goo.ZenID=0 then
IsNull(dbo.EmptyToNull(Bar.FanName),dbo.EmptyToNull(Bar.BazName))+' '+Bar.Strength else
'#'+BarZen.Description end) as Description,
Foo.Init,
Foo.DateCreated,
Foo.DateChanged,
Bug.LastName,
Bug.FirstName,
Goo.BarID,
(case when Goo.ZenID is null or Goo.ZenID=0 then
IsNull(dbo.EmptyToNull(Bar.BazName),dbo.EmptyToNull(Bar.FanName))+' '+Bar.Strength else
'#'+BarZen.Description end) as BazName,
GooTracking.Status as GooTrackingStatus
from
Foo
inner join Bug on (Foo.BugId=Bug.Id)
inner join Goo on (Foo.GooNum=Goo.GooNum)
left join Bar on (Bar.Id=Goo.BarID)
left join BarZen on (Goo.ZenID=BarZen.ID)
inner join GooTracking on(Goo.GooNum=GooTracking.GooNum )
where (BearBaitId = :aBaitid)
UNION
select Foo.Id,
Foo.BearBaitId,
Foo.LinkType,
Foo.BugId,
Foo.GooNum,
Foo.WorkOrderId,
Foo.Description,
Foo.Init,
Foo.DateCreated,
Foo.DateChanged,
Bug.LastName,
Bug.FirstName,
0,
NULL,
0
from Foo
inner join Bug on (Foo.BugId=Bug.Id)
where (LinkType=0) and (BearBaitId= :aBaitid )
order by BearBaitId,LinkType desc, GooNum
When I try to use an integer parameter on this non-trivial query, it seems impossible to me. I get this error:
Error
Incorrect syntax near ':'.
The query works fine if I take out the :aBaitid and substitute a literal 1.
Is there something else I can do to this query above? When I test with simple tests like this:
select * from foo where id = :anid
These simple cases work fine. The component is TADOQuery, and it works fine until you add any :parameters to the SQL string.
Update: when I use the following code at runtime, the parameter substitutions are actually done (some glitch in the ADO components is worked around) and a different error surfaces:
adoFooContentQuery.Parameters.FindParam('aBaitId').Value := 1;
adoFooContentQuery.Active := true;
Now the error changes to:
Incorrect syntax near the keyword 'inner''.
Note again, that this error goes away if I simply stop using the parameter substitution feature.
Update2: The accepted answer suggests I have to find two different copies of the parameter with the same name, which bothered me so I reworked the query like this:
DECLARE #aVar int;
SET #aVar = :aBaitid;
SELECT ....(long query here)
Then I used #aVar throughout the script where needed, to avoid the repeated use of :aBaitId. (If the number of times the parameter value is used changes, I don't want to have to find all parameters matching a name, and replace them).
I suppose a helper-function like this would be fine too: SetAllParamsNamed(aQuery:TAdoQuery; aName:String;aValue:Variant)
FindParam only finds one parameter, while you have two with the same name. Delphi dataset adds each parameter as a separate one to its collection of parameters.
It should work if you loop through all parameters, check if the name matches, and set the value of each one that matches, although I normally choose to give each same parameter a follow-up number to distingish between them.

Resources