This is the first time I've seen this issue. I'm building up an SQL array to run through sanitize_sql_array and Rails is adding extra, unnecessary single quotes in the return value. So instead of returning:
SELECT DISTINCT data -> 'Foo' from products
it returns:
SELECT DISTINCT data -> ''Foo'' from products
which of course Postgres doesn't like.
Here is the code:
sql_array = ["SELECT DISTINCT %s from products", "data -> 'Foo'"]
sql_array = sanitize_sql_array(sql_array)
connection.select_values(sql_array)
Note the same thing happens when I use the shorter and more usual:
sql_array = ["SELECT DISTINCT %s from products", "data -> 'Foo'"]
connection.select_values(send(:sanitize_sql_array, sql_array))
Ever seen this before? Does it have something to do with using HStore? I definitely need that string sanitized since the string Foo is actually coming from a user-entered variable.
Thanks!
You're giving sanitize_sql_array a string that contains an hstore expression and expecting sanitize_sql_array to understand that the string contains some hstore stuff; that's asking far too much, sanitize_sql_array only knows about simple things like strings and numbers, it doesn't know how to parse PostgreSQL's SQL extensions or even standard SQL. How would you expect sanitize_sql_array to tell the difference between, for example, a string that happens to contain '11 * 23' and a string that is supposed to represent the arithmetical expression 11 * 23?
You should split your data -> 'Foo' into two pieces so that sanitize_sql_array only sees the string part when it is sanitizing things:
sql_array = [ 'select distinct data -> ? from products', 'Foo' ]
sql = sanitize_sql_array(sql_array)
That will give you the SQL you're looking for:
select distinct data -> 'Foo' from products
Related
I have a Json field stored in DB as text.
I want to look for records that contain Harvard as a value for those keys
json_data['infoOnProgram']['universityName']
How should I construct my where request to search on that Json field.
Student.where(json_data...
Thank you!
If there are a not very many records you can do this, which will be very expensive:
harvard_students = Student.map {|stud|
hash = JSON.parse(stud.json_data)
hash['infoOnProgram']['universityName'] == 'Harvard' ? stud : nil
}.compact
if you have lots and lots of records to iterate through you can limit your list with a regex query:
Simple
Student.where("json data like '%Harvard%'")
slightly better
Student.where("json data like '%infoOnProgram%universityName%Harvard%'")
The problem is without knowing how the data is structured it is hard to create a specific regex. You can use the where to limit and the iteration to confirm more precisely that these records are what you are looking for.
If you're sure you really do have valid JSON in your text column then you could cast it to jsonb (or json) and use the usual JSON operators and functions that PostgreSQL provides:
Student.where(
"json_data::jsonb -> 'infoOnProgram' ->> 'universityName' = ?",
'Harvard'
)
Student.where(
'json_data::jsonb -> :info ->> :name = :uni',
info: 'infoOnProgram',
name: 'universityName',
uni: 'Harvard'
)
Student.where(
'cast(json_data as jsonb) #>> array[:path] = :uni',
path: %w[infoOnProgram universityName],
uni: 'Harvard'
)
This is going to be an expensive query though as you won't be able to use any indexes. You really should start the process of changing the column's type to jsonb, that would validate the JSON, avoid having to type cast in queries, allow indexing, ...
I am having trouble with a DB query in a Rails app. I want to store various search terms (say 100 of them) and then evaluate against a value dynamically. All the examples of SIMILAR TO or ~ (regex) in Postgres I can find use a fixed string within the query, while I want to look the query up from a row.
Example:
Table: Post
column term varchar(256)
(plus regular id, Rails stuff etc)
input = "Foo bar"
Post.where("term ~* ?", input)
So term is VARCHAR column name containing the data of at least one row with the value:
^foo*$
Unless I put an exact match (e.g. "Foo bar" in term) this never returns a result.
I would also like to ideally use expressions like
(^foo.*$|^second.*$)
i.e. multiple search terms as well, so it would match with 'Foo Bar' or 'Search Example'.
I think this is to do with Ruby or ActiveRecord stripping down something? Or I'm on the wrong track and can't use regex or SIMILAR TO with row data values like this?
Alternative suggestions on how to do this also appreciated.
The Postgres regular expression match operators have the regex on the right and the string on the left. See the examples: https://www.postgresql.org/docs/9.3/static/functions-matching.html#FUNCTIONS-POSIX-TABLE
But in your query you're treating term as the string and the 'Foo bar' as the regex (you've swapped them). That's why the only term that matches is the exact match. Try:
Post.where("? ~* term", input)
TL;DR I'm wondering what the pros and cons are (or if they are even equivalent) between #> {as_champion, whatever} and using IN ('as_champion', 'whatever') is. Details below:
I'm working with Rails and using Postgres' array column type, but having to use raw sql for my query as the Rails finder methods don't play nicely with it. I found a way that works, but wondering what the preferred method is:
The roles column on the Memberships table is my array column. It was added via rails as so:
add_column :memberships, :roles, :text, array: true
When I examine the table, it shows the type as: text[] (not sure if that is truly how Postgres represents an array column or if that is Rails shenanigans.
To query against it I do something like:
Membership.where("roles #> ?", '{as_champion, whatever}')
From the fine Array Operators manual:
Operator: #>
Description: contains
Example: ARRAY[1,4,3] #> ARRAY[3,1]
Result: t (AKA true)
So #> treats its operand arrays as sets and checks if the right side is a subset of the left side.
IN is a little different and is used with subqueries:
9.22.2. IN
expression IN (subquery)
The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result. The result of IN is "true" if any equal subquery row is found. The result is "false" if no equal row is found (including the case where the subquery returns no rows).
or with literal lists:
9.23.1. IN
expression IN (value [, ...])
The right-hand side is a parenthesized list of scalar expressions. The result is "true" if the left-hand expression's result is equal to any of the right-hand expressions. This is a shorthand notation for
expression = value1
OR
expression = value2
OR
...
So a IN b more or less means:
Is the value a equal to any of the values in the list b (which can be a query producing single element rows or a literal list).
Of course, you can say things like:
array[1] in (select some_array from ...)
array[1] in (array[1], array[2,3])
but the arrays in those cases are still treated like single values (that just happen to have some internal structure).
If you want to check if an array contains any of a list of values then #> isn't what you want. Consider this:
array[1,2] #> array[2,4]
4 isn't in array[1,2] so array[2,4] is not a subset of array[1,2].
If you want to check if someone has both roles then:
roles #> array['as_champion', 'whatever']
is the right expression but if you want to check if roles is any of those values then you want the overlaps operator (&&):
roles && array['as_champion', 'whatever']
Note that I'm using the "array constructor" syntax for the arrays everywhere, that's because it is much more convenient for working with a tool (such as ActiveRecord) that knows to expand an array into a comma delimited list when replacing a placeholder but doesn't fully understand SQL arrays.
Given all that, we can say things like:
Membership.where('roles #> array[?]', %w[as_champion whatever])
Membership.where('roles #> array[:roles]', :roles => some_ruby_array_of_strings)
and everything will work as expected. You're still working with little SQL snippets (as ActiveRecord doesn't have a full understanding of SQL arrays or any way of representing the #> operator) but at least you won't have to worry about quoting problems. You could probably go through AREL to manually add #> support but I find that AREL quickly devolves into an incomprehensible and unreadable mess for all but the most trivial uses.
I do a lot of time without date querying in my app, and I would like to abstract some of the queries away.
So say I have a model with a DateTime starts_at field:
Shift.where('starts_at::time > ?', '20:31:00.00')
-> SELECT "shifts".* FROM "shifts" WHERE (starts_at::time > '20:31:00.00')
This correctly returns all of the 'starts_at' values greater than the time 20:31.
I want to dynamically pass in the column name into the query, so I can do something like:
Shift.where('? > ?', "#{column_name}::time", '20:31:00.00').
-> SELECT "shifts".* FROM "shifts" WHERE ('starts_at::time' > '20:31:00.00')
In this example, this does not work as the search executes starts_at::time as a string, not as a column with the time cast.
How can I safely pass in column_name into a query with the ::time cast? While this will not accept user input, I would still like to ensure SQL injection is accounted for.
This is more complicated than you might think at first because identifiers (column names, table names, ...) and values ('pancakes', 6, ...) are very different things in SQL that have different quoting rules and even quote characters (single quotes for strings, double quotes for identifiers in standard SQL, backticks for identifiers in MySQL, brackets for identifiers in SQL-Server, ...). If you think of identifiers like Ruby variable names and values like, well, literal Ruby values then you can start to see the difference.
When you say this:
where('? > ?', ...)
both placeholders will be treated as values (not identifiers) and quoted as such. Why is this? ActiveRecord has no way of knowing which ? should be an identifier (such as the created_at column name) and which should be a value (such as 20:31:00.00).
The database connection does have a method specifically for quoting column names though:
> puts ActiveRecord::Base.connection.quote_column_name('pancakes')
"pancakes"
=> nil
so you can say things like:
quoted_column = Shift.connection.quote_column_name(column_name)
Shift.where("#{quoted_name}::time > ?", '20:31:00.00')
This is a little unpleasant because we recoil (or at least we should) at using string interpolation to build SQL. However, quote_column_name will take care of anything dodgy or unsafe in column_name so this isn't actually dangerous.
You could also say:
quoted_column = "#{Shift.connection.quote_column_name(column_name)}::time"
Shift.where("#{quoted_name} > ?", '20:31:00.00')
if you didn't always need to cast the column name to a time. Or even:
clause = "#{Shift.connection.quote_column_name(column_name)}::time > ?"
Shift.where(clause, '20:31:00.00')
You could also use extract or one of the other date/time functions instead of a typecast but you'd still be left with the quoting problem and the somewhat cringeworthy quote_column_name call.
Another option would be to whitelist column_name so that only specific valid values would be allowed. Then you could throw the safe column_name right into the query:
if(!in_the_whitelist(column_name))
# Throw a tantrum, hissy fit, or complain in your preferred fashion
end
Shift.where("#{column_name} > ?", '20:31:00.00')
This should be fine as long as you don't have any funky column names like "gotta have some breakfast" or similar things that always need to be properly quoted. You could even use Shift.column_names or Shift.columns to build your whitelist.
Using both a whitelist and then quote_column_name would probably be the safest but the quote_column_name method should be sufficient.
I decided to use this small solution, taking advantage of Rails column naming conventions:
scope :field_before_and_on_date, -> (field, time) do
column_name = field.to_s.parameterize.underscore
where("#{column_name} <= ?", time.end_of_day)
end
# Takes advantage of:
> "); and delete everything(); stuff(".parameterize.underscore
=> "and_delete_everything_stuff"
It's limited but the concept would work for a type cast too.
can anyone please tell me how to write query in select_value.
I have tried,
ActiveRecord::Base.connection.select_value("select count(*) from leave_details where status= 'Pending' and 'employeedetails_id'=25")
but it showing error
invalid input syntax for integer: "employeedetails_id".
I am using PostgreSQL.
Single quotes are used to quote strings in PostgreSQL (and every other SQL database that even pretends to respect the SQL standard) so you're saying something like this:
some_string = some_integer
when you do this:
'employeedetails_id'=25
and that doesn't make any sense: you can't compare strings and integers without an explicit type cast. You don't need to quote that identifier at all:
ActiveRecord::Base.connection.select_value(%q{
select count(*)
from leave_details
where status = 'Pending'
and employeedetails_id = 25
})
If you even do need to quote an identifier (perhaps it is case sensitive or contains spaces), then you'd use double quotes with PostgreSQL.
Apparently you created your column as "EmployeeDetails_id" so that it is case sensitive. That means that you always have to use that case and you always have to double quote it:
ActiveRecord::Base.connection.select_value(%q{
select count(*)
from leave_details
where status = 'Pending'
and "EmployeeDetails_id" = 25
})
I'd recommend reworking your table to not use mixed case identifiers:
They go against standard Ruby/Rails naming.
They force you to double quote the mixed case column names everywhere you use them.
They go against standard PostgreSQL practice.
This is going to trip you up over and over again.
Executing SQL directly isn't really The Rails Way, and you lose any database portability by doing it that way.
You should create a model for leave_details. E.g.
rails g model LeaveDetails status:string employeedetails_id:integer
Then, the code would be:
LeaveDetails.where({ :status => 'Pending', :employeedetails_id => 25 }).count