How to pass rails array mysql Where IN condition - ruby-on-rails

I have on array like below,
skills = ['ruby','Ruby on Rails'];
I am trying to pass array in mysql where condition like below
questions = MysqlConnection.connection.select_all("
SELECT questions.*,quest_answers.* FROM `questions`
INNER JOIN `quest_answers` ON `quest_answers`.`question_id` =
`questions`.`id` where questions.category IN (#{skills.join(', ')})")
But it did not work,How can pass an array to where In condition.
Error I am getting
Mysql2::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'on rails, Ruby)' at line 1: SELECT questions.*,quest_answers.* FROM `questions` INNER JOIN `quest_answers` ON `quest_answers`.`question_id` = `questions`.`id` where questions.category IN (Ruby on rails, Ruby)

You are passing a string representation of the array to MySQL, which doesn't work. You need to insert the values in the array into the query. This can be done by escaping the skills, and joining them:
skills.map { |s| "'#{s}'" }.join(', ')
This produces 'ruby', 'Ruby on Rails', which is a valid argument for the IN statement.
A better approach however is to not write the raw SQL at all, but rely on ActiveRecord to generate it. This is the more maintainable and readable approach.
Question.joins(:quest_answers).where(category: skills)
Passing an array to where converts it automatically into a subset condition.

Related

Executing a SQL query with an `IN` clause from Rails code

I know precious nothing abour Rails, so please excuse my naivete about this question.
I'm trying to modify a piece of code that I got from somewhere to make it execute it for a randomly selected bunch of users. Here it goes:
users = RedshiftRecord.connection.execute(<<~SQL
select distinct user_id
from tablename
order by random()
limit 1000
SQL
).to_a
sql = 'select user_id, count(*) from tablename where user_id in (?) group by user_id'
<Library>.on_replica(:something) do
Something::SomethingElse.
connection.
exec_query(sql, users.join(',')).to_h
end
This gives me the following error:
ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR: syntax error at or near ")"
LINE 1: ...ount(*) from tablename where user_id in (?) group by...
^
Users is an array, I know this coz I executed the following and it resulted in true:
p users.instance_of? Array
Would someone please help me execute this code? I want to execute a simple SQL query that would look like this:
select user_id, count(*) from tablename where user_id in (user1,user2,...,user1000) group by user_id
The problem here is that IN takes a list of parameters. Using a single bind IN (?) and a comma separated string will not magically turn it into a list of arguments. Thats just not how SQL works.
What you want is:
where user_id in (?, ?, ?, ...)
Where the number of binds matches the length of the array you want to pass.
The simple but hacky way to do this would be just interpolate in n number of question marks into the SQL string:
binds = Array.new(users.length, '?').join(',')
sql = <<~SQL
select user_id, count(*)
from tablename
where user_id in (#{binds)})
group by user_id'
SQL
<Library>.on_replica(:something) do
Something::SomethingElse.
connection.
exec_query(sql, users).to_h
end
But you would typically do this in a Rails app by creating a model and using the ActiveRecord query interface or using Arel to programatically create the SQL query.

Parameterize an ActiveRecord #joins method

I am refactoring a fairly complex query that involves chaining multiple .joins methods together. In one of these joins I am using a raw SQL query which uses string interpolation i.e joining WHERE foo.id = #{id}. I am aware that I can parameterize ActiveRecord #where by using the ? variable and passing in the arguments as parameters, but the joins method does not support multiple arguments in this fashion. For example:
Using:
Post.my_scope_name.joins("LEFT JOIN posts ON posts.id = images.post_id and posts.id = ?", "1") in order to pass in an id of 1 produces an ActiveRecord::StatementInvalid
because the generated SQL looks like this:
"LEFT JOIN posts ON posts.id = images.post_id and posts.id = ? 1"
What is the standard approach to parameterizing queries when using the joins method?
arel "A Relational Algebra" is the underlying query assembler for Rails and can be used to construct queries, conditions, joins, CTEs, etc. that are not high level supported in Rails. Since this library is an integral part of Rails most Rails query methods will support direct injection of Arel objects without issue (to be honest most methods convert your arguments into one of these objects anyway).
In your case you can construct the join you want as follows:
posts_table = Post.arel_table
images_table = Image.arel_table
some_id = 1
post_join = Arel::Nodes::OuterJoin.new(
posts_table,
Arel::Nodes::On.new(
posts_table[:id].eq(images_table[:post_id])
.and(posts_table[:id].eq(some_id))
)
)
SQL produced:
post_join.to_sql
#=> "LEFT OUTER JOIN [posts] ON [posts].[id] = [images].[post_id] AND [posts].[id] = 1"
Then you just add this join to your current query
Image.joins(post_join)
#=> SELECT images.* FROM images LEFT OUTER JOIN [posts] ON [posts].[id] = [images].[post_id] AND [posts].[id] = 1

Postgresql error with Rails 3 using order("RANDOM()")

Im trying to query my db for records that are similar to the currently viewed record (based on taggings), which I have working but I would like to randomize the order.
my development environment is mysql so I would do something like:
#tattoos = Tattoo.tagged_with(tags, :any => true).order("RAND()").limit(6)
which works, but my production environment is heroku which is using postgresql so I tried using this:
#tattoos = Tattoo.tagged_with(tags, :any => true).order("RANDOM()").limit(6)
but I get the following error:
ActionView::Template::Error (PGError: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
SELECT DISTINCT tattoos.* FROM "tattoos" JOIN taggings
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477 ON
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.taggable_id = tattoos.id AND
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.taggable_type = 'Tattoo' WHERE
(tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.tag_id = 3 OR
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.tag_id = 4 OR
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.tag_id = 5 OR
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.tag_id = 24 OR
tattoos_taggings_color_fantasy_newschool_nerdy_tv_477.tag_id = 205) ORDER BY RANDOM() LIMIT 6):
After analyzing the query more closely, I have to correct my first draft. The query would require a DISTINCT or GROUP BY the way it is.
The (possibly) duplicate tattoos.* come from first joining to (possibly) multiple rows in the table taggings. Your query engine then tries to get rid of such duplicates again by using DISTINCT - in a syntactically illegal way.
DISTINCT basically sorts the resulting rows by the resulting columns from left to right and picks the first for each set of duplicates. That's why the leftmost ORDER BY column have to match the SELECT list.
MySQL is more permissive and allows the non-standard use of DISTINCT, but PostgreSQL throws an error.
ORMs often produce ineffective SQL statements (they are just crutches after all). However, if you use appropriate PostgreSQL libraries, such an illegal statement shouldn't be produced to begin with. I am no Ruby expert, but something's fishy here.
The query is also very ugly and inefficient.
There are several ways to fix it. For instance:
SELECT *
FROM (<query without ORDER BY and LIMIT>) x
ORDER BY RANDOM()
LIMIT 6
Or, better yet, rewrite the query with this faster, cleaner alternative doing the same:
SELECT ta.*
FROM tattoos ta
WHERE EXISTS (
SELECT 1
FROM taggings t
WHERE t.taggable_id = ta .id
AND t.taggable_type = 'Tattoo'
AND t.tag_id IN (3, 4, 5, 24, 205)
)
ORDER BY RANDOM()
LIMIT 6;
You'll have to implement it in Ruby yourself.
not sure about the random, as it should work.
But take a note of http://railsforum.com/viewtopic.php?id=36581
which has code that might suit you
/lib/agnostic_random.rb
module AgnosticRandom
def random
case DB_ADAPTER
when "mysql" then "RAND()"
when "postgresql" then "RANDOM()"
end
end
end
/initializers/extend_ar.rb (name doesn't matter)
ActiveRecord::Base.extend AgnosticRandom

ActiveRecord Custom Query vs find_by_sql loading

I have a Custom Query that look like this
self.account.websites.find(:all,:joins => [:group_websites => {:group => :users}],:conditions=>["users.id =?",self])
where self is a User Object
I manage to generate the equivalent SQL for same
Here how it look
sql = "select * from websites INNER JOIN group_websites on group_websites.website_id = websites.id INNER JOIN groups on groups.id = group_websites.group_id INNER JOIN group_users ON (groups.id = group_users.group_id) INNER JOIN users on (users.id = group_users.user_id) where (websites.account_id = #{account_id} AND (users.id = #{user_id}))"
With the decent understanding of SQL and ActiveRecord I assumed that(which most would agree on) the result obtained from above query might take a longer time as compare to result obtained from find_by_sql(sql) one.
But Surprisingly
When I ran the above two
I found the ActiveRecord custom Query leading the way from ActiveRecord "find_by_sql" in term of load time
here are the test result
ActiveRecord Custom Query load time
Website Load (0.9ms)
Website Columns(1.0ms)
find_by_sql load time
Website Load (1.3ms)
Website Columns(1.0ms)
I repeated the test again an again and the result still the came out the same(with Custom Query winning the battle)
I know the difference aren't that big but still I just cant figure out why a normal find_by_sql query is slower than Custom Query
Can Anyone Share a light on this.
Thanks Anyway
Regards
Viren Negi
With the find case, the query is parameterized; this means the database can cache the query plan and will not need to parse and compile the query again.
With the find_by_sql case the entire query is passed to the database as a string. This means there is no caching that the database can do on the structure of the query, and it needs to be parsed and compiled on each occasion.
I think you can test this: try find_by_sql in this way (parameterized):
User.find_by_sql(["select * from websites INNER JOIN group_websites on group_websites.website_id = websites.id INNER JOIN groups on groups.id = group_websites.group_id INNER JOIN group_users ON (groups.id = group_users.group_id) INNER JOIN users on (users.id = group_users.user_id) where (websites.account_id = ? AND (users.id = ?))", account_id, users.id])
Well, the reason is probably quite simple - with custom SQL, the SQL query is sent immediately to db server for execution.
Remember that Ruby is an interpreted language, therefore Rails generates a new SQL query based on the ORM meta language you have used before it can be sent to the actual db server for execution. I would say additional 0.1 ms is the time taken by framework to generate the query.

Rails 3 LIKE query raises exception when using a double colon and a dot

In rails 3.0.0, the following query works fine:
Author.where("name LIKE :input",{:input => "#{params[:q]}%"}).includes(:books).order('created_at')
However, when I input as search string (so containing a double colon followed by a dot):
aa:.bb
I get the following exception:
ActiveRecord::StatementInvalid: SQLite3::SQLException: ambiguous column name: created_at
In the logs the these are the sql queries:
with aa as input:
Author Load (0.4ms) SELECT "authors".* FROM "authors" WHERE (name LIKE 'aa%') ORDER BY created_at
Book Load (2.5ms) SELECT "books".* FROM "books" WHERE ("books".author_id IN (1,2,3)) ORDER BY id
with aa:.bb as input:
SELECT DISTINCT "authors".id FROM "authors" LEFT OUTER JOIN "books" ON "books"."author_id" = "authors"."id" WHERE (name LIKE 'aa:.bb%') ORDER BY created_at DESC LIMIT 12 OFFSET 0
SQLite3::SQLException: ambiguous column name: created_at
It seems that with the aa:.bb input, an extra query is made to fetch the distinct author id_s.
I thought Rails would escape all the characters. Is this expected behaviour or a bug?
Best Regards,
Pieter
The "ambiguous column" error usually happens when you use includes or joins and don't specify which table you're referring to:
"name LIKE :input"
Should be:
"authors.name LIKE :input"
Just "name" is ambiguous if your books table has a name column too.
Also: have a look at your development.log to see what the generated query looks like. This will show you if it's being escaped properly.
Replace
.includes(:books)
with
.preload(:books)
This should force activerecord to use 2 queries instead of the join.
Rails has 2 versions of includes: One which constructs a big query with joins (the 2nd of your 2 queries and thus more likely to result in ambiguous column references and one that avoids the joins in favour of a separate query per association.
Rails decides which strategy to used based on whether it thinks that your conditions, order etc refer to the included tables (since in that case the joins version is required). Where a condition is a string fragment that heuristic isn't very sophisticated - i seem to recall that it just scans the conditions for anything that might look like a column from another table (ie foo.bar) so having a literal of that form could fool it.
You can either qualify your column names so that it doesn't matter which includes strategy is used or you can use preload/eager_load instead of includes. These behave similarly to includes but force a specific include strategy rather than trying to guess which is most appropriate.

Resources