Parsing a PostgreSQL result object in a Rails app - ruby-on-rails

I am writing an app that needs to quickly process hundreds of thousands of rows of data, so I've looked into nesting raw SQL in my Ruby code using ActiveRecord::Base.connection.execute, which is working beautifully. However whenever I run it I get the following Object as a result:
#<PG::Result:0x007fe158ab18c8 status=PGRES_TUPLES_OK ntuples=0 nfields=1 cmd_tuples=0>
I've googled around and can't find a way to parse the PG Result into something actually useful. Is there any built-in PG way to do this, or a workaround, or anything really?
Here is the query I'm using:
SELECT row_to_json(row(company_name, ccn_short_title, title))
FROM contents
WHERE contents.company_name = '#{company_name}'
AND contents.title = '#{title}';

Actually PG::Result responds to many well-known methods from Enumerable module. You can output them all to watch for the desired ones:
query = "SELECT row_to_json(row) from (select * from users) row"
result = ActiveRecord::Base.connection.execute(query)
result.methods - Object.methods
# => returns an array of methods which can be used
For example, you could iterate the results and map them to something more suitable...
result.map do |row|
JSON.parse(row["row_to_json"])
end
# => returns familiar hashes
Get a desired result hash by its index...
result[0]
And much more.

Related

How to pass array as bind variable to Rails/ActiveRecord raw SQL queries?

I need to pass an array of ids into my raw sql query like this:
select offers.* from offers where id in (1,2,3,4,5)
The real query includes a lot of joins and aggregation functions and can't be written using Arel expressions or ActiveRecord model methods like Offer.where(id: [...]). I'm looking exactly for how to use bind variables in raw queries.
Instead of interpolating ids into string I want to use bind variables like this (pseudo-code):
ActiveRecord::Base.connection.select_all("select offers.* from offers where id in (:ids)", {ids: [1,2,3,4,5]})
However, I can't find any solution to perform this. From this ticket I've got a comment with related test-case in ActiveRecord code with the following example:
sub = Arel::Nodes::BindParam.new
binds = [Relation::QueryAttribute.new("id", 1, Type::Value.new)]
sql = "select * from topics where id = #{sub.to_sql}"
#connection.exec_query(sql, "SQL", binds)
I've tried this approach, but it didn't worked at all, my "?" was not replaced by actual values.
I'm using Rails 5.1.6 and MariaDB database.
You could do this in a much simpler fashion purely with arel. (Also it makes the code far more maintainable than SQL strings)
offers = Arel::Table.new('offers')
ids = [1,2,3,4,5]
query = offers.project(Arel.star).where(offers[:id].in(ids))
ActiveRecord::Base.connection.exec_query(query.to_sql)
This will result in the following SQL
SELECT
[offers].*
FROM
[offers]
WHERE
[offers].[id] IN (1,2,3,4,5)
When executed you will receive an ActiveRecord::Result object with is usually easiest to deal with by calling to_hash and each resulting row will be turned into a Hash of {column_name => value}
However if you are using rails and Offer is a true model then:
Offer.where(id: ids)
Will result in the same query and will return an ActiveRecord::Relation collection of Offer objects which is generally more preferable.
Update
Seems like you need to enable prepared_statements in mysql2 (mariadb) in order to use the bind params, which can be done like this:
default: &default
adapter: mysql2
encoding: utf8
prepared_statements: true # <- here we go!
Please note the following pieces of code:
https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb#L115
https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/connection_adapters/mysql2_adapter.rb#L40
https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb#L630
https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/connection_adapters/mysql/database_statements.rb#L30
As you can see in the last code exec_query ignores bind_params if prepared_statements is turned off (which appears to be the default for the mysql2 adapter).

Active Record - Chain Queries with OR

Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.

Rails and Arel's where function: Can I call where on objects instead of making a call to the database?

Consider the following:
budget has many objects in its another_collection.
obj has many objects of the same type as object in its another_collection.
budget and some_collection are already declared before the following loop
they've been previous saved in the database and have primary keys set.
some_collection is a collections of objs.
-
some_collection.each do |obj|
another_obj = obj.another_collection.build
budget.another_collection << another_obj
end
budget.another_collection.collect {|another_obj| another_obj.another_obj_id}
=> [1, 2, 3, 4]
some_obj_with_pk_1 = some_collection.where(obj_id: obj.id)
some_obj_with_pl_1.id
=> 1
budget.another_collection.where(another_obj_id: some_obj_with_pk_1.id)
=> []
This shouldn't happen. What is happening is that rails queries the database for any items in another_collection with another_obj_id = 1. Since this collection hasn't been saved to the database yet, none of these items are showing up in the results.
Is there a function or something I can pass to Arel's where method that says to use local results only? I know I can just iterate over the items and find it but it would be great if I didn't have to do that and just use a method that does this already.
You could always use Enumeable#select which takes a block and returns only the elements that the block returns true for. You'd want to make sure that you had ActiveRecord retrieve the result set first (by calling to_a on your query).
records = Model.where(some_attribute: value).to_a
filtered_records = records.select { |record| record.something? }
Depending on your result set and your needs, it is possible that a second database query would be faster, as your SQL store is better suited to do these comparisons than Ruby. But if your records have yet to be saved, you would need to do something like the above, since the records aren't persisted yet

Modifying the returned value of find_by_sql

So I am pulling my hair over this issue / gotcha. Basically I used find_by_sql to fetch data from my database. I did this because the query has lots of columns and table joins and I think using ActiveRecord and associations will slow it down.
I managed to pull the data and now I wanted to modify returned values. I did this by looping through the result ,for example.
a = Project.find_by_sql("SELECT mycolumn, mycolumn2 FROM my_table").each do |project|
project['mycolumn'] = project['mycolumn'].split('_').first
end
What I found out is that project['mycolumn'] was not changed at all.
So my question:
Does find_by_sql return an array Hashes?
Is it possible to modify the value of one of the attributes of hash as stated above?
Here is the code : http://pastie.org/4213454 . If you can have a look at summarize_roles2() that's where the action is taking place.
Thank you. Im using Rails 2.1.1 and Ruby 1.8. I can't really upgrade because of legacy codes.
Just change the method above to access the values, print value of project and you can clearly check the object property.
The results will be returned as an array with columns requested encapsulated as attributes of the model you call this method from.If you call Product.find_by_sql then the results will be returned in a Product object with the attributes you specified in the SQL query.
If you call a complicated SQL query which spans multiple tables the columns specified by the SELECT will be attributes of the model, whether or not they are columns of the corresponding table.
Post.find_by_sql "SELECT p.title, c.author FROM posts p, comments c WHERE p.id = c.post_id"
> [#<Post:0x36bff9c #attributes={"title"=>"Ruby Meetup", "first_name"=>"Quentin"}>, ...]
Source: http://api.rubyonrails.org/v2.3.8/
Have you tried
a = Project.find_by_sql("SELECT mycolumn, mycolumn2 FROM my_table").each do |project|
project['mycolumn'] = project['mycolumn'].split('_').first
project.save
end

searching within an already retrieved mysql result

I'm trying to limit the number of times I do a mysql query, as this could end up being 2k+ queries just to accomplish a fairly small result.
I'm going through a CSV file, and I need to check that the format of the content in the csv matches the format the db expects, and sometimes I try to accomplish some basic clean-up (for example, I have one field that is a string, but is sometimes in the csv as jb2003-343, and I need to strip out the -343).
The first thing I do is get from the database the list of fields by name that I need to retrieve from the csv, then I get the index of those columns in the csv, then I go through each line in the csv and get each of the indexed columns
get_fields = BaseField.find_by_group(:all, :conditions=>['group IN (?)',params[:group_ids]])
csv = CSV.read(csv.path)
first_line=csv.first
first_line.split(',')
csv.each_with_index do |row|
if row==0
col_indexes=[]
csv_data=[]
get_fields.each do |col|
col_indexes << row.index(col.name)
end
else
csv_row=[]
col_indexes.each do |col|
#possibly check the value here against another mysql query but that's ugly
csv_row << row[col]
end
csv_data << csv_row
end
end
The problem is that when I'm adding the content of the csv_data for output, I no longer have any connection to the original get_fields query. Therefore, I can't seem to say 'does this match the type of data expected from the db'.
I could work my way back through the same process that got me down to that level, and make another query like this
get_cleanup = BaseField.find_by_csv_col_name(first_line[col])
if get_cleanup.format==row[col].is_a
csv_row << row[col]
else
# do some data clean-up
end
but as I mentioned, that could mean the get_cleanup is run 2000+ times.
instead of doing this, is there a way to search within the original get_fields result for the name, and then get the associated field?
I tried searching for 'search rails object', but kept getting back results about building search, not searching within an already existing object.
I know I can do array.search, but don't see anything in the object api about search.
Note: The code above may not be perfect, because I'm not running it yet, just wrote that off the top of my head, but hopefully it gives you the idea of what I'm going for.
When you populate your col_indexes array, rather than storing a single value, you can store a hash which includes index and the datatype.
get_fields.each do |col|
col_info = {:row_index = row.index(col.name), :name=>col.name :format=>col.format}
col_indexes << col_info
end
You can then access all your data in the for loop

Resources