Modifying the returned value of find_by_sql - ruby-on-rails

So I am pulling my hair over this issue / gotcha. Basically I used find_by_sql to fetch data from my database. I did this because the query has lots of columns and table joins and I think using ActiveRecord and associations will slow it down.
I managed to pull the data and now I wanted to modify returned values. I did this by looping through the result ,for example.
a = Project.find_by_sql("SELECT mycolumn, mycolumn2 FROM my_table").each do |project|
project['mycolumn'] = project['mycolumn'].split('_').first
end
What I found out is that project['mycolumn'] was not changed at all.
So my question:
Does find_by_sql return an array Hashes?
Is it possible to modify the value of one of the attributes of hash as stated above?
Here is the code : http://pastie.org/4213454 . If you can have a look at summarize_roles2() that's where the action is taking place.
Thank you. Im using Rails 2.1.1 and Ruby 1.8. I can't really upgrade because of legacy codes.

Just change the method above to access the values, print value of project and you can clearly check the object property.
The results will be returned as an array with columns requested encapsulated as attributes of the model you call this method from.If you call Product.find_by_sql then the results will be returned in a Product object with the attributes you specified in the SQL query.
If you call a complicated SQL query which spans multiple tables the columns specified by the SELECT will be attributes of the model, whether or not they are columns of the corresponding table.
Post.find_by_sql "SELECT p.title, c.author FROM posts p, comments c WHERE p.id = c.post_id"
> [#<Post:0x36bff9c #attributes={"title"=>"Ruby Meetup", "first_name"=>"Quentin"}>, ...]
Source: http://api.rubyonrails.org/v2.3.8/

Have you tried
a = Project.find_by_sql("SELECT mycolumn, mycolumn2 FROM my_table").each do |project|
project['mycolumn'] = project['mycolumn'].split('_').first
project.save
end

Related

Ruby on Rails / ActiveRecord: How Can I (Elegantly) Retrieve Data from Multiple Tables?

It's rather trivial to retrieve data from multiple tables that are related through foreign keys using raw SQL. I can do, for example:
SELECT title, domestic_sales
FROM movies
JOIN boxoffice
ON movies.id = boxoffice.movie_id;
This would give me a table with two colums: title and domestic_sales, where the data in the first column comes from the table movies and the data in the second column comes from the table boxoffice.
How can I do this in Rails using Ruby code? I can, of course, get the same result if I use raw SQL. So, I could do the following:
ActiveRecord::Base.connection.execute(<<-SQL)
SELECT title, domestic_sales
FROM movies
JOIN boxoffice
ON movies.id = boxoffice.movie_id;
SQL
This would give me a PG::Result object with the data I want. But this is super inelegant. I would like to be able to get this information without using raw SQL.
So, this is the first thing that comes to mind is:
Movie.select(:name, :domestic_sales).joins(:box_office)
The problem, however, is that the aforementioned line of code returns a bunch of Movie objects. Since the Movie class doesn't have the domestic_sales attribute, I don't get access to that information.
The next thing I thought was to use a loop. So, I could do something like:
Movie.joins(:box_office).to_a.map do |m|
{name: m.name, rating: m.box_office.domestic_sales}
end
This gives me exactly the data I want. But it costs n + 1 SQL queries, which is not good. I should be able to get this with just one query...
So: How can I retrieve the data I want without using raw SQL and without using loops that cost multiple queries?
SELECT title, domestic_sales
FROM movies
JOIN boxoffice
ON movies.id = boxoffice.movie_id;
translated to ActiveRecord would look like this
Movie
.select(:title, :domestice_sales)
.joins("boxoffice ON movies.id = boxoffice.movie_id")
When you have proper associations defined in your models you would would be able to write:
Movie
.select(:title, :domestice_sales)
.joins(:boxoffices)
And when you do not need an instance of ActiveRecord and would be fine with a nested array, you can even write:
Movie
.joins(:boxoffices)
.pluck(:title, :domestice_sales)
Try this way.
Movie.joins(:box_office).pluck(:title, :domestic_sales)

Ordering a collection by instance method

I would like to order a collection first by priority and then due time like this:
#ods = Od.order(:priority, :due_date_time)
The problem is due_date_time is an instance method of Od, so I get
PG::UndefinedColumn: ERROR: column ods.due_date_time does not exist
I have tried the following, but it seems that by sorting and mapping ids, then finding them again with .where means the sort order is lost.
#ods = Od.where(id: (Od.all.sort {|a,b| a.due_date_time <=> b.due_date_time}.map(&:id))).order(:priority)
due_date_time calls a method from a child association:
def due_date_time
run.cut_off_time
end
run.cut_off_time is defined here:
def cut_off_time
(leave_date.beginning_of_day + route.cut_off_time_mins_since_midnight * 60)
end
I'm sure there is an easier way. Any help much appreciated! Thanks.
order from ActiveRecord similar to sort from ruby. So, Od.all.sort run iteration after the database query Od.all, run a new iteration map and then send a new database query. Also Od.all.sort has no sense because where select record when id included in ids but not searching a record for each id.
Easier do something like this:
Od.all.sort_by { |od| [od.priority, od.due_date_time] }
But that is a slow solution(ods table include 10k+ records). Prefer to save column to sort to the database. When that is not possible set logic to calculate due_date_time in a database query.

In Rails 3.2, how to "pluck_in_batches" for a very large table

I have a massive table Foo from which I need to pluck all values in a certain field, Foo.who.
The array has millions of rows, but only a few thousand different values in the who column.
If the table was smaller of course I'd simply use Foo.pluck(:who)
If I use Foo.find_in_batches do |a_batch| each set is an Array of Foo records, rather than an activerecord collection of Foo records, so I cannot use .pluck() and AFAIK the only way to extract the who column is via something like .map(&:who) that iterates over the array.
Is there a way to pluck the who column from Foo in batches that does not require then iterating over each element of each batch to extract the who column?
In Rails 5 you can use:
Foo.in_batches do |relation|
values = relation.pluck(:id, :name, description)
...
end
Upd: for prevent memory leaks use:
Foo.uncached do
Foo.in_batches do |relation|
values = relation.pluck(:id, :name, description)
...
end
end
Here's a method to get the ids that were retrieved by the in_batches method itself, without need to run another query yourself.
in_batches already runs pluck(:id) under the hood (when load param is false which is the default) and yield the relation object. This relation object is created with where(id: ids_from_pluck).
Is it possible to get the list of ids directly from the relation object via where_values_hash method, without the need to run another query in DB. For example:
Foo.in_batches do |relation|
ids = relation.where_values_hash['id']
end
This should work on both Rails 5.x and 6.x, but it relies on implementation detail of in_batches so it is not guaranteed to work in future.
Try this:
Foo.select(:id, :who).find_in_batches do |a_batch|
...
end

How do i use .sort() to create a relation?

I am using the kaminari gem for pagination. I have a resources controller which paginates perfectly (due to the simple nature of the ordering). That can be seen here:
#resources = Resource.order("created_at desc").page(params[:page]).per(25)
That just sorts them by latest first. when i do .class it appears thats an activerecord::relation
On my tags though, I want to sort them by a relationship (the number of resources assigned to that tag)
#tags = Tag.all.sort{|a, b| b.number_of_resources <=> a.number_of_resources}.page(params[:page]).per(50)
It gives me the error however undefined methodpage' for #`
Tag.all returns an Array, hence your #page call failing, as it expects an ARel relation.
If #number_of_resources maps to a DB column, then all you need to do is:
Tag.order('number_of_resources').page(params[:page]).per(50)
If it's not, you either need to add it to the Tag database table, or just do your sort/paginate in Ruby rather than using kaminari. This will be feasible if the number of tags is under ~1000 or so.
If you do add the info to the db, check out this post: Counter Cache for a column with conditions?
you should do something like: 1) joins the two tables, 2) group rows by tag, 3) count how many rows belongs to each group, 4) order using that new column with the count
you should make a good sql statement and then you can call pagination

Rails: select unique values from a column

I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")

Resources