I have inherited another programmer's Rails3 project, and I'm fairly new to rails overall. He's got a query that appears to sort by specific id's. Can somebody explain how this resolves in actual SQL? I think this code is killing the db and subsequently rails. I've tried to output it in the logger but can't seem to get the actual SQL to output even with config set to :debug. Searching heavily here (on SO) didn't turn up a clear explanation of how this query looks. The code looks like:
options = {
select: "SUM(1) AS num_demos, product_id ",
group: "product_id",
order: "num_demos ASC",
}
product_ids = Demo.where("state = 'waitlisted'").find(:all, options).collect{|d| d.product_id}
sort_product_ids = product_ids.collect{|product_id| "id = #{product_id}"}
Product.where(visible: true, id: product_ids).order(sort_product_ids.join(', '))
As far as I can see, the final line will create a query against the product table with an ORDER BY "id = 1, id = 3, ..." etc, which doesn't make a lot of sense to me. All clues appreciated.
A quick breakdown of what's going on, as it'll help you understand what to do for your replacement query.
options = {
select: "SUM(1) AS num_demos, product_id ",
group: "product_id",
order: "num_demos ASC",
}
product_ids = Demo.where("state = 'waitlisted'").find(:all, options).collect{|d| d.product_id}
This line will generate
SELECT SUM(1) as num_demos, product_id FROM "demos" WHERE (state = 'waitlisted') GROUP BY product_id
And returns an array of Demo objects, sorted by the count(*) of rows in the group, where only the product_id attribute has been loaded, and is available to you.
Next,
sort_product_ids = product_ids.collect{|product_id| "id = #{product_id}"}
results in a collection of product_ids mapped to the format "id = x". IE: If the previous result returned 10 results, with product_ids ranging from 1..10, sort_product_ids is now equivalent to ["id = 1", "id = 2", "id = 3", "id = 4", "id = 5", "id = 6", "id = 7", "id = 8", "id = 9", "id = 10"]
Finally,
Product.where(visible: true, id: product_ids).order(sort_product_ids.join(', '))
Selects all Products where the column visible is true, and their id is in the array of product_ids (which, as we found out earlier, is actually an array of Demo objects, not integers - this might be causing the query to fail). Then, it asks SQL to sort that result list by the sort_product_ids (sent in as a string "id = 1, id = 2, ... id = 10" instead of an array ["id = 1", "id = 2", ... "id = 10"]).
More info available at:
http://guides.rubyonrails.org/active_record_querying.html
http://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html
To select and sort by a given array of ids you can use this
Product.where(visible: true, id: product_ids)
.order( "field(id,#{product_ids.join(',')})" )
If you are using PostgreSQL, consider to use WITH ORDINALITY, it is the fastest way compared to others. See this thread.
To apply this method to Ruby on Rails, for example:
class SpecifiedByIds
def specified_by_ids(ids)
joins(
<<~SQL
LEFT JOIN unnest(array[#{ids.join(',')}])
WITH ORDINALITY AS t(id,odr) ON t.id = #{table_name}.id
SQL
).order('t.odr')
end
end
class MyModel < ActiveRecord::Base
extend SpecifiedByIds
end
Related
Basically I'd like to return all people whose current job title is X and whose previous job title is Y. As an example, I have a talent whose current emnployment is "Airbnb (company_id = 1)" and whose previous employment is at "Youtube (company_id = 2)".
If I run a query to find talent where current employment is Airbnb:
Talent.joins(:job_histories).where(["job_histories.company_id = ? and job_histories.end_year = ?", 1, "Present"])
I get the person.
If I run a query where previous employment is Youtube (hence the end_year != "Present" below)
Talent.joins(:job_histories).where(["job_histories.company_id = ? and job_histories.end_year != ?", 2, "Present"])
I also get the same person.
However, if I chain them together to find talents where current employer is Airbnb AND previous employer is Youtube, like this:
#talents = Talent.all
#talents = #talents.joins(:job_histories).where(["job_histories.company_id = ? and job_histories.end_year = ?", 1, "Present"])
#talents = #talents.joins(:job_histories).where(["job_histories.company_id = ? and job_histories.end_year != ?", 2, "Present"])
I do not get any results. I've tried several variations of the query but none return anything.
The only way I can get it to work is by using the first query and then looping over each talent to find where job_histories.company_id == 2.
if params[:advanced_current_company] && params[:advanced_previous_company]
#talents = #talents.joins(:job_histories).where(job_histories: { company_id: params[:advanced_current_company] }).distinct if params[:advanced_current_company]
#talents.each do |talent|
talent.job_histories.each do |job_history|
if job_history.company_id == params[:advanced_previous_company][0].to_i
new_talents.append(talent.id)
end
end
end
#talents = Talent.where(id: new_talents)
end
Any direction would be amazing. Thanks!
You had the right idea with a double join of the job_histories, but you need to alias the job_histories table names to be able to differentiate between them in the query, as otherwise activerecord will think it's only one join that needs to be done.
Talent.joins("INNER JOIN job_histories as jh1 ON jh1.talent_id = talents.id")
.joins("INNER JOIN job_histories as jh2 ON jh2.talent_id = talents.id")
.where("jh1.company_id = ? and jh1.end_year = ?", 1, "Present")
.where("jh2.company_id = ? and jh2.end_year != ?", 2, "Present")
I have a Rails query which is shown below:
query_results =
User.
joins("INNER JOIN posts ON posts.user_id = users.user_id").
select("posts.topic, posts.thread_id")
query_results contains values of 2 columns: topic and thread_id.
I would like to split query_results into 2 arrays - 1 containing values from all records (from query_results) for column topic alone and the 2nd containing values from all records for column thread_id alone.
How can I achieve this?
Try This out can help you!
here we are going to use pluck.
Yes. According to Rails guides, pluck directly converts a database result into an array, without constructing ActiveRecord objects. This means better performance for a large or often-running query.
topic_arr = []
thread_id = []
query_results = User.joins("INNER JOIN posts ON posts.user_id = users.user_id").pluck("posts.topic, posts.thread_id")
query_results.each do |i|
topic_arr.push(i.first)
thread_id.push(i.last)
end
puts query_results #=>[["topic1", 1], ["topic2", 2], ["topic3", 3]]
puts topic_arr #=>["topic1","topic2","topic3"]
puts thread_id #=>[1,2,3]
I think you can try below code for your requirement :-
query_results =
User.
joins("INNER JOIN posts ON posts.user_id = users.user_id").
pluck("posts.topic, posts.thread_id").to_h
topic_arr = query_results.keys
thread_id_arr = query_results.values
Example
Above query will give you result like:-
query_results = {"topic 1"=>1, "topic 2" => 2}
topic_arr = query_results.keys
topic_arr = ["topic 1", "topic 2"]
thread_id_arr = query_results.values
thread_id_arr = [1, 2]
I have a table products which has a product_type_code column on it. What I'd like to do is retrieve different numbers of objects based on this column (eg.: 3 products with product_type_code = 'fridge', 6 products with product_type_code = 'car', 9 products with product_type_code = 'house', etc.).
I know I can do like this:
fridges = Product.where(product_type_code: 'fridge').limit(3)
houses = Product.where(product_type_code: 'house').limit(9)
[...]
And even create a scope like this:
# app/models/product.rb
scope :by_product_type_code, -> (material) { where(product_type_code: product_type_code) }
However, this is not efficient since I go to the database 3 times, if I'm not wrong. What I'd like to do is something like:
scope :by_product_type_code, -> (hash) { some_method(hash) }
where hash is: { fridge: 3, car: 6, house: 9 }
and get an ActiveRecord_Relation containing 3 fridges, 6 cars and 9 houses.
How can I do that efficiently?
You can create a query using UNION ALL, which selects records having a specifc product_type_code and limit to use it with find_by_sql:
{ fridge: 3, car: 6, house: 9 }.map do |product_type_code, limit|
"(SELECT *
FROM products
WHERE product_type_code = '#{product_type_code}'
LIMIT #{limit})"
end.join(' UNION ALL ')
And you're gonna have a query like:
(SELECT * FROM products WHERE product_type_code = 'fridge'LIMIT 3)
UNION ALL
(SELECT * FROM products WHERE product_type_code = 'car'LIMIT 6)
UNION ALL
(SELECT * FROM products WHERE product_type_code = 'house'LIMIT 9)
#SebastianPalma's answer is the best solution; however if you were looking for a more "railsy" fashion of generating this query you can use arel as follows:
scope :by_product_type_code, ->(h) {
products_table = self.arel_table
query = h.map do |product_type,limit|
products_table.project(:id)
.where(products_table[:product_type_code].eq(product_type))
.take(limit)
end.reduce do |scope1, scope2|
Arel::Nodes::UnionAll.new(scope1,scope2)
end
self.where(id: query)
end
This will result in the sub query being part of the where clause.
Or
scope :by_product_type_code, ->(h) {
products_table = self.arel_table
query = h.map do |product_type,limit|
products_table.project(Arel.star)
.where(products_table[:product_type_code].eq(product_type))
.take(limit)
end.reduce do |scope1, scope2|
Arel::Nodes::UnionAll.new(scope1,scope2)
end
sub_query = Arel::Nodes::As.new(query,products_table)
self.from(sub_query)
end
This will result in the subquery being the source of the data.
I have two tables connected with habtm relation (through a table).
Table1
id : integer
name: string
Table2
id : integer
name: string
Table3
id : integer
table1_id: integer
table2_id: integer
I need to group Table1 records by simmilar records from Table2. Example:
userx = Table1.create()
user1.table2_ids = 3, 14, 15
user2.table2_ids = 3, 14, 15, 16
user3.table2_ids = 3, 14, 16
user4.table2_ids = 2, 5, 7
user5.table2_ids = 3, 5
Result of grouping that I want is something like
=> [ [ [1,2], [3, 14, 15] ], [ [2,3], [3,14, 16] ], [ [ 1, 2, 3, 5], [3] ] ]
Where first array is an user ids second is table2_ids.
I there any possible SQL solution or I need to create some kind of algorithm ?
Updated:
Ok, I have a code that is working like I've said. Maybe someone who can help me will find it useful to understand my idea.
def self.compare
hash = {}
Table1.find_each do |table_record|
Table1.find_each do |another_table_record|
if table_record != another_table_record
results = table_record.table2_ids & another_table_record.table2_ids
hash["#{table_record.id}_#{another_table_record.id}"] = results if !results.empty?
end
end
end
#hash = hash.delete_if{|k,v| v.empty?}
hash.sort_by{|k,v| v.count}.to_h
end
But I can bet that you can imagine how long does it takes to show me an output. For my 500 Table1 records it's something near 1-2 minutes. If I will have more, time will be increased in progression, so I need some elegant solution or SQL query.
Table1.find_each do |table_record|
Table1.find_each do |another_table_record|
...
Above codes have performance issue that you have to query database N*N times, which could be optimized down to one single query.
# Query table3, constructing the data useful to us
# { table1_id: [table2_ids], ... }
records = Table3.all.group_by { |t| t.table1_id }.map { |t1_id, t3_records|
[t1_id, t3_records.map(&:table2_id)]
}.to_h
Then you could do exactly the same thing to records to get the final result hash.
UPDATE:
#AKovtunov You miss understood me. My code is the first step. With records, which have {t1_id: t2_ids} hash, you could do sth like this:
hash = {}
records.each do |t1_id, t2_ids|
records.each do |tt1_id, tt2_ids|
if t1_id != tt1_id
inter = t2_ids & tt2_ids
hash["#{t1_id}_#{tt1_id}"] = inter if !inter.empty?
end
end
end
I have а little question, write through Activerecord Query Interface
It's actually:
Gp.select("date('gps'.'created_at') as date,('users'.'name') as name, SUM('gps'.'sum_issue') as sum_issue").joins('LEFT JOIN users ON users.id = gps.user_id').where("users.ab_id = :abs_id AND users.id != 20", {:abs_id => current_user.ab_id}).group("users.name")
Result of query must be user name, sum,and date. If i do this query directly from SQLlite it's work, But
Active Record Query Interface give me
[#<Gp sum_issue: 289000>, #<Gp sum_issue: 364130>, #<Gp sum_issue: 620000>]
How i can get a name,date,sum_issue and show it in my helper.
like this:
{
created_at: datet,
sum_issue: sum_issue,
name: name
}
Try
Gp.select("date('gps'.'created_at') as date,('users'.'name') as name, SUM('gps'.'sum_issue') as sum_issue").joins('LEFT JOIN users ON users.id = gps.user_id').where("users.ab_id = :abs_id AND users.id != 20", {:abs_id => current_user.ab_id}).group("users.name").first.name