Rails ActiveRecord query pluck select sum - ruby-on-rails

I have a complicated query that calculates the sum of the weight of an object's relational trait (For PosgreSQL)
Object.joins(:object_traits).where(object_trait: {name: [list_of_names]}).select("sum(object_traits.weight) as sum_weight, #{other direct object traits}").group("#{other direct object traits}").order('weight_sum')
Ideally I would like to pluck the sum of the weights for each Object

Since the argument to pluck actually takes the place of the SELECT clause in the generated query, you can accomplish this just by a) moving your .select() invocation to the end of the chain and b) changing it to a .pluck(). For example, the following works in a quick demo app:
irb> User.group("name").order("SUM(age) DESC").pluck("name, SUM(age)")
(1.0ms) SELECT name, SUM(age) FROM "users" GROUP BY "users"."name" ORDER BY SUM(age) DESC
=> [["rob2", 13], ["rob4", 5], ["rob", 1]]
Any joins or where clauses should still work, too.

Related

How to query JSONB field with IN operator in rails - active record

I have a jsonb column called lms_data with a hash data-structure inside. I am trying to find elements that match an array of ids. This query works and returns the correct result :
CoursesProgram
.joins(:course)
.where(program_id: 12)
.where(
"courses.lms_data->>'_id' IN ('604d26cadb238f542f2fa', '604541eb0ff9d7b28828c')")
SQL LOG :
CoursesProgram Load (0.5ms) SELECT "courses_programs".* FROM "courses_programs" INNER JOIN "courses" ON "courses"."id" = "courses_programs"."course_id" WHERE "courses_programs"."program_id" = $1 AND (courses.lms_data->>'_id' IN ('604d26cadb61e238f542f2fa', '604541eb0ff9d8387b28828c')) [["program_id", 12]
However when I try to pass a variable as the array of ids :
CoursesProgram
.joins(:course)
.where(program_id: 12)
.where(
"courses.lms_data->'_id' IN (?)",
["604d26cadb61e238f542f2fa", "604541eb0ff9d8387b28828c"])
I dont get any results and I get two queries performed in the logs...
CoursesProgram Load (16.6ms) SELECT "courses_programs".* FROM "courses_programs" INNER JOIN "courses" ON "courses"."id" = "courses_programs"."course_id" WHERE "courses_programs"."program_id" = $1 AND (courses.lms_data->'_id' IN ('604d26cadb61e238f542f2fa','604541eb0ff9d8387b28828c')) [["program_id", 12]]
CoursesProgram Load (0.8ms) SELECT "courses_programs".* FROM "courses_programs" INNER JOIN "courses" ON "courses"."id" = "courses_programs"."course_id" WHERE "courses_programs"."program_id" = $1 AND (courses.lms_data->'_id' IN ('604d26cadb61e238f542f2fa','604541eb0ff9d8387b28828c')) LIMIT $2 [["program_id", 12], ["LIMIT", 11]]
I cannot wrapp my head around this one.
The queries perform in both cases seem to be the same. Why is one working and the other one not ? and why in the second case is the query performed twice ?
The question mark is its own operator in postgres's json query function set (meaning, does this exist). ActiveRecord is attempting to do what it thinks you want, but there are limitations with expectation.
Solution.
Don't use it. Since the ? can cause problems with postgres's json query, I use named substitution instead.
from the postgres documentation:
?| text[] Do any of these array strings exist as top-level keys? '{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'c']
So first we use the ?| postgres json operator to look for an ANY in the values of lms_data.
And secondly we tell postgres we'll be using an an array with the postgres array function array[:named_substitution]
ANd lastly after the , at the end of the postgres query, add your named sub variable (in this case I used :ids) and your array.
CoursesProgram
.joins(:course)
.where(program_id: 12)
.where(
"courses.lms_data->>'_id' ?| array[:ids]",
ids: ['604d26cadb238f542f2fa', '604541eb0ff9d7b28828c'])

Unable to figure out what is going on with Active Record distinct method

I am trying to select distinct records from models table based on user_id. I have tried writing this in few possible ways. I don't understand why using distinct with pluck[2] returns correct value, while only distinct[3] does not seem to work at all. What am I missing here?
Using 'rails', '~> 5.2.3'.
[1] pry(main)> Model.select(:user_id).distinct.count
(2967.7ms) SELECT COUNT(DISTINCT "models"."user_id") FROM "models"
=> 11432
[2] pry(main)> Model.distinct(:user_id).pluck(:user_id).count
(690.6ms) SELECT DISTINCT "models"."user_id" FROM "models"
=> 11432
[3] pry(main)> Model.distinct(:user_id).count
(1076.7ms) SELECT COUNT(DISTINCT "models"."id") FROM "models"
=> 2531300
1) This is the correct way to count the number of distinct user_id values in the database using the ActiveRecord query language.
2) Note that pluck already runs the database query. That means it loads all records into memory and then counts the number of elements in that array.
3) The argument to the distinct is not a column name but a boolean. That means that calling distinct(:user_id) is basically the same as calling distinct(true) and true default anyway. That means
Model.distinct(:user_id).count
is the same as
Model.distinct.count

How to attach raw SQL to an existing Rails ActiveRecord chain?

I have a rule builder that ultimately builds up ActiveRecord queries by chaining multiple where calls, like so:
Track.where("tracks.popularity < ?", 1).where("(audio_features ->> 'valence')::numeric between ? and ?", 2, 5)
Then, if someone wants to sort the results randomly, it would append order("random()").
However, given the table size, random() is extremely inefficient for ordering, so I need to use Postgres TABLESAMPLE-ing.
In a raw SQL query, that looks like this:
SELECT * FROM "tracks" TABLESAMPLE SYSTEM(0.1) LIMIT 250;
Is there some way to add that TABLESAMPLE SYSTEM(0.1) to the existing chain of ActiveRecord calls? Putting it inside a where() or order() doesn't work since it's not a WHERE or ORDER BY function.
irb(main):004:0> Track.from('"tracks" TABLESAMPLE SYSTEM(0.1)')
Track Load (0.7ms) SELECT "tracks".* FROM "tracks" TABLESAMPLE SYSTEM(0.1) LIMIT $1 [["LIMIT", 11]]

Arel and CTE to Wrap Query

I have to update a search builder which builds a relation with CTE. This is necessary because a complex relation (which includes DISTINCT, JOINs etc) is first built and then it's results have to be ordered – all in one query.
Here's a simplified look at things:
rel = User.select('DISTINCT ON (users.id) users.*').where(<lotsastuff>)
rel.to_sql
# SELECT DISTINCT ON (users.id) users.*
# FROM "users"
# WHERE <lotsastuff>
rel2 = User.from_cte('cte_table', rel).order(:created_at)
rel2.to_sql
# WITH "cte_table" AS (
# SELECT DISTINCT ON (users.id) users.*
# FROM "users"
# WHERE <lotsastuff>
# ) SELECT "cte_table".* FROM "cte_table"
# ORDER BY "cte_table"."created_at" ASC
The beauty of it is that rel2 responds as expected e.g. to count.
The from_cte method is provided by the "posgres_ext" gem which appears to have been abandoned. I'm therefore looking for another way to build the relation rel2 from rel.
The Arel docs mention a case which doesn't seem to help here.
Any hints on how to get there? Thanks a bunch!
PS: I know how to do this with to queries by selecting all user IDs in the first, then build a query with IN over the IDs and order there. However, I'm curious whether this is possible with one query (with or without CTE) as well.
Since your CTE is non-recursive, you can rewrite it as a subquery in the FROM clause. The only change is that Postgres's planner will optimize it as part of the main query instead of separately (because a CTE is an optimization fence). In ActiveRecord this works for me (tested on 5.1.4):
2.4.1 :001 > rel = User.select("DISTINCT ON (users.id) users.*").where("1=1")
2.4.1 :002 > puts User.from(rel, 'users').order(:created_at).to_sql
SELECT "users".* FROM (SELECT DISTINCT ON (users.id) users.* FROM "users" WHERE (1=1)) users ORDER BY "users"."created_at" ASC
I don't see any way to squeeze a CTE into ActiveRecord without extending it though, like what postgres_ext does. Sorry!
From what you've mentioned, I did not understand why do you need to use CTE instead of just a nested query.
rel = User.select('DISTINCT ON (users.id) users.*').where(<lotsastuff>).arel
inner_query = Arel::Table.new(:inner_query)
composed_cte = Arel::Nodes::As.new(inner_query, rel)
select_manager = Arel::SelectManager.new(composed_cte)
rel2 = select_manager.project('COUNT(*)')
rel2.to_sql
rel3 = select_manager.order('created_at ASC')
rel3.to_sql
you can then execute that sql

How to combine 3 SQL request into one and order it Rails

I'm creating filter for my Point model on Ruby on Rails app. App uses ActiveAdmin+Ransacker for filters. I wrote 3 methods to filter the Point:
def self.filter_by_customer_bonus(bonus_id)
Point.joins(:customer).where('customers.bonus_id = ?', bonus_id)
end
def self.filter_by_classificator_bonus(bonus_id)
Point.joins(:setting).where('settings.bonus_id = ?', bonus_id)
end
def self.filter_by_bonus(bonus_id)
Point.where(bonus_id: bonus_id)
end
Everything works fine, but I need to merge the result of 3 methods to one array. When The Points.count (on production server for example) > 1000000 it works too slow, and I need to merge all of them to one method. The problem is that I need to order the final merged array this way:
Result array should start with result of first method here, the next adding the second method result, and then third the same way.
Is it possible to move this 3 sqls into 1 to make it work faster and order it as I write before?
For example my Points are [1,2,3,4,5,6,7,8,9,10]
Result of first = [1,2,3]
Result of second = [2,3,4]
Result of third = [5,6,7]
After merge I should get [1,2,3,4,5,6,7] but it should be with the result of 1 method, not 3+merge. Hope you understand me :)
UPDATE:
The result of the first answer:
Point Load (8.0ms) SELECT "points".* FROM "points" INNER JOIN "customers" ON "customers"."number" = "points"."customer_number" INNER JOIN "managers" ON "managers"."code" = "points"."tp" INNER JOIN "settings" ON "settings"."classificator_id" = "managers"."classificator_id" WHERE "points"."bonus_id" = $1 AND "customers"."bonus_id" = $2 AND "settings"."bonus_id" = $3 [["bonus_id", 2], ["bonus_id", 2], ["bonus_id", 2]]
It return an empty array.
You can union these using or (documentation):
def self.filter_trifecta(bonus_id)
(
filter_by_customer_bonus(bonus_id)
).or(
filter_by_classificator_bonus(bonus_id)
).or(
filter_by_bonus(bonus_id)
)
end
Note: you might have to hoist those joins up to the first condition — I'm not sure of or will handle those forks well as-is.
Below gives you all the results in a single query. if you have indexes on the foreign keys used here it should be able to handle million records:
The one provided earlier does an AND on all 3 queries, thats why you had zero results, you need union, below should work. (Note: If you are using rails 5, there is active record syntax for union, which the first commenter provided.)
Updated:
Point.from(
"(#{Point.joins(:customer).where(customers: {bonus_id: bonus_id).to_sql}
UNION
#{Point.joins(:setting).where(settings: {bonus_id: bonus_id}).to_sql}
UNION
#{Point.where(bonus_id: bonus_id).to_sql})
AS points")
Instead you can also use your 3 methods like below:
Point.from("(#{Point.filter_by_customer_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_classificator_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_bonus(bonus_id).to_sql}
) as points")

Resources