I'm trying to optimize the speed in my grails app.
I have this:
Catalog a= Catalog.findByName('a');
Element b= Element.findByCatalogAndNumber(a,2);
This way i can find b.
But I'm thinking I could use something like this:
Element b= Element.createCriteria().get{
catalog{
eq("name",'a')
}
eq("number",2)
}
But I'm not sure if it reduces the queries to the database, or I'm just making a fool of myself and creating even bigger files and reducing the speed of my app by doing this.
any idea?
I have compared three versions of your query using
Grails 2.4.4, default settings for caches in a the Grails application
PostgreSQL 8.4, SQL statement logging has been turned on to count/see the SQL queries.
The first versions one using two calls on the Grails domain class:
def query1() {
Catalog a = Catalog.findByName('a');
log.info(a)
Element b = Element.findByCatalogAndPos(a, 2);
log.info(b)
render(b.toString())
}
The 2nd one using criteria
def query2() {
Element b = Element.createCriteria().get {
catalog {
eq("name", "a")
}
eq("pos", 2)
}
render(b.toString())
}
and the last one using a where query
def query3() {
def query = Element.where {
catalog.name == "a" && pos == 2
}
Element b = query.get()
render(b.toString())
}
The first one results in two SQL queries, the other ones will only send one query to the database (using an inner join from Element to Catalog).
As for readability/expressiveness, choose the 3rd version: It expresses your intention in a single line, and it's the most compact version.
As for performance, choose the 2nd or the 3rd version. Under high load, many concurrent users/requests, the number of queries does matter. This might not be an issue for all applications.
Anway, I'd always choose the 3rd version for the expressiveness; and it will scale, if the query conditions gets more complex over the time.
Update
The SQL statements used by the 1st version:
select this_.id as id1_1_0_, this_.version as version2_1_0_, this_.date_created as date_cre3_1_0_, this_.last_updated as last_upd4_1_0_, this_.name as name5_1_0_, this_.remark as remark6_1_0_
from catalog this_
where this_.name=$1 limit $2
Parameter: $1 = 'a', $2 = '1'
select this_.id as id1_2_0_, this_.version as version2_2_0_, this_.catalog_id as catalog_3_2_0_, this_.date_created as date_cre4_2_0_, this_.last_updated as last_upd5_2_0_, this_.pos as pos6_2_0_, this_.remark as remark7_2_0_
from element this_
where this_.catalog_id=$1 and this_.pos=$2 limit $3
Parameter: $1 = '10', $2 = '2', $3 = '1'
The SQL statement for the 2nd and 3rd version:
select this_.id as id1_2_1_, this_.version as version2_2_1_, this_.catalog_id as catalog_3_2_1_, this_.date_created as date_cre4_2_1_, this_.last_updated as last_upd5_2_1_, this_.pos as pos6_2_1_, this_.remark as remark7_2_1_, catalog_al1_.id as id1_1_0_, catalog_al1_.version as version2_1_0_, catalog_al1_.date_created as date_cre3_1_0_, catalog_al1_.last_updated as last_upd4_1_0_, catalog_al1_.name as name5_1_0_, catalog_al1_.remark as remark6_1_0_
from element this_ inner join catalog catalog_al1_
on this_.catalog_id=catalog_al1_.id
where (catalog_al1_.name=$1) and this_.pos=$2
Parameter: $1 = 'a', $2 = '2'
Related
I have a jsonb column called lms_data with a hash data-structure inside. I am trying to find elements that match an array of ids. This query works and returns the correct result :
CoursesProgram
.joins(:course)
.where(program_id: 12)
.where(
"courses.lms_data->>'_id' IN ('604d26cadb238f542f2fa', '604541eb0ff9d7b28828c')")
SQL LOG :
CoursesProgram Load (0.5ms) SELECT "courses_programs".* FROM "courses_programs" INNER JOIN "courses" ON "courses"."id" = "courses_programs"."course_id" WHERE "courses_programs"."program_id" = $1 AND (courses.lms_data->>'_id' IN ('604d26cadb61e238f542f2fa', '604541eb0ff9d8387b28828c')) [["program_id", 12]
However when I try to pass a variable as the array of ids :
CoursesProgram
.joins(:course)
.where(program_id: 12)
.where(
"courses.lms_data->'_id' IN (?)",
["604d26cadb61e238f542f2fa", "604541eb0ff9d8387b28828c"])
I dont get any results and I get two queries performed in the logs...
CoursesProgram Load (16.6ms) SELECT "courses_programs".* FROM "courses_programs" INNER JOIN "courses" ON "courses"."id" = "courses_programs"."course_id" WHERE "courses_programs"."program_id" = $1 AND (courses.lms_data->'_id' IN ('604d26cadb61e238f542f2fa','604541eb0ff9d8387b28828c')) [["program_id", 12]]
CoursesProgram Load (0.8ms) SELECT "courses_programs".* FROM "courses_programs" INNER JOIN "courses" ON "courses"."id" = "courses_programs"."course_id" WHERE "courses_programs"."program_id" = $1 AND (courses.lms_data->'_id' IN ('604d26cadb61e238f542f2fa','604541eb0ff9d8387b28828c')) LIMIT $2 [["program_id", 12], ["LIMIT", 11]]
I cannot wrapp my head around this one.
The queries perform in both cases seem to be the same. Why is one working and the other one not ? and why in the second case is the query performed twice ?
The question mark is its own operator in postgres's json query function set (meaning, does this exist). ActiveRecord is attempting to do what it thinks you want, but there are limitations with expectation.
Solution.
Don't use it. Since the ? can cause problems with postgres's json query, I use named substitution instead.
from the postgres documentation:
?| text[] Do any of these array strings exist as top-level keys? '{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'c']
So first we use the ?| postgres json operator to look for an ANY in the values of lms_data.
And secondly we tell postgres we'll be using an an array with the postgres array function array[:named_substitution]
ANd lastly after the , at the end of the postgres query, add your named sub variable (in this case I used :ids) and your array.
CoursesProgram
.joins(:course)
.where(program_id: 12)
.where(
"courses.lms_data->>'_id' ?| array[:ids]",
ids: ['604d26cadb238f542f2fa', '604541eb0ff9d7b28828c'])
Hows do one obtain the UNION operation result in Rails.
Given I have the following SQL statement
SELECT "sip_trunks".* FROM "sip_trunks" WHERE "sip_trunks"."default" = t LIMIT 1 UNION ALL SELECT "sip_trunks".* FROM "sip_trunks" WHERE "sip_trunks"."default" = f LIMIT 1
Thus far I have managed to construct the SQL using AREL with union all statement.
SipTrunk.where(default: true).limit(1).union(:all,SipTrunk.where(default: false).limit(1))
But attempting to query this result and AREL i.e Arel::Nodes::UnionAll and I'm unable to obtain the DB result.
Also running to_sql on the statement yield a SQL like this..
( SELECT "sip_trunks".* FROM "sip_trunks" WHERE "sip_trunks"."default" = $1 LIMIT 1 UNION ALL SELECT "sip_trunks".* FROM "sip_trunks" WHERE "sip_trunks"."default" = $2 LIMIT 1 )
this seem like a prepared statement but I don't see any prepared statement in DB
Attempting to use the above SQL using find_by_sql
SipTrunk.find_by_sql(SipTrunk.where(default: true).limit(1).union(:all,SipTrunk.where(default: false).limit(1)).to_sql,[['default',true],['default',false]])
with following error
ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR: syntax error at or near "UNION"
LINE 1: ...trunks" WHERE "sip_trunks"."default" = $1 LIMIT 1 UNION ALL ...
How do I get the final SQL rows, from here?
Here is how I would perform this operation.
sql1 = SipTrunk.where(default: true).limit(1).arel
sql2 = SipTrunk.where(default: false).limit(1).arel
subquery = Arel::Nodes::As.new(
Arel::Nodes::UnionAll.new(sql1,sql2),
SipTrunk.arel_table
)
SipTrunk.from(subquery)
This will result in the following SQL
SELECT
sip_trunks.*
FROM
( SELECT
sip_trunks.*
FROM
sip_trunks
WHERE
sip_trunks.default = t
LIMIT 1
UNION ALL
SELECT
sip_trunks.*
FROM
sip_trunks
WHERE
sip_trunks.default = f
LIMIT 1) AS sip_trunks
And this will return an ActiveRecord::Relation of SipTrunk objects.
You can do a union like this, concatenating the two sql queries.
sql1 = SipTrunk.where(default: true).limit(1).to_sql
sql2 = SipTrunk.where(default: false).limit(1).to_sql
#sip_trunks = SipTrunk.find_by_sql("(#{sql1}) UNION (#{sql2})")
If you want to be fancy or have more than one sql queries to join you can to this
final_sql = [sql1, sql2].join(' UNION ')
#sip_trunks = SipTrunk.find_by_sql(final_sql)
I find my query is taking too long to load so I'm wondering if the position of the includes matters.
Example A:
people = Person.where(name: 'guillaume').includes(:jobs)
Example B:
people = Person.includes(:jobs).where(name: 'guillaume')
Is example A faster because I should have fewer people's jobs to load?
Short answer: no.
ActiveRecord builds your query and as long as you don't need the records, it won't send the final SQL query to the database to fetch them. The 2 queries you pasted are identical.
Whenever in doubt, you can always open up rails console, write your queries there and observe the queries printed out. In your example it would be something like:
SELECT "people".* FROM "people" WHERE "people"."name" = $1 LIMIT $2 [["name", "guillaume"], ["LIMIT", 11]]
SELECT "jobs".* FROM "jobs" WHERE "jobs"."person_id" = 1
in both of the cases.
I'm creating filter for my Point model on Ruby on Rails app. App uses ActiveAdmin+Ransacker for filters. I wrote 3 methods to filter the Point:
def self.filter_by_customer_bonus(bonus_id)
Point.joins(:customer).where('customers.bonus_id = ?', bonus_id)
end
def self.filter_by_classificator_bonus(bonus_id)
Point.joins(:setting).where('settings.bonus_id = ?', bonus_id)
end
def self.filter_by_bonus(bonus_id)
Point.where(bonus_id: bonus_id)
end
Everything works fine, but I need to merge the result of 3 methods to one array. When The Points.count (on production server for example) > 1000000 it works too slow, and I need to merge all of them to one method. The problem is that I need to order the final merged array this way:
Result array should start with result of first method here, the next adding the second method result, and then third the same way.
Is it possible to move this 3 sqls into 1 to make it work faster and order it as I write before?
For example my Points are [1,2,3,4,5,6,7,8,9,10]
Result of first = [1,2,3]
Result of second = [2,3,4]
Result of third = [5,6,7]
After merge I should get [1,2,3,4,5,6,7] but it should be with the result of 1 method, not 3+merge. Hope you understand me :)
UPDATE:
The result of the first answer:
Point Load (8.0ms) SELECT "points".* FROM "points" INNER JOIN "customers" ON "customers"."number" = "points"."customer_number" INNER JOIN "managers" ON "managers"."code" = "points"."tp" INNER JOIN "settings" ON "settings"."classificator_id" = "managers"."classificator_id" WHERE "points"."bonus_id" = $1 AND "customers"."bonus_id" = $2 AND "settings"."bonus_id" = $3 [["bonus_id", 2], ["bonus_id", 2], ["bonus_id", 2]]
It return an empty array.
You can union these using or (documentation):
def self.filter_trifecta(bonus_id)
(
filter_by_customer_bonus(bonus_id)
).or(
filter_by_classificator_bonus(bonus_id)
).or(
filter_by_bonus(bonus_id)
)
end
Note: you might have to hoist those joins up to the first condition — I'm not sure of or will handle those forks well as-is.
Below gives you all the results in a single query. if you have indexes on the foreign keys used here it should be able to handle million records:
The one provided earlier does an AND on all 3 queries, thats why you had zero results, you need union, below should work. (Note: If you are using rails 5, there is active record syntax for union, which the first commenter provided.)
Updated:
Point.from(
"(#{Point.joins(:customer).where(customers: {bonus_id: bonus_id).to_sql}
UNION
#{Point.joins(:setting).where(settings: {bonus_id: bonus_id}).to_sql}
UNION
#{Point.where(bonus_id: bonus_id).to_sql})
AS points")
Instead you can also use your 3 methods like below:
Point.from("(#{Point.filter_by_customer_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_classificator_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_bonus(bonus_id).to_sql}
) as points")
The following code gets all the residences which have all the amenities which are listed in id_list. It works with out a problem with SQLite but raises an error with PostgreSQL:
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: {amenity_id: id_list}).
references(:listed_amenities).
group(:residence_id).
having("count(*) = ?", id_list.size)
The error on the PostgreSQL version:
What do I have to change to make it work with PostgreSQL?
A few things:
references should only be used with includes; it tells ActiveRecord to perform a join, so it's redundant when using an explicit joins.
You need to fully qualify the argument to group, i.e. group('residences.id').
For example,
id_list = [48, 49]
Residence.joins(:listed_amenities).
where(listed_amenities: { amenity_id: id_list }).
group('residences.id').
having('COUNT(*) = ?", id_list.size)
The query the Ruby (?) code is expanded to is selecting all fields from the residences table:
SELECT "residences".*
FROM "residences"
INNER JOIN "listed_amenities"
ON "listed_amentities"."residence_id" = "residences"."id"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1;
From the Postgres manual, When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column.
You'll need to either group by all fields that aggregate functions aren't applied to, or do this differently. From the query, it looks like you only need to scan the amentities table to get the residence ID you're looking for:
SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
And then fetch your residence data with that ID. Or, in one query:
SELECT "residences".*
FROM "residences"
WHERE "id" IN (SELECT "residence_id"
FROM "listed_amenities"
WHERE "listed_amenities"."amenity_id" IN (48,49)
GROUP BY "residence_id"
HAVING count(*) = 2
ORDER BY "residences"."id" ASC
LIMIT 1
);