I have two models/tables A and B. I'd like to perform an Active Record query where the results include columns from both tables. I tried inner joins as they sounded like they combine columns from both tables. However trying the Active Record joins finder method returns results from only the first table.
What Active Record queries include columns from two tables in the results? Perhaps the includes finder method could help.
Edit: think of the two tables as ForumThreads and Posts. Each forum thread has multiple posts. I'd like the rows in the query results to contain information for each post and information for the forum thread (for example the thread title).
This question might have answered my question: Rails Joins and include columns from joins table
Joins performs an inner join, but will not return the data until you ask for it.
User.where(:id => 1).joins(:client_applications)
User Load (0.2ms) SELECT "users".* FROM "users" INNER JOIN "client_applications" ON "client_applications"."user_id" = "users"."id" WHERE "users"."id" = 1
Includes will execute two queries (using where in) and cache the associated data (Eager Loading)
User.where(:id => 1).includes(:client_applications)
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" = 1
ClientApplication Load (13.6ms) SELECT "client_applications".* FROM "client_applications" WHERE "client_applications"."user_id" IN (1)
Related
I have the following statement:
Customer.where(city_id: cities)
which results in the following SQL statement:
SELECT customers.* FROM customers WHERE customers.city_id IN (SELECT cities.id FROM cities...
Is this intended behavior? Is it documented somewhere? I will not use the Rails code above and use one of the followings instead:
Customer.where(city_id: cities.pluck(:id))
or
Customer.where(city: cities)
which results in the exact same SQL statement.
The AREL querying library allows you to pass in ActiveRecord objects as a short-cut. It'll then pass their primary key attributes into the SQL it uses to contact the database.
When looking for multiple objects, the AREL library will attempt to find the information in as few database round-trips as possible. It does this by holding the query you're making as a set of conditions, until it's time to retrieve the objects.
This way would be inefficient:
users = User.where(age: 30).all
# ^^^ get all these users from the database
memberships = Membership.where(user_id: users)
# ^^^^^ This will pass in each of the ids as a condition
Basically, this way would issue two SQL statements:
select * from users where age = 30;
select * from memberships where user_id in (1, 2, 3);
Each of these involves a call on a network port between applications and the data to then be passsed back across that same port.
This would be more efficient:
users = User.where(age: 30)
# This is still a query object, it hasn't asked the database for the users yet.
memberships = Membership.where(user_id: users)
# Note: this line is the same, but users is an AREL query, not an array of users
It will instead build a single, nested query so it only has to make a round-trip to the database once.
select * from memberships
where user_id in (
select id from users where age = 30
);
So, yes, it's expected behaviour. It's a bit of Rails magic, it's designed to improve your application's performance without you having to know about how it works.
There's also some cool optimisations, like if you call first or last instead of all, it will only retrieve one record.
User.where(name: 'bob').all
# SELECT "USERS".* FROM "USERS" WHERE "USERS"."NAME" = 'bob'
User.where(name: 'bob').first
# SELECT "USERS".* FROM "USERS" WHERE "USERS"."NAME" = 'bob' AND ROWNUM <= 1
Or if you set an order, and call last, it will reverse the order then only grab the last one in the list (instead of grabbing all the records and only giving you the last one).
User.where(name: 'bob').order(:login).first
# SELECT * FROM (SELECT "USERS".* FROM "USERS" WHERE "USERS"."NAME" = 'bob' ORDER BY login) WHERE ROWNUM <= 1
User.where(name: 'bob').order(:login).first
# SELECT * FROM (SELECT "USERS".* FROM "USERS" WHERE "USERS"."NAME" = 'bob' ORDER BY login DESC) WHERE ROWNUM <= 1
# Notice, login DESC
Why does it work?
Something deep in the ActiveRecord query builder is smart enough to see that if you pass an array or a query/criteria, it needs to build an IN clause.
Is this documented anywhere?
Yes, http://guides.rubyonrails.org/active_record_querying.html#hash-conditions
2.3.3 Subset conditions
If you want to find records using the IN expression you can pass an array to the conditions hash:
Client.where(orders_count: [1,3,5])
This code will generate SQL like this:
SELECT * FROM clients WHERE (clients.orders_count IN (1,3,5))
I am not able to grasp how the ActiveRecord preload method is of use.
When I do, for example, User.preload(:posts), it does runs two queries but what is returned is just the same as User.all. The second query does not seem to affect the result.
User Load (3.2ms) SELECT "users".* FROM "users"
Post Load (1.2ms) SELECT "posts".* FROM "posts" WHERE "posts"."user_id" IN (1, 2, 3)
Can someone explain?
Thanks!
Output is the same, but when you'll call user.posts, Rails will not load your posts from the database next time:
users = User.preload(:posts).limit(5) # users collection, 2 queries to the database
# User Load ...
# Post Load ...
users.flat_map(&:posts) # users posts array, no loads
users.flat_map(&:posts) # users posts array, no loads
You can do it as mutch times as you want, Rails just 'remember' your posts in RAM. The idea is that you go to the database only once.
I want to expand this question.
order by foreign key in activerecord
I'm trying to order a set of records based on a value in a really large table.
When I use join, it brings all the "other" records data into the objects.. As join should..
#table users 30+ columns
#table bids 5 columns
record = Bid.find(:all,:joins=>:users, :order=>'users.ranking DESC' ).first
Now record holds 35 fields..
Is there a way to do this without the join?
Here's my thinking..
With the join I get this query
SELECT * FROM "bids"
left join users on runner_id = users.id
ORDER BY ranking LIMIT 1
Now I can add a select to the code so I don't get the full user table, but putting a select in a scope is dangerous IMHO.
When I write sql by hand.
SELECT * FROM bids
order by (select users.ranking from users where users.id = runner_id) DESC
limit 1
I believe this is a faster query, based on the "explain" it seems simpler.
More important than speed though is that the second method doesn't have the 30 extra fields.
If I build in a custom select inside the scope, it could explode other searches on the object if they too have custom selects (there can be only one)
What you would like to achieve in active record writing is something along
SELECT b.* from bids b inner join users u on u.id=b.user_id order by u.ranking desc
In active record i would write such as:
Bids.joins("inner join users u on bids.user_id=u.id").order("u.ranking desc")
I think it's the only to make a join without fetching all attributes from the user models.
When I do includes it left joins the table I want to filter on, but when I add pluck that join disappears. Is there any way to mix pluck and left join without manually typing the sql for 'left join'
Here's my case:
Select u.id
From users u
Left join profiles p on u.id=p.id
Left join admin_profiles a on u.id=a.uid
Where 2 in (p.prop, a.prop, u.prop)
Doing this is just loading all the values:
Users.includes(:AdminProfiles, :Profiles).where(...).map{ |a| a[:id] }
But when I do pluck instead of map, it doesn't left join the profile tables.
Your problem is that you're using includes which doesn't really do a join, instead it fires a second query after the first one to query for the associations, in your case you want them both to be actually joined, so for that replace includes(:something) with joins(:something) and every thing should work fine.
Replying to your comment, i'm gonna quote few parts from the rails guide about active record query interface
From the section Solution to N + 1 queries problem
clients = Client.includes(:address).limit(10)
clients.each do |client|
puts client.address.postcode
end
The above code will execute just 2 queries, as opposed to 11 queries in the previous case:
SELECT * FROM clients LIMIT 10
SELECT addresses.* FROM addresses WHERE (addresses.client_id IN (1,2,3,4,5,6,7,8,9,10))
as you can see, two queries, no joins at all.
From the section Specifying Conditions on Eager Loaded Associations link
Even though Active Record lets you specify conditions on the eager loaded associations just like joins, the recommended way is to use joins instead.
Then an example:
Article.includes(:comments).where(comments: { visible: true })
This would generate a query which contains a LEFT OUTER JOIN whereas the joins method would generate one using the INNER JOIN function instead.
SELECT "articles"."id" AS t0_r0, ... "comments"."updated_at" AS t1_r5 FROM "articles" LEFT OUTER JOIN "comments" ON "comments"."article_id" = "articles"."id" WHERE (comments.visible = 1)
If there was no where condition, this would generate the normal set of two queries.
I have an actions table with over 450,000 records. I want to join the actions table on the users table (it actually joins two other tables, one of which is joined on the other, and the other being joined on the users table, before joining the actions table.) The sql query looks like this:
SELECT "users".* FROM "users" INNER JOIN "campaigns" ON "campaigns"."user_id" = "users"."id" INNER JOIN "books" ON "books"."campaign_id" = "campaigns"."id" INNER JOIN "actions" ON "actions"."book_id" = "books"."id" AND "actions"."type" IN ('Impression')
However, this query in rails causes my app to hang because of the large number of records in the actions table.
How should I be handling this?
There are several problems with this approach:
You are fetching the same users several times (users multiplied by
number of actions)
You are fetching all the users with corresponding actions at once. It means big memory consumption and thus frequent garbage collection
You are fetching all the attributes for users. I guess, you do not need all them all
You have made comment about ordering users by their order count. Do you do this in Ruby code ? If yes, then it's the 4th problem. Big problem, indeed
So I'd propose to use group() method for grouping records or just plain SQL like
SELECT "users".id, count(*) as actions_cnt
FROM "users"
INNER JOIN "campaigns" ON "campaigns"."user_id" = "users"."id"
INNER JOIN "books" ON "books"."campaign_id" = "campaigns"."id"
INNER JOIN "actions" ON "actions"."book_id" = "books"."id" AND "actions"."type" IN ('Impression')
GROUP BY
"users".id
If there are many users in your app then I'd propose to add "OFFSET #{offset} LIMIT #{limit}" to fetch records in batches.
Finally, you can directly specify columns that you need so that memory footprint will be not so large.