Rails outer join, left join or some other join

Rails outer join, left join or some other join - ruby-on-rails

Table selections
user_id
profile_id
Table profile
user_id
Table user
user has_one profile
profile belongs_to user
user has_many selections
How can I select ALL profiles that are NOT in selections table for a user?

For a pure SQL method, which is practically always the fastest way of doing things with data, I would go with:
select *
from profiles
where id not in (
select profile_id
from user_profiles
where user_id = #{self.id})
ActiveRecord syntax does better with joins, but I'd be inclined for simplicity and readability to keep it as:
Profile.where("id not in (
select profile_id
from user_profiles
where user_id = ?)", self.id)

Might be a round about way with extra iteration. But one that came to my mind.
Profile.all.reject{|pro| pro.selections.nil?}

ids = Selection.where(user_id: self.id).map{|x|x["user_id"]}.uniq
Profile.where(['id not in (?)', ids])

David answer is the most correct of all answers, though I wanted to contribute on a bit of a tuning. If You are using postgres than this post is worth reading:
http://explainextended.com/2009/09/16/not-in-vs-not-exists-vs-left-join-is-null-postgresql/
Basically the gist of what that article says is, that are 3 usual ways to do it:
NOT IN (already suggested by David)
NOT EXISTS
LEF JOIN and checking fro NULLs
The author does some research and comes up with conclusion that second and third produce the same explain plan, which means that both are virtually the same.
Revising Davids example it would look like this
Profile.where(
"
NOT EXISTS (
select 1
from selections
where selections.user_id = ? AND selections.profile_id = profiles.id)
", id) # no need to use self when calling instance methods

Related

Retrive records which are not referenced in other table, ActiveRecord query

There are 2 tables : User and Teacher. Teacher.user_id is from User. So, how do I find in a single query, all the users who are not in teachers.
I meant something along the lines :
User.not_in(Teacher.all)

You can use where.not query from ActiveRecord try something like below:
User.where.not(id: Teacher.pluck(:user_id).reject {|x| x.nil?})
Note: used reject method, in case you have nil values in some records.

The other users seem to have neglected the rails 3 tag (since removed based on the approved answer. My answer left for posterity) : Please try this
User.where("id NOT IN (?)",Teacher.pluck(:user_id).join(","))
This will become SELECT * FROM users WHERE id NOT IN (....) (two queries one to get the user_id from teachers and another to get the user(s) not in that list) and may fail based on the size of teacher table.
Other option is an arel table:
users = User.arel_table
User.where(users[:id].not_in(Teacher.select(:user_id).where("user_id IS NOT NULL")))
This should produce a single query similar to
SELECT * FROM users
WHERE id NOT IN ( SELECT user_id FROM teachers WHERE user_id IS NOT NULL)
(one query better performance) * syntax was not fully tested
Another single query option might be
User.joins("LEFT OUTER JOIN teachers ON teachers.user_id = users.id").
where("teachers.user_id IS NULL")

I think you should be able to do something like this
User.where.not(id: Teacher.ids)

Is there anyway to make a lesser impact on my database with this request?

For the analytics of my site, I'm required to extract the 4 states of my users.
#members = list.members.where(enterprise_registration_id: registration.id)
# This pulls roughly 10,0000 records.. Which is evidently a huge data pull for Rails
# Member Load (155.5ms)
#invited = #members.where("user_id is null")
# Member Load (21.6ms)
#not_started = #members.where("enterprise_members.id not in (select enterprise_member_id from quizzes where quizzes.section_id IN (?)) AND enterprise_members.user_id in (select id from users)", #sections.map(&:id) )
# Member Load (82.9ms)
#in_progress = #members.joins(:quizzes).where('quizzes.section_id IN (?) and (quizzes.completed is null or quizzes.completed = ?)', #sections.map(&:id), false).group("enterprise_members.id HAVING count(quizzes.id) > 0")
# Member Load (28.5ms)
#completes = Quiz.where(enterprise_member_id: registration.members, section_id: #sections.map(&:id)).completed
# Quiz Load (138.9ms)
The operation returns a 503 meaning my app gives up on the request. Any ideas how I can refactor this code to run faster? Maybe by better joins syntax? I'm curious how sites with larger datasets accomplish what seems like such trivial DB calls.

The answer is your indexes. Check your rails logs (or check the console in development mode) and copy the queries to your db tool. Slap an "Explain" in front of the query and it will give you a breakdown. From here you can see what indexes you need to optimize the query.
For a quick pass, you should at least have these in your schema,
enterprise_members: needs an index on enterprise_member_id
members: user_id
quizes: section_id

As someone else posted definitely look into adding indexes if needed. Some of how to refactor depends on what exactly you are trying to do with all these records. For the #members query, what are you using the #members records for? Do you really need to retrieve all attributes for every member record? If you are not using every attribute, I suggest only getting the attributes that you actually use for something, .pluck usage could be warranted. 3rd and 4th queries, look fishy. I assume you've run the queries in a console? Again not sure what the queries are being used for but I'll toss in that it is often useful to write raw sql first and query on the db first. Then, you can apply your findings to rewriting activerecord queries.
What is the .completed tagged on the end? Is it supposed to be there? only thing I found close in the rails api is .completed? If it is a custom method definitely look into it. You potentially also have an use case for scopes.

THIRD QUERY:
I unfortunately don't know ruby on rails, but from a postgresql perspective, changing your "not in" to a left outer join should make it a little faster:
Your code:
enterprise_members.id not in (select enterprise_member_id from quizzes where quizzes.section_id IN (?)) AND enterprise_members.user_id in (select id from users)", #sections.map(&:id) )
Better version (in SQL):
select blah
from enterprise_members em
left outer join quizzes q on q.enterprise_member_id = em.id
join users u on u.id = q.enterprise_member_id
where quizzes.section_id in (?)
and q.enterprise_member_id is null
Based on my understanding this will allow postgres to sort both the enterprise_members table and the quizzes and do a hash join. This is better than when it will do now. Right now it finds everything in the quizzes subquery, brings it into memory, and then tries to match it to enterprise_members.
FIRST QUERY:
You could also create a partial index on user_id for your first query. This will be especially good if there are a relatively small number of user_ids that are null in a large table. Partial index creation:
CREATE INDEX user_id_null_ix ON enterprise_members (user_id)
WHERE (user_id is null);
Anytime you query enterprise_members with something that matches the index's where clause, the partial index can be used and quickly limit the rows returned. See http://www.postgresql.org/docs/9.4/static/indexes-partial.html for more info.

Thanks everyone for your ideas. I basically did what everyone said. I added indexes, resorted how I called everything, but the major difference was using the pluck method.. Here's my new stats :
#alt_members = list.members.pluck :id # 23ms
if list.course.sections.tests.present? && #sections = list.course.sections.tests
#quiz_member_ids = Quiz.where(section_id: #sections.map(&:id)).pluck(:enterprise_member_id) # 8.5ms
#invited = list.members.count('user_id is null') # 12.5ms
#not_started = ( #alt_members - ( #alt_members & #quiz_member_ids ).count #0ms
#in_progress = ( #alt_members & #quiz_member_ids ).count # 0ms
#completes = ( #alt_members & Quiz.where(section_id: #sections.map(&:id), completed: true).pluck(:enterprise_member_id) ).count # 9.7ms
#question_count = Quiz.where(section_id: #sections.map(&:id), completed: true).limit(5).map{|quiz|quiz.answers.count}.max # 3.5ms

Rails 3 Comparing foreign key to list of ids using activerecord

I have a relationship between two models, Registers and Competitions. I have a very complicated dynamic query that is being built and if the conditions are right I need to limit Registration records to only those where it's Competition parent meets a certain criteria. In order to do this without select from the Competition table I was thinking of something along the lines of...
Register.where("competition_id in ?", Competition.where("...").collect {|i| i.id})
Which produces this SQL:
SELECT "registers".* FROM "registers" WHERE (competition_id in 1,2,3,4...)
I don't think PostgreSQL liked the fact that the in parameters aren't surrounded by parenthesis. How can I compare the Register foreign key to a list of competition ids?

you can make it a bit shorter and skip the collect (this worked for me in 3.2.3).
Register.where(competition_id: Competition.where("..."))
this will result in the following sql:
SELECT "registers".* FROM "registers" WHERE "registers"."competition_id" IN (SELECT "competitions"."id" FROM "competitions" WHERE "...")

Try this instead:
competitions = Competition.where("...").collect {|i| i.id}
Register.where(:competition_id => competitions)

In Ruby on Rails, how do I query for items that have no associated elements in a has_many :through relationship

I have a Contact model, and a User model, and a join table and each is HABTM more or less.
How can I query for the Contacts that have no users assigned to them? Driving me nuts.
Thanks

IMHO you should do a raw SQL query something along the lines of....
select c.*
from contacts c left join contacts_users cu on c.id = cu.contact_id
where cu.contact_id is null
I don't know of any pretty ORM-specific way to do it. Obviously you'll want to tailor the query to use the actual fields from your table.

I believe this thread is looking for the same thing right?
Want to find records with no associated records in Rails 3
If I understand you correctly, then I think it could be something like:
Contact.includes(:jointablenames).where( :jointablenames => {:contact_id => nil } )

rails select and include

Can anyone explain this?
Project.includes([:user, :company])
This executes 3 queries, one to fetch projects, one to fetch users for those projects and one to fetch companies.
Project.select("name").includes([:user, :company])
This executes 3 queries, and completely ignores the select bit.
Project.select("user.name").includes([:user, :company])
This executes 1 query with proper left joins. And still completely ignores the select.
It would seem to me that rails ignores select with includes. Ok fine, but why when I put a related model in select does it switch from issuing 3 queries to issuing 1 query?
Note that the 1 query is what I want, I just can't imagine this is the right way to get it nor why it works, but I'm not sure how else to get the results in one query (.joins seems to only use INNER JOIN which I do not in fact want, and when I manually specifcy the join conditions to .joins the search gem we're using freaks out as it tries to re-add joins with the same name).

I had the same problem with select and includes.
For eager loading of associated models I used native Rails scope 'preload' http://apidock.com/rails/ActiveRecord/QueryMethods/preload
It provides eager load without skipping of 'select' at scopes chain.
I found it here https://github.com/rails/rails/pull/2303#issuecomment-3889821
Hope this tip will be helpful for someone as it was helpful for me.

Allright so here's what I came up with...
.joins("LEFT JOIN companies companies2 ON companies2.id = projects.company_id LEFT JOIN project_types project_types2 ON project_types2.id = projects.project_type_id LEFT JOIN users users2 ON users2.id = projects.user_id") \
.select("six, fields, I, want")
Works, pain in the butt but it gets me just the data I need in one query. The only lousy part is I have to give everything a model2 alias since we're using meta_search, which seems to not be able to figure out that a table is already joined when you specify your own join conditions.

Rails has always ignored the select argument(s) when using include or includes. If you want to use your select argument then use joins instead.
You might be having a problem with the query gem you're talking about but you can also include sql fragments using the joins method.
Project.select("name").joins(['some sql fragement for users', 'left join companies c on c.id = projects.company_id'])
I don't know your schema so i'd have to guess at the exact relationships but this should get you started.

I might be totally missing something here but select and include are not a part of ActiveRecord. The usual way to do what you're trying to do is like this:
Project.find(:all, :select => "users.name", :include => [:user, :company], :joins => "LEFT JOIN users on projects.user_id = users.id")
Take a look at the api documentation for more examples. Occasionally I've had to go manual and use find_by_sql:
Project.find_by_sql("select users.name from projects left join users on projects.user_id = users.id")
Hopefully this will point you in the right direction.

I wanted that functionality myself,so please use it.
Include this method in your class
#ACCEPTS args in string format "ASSOCIATION_NAME:COLUMN_NAME-COLUMN_NAME"
def self.includes_with_select(*m)
association_arr = []
m.each do |part|
parts = part.split(':')
association = parts[0].to_sym
select_columns = parts[1].split('-')
association_macro = (self.reflect_on_association(association).macro)
association_arr << association.to_sym
class_name = self.reflect_on_association(association).class_name
self.send(association_macro, association, -> {select *select_columns}, class_name: "#{class_name.to_sym}")
end
self.includes(*association_arr)
end
And you will be able to call like: Contract.includes_with_select('user:id-name-status', 'confirmation:confirmed-id'), and it will select those specified columns.

The preload solution doesn't seem to do the same JOINs as eager_load and includes, so to get the best of all worlds I also wrote my own, and released it as a part of a data-related gem I maintain, The Brick.
By overriding ActiveRecord::Associations::JoinDependency.apply_column_aliases() like this then when you add a .select(...) then it can act as a filter to choose which column aliases get built out.
With gem 'brick' loaded, in order to enable this selective behaviour, add the special column name :_brick_eager_load as the first entry in your .select(...), which turns on the filtering of columns while the aliases are being built out. Here's an example:
Employee.includes(orders: :order_details)
.references(orders: :order_details)
.select(:_brick_eager_load,
'employees.first_name', 'orders.order_date', 'order_details.product_id')
Because foreign keys are essential to have everything be properly associated, they are automatically added, so you do not need to include them in your select list.
Hope it can save you both query time and some RAM!

Categories

HOME

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Rails outer join, left join or some other join - ruby-on-rails

Table selections user_id profile_id Table profile user_id Table user user has_one profile profile belongs_to user user has_many selections How can I select ALL profiles that are NOT in selections table for a user?

Might be a round about way with extra iteration. But one that came to my mind. Profile.all.reject{|pro| pro.selections.nil?}

ids = Selection.where(user_id: self.id).map{|x|x["user_id"]}.uniq Profile.where(['id not in (?)', ids])

Related

Retrive records which are not referenced in other table, ActiveRecord query

Is there anyway to make a lesser impact on my database with this request?

Rails 3 Comparing foreign key to list of ids using activerecord

In Ruby on Rails, how do I query for items that have no associated elements in a has_many :through relationship

rails select and include

Categories

Resources