Rails 4 Left outer join multiple tables with Group By and count - ruby-on-rails

Is it possible to do a left outer join in Rails4 with group by and counts. I am trying to write a scope which will do a left outer join of users with the messages, comments and likes tables and then group by id to get total count. In case there is no association, the count should be zero.
So the final result set would be cuuser.*, message_count, likes_count and comments_count. Any idea how this can be accomplished? Thanks in Advance!
class Cuuser < ActiveRecord::Base
has_and_belongs_to_many :groups
has_many :messages
has_many :comments
has_many :likes
validates :username, format: { without: /\s/ }
scope :superusers, -> { joins(:comments, :likes).
select('cuusers.id').
group('cuusers.id').
having('count(comments.id) + count(likes.id) > 2')}
end

You could drop to SQL strings:
joins("LEFT OUTER JOIN comments ON comments.cuuser_id = cuuser.id LEFT OUTER JOIN likes ON likes.cuuser_id = cuuser.id LEFT OUTER JOIN messages on messages.cuuser_id = cuuser.id")
This isn't great. You're sacrificing some portability and the ability of ActiveRecord to guess the association.
If you used the Squeel gem you could use the following format:
joins{ [comments.outer, likes.outer, messages.outer] }
Squeel exposes Arel in a slightly more sane format, so you can do things like left outer joins while still guessing associations from the model definitions.
You could use Arel of course, but things get very clunky very quickly.
To get your counts, try:
select('cuusers.id, count(messages.id) as message_count, count(likes.id) as likes_count, count(comments.id) as comments_count').
They should be available as attributes on the returned objects, just like ordinary database fields.

Related

ActiveRecord - find records that its association count is 0

In my Ruby on Rails app I have the following model:
class SlideGroup < ApplicationRecord
has_many :survey_group_lists, foreign_key: 'group_id'
has_many :surveys, through: :survey_group_lists
end
I want to find all orphaned slide groups. Orphaned slide group is slide group which is not connected to any survey. I've been trying following query but it does not return anything and I'm sure that I have orphaned records in my test database:
SlideGroup.joins(:surveys).group("slide_groups.id, surveys.id").having("count(surveys.id) = ?",0)
this generates following sql query:
SlideGroup Load (9.3ms) SELECT "slide_groups".* FROM "slide_groups" INNER JOIN "survey_group_lists" ON "survey_group_lists"."group_id" = "slide_groups"."id" INNER JOIN "surveys" ON "surveys"."id" = "survey_group_lists"."survey_id" GROUP BY slide_groups.id, surveys.id HAVING (count(surveys.id) = 0)
You're using joins, which is INNER JOIN, whereas what you need is an OUTER JOIN -
includes:
SlideGroup.includes(:surveys).group("slide_groups.id, surveys.id").having("count(surveys.id) = ?",0)
A bit cleaner query:
SlideGroup.includes(:surveys).where(surveys: { id: nil })
Finding orphan records has been explained by others.
I see problems with this approach:
There should not be any orphan in the first place
The presence of a survey.id does not guarantee the presence of a Survey
What about SurveyGroupList that are orphan?
So the proper solution would be to ensure that no orphans are left in the DB. By implementing the proper logic AND adding foreign keys with on delete cascade to the DB. You can also add dependent: :destroy option to your associations but this only works if you use #destroy on your models (not delete) and of course does not work if you delete directly via SQL.

ActiveRecord query for Users who don't own a Car

How do I get all the users who do not have a car?
class User < ActiveRecord::Base
has_one :car
end
class Car < ActiveRecord::Base
belongs_to :user
end
I was doing the following:
all.select {|user| not user.car }
That worked perfect until my database of users and cars got too big and now I get strange errors, especially when I try and sort the result. I need to do the filtering in the query and the ordering as well as part of the query.
UPDATE: What I did was the following:
where('id not in (?)', Car.pluck(:user_id)).order('first_name, last_name, middle_name')
It's fairly slow as Rails has to grab all the user_ids from the cars table and then issue a giant query. I know I can do a sub-query in SQL, but there must be a better Rails/ActiveRecord way.
UPDATE 2: I now have a noticeably more efficient query:
includes(:car).where(cars: {id: nil})
The answer I accepted below has joins with a SQL string instead of includes. I don't know if includes is more inefficient because it stores the nil data in Ruby objects whereas joins might not? I like not using strings...
One way is to use a left join from the users table to the cars table and only take user entries that don't have any corresponding values in the cars table, this looks like:
User.select('users.*').joins('LEFT JOIN cars ON users.id = cars.user_id').where('cars.id IS NULL')
Most of the work that needs to be done here is SQL. Try this:
User.joins("LEFT OUTER JOIN cars ON users.id = cars.user_id").where("cars.id IS NULL")
It is incredibly inefficient to do this with ruby, as you appear to be trying to do.
You can throw an order on there too:
User.
joins("LEFT OUTER JOIN cars ON users.id = cars.user_id").
where("cars.id IS NULL").
order(:first_name, :last_name, :middle_name)
You can make this a scope on your User model so you only have one place to deal with it:
class User < ActiveRecord::Base
has_one :car
def self.without_cars
joins("LEFT OUTER JOIN cars ON users.id = cars.user_id").
where("cars.id IS NULL").
order(:first_name, :last_name, :middle_name)
end
end
This way you can do:
User.without_cars
In your controller or another method, or even chain the scope:
User.without_cars.where("users.birthday > ?", 18.years.ago)
to find users without cars that are under 18 years old (arbitrary example, but you get the idea). My point is, this kind of thing should always be made into a scope, so it can be chained with other scopes :) Arel is awesome that way.

ActiveRecord find categories which contain at least one item

Support I have two models for items and categories, in a many-to-many relation
class Item < ActiveRecord::Base
has_and_belongs_to_many :categories
class Category < ActiveRecord::Base
has_and_belongs_to_many :items
Now I want to filter out categories which contain at least one items, what will be the best way to do this?
I would like to echo #Delba's answer and expand on it because it's correct - what #huan son is suggesting with the count column is completely unnecessary, if you have your indexes set up correctly.
I would add that you probably want to use .uniq, as it's a many-to-many you only want DISTINCT categories to come back:
Category.joins(:items).uniq
Using the joins query will let you more easily work conditions into your count of items too, giving much more flexibility. For example you might not want to count items where enabled = false:
Category.joins(:items).where(:items => { :enabled => true }).uniq
This would generate the following SQL, using inner joins which are EXTREMELY fast:
SELECT `categories`.* FROM `categories` INNER JOIN `categories_items` ON `categories_items`.`category_id` = `categories`.`id` INNER JOIN `items` ON `items`.`id` = `categories_items`.`item_id` WHERE `items`.`enabled` = 1
Good luck,
Stu
Category.joins(:items)
More details here: http://guides.rubyonrails.org/active_record_querying.html#joining-tables
please notice, what the other guys answererd is NOT performant!
the most performant solution:
better to work with a counter_cache and save the items_count in the model!
scope :with_items, where("items_count > 0")
has_and_belongs_to_many :categories, :after_add=>:update_count, :after_remove=>:update_count
def update_count(category)
category.items_count = category.items.count
category.save
end
for normal "belongs_to" relation you just write
belongs_to :parent, :counter_cache=>true
and in the parent_model you have an field items_count (items is the pluralized has_many class name)
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html
in a has_and_belongs_to_many relation you have to write it as your own as above
scope :has_item, where("#{table_name}.id IN (SELECT categories_items.category_id FROM categories_items")
This will return all categories which have an entry in the join table because, ostensibly, a category shouldn't have an entry there if it does not have an item. You could add a AND categories_items.item_id IS NOT NULL to the subselect condition just to be sure.
In case you're not aware, table_name is a method which returns the table name of ActiveRecord class calling it. In this case it would be "categories".

Finding records with no associated records in rails 3

class Person < ActiveRecord::Base
has_many :pets
scope :with_dog, join(:pets).where("pets.type = 'Dog'")
scope :without_pets ???????????????????????????????????
end
class Pet < ActiveRecord::Base
belongs_to :people
end
I'd like to add a scope to the Person model that returns people who have no pets. Any ideas? I feel like this is obvious, but it's escaping me at the moment.
scope :without_pets, lambda { includes(:pets).where('pets.id' => nil) }
Try something like this:
Person.joins('left outer join pets on persons.id=pets.person_id').
select('persons.*,pets.id').
where('pets.id is null')
I haven't tested it but it ought to work.
The idea is that we're performing a left outer join, so the pets fields will be null for every person that has no pets. You'll probably need to include :readonly => false in the join since ActiveRecord returns read-only objects when join() is passed a string.
Mark Westling's answer is correct. The outer join is the right way to go. An inner join (which is what the joins method generates if you pass it the name/symbol of an association and not your own SQL) will not work, as it will not include people who do not have a pet.
Here it is written as a scope:
scope :without_pets, joins("left outer join pets on pets.person_id = persons.id").where("pets.id is null")
(If that doesn't work, try replacing 'persons' with 'people' -- I'm not sure what your table name is.)
You must use a LEFT OUTER JOIN in order to find records without associated records. Here's an adapted version of a code I use:
scope :without_pets, joins('LEFT OUTER JOIN pets ON people.id = pets.person_id').group('people.id').having('count(pets.id) = 0')
Im not sure if your pet model has a person id, but maybe this attempt helps you somehow
scope :with_dog, joins(:pets).where("pets.type = 'Dog'")
scope :without_pets, joins(:pets).where("pets.person_id != persons.id")
Update: Corrected the query method name from 'join' to 'joins'.

How on earth is this rails query working?

I have just optimised some Ruby code that was in a controller method, replacing it with a direct database query. The replacement appears to work and is much faster. Thing is, I've no idea how Rails managed to figure out the correct query to use!
The purpose of the query is to work out tag counts for Place models within a certain distance of a given latitude and longitude. The distance part is handled by the GeoKit plugin (which basically adds convenience methods to add the appropriate trigonometry calculations to the select), and the tagging part is done by the acts_as_taggable_on_steroids plugin, which uses a polymorphic association.
Below is the original code:
places = Place.find(:all, :origin=>latlng, :order=>'distance asc', :within=>distance, :limit=>200)
tag_counts = MyTag.tagcounts(places)
deep_tag_counts=Array.new()
tag_counts.each do |tag|
count=Place.find_tagged_with(tag.name,:origin=>latlng, :order=>'distance asc', :within=>distance, :limit=>200).size
deep_tag_counts<<{:name=>tag.name,:count=>count}
end
where the MyTag class implements this:
def MyTag.tagcounts(places)
alltags = places.collect {|p| p.tags}.flatten.sort_by(&:name)
lasttag=nil;
tagcount=0;
result=Array.new
alltags.each do |tag|
unless (lasttag==nil || lasttag.name==tag.name)
result << MyTag.new(lasttag,tagcount)
tagcount=0
end
tagcount=tagcount+1
lasttag=tag
end
unless lasttag==nil then
result << MyTag.new(lasttag,tagcount)
end
result
end
This was my (very ugly) first attempt as I originally found it difficult to come up with the right rails incantations to get this done in SQL. The new replacement is this single line:
deep_tag_counts=Place.find(:all,:select=>'name,count(*) as count',:origin=>latlng,:within=>distance,:joins=>:tags, :group=>:tag_id)
Which results in an SQL query like this:
SELECT name,count(*) as count, (ACOS(least(1,COS(0.897378837271255)*COS(-0.0153398733287034)*COS(RADIANS(places.lat))*COS(RADIANS(places.lng))+
COS(0.897378837271255)*SIN(-0.0153398733287034)*COS(RADIANS(places.lat))*SIN(RADIANS(places.lng))+
SIN(0.897378837271255)*SIN(RADIANS(places.lat))))*3963.19)
AS distance FROM `places` INNER JOIN `taggings` ON (`places`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Place') INNER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`) WHERE (places.lat>50.693170735732 AND places.lat<52.1388692642679 AND places.lng>-2.03785525810908 AND places.lng<0.280035258109084 AND (ACOS(least(1,COS(0.897378837271255)*COS(-0.0153398733287034)*COS(RADIANS(places.lat))*COS(RADIANS(places.lng))+
COS(0.897378837271255)*SIN(-0.0153398733287034)*COS(RADIANS(places.lat))*SIN(RADIANS(places.lng))+
SIN(0.897378837271255)*SIN(RADIANS(places.lat))))*3963.19)
<= 50) GROUP BY tag_id
Ignoring the trig (which is from GeoKit, and results from the :within and :origin parameters), what I can't figure out about this is how on earth Rails was able to figure out from the instruction to join 'tags', that it had to involve 'taggings' in the JOIN (which it does, as there is no direct way to join the places and tags tables), and also that it had to use the polymorphic stuff.
In other words, how the heck did it (correctly) come up with this bit:
INNER JOIN `taggings` ON (`places`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Place') INNER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`)
...given that I never mentioned the taggings table in the code! Digging into the taggable plugin, the only clue that Rails has seems to be this:
class Tag < ActiveRecord::Base
has_many :taggings, :dependent=>:destroy
...
end
Anybody able to give some insight into the magic going on under the hood here?
The acts_as_taggable_on_steroids plugin tells your Place model that it has_many Tags through Taggings. With this association specified, ActiveRecord knows that it needs to join taggings in order to get to the tags table. The same thing holds true for HABTM relationships. For example:
class Person < ActiveRecord::Base
has_and_belongs_to_many :tags
end
class Tag < ActiveRecord::Base
has_and_belongs_to_many :people
end
>> Person.first(:joins => :tags)
This produces the following SQL:
SELECT "people".*
FROM "people"
INNER JOIN "people_tags" ON "people_tags".person_id = "people".id
INNER JOIN "tags" ON "tags".id = "people_tags".tag_id
LIMIT 1

Resources