How on earth is this rails query working? - ruby-on-rails

I have just optimised some Ruby code that was in a controller method, replacing it with a direct database query. The replacement appears to work and is much faster. Thing is, I've no idea how Rails managed to figure out the correct query to use!
The purpose of the query is to work out tag counts for Place models within a certain distance of a given latitude and longitude. The distance part is handled by the GeoKit plugin (which basically adds convenience methods to add the appropriate trigonometry calculations to the select), and the tagging part is done by the acts_as_taggable_on_steroids plugin, which uses a polymorphic association.
Below is the original code:
places = Place.find(:all, :origin=>latlng, :order=>'distance asc', :within=>distance, :limit=>200)
tag_counts = MyTag.tagcounts(places)
deep_tag_counts=Array.new()
tag_counts.each do |tag|
count=Place.find_tagged_with(tag.name,:origin=>latlng, :order=>'distance asc', :within=>distance, :limit=>200).size
deep_tag_counts<<{:name=>tag.name,:count=>count}
end
where the MyTag class implements this:
def MyTag.tagcounts(places)
alltags = places.collect {|p| p.tags}.flatten.sort_by(&:name)
lasttag=nil;
tagcount=0;
result=Array.new
alltags.each do |tag|
unless (lasttag==nil || lasttag.name==tag.name)
result << MyTag.new(lasttag,tagcount)
tagcount=0
end
tagcount=tagcount+1
lasttag=tag
end
unless lasttag==nil then
result << MyTag.new(lasttag,tagcount)
end
result
end
This was my (very ugly) first attempt as I originally found it difficult to come up with the right rails incantations to get this done in SQL. The new replacement is this single line:
deep_tag_counts=Place.find(:all,:select=>'name,count(*) as count',:origin=>latlng,:within=>distance,:joins=>:tags, :group=>:tag_id)
Which results in an SQL query like this:
SELECT name,count(*) as count, (ACOS(least(1,COS(0.897378837271255)*COS(-0.0153398733287034)*COS(RADIANS(places.lat))*COS(RADIANS(places.lng))+
COS(0.897378837271255)*SIN(-0.0153398733287034)*COS(RADIANS(places.lat))*SIN(RADIANS(places.lng))+
SIN(0.897378837271255)*SIN(RADIANS(places.lat))))*3963.19)
AS distance FROM `places` INNER JOIN `taggings` ON (`places`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Place') INNER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`) WHERE (places.lat>50.693170735732 AND places.lat<52.1388692642679 AND places.lng>-2.03785525810908 AND places.lng<0.280035258109084 AND (ACOS(least(1,COS(0.897378837271255)*COS(-0.0153398733287034)*COS(RADIANS(places.lat))*COS(RADIANS(places.lng))+
COS(0.897378837271255)*SIN(-0.0153398733287034)*COS(RADIANS(places.lat))*SIN(RADIANS(places.lng))+
SIN(0.897378837271255)*SIN(RADIANS(places.lat))))*3963.19)
<= 50) GROUP BY tag_id
Ignoring the trig (which is from GeoKit, and results from the :within and :origin parameters), what I can't figure out about this is how on earth Rails was able to figure out from the instruction to join 'tags', that it had to involve 'taggings' in the JOIN (which it does, as there is no direct way to join the places and tags tables), and also that it had to use the polymorphic stuff.
In other words, how the heck did it (correctly) come up with this bit:
INNER JOIN `taggings` ON (`places`.`id` = `taggings`.`taggable_id` AND `taggings`.`taggable_type` = 'Place') INNER JOIN `tags` ON (`tags`.`id` = `taggings`.`tag_id`)
...given that I never mentioned the taggings table in the code! Digging into the taggable plugin, the only clue that Rails has seems to be this:
class Tag < ActiveRecord::Base
has_many :taggings, :dependent=>:destroy
...
end
Anybody able to give some insight into the magic going on under the hood here?

The acts_as_taggable_on_steroids plugin tells your Place model that it has_many Tags through Taggings. With this association specified, ActiveRecord knows that it needs to join taggings in order to get to the tags table. The same thing holds true for HABTM relationships. For example:
class Person < ActiveRecord::Base
has_and_belongs_to_many :tags
end
class Tag < ActiveRecord::Base
has_and_belongs_to_many :people
end
>> Person.first(:joins => :tags)
This produces the following SQL:
SELECT "people".*
FROM "people"
INNER JOIN "people_tags" ON "people_tags".person_id = "people".id
INNER JOIN "tags" ON "tags".id = "people_tags".tag_id
LIMIT 1

Related

SQL not working for pg

I'm trying to use SQL to get information from a Postgres database using Rails.
This is what I've tried:
Select starts_at, ends_at, hours, employee.maxname, workorder.wonum from events where starts_at>'2018-03-14'
inner join employees on events.employee_id = employees.id
inner join workorders on events.workorder_id = workorders.id;
I get the following error:
ERROR: syntax error at or near "inner"
LINE 2: inner join employees on events.employee_id = employees.id
Sami's comment is correct, but since this question is tagged with ruby-on-rails you can try to use ActiveRecord's API to do the same:
Make sure that your models relations are defined
class Event < ActiveRecord::Base
belongs_to :employee
belongs_to :workorder
end
And then you can do something like:
Event
.where('starts_at > ?', '2018-03-14')
.joins(:employee, :workorder)
or
Event
.joins(:employee, :workorder)
.where('starts_at > ?', '2018-03-14')
And you don't need to worry which one goes first.
In general, it's suboptimal to create the SQL queries in rails if you don't absolutely need to because they're harder to maintain.
You request should look at this :
select starts_at, ends_at, hours, employee.maxname, workorder.wonum
from events
inner join employees on events.employee_id = employees.id
inner join workorders on events.workorder_id = workorders.id
where starts_at>'2018-03-14';

Rails 4 Left outer join multiple tables with Group By and count

Is it possible to do a left outer join in Rails4 with group by and counts. I am trying to write a scope which will do a left outer join of users with the messages, comments and likes tables and then group by id to get total count. In case there is no association, the count should be zero.
So the final result set would be cuuser.*, message_count, likes_count and comments_count. Any idea how this can be accomplished? Thanks in Advance!
class Cuuser < ActiveRecord::Base
has_and_belongs_to_many :groups
has_many :messages
has_many :comments
has_many :likes
validates :username, format: { without: /\s/ }
scope :superusers, -> { joins(:comments, :likes).
select('cuusers.id').
group('cuusers.id').
having('count(comments.id) + count(likes.id) > 2')}
end
You could drop to SQL strings:
joins("LEFT OUTER JOIN comments ON comments.cuuser_id = cuuser.id LEFT OUTER JOIN likes ON likes.cuuser_id = cuuser.id LEFT OUTER JOIN messages on messages.cuuser_id = cuuser.id")
This isn't great. You're sacrificing some portability and the ability of ActiveRecord to guess the association.
If you used the Squeel gem you could use the following format:
joins{ [comments.outer, likes.outer, messages.outer] }
Squeel exposes Arel in a slightly more sane format, so you can do things like left outer joins while still guessing associations from the model definitions.
You could use Arel of course, but things get very clunky very quickly.
To get your counts, try:
select('cuusers.id, count(messages.id) as message_count, count(likes.id) as likes_count, count(comments.id) as comments_count').
They should be available as attributes on the returned objects, just like ordinary database fields.

ActiveRecord query for Users who don't own a Car

How do I get all the users who do not have a car?
class User < ActiveRecord::Base
has_one :car
end
class Car < ActiveRecord::Base
belongs_to :user
end
I was doing the following:
all.select {|user| not user.car }
That worked perfect until my database of users and cars got too big and now I get strange errors, especially when I try and sort the result. I need to do the filtering in the query and the ordering as well as part of the query.
UPDATE: What I did was the following:
where('id not in (?)', Car.pluck(:user_id)).order('first_name, last_name, middle_name')
It's fairly slow as Rails has to grab all the user_ids from the cars table and then issue a giant query. I know I can do a sub-query in SQL, but there must be a better Rails/ActiveRecord way.
UPDATE 2: I now have a noticeably more efficient query:
includes(:car).where(cars: {id: nil})
The answer I accepted below has joins with a SQL string instead of includes. I don't know if includes is more inefficient because it stores the nil data in Ruby objects whereas joins might not? I like not using strings...
One way is to use a left join from the users table to the cars table and only take user entries that don't have any corresponding values in the cars table, this looks like:
User.select('users.*').joins('LEFT JOIN cars ON users.id = cars.user_id').where('cars.id IS NULL')
Most of the work that needs to be done here is SQL. Try this:
User.joins("LEFT OUTER JOIN cars ON users.id = cars.user_id").where("cars.id IS NULL")
It is incredibly inefficient to do this with ruby, as you appear to be trying to do.
You can throw an order on there too:
User.
joins("LEFT OUTER JOIN cars ON users.id = cars.user_id").
where("cars.id IS NULL").
order(:first_name, :last_name, :middle_name)
You can make this a scope on your User model so you only have one place to deal with it:
class User < ActiveRecord::Base
has_one :car
def self.without_cars
joins("LEFT OUTER JOIN cars ON users.id = cars.user_id").
where("cars.id IS NULL").
order(:first_name, :last_name, :middle_name)
end
end
This way you can do:
User.without_cars
In your controller or another method, or even chain the scope:
User.without_cars.where("users.birthday > ?", 18.years.ago)
to find users without cars that are under 18 years old (arbitrary example, but you get the idea). My point is, this kind of thing should always be made into a scope, so it can be chained with other scopes :) Arel is awesome that way.

How to write complex query in Ruby

Need advice, how to write complex query in Ruby.
Query in PHP project:
$get_trustee = db_query("SELECT t.trustee_name,t.secret_key,t.trustee_status,t.created,t.user_id,ui.image from trustees t
left join users u on u.id = t.trustees_id
left join user_info ui on ui.user_id = t.trustees_id
WHERE t.user_id='$user_id' AND trustee_status ='pending'
group by secret_key
ORDER BY t.created DESC")
My guess in Ruby:
get_trustee = Trustee.find_by_sql('SELECT t.trustee_name, t.secret_key, t.trustee_status, t.created, t.user_id, ui.image FROM trustees t
LEFT JOIN users u ON u.id = t.trustees_id
LEFT JOIN user_info ui ON ui.user_id = t.trustees_id
WHERE t.user_id = ? AND
t.trustee_status = ?
GROUP BY secret_key
ORDER BY t.created DESC',
[user_id, 'pending'])
Option 1 (Okay)
Do you mean Ruby with ActiveRecord? Are you using ActiveRecord and/or Rails? #find_by_sql is a method that exists within ActiveRecord. Also it seems like the user table isn't really needed in this query, but maybe you left something out? Either way, I'll included it in my examples. This query would work if you haven't set up your relationships right:
users_trustees = Trustee.
select('trustees.*, ui.image').
joins('LEFT OUTER JOIN users u ON u.id = trustees.trustees_id').
joins('LEFT OUTER JOIN user_info ui ON ui.user_id = t.trustees_id').
where(user_id: user_id, trustee_status: 'pending').
order('t.created DESC')
Also, be aware of a few things with this solution:
I have not found a super elegant way to get the columns from the join tables out of the ActiveRecord objects that get returned. You can access them by users_trustees.each { |u| u['image'] }
This query isn't really THAT complex and ActiveRecord relationships make it much easier to understand and maintain.
I'm assuming you're using a legacy database and that's why your columns are named this way. If I'm wrong and you created these tables for this app, then your life would be much easier (and conventional) with your primary keys being called id and your timestamps being called created_at and updated_at.
Option 2 (Better)
If you set up your ActiveRecord relationships and classes properly, then this query is much easier:
class Trustee < ActiveRecord::Base
self.primary_key = 'trustees_id' # wouldn't be needed if the column was id
has_one :user
has_one :user_info
end
class User < ActiveRecord::Base
belongs_to :trustee, foreign_key: 'trustees_id' # relationship can also go the other way
end
class UserInfo < ActiveRecord::Base
self.table_name = 'user_info'
belongs_to :trustee
end
Your "query" can now be ActiveRecord goodness if performance isn't paramount. The Ruby convention is readability first, reorganizing code later if stuff starts to scale.
Let's say you want to get a trustee's image:
trustee = Trustee.where(trustees_id: 5).first
if trustee
image = trustee.user_info.image
..
end
Or if you want to get all trustee's images:
Trustee.all.collect { |t| t.user_info.try(:image) } # using a #try in case user_info is nil
Option 3 (Best)
It seems like trustee is just a special-case user of some sort. You can use STI if you don't mind restructuring you tables to simplify even further.
This is probably outside of the scope of this question so I'll just link you to the docs on this: http://api.rubyonrails.org/classes/ActiveRecord/Base.html see "Single Table Inheritance". Also see the article that they link to from Martin Fowler (http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html)
Resources
http://guides.rubyonrails.org/association_basics.html
http://guides.rubyonrails.org/active_record_querying.html
Yes, find_by_sql will work, you can try this also:
Trustee.connection.execute('...')
or for generic queries:
ActiveRecord::Base.connection.execute('...')

Specifying conditions on eager loaded associations returns ActiveRecord::RecordNotFound

The problem is that when a Restaurant does not have any MenuItems that match the condition, ActiveRecord says it can't find the Restaurant. Here's the relevant code:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
def self.with_meals_of_the_week
includes({menu_items: :meal}).where(:'menu_items.date' => Time.now.beginning_of_week..Time.now.end_of_week)
end
end
And the sql code generated:
Restaurant Load (0.0ms)←[0m ←[1mSELECT DISTINCT "restaurants".id FROM "restaurants"
LEFT OUTER JOIN "menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN "meals" ON "meals"."id" = "menu_items"."meal_id" WHERE
"restaurants"."id" = ? AND ("menu_items"."date" BETWEEN '2012-10-14 23:00:00.000000'
AND '2012-10-21 22:59:59.999999') LIMIT 1←[0m [["id", "1"]]
However, according to this part of the Rails Guides, this shouldn't be happening:
Post.includes(:comments).where("comments.visible", true)
If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded.
The SQL generated is a correct translation of your query. But look at it,
just at the SQL level (i shortened it a bit):
SELECT *
FROM
"restaurants"
LEFT OUTER JOIN
"menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN
"meals" ON "meals"."id" = "menu_items"."meal_id"
WHERE
"restaurants"."id" = ?
AND
("menu_items"."date" BETWEEN '2012-10-14' AND '2012-10-21')
the left outer joins do the work you expect them to do: restaurants
are combined with menu_items and meals; if there is no menu_item to
go with a restaurant, the restaurant is still kept in the result, with
all the missing pieces (menu_items.id, menu_items.date, ...) filled in with NULL
now look aht the second part of the where: the BETWEEN operator demands,
that menu_items.date is not null! and this
is where you filter out all the restaurants without meals.
so we need to change the query in a way that makes having null-dates ok.
going back to ruby, you can write:
def self.with_meals_of_the_week
includes({menu_items: :meal})
.where('menu_items.date is NULL or menu_items.date between ? and ?',
Time.now.beginning_of_week,
Time.now.end_of_week
)
end
The resulting SQL is now
.... WHERE (menu_items.date is NULL or menu_items.date between '2012-10-21' and '2012-10-28')
and the restaurants without meals stay in.
As it is said in Rails Guide, all Posts in your query will be returned only if you will not use "where" clause with "includes", cause using "where" clause generates OUTER JOIN request to DB with WHERE by right outer table so DB will return nothing.
Such implementation is very helpful when you need some objects (all, or some of them - using where by base model) and if there are related models just get all of them, but if not - ok just get list of base models.
On other hand if you trying to use conditions on including tables then in most cases you want to select objects only with this conditions it means you want to select Restaurants only which has meals_items.
So in your case, if you still want to use only 2 queries (and not N+1) I would probably do something like this:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
cattr_accessor :meals_of_the_week
def self.with_meals_of_the_week
restaurants = Restaurant.all
meals_of_the_week = {}
MenuItems.includes(:meal).where(date: Time.now.beginning_of_week..Time.now.end_of_week, restaurant_id => restaurants).each do |menu_item|
meals_of_the_week[menu_item.restaurant_id] = menu_item
end
restaurants.each { |r| r.meals_of_the_week = meals_of_the_week[r.id] }
restaurants
end
end
Update: Rails 4 will raise Deprecation warning when you simply try to do conditions on models
Sorry for possible typo.
I think there is some misunderstanding of this
If there was no where condition, this would generate the normal set of two queries.
If, in the case of this includes query, there were no comments for any
posts, all the posts would still be loaded. By using joins (an INNER
JOIN), the join conditions must match, otherwise no records will be
returned.
[from guides]
I think this statements doesn't refer to the example Post.includes(:comments).where("comments.visible", true)
but refer to one without where statement Post.includes(:comments)
So all work right! This is the way LEFT OUTER JOIN work.
So... you wrote: "If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded." Ok! But this is true ONLY when there is NO where clause! You missed the context of the phrase.

Resources