Rails 3: Inner join with associations - ruby-on-rails

I'm a bit stuck on an Active Record query problem. I'd like to do a left join instead of an inner join on a query, but the results of the joins method are returning only an inner join.
Entry.
joins(:measurements => :measurement_type).
sum(:value, :conditions => ['name = ?','Calories'])
The tricky part is that :measurements has a belongs_to relationship to :measurement_type which is where the name column resides.
Right now this is returning sums only for users who have food_posts. The goal is to return sums for all users. Users who don't have a food post would have a sum of zero.
I've tried Googling and searching other posts on here, but haven't found an answer. Any suggestions?
Also I'm using SQLite in my development environment and Postgresql for production.

If you want to do a left join instead of an inner join you need to use custom 'sql fragments'
In your example,
relation.joins("left join measurements on measurements.id = measurement_type.measurements_id")
Or something along those lines dependent on your setup.
This tutorial sums it up pretty clearly.

MetaWhere gem has support for outer joins, among much other goodness:
https://github.com/ernie/meta_where

Related

Rails (Activerecord) - Can't make query with joins and global sum without duplicates

I'm using a query with multiple user-set filters in order to show a list of invoices in a Rails app. One of the filters adds a where condition on a column of a separate table, which needs a double join in order to be accessible (estimates -through projects-).
scope :by_seller, lambda {|user_id|
joins(project: :estimates)
.where(estimates: {:user_id => user_id}) unless user_id.blank?
}
Additionally, I use Rails' aggregate method "sum" in order to find out the total amount of the invoices, #invoices.sum(:total_cache), where total_cache is a cached column in the database specifically designed to perform this kind of sum in a performant way.
#invoices.sum(:total_cache)
My problem is, given the fact that I need a double join in order to access Estimates through Projects, and that each Invoice belongs to a Project, BUT a project can have many Estimates, the join operation results in duplicate records, so my Invoices table shows some of the invoices many times (as many as the number of estimates its project has). This results in an invoices table with duplicate records, and in an incorrect sum value, as it sums some of the invoice totals N times.
The filtering behaviour is just fine, as my intention is to filter by the user who made ANY of the estimates in the invoice project. However, the issue is that when I try to avoid the duplicates by adding a group('invoices.id') -the way I always solved such situations-, the final sum operation won't return the total sum of the invoices' total, but a grouped sum of each one of them (totally useless).
The only workaround I've found is to include the group clause and perform the sum in pure ruby code, treating the collection as an array, which IMHO is terribly inefficient, as there are tons of invoices:
#invoices.map(&:total_cache).inject(0, &:+)
Is there a way I can obtain a unique ActiveRecord collection of Invoices without duplicates in a way I can then call the aggregate sum method and obtain a total calculated by Postgres?
Of course, if there is something wrong in my base idea I'm completely open to hearing it! It's quite a complex query (I simplified it for the sake of the question here) and there can be many approaches I'm sure!
Thank you everyone!
I'm not sure how much "slower" or "faster" this is than doing the sum in ruby code. But if you want to still retain an ActiveRecord::Relation object, then you can do something like below. I reproduced your setup environment in a local Rails project.
user = User.first
Invoice.where(
id: Invoice.by_seller(user.id).select(:id)
).sum(:total_cache)
# (1.2 ms) SELECT SUM("invoices"."total_cache") FROM "invoices" WHERE "invoices"."id" IN (SELECT "invoices"."id" FROM "invoices" INNER JOIN "projects" ON "projects"."id" = "invoices"."project_id" INNER JOIN "estimates" ON "estimates"."project_id" = "projects"."id" WHERE "estimates"."user_id" = $1) [["user_id", 1]]
# => 5

Displaying posts that have comments with Rails 5

I have Posts and Comments tables. They have belongs_to and has_many relation. Everything works great.
What I need to do is writing the SQL to pull posts that have comments. How can I do that in the controller?
I need some sort of Join I guess. Right?
Thank you
Post.joins(:comments) by itself will give you all those post that do have comments related.
The INNER JOIN does that work. When you use an inner join between two tables it returns a new set of data with all of the instances of the join where the condition was met, the rows are ignored otherwise.

ActiveRecord 4.2 includes and ordering by count

I'm writing a search for a project I'm working on. It is meant to be able to search the body of articles and produce a list of their authors, ordered by the number of matching articles and including the relevant articles only, not all of their articles.
I currently have the following query:
Author.includes(:articles).where('articles.body ilike ?', '%foo%').references(:articles)
The use of includes in this case makes it so that all the relevant articles (not all articles) are preloaded, that's exactly what I want. However, when it comes to ordering by the number of included articles, I'm not sure how to proceed.
I should note I want to do this in ActiveRecord because pagination will be applied after the query. Not after a Ruby solution.
I should note I'm using PostgreSQL 9.3.
Edit: using raw SQL
This seems to work on its own like so:
Author.includes(:articles).where('articles.body ilike ?', '%foo%').references(:articles).select('authors.*, (SELECT COUNT(0) FROM articles WHERE articles.author_id = authors.id) AS article_count').order('article_count DESC')
This works fine. However, if I add .limit(1) it breaks.
PG::UndefinedColumn: ERROR: column "article_count" does not exist
Any idea why adding limit breaks it? The query seems very different too
SELECT DISTINCT "authors"."id", article_count AS alias_0 FROM "authors" LEFT OUTER JOIN "articles" ON "articles"."author_id" = "authors"."id" WHERE (articles.body ilike '%microsoft%') ORDER BY article_count DESC LIMIT 1
I don't think there's an out of the box solution for this. You have to write raw sql to do this but you can combine it with existing ActiveRecord queries.
Author
.includes(:articles)
.select('authors.*, (SELECT COUNT(0) FROM articles WHERE articles.author_id = authors.id) AS article_count')
.order('article_count DESC')
So the only thing to explain here is the select part. The first part, authors.*, selects all fields under the authors table and this is the default. Since we want to also count the number of articles, we create a subquery and pass its result as one of the pseudo columns of authors (we called it article_count). The last part is to just call order using article_count.
This solution assumes a couple of things which you'll have to fine tune depending on your setup.
Author by convention in rails maps to an authors table. If it is an STI (inherits from a User class and is using users table), you'll need to change authors to users.
articles.author_id assumes that the foreign key is author_id (and essentially, an article is only written by a single author). Change to whatever the foreign key is.
So given that, you'll have an array of authors ordered by the number of articles they've written.

How can I get this complicated query to work with ActiveRecord?

I have a complicated SQL query involving 3 models, creating virtual attributes based off of time operations... Ideally I would want to use ActiveRecord arel methods to generate this so that it's nicer, but I have no idea how.
find_by_sql <<--sql
SELECT username, count(*) as records, AVG(time_taken) as time_taken
FROM (
SELECT TIMESTAMPDIFF(HOUR, posts.created_at, reports.created_at) as time_taken,
users.username as username
FROM reports, users, posts
WHERE reports.user_id = users.id AND
reports.post_id = posts.id
) AS dashboard_data # this name is unused, but apparently required
GROUP BY username
sql
Is there any way to do something like this within ActiveRecord? Or is raw sql the only way to do complicated stuff like this?
Unless it's from a table you're going to be stuck doing things like this. You're fetching from a subquery which is way outside the scope of what ActiveRecord is intended to do. An ORM is designed to map objects to specific rows in well-defined tables.
You can sometimes fake it so that whatever you're doing appears to be sufficiently table-like that it works by using a SQL view. ActiveRecord will gladly use a view if it can be queried like a table. For instance if it has an id column defined that serves as some sort of primary key it should work. In your case you could GROUP BY users.id and have users.id AS id exposed as well as users.username.
I think this is equivalent to what you're doing:
User.joins({:reports=>:posts}).
group("users.username").
select("users.username,
count(*) as records,
avg(timestampdiff(hour, posts.created_at, reports.created_at)) as time_taken")

How to find all items not related to another model - Rails 3

I have a fairly complicated lookup i'm trying to do in Rails and I'm not entirely sure how hoping someone can help.
I have two models, User and Place.
A user is related to Place twice. Once for visited_places and once for planned_places. Its a many to many relationship but using has_many :through. Here's the relationship from User.
has_many :visited_places
has_many :visited, :class_name=>"Place", :through=>:visited_places, :source=>:place
has_many :planned_places
has_many :planned, :class_name=>"Place", :through=>:planned_places, :source=>:place
In place the relationship is also defined. Here's the definition there
has_many :visited_users, :class_name=>"User", :through=>:visited_places
has_many :planned_users, :class_name=>"User", :through=>:planned_places
I'm trying to write a find on Place that returns all places in the database that aren't related to a User through either visited or planned. Right now I'm accomplishing this by simply querying all Places and then subtracting visited and planned from the results but I want to add in pagination and I'm worried this could complicate that. Here's my current code.
all_places = Place.find(:all)
all_places = all_places - user.visited - user.planned
Anyone know how i can accomplish this in just a call to Place.find. Also this is a Rails 3 app so if any of the active record improvements make this easier they are an option.
How about something like:
unvisited_places = Place.find(:all, :conditions => "id NOT IN(#{visited_places.map(&:place_id)})")
That's the general idea -- it can be made more efficient and convenient depending on your final needs.
You don't show it but if I am right in assuming that the VisitedPlace and PlannedPlace models have a belongs_to :user relationships then those tables have a user_id secondary key, right?
So in that case I would think it would be most efficient to do this in the database in which case you are looking for a select across a table join of places, visited_places and planned_places where users.id is not in either of visited_places or planned_places
in sql:
select * from places where id not in
(
(select place_id from visited_places where user_id = ?)
union
(select place_id from planned_places where user_id=?)
)
If that query works, you can use as follows:
Places.find_by_sql(...the complete sql query ...)
I would not know how to write such a query, with an exclusion, in Rails 3 otherwise.
I ran into a similar desire recently... I wanted to get all Model1s that weren't associated with a Model2. Using Rails 4.1, here's what I did:
Model1.where.not(id: Model2.select(:user_id).uniq)
This creates a nested SELECT, like #nathanvda suggested, effectively letting the database do all the work. Example SQL produced is:
SELECT "model1s".* FROM "model1s" WHERE ("model1s"."id" NOT IN (SELECT DISTINCT "model2s"."model1_id" FROM "model2s"))

Resources