Active Record .includes with where clause - ruby-on-rails

I'm attempting to avoid an N+1 query by using includes, but I need to filter out some of the child records. Here's what I have so far:
Column.includes(:tickets).where(board_id: 1, tickets: {sprint_id: 10})
The problem is that only Columns containing Tickets with sprint_id of 10 are returned. I want to return all Columns with board_id of 1, and pre-fetch tickets only with a sprint_id of 10, so that column.tickets is either an empty list of a list of Ticket objects with sprint_id 10.

This is how includes is intended to work. When you add a where clause it applies to the entire query and not just loading the associated records.
One way of doing this is by flipping the query backwards:
columns = Ticket.eager_load(:columns)
.where(sprint_id: 10, columns: { board_id: 1 })
.map(&:column)
.uniq

Column.includes(:tickets).where(board_id: 1, tickets: {sprint_id: 10}) makes two SQL queries. One to select the columns that match the specified where clause, and another to select and load the tickets that their column_id is equal to the id of the matched columns.
To get all the related columns without loading unwanted tickets, you can do this:
columns = Column.where(board_id: 1).to_a
tickets = Ticket.where(column_id: columns.map(&:id), sprint_id: 10).to_a
This way you won't be able to call #tickets on each column (as it will again make a database query and you'll have the N+1 problem) but to have a similar way of accessing a column's tickets without making any queries you can do something like this:
grouped_tickets = tickets.group_by(&:column_id)
columns.each do |column|
column_tickets = grouped_tickets[column.id]
# Do something with column_tickets if they're present.
end

Related

Find all records having less value then sum of two columns with associated activerecord

I have a Room model and it is associated with ReservationRoom. (room => has_many :reservations_rooms)
Now I want to find Rooms with below condition
ReservationRooms has two column named number_of_adults and number_of_child
and 
Room has column named min_occupancy
I need all the room record 
if sum of the number_of_adults and number_of_child of reservation_rooms, less then min_occupancy of room table.
something link below.
Room.joins(:reservations_rooms).where(id: [22, 19]).having('sum(reservations_rooms.number_of_adult + reservations_rooms.number_of_child) < rooms.min_occupancy')
can someone help me out with this?
Try this, I think you don't need grouping
Room.joins(:reservation_rooms).
where(rooms: { id: [22, 19] }).
where('reservation_rooms.number_of_adults + reservation_rooms.number_of_child <= rooms.min_occupancy')
I left the condition about ids because I don't know if you need that too
What you are looking for is a where clause. You can combine multiple where in ActiveRecord and the will be combined with AND:
Room # the model you are interested in
.joins(:reservation_rooms) # combine the reservations
.where(rooms: {id: [22, 19]}) # only for those two rooms
.where(
"number_of_adults + number_of_child <= min_occupancy"
) # add adults/child and compare to occupancy
This should generate SQL similar to this.
select * from rooms
joins reservation_rooms on roomws.id = reservation_rooms.room_id
where
rooms.id in (22, 19)
and number_of_adults + number_of_child <= max_occupancy
Some notes:
You can see what SQL gets generated by appending .to_sql to your ActiveRecord::Association (the query you've built)
Assuming the column names are unique you don't need to prefix them with the table name (e.g. if bot tables have a created_at column you would need to specify what column you are interested in like rooms.id).
I'd rename number_of_child to number_of_children to be consistent (it's plural on number_of_adults)
sum is an aggregate function (like avg, count and others). Those functions are used to group multiple columns into one. (in your example you want to combine multiple columns of a row, hence you can use +)
having is also used with group s. It is similar to a where clause but for filtering grouped rows

Return duplicate records (activerecord, postgres)

I have the following query returning duplicate titles, but :id is nil:
Movie.select(:title).group(:title).having("count(*) > 1")
[#<Movie:0x007f81f7111c20 id: nil, title: "Fargo">,
#<Movie:0x007f81f7111ab8 id: nil, title: "Children of Men">,
#<Movie:0x007f81f7111950 id: nil, title: "The Martian">,
#<Movie:0x007f81f71117e8 id: nil, title: "Gravity">]
I tried adding :id to the select and group but it returns an empty array. How can I return the whole movie record, not just the titles?
A SQL-y Way
First, let's just solve the problem in SQL, so that the Rails-specific syntax doesn't trick us.
This SO question is a pretty clear parallel: Finding duplicate values in a SQL Table
The answer from KM (second from the top, non-checkmarked, at the moment) meets your criteria of returning all duplicated records along with their IDs. I've modified KM's SQL to match your table...
SELECT
m.id, m.title
FROM
movies m
INNER JOIN (
SELECT
title, COUNT(*) AS CountOf
FROM
movies
GROUP BY
title
HAVING COUNT(*)>1
) dupes
ON
m.title=dupes.title
The portion inside the INNER JOIN ( ) is essentially what you've generated already. A grouped table of duplicated titles and counts. The trick is JOINing it to the unmodified movies table, which will exclude any movies that don't have matches in the query of dupes.
Why is this so hard to generate in Rails? The trickiest part is that, because we're JOINing movies to movies, we have to create table aliases (m and dupes in my query above).
Sadly, it Rails doesn't provide any clean ways of declaring these aliases. Some references:
Rails GitHub issues mentioning "join" and "alias". Misery.
SO Question: ActiveRecord query with alias'd table names
Fortunately, since we've got the SQL in-hand, we can use the .find_by_sql method...
Movie.find_by_sql("SELECT m.id, m.title FROM movies m INNER JOIN (SELECT title, COUNT(*) FROM movies GROUP BY title HAVING COUNT(*)>1) dupes ON m.first=.first")
Because we're calling Movie.find_by_sql, ActiveRecord assumes our hand-written SQL can be bundled into Movie objects. It doesn't massage or generate anything, which lets us do our aliases.
This approach has its shortcomings. It returns an array and not an ActiveRecord Relation, which means it can't be chained with other scopes. And, in the documentation for the find_by_sql method, we get extra discouragement...
This should be a last resort because using, for example, MySQL specific terms will lock you to using that particular database engine or require you to change your call if you switch engines.
A Rails-y Way
Really, what is the SQL doing above? It's getting a list of names that appear more than once. Then, it's matching that list against the original table. So, let's just do that using Rails.
titles_with_multiple = Movie.group(:title).having("count(title) > 1").count.keys
Movie.where(title: titles_with_multiple)
We call .keys because the first query returns an hash. The keys are our titles. The where() method can take an array, and we've handed it an array of titles. Winner.
You could argue one line of Ruby is more elegant than two. And if that one line of Ruby has an ungodly string of SQL embedded within it, how elegant is it really?
Hope this helps!
You can try to add id in your select:
Movie.select([:id, :title]).group(:title).having("count(title) > 1")

Order with DISTINCT ids in rails with postgres

I have the following code to join two tables microposts and activities with micropost_id column and then order based on created_at of activities table with distinct micropost id.
Micropost.joins("INNER JOIN activities ON
(activities.micropost_id = microposts.id)").
where('activities.user_id= ?',id).order('activities.created_at DESC').
select("DISTINCT (microposts.id), *")
which should return whole micropost columns.This is not working in my developement enviornment.
(PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
If I add activities.created_at in SELECT DISTINCT, I will get repeated micropost ids because the have distinct activities.created_at column. I have done a lot of search to reach here. But the problem always persist because of this postgres condition to avoid random selection.
I want to select based on order of activities.created_at with distinct micropost _id.
Please help..
To start with, we need to quickly cover what SELECT DISTINCT is actually doing. It looks like just a nice keyword to make sure you only get back distinct values, which shouldn't change anything, right? Except as you're finding out, behind the scenes, SELECT DISTINCT is actually acting more like a GROUP BY. If you want to select distinct values of something, you can only order that result set by the same values you're selecting -- otherwise, Postgres doesn't know what to do.
To explain where the ambiguity comes from, consider this simple set of data for your activities:
CREATE TABLE activities (
id INTEGER PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE,
micropost_id INTEGER REFERENCES microposts(id)
);
INSERT INTO activities (id, created_at, micropost_id)
VALUES (1, current_timestamp, 1),
(2, current_timestamp - interval '3 hours', 1),
(3, current_timestamp - interval '2 hours', 2)
You stated in your question that you want "distinct micropost_id" "based on order of activities.created_at". It's easy to order these activities by descending created_at (1, 3, 2), but both 1 and 2 have the same micropost_id of 1. So if you want the query to return just micropost IDs, should it return 1, 2 or 2, 1?
If you can answer the above question, you need to take your logic for doing so and move it into your query. Let's say that, and I think this is pretty likely, you want this to be a list of microposts which were most recently acted on. In that case, you want to sort the microposts in descending order of their most recent activity. Postgres can do that for you, in a number of ways, but the easiest way in my mind is this:
SELECT micropost_id
FROM activities
JOIN microposts ON activities.micropost_id = microposts.id
GROUP BY micropost_id
ORDER BY MAX(activities.created_at) DESC
Note that I've dropped the SELECT DISTINCT bit in favor of using GROUP BY, since Postgres handles them much better. The MAX(activities.created_at) bit tells Postgres to, for each group of activities with the same micropost_id, sort by only the most recent.
You can translate the above to Rails like so:
Micropost.select('microposts.*')
.joins("JOIN activities ON activities.micropost_id = microposts.id")
.where('activities.user_id' => id)
.group('microposts.id')
.order('MAX(activities.created_at) DESC')
Hope this helps! You can play around with this sqlFiddle if you want to understand more about how the query works.
Try the below code
Micropost.select('microposts.*, activities.created_at')
.joins("INNER JOIN activities ON (activities.micropost_id = microposts.id)")
.where('activities.user_id= ?',id)
.order('activities.created_at DESC')
.uniq

Rails 3 Comparing foreign key to list of ids using activerecord

I have a relationship between two models, Registers and Competitions. I have a very complicated dynamic query that is being built and if the conditions are right I need to limit Registration records to only those where it's Competition parent meets a certain criteria. In order to do this without select from the Competition table I was thinking of something along the lines of...
Register.where("competition_id in ?", Competition.where("...").collect {|i| i.id})
Which produces this SQL:
SELECT "registers".* FROM "registers" WHERE (competition_id in 1,2,3,4...)
I don't think PostgreSQL liked the fact that the in parameters aren't surrounded by parenthesis. How can I compare the Register foreign key to a list of competition ids?
you can make it a bit shorter and skip the collect (this worked for me in 3.2.3).
Register.where(competition_id: Competition.where("..."))
this will result in the following sql:
SELECT "registers".* FROM "registers" WHERE "registers"."competition_id" IN (SELECT "competitions"."id" FROM "competitions" WHERE "...")
Try this instead:
competitions = Competition.where("...").collect {|i| i.id}
Register.where(:competition_id => competitions)

Get a Rails record count without quering a 2nd time

I've got a Rails ActiveRecord query that find all the records where the name is some token.
records = Market.where("lower(name) = ?", name.downcase );
rec = records.first;
count = records.count;
The server shows that the calls for .first and .count were BOTH hitting the database.
←[1m←[35mCACHE (0.0ms)←[0m SELECT "markets".* FROM "markets" WHERE (lower(nam
e) = 'my market') LIMIT 1
←[1m←[36mCACHE (0.0ms)←[0m ←[1mSELECT COUNT(*) FROM "markets" WHERE (lower(na
me) = 'my market')←[0m
Why is it going to the database to get the count when it can use the results already queried?
I'm concerned about future performance. Today there are 1000 records. When that table holds 8 million rows, doing two queries one for data, and one for count, it will be expensive.
How do I get the count from the collection, not the database?
RactiveRecord use lazy query to fetch data from database. If you want to simple count the records, you can only call size of the retrun array.
records = Market.where("lower(name) = ?", name.downcase ).all
records.size
So, records is an ActiveRelation. You would think it's an array of all your Market records that match your where criteria, but it's not. Each time you reference something like first or count on that relation, it performs the query retrieve what you're asking for.
To get the actual records into an array, just add .all to the relation to actually retrieve them. Like:
records = Market.where("lower(name) = ?", name.downcase).all
count = records.count
For Rails 6.0.1 and Ruby 2.6.5
You will need to store the results into an array by using the to_a.
records = Market.where("lower(name) = ?", name.downcase).to_a
This will create the SQL query and store the results in the array records.
Then, when you call either records.first or records.count it will only return the data or do the calculation, not rerun a query. This is the same for records.size and records.length.
Another Example
I was needing to do this for a blog I am developing. I was trying to run a query to find all of the tags associated with a post, and I wanted to count how many tags there were. This was causing multiple queries until I came across the to_a suffix.
So, my SQL query looks like this:
#tags = TagMap.where(post_id: #post).joins(:tag).select(:id, '"tags"."name"').to_a
This looks through my TagMap table for all records that have post_id equal to the id of the post that I am viewing. It then joins to the Tags table and pulls only the id of the TagMap record and the name of the tag from the Tags table. Then it puts them all into an array. I can then run #tags.count and it will return the number of TagMap records for that post without doing another query.
I hope that this helps anyone using Rails 6+

Resources