StatementInvalid Rails Query - ruby-on-rails

I've got the following query that works:
jobs = current_location.jobs.includes(:customer).all.where(complete: complete)
However, when I add a where clause to query the first name of the customer table, I get an error.
jobs = current_location.jobs.includes(:customer).all.where(complete: complete).where("customers.fist_name = ?", "Bob")
Here is the error:
PG::UndefinedTable: ERROR: missing FROM-clause entry for table "customers"
LINE 1: ...bs"."complete" = $2 AND "jobs"."status" = $3 AND (customers....
^
: SELECT "jobs".* FROM "jobs" INNER JOIN "jobs_users" ON "jobs"."id" = "jobs_users"."job_id" WHERE "jobs_users"."user_id" = $1 AND "jobs"."complete" = $2 AND "jobs"."status" = $3 AND (customers.last_name = 'Bob') ORDER BY "jobs"."start" DESC LIMIT $4 OFFSET $5
The current_location method:
def current_location
return current_user.locations.find_by(id: cookies[:current_location])
end
Location Model
has_many :jobs
has_and_belongs_to_many :customers
Job Model
belongs_to :location
belongs_to :customer
Customer Model
has_many :jobs
has_and_belongs_to_many :locations
How can I fix this issue?

includes will only join the table if you set a reference to the association.
When using includes you ensure a reference to the association in 2 fashions:
You can use the references method this will join the table whether or not there are any query conditions (If you MUST use raw SQL as shown in your question then this is the method you would need to use) e.g.
current_location.jobs
.includes(:customer)
.references(:customer)
Or you can use the hash finder version of where: (Please note that when using an associative reference in the where clause you must reference the table name, in this case customers and not the association name customer)
current_location.jobs
.includes(:customer)
.where(customers: {first_name: "Bob" })
Both of these will eager load the customer for the jobs referenced.
The first option (references) will OUTER JOIN the customers table so that all the jobs are loaded even if they have no customers as long as no query conditions reference the customers table.
The second option (using where) will OUTER JOIN the customers table but given the query parameter against the customers table it will act very much like an INNER JOIN.
If you only need to search the jobs based on customer information then joins is a better choice as this will create an INNER JOIN with the customers table but will not try to load any of the customer data in the query e.g.
current_location.jobs.joins(:customer).where(customers: {first_name: "Bob" })
joins will always include the associated table regardless of a reference in the query.
Sidenote: the all in both your queries is completely unnecessary

includes(:customer) does not necessarily join the customers table into the SQL query. You need to use joins(:customer) to force Rails to join the customers table into the SQL query and make it available to query conditions.
jobs = current_location.jobs
.joins(:customer)
.includes(:customer)
.where(complete: complete)
.where(customers: { first_name: 'Bob' })

Related

Rails remove duplicated associated records

Let's say I have a User and User has_many :tags and I would like to remove all #users tags that have duplicated name. For example,
#user.tags #=> [<Tag name: 'A'>, <Tag name: 'A'>, <Tag name: 'B'>]
I would like to keep only the tags with unique names and delete the rest from the database.
I know I could pull out a list of unique tags names from user's tags and remove all users's tags and re-create user's tags with only unique names but it would be ineffficient?
On the other hand, select won't work as it returns only the selected column. uniq also won't work:
#user.tags.uniq #=> returns all tags
Is there a more efficient way?
UPDATE:
I would like to do this in a migration.
This method will give you an ActiveRecord::Relation with the duplicate tags:
class Tag < ApplicationRecord
belongs_to :user
def self.duplicate_tags
unique = self.select('DISTINCT ON(tags.name, tags.user_id) tags.id')
.order(:name, :user_id, :id)
self.where.not(id: unique)
end
end
Its actually run as a single query:
SELECT "tags".* FROM "tags"
WHERE "tags"."id" NOT IN
(SELECT DISTINCT ON(tags.name) tags.id
FROM "tags" GROUP BY "tags"."id", "tags"."user_id"
ORDER BY tags.name, tags.id)
You can remove the duplicates in a single query with #delete_all.
# Warning! This can't be undone!
Tag.duplicate_tags.destroy_all
If you need to destroy dependent associations or call your before_* or after_destroy callbacks, use the #destroy_all method instead. But you should use this together with #in_batches to avoid running out of memory.
# Warning! This can't be undone!
Tag.duplicate_tags.in_batches do |batch|
# destroys a batch of 1000 records
batch.destroy_all
end
You can write SQL model-independent query in the migration.
Here is PostgreSQL-specific migration code:
execute <<-SQL
DELETE FROM tags
WHERE id NOT IN (
SELECT DISTINCT ON(user_id, name) id FROM tags
ORDER BY user_id, name, id ASC
)
SQL
And here is more SQL common one:
execute <<-SQL
DELETE FROM tags
WHERE id IN (
SELECT DISTINCT t2.id FROM tags t1
INNER JOIN tags t2
ON (
t1.user_id = t2.user_id AND
t1.name = t2.name AND
t1.id < t2.id
)
)
SQL
This SQL fiddle shows
different queries you can use as sub-select in DELETE query depending on your goals: deleting first/last/all duplicates.

How to do this PG query in Rails?

I have the following simple relations:
class Company
has_many :users
end
class User
belongs_to :company
has_and_belongs_to_many :roles
end
class Role
has_and_belongs_to_many :users
end
The only column that matters is :name on Role.
I'm trying to make an efficient PostgreSQL query which will show a comma separated list of all role_names for each user.
So far I have got it this far, which works great if there's only single role assigned. If I add another role, I get duplicate users. Rather than trying to parse this after, I'm trying to just get it to return a comma separated list in a role_names field by using the string_agg() function.
This is my query so far and I'm kind of failing at taking it this final step.
User.where(company_id: id)
.joins(:roles)
.select('distinct users.*, roles.name as role_name')
EDIT
I can get it working via raw SQL (gross) but rails doesn't know how to understand it when I put it in ActiveRecord format
ActiveRecord::Base.connection.execute('SELECT users.*, string_agg("roles"."name", \',\') as roles FROM "users" INNER JOIN "roles_users" ON "roles_users"."user_id" = "users"."id" INNER JOIN "roles" ON "roles"."id" = "roles_users"."role_id" WHERE "users"."company_id" = 1 GROUP BY users.id')
User.where(company_id: id)
.joins(:roles)
.select('users.*, string_agg("roles"."name" \',\')')
.group('users.id')
Looks to me that you want to do:
User.roles.map(&:name).join(',')
(In my opionion SQL is a better choice when working with databases but when you are on rails you should probably do as much as possible with Active Record. Be aware of performance issues!)

Postgres error when trying to display all items through a relationship

This works on SQLite3, but not on PostgreSQL.
The error I'm getting is PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
I'm trying to display all exercises that are in a group with the code: current_user.group.exercises
Here is the relationship
A group has_many workouts, and a workout has_many exercises
In my Group model I have has_many :exercises, through: :workouts
Any ideas?
EDIT 1:
Here is the SQL rails is generating:
SELECT DISTINCT "exercises".*
FROM "exercises"
INNER JOIN "workout_exercises" ON "exercises"."id" = "workout_exercises"."exercise_id"
INNER JOIN "workouts" ON "workout_exercises"."workout_id" = "workouts"."id"
INNER JOIN "groups_workouts" ON "workouts"."id" = "groups_workouts"."workout_id"
WHERE "groups_workouts"."group_id" = 2
ORDER BY exercise_order, workout_order
And here is the error:
PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ..." WHERE "groups_workouts"."group_id" = 2 ORDER BY exercise_o...
^
: SELECT DISTINCT "exercises".* FROM "exercises" INNER JOIN "workout_exercises" ON "exercises"."id" = "workout_exercises"."exercise_id" INNER JOIN "workouts" ON "workout_exercises"."workout_id" = "workouts"."id" INNER JOIN "groups_workouts" ON "workouts"."id" = "groups_workouts"."workout_id" WHERE "groups_workouts"."group_id" = 2 ORDER BY exercise_order, workout_order
So this is a uniq constraint exception. On the model I had has_many :exercises, through: :workouts, uniq: true which Postgres didn't like.
To fix the error, I moved the uniq constraint from the model to the actual query. So in this situation, I just did current_user.group.exercises.uniq
This only sort of solves my problem. There are situations where I would want to have a uniq constraint at the model level, but I haven't been able to find a way to do that yet.
Each SQL variant has slightly different rules as to what expressions it accepts. For example, see Simulating MySQL's ORDER BY FIELD() in Postgresql and related links for information on this issue. If you give the specifics of the SQL you're generating, you can probably get more specific advice.

Specifying conditions on eager loaded associations returns ActiveRecord::RecordNotFound

The problem is that when a Restaurant does not have any MenuItems that match the condition, ActiveRecord says it can't find the Restaurant. Here's the relevant code:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
def self.with_meals_of_the_week
includes({menu_items: :meal}).where(:'menu_items.date' => Time.now.beginning_of_week..Time.now.end_of_week)
end
end
And the sql code generated:
Restaurant Load (0.0ms)←[0m ←[1mSELECT DISTINCT "restaurants".id FROM "restaurants"
LEFT OUTER JOIN "menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN "meals" ON "meals"."id" = "menu_items"."meal_id" WHERE
"restaurants"."id" = ? AND ("menu_items"."date" BETWEEN '2012-10-14 23:00:00.000000'
AND '2012-10-21 22:59:59.999999') LIMIT 1←[0m [["id", "1"]]
However, according to this part of the Rails Guides, this shouldn't be happening:
Post.includes(:comments).where("comments.visible", true)
If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded.
The SQL generated is a correct translation of your query. But look at it,
just at the SQL level (i shortened it a bit):
SELECT *
FROM
"restaurants"
LEFT OUTER JOIN
"menu_items" ON "menu_items"."restaurant_id" = "restaurants"."id"
LEFT OUTER JOIN
"meals" ON "meals"."id" = "menu_items"."meal_id"
WHERE
"restaurants"."id" = ?
AND
("menu_items"."date" BETWEEN '2012-10-14' AND '2012-10-21')
the left outer joins do the work you expect them to do: restaurants
are combined with menu_items and meals; if there is no menu_item to
go with a restaurant, the restaurant is still kept in the result, with
all the missing pieces (menu_items.id, menu_items.date, ...) filled in with NULL
now look aht the second part of the where: the BETWEEN operator demands,
that menu_items.date is not null! and this
is where you filter out all the restaurants without meals.
so we need to change the query in a way that makes having null-dates ok.
going back to ruby, you can write:
def self.with_meals_of_the_week
includes({menu_items: :meal})
.where('menu_items.date is NULL or menu_items.date between ? and ?',
Time.now.beginning_of_week,
Time.now.end_of_week
)
end
The resulting SQL is now
.... WHERE (menu_items.date is NULL or menu_items.date between '2012-10-21' and '2012-10-28')
and the restaurants without meals stay in.
As it is said in Rails Guide, all Posts in your query will be returned only if you will not use "where" clause with "includes", cause using "where" clause generates OUTER JOIN request to DB with WHERE by right outer table so DB will return nothing.
Such implementation is very helpful when you need some objects (all, or some of them - using where by base model) and if there are related models just get all of them, but if not - ok just get list of base models.
On other hand if you trying to use conditions on including tables then in most cases you want to select objects only with this conditions it means you want to select Restaurants only which has meals_items.
So in your case, if you still want to use only 2 queries (and not N+1) I would probably do something like this:
class Restaurant < ActiveRecord::Base
has_many :menu_items, dependent: :destroy
has_many :meals, through: :menu_items
cattr_accessor :meals_of_the_week
def self.with_meals_of_the_week
restaurants = Restaurant.all
meals_of_the_week = {}
MenuItems.includes(:meal).where(date: Time.now.beginning_of_week..Time.now.end_of_week, restaurant_id => restaurants).each do |menu_item|
meals_of_the_week[menu_item.restaurant_id] = menu_item
end
restaurants.each { |r| r.meals_of_the_week = meals_of_the_week[r.id] }
restaurants
end
end
Update: Rails 4 will raise Deprecation warning when you simply try to do conditions on models
Sorry for possible typo.
I think there is some misunderstanding of this
If there was no where condition, this would generate the normal set of two queries.
If, in the case of this includes query, there were no comments for any
posts, all the posts would still be loaded. By using joins (an INNER
JOIN), the join conditions must match, otherwise no records will be
returned.
[from guides]
I think this statements doesn't refer to the example Post.includes(:comments).where("comments.visible", true)
but refer to one without where statement Post.includes(:comments)
So all work right! This is the way LEFT OUTER JOIN work.
So... you wrote: "If, in the case of this includes query, there were no comments for any posts, all the posts would still be loaded." Ok! But this is true ONLY when there is NO where clause! You missed the context of the phrase.

How do I get Rails ActiveRecord to generate optimized SQL?

Let's say that I have 4 models which are related in the following ways:
Schedule has foreign key to Project
Schedule has foreign key to User
Project has foreign key to Client
In my Schedule#index view I want the most optimized SQL so that I can display links to the Schedule's associated Project, Client, and User. So, I should not pull all of the columns for the Project, Client, and User; only their IDs and Name.
If I were to manually write the SQL it might look like this:
select
s.id,
s.schedule_name,
s.schedule_type,
s.project_id,
p.name project_name,
p.client_id client_id,
c.name client_name,
s.user_id,
u.login user_login,
s.created_at,
s.updated_at,
s.data_count
from
Users u inner join
Clients c inner join
Schedules s inner join
Projects p
on p.id = s.project_id
on c.id = p.client_id
on u.id = s.user_id
order by
s.created_at desc
My question is: What would the ActiveRecord code look like to get Rails 3 to generate that SQL? For example, somthing like:
#schedules = Schedule. # ?
I already have the associations setup in the models (i.e. has_many / belongs_to).
I think this will build (or at least help) you get what you're looking for:
Schedule.select("schedules.id, schedules.schedule_name, projects.name as project_name").joins(:user, :project=>:client).order("schedules.created_at DESC")
should yield:
SELECT schedules.id, schedules.schedule_name, projects.name as project_name FROM `schedules` INNER JOIN `users` ON `users`.`id` = `schedules`.`user_id` INNER JOIN `projects` ON `projects`.`id` = `schedules`.`project_id` INNER JOIN `clients` ON `clients`.`id` = `projects`.`client_id`
The main problem I see in your approach is that you're looking for schedule objects but basing your initial "FROM" clause on "User" and your associations given are also on Schedule, so I built this solution based on the plain assumption that you want schedules!
I also didn't include all of your selects to save some typing, but you get the idea. You will simply have to add each one qualified with its full table name.

Resources