ActiveRecord/Postgres: aggregate functions are not allowed in WHERE - ruby-on-rails

I want to retrieve contacts, which have many activities, where the max completed_date of the latter was 6 months ago. Let me ilustrate it:
user = User.first
user.contacts.first.activities.maximum(:completed_date)
# SELECT MAX("activities"."completed_date") AS max_id FROM "activities" WHERE "activities"."user_id" = $1 [["user_id", 12]]
=> 2014-03-18 09:06:54 UTC
Thats perfect. Now I want to use that a clause for a WHERE query but it seems I can't:
user.contacts.joins(:activities)
.where('MAX("activities"."completed_date") < ?', Time.now - 6.months)
# SELECT "contacts".* FROM "contacts"
# INNER JOIN "activities" ON "activities"."contact_id" = "contacts"."id"
# WHERE "contacts"."user_id" = $1 AND (MAX("activities"."completed_date") <= '2013-09-23 05:55:21.191254') [["user_id", 12]]
#=> PG::GroupingError: ERROR: aggregate functions are not allowed in WHERE
# LINE 1: ...ntacts"."id" WHERE "contacts"."user_id" = $1 AND (MAX("activ...
How I'm supposed to do this?

It is complaining because of the MAX, aggregate function, call in the WHERE clause.
To avoid this problem call MAX in SELECT with AS to alias it. Then use the alias in the WHERE.
user.contacts.select('*, MAX("activities"."completed_date") AS max_complete_date').joins(:activities)
.where('max_complete_date < ?', Time.now - 6.months)
Edit
I appologize, you should use HAVING instead.
user.contacts.joins(:activities)
.having('MAX("activities"."completed_date") < ?', Time.now - 6.months).group("contacts.id")

Related

Remove duplicated records keeping last usign ActiveRecord

I've been trying to remove the records that are duplicated (same value in the column shopify_order_id) keeping the most recent one.
I wrote it in sql:
select orders.id from (
select shopify_order_id, min(shopify_created_at) as min_created
from orders group by shopify_order_id having count(*) > 1 limit 5000
) as keep_orders
join orders
on
keep_orders.shopify_order_id = orders.shopify_order_id and
orders.shopify_created_at <> keep_orders.min_created
and now I'm trying to get it to Active Record but can't seem to join the two parts.
The first nested select is
Order.select('shopify_order_id, MIN(shopify_created_at) as min_created').
group(:shopify_order_id).
having('count(*) > 1').
limit(5000)
but then the following doesn't work:
Order.select('orders.id').from(keep_orders, :keep_orders).
joins('orders ON keep_orders.shopify_order_id = orders.shopify_order_id').
where.not('orders.shopify_created_at = keep_orders.min_created')
it builds the query:
SELECT orders.id FROM (SELECT shopify_order_id, MIN(shopify_created_at) as min_created FROM "orders" GROUP BY "orders"."shopify_order_id" HAVING (count(*) > 1) LIMIT $1) keep_orders orders ON keep_orders.shopify_order_id = orders.shopify_order_id WHERE NOT (orders.shopify_created_at = keep_orders.min_created) ORDER BY "orders"."id" ASC LIMIT $2 [["LIMIT", 5000], ["LIMIT", 1]]
which is missing the keyword join.
Any help on how to refactor the query/do it in another way would be more than appreciated.
If you call joins with a string SQL fragment you need to specify the type of join you want:
Order.select('orders.id').from(keep_orders, :keep_orders)
.joins('JOIN orders ON keep_orders.shopify_order_id = orders.shopify_order_id')
.where.not('orders.shopify_created_at = keep_orders.min_created')

Rails Query Multiple Params From Same Table

How can I search for multiple params? I have checkboxes in my view, so if multiple checkboxes are selected, I would like all the params selected to be chosen. I can currently only get the search to work with one param with code below.
There is a has_many to has_many association between car model and colour_collection model.
Controller:
#cars = car.joins(:colour_collections).where("colour_collections.name = ?", params[:colour_collection])
logs show this if two colours selected (e.g. red and green) creating duplicates in the resulting querie:
(0.7ms) SELECT COUNT(*) FROM "colour_collections"
ColourCollection Load (0.5ms) SELECT "colour_collections".* FROM "colour_collections"
Car Load (2.5ms) SELECT "cars".* FROM "cars" INNER JOIN "car_colour_collections" ON "car_colour_collections"."car_id" = "cars"."id" INNER JOIN "colour_collections" ON "colour_collections"."id" = "car_colour_collections"."colour_collection_id" WHERE "colour_collections"."name" IN ('Subtle', 'Intermediate') ORDER BY "cars"."created_at" DESC
CarAttachment Load (0.5ms) SELECT "car_attachments".* FROM "car_attachments" WHERE "car_attachments"."car_id" = $1 ORDER BY "car_attachments"."id" ASC LIMIT $2 [["car_id", 21], ["LIMIT", 1]]
CACHE (0.0ms) SELECT "car_attachments".* FROM "car_attachments" WHERE "car_attachments"."car_id" = $1 ORDER BY "car_attachments"."id" ASC LIMIT $2 [["car_id", 21], ["LIMIT", 1]]
CarAttachment Load (0.5ms) SELECT "car_attachments".* FROM "car_attachments" WHERE "car_attachments"."car_id" = $1 ORDER BY "car_attachments"."id" ASC LIMIT $2 [["car_id", 20], ["LIMIT", 1]]
CACHE (0.0ms) SELECT "car_attachments".* FROM "car_attachments" WHERE "car_attachments"."car_id" = $1 ORDER BY "car_attachments"."id" ASC LIMIT $2 [["car_id", 20], ["LIMIT", 1]]
If you want to search for multiple values in a single column for example
params[:colour_collection] = ['red','green','blue']
Then you would expect your query to look like this
SELECT * FROM cars c
INNER JOIN colour_collections s
WHERE s.name IN ('red','green','blue');
In this case the corresponding ActiveRecord statement would look like this
Car.
joins(:colour_collections).
where(colour_collections: { name: params[:colour_collection] })
Rails 5 comes with an or method but Rails 4 does not have the or method, so you can use plain SQL query in Rails 4.
In Rails 4 :
#cars = car.
joins(:colour_collections).
where("colour_collections.name = ? or colour_collections.type = ?", params[:colour_collection], params[:type])
In Rails 5 :
#cars = car.
joins(:colour_collections).
where("colour_collections.name = ?", params[:colour_collection]).or(car.joins(:colour_collections).where("colour_collections.type = ?", params[:type]))
Depending on whether you want to use OR or AND. There are multiple ways of achieving this but simple example is
Article.where(trashed: true).where(trashed: false)
the sql generated will be
SELECT * FROM articles WHERE 'trashed' = 1 AND 'trashed' = 0
Foo.where(foo: 'bar').or.where(bar: 'bar') This is norm in Rails 5 or simply
Foo.where('foo= ? OR bar= ?', 'bar', 'bar')
#cars = car.joins(:colour_collections).where("colour_collections.name = ?", params[:colour_collection]).where("cars.make = ?", params[:make])
More discussion on chaining How does Rails ActiveRecord chain "where" clauses without multiple queries?

Complex ActiveRecord query comparing datetime through many to many relation

So the objective of my query:
Fetch all of a single user's clients that have not had a meeting since 30 days ago.
A Client has_many :meetings, through: :contacts although contacts isn't very relevant here.
My query is as follows:
user.clients.where(is_dormant: false).joins(:meetings).distinct.where('meetings.actual_start_datetime <= ?', 30.days.ago).where.not('meetings.actual_start_datetime > ?', 30.days.ago)
which produces this SQL:
SELECT DISTINCT "clients".* FROM "clients" INNER JOIN "contacts" ON "contacts"."client_id" = "clients"."id" INNER JOIN "meetings" ON "meetings"."contact_id" = "contacts"."id" INNER JOIN "clients_users" ON "clients"."id" = "clients_users"."client_id" WHERE "clients_users"."user_id" = $1 AND "clients"."is_dormant" = $2 AND (meetings.actual_start_datetime <= '2016-12-31 20:29:08.972999') AND (NOT (meetings.actual_start_datetime > '2016-12-31 20:29:08.973484')) ORDER BY "clients"."name" ASC [["user_id", 1], ["is_dormant", "f"]]
But it seems to just ignore the where.not('meetings.actual_start_datetime > ?', 30.days.ago) clause. If I run the query without that clause, it returns the exact same result.
After many days of deliberating, it seems the easiest way to do this is get all of the clients who have had a meeting 30 or more days ago, then subtract from that array the clients who have had a meeting in the last 30 days, eg:
user_clients.without_recent_meetings - user_clients.with_recent_meetings
Is there any way to do this in one query, as this way means having to run a complex query twice?
Try this one
user.clients.where(is_dormant: false).joins(:meetings).distinct.where('meetings.actual_start_datetime <= ? AND clients.id not in (select m.client_id from meetings m where m.actual_start_datetime>?)', 30.days.ago)

How to reduce database queries and is it Worth it?

I currently have an app that lists flights from one location to another, its price and other information. I have implemented a search through a drop down list so it only shows flights either from a certain location, to a certain location or from and to a certain location, depending on how the user searches.
def index
#flights = Flight.all
#flights_source = #flights.select('DISTINCT source') #this line is used for options_from_collection_for_select in view
#flights_destination = #flights.select('DISTINCT destination') #this line is used for options_from_collection_for_select in view
if params[:leaving_from].present? && params[:going_to].blank?
#flights = Flight.where(:source => params[:leaving_from])
elsif params[:going_to].present? && params[:leaving_from].blank?
#flights = Flight.where(:destination => params[:going_to])
elsif params[:leaving_from].present? && params[:going_to].present?
#flights = Flight.where(:source => params[:leaving_from]).where(:destination => params[:going_to])
end
end
The problem is every time I want to add another search parameter, for example price, it's going to be another query. Is there a way to take Flight.all and search within the result and make a new hash or array with only the records that match the search terms, instead of doing a new query with select DISTINCT.
The closest thing I could come up with is somehow turning the result of Flight.all into a array[hash] and using that get the results for distinct source and destination. But not sure how to do that.
And finally would it be worth it to do this to reduce the number of database queries?
These are the current queries:
Flight Load (1.4ms) SELECT "flights".* FROM "flights"
User Load (1.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 ORDER BY "users"."id" ASC LIMIT 1 [["id", 2]]
Flight Load (1.4ms) SELECT DISTINCT source FROM "flights"
Flight Load (0.8ms) SELECT DISTINCT destination FROM "flights"
EDIT:
I changed the select distinct to
#flights_source = #flights.uniq.pluck(:source)
#flights_destination = #flights.uniq.pluck(:destination)
And used options_for_select instead of options_from_collection_for_select in the view. But the queries are still, I think this means I eliminated us much as I can, not sure though.
(0.8ms) SELECT DISTINCT "flights"."source" FROM "flights"
(0.6ms) SELECT DISTINCT "flights"."destination" FROM "flights"
Request Load (1.3ms) SELECT "requests".* FROM "requests"
Flight Load (1.0ms) SELECT "flights".* FROM "flights"
User Load (0.5ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 ORDER BY "users"."id" ASC LIMIT 1 [["id", 2]]

How to lock a parent record in Rails/ActiveRecord?

When there is a single parent record associated with multiple child records, using row locking on the parent record is an obvious way to ensure consistency. However, I cannot seem to find a clean way to do this in ActiveRecord.
For example, say we have two models: Order and OrderProduct.
class Order < ActiveRecord::Base
has_many :order_products
...
end
class OrderProduct < ActiveRecord::Base
belongs_to :order
...
end
Updating an OrderProduct affects the overall state of the Order, so we want to make sure only one transaction is updating an Order at any given time.
If we're trying to achieve this when editing an OrderProduct, the cleanest way in ruby I can see is:
def edit
product = OrderProduct.find params[:id]
Order.transaction do
product.order.lock!
# Make sure no changes have occurred while we were waiting for the lock
product.reload
# Do stuff...
product.order.some_method
end
end
However this if rather inefficient with SQL queries, producing:
SELECT "order_products".* FROM "order_products" WHERE "order_products"."id" = $1 LIMIT 1 [["id", "2"]]
SELECT "orders".* FROM "orders" WHERE "orders"."id" = 2 LIMIT 1
SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 LIMIT 1 FOR UPDATE [["id", 2]]
SELECT "order_products".* FROM "order_products" WHERE "order_products"."id" = $1 LIMIT 1 [["id", 2]]
SELECT "orders".* FROM "orders" WHERE "orders"."id" = 2 LIMIT 1
We can reduce the number of queries by changing the to something along the lines of:
def edit
product = OrderProduct.find params[:id]
Order.transaction do
order = Order.find product.order_id, lock: true
# Make sure no changes have occurred while we were waiting for the lock
product.reload
# Cache the association
product.order = order
# Do stuff...
product.order.some_method
end
end
which produces better SQL:
SELECT "order_products".* FROM "order_products" WHERE "order_products"."id" = $1 LIMIT 1 [["id", "2"]]
SELECT "orders".* FROM "orders" WHERE "orders"."id" = 2 LIMIT 1 FOR UPDATE [["id", 2]]
SELECT "order_products".* FROM "order_products" WHERE "order_products"."id" = $1 LIMIT 1 [["id", 2]]
However the code is messier.
Is there a cleaner way of doing this with ActiveRecord? Calling product.order = order just to get the association cached seems a little dangerous.
For a simple Rails way of locking, check out http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html
.lock.load
Is what you are looking for.
probably?

Resources