Arel count on union results - ruby-on-rails

I have two tables ITEMS and ITEM_AUDITS. An item can have 0 or more audit records associated to it. Resource and status are required for fields and lookups.
I am trying to create a query that counts the number of occurrences of the item_id in both tables. Effectively producing data that looks like the example below
Title Item_ID Count
ABC 1 2
ABC 3 4
Bible 5 1
I have been able to create a union query that produces the data without the counts:
# Declare Arel objects
i = Item.arel_table
ia = ItemAudit.arel_table
s = Status.arel_table
r = Resource.arel_table
######################
# Build the item query
######################
# Build the joins
item = Item.joins(
i.join(r, Arel::Nodes::OuterJoin).on(i[:resource_id].eq(r[:id]))
.join(s, Arel::Nodes::OuterJoin).on(i[:status_id].eq(s[:id]))
.join_sql
).uniq
# Build the select columns
#item = item.select('resources.title, items.id as id, count(*) as loan_count')
item = item.select('resources.title, items.id as item_id')
# Adds the where criteria
item = item.where(
s[:title].matches("On Loan").or(s[:title].matches("Permanent Loan"))
)
# Add the group by clause
item = item.group("items.id")
##########################
# Build item history query
##########################
# Build the joins
item_audit = Item.joins(
i.join(ia).on(i[:id].eq(ia[:item_id]))
.join(r, Arel::Nodes::OuterJoin).on(i[:resource_id].eq(r[:id]))
.join(s, Arel::Nodes::OuterJoin).on(i[:status_id].eq(s[:id]))
.join_sql
)
# Build the select columns
item_audit = item_audit.select('resources.title, item_audits.item_id as item_id')
#######################
# Union the two queries
#######################
report = item.union(:all, item_audit).select('title, item_id')
I can't progress past this to get the counts which should be something like
report = item.union(:all, item_audit).select('title, item_id, count(item_id)')
EDIT
The SQL I am trying to end up with is as below (a simple group and count on the union results).
select qry.title, qry.item_id, count(qry.item_id) from(
SELECT DISTINCT
resources.title, items.id as item_id
FROM
`items`
LEFT OUTER JOIN
`resources` ON `items`.`resource_id` = `resources`.`id`
LEFT OUTER JOIN
`statuses` ON `items`.`status_id` = `statuses`.`id`
WHERE
((`statuses`.`title` LIKE 'On Loan'
OR `statuses`.`title` LIKE 'Permanent Loan'))
GROUP BY items.id
UNION ALL SELECT
resources.title, item_audits.item_id as item_id
FROM
`items`
INNER JOIN
`item_audits` ON `items`.`id` = `item_audits`.`item_id`
LEFT OUTER JOIN
`resources` ON `items`.`resource_id` = `resources`.`id`
LEFT OUTER JOIN
`statuses` ON `items`.`status_id` = `statuses`.`id`) as qry
group by qry.item_id
Anyone give me any pointers.
Many thanks,

Related

Rails find Parent that has two Children with certain attribute

I have over 100,000 objects in my database for different Product. Each Product has 4-6 Variants. Because of this, it is not easy to lazily edit large amount of data by iterating through everything. Because of this, I am trying to get only the exact number of Products I need.
So far, I can get all the Products that have a Variant with the size attribute 'SM'.
The hang up, is getting all the Products that have both a Variant with size 'MD' and 'SM'.
This is the code I am using Product.joins(:variants).where('variants.size = ?', 'SM')
I have tried adding .where('variants.size = ?', 'MD') to it, but that does work.
How about this
Product.where(
id: Variant.select(:product_id)
.where(size: 'SM')
).where(id: Variant.select(:product_id)
.where(size: 'MD')
)
This should generate something akin to
SELECT products.*
FROM products
WHERE products.id IN (SELECT
variants.product_id
FROM variants
WHERE size = 'SM')
AND products.id IN (SELECT
variants.product_id
FROM variants
WHERE size = 'MD')
so the product id must be in both lists to be selected.
Additionally This should also work (Not 100% certain)
Product.where(id: Product.joins(:variants)
.where(variants: {size: ['SM', 'MD']})
.group(:id)
.having('COUNT(*) = 2').select(:id)
Which should generate something like
SELECT products.*
FROM products
WHERE
products.id IN ( SELECT products.id
FROM products
INNER JOIN variants
ON variants.product_id = products.id
WHERE
variants.size IN ('SM','MD')
GROUP BY
products.id
HAVING
Count(*) = 2
One more option
p_table = Products.arel_table
v_table = Variant.arel_table
sm_table = p_table.join(v_table)
.on(v_table[:product_id].eq(p_table.[:id])
.and(v_table[:size].eq('SM'))
)
md_table = p_table.join(v_table)
.on(v_table[:product_id].eq(p_table.[:id])
.and(v_table[:size].eq('MD'))
)
Product.joins(sm_table.join_sources).joins(md_table.join_sources)
SQL
SELECT products.*
FROM products
INNER JOIN variants on variants.product_id = products.id
AND variants.size = 'SM'
INNER JOIN variants on variants.product_id = products.id
AND variants.size = 'MD'
These 2 joins should enforce the small and medium because of the INNER JOIN
IMHO you need to use a bit more SQL instead of Rails magic to build database queries like that.
Product
.joins('INNER JOIN variants as sm_vs ON sm_vs.product_id = products.id')
.joins('INNER JOIN variants as md_vs ON md_vs.product_id = products.id')
.where(sm_vs: { size: 'SM' })
.where(md_vs: { size: 'MD' })
Or simplified - as #engineersmnky suggested:
Product
.joins("INNER JOIN variants as sm_vs ON sm_vs.product_id = products.id AND sm_vs.size = 'SM'")
.joins("INNER JOIN variants as md_vs ON md_vs.product_id = products.id AND sm_vs.size = 'MD'")
Both queries do basically the same. Just choose the version you like better.
Product.joins(:variants).where('variants.size = ? OR variants.size = ?', 'SM','MD')

Rails Postgres query to exclude any results that contain one of three records on join

This is a hard problem to describe but I have Rails query where I join another table and I want to exclude any results where the join table contain one of three conditions.
I have a Device model that relates to a CarUserRole model/record. In that CarUserRole record it will contain one of three :role - "owner", "monitor", "driver". I want to return any results where there is no related CarUserRole record where role: "owner". How would I do that?
This was my first attempt -
Device.joins(:car_user_roles).where('car_user_roles.role = ? OR car_user_roles.role = ? AND car_user_roles.role != ?', 'monitor', 'driver', 'owner')
Here is the sql -
"SELECT \"cars\".* FROM \"cars\" INNER JOIN \"car_user_roles\" ON \"car_user_roles\".\"car_id\" = \"cars\".\"id\" WHERE (car_user_roles.role = 'monitor' OR car_user_roles.role = 'driver' AND car_user_roles.role != 'owner')"
Update
I should mention that a device sometimes has multiple CarUserRole records. A device can have an "owner" and a "driver" CarUserRole. I should also note that they can only have one owner.
Anwser
I ended up going with #Reub's solution via our chat -
where(CarUserRole.where("car_user_roles.car_id = cars.id").where(role: 'owner').exists.not)
Since the car_user_roles table can have multiple records with the same car_id, an inner join can result in the join table having multiple rows for each row in the cars table. So, for a car that has 3 records in the car_user_roles table (monitor, owner and driver), there will be 3 records in the join table (each record having a different role). Your query will filter out the row where the role is owner, but it will match the other two, resulting in that car being returned as a result of your query even though it has a record with role as 'owner'.
Lets first try to form an sql query for the result that you want. We can then convert this into a Rails query.
SELECT * FROM cars WHERE NOT EXISTS (SELECT id FROM car_user_roles WHERE role='owner' AND car_id = cars.id);
The above is sufficient if you want devices which do not have any car_user_role with role as 'owner'. But this can also give you devices which have no corresponding record in car_user_roles. If you want to ensure that the device has at least one record in car_user_roles, you can add the following to the above query.
AND EXISTS (SELECT id FROM car_user_roles WHERE role IN ('monitor', 'driver') AND car_id = cars.id);
Now, we need to convert this into a Rails query.
Device.where(
CarUserRole.where("car_user_roles.car_id = cars.id").where(role: 'owner').exists.not
).where(
CarUserRole.where("car_user_roles.car_id = cars.id").where(role: ['monitor', 'driver']).exists
).all
You could also try the following if your Rails version supports exists?:
Device.joins(:car_user_roles).exists?(role: ['monitor', 'driver']).exists?(role: 'owner').not.select('cars.*').distinct
Select the distinct cars
SELECT DISTINCT (cars.*) FROM cars
Use a LEFT JOIN to pull in the car_user_roles
LEFT JOIN car_user_roles ON cars.id = car_user_roles.car_id
Select only the cars that DO NOT contain an 'owner' car_user_role
WHERE NOT EXISTS(SELECT NULL FROM car_user_roles WHERE cars.id = car_user_roles.car_id AND car_user_roles.role = 'owner')
Select only the cars that DO contain either a 'driver' or 'monitor' car_user_role
AND (car_user_roles.role IN ('driver','monitor'))
Put it all together:
SELECT DISTINCT (cars.*) FROM cars LEFT JOIN car_user_roles ON cars.id = car_user_roles.car_id WHERE NOT EXISTS(SELECT NULL FROM car_user_roles WHERE cars.id = car_user_roles.car_id AND car_user_roles.role = 'owner') AND (car_user_roles.role IN ('driver','monitor'));
Edit:
Execute the query directly from Rails and return only the found object IDs
ActiveRecord::Base.connection.execute(sql).collect { |x| x['id'] }

RIGHT OUTER JOIN returns empty results with WHERE

I need to produce a report of all records (businesses) created by a particular user each month over last months. I produced the following query and expect it to provide me with a row for each month. However, this user didn't create any records (businesses) these months so I get an empty result [].
I'm still expecting to receive a row for each month, since I'm selecting a generate_series column using RIGHT OUTER JOIN but it doesn't happen.
start = 3.months.ago
stop = Time.now
new_businesses = Business.select(
"generate_series, count(id) as new").
joins("RIGHT OUTER JOIN ( SELECT
generate_series(#{start.month}, #{stop.month})) series
ON generate_series = date_part('month', created_at)
").
where(created_at: start.beginning_of_month .. stop.end_of_month).
where(author_id: creator.id).
group("generate_series").
order('generate_series ASC')
How can I change my query to get a row for each month instead of an empty result? I'm using PosgreSQL.
UPDATE
This code works:
new_businesses = Business.select(
"generate_series as month, count(id) as new").
joins("RIGHT OUTER JOIN ( SELECT
generate_series(#{start.month}, #{stop.month})) series
ON (generate_series = date_part('month', created_at)
AND author_id = #{creator.id}
AND created_at BETWEEN '#{start.beginning_of_month.to_formatted_s(:db)}' AND
'#{stop.end_of_month.to_formatted_s(:db)}'
)
").
group("generate_series").
order('generate_series ASC')
Your problem is in the where part which is breaks any outer joins. Consider the example:
select *
from a right outer join b on (a.id = b.id)
It will returns all rows from b and linked values from a, but:
select *
from a right outer join b on (a.id = b.id)
where a.some_field = 1
will drops all rows where a is not present.
The right way to do such sings is to place the filter into the join query part:
select *
from a right outer join b on (a.id = b.id and a.some_field = 1)
or use subquery:
select *
from (select * from a where a.some_field = 1) as a right outer join b on (a.id = b.id)

How to write the below SQL query in rails 3 ActiveRecord?

select * from
(
SELECT DISTINCT ON (table1.id) table1.*, table3.date_filed as date_filed
FROM
table1 LEFT JOIN table2 ON table2.id = table1.some_id
INNER JOIN table3 ON table2.id = table3.some_id
WHERE
(
status IN('Supervisor Accepted')
)
AND(table3.is_main)
)first_result
ORDER BY date_filed ASC LIMIT 25 OFFSET 0
Is there any way to run main/subset query in the database side through Active::record (Rails 3). I don't want run the first_result(First db query) and the order by on the top of the result(Second db query).
I tried the below:
# First query run
first_result = Table1.select('DISTINCT ON (table1.id) table1.*, table3.date_filed').
joins('LEFT JOIN table2 ON table2.id = table1.some_id'). # I don't want a association here
joins('INNER JOIN table3 ON table2.id = table3.some_id').
where('table3.is_main')
# Second query run, WHICH is UGLY and not working properly
Table1.where(id: first_result.collect(:&id)).
order_by('date_filed ASC')
page(page).
per_page(per_page)

Nested query in squeel

Short version: How do I write this query in squeel?
SELECT OneTable.*, my_count
FROM OneTable JOIN (
SELECT DISTINCT one_id, count(*) AS my_count
FROM AnotherTable
GROUP BY one_id
) counts
ON OneTable.id=counts.one_id
Long version: rocket_tag is a gem that adds simple tagging to models. It adds a method tagged_with. Supposing my model is User, with an id and name, I could invoke User.tagged_with ['admin','sales']. Internally it uses this squeel code:
select{count(~id).as(tags_count)}
.select("#{self.table_name}.*").
joins{tags}.
where{tags.name.in(my{tags_list})}.
group{~id}
Which generates this query:
SELECT count(users.id) AS tags_count, users.*
FROM users INNER JOIN taggings
ON taggings.taggable_id = users.id
AND taggings.taggable_type = 'User'
INNER JOIN tags
ON tags.id = taggings.tag_id
WHERE tags.name IN ('admin','sales')
GROUP BY users.id
Some RDBMSs are happy with this, but postgres complains:
ERROR: column "users.name" must appear in the GROUP BY
clause or be used in an aggregate function
I believe a more agreeable way to write the query would be:
SELECT users.*, tags_count FROM users INNER JOIN (
SELECT DISTINCT taggable_id, count(*) AS tags_count
FROM taggings INNER JOIN tags
ON tags.id = taggings.tag_id
WHERE tags.name IN ('admin','sales')
GROUP BY taggable_id
) tag_counts
ON users.id = tag_counts.taggable_id
Is there any way to express this using squeel?
I wouldn't know about Squeel, but the error you see could be fixed by upgrading PostgreSQL.
Some RDBMSs are happy with this, but postgres complains:
ERROR: column "users.name" must appear in the GROUP BY clause or be
used in an aggregate function
Starting with PostgreSQL 9.1, once you list a primary key in the GROUP BY you can skip additional columns for this table and still use them in the SELECT list. The release notes for version 9.1 tell us:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause
BTW, your alternative query can be simplified, an additional DISTINCT would be redundant.
SELECT o.*, c.my_count
FROM onetable o
JOIN (
SELECT one_id, count(*) AS my_count
FROM anothertable
GROUP BY one_id
) c ON o.id = counts.one_id

Resources