Order by foreign key in activerecord: without a join? - ruby-on-rails

I want to expand this question.
order by foreign key in activerecord
I'm trying to order a set of records based on a value in a really large table.
When I use join, it brings all the "other" records data into the objects.. As join should..
#table users 30+ columns
#table bids 5 columns
record = Bid.find(:all,:joins=>:users, :order=>'users.ranking DESC' ).first
Now record holds 35 fields..
Is there a way to do this without the join?
Here's my thinking..
With the join I get this query
SELECT * FROM "bids"
left join users on runner_id = users.id
ORDER BY ranking LIMIT 1
Now I can add a select to the code so I don't get the full user table, but putting a select in a scope is dangerous IMHO.
When I write sql by hand.
SELECT * FROM bids
order by (select users.ranking from users where users.id = runner_id) DESC
limit 1
I believe this is a faster query, based on the "explain" it seems simpler.
More important than speed though is that the second method doesn't have the 30 extra fields.
If I build in a custom select inside the scope, it could explode other searches on the object if they too have custom selects (there can be only one)

What you would like to achieve in active record writing is something along
SELECT b.* from bids b inner join users u on u.id=b.user_id order by u.ranking desc
In active record i would write such as:
Bids.joins("inner join users u on bids.user_id=u.id").order("u.ranking desc")
I think it's the only to make a join without fetching all attributes from the user models.

Related

Optimizing SQL query using JOIN instead of NOT IN

I have a sql query that I'd like to optimize. I'm not the designer of the database, so I have no way of altering structure, indexes or stored procedures.
I have a table that consists of invoices (called faktura) and each invoice has a unique invoice id. If we have to cancel the invoice a secondary invoice is created in the same table but with a field ("modpartfakturaid") referring to the original invoice id.
Example of faktura table:
invoice 1: Id=152549, modpartfakturaid=null
invoice 2: Id=152592, modpartfakturaid=152549
We also have a table called "BHLFORLINIE" which consists of services rendered to the customer. Some of the services have already been invoiced and match a record in the invoice (FAKTURA) table.
What I'd like to do is get a list of all services that either does not have an invoice yet or does not have an invoice that's been cancelled.
What I'm doing now is this:
`SELECT
dbo.BHLFORLINIE.LeveringsDato AS treatmentDate,
dbo.PatientView.Navn AS patientName,
dbo.PatientView.CPRNR AS patientCPR
FROM
dbo.BHLFORLINIE
INNER JOIN dbo.BHLFORLOEB
ON dbo.BHLFORLOEB.BhlForloebID = dbo.BHLFORLINIE.BhlForloebID
INNER JOIN dbo.PatientView
ON dbo.PatientView.PersonID = dbo.BHLFORLOEB.PersonID
INNER JOIN dbo.HENVISNING
ON dbo.HENVISNING.BhlForloebID = dbo.BHLFORLOEB.BhlForloebID
LEFT JOIN dbo.FAKTURA
ON dbo.BHLFORLINIE.FakturaId = FAKTURA.FakturaId
WHERE
(dbo.BHLFORLINIE.LeveringsDato >= '2017-01-01' OR dbo.BHLFORLINIE.FakturaId IS NULL) AND
dbo.BHLFORLINIE.ProduktNr IN (110,111,112,113,8050,4001,4002,4003,4004,4005,4006,4007,4008,4009,6001,6002,6003,6004,6005,6006,6007,6008,7001,7002,7003,7004,7005,7006,7007,7008) AND
((dbo.FAKTURA.FakturaType = 0 AND
dbo.FAKTURA.FakturaID NOT IN (
SELECT FAKTURA.ModpartFakturaID FROM FAKTURA WHERE FAKTURA.ModpartFakturaID IS NOT NULL
)) OR
dbo.FAKTURA.FakturaType IS NULL)
GROUP BY
dbo.PatientView.CPRNR,
dbo.PatientView.Navn,
dbo.BHLFORLINIE.LeveringsDato`
Is there a smarter way of doing this? Right now the added the query performs three times slower because of the "not in" subquery.
Any help is much appreciated!
Peter
You can use an outer join and check for null values to find non matches
SELECT customer.name, invoice.id
FROM invoices i
INNER JOIN customer ON i.customerId = customer.customerId
LEFT OUTER JOIN invoices i2 ON i.invoiceId = i2.cancelInvoiceId
WHERE i2.invoiceId IS NULL

ActiveRecord using pluck with includes/left outer joins

When I do includes it left joins the table I want to filter on, but when I add pluck that join disappears. Is there any way to mix pluck and left join without manually typing the sql for 'left join'
Here's my case:
Select u.id
From users u
Left join profiles p on u.id=p.id
Left join admin_profiles a on u.id=a.uid
Where 2 in (p.prop, a.prop, u.prop)
Doing this is just loading all the values:
Users.includes(:AdminProfiles, :Profiles).where(...).map{ |a| a[:id] }
But when I do pluck instead of map, it doesn't left join the profile tables.
Your problem is that you're using includes which doesn't really do a join, instead it fires a second query after the first one to query for the associations, in your case you want them both to be actually joined, so for that replace includes(:something) with joins(:something) and every thing should work fine.
Replying to your comment, i'm gonna quote few parts from the rails guide about active record query interface
From the section Solution to N + 1 queries problem
clients = Client.includes(:address).limit(10)
clients.each do |client|
puts client.address.postcode
end
The above code will execute just 2 queries, as opposed to 11 queries in the previous case:
SELECT * FROM clients LIMIT 10
SELECT addresses.* FROM addresses WHERE (addresses.client_id IN (1,2,3,4,5,6,7,8,9,10))
as you can see, two queries, no joins at all.
From the section Specifying Conditions on Eager Loaded Associations link
Even though Active Record lets you specify conditions on the eager loaded associations just like joins, the recommended way is to use joins instead.
Then an example:
Article.includes(:comments).where(comments: { visible: true })
This would generate a query which contains a LEFT OUTER JOIN whereas the joins method would generate one using the INNER JOIN function instead.
SELECT "articles"."id" AS t0_r0, ... "comments"."updated_at" AS t1_r5 FROM "articles" LEFT OUTER JOIN "comments" ON "comments"."article_id" = "articles"."id" WHERE (comments.visible = 1)
If there was no where condition, this would generate the normal set of two queries.

Find the number of users with one of multiple associations

I'm trying to find the best way to count the number of Users who have one (or many) instances of a has_many relation.
For example, User has_many :bank_accounts and :credit_accounts (and a few other relations). I want to find the number of unique Users who have at least one bank_account and at least one credit_account, and ideally implement this inside of a scope so I can run where queries on it.
At the moment I'm implementing it (poorly) using the following code:
(BankAccount.select(:user_id).uniq + CreditAccount.select(:user_id) + ...).uniq.count
I've played around a lot with some joins, however I'm not getting any results. For example, I've toyed around a lot with different forms of User.joins(:bank_accounts, :credit_accounts).uniq('users.id').count however I don't appear to be getting any results.
Any help would be greatly appreciated, thanks!
If you are fine with using normal sql. You can use the below query
select distinct(user_id) from
(select user_id from bank_accounts union select user_id from credit_accounts) a;
I am not sure if a rails way exists for this.
In this case all we need is an INNER JOIN of users with credit_accounts and bank_accounts.
User.joins(:credit_accounts, :bank_accounts).uniq.count
The above query works for me. The sql generated by this query is below
"SELECT DISTINCT COUNT(DISTINCT `users`.`id`) FROM `users`.* FROM `users` INNER JOIN `credit_accounts` ON `credit_accounts`.`user_id` = `users`.`id` INNER JOIN `bank_accounts` ON `bank_accounts`.`user_id` = `users`.`id`"

Selecting distinct through join

We have 2 tables: users and statuses
The status table has a user_id, status and occured_on. The status is either 'removed' or 'added' and occured_on is the date the user was removed or added.
I need the current added users. That is, all the (distinct) users whose newest status record is 'added'.
I'm using Rails, and have tried:
User
.joins(:statuses)
.where('statuses.status = ?', 'added')
.order('statuses.occured_on DESC')
.uniq
Which translates to the SQL:
SELECT DISTINCT users.*
FROM users
INNER JOIN statuses
ON statuses.user_id = users.id
WHERE statuses.status = 'added'
ORDER BY statuses.occured_on DESC
That gives me the error:
PG::Error: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...statuses.status = 'added') ORDER BY statuses.oc...
I'd be happy knowing either the Rails code that would work or the straight SQL.
Also, I'd prefer no sub-selects if possible.
Concider the following database schema change:
StatusTable:
StatusId
Status
UserId
ActiveFrom
ActiveTo
Afterwards you can add additional checks such as:
CONSTRAINT chk_from_to CHECK (ActiveFrom <= ActiveTo)
Then your query would look something like:
SELECT users.*
FROM users
JOIN statuses ON UserId = users.user_id AND ActiveFrom < CURRENT_TIMESTAMP AND ActiveTo > CURRENT_TIMESTAMP
WHERE statuses.Status = 'active'
With such structure you might need to change the way you change statuses, but from my own experience, this structure is much more flexible, and easier to query.
SELECT * FROM users INNER JOIN statuses ON users.id=statuses.user_id WHERE statuses.status='added' ORDER BY statuses.occured_on
After clarification, I don't think the schema is well designed for your goal. Can you clarify why you want the status change history contained in that table? My general approach to this would be that active users should be contained in a table called projects_users, containing project_id, user_id. When they are "removed" they should be removed from that table. Logs of the actions - adding and remove users from projects - should be stored in a separate table.
There's no good way that I'm aware of to write this query given your current design. Even if you fixed the errors, this runs error free in MySQL (which is exactly what you have)
SELECT DISTINCT `users`.* FROM `users`
INNER JOIN `projects_users`
ON `users`.`id`=`projects_users`.`user_id`
WHERE `status`='added'
ORDER BY `projects_users`.`occured_on` DESC
it still won't get you the correct results. The ORDER BY clause will just get you the most recent change to "added", it won't guarantee there is not a more recent "removed" action. To do that you'd need to compare the date of each most recent added record to the date of the most recent removed record, for each user, a nightmare.

ActiveRecord subquery in select clause

So I'm getting a bunch of Volunteers records, with some filtering and sorting, which is fine. But I'd like to also get a count of the number of Children each volunteer is helping (using volunteer_id on children table), as a sub-query in the select clause to avoid having to perform a separate query for each record. As a bonus it would be good to be able to sort by this count too!
I'd like to end up with a generated query like this and be able to access the 'kids' column:
SELECT id, name, (SELECT COUNT(*) FROM children WHERE volunteer_id = volunteers.id) AS kids FROM volunteers
Is there any way of doing this with Arel? I've had a bit of a scout around and haven't found anything yet.
Alternatively, is it possible to join to the children table and get: count(children.id) ?
Thanks for any help :)
The proper way of doing this with SQL is with a GROUP BY clause:
SELECT v.id, v.name, COUNT(*) AS kids
FROM volunteers v
LEFT OUTER JOIN children c ON v.id = c.volunteer_id
GROUP BY v.id, v.name
There is a method .group() in AR for using GROUP BY queries.

Resources