Rails update_all from associated_object - ruby-on-rails

I have a Glass object and a Prescription object, but i forgot to add timestamps to the Glass Object, so i created a migration to do that. However, not surprisingly all the objects have todays date and time.
glass belongs_to :prescription prescription has_one :glass
However, I can get the correct timestamp from the Prescription object. I just don't know how to do that. So I want to do something like
Glass.update_all(:created_at => self.prescription.created_at)
any ideas ?

Easiest thing to do is simply multiple SQL queries, it's a one off migration so no biggie I think. ActiveRecord update_all is meant to update the matching records with the same value so that won't work.
Glass.all.find_each do |glass|
glass.update!(created_at: glass.prescription.created_at)
end
If you want one query (update based on a join - called "update from" in sql terms) it seems not straightforward in ActiveRecord (should work on MySQL but not on Postgres) https://github.com/rails/rails/issues/13496 it will be easier to write raw SQL - this can help you get started https://www.dofactory.com/sql/update-join

You can use touch method
Prescription.find_each do |prescription|
prescription.glass.touch(:created_at, time: prescription.created_at)
end

Believe me when I say that I'm on team "idiomatic Rails" and it's true that iterating through each record and updating it is probably more idiomatic, but UPDATE FROM.. is so incredibly more performant and efficient (resources-wise) that unless the migration is iterating through < 1000 records, I prefer to do the in-SQL UPDATE FROM.
The particular syntax for doing an update from a join will vary depending on which SQL implementation you're running (Postgres, MySQL, etc.), but in general just execute it from a Rails DB connection.
InboundMessage.connection.execute <<-SQL
UPDATE
inbound_messages
INNER JOIN notifications
ON inbound_messages.message_detail_type = "Notification"
AND inbound_messages.message_detail_id = notifications.id
SET
inbound_messages.message_detail_type = notifications.notifiable_type,
inbound_messages.message_detail_id = notifications.notifiable_id
WHERE
notifications.type = "foo_bar"
SQL

Related

Need help writing sql query in rails so I get ActiveRelation

I need an active record relation that gives me the latest record of a region, city, bed combination. I have the sql query written as below, but I need to figure out if there is away to use a different approach to have it return an active record relation and not an array. Any suggestions?
Current query:
#current_ltm_market_stats = LtmStatsByBedCount.find_by_sql(" SELECT *
FROM ltm_stats_by_bed_counts lstats
WHERE lstats.city_id = '#{#city_id}'
#{#region_id_condition}
AND (lstats.beds,lstats.city_id,lstats.region_id,lstats.reporting_date)
IN (SELECT lstats.beds,
lstats.city_id,
lstats.region_id,
max(lstats.reporting_date)
FROM ltm_stats_by_bed_counts lstats
WHERE lstats.city_id = '#{#city_id}'
#{#region_id_condition}
GROUP BY city_id, region_id, beds)
ORDER BY lstats.year DESC,lstats.month DESC")
I had tried this before which did result in a relation but it runs really slowly and the result is not exactly the same. Are there any better rails ways to do this?
#all_ltm_market_stats = LtmStatsByBedCount.where(city_id: #market.city_id, region_id: #market.region_id)
#current_ltm_market_stats = #latest_year_ltm_market_stats.where(month: #latest_year_ltm_market_stats.all_ltm_market_stats.select('Max(year)'))
Information in the question is incomplete, so i might have to update my answer when additional details are added, But here is the initial draft with available information:
#current_ltm_market_stats = LtmStatsByBedCount.
where(city_id: #city_id).
where(#region_id_condition).
where("(beds, city_id, region_id, reporting_date) IN (
SELECT lstats.beds,
lstats.city_id,
lstats.region_id,
max(lstats.reporting_date)
FROM ltm_stats_by_bed_counts lstats
WHERE lstats.city_id = '#{#city_id}'
#{#region_id_condition}
GROUP BY city_id, region_id, beds)").
order(year: :desc, month: :desc)
Note that you might have to adjust your #region_id_condition a bit for this to work.
Theoretically it is equivalent of your SQL version(which means it will generate same sql excluding table alias) and returns the AR relation object. Which is the only requirement in the question. Obviously SQL might be improved with additional information as well.
Additionally, you will want to have carefully crafted indexes on this table if you are going to use this query on larger datasets frequently.

Is there anyway to make a lesser impact on my database with this request?

For the analytics of my site, I'm required to extract the 4 states of my users.
#members = list.members.where(enterprise_registration_id: registration.id)
# This pulls roughly 10,0000 records.. Which is evidently a huge data pull for Rails
# Member Load (155.5ms)
#invited = #members.where("user_id is null")
# Member Load (21.6ms)
#not_started = #members.where("enterprise_members.id not in (select enterprise_member_id from quizzes where quizzes.section_id IN (?)) AND enterprise_members.user_id in (select id from users)", #sections.map(&:id) )
# Member Load (82.9ms)
#in_progress = #members.joins(:quizzes).where('quizzes.section_id IN (?) and (quizzes.completed is null or quizzes.completed = ?)', #sections.map(&:id), false).group("enterprise_members.id HAVING count(quizzes.id) > 0")
# Member Load (28.5ms)
#completes = Quiz.where(enterprise_member_id: registration.members, section_id: #sections.map(&:id)).completed
# Quiz Load (138.9ms)
The operation returns a 503 meaning my app gives up on the request. Any ideas how I can refactor this code to run faster? Maybe by better joins syntax? I'm curious how sites with larger datasets accomplish what seems like such trivial DB calls.
The answer is your indexes. Check your rails logs (or check the console in development mode) and copy the queries to your db tool. Slap an "Explain" in front of the query and it will give you a breakdown. From here you can see what indexes you need to optimize the query.
For a quick pass, you should at least have these in your schema,
enterprise_members: needs an index on enterprise_member_id
members: user_id
quizes: section_id
As someone else posted definitely look into adding indexes if needed. Some of how to refactor depends on what exactly you are trying to do with all these records. For the #members query, what are you using the #members records for? Do you really need to retrieve all attributes for every member record? If you are not using every attribute, I suggest only getting the attributes that you actually use for something, .pluck usage could be warranted. 3rd and 4th queries, look fishy. I assume you've run the queries in a console? Again not sure what the queries are being used for but I'll toss in that it is often useful to write raw sql first and query on the db first. Then, you can apply your findings to rewriting activerecord queries.
What is the .completed tagged on the end? Is it supposed to be there? only thing I found close in the rails api is .completed? If it is a custom method definitely look into it. You potentially also have an use case for scopes.
THIRD QUERY:
I unfortunately don't know ruby on rails, but from a postgresql perspective, changing your "not in" to a left outer join should make it a little faster:
Your code:
enterprise_members.id not in (select enterprise_member_id from quizzes where quizzes.section_id IN (?)) AND enterprise_members.user_id in (select id from users)", #sections.map(&:id) )
Better version (in SQL):
select blah
from enterprise_members em
left outer join quizzes q on q.enterprise_member_id = em.id
join users u on u.id = q.enterprise_member_id
where quizzes.section_id in (?)
and q.enterprise_member_id is null
Based on my understanding this will allow postgres to sort both the enterprise_members table and the quizzes and do a hash join. This is better than when it will do now. Right now it finds everything in the quizzes subquery, brings it into memory, and then tries to match it to enterprise_members.
FIRST QUERY:
You could also create a partial index on user_id for your first query. This will be especially good if there are a relatively small number of user_ids that are null in a large table. Partial index creation:
CREATE INDEX user_id_null_ix ON enterprise_members (user_id)
WHERE (user_id is null);
Anytime you query enterprise_members with something that matches the index's where clause, the partial index can be used and quickly limit the rows returned. See http://www.postgresql.org/docs/9.4/static/indexes-partial.html for more info.
Thanks everyone for your ideas. I basically did what everyone said. I added indexes, resorted how I called everything, but the major difference was using the pluck method.. Here's my new stats :
#alt_members = list.members.pluck :id # 23ms
if list.course.sections.tests.present? && #sections = list.course.sections.tests
#quiz_member_ids = Quiz.where(section_id: #sections.map(&:id)).pluck(:enterprise_member_id) # 8.5ms
#invited = list.members.count('user_id is null') # 12.5ms
#not_started = ( #alt_members - ( #alt_members & #quiz_member_ids ).count #0ms
#in_progress = ( #alt_members & #quiz_member_ids ).count # 0ms
#completes = ( #alt_members & Quiz.where(section_id: #sections.map(&:id), completed: true).pluck(:enterprise_member_id) ).count # 9.7ms
#question_count = Quiz.where(section_id: #sections.map(&:id), completed: true).limit(5).map{|quiz|quiz.answers.count}.max # 3.5ms

Rails + ActiveRecord + optimization: Is there a better way to update on 300,000 records?

So I have a rake task that does this:
wine_club_memberships = WineClubMembership.pluck(:billing_info_id)
total_updated = BillingInfo.joins(:order).where(["orders.ordered_date < (CURRENT_DATE - 90) AND billing_infos.card_number IS NOT NULL AND billing_infos.card_number != '' AND billing_infos.id NOT IN (?)", wine_club_memberships]).update_all("card_number = ''")
log.error("Total records updated #{total_updated}")
The thing is that BillingInfo has 300,000+ records, and I'm wondering if all this joins, where, update_all is just the same as using pure SQL. Currently it's not too efficient, since I have a huge array of WineClubMembership records that I stuff in the statement.
Is there a more efficient way of doing this? Even though this is a long ugly statement, I was thinking that it would be efficient for the most part because it does everything pretty much in one or two hits to the database. However, people around me are thinking there must be other "Rails methods" that could do this in a better way that won't affect the performance of the production website.
I did see doing searches in "batches" but I am not sure if that will help.
UPDATE
I'm using Postgres 9.1+. In the old (just a little simpler) version of my activerecord search, This is what came out:
Ruby code:
wine_club_memberships = WineClubMembership.pluck(:billing_info_id)
total_updated = BillingInfo.joins(:order).where(["orders.ordered_date < (CURRENT_DATE - 90) AND billing_infos.id NOT IN (?)", wine_club_memberships]).update_all("card_number = ''")
SQL generated:
SQL (127848.6ms) UPDATE "billing_infos" SET card_number = '' WHERE "billing_infos"."id" IN (SELECT "billing_infos"."id" FROM "billing_infos" INNER JOIN "orders" ON "orders"."id" = "billing_infos"."order_id" WHERE (orders.ordered_date < (CURRENT_DATE - 90) AND billing_infos.id NOT IN (423908,390663,387323,402393,383446,416114,391009,456371,384305,386681,384382,384418, ...)))
It's possible that if you have your db manage the source of the final NOT IN comparison there will be optimizations in the db for dealing with it I.e. let sql manage the list of ids instead of passing it a 300,000 item long array. If your db allows try something like
... NOT IN (SELECT billing_info_id FROM wine_club_memberships)").update_all("card_number = ''")
As far as a Rails specific method for speeding this up, you're usually not going to be able to do better (performance-wise, if not maintainability-wise) than just passing a pure sql string to the dbs.

rails select and include

Can anyone explain this?
Project.includes([:user, :company])
This executes 3 queries, one to fetch projects, one to fetch users for those projects and one to fetch companies.
Project.select("name").includes([:user, :company])
This executes 3 queries, and completely ignores the select bit.
Project.select("user.name").includes([:user, :company])
This executes 1 query with proper left joins. And still completely ignores the select.
It would seem to me that rails ignores select with includes. Ok fine, but why when I put a related model in select does it switch from issuing 3 queries to issuing 1 query?
Note that the 1 query is what I want, I just can't imagine this is the right way to get it nor why it works, but I'm not sure how else to get the results in one query (.joins seems to only use INNER JOIN which I do not in fact want, and when I manually specifcy the join conditions to .joins the search gem we're using freaks out as it tries to re-add joins with the same name).
I had the same problem with select and includes.
For eager loading of associated models I used native Rails scope 'preload' http://apidock.com/rails/ActiveRecord/QueryMethods/preload
It provides eager load without skipping of 'select' at scopes chain.
I found it here https://github.com/rails/rails/pull/2303#issuecomment-3889821
Hope this tip will be helpful for someone as it was helpful for me.
Allright so here's what I came up with...
.joins("LEFT JOIN companies companies2 ON companies2.id = projects.company_id LEFT JOIN project_types project_types2 ON project_types2.id = projects.project_type_id LEFT JOIN users users2 ON users2.id = projects.user_id") \
.select("six, fields, I, want")
Works, pain in the butt but it gets me just the data I need in one query. The only lousy part is I have to give everything a model2 alias since we're using meta_search, which seems to not be able to figure out that a table is already joined when you specify your own join conditions.
Rails has always ignored the select argument(s) when using include or includes. If you want to use your select argument then use joins instead.
You might be having a problem with the query gem you're talking about but you can also include sql fragments using the joins method.
Project.select("name").joins(['some sql fragement for users', 'left join companies c on c.id = projects.company_id'])
I don't know your schema so i'd have to guess at the exact relationships but this should get you started.
I might be totally missing something here but select and include are not a part of ActiveRecord. The usual way to do what you're trying to do is like this:
Project.find(:all, :select => "users.name", :include => [:user, :company], :joins => "LEFT JOIN users on projects.user_id = users.id")
Take a look at the api documentation for more examples. Occasionally I've had to go manual and use find_by_sql:
Project.find_by_sql("select users.name from projects left join users on projects.user_id = users.id")
Hopefully this will point you in the right direction.
I wanted that functionality myself,so please use it.
Include this method in your class
#ACCEPTS args in string format "ASSOCIATION_NAME:COLUMN_NAME-COLUMN_NAME"
def self.includes_with_select(*m)
association_arr = []
m.each do |part|
parts = part.split(':')
association = parts[0].to_sym
select_columns = parts[1].split('-')
association_macro = (self.reflect_on_association(association).macro)
association_arr << association.to_sym
class_name = self.reflect_on_association(association).class_name
self.send(association_macro, association, -> {select *select_columns}, class_name: "#{class_name.to_sym}")
end
self.includes(*association_arr)
end
And you will be able to call like: Contract.includes_with_select('user:id-name-status', 'confirmation:confirmed-id'), and it will select those specified columns.
The preload solution doesn't seem to do the same JOINs as eager_load and includes, so to get the best of all worlds I also wrote my own, and released it as a part of a data-related gem I maintain, The Brick.
By overriding ActiveRecord::Associations::JoinDependency.apply_column_aliases() like this then when you add a .select(...) then it can act as a filter to choose which column aliases get built out.
With gem 'brick' loaded, in order to enable this selective behaviour, add the special column name :_brick_eager_load as the first entry in your .select(...), which turns on the filtering of columns while the aliases are being built out. Here's an example:
Employee.includes(orders: :order_details)
.references(orders: :order_details)
.select(:_brick_eager_load,
'employees.first_name', 'orders.order_date', 'order_details.product_id')
Because foreign keys are essential to have everything be properly associated, they are automatically added, so you do not need to include them in your select list.
Hope it can save you both query time and some RAM!

How to select records where a child does not exist

In rails I have 2 tables:
bans(ban_id, admin_id)
ban_reasons(ban_reason_id, ban_id, reason_id)
I want to find all the bans for a certain admin where there is no record in the ban_reasons table. How can I do this in Rails without looping through all the ban records and filtering out all the ones with ban.ban_reasons.nil? I want to do this (hopefully) using a single SQL statement.
I just need to do: (But I want to do it the "rails" way)
SELECT bans.* FROM bans WHERE admin_id=1234 AND
ban_id NOT IN (SELECT ban_id FROM ban_reasons)
Your solution works great (only one request) but it's almost plain SQL:
bans = Ban.where("bans.id NOT IN (SELECT ban_id from ban_reason)")
You may also try the following, and let rails do part of the job:
bans = Ban.where("bans.id NOT IN (?)", BanReason.select(:ban_id).map(&:ban_id).uniq)
ActiveRecord only gets you to a point, everything after should be done by raw SQL. The good thing about AR is that it makes it pretty easy to do that kind of stuff.
However, since Rails 3, you can do almost everything with the AREL API, although raw SQL may or may not look more readable.
I'd go with raw SQL and here is another query you could try if yours doesn't perform well:
SELECT b.*
FROM bans b
LEFT JOIN ban_reason br on b.ban_id = br.ban_id
WHERE br.ban_reason_id IS NULL
Using Where Exists gem (which I'm author of):
Ban.where(admin_id: 123).where_not_exists(:ban_reasons)

Resources