Faster save method? - ruby-on-rails

I must convert ~ 1.300.000 records on my database.
Do you know a method faster than this?
Article.find_each(&:save)

If you're looking to update a single field in a table, you can use update_all on your ActiveRecord model.
Post.update_all(:published=>true)
# UPDATE "posts" SET "published" = 't'
This works with an ActiveRecord scopes as well.
Post.where(:published=>true).update_all(:published=>false)
# SQL (3.3ms) UPDATE "posts" SET "published" = 'f' WHERE "posts"."published" = 't'
By using this, you can use conditional statements (such as where) to pick out common rows in your table and perform update_all on them. This is assuming you want to do some form of attribute updating before saving the record.

You can increase the number of records in batch (the default is 1000), this number depends on how much memory you have in your server:
Article.find_each(:batch_size => 5000) { |r| r.save }

If you are creating, you need to bulk insert with a gem like activerecord-import. If you are updating, just use update_all.

Related

Rails update_all from associated_object

I have a Glass object and a Prescription object, but i forgot to add timestamps to the Glass Object, so i created a migration to do that. However, not surprisingly all the objects have todays date and time.
glass belongs_to :prescription prescription has_one :glass
However, I can get the correct timestamp from the Prescription object. I just don't know how to do that. So I want to do something like
Glass.update_all(:created_at => self.prescription.created_at)
any ideas ?
Easiest thing to do is simply multiple SQL queries, it's a one off migration so no biggie I think. ActiveRecord update_all is meant to update the matching records with the same value so that won't work.
Glass.all.find_each do |glass|
glass.update!(created_at: glass.prescription.created_at)
end
If you want one query (update based on a join - called "update from" in sql terms) it seems not straightforward in ActiveRecord (should work on MySQL but not on Postgres) https://github.com/rails/rails/issues/13496 it will be easier to write raw SQL - this can help you get started https://www.dofactory.com/sql/update-join
You can use touch method
Prescription.find_each do |prescription|
prescription.glass.touch(:created_at, time: prescription.created_at)
end
Believe me when I say that I'm on team "idiomatic Rails" and it's true that iterating through each record and updating it is probably more idiomatic, but UPDATE FROM.. is so incredibly more performant and efficient (resources-wise) that unless the migration is iterating through < 1000 records, I prefer to do the in-SQL UPDATE FROM.
The particular syntax for doing an update from a join will vary depending on which SQL implementation you're running (Postgres, MySQL, etc.), but in general just execute it from a Rails DB connection.
InboundMessage.connection.execute <<-SQL
UPDATE
inbound_messages
INNER JOIN notifications
ON inbound_messages.message_detail_type = "Notification"
AND inbound_messages.message_detail_id = notifications.id
SET
inbound_messages.message_detail_type = notifications.notifiable_type,
inbound_messages.message_detail_id = notifications.notifiable_id
WHERE
notifications.type = "foo_bar"
SQL

How to get the latest created object in ruby on rails [duplicate]

I was wondering if there is a way to find the newest record in a table in rails3?
Given a Post model, you could do #post = Post.order("created_at").last
(The reason I didn't just do a #post = Post.last is because that always defaults to sort by your primary key (usually id). Most of the time this is fine, but I'm sure there's a scenario where that could cause problems (e.g. setting custom IDs on records, database changes which affect the primary key sequencing/autonumbering, etc.). Sorting by the created_at timestamp ensures you are really getting the most recent record).
While dmarkow's answer is technically correct, you'll need to make an index on created_at or risk an increasingly slow query as your database grows.
If you know that your "id" column is an auto-increment primary key (which it likely is), then just use it since it is an index by definition.
Also, unless AREL is optimized to select only one record in a find(:last), you run the risk of making it select ALL records, then return you just the last one by using the "last()" method. More efficient is to limit the results to one:
MyModel.last(:order => "id asc", :limit => 1)
or
MyModel.first(:order => "id desc", :limit => 1)
you may run into ambiguity issues using created_at on a sufficiently high-traffic table.
eg. try:
INSERT INTO table (created_at) VALUES ( NOW() );
INSERT INTO table (created_at) VALUES ( NOW() );
..has the potential to have the same created_at, which only has 1 second of resolution. a sort would return them in no particular order.
you may be better off storing a microtime value and sorting on that.
Yes, you can use the method .last
So if your model is called Post then:
>> Post.last
=> #<Post ...>
Try, for a model named ModelName:
record = ModelName.last

Model.first does not retrieve first record from table

Model.first doesnot retrive first record from table. Instead it retrives any random record from table.
eg:
Merchant.first
Query
SELECT "merchants".* FROM "merchants" LIMIT 1
=> <Merchant id: 6, merchant_name: "Bestylish", description: "", description_html: "" >
Instead the query should be
SELECT "merchants".* FROM "merchants" ORDER BY "merchants"."id" ASC LIMIT 1;
Why it doesnot retrive the first record
Model.first will use the default sorting of your database.
For example. In Postgresql default sorting is not necessarily an id.
This seems to be default behaviour with Postgres, as some active-record versions do not add a default ordering to the query for first, while adding one for last.
https://github.com/rails/rails/issues/9885
PostgreSQL does not by default apply a sort, which is generally a good thing for performance.
So in this context "first" means "the first row returned", not "the first row when ordered by some meaningless key value".
Curiously "last" does seem to order by id.
It is defined here, in Rails 4, to order by primary key if no other order conditions are specified.
In Rails 3.2.11, it is as such:
def find_first
if loaded?
#records.first
else
#first ||= limit(1).to_a[0]
end
end
Without the order method, which will just apply the limit and then leave the ordering up to your database.
You need to apply the ordering yourself. Try calling Merchant.order('id ASC').first
It may be possible to automate this using default scopes in your model but I'm not sure about that.

Can Rails cache in this situation?

> player.records
Record Load (0.5ms) SELECT * FROM `records` WHERE (`records`.player_id = 1)
> player.records.first(:conditions => {:metric_id => "IS NOT NULL"})
Record Load (0.5ms) SELECT * FROM `records` WHERE (`records`.player_id = 1 AND (`records`.`metric_id` = 'IS NOT NULL')) LIMIT 1
Is there a way to make the second query not hit the database, but use the cache instead? It seems a bit excessive for it to be hitting the database again when they data is already in memory.
I need both results. I'm aware that Ruby can iterate through the values, but I'd prefer to do this through ActiveRecord if possible. I'm coming from a Django background where filter() did this just fine.
I'm using Rails 2.3.
No, simply because the condition is different.
But try to explain the context. Why do you need to use both queries? Can't you use only the second one?
If you need both, why can't you filter the Array with Ruby code instead of making another query?

Find the newest record in Rails 3

I was wondering if there is a way to find the newest record in a table in rails3?
Given a Post model, you could do #post = Post.order("created_at").last
(The reason I didn't just do a #post = Post.last is because that always defaults to sort by your primary key (usually id). Most of the time this is fine, but I'm sure there's a scenario where that could cause problems (e.g. setting custom IDs on records, database changes which affect the primary key sequencing/autonumbering, etc.). Sorting by the created_at timestamp ensures you are really getting the most recent record).
While dmarkow's answer is technically correct, you'll need to make an index on created_at or risk an increasingly slow query as your database grows.
If you know that your "id" column is an auto-increment primary key (which it likely is), then just use it since it is an index by definition.
Also, unless AREL is optimized to select only one record in a find(:last), you run the risk of making it select ALL records, then return you just the last one by using the "last()" method. More efficient is to limit the results to one:
MyModel.last(:order => "id asc", :limit => 1)
or
MyModel.first(:order => "id desc", :limit => 1)
you may run into ambiguity issues using created_at on a sufficiently high-traffic table.
eg. try:
INSERT INTO table (created_at) VALUES ( NOW() );
INSERT INTO table (created_at) VALUES ( NOW() );
..has the potential to have the same created_at, which only has 1 second of resolution. a sort would return them in no particular order.
you may be better off storing a microtime value and sorting on that.
Yes, you can use the method .last
So if your model is called Post then:
>> Post.last
=> #<Post ...>
Try, for a model named ModelName:
record = ModelName.last

Resources