Rails: Is it bad to have an irreversible migration?

Rails: Is it bad to have an irreversible migration? - ruby-on-rails

When is it acceptable to raise an ActiveRecord::IrreversibleMigration exception in the self.down method of a migration? When should you take the effort to actually implement the reverse of the migration?

If you are dealing with production-grade systems then yes, it is very bad. If it is your own pet project, then anything is allowed (if nothing else, it will be a learning experience :) though chances are that sooner rather than later, even in a pet project, you will find yourself having put a cross over a reverse migration only to have to undo that migration a few days later, be it via rake or manually.)
In a production scenario, you should always make the effort to write and test a reversible migration in the eventuality that you go through it in production, then discover a bug which forces you to roll back (code and schema) to some previous revision (pending some non-trivial fix -- and an otherwise unusable production system.)
Reverse migrations range from mostly trivial (removing columns or tables that were added during migration, and/or changing column types, etc.) to somewhat more involved (execute of JOINed INSERTs or UPDATEs), but nothing is so complex as to justify "sweeping it under the rug". If nothing else, forcing yourself to think of ways to achieve reverse migrations can give you new insight into the very problem that your forward migration is fixing.
You might occasionally run into a situation where a forward migration removes a feature, resulting in data being discarded from the database. For obvious reasons, the reverse migration cannot resuscitate discarded data. Although one could, in such cases, recommend having the forward migration automatically save the data or keep it around in the eventuality of rollback as an alternative to outright failure (save to yml, copy/move to a special table, etc.), you don't have to, as the time required to test such an automated procedure could exceed the time required to restore the data manually (should the need arise.) But even in such cases, instead of just failing, you can always make the reverse migration conditionally and temporarily fail pending some user action (i.e. test for the existence of some required table that has to be restored manually; if missing, output "I have failed because I cannot recreate table XYZ from nothingness; manually restore table XYZ from backup then run me again, and I will not fail you!")

If you are destroying data, you can make a backup of it first.
e.g.
def self.up
# create a backup table before destroying data
execute %Q[create table backup_users select * from users]
remove_column :users, :timezone
end
def self.down
add_column :users, :timezone, :string
execute %Q[update users U left join backup_users B on (B.id=U.id) set U.timezone = B.timezone]
execute %Q[drop table backup_users]
end

In a production scenario, you should always make the effort to write and test a reversible migration in the eventuality that you go through it in production, then discover a bug which forces you to roll back (code and schema) to some previous revision (pending some non-trivial fix -- and an otherwise unusable production system.)
Having a reversible migration is fine for development and staging, but assuming well tested code it should be extremely rare that you would ever want to migrate down in production. I build into my migrations an automatic IrreversibleMigration in production mode. If I really needed to reverse a change, I could use another "up" migration or remove the exception. That seems sketchy though. Any bug that would cause a scenario this dire is a sign that the QA process is seriously screwed up.

Feeling like you need an irreversible migration is probably a sign you've got bigger problems looming. Maybe some specifics would help?
As for your second question: I always take the 'effort' to write the reverse of migrations. Of course, I don't actually write the .down, TextMate inserts it automatically when creating the .up.

Reversible Data Migration makes it easy to create reversable data migrations using yaml files.
class RemoveStateFromProduct < ActiveRecord::Migration
def self.up
backup_data = []
Product.all.each do |product|
backup_data << {:id => product.id, :state => product.state}
end
backup backup_data
remove_column :products, :state
end
def self.down
add_column :products, :state, :string
restore Product
end
end

IIRC, you'll have the IrreversibleMigration when changing a datatype in the migration.

I think another situation when it's ok is when you have a consolidated migration. In that case a "down" doesn't really make sense, as it would drop all the tables (except tables added after the consolidation). That's probably not what you'd want.

Related

Rails migration: only for schema change or also for updating data?

I'm a junior Rails developer and at work we faced the following problem:
Needed to update the value of a column only for one record.
What we did is creating a migration like this:
class DisableAccessForUser < ActiveRecord::Migration
def change
User.where(name: "User").first.update_column(:access, false)
end
end
Are migrations only for schema changes?
What other solutions do you suggest?
PS: I can only change it with code. No access to console.

The short version is, since migrations are only for schema changes, you wouldn't want to use them to change actual data in the database.
The main issue is that your data-manipulating migration(s) might be ignored by other developers if they load the DB structuring using either rake db:schema:load or rake db:reset. Both of which merely load the latest version of the structure using the schema.rb file and do not touch the migrations.
As Nikita Singh also noted in the comments, I too would say the best method of changing row data is to implement a simple rake task that can be run as needed, independent of the migration structure. Or, for a first time installation, the seed.rb file is perfect to load initial system data.
Hope that rambling helps.
Update
Found some documentation in some "official" sources:
Rails Guide for Migrations - Using Models in your Migrations. This section gives a description of a scenario in which data-manipulation in the migration files can cause problems for other developers.
Rails Guide for Migrations - Migrations and Seed Data. Same document as above, doesn't really explain why it is bad to put seed or data manipulation in the migration, merely says to put all that in the seed.rd file.
This SO answer. This person basically says the same thing I wrote above, except they provide a quote from the book Agile Web Development with Rails (3rd edition), partially written by David Heinemeier Hansson, creator of Rails. I won't copy the quote, as you can read it in that post, but I believe it gives you a better idea of why seed or data manipulation in migrations might be considered a bad practice.

Migrations are fine for schema changes. But when you work on much collaborated projects like pulling code everyday from lot of developers.
Chances are you might miss some migrations(Value update migrations..No problem for schema changes) Because migrations depends on the timestamps.
So what we do is create a rake task in a single namespace to update some table values( Be careful it does not overwrites)
And invoke all the rake task in that NameSpace whenever we update the code from Git.

Making data changes using classes in migrations is dangerous because it's not terribly future proof. Changes to the class can easily break the migration in the future.
For example, let's imagine you were to add a new column to user (sample_group) and access that column in a Rails lifecycle callback that executes on object load (e.g. after_initialize). That would break this migration. If you weren't skipping callbacks and validations on save (by using update_column) there'd be even more ways to break this migration going forward.
When I want to make data changes in migrations I typically fall back to SQL. One can execute any SQL statement in a migration by using the execute() method. The exact SQL to use depends on the database in use, but you should be able to come up with a db appropriate query. For example in MySQL I believe the following should work:
execute("UPDATE users SET access = 0 WHERE id IN (select id from users order by id limit 1);")
This is far more future proof.

There is nothing wrong with using a migration to migrate the data in your database, in the right situation, if you do it right.
There are two related things you should avoid in your migrations (as many have mentioned), neither of which preclude migrating data:
It's not safe to use your models in your migrations. The code in the User model might change, and nobody is going to update your migration when that happens, so if some co-worker takes a vacation for 3 months, comes back, and tries to run all the migrations that happened while she was gone, but somebody renamed the User model in the mean time, your migration will be broken, and prevent her from catching up. This just means you have to use SQL, or (if you are determined to keep even your migrations implementation-agnostic) include an independent copy of an ActiveRecord model directly in your migration file (nested under the migration class).
It also doesn't make sense to use migrations for seed data, which is, specifically, data that is to be used to populate a new database when someone sets up the app for the first time so the app will run (or will have the data one would expect in a brand new instance of the app). You can't use migrations for this because you don't run migrations when setting up your database for the first time, you run db:schema:load. Hence the special file for maintaining seed data: seeds.rb. This just means that if you do need to add data in a migration (in order to get production and everyone's dev data up to speed), and it qualifies as seed data (necessary for the app to run), you need to add it to seeds.rb too!
Neither of these, however, mean that you shouldn't use migrations to migrate the data in existing databases. That is what they are for. You should use them!

A migrations is simply a structured way to make database changes, both schema and data.
In my opinion there are situations in which using migrations for data changes is legitimate.
For example:
If you are holding data which is mostly constant in your database but changes annually, it is fine to make a migration each year to update it. For example, if you list the teams in a soccer league a migration would be a good way to update the current teams in each year.
If you want to mass-alter an attribute of a large table. For example if you had a slug column in your user and the name "some user" would be translated to the slug "some_user" and now you want to change it to "some.user". This is something I'd do with a migration.
Having said that, I wouldn't use a migration to change a single user attribute. If this is something which happens occasionally you should make a dashboard which will allow you to edit this data in the future. Otherwise a rake task may be a good option.

This question is old and I think rails approach changed over time here. Based on https://edgeguides.rubyonrails.org/active_record_migrations.html#migrations-and-seed-data it's OK to feed new columns with data here. To be more precise your migration code should contain also "down" block:
class DisableAccessForUser < ActiveRecord::Migration
def up
User.where(name: "User").first.update_column(:access, false)
end
def down
User.where(name: "User").first.update_column(:access, true)
end
end
If you use seeds.rb to pre-fill data, don't forget to include new column value there, too:
User.find_or_create_by(id: 0, name: 'User', access: false)

If I remember correctly, changing particular records may work, but I'm not sure about that.
In any case, it isn't a good practice, migrations should be user for schema changes only.
For updating one record I would use console. Just type 'rails console' in terminal and input code to change attributes.

What is the Rails Way to perform a data update after a structural db change via db:migrate?

I'm changing a relationship in my database from has_many to a has_many :through. So right now I have:
class Brand < Ar::Base
has_many :products
end
class Product < AR::Base
belongs_to :brand
end
and I'm going to add a join table.
But of course I need to update the database with data after that. I have seen that it is not good practice to do this in the confines of the migration. Where is the best place to perform this, knowing that I have to then run another migration after the data update is complete (i.e. removing the original brand_id column from the products table)?

Unless I misunderstand your question, the migration is the place to make that transformation. The purpose of the migration is to change your schema and migrate existing data to use the schema. Migrations capture the temporal aspect of layering schema changes, so that you can go forward and backward in time without leaving data in an inconsistent state. If you were to migrate your rows anywhere else, you have no guarantee that when that code runs, the schema is as it was when you wrote your migration code.
I believe that you will find support for my position in the examples on the Active Record Migrations api documentation. You might be confusing migrations with populating seed data (rake db:seed) which is handled in db/seeds.rb.

One-time changes like this can be done with ruby code in the migration. Migrations aren't just for schema changes. The idea is that migrations, by there version/datecode, are ensured to only be run once.

I think you should at least include the call to the code (maybe a rake task) that runs the data manipulation within the migration, since you have to run the 2nd migration right after the data manipulation.
If it were me, I would create a rake task that manipulates the data. This will at least remove the code from the migration and allow you to run it manually if it were ever necessary. Then code your migration and include a call to that rake task. I honestly don't see what the big deal is about not using data manipulation in a migration. Especially when you have to do things in a specific order as you are doing. They are tied so closely together, so why completely separate them?

Irreversible Migrations - warning & confirm instead of abort?

I've been writing some migrations lately which fall under the Irreversible Migration umbrella. But they aren't end of the world irreversible. You could roll them back if you want. The scenario I have at the moment is changing a one to many relationship to a many to many relationship. It involves dropping a column and making a new join table. (as well as two lines in the models).
I was thinking, instead of aborting the down migration, I could say something like "This migration is [INSERT SCARY MESSAGE HERE], are you sure you want to proceed? Y/N" and then roll back the migration if they choose to? Just put the migration inside an if statement?
It's easy enough to make migrations irreversible, and usually there's good reason (e.g. data can't be recovered). Do these issues usually get resolved by just writing a migration which does it manually?
In my noobish mind it'd be nice to have a happy medium. Is it wise? Maybe I just don't understand when to make them non-reversible in the first place.

I always try to make the migration reversible if possible. The only time I think I've run into problems is when you go from a coarsely defined data model to a finer grained on, and then back again. I don't see any reason to not use your solution though, depending of course, on the consequences of the migration. There is also nothing stopping the person running the down migration from commenting out your raised error and writing their own code to reverse the migration, but it is far safer for you, the person writing the data model change to know how to transform back to the previous state instead of them guessing.

Just stumbled over this old post here - as I struggled somehow over the same question.
I had the other case: moving from many-to-many (HABTM) to one-to-many. Of course, I wanted to delete the join table afterwards. I was really afraid that I would forget to copy over the data from the join table during deployment. So I decided to include a "warning" migration:
class DataMigrationWarning < ActiveRecord::Migration
def change
puts("********************** Data Migration Warning **********************")
puts("Dont forgett to save the data.")
puts("Next UP migration will delete table XYZ.")
puts("Next DOWN migration will delete field A in table BCD.")
puts("press y for continue.")
puts("press anything else for stopping.")
if STDIN.gets.chomp == "y"
puts("Ok then!")
else
fail
end
# More detailed explanation...
end
end
The command line will then just show all the things in there and waits for an input from the user. y will just go to next migration. all other inputs will stop the migration and all following.
The process looked in the end like:
Migration: Create new field for new belong_to
Warning migration
Migration: Delete old join table

How do I write a migration that will remove certain records from my database?

I have a Note model, with a note_type field. How do write a migration that will remove Note records from the database if the type is "short_note"?

The code itself is simple.
Note.delete_all :type => 'short_note'
(If notes have destroy callbacks, you'll need to run destroy_all instead. It's slower because they're deleted one-by-one, but can sometimes produce better data integrity.)
However, I imagine you're a bit more worried about the down migration than the up migration. It is an irreversible transformation by nature. The answer to that particular bit of the question is that your migration should raise an ActiveRecord::IrreversibleMigration exception.
However, whenever you write an irreversible migration, it's important to consider why you're doing it. Depending on your situation, maybe it's more appropriate to just run that particular command in the console upon deploy to production than to make that migration part of your application's very definition.

def self.up
execute "DELETE FROM notes WHERE note_type = 'short_note'"
end
:-P
Just kidding. I'm sure you could do this:
Note.delete_all :note_type => 'short_note'

In the self.up:
Note.delete_all("type = 'short_note'");
or use destroy_all which will call the record's destroy method and callbacks (before_destroy and after_destroy):
Note.destroy_all("type = 'short_note'");

Rails: Best way to make changes to a production database

I need to make changes to an in-use production database. Just adding a few columns. I've made the changes to the dev database with migrations. What is the best way to update the production database while preserving the existing data and not disrupting operation too much?
It's MYSQL and I will be needing to add data to the columns as well for already existing records. One column can have a default value (it's boolean) but the other is a timestamp and should have an arbitrary backdated value. The row counts are not huge.
So if I use migrations how do I add data and how do I get it to just do the two (or three - I add data -latest migrations on the production db when it wasn't initially built via migrations (I believe they used the schema instead)?

I always follow this procedure:
Dump prod database with mysqldump command
Populate dev / test database with dump using mysql command
Run migrations in dev / test
Check migration worked
Dump prod database with mysqldump command (as it may have changed) keeping backup on server
Run migrations on prod (using capristano)
Test migration has worked on prod
Drink beer (while watching error logs)

It sounds like you're in a state where the production db schema doesn't exactly match what you're using in dev (although it's not totally clear). I would draw a line in the sand, and get that prod db in a better state. Essentially what you want to do is make sure that the prod db has a "schema_info" table that lists any migrations that you >don't< ever want to run in production. Then you can add migrations to your hearts content and they'll work against the production db.
Once you've done that you can write migrations that add schema changes or add data, but one thing you need to be really careful about is that if you add data using a migration, you must define the model within the migration itself, like this:
class AddSomeColumnsToUserTable < ActiveRecord::Migration
class User < ActiveRecord::Base; end
def self.up
add_column :users, :super_cool, :boolean, :default => :false
u = User.find_by_login('cameron')
u.super_cool = true
u.save
end
def self.down
remove_column :users, :super_cool
end
end
The reason for this is that in the future, you might remove the model altogether, during some refactoring or other. If you don't define the user class on line "User.find_by_login..." the migration will throw an exception which is a big pain.

Is there a reason you are not using the same migrations you used in your dev environment?

Adding a column with add_column in a migration should be non-destructive: it will generate a "ALTER TABLE" statement. If you know what you're going to put into the columns once created, you can fill in the values within the migration (you may choose a less time-consuming alternative if the row counts are large).
Removing or altering the definition of a column is, I think, platform-dependent: some will allow deletion of a column in place, others will perform a rename-create-select-drop sequence of commands.
To get more specific, we need more information: what kind of migration are you looking at, what platform are you running on, do you need to set values as part of the migration? Stuff like that would help a lot - just edit the question, which will push it back up the list.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart