rake db migration issues - ruby-on-rails

Some questions on db migration tasks (rake db:migrate)
Does it make sense to rename the file names, if there is a spelling mistake.
(e.g. CreaetFoos.rb to CreateFoos.rb)
I created a migration script (say version '3') by mistake during the dev process and I would like it to be removed from git. What if I had already migrated to be at the current level of '6', should I just rollback till '2', remove migration script corresponding to '3' from git and re-run migration scripts. Will schema_migrations hold the right data in this case?
I would like to create a migration script during the dev process, but I don't want this to be considered as part of the migration script unless I call it complete (i.e. I don't want other developers to use a incomplete migration script which is checked into git). How do I handle this?

A multi-part question! Let me answer them in the proper parts.
[Question 1] Does it make sense to rename the file names, if there is a spelling mistake.
If it bothers you that much, yes. It would also bother me too.
[Question 2] [Wall of text about removing a migration]
Once a migration has been committed to your version control system, it should remain untouched. If it's modified, then you and other developers would need to roll it back and forward in order to get its changes again. It would be much better if you were to never touch old migrations and to fix any issues in new migrations. There are exceptions to this rule, which will be obvious when you encounter them.
Such as migrations that drop entire tables by accident.
[Question 3] Handling of migrations committed to version control
It's best practice to work in your own branch if you're going to be committing work that is incomplete. By doing this, you will leave the main branch ("master", probably) pristine and complete, allowing for other developers to continue on their own work.
Once you've got that migration sorted, then you will merge that branch back into master.

Related

I can't rollback migrations, because the migration file does not exist

I added a migration in branch "add_dogs" with migration db/migrate/20221220155010_create_dogs.rb, and ran db:migrate.
Later on, I changed branches (without a merge), and ultimately abandoned the "new_dogs" branch.
Later later on, I checked out "add_cats" branch with db/migrate/20221101010101_create_cats.rb, and ran db:migrate. So far, all is well.
But then I tweak the "add_cats" migration (before committing anything), and ran db:rollback so I can run it again. I get this error:
ActiveRecord::UnknownMigrationVersionError:
No migration with version number 20221220155010.
I can still run db:migrate on new migrations just fine, but not db:rollback or db:migrate:redo.
This makes sense, because the database has a record of applying 20221220155010, but that migration file no longer exists, so there is no way to roll it back.
How can I get past this?
Here are three ways to deal with a missing migration file, depending on your needs and access:
For a quick temporary fix, you can roll back just the migration you're currently editing so you can run it again. This may be useful if the other migration is still in the pipeline on the other branch and both eventually will get merged.
rake db:migrate:down VERSION=20230101010101
// This is the version of the migration you WANT to rollback, not the missing one.
If the missing migration will never come back, you want a permanent fix. The simplest way is to remove that record from the database. You can do this from your favorite SQL client, rails console, etc. (I suppose you could even write a migration to do that, but that seems mighty sketchy.)
DELETE FROM schema_migrations WHERE version = '20221220155010'
-- This is the version of the migration that is MISSING, not the one you are working on.
If you don't have direct access to the database for whatever reason, you can give Rails a placebo to rollback. Ensure the timestamp in the filename matches the missing migration's version number.
Create a file named db/migrations/20221220155010_just_kidding.rb:
class JustKidding < ActiveRecord::Migration
def change
# nothing to see here.
end
end
Then, rails db:rollback will roll back that no-op migration and delete 20221220155010 from the schema_migrations table. You can now delete the placebo migration forever and you'll be in good shape as far as running migrations and rollbacks.
However...don't forget that the effects of the old migration are still in your schema. Maybe you're stuck with a new, unused 'dogs' table or an extra column on a table. Maybe that's benign on your dev box, but you certainly don't want that cruft on a production environment. All the advice in this answer assumes you're on a throw-away environment and that the effects of the old migration aren't a problem. Tearing down your whole database and rebuilding may become a more attractive option in this case.
One of the realy take-aways here is... don't let this happen in the first place! Ideally, you should rollback any new, uncommitted migrations before changing away from a branch. But...things happen...
p.s. If there is a way to do this from the command line, I'd love to learn it. I'm imagining something like rails db:migrate:delete VERSION=20230101010101 might be handy in a hackish kind of way.

After rails migration, resulting schema does not match migrations. Lingering database state?

Rails 6.1.4
Ruby 2.7
Postgresql 14
A dozen or so migrations, one schema.rb file.
I edited a migration, but did not change the migration id. The result is super weird behavior and I wanted to get input on the best approach.
After I incorrectly edited my migrations, I commited and pushed my feature. A team member pulled the feature and ran the migration on their machine. After they did, no matter the branch, the schema would include the changes I added when I originally modified it. But if they were on a different branch than mine, the actual migration files did not have those changes!!
I tried reverting my commit history to pre-migration editing with no luck. This is how I know it's a db issue, albeit caused from git.
So basically, after every migration, a specific model in the schema gets 4 added columns. No matter what, and it's not in a migration file on rails.
And thats the issue.
My question:
How would you go about solving this without resetting the db?
My current approach/best guess:
Lingering state in the db gets generated in the schema.rb file.
If its not in a migration, the only place schema.rb can get the info is from db.
How do I reset the state on stuff in general?
Either rebuild from scratch, or 'install a copy'. From scratch is not an option :)
If I wanted to install a copy, would it be a wise path to:
Revert changes from any migrations after pulling, delete branch.
Pull down fresh copy of branch
DO NOT MIGRATE - Instead, rails db:schema:load
This should copy over the db structure and effectively overwrite any lingering ghost state.
Rails db:migrate -> this will update migrations,
if you did everything right only the schema version number should change
Now things are synced, continue to db:migrate as normal moving forward.
I did this on my local machine and was successful, but I am curious..
Am I understanding this process correctly? Is there an easier way?

Consolidating Rails Migrations on a Long Running Project

I'm working on a project that's accumulated hundreds of migrations, and I'm unsure what to do with them long term. I suppose they're no hurting anything, but it seems strange to keep around a bunch of old files for incremental migrations, some of them creating tables that are later removed.
So far, I've seen three possibilities:
Leave them alone. They're not hurting anything.
Just delete them. I don't see much harm in doing this, since a new developer would be probably be starting with schema load anyway, not migrations.
Delete them all, create a new one with a timestamp matching an old merge, and create a new merge from your schema. This seems very clean, but I'm not sure who would actually use it.
I'm inclined to just delete them, but I'm curious if there's a big pitfall I'm missing.
In my opinion, as soon as every database on the project, especially the production, are at least at version '201xxxxxxxx', it should be fine to delete migrations before that version. They are not technically necessary anymore.
After that, if you want to play archaeology with your database history, you can still use your version control system.
With Git for example, you can use the following commands to have a quick look over the past :
git log --name-only db/migrate/ #to list commit involving migrations + migration filename
git show xxxxx db/migrate #to see the code of commit xxxxx's migration(s)
Alternatively, you can browse repository history of schema.rb, identify a commit and see the corresponding migration content with the command above.
If you prefer to have a lighter db/migrate and use Version control, I would go for a little cleanup.
If you find it more convenient to have the whole migrations history directly available because it's easier to browse, I would go for option 1.
Note: it is very likely that old migrations do not make sense with the current application code. For example, some migrations may refer to class or methods that don't exist anymore.
Using version control to checkout the application at the time the migration was written could also avoid some confusion.
Personally, I lean toward option 1: It's generally true that at some point in any project, the schema is what matters, and the migrations are just a curiosity, but you're right when you say they're not hurting anything. Theoretically the old migrations could be useful for someone who wanted to go back and see how the database was organized at some point in the past.
I don't know of any serious pitfalls in deleting them, but I also don't see an advantage to doing so, unless it's the saving of time scrolling past them when you want to edit a new migration.
I don't think the effort of putting together a single migration that duplicates the schema is beneficial - it's extra work, and that's what the schema is for anyway.
If you work on a project large enough for long enough, there will come a time when you look at all of those extra migrations with disdain and wonder how so many can exist.
You think to yourself, "just delete them...they don't do anything." This is a perfectly logical and normal thought process (especially as a Rails developer, we just love to minimize code and make things efficient!), but don't let the dark side tempt you.
By deleting migrations, you are deleting the historical record of your application's data model, and worse, the logical path you took to get to your current model. This history can help you remember why you did what you did, and didn't do what you didn't do.
Yes, we've all been guilty of deleting migrations from time to time. But you must resist the temptation as the net benefit would only be a few kB and a cleaner migration folder.
Remember:
Those who delete migrations are condemned to repeat them!
In one of our former projects we dropped all migrations, created a new one which would truncate schema_migrations table with manual sql and then copied db/schema.rb contents to it.
Surely this migration is irreversible, however it allowed us to get rid of hundreds of old valueless migrations but still be able to re-create db not from db schema only but from migrations as well.

Is it a good idea to purge old Rails migration files?

I have been running a big Rails application for over 2 years and, day by day, my ActiveRecord migration folder has been growing up to over 150 files.
There are very old models, no longer available in the application, still referenced in the migrations. I was thinking to remove them.
What do you think? Do you usually purge old migrations from your codebase?
The Rails 4 Way page 177:
Sebastian says...
A little-known fact is that you can remove old migration files (while
still keeping newer ones) to keep the db/migrate folder to a
manageable size. You can move the older migrations to a
db/archived_migrations folder or something like that. Once you do trim
the size of your migrations folder, use the rake db:reset task to
(re-)create your database from db/schema.rb and load the seeds into
your current environment.
Once I hit a major site release, I'll roll the migrations into one and start fresh. I feel dirty once the migration version numbers get up around 75.
I occasionally purge all migrations, which have already been applied in production and I see at least 2 reasons for this:
More manageable folder: it is easier to spot a new migration.
Cleaner text search results: global text search within a project does not lead to tons of useless matches because of some 3-year-old migration when someone added or removed some column which anyway does not exist anymore.
They are relatively small, so I would choose to keep them, just for the record.
You should write your migrations without referencing models, or other parts of application, because they'll come back to you haunting ;)
Check out these guidelines:
http://guides.rubyonrails.org/migrations.html#using-models-in-your-migrations
Personally I like to keep things tidy in the migrations files. I think once you have pushed all your changes into prod you should really look at archiving the migrations. The only difficulty I have faced with this is that when Travis runs it runs a db:migrate, so these are the steps I have used:
Move historic migrations from /db/migrate/ to /db/archive/release-x.y/
Create a new migration file manually using the version number from the last run migration in the /db/archive/release-x.y directory and change the description to something like from_previous_version. Using the old version number means that it won't run on your prod machine and mess up.
Copy the schema.rb contents from inside the ActiveRecord::Schema.define(version: 20141010044951) do section and paste into the change method of your from_previous_version changelog
Check all that in and Robert should be your parent's brother.
The only other consideration would be if your migrations create any data (my test scenarios contain all their own data so I don't have this issue)
Why? Unless there is some kind of problem with disk space, I don't see a good reason for deleting them. I guess if you are absolutely certain that you are never going to roll back anything ever again, than you can. However, it seems like saving a few KB of disk space to do this wouldn't be worth it. Also, if you just want to delete the migrations that refer to old models, you have to look through them all by hand to make sure you don't delete anything that is still used in your app. Lots of effort for little gain, to me.
See http://edgeguides.rubyonrails.org/active_record_migrations.html#schema-dumping-and-you
Migrations are not a representation of the database: either structure.sql or schema.rb is. Migrations are also not a good place for setting/initializing data. db/seeds or a rake task are better for that kind of task.
So what are migrations? In my opinion they are instructions for how to change the database schema - either forwards or backwards (via a rollback). Unless there is a problem, they should be run only in the following cases:
On my local development machine as a way to test the migration itself and write the schema/structure file.
On colleague developer machines as a way to change the schema without dropping the database.
On production machines as a way to change the schema without dropping the database.
Once run they should be irrelevant. Of course mistakes happen, so you definitely want to keep migrations around for a few months in case you need to rollback.
CI environments do not ever need to run migrations. It slows down your CI environment and is error prone (just like the Rails guide says). Since your test environments only have ephemeral data, you should instead be using rake db:setup, which will load from the schema.rb/structure.sql and completely ignore your migration files.
If you're using source control, there is no benefit in keeping old migrations around; they are part of the source history. It might make sense to put them in an archive folder if that's your cup of coffee.
With that all being said, I strongly think it makes sense to purge old migrations, for the following reasons:
They could contain code that is so old it will no longer run (like if you removed a model). This creates a trap for other developers who want to run rake db:migrate.
They will slow down grep-like tasks and are irrelevant past a certain age.
Why are they irrelevant? Once more for two reasons: the history is stored in your source control and the actual database structure is stored in structure.sql/schema.rb. My rule of thumb is that migrations older than about 12 months are completely irrelevant. I delete them. If there were some reason why I wanted to rollback a migration older than that I'm confident that the database has changed enough in that time to warrant writing a new migration to perform that task.
So how do you get rid of the migrations? These are the steps I follow:
Delete the migration files
Write a rake task to delete their corresponding rows in the schema_migrations table of your database.
Run rake db:migrate to regenerate structure.sql/schema.rb.
Validate that the only thing changed in structure.sql/schema.rb is removed lines corresponding to each of the migrations you deleted.
Deploy, then run the rake task from step 2 on production.
Make sure other developers run the rake task from step 2 on their machines.
The second item is necessary to keep schema/structure accurate, which, again, is the only thing that actually matters here.
It's fine to remove old migrations once you're comfortable they won't be needed. The purpose of migrations is to have a tool for making and rolling back database changes. Once the changes have been made and in production for a couple of months, odds are you're unlikely to need them again. I find that after a while they're just cruft that clutters up your repo, searches, and file navigation.
Some people will run the migrations from scratch to reload their dev database, but that's not really what they're intended for. You can use rake db:schema:load to load the latest schema, and rake db:seed to populate it with seed data. rake db:reset does both for you. If you've got database extensions that can't be dumped to schema.rb then you can use the sql schema format for ActiveRecord and run rake db:structure:load instead.
Yes. I guess if you have completely removed any model and related table also from database, then it is worth to put it in migration. If model reference in migration does not depend on any other thing, then you can delete it. Although that migration is never going to run again as it has already run and even if you don't delete it from existing migration, then whenever you will migrate database fresh, it cause a problem.
So better it to remove that reference from migration. And refactore/minimize migrations to one or two file before big release to live database.
I agree, no value in 100+ migrations, the history is a mess, there is no easy way of tracking history on a single table and it adds clutter to your file finding. Simply Muda IMO :)
Here's a 3-step guide to squash all migrations into identical schema as production:
Step1: schema from production
# launch rails console in production
stream = StringIO.new
ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, stream); nil
stream.rewind
puts stream.read
This is copy-pasteable to migrations, minus the obvious header
Step 2: making the migrations without it being run in production
This is important. Use the last migration and change it's name and content. ActiveRecord stors the datetime number in it's schema_migrations table so it knows what it has run and not. Reuse the last and it'll think it has already run.
Example: rename 20161202212203_this_is_the_last_migration -> 20161202212203_schema_of_20161203.rb
And put the schema there.
Step 3: verify and troubleshoot
Locally, rake db:drop, rake db:create, rake db:migrate
Verify that schema is identical. One issue we encountered was datetime "now()" in schema, here's the best solution I could find for that: https://stackoverflow.com/a/40840867/252799

Managing Rails Migrations for different branches on the same machine

I'm a one-man-band at the company I work for. I develop a Rails application for internal use within the company. Since the beginning of the project I have used SVN for source control and done most, but not all, development in trunk. Occasionally, when I have had very significant changes to make, I have branched and made the changes merging back in when done. All very typical.
However, none of those "significant changes" that I have had to make have ever touched the database migrations. They have always been view/controller stuff.
In this situation, with one development box, how do I play around with migrations and various database changes that I may or may not keep? I don't want to have to remember to revert all the migrations back to the beginning of the branch before I throw the branch out if it doesn't work.
I have considered setting up special development environments and databases (app_branch instead of app_development) but that seems to work strongly against the notion of "easy branching" that experimental development tends to rely on.
Are there best practices for this situation? What are others out there doing in this situation?
I try hard to keep my development database "droppable." If I lose it all - no big deal. My migrations are ready to build it up again from scratch and there's always a script with seed / test data in it somewhere. I guess it's not especially clever.
If I needed a new branch for database work, I would just check it out, drop, create, rake, and then seed. I guess I'd write a script to get it done because when I go to adandon the branch, I'm going to have to go through the same process again from the trunk.
Make sure your schema.rb file is in version control. That way, as you switch branches, you can drop your DB and then do rake db:schema:load as necessary.
Also, you really should switch to Git. It will make branch management a lot easier than SVN. (I speak from lots of experience with both programs.)
Well, if you want to have different schemas, you'll need multiple databases. "Easy branching" refers to source control, typically, and not databases. As far as I know there's no easy way to branch databases like you would branch in, say, git.
One thing we do to manage our dev/production branches is we check our current git branch in our database.yml file. If the current branch is production, we use one database, otherwise we use our dev database. something along the lines of this:
<% if 'git branch' =~ /^\* production/
db = 'production_database'
else
db = 'development_database
end %>
development:
database: <% db %>
Note, the 'production_database' refers to a local version of the production schema, not the live production database.
I wrote a script for dealing with this exact problem. It is based around git, but you could easily change it to work for svn:
https://gist.github.com/4076864
Given a branch name it will:
Roll back any migrations on your current branch which do not exist on the given branch
Discard any changes to the db/schema.rb file
Check out the given branch
Run any new migrations existing in the given branch
Update your test database
I find myself manually doing this all the time on our project, so I thought it'd be nice to automate the process.
If I am creating a branch where you are making siginificant changes, you can create a copy of the database before creating your migrations then change the development section of database.yml inside the branch. Leave your :production section alone and then decide which version of the database you want to keep for future development when you merge the branch back into the trunk.
We do this with feature releases. I'll have local DBs for version 1, 2, 3 like "db_v1", db_v2", etc. As we roll through the versions, each subsequent development branch gets an edit in database.yml while the trunk stays on the last version for bug fixing.

Resources