After rails migration, resulting schema does not match migrations. Lingering database state? - ruby-on-rails

Rails 6.1.4
Ruby 2.7
Postgresql 14
A dozen or so migrations, one schema.rb file.
I edited a migration, but did not change the migration id. The result is super weird behavior and I wanted to get input on the best approach.
After I incorrectly edited my migrations, I commited and pushed my feature. A team member pulled the feature and ran the migration on their machine. After they did, no matter the branch, the schema would include the changes I added when I originally modified it. But if they were on a different branch than mine, the actual migration files did not have those changes!!
I tried reverting my commit history to pre-migration editing with no luck. This is how I know it's a db issue, albeit caused from git.
So basically, after every migration, a specific model in the schema gets 4 added columns. No matter what, and it's not in a migration file on rails.
And thats the issue.
My question:
How would you go about solving this without resetting the db?
My current approach/best guess:
Lingering state in the db gets generated in the schema.rb file.
If its not in a migration, the only place schema.rb can get the info is from db.
How do I reset the state on stuff in general?
Either rebuild from scratch, or 'install a copy'. From scratch is not an option :)
If I wanted to install a copy, would it be a wise path to:
Revert changes from any migrations after pulling, delete branch.
Pull down fresh copy of branch
DO NOT MIGRATE - Instead, rails db:schema:load
This should copy over the db structure and effectively overwrite any lingering ghost state.
Rails db:migrate -> this will update migrations,
if you did everything right only the schema version number should change
Now things are synced, continue to db:migrate as normal moving forward.
I did this on my local machine and was successful, but I am curious..
Am I understanding this process correctly? Is there an easier way?

Related

I can't rollback migrations, because the migration file does not exist

I added a migration in branch "add_dogs" with migration db/migrate/20221220155010_create_dogs.rb, and ran db:migrate.
Later on, I changed branches (without a merge), and ultimately abandoned the "new_dogs" branch.
Later later on, I checked out "add_cats" branch with db/migrate/20221101010101_create_cats.rb, and ran db:migrate. So far, all is well.
But then I tweak the "add_cats" migration (before committing anything), and ran db:rollback so I can run it again. I get this error:
ActiveRecord::UnknownMigrationVersionError:
No migration with version number 20221220155010.
I can still run db:migrate on new migrations just fine, but not db:rollback or db:migrate:redo.
This makes sense, because the database has a record of applying 20221220155010, but that migration file no longer exists, so there is no way to roll it back.
How can I get past this?
Here are three ways to deal with a missing migration file, depending on your needs and access:
For a quick temporary fix, you can roll back just the migration you're currently editing so you can run it again. This may be useful if the other migration is still in the pipeline on the other branch and both eventually will get merged.
rake db:migrate:down VERSION=20230101010101
// This is the version of the migration you WANT to rollback, not the missing one.
If the missing migration will never come back, you want a permanent fix. The simplest way is to remove that record from the database. You can do this from your favorite SQL client, rails console, etc. (I suppose you could even write a migration to do that, but that seems mighty sketchy.)
DELETE FROM schema_migrations WHERE version = '20221220155010'
-- This is the version of the migration that is MISSING, not the one you are working on.
If you don't have direct access to the database for whatever reason, you can give Rails a placebo to rollback. Ensure the timestamp in the filename matches the missing migration's version number.
Create a file named db/migrations/20221220155010_just_kidding.rb:
class JustKidding < ActiveRecord::Migration
def change
# nothing to see here.
end
end
Then, rails db:rollback will roll back that no-op migration and delete 20221220155010 from the schema_migrations table. You can now delete the placebo migration forever and you'll be in good shape as far as running migrations and rollbacks.
However...don't forget that the effects of the old migration are still in your schema. Maybe you're stuck with a new, unused 'dogs' table or an extra column on a table. Maybe that's benign on your dev box, but you certainly don't want that cruft on a production environment. All the advice in this answer assumes you're on a throw-away environment and that the effects of the old migration aren't a problem. Tearing down your whole database and rebuilding may become a more attractive option in this case.
One of the realy take-aways here is... don't let this happen in the first place! Ideally, you should rollback any new, uncommitted migrations before changing away from a branch. But...things happen...
p.s. If there is a way to do this from the command line, I'd love to learn it. I'm imagining something like rails db:migrate:delete VERSION=20230101010101 might be handy in a hackish kind of way.

Edit commited migration in rails

In my project I have migrations like:
1. 20141122184434_add_column_a_to_b
2. 20141208235304_delete_data_from_b
3. 20141217011359_add_table
4. 20141218183503_remove_column_b_from_c
And they are already commited for my develop branch (but not on master and production). After few weeks (and adding more migration) it was found that migration B contains error, and it deletes important data, so we can't merge it with master.
Is there any clean way to edit migration 20141208235304_delete_data_from_b? I know I can just reroll, edit, and migrate again, but how will it work for other developers after I commit my changes to develop branch?
If the migration is recoverable (which means it will not destruct important data), then you can simply add a new migration to fix the issue.
Otherwise, if you discovered that one of the migrations is causing any issue, therefore the only possible solution is to fix the migration before it is ever applied in production. In that case, fix the migration, but then whoever applied such migration locally must:
reload the schema completely (rake db:schema:load) or
revert that migration (rake db:migrate:redo VERSION=20141208235304)
One more advice. A migration is designed to change the database schema, NOT to delete/insert or manipulate data. For such, you should use a rake task.

schema.rb messed up due to migrations in other branches

Currently I'm working with a huge rails application and multiple branches with each a new feature for this application.
It happens a lot a feature will require migrations, which shouldn't be a problem until you merge it with master: schema.rb got updated with the information of your dev database!
To clarify:
1. Branch A has migration create_table_x
2. Branch B has migration create_table_y
3. Branch A adds another create_table_z and runs db:migrate
4. You want to merge Branch A with Master and you see table_x, table_y and table_z in the schema.rb of Branch A.
It's not an option to reset+seed the database before every migration in a branch or create a database per branch. Due to the huge size of 2 GB SQL data, it would not be workable.
My question:
Is it really required to keep schema.rb in the repository since it gets rebuilt every migration?
If so, is it possible to build schema off the migrations instead of database dump?
you should keep the schema within your repo. running the migration will fix merge conflicts within your schema.rb file for you.
my simple take on your questions
Is it Required? not really, but good practice.
"It's strongly recommended to check this file into your version
control system." -schema.rb
It is possible? yes but you may want to ask yourself if you really save any time by doing so, either manually or building the schema off your migrations elsewhere and pushing it in.
you ge tthe added benefit of using
rake db:schema:load
and
rake db:schema:dump
http://tbaggery.com/2010/10/24/reduce-your-rails-schema-conflicts.html
Keeping schema.rb in the repo is important, because you want a new developer (or your test environment) to be able to generate a new database just from schema.rb. When you have a lot of migrations, they may not all continue to run, especially if they rely on model classes that don't exist, or have changed, or have different validations in effect than when the migration was first run. When you run your tests, you want to dump and recreate your database from schema.rb, not by re-running all the migrations every time you run the full test suite.
If you're working on two different feature branches simultaneously, and changing the database structure in both, I think schema.rb is the least of your problems. If you rename or remove a column in branch B, branch A is going to break whenever it references the old column. If all they're doing is creating new tables, it's not as bad, but then you want schema.rb to be updated when you merge from master into A, so that A doesn't try to run the migration a second time and fail because the new table already exists.
If that doesn't answer your question, maybe you could explain your workflow a little more?
Fresh Temporary DB as a quick workaround
For example, do the following whenever you need pretty db/schema.rb on particular branch:
git checkout -- db/schema.rb
switch to the different development database, i.e. update config/database.yml
rake db:drop && rake db:setup
rake db:migrate
commit your pretty db/schema.rb
revert changes in config/database.yaml

Generate a migration file from schema.rb

I'm looking to generate a migration file from the schema.rb. is it possible?
I have many migration files at the moment and would like to combine everything into one master migration file.
I also think i may have accidentally deleted a migration file at some point.
thanks for any help
You could copy and paste schema.rb into a migration and back-date it (e.g. change the date) so that no existing databases will run it. After you create this migration you can delete all your old migrations.
I disagree with Andrew that you should never delete migrations. Migrations break unexpectedly all the time based on model classes changing and it is very non-trivial to fix them. Since I'm sure you are using version control, you can always look back in the history if you need them for reference.
There's no need to do this. For new installations you should be running rake db:schema:load, not rake db:migrate, this will load the schema into the database, which is faster than running all the migrations.
You should never delete migrations, and certainly not combine them. As for accidentally deleting one, you should be using a version control system, such as Git.

Why does schema.rb change (in the eyes of Git) when just running rake db:migrate?

This is a little general I know, but it's been bugging the hell out of me. I've been working on lots of rails projects remotely with Git and every time I do a git pull and see that there is some sort of data change (migration, or schema.rb change) I do a rake db:migrate.
These generally run fine and I can continue working. But if you do a git pull and then git status, your working directory is clean (obviously) then do a rake db:migrate (obviously when there are changes) and another git status and all the sudden your db/schema.rb has changed. I have been just doing a git checkout immediately to reset back to the latest committed version of the schema.rb file, but why should this be necessary?! What is rails doing? Updating a timestamp? I can't seem to figure out what the diff is but maybe I'm just missing something?
The schema enables machines to run rake db:schema:load when being setup for the first time instead of having to run the migrations, which can go out of date if models are renamed or removed, etc. It's supposed to update after a migration, and you always want to latest version checked into source control.
The order of attributes in the dump reflects the order of the attributes in the database, and that can get out of sync if one person has been playing around locally with migrations, running them forwards and backwards manually, and editing things to get them just so. It's possible to create a state where the attribute order in the pusher's schema.rb is different from what everyone else will see when they run the migrations.
If it's easy to recreate your development data, just rebuild the database from schema.rb - then everyone is back in sync (but remember you can't reload the data from a SQL dump that also creates the table - that will recreate the problem. it has to be a data-only dump/load). In the worst case, you can create a migration to delete the column and another to re-add it.
schema.rb reflects your database schema so when you migrate(with changes) it follow that your schema also changes to reflect your db change. I usually add schema.rb to our gitignore, along with database.yml (probably because instead of using schema:load like the one below, i usually do a sql dump when cloning an existing app - but that's just me)

Resources