How to skip rails migrations after creating database from dump

How to skip rails migrations after creating database from dump - ruby-on-rails

I restored my database from the latest dump and tried to run rake tests. Unfortunately 30 migrations were pending. My first idea was to comment out each of 30 migrations code and run 'rake db:migrate' but there must be a simpler solution. I use Rails 2.3.14 and Postgresql 9.1.3.

If you're restoring a database from a dump, the schema_migrations table should restore along with the rest of the tables.
This seems to indicate your schema_migrations table may not be getting backed up which would lead to the problem you have now.
The ideal solution would be to restore a backup that has all the tables in it correctly -- including schema_migrations.
Even if you decide to find a way around this in the short-term, in the long-term the correct solution is to modify your backup scripts to get all the tables you need, including schema_migrations.
In terms of what to do now, the ideal solution is probaby to backup just that one table (schema_migrations) from your database and import that data into the database you're trying to load now. Then your migrations should no longer be pending.
Doing that with a simple table dump and load script should be fine. The simple postgres gui PgAdmin ( http://www.pgadmin.org/ ) may also provide some basic tools for dumping then loading a single table.

Kevin is correct. However, he is missing a critical point.
When you restore from a backup it restores the schema_migrations table which tracks which migrations need to be run. If those thirty migrations had been run on the database you restored from, they would not have run.
However, your code is thirty migrations ahead of the snapshot of your database represented by the backup.
This can happen to me if I deploy, then grab the production backup right away. Although the migrations have run on production, I'm getting the backup from before office hours prior to my deployment. I usually like to wait a day and get the next day's backup.
Or, don't worry about it. Your backup is before those thirty migrations, but then they were applied, so the migrations have made sure your schema matches the version of your code. That's a good thing.
Don't sweat it, and refresh again tomorrow, when the backup has your changes.

You could also manually add the timestamps of the missing migrations in the db table like:
INSERT INTO "public"."schema_migrations"("version") VALUES ('20201212012345')
That should have the same effect as temporarily outcommenting the 'create' instructions in the migration files. If you run migrations as part of a deploy process from git, commenting out would mean you had to push those changes to git.
If you just work on staging / development env, directly fixing the db might be nicer than pushing those changes, possibly confusing other deploys or devs.

Related

Rails pg database: schema_migrations empty

I am trying to deploy a new release of a project to my staging server, but the schema_migrations table in the database is inexplicably empty.
It is now trying to run all migrations while deploying, causing issues since the other tables exist and are intact.
Instead of dropping/recreating the database and losing all my data(though inconvenient, a valid option), is it possible to generate the schema_migrations table without dropping?

Two points:
There's nothing wrong with deleting migrations that aren't needed anymore, you're not supposed to keep them around forever.
You can manually update the schema_migrations table if you have to.
As far as (1) goes, I delete migrations (provided they've been applied everywhere) that are old than a month or so. Keeping around years of clutter in db/migrate is pointless. Migrating down and then back up tends to be more of a mess than it is worth for all but the most recent migrations; if you need to test something old, dump your current development database and start with a fresh one based on an older schema.rb. Besides, if you really need an old migration then you can always dig it out of revision control or simply recreate it based on diffs between versions of db/schema.rb (or db/structure.sql).
Of course, if you don't keep db/schema.rb in revision control then you're doing it wrong and you a looming disaster on your hands.
For (2), the schema_migrations table is very simple:
create table schema_migrations (
version varchar not null primary key
)
So you can connect to PostgreSQL using psql, create the table by hand, and then do a bunch of insert into schema_migrations (version) values (...) to cover the historic migrations that should have been run in your staging environment, and then db:migrate as normal to get things up to date. Then you can spend as much time as you want doing a post mortem to figure out what happened to your staging database's schema_migrations table.
If you go this route, you might want to have a look at the ar_internal_metadata table if your version of Rails uses it.
You will, of course, backup your staging database before doing any of this. And you will want to examine your production database to make sure it isn't suffering the same problems.
Sometimes you have to get your hands dirty to make things go.

What's a good way to clean up my migrations in Rails?

So I've been working on this web app for a year now and I would like to compile to schema into ONE migration, that way my text editor loads faster, git working directory isn't so cluttered.
Search find will be faster.
Any my config/db won't be 4000px long.

Remove the migration files once you've migrated your servers. If you ever want to start with a fresh deployment, run rake db:schema:load or rake db:setup. You shouldn't be re-running all your migrations as explained here.

You don't need to keep your migrations around forever, you are free to delete them as soon as you're sure you don't need them anymore. Just go into your db/migrate/ directory and delete the migrations that are older than, say, a couple months.
As long as all the migrations that you want to delete have been applied everywhere (i.e. development and production) then you don't need them anymore (unless you want to go backwards). Really, migrations aren't meant to be permanent files, they're just around to get you from A to B and then they're just baggage.

One way to go is to take a blank database and run all the migrations. Now you've got all the template data which you can save to a yaml. The yaml plus the schema should be enough to bring the DB back without running any of your previously existing migrations.
However, other answers should mention an existing tool or gem for doing this.

Given that none of the answers mention it, this is the gem that does the job: https://github.com/jalkoby/squasher
It basically reruns the migrations from scratch until the date you specify, and then loads the resulting db/schema.rb into an initial migration that replaces the old ones. It can also cleanup the schema_migrations table so you don't get those
up <timestamp> ********** NO FILE **********
entries when running rake db:migrate:status.

rake db:schema:load vs. migrations

Very simple question here - if migrations can get slow and cumbersome as an app gets more complex and if we have the much cleaner rake db:schema:load to call instead, why do migrations exist at all?
If the answer to the above is that migrations are used for version control (a stepwise record of changes to the database), then as an app gets more complex and rake db:schema:load is used more instead, do they continue to maintain their primary function?
Caution:
From the answers to this question: rake db:schema:load will delete data on a production server so be careful when using it.

Migrations provide forward and backward step changes to the database. In a production environment, incremental changes must be made to the database during deploys: migrations provide this functionality with a rollback failsafe. If you run rake db:schema:load on a production server, you'll end up deleting all your production data. This is a dangerous habit to get into.
That being said, I believe it is a decent practice to occasionally "collapse" migrations. This entails deleting old migrations, replacing them with a single migration (very similar to your schema.rb file) and updating the schema_migrations table to reflect this change. Be very careful when doing this! You can easily delete your production data if you aren't careful.
As a side note, I strongly believe that you should never put data creation in the migration files. The seed.rb file can be used for this, or custom rake or deploy tasks. Putting this into migration files mixes your database schema specification with your data specification and can lead to conflicts when running migration files.

Just stumbled across this post, that was long ago and didn't see the answer I was expecting.
rake db:schema:load is great for the first time you put a system in production. After that you should run migrations normally.
This also helps you cleaning your migrations whenever you like, since the schema has all the information to put other machines in production even when you cleaned up your migrations.

Migrations lets you add data to the db too. but db:schema:load only loads the schema .

Because migrations can be rolled back, and provide additional functionality. For example, if you need to modify some data as part of a schema change then you'll need to do that as a migration.

As a user of other ORM's, it always seemed strange to me that Rails didn't have a 'sync and update' feature. ie, by using the schema file (which represents the entire, up-to-date schema), go through the existing DB structure and add/remove tables, columns, indexes as required.
To me this would be a lot more robust, even if possibly a little slower.

I have already posted as a comment, but feels it is better to put the comments of the db/schema.rb file here:
# Note that this schema.rb definition is the authoritative source for your
# database schema. If you need to create the application database on another
# system, you should be using db:schema:load, not running all the migrations
# from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).
#
# It's strongly recommended that you check this file into your version control system.
Actually, my experience is that it is better to put the migration files in git and not the schema.rb file...

rake db:migrate setup the tables in the database. When you run the migration command, it will look in db/migrate/ for any ruby files and execute them starting with the oldest. There is a timestamp at the beginning of each migration filename.
Unlike rake db:migrate that runs migrations that have not run yet, rake db:schema:load loads the schema that is already generated in db/schema.rbinto the database.
You can find out more about rake database commands here.

So schema:load takes the currently configured schema, derives the associated queries to match, and runs them all in one go. It's kind of a one-and-done situation. As you've seen, migrations make changes step-by-step. Loading the schema might make sense when working on a project locally, especially early in the lifetime of a project. But if we were to drop and recreate the production DB each time we do a deployment, we would lose production data each time. That's a no-go. So that's why we use migrations to make the required changes to the existing DB.
So. The deeper into a project you get, the more migrations you'll get stacked up as you make more changes to the DB. And with each migration, those migrations become more and more the source of truth of what's on production - what matters isn't what's in the schema, but what migrations have been run in production. The difference is effectively moot if we have both in sync. But as soon as one goes of out date from the other, you start to have discrepancies. Ideally this would not happen, but we live in the real world, and stuff happens. And if you're using schema:load to set up your DB locally, you might not be getting the actual state of the DB, as it is reflected via the migration history on production.

Is it a good idea to purge old Rails migration files?

I have been running a big Rails application for over 2 years and, day by day, my ActiveRecord migration folder has been growing up to over 150 files.
There are very old models, no longer available in the application, still referenced in the migrations. I was thinking to remove them.
What do you think? Do you usually purge old migrations from your codebase?

The Rails 4 Way page 177:
Sebastian says...
A little-known fact is that you can remove old migration files (while
still keeping newer ones) to keep the db/migrate folder to a
manageable size. You can move the older migrations to a
db/archived_migrations folder or something like that. Once you do trim
the size of your migrations folder, use the rake db:reset task to
(re-)create your database from db/schema.rb and load the seeds into
your current environment.

Once I hit a major site release, I'll roll the migrations into one and start fresh. I feel dirty once the migration version numbers get up around 75.

I occasionally purge all migrations, which have already been applied in production and I see at least 2 reasons for this:
More manageable folder: it is easier to spot a new migration.
Cleaner text search results: global text search within a project does not lead to tons of useless matches because of some 3-year-old migration when someone added or removed some column which anyway does not exist anymore.

They are relatively small, so I would choose to keep them, just for the record.
You should write your migrations without referencing models, or other parts of application, because they'll come back to you haunting ;)
Check out these guidelines:
http://guides.rubyonrails.org/migrations.html#using-models-in-your-migrations

Personally I like to keep things tidy in the migrations files. I think once you have pushed all your changes into prod you should really look at archiving the migrations. The only difficulty I have faced with this is that when Travis runs it runs a db:migrate, so these are the steps I have used:
Move historic migrations from /db/migrate/ to /db/archive/release-x.y/
Create a new migration file manually using the version number from the last run migration in the /db/archive/release-x.y directory and change the description to something like from_previous_version. Using the old version number means that it won't run on your prod machine and mess up.
Copy the schema.rb contents from inside the ActiveRecord::Schema.define(version: 20141010044951) do section and paste into the change method of your from_previous_version changelog
Check all that in and Robert should be your parent's brother.
The only other consideration would be if your migrations create any data (my test scenarios contain all their own data so I don't have this issue)

Why? Unless there is some kind of problem with disk space, I don't see a good reason for deleting them. I guess if you are absolutely certain that you are never going to roll back anything ever again, than you can. However, it seems like saving a few KB of disk space to do this wouldn't be worth it. Also, if you just want to delete the migrations that refer to old models, you have to look through them all by hand to make sure you don't delete anything that is still used in your app. Lots of effort for little gain, to me.

See http://edgeguides.rubyonrails.org/active_record_migrations.html#schema-dumping-and-you
Migrations are not a representation of the database: either structure.sql or schema.rb is. Migrations are also not a good place for setting/initializing data. db/seeds or a rake task are better for that kind of task.
So what are migrations? In my opinion they are instructions for how to change the database schema - either forwards or backwards (via a rollback). Unless there is a problem, they should be run only in the following cases:
On my local development machine as a way to test the migration itself and write the schema/structure file.
On colleague developer machines as a way to change the schema without dropping the database.
On production machines as a way to change the schema without dropping the database.
Once run they should be irrelevant. Of course mistakes happen, so you definitely want to keep migrations around for a few months in case you need to rollback.
CI environments do not ever need to run migrations. It slows down your CI environment and is error prone (just like the Rails guide says). Since your test environments only have ephemeral data, you should instead be using rake db:setup, which will load from the schema.rb/structure.sql and completely ignore your migration files.
If you're using source control, there is no benefit in keeping old migrations around; they are part of the source history. It might make sense to put them in an archive folder if that's your cup of coffee.
With that all being said, I strongly think it makes sense to purge old migrations, for the following reasons:
They could contain code that is so old it will no longer run (like if you removed a model). This creates a trap for other developers who want to run rake db:migrate.
They will slow down grep-like tasks and are irrelevant past a certain age.
Why are they irrelevant? Once more for two reasons: the history is stored in your source control and the actual database structure is stored in structure.sql/schema.rb. My rule of thumb is that migrations older than about 12 months are completely irrelevant. I delete them. If there were some reason why I wanted to rollback a migration older than that I'm confident that the database has changed enough in that time to warrant writing a new migration to perform that task.
So how do you get rid of the migrations? These are the steps I follow:
Delete the migration files
Write a rake task to delete their corresponding rows in the schema_migrations table of your database.
Run rake db:migrate to regenerate structure.sql/schema.rb.
Validate that the only thing changed in structure.sql/schema.rb is removed lines corresponding to each of the migrations you deleted.
Deploy, then run the rake task from step 2 on production.
Make sure other developers run the rake task from step 2 on their machines.
The second item is necessary to keep schema/structure accurate, which, again, is the only thing that actually matters here.

It's fine to remove old migrations once you're comfortable they won't be needed. The purpose of migrations is to have a tool for making and rolling back database changes. Once the changes have been made and in production for a couple of months, odds are you're unlikely to need them again. I find that after a while they're just cruft that clutters up your repo, searches, and file navigation.
Some people will run the migrations from scratch to reload their dev database, but that's not really what they're intended for. You can use rake db:schema:load to load the latest schema, and rake db:seed to populate it with seed data. rake db:reset does both for you. If you've got database extensions that can't be dumped to schema.rb then you can use the sql schema format for ActiveRecord and run rake db:structure:load instead.

Yes. I guess if you have completely removed any model and related table also from database, then it is worth to put it in migration. If model reference in migration does not depend on any other thing, then you can delete it. Although that migration is never going to run again as it has already run and even if you don't delete it from existing migration, then whenever you will migrate database fresh, it cause a problem.
So better it to remove that reference from migration. And refactore/minimize migrations to one or two file before big release to live database.

I agree, no value in 100+ migrations, the history is a mess, there is no easy way of tracking history on a single table and it adds clutter to your file finding. Simply Muda IMO :)
Here's a 3-step guide to squash all migrations into identical schema as production:
Step1: schema from production
# launch rails console in production
stream = StringIO.new
ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, stream); nil
stream.rewind
puts stream.read
This is copy-pasteable to migrations, minus the obvious header
Step 2: making the migrations without it being run in production
This is important. Use the last migration and change it's name and content. ActiveRecord stors the datetime number in it's schema_migrations table so it knows what it has run and not. Reuse the last and it'll think it has already run.
Example: rename 20161202212203_this_is_the_last_migration -> 20161202212203_schema_of_20161203.rb
And put the schema there.
Step 3: verify and troubleshoot
Locally, rake db:drop, rake db:create, rake db:migrate
Verify that schema is identical. One issue we encountered was datetime "now()" in schema, here's the best solution I could find for that: https://stackoverflow.com/a/40840867/252799

Rebase Rails migrations in a long running project

In which I mean "rebasing" in the dictionary, rather than git definition...
I have a large, long running Rails project that has about 250 migrations, it's getting a touch unwieldy to manage all of these.
That said, I do need a base from which to purge and rebuild my database when running tests. So the data contained in these is important.
Does any one have any strategies for say, dumping the schema at a set point - archiving off all the old migrations and starting afresh with new migrations.
Obviously I can use rake schema:dump - but really I need a way that db:migrate will load the schema first and then start running the rest of the migrations.
I would like to keep using migrations as they're very useful in development, however, there's no way I'm going back and editing a migration from 2007 so it seems silly to keep it.

In general, you don't need to clean up old migrations. If you're running db:migrate from scratch (no existing db), Rails uses db/schema.rb to create the tables instead of running every migration. Otherwise, it only runs the migrations required to upgrade from the current schema to the latest.
If you still want to combine migrations up to a given point into a single one, you could try to:
migrate from scratch up to the targeted schema using rake db:migrate VERSION=xxx
dump the schema using rake db:schema:dump
remove the migrations from the beginning up to version xxx and create a single new migration using the contents of db/schema.rb (put create_table and add_index statements into the self.up method of the new migration).
Make sure to choose one of the old migration version numbers for your aggregated new migration; otherwise, Rails would try to apply that migration on your production server (which would wipe your existing data, since the create_table statements use :force⇒true).
Anyway, I wouldn't recommend to do this since Rails usually handles migrations well itself. But if you still want to, make sure to double check everything and try locally first before you risk data loss on your production server.

To automate the merging (or squashing) of migrations, you could use the Squasher gem
Simply install
gem install squasher
And run with a date, and migrations before that date will be merged:
squasher 2016 # => Will merge all migration created before 2016
More details in the README

In addition to the answer provided (which well indicates how to consolidate your volume of migrations), you indicate a concern to purge data (which I assume is manually added after fixtures populate your tables); which infers you're depending on refreshing an initial data state. Some projects indeed require intensive refinement of base data, reconstruction, and re-population of tables. Ours heavily depends on repetitive execution of these operations, and I've found that if you can reduce your schema entirely to SQL execute statements, your tables will rebuild far faster than they will from Ruby syntax.
A trivial further help in rebuilding your tables is to dedicate a separate terminal window to a single combined command statement:
rake db:drop db:create db:schema:load db:fixtures:load
Each time you need to rebuild and re-populate your tables, an up-arrow and return keypress will get that routine job done. If there's no conflict in SQL execute statements, and if you don't have further migrations to run while you're project is in development state, the SQL statements will execute perhaps better than twice as fast as the Ruby syntax. Our tables rebuild and re-populate in 20 seconds this way for example, whereas the Ruby syntax increases the process to well over 50 seconds. If you're waiting on that data to refresh to perform further work (especially many times), this makes a huge difference in workflow.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart