Addressing migrations and schema changes when working with git branches [duplicate] - ruby-on-rails

I am working on a rails app with quite a few git branches and many of them include db migrations. We try to be careful but occasionally some piece of code in master asks for a column that got removed/renamed in another branch.
What would be a nice solution to "couple" git branches with DB states?
What would these "states" actually be?
We can't just duplicate a database if it's a few GBs in size.
And what should happen with merges?
Would the solution translate to noSQL databases as well?
We currently use MySQL, mongodb and redis
EDIT: Looks like I forgot to mention a very important point, I am only interested in the development environment but with large databases (a few GBs in size).

When you add a new migration in any branch, run rake db:migrate and commit both the migration and db/schema.rb
If you do this, in development, you'll be able to switch to another branch that has a different set of migrations and simply run rake db:schema:load.
Note that this will recreate the entire database, and existing data will be lost.
You'll probably only want to run production off of one branch which you're very careful with, so these steps don't apply there (just run rake db:migrate as usual there). But in development, it should be no big deal to recreate the database from the schema, which is what rake db:schema:load will do.

If you have a large database that you can't readily reproduce, then I'd recommend using the normal migration tools. If you want a simple process, this is what I'd recommend:
Before switching branches, rollback (rake db:rollback) to the state before the branch point. Then, after switching branches, run db:migrate. This is mathematically correct, and as long as you write down scripts, it will work.
If you forget to do this before switching branches, in general you can safely switch back, rollback, and switch again, so I think as a workflow, it's feasible.
If you have dependencies between migrations in different branches... well, you'll have to think hard.

Here's a script I wrote for switching between branches that contain different migrations:
https://gist.github.com/4076864
It won't solve all the problems you mentioned, but given a branch name it will:
Roll back any migrations on your current branch which do not exist on the given branch
Discard any changes to the db/schema.rb file
Check out the given branch
Run any new migrations existing in the given branch
Update your test database
I find myself manually doing this all the time on our project, so I thought it'd be nice to automate the process.

Separate Database for each Branch
It's the only way to fly.
Update October 16th, 2017
I returned to this after quite some time and made some improvements:
I've added another namespace rake task to create a branch and clone the database in one fell swoop, with bundle exec rake git:branch.
I realize now that cloning from master is not always what you want to do so I made it more explicit that the db:clone_from_branch task takes a SOURCE_BRANCH and a TARGET_BRANCH environment variable. When using git:branch it will automatically use the current branch as the SOURCE_BRANCH.
Refactoring and simplification.
config/database.yml
And to make it easier on you, here's how you update your database.yml file to dynamically determine the database name based on the current branch.
<%
database_prefix = 'your_app_name'
environments = %W( development test )
current_branch = `git status | head -1`.to_s.gsub('On branch ','').chomp
%>
defaults: &defaults
pool: 5
adapter: mysql2
encoding: utf8
reconnect: false
username: root
password:
host: localhost
<% environments.each do |environment| %>
<%= environment %>:
<<: *defaults
database: <%= [ database_prefix, current_branch, environment ].join('_') %>
<% end %>
lib/tasks/db.rake
Here's a Rake task to easily clone your database from one branch to another. This takes a SOURCE_BRANCH and a TARGET_BRANCH environment variables. Based off of #spalladino's task.
namespace :db do
desc "Clones database from another branch as specified by `SOURCE_BRANCH` and `TARGET_BRANCH` env params."
task :clone_from_branch do
abort "You need to provide a SOURCE_BRANCH to clone from as an environment variable." if ENV['SOURCE_BRANCH'].blank?
abort "You need to provide a TARGET_BRANCH to clone to as an environment variable." if ENV['TARGET_BRANCH'].blank?
database_configuration = Rails.configuration.database_configuration[Rails.env]
current_database_name = database_configuration["database"]
source_db = current_database_name.sub(CURRENT_BRANCH, ENV['SOURCE_BRANCH'])
target_db = current_database_name.sub(CURRENT_BRANCH, ENV['TARGET_BRANCH'])
mysql_opts = "-u #{database_configuration['username']} "
mysql_opts << "--password=\"#{database_configuration['password']}\" " if database_configuration['password'].presence
`mysqlshow #{mysql_opts} | grep "#{source_db}"`
raise "Source database #{source_db} not found" if $?.to_i != 0
`mysqlshow #{mysql_opts} | grep "#{target_db}"`
raise "Target database #{target_db} already exists" if $?.to_i == 0
puts "Creating empty database #{target_db}"
`mysql #{mysql_opts} -e "CREATE DATABASE #{target_db}"`
puts "Copying #{source_db} into #{target_db}"
`mysqldump #{mysql_opts} #{source_db} | mysql #{mysql_opts} #{target_db}`
end
end
lib/tasks/git.rake
This task will create a git branch off of the current branch (master, or otherwise), check it out and clone the current branch's database into the new branch's database. It's slick AF.
namespace :git do
desc "Create a branch off the current branch and clone the current branch's database."
task :branch do
print 'New Branch Name: '
new_branch_name = STDIN.gets.strip
CURRENT_BRANCH = `git status | head -1`.to_s.gsub('On branch ','').chomp
say "Creating new branch and checking it out..."
sh "git co -b #{new_branch_name}"
say "Cloning database from #{CURRENT_BRANCH}..."
ENV['SOURCE_BRANCH'] = CURRENT_BRANCH # Set source to be the current branch for clone_from_branch task.
ENV['TARGET_BRANCH'] = new_branch_name
Rake::Task['db:clone_from_branch'].invoke
say "All done!"
end
end
Now, all you need to do is run bundle exec git:branch, enter in the new branch name and start killing zombies.

Perhaps you should take this as a hint that your development database is too big? If you can use db/seeds.rb and a smaller data set for development then your issue can be easily solved by using schema.rb and seeds.rb from the current branch.
That assumes that your question relates to development; I can't imagine why you'd need to regularly switch branches in production.

I was struggling with the same issue. Here is my solution:
Make sure that both schema.rb and all migrations are checked in by all developers.
There should be one person/machine for deployments to production. Let's call this machine as the merge-machine. When the changes are pulled to the merge machine, the auto-merge for schema.rb will fail. No issues. Just replace the content with whatever the previous contents for schema.rb was (you can put a copy aside or get it from github if you use it ...).
Here is the important step. The migrations from all developers will now be available in db/migrate folder. Go ahead and run bundle exec rake db:migrate. It will bring the database on the merge machine at par with all changes. It will also regenerate schema.rb.
Commit and push the changes out to all repositories (remotes and individuals, which are remotes too). You should be done!

This is what I have done and I'm not quite sure that I have covered all the bases:
In development (using postgresql):
sql_dump db_name > tmp/branch1.sql
git checkout branch2
dropdb db_name
createdb db_name
psql db_name < tmp/branch2.sql # (from previous branch switch)
This is a lot faster than the rake utilities on a database with about 50K records.
For production, maintain the master branch as sacrosanct and all migrations are checked in, shema.rb properly merged. Go through your standard upgrade procedure.

You want to preserve a "db environment" per branch. Look at smudge/clean script to point to different instances. If you run out of db instances, have the script spin off a temp instance so when you switch to a new branch, it's already there and just needs to be renamed by the script. DB updates should run just before you execute your tests.
Hope this helps.

I totally experience the pita you are having here. As I think about it, the real issue is that all the branches don't have the code to rollback certain branches. I'm in the django world, so I don't know rake that well. I'm toying with the idea that the migrations live in their own repo that doesn't get branched (git-submodule, which I recently learned about). That way all the branches have all the migrations. The sticky part is making sure each branch is restricted to only the migrations they care about. Doing/keeping track of that manually would be a pita and prone to error. But none of the migration tools are built for this. That is the point at which I am without a way forward.

I would suggest one of two options:
Option 1
Put your data in seeds.rb. A nice option is to create your seed data via FactoryGirl/Fabrication gem. This way you can guarantee that the data is in sync with the code if we assume, that the factories are updated together with the addition/removal of columns.
After switching from one branch to another, run rake db:reset, which effectively drops/creates/seeds the database.
Option 2
Manually maintain the states of the database by always running rake db:rollback/rake db:migrate before/after a branch checkout. The caveat is that all your migrations need to be reversible, otherwise this won't work.

If you do a git pull, you should already have the latest schema, affected by any migrations that came in via the pull, but your database tables may not be updated
So, you do need to run the migrations after pulling, but this will often change db/schema.rb
If all you've done is pull and migrate, there's no reason you should be responsible for committing any of the resultant schema changes as they don't technically belong to you, and they may end up being extraneous/incorrect
Resetting the schema diff makes the most sense
Here is my step by step version of what to do before creating a new branch
Switch to your parent/base branch
Pull the latest code
Run bundle exec rake db:migrate to update your schema.rb file locally
Do a git checkout db/schema.rb to throw away the changes brought by db:migrate if any
Create your new branch and switch to it
Make sure to commit your changes before switching to another branch
Adapted from here

On development environment:
You should work with rake db:migrate:redo to test if your script are reversible, but keep in mind always should have a seed.rb with the data population.
If you work with git, you seed.rb should be change with an migration change, and the execution of db:migrate:redo for the begining (load the data for a new development on other machine or new database)
Apart of ´change´, with yours up's and down's methods your code always be cover scenarios for the "change" in this moment and when start from zero.

Related

How to correctly deal with rails schema source control branching?

I have a versionated schema.rb that reflects the master branch migrations.
Whenever I go to another branch and run a new migration in that branch, my database structure is changed.
If I go back to master branch and run db:migrate, that migration changes from the another branch is added to the schema, despite it not belonging to the master branch.
I know this is the way rails works, but how do you devs deal with this?
I heard some people just schema:load to have local database synched with schema, but since it erases all data, it's not a good solution for me because I use a heavy sql dump that helps the development (it is very painful and prone to risks to develop without this data)
Having a production dump for development (using the browser/rails server) is great, but you should probably have a test environment setup that uses a minimal seeded database.
You can get this by doing RAILS_ENV=test rake db:schema:load on an empty (new) DB.
What I'd normally do is something like:
git checkout db/schema.rb
RAILS_ENV=test rake db:reset
RAILS_ENV=test rake db:migrate
and load fixtures as needed. That will leave you with a schema that should just have your migration changes and a changed timestamp. (Unless you have stuff like setting changes hardcoded into your schema, in which case they'll be lost - we monkeypatched ActiveRecord::SchemaDumper to avoid this).
After you've made your migration and done this, commit the schema.rb and migration file before you apply the migration to development. You can then discard the schema.rb with the unwanted changes.
Found this question as I've scoured the web dealing with exactly the same issue - large production dump, different migrations on different branches.
The options seem to be:
Meticulously make sure your migrations are reversible and roll them back if needed before switching branches - can be done, but not practical for me as I was given a nasty database and have needed to use raw sql to rework it.
Fresh dump from production every time a branch is switched - not practical for local development
Generate a new database and seed it with sample data for every git branch - not great for realistic testing in staging
I've been doing #1 for a long time and am looking for a better way. I've decided to do #2 when needed for staging, since it doesn't get touched too often, and do #3 locally.
The local solution is the interesting part - here's what I have (will edit as I improve on it):
database.yml - we want to work on a different db depending on what git branch we're on (tweaked from another stackoverflow answer I've lost the link to):
<%
branch = `git rev-parse --abbrev-ref HEAD`.strip rescue nil
use_single_db = !branch || branch == 'master'
branch_spec = (use_single_db ? "" : "_#{branch}").underscore.gsub(/[\.\/\-]/, '_')
%>
default: &default
host: '127.0.0.1'
adapter: mysql2
encoding: utf8
pool: 10
port: 3306
username: 'root'
password: 'password'
database: tmp_<%= branch_spec %>
development:
<<: *default
This does mean, every time we create a new branch, we'll have to create a database and seed it. You'll need to populate seeds.rb for your use case - tedious, but necessary. (You mentioned that there's quite a bit of data you need for development - you can run all sorts of ruby in this file, perhaps you load the big stuff you need from a csv or something?) One thing I make sure to do in seeds.rb is to populate the
schema_migrations table with the current migrations on the branch
(and any other appropriate edits to seeds.rb) for when that is mysteriously NOT loaded from structure.sql. More on that at the
end.
conn = ActiveRecord::Base.connection
schema_migrations_versions = [
...
20160407183816,
20220317151510,
20220414173833,
20220425184402,
20220711221447,
20220712172621,
]
if conn.execute("SELECT * FROM schema_migrations").to_a.length == 0
conn.execute("INSERT INTO schema_migrations (version) VALUES (#{schema_migrations_versions.join('),(')})")
end
To actually do the creation/seeding, run the following (I made a bash script so that this is a quick, one-time command):
bin/rails db:create
bin/rails db:structure:load # you might need db:schema:load if you're using schema.rb
bin/rails db:environment:set RAILS_ENV=development
mv db/migrate db/migrate_x
bin/rails db:seed
mv db/migrate_x db/migrate
Why the hacky mv command you ask? We have a chicken-and-egg problem:
The schema.rb or structure.sql file is the 'source of truth' for the database structure at any given time - presumably, any migrations you have committed to the repo have been run on the database and this file has been updated with them. When you run db:structure:load, it creates all your tables as if all the migrations have been run.
Great, let's seed the database. But then Rails complains that you haven't run your migrations on it yet, because the schema_migrations table is empty, because we haven't seeded the database yet. This is supposed to get added from structure.sql but I find that most of the time (all the time?) it doesn't actually get added, and I haven't been able to figure out why.
Ok, so run migrations first. But unless you carefully create your migrations to check whether they should be run ("create if not exists" type stuff), your migrations will fail because that table exists, that column exists, or doesn't exist... can't do it.
It is possible to set a rails setting to just not check for migrations, but it seems equally annoying to temporarily edit the config just to seed the database, so I have opted for moving the migrations somewhere rails can't find them for the duration of the seeding.
This seems more like a git problem than a Rails one. Git allows for branch changes with new files, since they would cause no conflict with the master branch, or whatever branch you're checking out. In Rails, creating a migration creates a new untracked file in the db/migrations directory. Since it is a new file, you are able to checkout the master branch with no issue.
~/code/node/server (test)
$ git checkout -b dev
Switched to a new branch 'dev'
~/code/node/server (dev)
$ touch test.txt
~/code/node/server (dev)
$ git checkout test
Switched to branch 'test'
~/code/node/server (test)
$ git status
On branch test
Untracked files:
(use "git add <file>..." to include in what will be committed)
test.txt
nothing added to commit but untracked files present (use "git add" to track)
In this example, I start on branch 'test' and then checkout a new branch called 'dev'. I then create a test file and make sure not to start tracking it. Then checkout the 'test' branch again. (Notice no issue here). git status reveals that the untracked file followed me back to the 'test' branch.
If you want the migration file to stay local to the non-master branch where it was created, you can do a few things:
You can stage and commit this file to the non-master branch. After you do this, the file will reside on that branch and not follow you back to the master branch (or whatever other branch you decide to checkout).
You can stage this file, and then stash it:
git add path_to/your_file
git stash
to retrieve the stash later you can git stash pop
for more info on git stash, read here
Delete the file.
If you do accidentally migrate on a branch you didn't mean to, you can always rollback the migration using bundle exec rake db:rollback until you get to the desired schema state.
Hopefully this helps, and happy coding!

Why does schema.rb update after I run db:migrate for Rails?

My understanding is that whoever created the migrations should've also updated schema.rb. Since I've pulled the migrations, I should've also pulled the updated schema.rb. However, once in a while, schema.rb updates after I run bundle exec rake db:migrate.
My current workflow is:
git pull --rebase origin master --prune
rails s
Rails tells me to migrate
bundle exec rake db:migrate
Realize schema.rb updated
At this point, I'm pretty sure I'm not supposed to check in the updated schema.rb. I'd manually revert it through git checkout origin/master db/schema.rb.
So what went wrong in this case? Did a co-worker forget to run migrations after creating them? Did I do something wrong?
As far as I know schema can change after running rails db:migrate because of:
A co-worker did not commit the schema.rb so when you fetched and run the migrations you get the diff
A different DB version is running on your local machine. Based on db configuration schema may be changed accordingly.
Running git diff will help you to understand what is going.
schema.rb retains two key sets of data:
a description of all the tables in your app database structure
a list of all the migrations that have been applied.
if a new developer were to join your team, they should be able to run rake db:schema:load and get an up-to-date database structure straight away. That's far more efficient and reliable than expecting them to run through all the migrations manually.
Running rake db:migrate, even if there are no outstanding migrations that need running, will always regenerate db/schema.rb. Most of the time you won't notice because the file will be the same – but you may get differences in whitespace formatting or column order.
The best practice (IMHO) should always be to check in the updated db/schema.rb in the same commit as any migrations you've added.
When fetching or pulling branches to your local machine, running rake db:migrate will apply whatever new migrations need to be run based on the records in your local database's schema_migrations table. After that, your new db/schema.rb should be the same as the one you pulled down – but if it isn't, git diff will show you what the difference is.
You can then make a judgement call as to what the best course of action is. If the only difference is cosmetic, I personally tend to revert the unstaged changes and leave the committed version untouched until the next migration.
All the above also applies if you have switched to a SQL-based structure file (db/structure.sql) by specifying config.active_record.schema_format = :sql in config/application.rb.

Why 'git checkout .' doesn't undo the changes made by rake 'db:rollback'?

I created a rails application using scaffolding and migrated the database.
and I committed a local repository by git commit -m "First commit"
then I unrolled the database using rake db:rollback and the application stopped working.
I tried to undo using git checkout . but the application wasn't still working till I migrated the database again using rake db:migrate.
Why is this happening?
Rails' migration mechanism checks for a specific table in your db which shows which migrations are applied to your db and which are pending migrations (from the files that are present but without an entry).
When you perform a db:migrate or a db:rollback this table is also updated.
The db files are not inside your repository (and shouldn't be), so you can not undo these changes by git.
You need to use the tools provided by the rake tasks.
Running a rake -T db will give you a full list of the tools you have to manipulate your migrations and db status.
If you want to redo or change the migration you need to create another one,
for more info check this.

Rails 3 and Heroku: automatically "rake db:migrate" on push?

I have a slight annoyance with my heroku push/deploy process, which otherwise has been a joy to discover and use.
If i add a new migration to my app, the only way i can get it up onto the heroku server is to do a push to the heroku remote. This uploads it and restarts the app. But it doesn't run the migration, so i have to do heroku rake db:migrate --app myapp, then heroku restart --app myapp. In the meantime, the app is broken because it hasn't run the migrations and the code is referring to fields/tables etc in the migration.
There must be a way to change the deployment process to run the rake db:migrate automatically as part of the deploy process but i can't work it out.
Is it something i set in a heroku cpanel? Is it an option i pass to heroku from the command line? Is it a git hook? Can anyone set me straight? thanks, max
Heroku now has the ability to handle this as part of their "release phase" feature.
You can add a process called release to your Procfile and that will be run during each and every deploy.
Rails >= 5 Example
release: bundle exec rails db:migrate
Rails < 5 example
release: bundle exec rake db:migrate
What about this simple command chaining solution:
git push heroku master && heroku run rake db:migrate
It will automatically run the migrate as soon as the first one finishes successfully. It's tipically 1-2 seconds delay or less.
Here is a rake task that wraps up everything into a one-liner (and also supports rollback):
https://gist.github.com/362873
You still might wind up deploying on top of your boss's demo, but at least you don't waste time typing between the git push and the rake db:migrate.
I created a custom buildpack that gets Heroku to run rake db:migrate for you automatically on deployment. It's just a fork of Heroku's default Ruby buildpack, but with the rake db:migrate task added.
To use it with your app you'd do this:
heroku config:set BUILDPACK_URL=https://github.com/dtao/rake-db-migrate-buildpack
Also note that in order for it to work, you need to enable the user-env-compile Heroku Labs feature. Here's how you do that:
heroku labs:enable user-env-compile
And here's my evidence that this works:
Perhaps you could try separating your schema commits (migrations, etc.) commits from code commits (models, validations, etc.).
(Note the following assumes your migration changes are NOT destructive, as you've indicate covers most of your use cases.)
Your deploy process could then be:
Push schema changes to Heroku
migrate
Push application code to Heroku
This is of course far form optimal, but is an effective way to avoid downtime in the situation you've described: by the time the app receive the code for the dynamic fields, the DB will already have migrated.
(Of course, the simplest solution would be to simply push and migrate while your boss is out to lunch ;-D)
Otherwise, even if schema modifications were carried out automatically you'd still run the risk of a request passing through right before the migrations have been run.
Just for those googling folks like me I want to give a plain solution here.
I am using Rails 4 and needed to add a simple Rake task to the deployment to heroku. As I am using the 'deploy to heroku' button in github there is no chance to run "heroku run ..." immediately after deployment.
What I did: I extended the standard Rake Task 'assets:clean' that is automatically run during a deployment to heroku. The task still runs normally but I have attached my own stuff to it's end. This is done with the 'enhance' method. In the example below I add a db:migrate because this is probably what most people want:
# in lib/tasks/assets_clean_enhance.rake
Rake::Task['assets:clean'].enhance do
Rake::Task['db:migrate'].invoke
end
I admit that this is no perfect solution. But the heroku Ruby Buildpack still does not support any other way. And writing my own buildback seemed a bit of an overkill for so simple a thing.
I use a rake task to put the app in maintenance mode, push, migrate and move it off maintenance mode.
I wrote SmartMigrate buildpack which is a simple Heroku buildpack to warn of pending migrations after a ruby build whenever new migrations detected. This buildpack is intended to be part of a Multipack that has a preceding Ruby buildpack.
With due respect to other solutions here, this buildpack has 3 advantages over those:
No need for maintenance mode
No need for out-dated ruby buildpack forks that just insert the migration at the end
No need to run migrations ALL THE TIME, a warning is only displayed if new migrations are detected since the last deployment
I think David Sulc's approach is the only one which ensures that you avoid requests getting through while the app is in a broken state.
It is a bit of a pain, but may be necessary in some circumstances.
As he stated, it does require that the db migrations are non-destructive.
However, it can be difficult to push your migrations and schema changes prior to the rest of the code, as the obvious approach ('git push heroku {revnum}') relies on you having checked the migrations in before the rest of the code.
If you haven't done that, it's still possible to do this using a temporary branch:
Create a branch, based at the git revision that you most recently pushed to heroku:
git branch <branchname> <revnum-or-tag>
Check out that branch:
git checkout <branchname>
If your db migration commits only contained migrations, and no code changes, cherry-pick the commits that contain the database changes:
git cherry-pick <revnum1> <revnum2>...
If you commited your db changes in revisions which also contained code changes, you can use 'git cherry-pick -n' which won't automatically commit; use 'git reset HEAD ' to remove the files that aren't db changes from the set of things that are going to be commited. Once you've got just the db changes, commit them in your temporary branch.
git cherry-pick -n <revnum1> <revnum2>...
git reset HEAD <everything that's modified except db/>
git status
... check that everything looks ok ...
git commit
Push this temporary branch to heroku (ideally to a staging app to check that you've got it right, since avoiding downtime is the whole point of jumping through these hoops)
git push heroku <branchname>:master
Run the migrations
heroku run rake db:migrate
At this point, you might think that you could just push 'master' to heroku to get the code changes across. However, you can't, as it isn't a fast-forward merge. The way to proceed is to merge the remainder of 'master' into your temporary branch, then merge it back to master, which recombines the commit histories of the two branches:
git checkout <branchname>
git merge master
git diff <branchname> master
... shouldn't show any differences, but just check to be careful ...
git checkout master
git merge <branchname>
Now you can push master to heroku as normal, which will get the rest of your code changes across.
In the second-to-last step, I'm not 100% sure whether merging master to {branchname} is necessary. Doing it that way should ensure that a 'fast-forward' merge is done, which keeps git happy when you push to heroku, but it might be possible to get the same result by just merging {branchname} to master without that step.
Of course, if you aren't using 'master', substitute the appropriate branch name in the relevant places above.
I've been using the heroku_san gem as my deployment tool for a while. It is a nice small, focused tool for the push + migration. It adds some other rake commands that make accessing other functions (like console) easy. Beyond not having to remember database migrations, my favorite feature is its Heroku configuration file – so I can name all my servers (production, staging, playground4, shirley) whatever I want – and keep them straight in my head.

What is the preferred way to manage schema.rb in git?

I don't want to add schema.rb to .gitignore, because I want to be able to load a new database schema from that file. However, keeping it checked in is causing all sorts of spurious conflicts that are easily resolved by a fresh db:migrate:reset.
Basically I want a way to:
Keep schema.rb in the repository for deploy-time database setup
Keep schema.rb in '.gitignore' for general development
There would be one or two people responsible for updating schema.rb and knowing that it was correct.
Is there a way I can have my cake and eat it, too?
I'm afraid the magic solution you're looking for does not exist. This file is normally managed in version control, then for any conflicts on the version line just choose the later of the two dates. As long as you're also running all of the associated migrations nothing should get out of sync this way. If two developers have caused modifications to a similar area of schema.rb and you get conflicts in addition to the version then you are faced with a normal merge conflict resolution, but in my opinion these are normally easy to understand and resolve. I hope this helps some!
One other thing you can do is use:
git update-index --assume-unchanged /path/schema.rb
This will keep the file in the repository but won't track changes. you can switch the tracking anytime by using:
git update-index --no-assume-unchanged /path/schema.rb
What has worked really well for me is to delete and .gitignore schema.rb and then have it regenerated for each developer when they rake db:migrate.
You can still achieve what you wanted without migrating from 0 and risking broken migrations from years ago by simply doing a "roll-up" of the migrations periodically. You can do this by:
Run all outstanding migrations with rake db:migrate
Taking the contents of your schema.rb in the ActiveRecord::Schema.define block
Paste it into your initial_schema migration inside def up (overwriting what's already there)
Delete all other migrations
Now your initial_schema migration is your starting point for new systems and you don't have to worry about conflicts in schema.rb that may not be resolved correctly. It's not magical, but it works.
Would it be sufficient to do a rake db:dump in a pre-commit git hook?
The following won't necessarily fix (1) or (2), but it might take care of the merging issue, and then maybe (1) and (2) go away.
Instead of using .gitignore, use separate branches: Develop which omits schema.rb and Test and Deploy which include schema.rb. Only make code changes in the Develop branches and never merge from Test into Develop. Keep schema.rb in a separate branch:
Developer A
Develop --------
Local Schema \ Your Repo
Test ---------> Dev A
---------> Dev B
Developer B / Master
Develop -------- Schema
Local Schema Test
Test Deploy
In Git, branches are pointers to collections of file contents, so they can include or exclude particular files as well as track file versions. This makes them flexible tools for building your particular workflow.
You could define a merge strategy.
I've found this solution, but dont remember the source
[merge "railsschema"]
name = newer Rails schema version
driver = "ruby -e '\n\
system %(git), %(merge-file), %(--marker-size=%L), %(%A), %(%O), %(%B)\n\
b = File.read(%(%A))\n\
b.sub!(/^<+ .*\\nActiveRecord::Schema\\.define.:version => (\\d+). do\\n=+\\nActiveRecord::Schema\\.define.:version => (\\d+). do\\n>+ .*/) do\n\
%(ActiveRecord::Schema.define(:version => #{[$1, $2].max}) do)\n\
end\n\
File.open(%(%A), %(w)) {|f| f.write(b)}\n\
exit 1 if b.include?(%(<)*%L)'"
put this "somewhere" and
git-config --global core.attributesfile "somewhere"
I built a gem to solve this problem.
It sorts columns, index names and foreign keys, removes excess whitespace and runs Rubocop for some formatting to unify the output of your schema.rb file.
https://github.com/jakeonrails/fix-db-schema-conflicts
After you add it to your Gemfile you just run rake db:migrate or rake db:schema:dump like normal.
Commit schema.rb file.
Run git pull (or continue with what you're doing)
Every time you migrate the database, the schema.rb file updates and appears in git status. When working on something and occasionally doing git pull, this can be annoying because you have to commit schema.rb file before pulling to resolve conflict. This means that every time you migrate the database, you need to commit schema.rb file.
schema.rb should be tracked Git, of course.
I've just released this gem that can solve an issue with "conflicts" between branches for good.
The idea of that gem is simple. It keeps all migrated migrations inside tmp folder so that Git ignores them. It's just only your local story. These files are needed to roll back the "unknown" migrations being in another branch. Now, whenever you have an inconsistent DB schema due to running migrations in some other branch just run rails db:migrate inside the current branch and it will fix the issue automatically. The gem does all this magic automatically for you.

Resources