My understanding is that whoever created the migrations should've also updated schema.rb. Since I've pulled the migrations, I should've also pulled the updated schema.rb. However, once in a while, schema.rb updates after I run bundle exec rake db:migrate.
My current workflow is:
git pull --rebase origin master --prune
rails s
Rails tells me to migrate
bundle exec rake db:migrate
Realize schema.rb updated
At this point, I'm pretty sure I'm not supposed to check in the updated schema.rb. I'd manually revert it through git checkout origin/master db/schema.rb.
So what went wrong in this case? Did a co-worker forget to run migrations after creating them? Did I do something wrong?
As far as I know schema can change after running rails db:migrate because of:
A co-worker did not commit the schema.rb so when you fetched and run the migrations you get the diff
A different DB version is running on your local machine. Based on db configuration schema may be changed accordingly.
Running git diff will help you to understand what is going.
schema.rb retains two key sets of data:
a description of all the tables in your app database structure
a list of all the migrations that have been applied.
if a new developer were to join your team, they should be able to run rake db:schema:load and get an up-to-date database structure straight away. That's far more efficient and reliable than expecting them to run through all the migrations manually.
Running rake db:migrate, even if there are no outstanding migrations that need running, will always regenerate db/schema.rb. Most of the time you won't notice because the file will be the same – but you may get differences in whitespace formatting or column order.
The best practice (IMHO) should always be to check in the updated db/schema.rb in the same commit as any migrations you've added.
When fetching or pulling branches to your local machine, running rake db:migrate will apply whatever new migrations need to be run based on the records in your local database's schema_migrations table. After that, your new db/schema.rb should be the same as the one you pulled down – but if it isn't, git diff will show you what the difference is.
You can then make a judgement call as to what the best course of action is. If the only difference is cosmetic, I personally tend to revert the unstaged changes and leave the committed version untouched until the next migration.
All the above also applies if you have switched to a SQL-based structure file (db/structure.sql) by specifying config.active_record.schema_format = :sql in config/application.rb.
I am working on a rails app with quite a few git branches and many of them include db migrations. We try to be careful but occasionally some piece of code in master asks for a column that got removed/renamed in another branch.
What would be a nice solution to "couple" git branches with DB states?
What would these "states" actually be?
We can't just duplicate a database if it's a few GBs in size.
And what should happen with merges?
Would the solution translate to noSQL databases as well?
We currently use MySQL, mongodb and redis
EDIT: Looks like I forgot to mention a very important point, I am only interested in the development environment but with large databases (a few GBs in size).
When you add a new migration in any branch, run rake db:migrate and commit both the migration and db/schema.rb
If you do this, in development, you'll be able to switch to another branch that has a different set of migrations and simply run rake db:schema:load.
Note that this will recreate the entire database, and existing data will be lost.
You'll probably only want to run production off of one branch which you're very careful with, so these steps don't apply there (just run rake db:migrate as usual there). But in development, it should be no big deal to recreate the database from the schema, which is what rake db:schema:load will do.
If you have a large database that you can't readily reproduce, then I'd recommend using the normal migration tools. If you want a simple process, this is what I'd recommend:
Before switching branches, rollback (rake db:rollback) to the state before the branch point. Then, after switching branches, run db:migrate. This is mathematically correct, and as long as you write down scripts, it will work.
If you forget to do this before switching branches, in general you can safely switch back, rollback, and switch again, so I think as a workflow, it's feasible.
If you have dependencies between migrations in different branches... well, you'll have to think hard.
Here's a script I wrote for switching between branches that contain different migrations:
https://gist.github.com/4076864
It won't solve all the problems you mentioned, but given a branch name it will:
Roll back any migrations on your current branch which do not exist on the given branch
Discard any changes to the db/schema.rb file
Check out the given branch
Run any new migrations existing in the given branch
Update your test database
I find myself manually doing this all the time on our project, so I thought it'd be nice to automate the process.
Separate Database for each Branch
It's the only way to fly.
Update October 16th, 2017
I returned to this after quite some time and made some improvements:
I've added another namespace rake task to create a branch and clone the database in one fell swoop, with bundle exec rake git:branch.
I realize now that cloning from master is not always what you want to do so I made it more explicit that the db:clone_from_branch task takes a SOURCE_BRANCH and a TARGET_BRANCH environment variable. When using git:branch it will automatically use the current branch as the SOURCE_BRANCH.
Refactoring and simplification.
config/database.yml
And to make it easier on you, here's how you update your database.yml file to dynamically determine the database name based on the current branch.
<%
database_prefix = 'your_app_name'
environments = %W( development test )
current_branch = `git status | head -1`.to_s.gsub('On branch ','').chomp
%>
defaults: &defaults
pool: 5
adapter: mysql2
encoding: utf8
reconnect: false
username: root
password:
host: localhost
<% environments.each do |environment| %>
<%= environment %>:
<<: *defaults
database: <%= [ database_prefix, current_branch, environment ].join('_') %>
<% end %>
lib/tasks/db.rake
Here's a Rake task to easily clone your database from one branch to another. This takes a SOURCE_BRANCH and a TARGET_BRANCH environment variables. Based off of #spalladino's task.
namespace :db do
desc "Clones database from another branch as specified by `SOURCE_BRANCH` and `TARGET_BRANCH` env params."
task :clone_from_branch do
abort "You need to provide a SOURCE_BRANCH to clone from as an environment variable." if ENV['SOURCE_BRANCH'].blank?
abort "You need to provide a TARGET_BRANCH to clone to as an environment variable." if ENV['TARGET_BRANCH'].blank?
database_configuration = Rails.configuration.database_configuration[Rails.env]
current_database_name = database_configuration["database"]
source_db = current_database_name.sub(CURRENT_BRANCH, ENV['SOURCE_BRANCH'])
target_db = current_database_name.sub(CURRENT_BRANCH, ENV['TARGET_BRANCH'])
mysql_opts = "-u #{database_configuration['username']} "
mysql_opts << "--password=\"#{database_configuration['password']}\" " if database_configuration['password'].presence
`mysqlshow #{mysql_opts} | grep "#{source_db}"`
raise "Source database #{source_db} not found" if $?.to_i != 0
`mysqlshow #{mysql_opts} | grep "#{target_db}"`
raise "Target database #{target_db} already exists" if $?.to_i == 0
puts "Creating empty database #{target_db}"
`mysql #{mysql_opts} -e "CREATE DATABASE #{target_db}"`
puts "Copying #{source_db} into #{target_db}"
`mysqldump #{mysql_opts} #{source_db} | mysql #{mysql_opts} #{target_db}`
end
end
lib/tasks/git.rake
This task will create a git branch off of the current branch (master, or otherwise), check it out and clone the current branch's database into the new branch's database. It's slick AF.
namespace :git do
desc "Create a branch off the current branch and clone the current branch's database."
task :branch do
print 'New Branch Name: '
new_branch_name = STDIN.gets.strip
CURRENT_BRANCH = `git status | head -1`.to_s.gsub('On branch ','').chomp
say "Creating new branch and checking it out..."
sh "git co -b #{new_branch_name}"
say "Cloning database from #{CURRENT_BRANCH}..."
ENV['SOURCE_BRANCH'] = CURRENT_BRANCH # Set source to be the current branch for clone_from_branch task.
ENV['TARGET_BRANCH'] = new_branch_name
Rake::Task['db:clone_from_branch'].invoke
say "All done!"
end
end
Now, all you need to do is run bundle exec git:branch, enter in the new branch name and start killing zombies.
Perhaps you should take this as a hint that your development database is too big? If you can use db/seeds.rb and a smaller data set for development then your issue can be easily solved by using schema.rb and seeds.rb from the current branch.
That assumes that your question relates to development; I can't imagine why you'd need to regularly switch branches in production.
I was struggling with the same issue. Here is my solution:
Make sure that both schema.rb and all migrations are checked in by all developers.
There should be one person/machine for deployments to production. Let's call this machine as the merge-machine. When the changes are pulled to the merge machine, the auto-merge for schema.rb will fail. No issues. Just replace the content with whatever the previous contents for schema.rb was (you can put a copy aside or get it from github if you use it ...).
Here is the important step. The migrations from all developers will now be available in db/migrate folder. Go ahead and run bundle exec rake db:migrate. It will bring the database on the merge machine at par with all changes. It will also regenerate schema.rb.
Commit and push the changes out to all repositories (remotes and individuals, which are remotes too). You should be done!
This is what I have done and I'm not quite sure that I have covered all the bases:
In development (using postgresql):
sql_dump db_name > tmp/branch1.sql
git checkout branch2
dropdb db_name
createdb db_name
psql db_name < tmp/branch2.sql # (from previous branch switch)
This is a lot faster than the rake utilities on a database with about 50K records.
For production, maintain the master branch as sacrosanct and all migrations are checked in, shema.rb properly merged. Go through your standard upgrade procedure.
You want to preserve a "db environment" per branch. Look at smudge/clean script to point to different instances. If you run out of db instances, have the script spin off a temp instance so when you switch to a new branch, it's already there and just needs to be renamed by the script. DB updates should run just before you execute your tests.
Hope this helps.
I totally experience the pita you are having here. As I think about it, the real issue is that all the branches don't have the code to rollback certain branches. I'm in the django world, so I don't know rake that well. I'm toying with the idea that the migrations live in their own repo that doesn't get branched (git-submodule, which I recently learned about). That way all the branches have all the migrations. The sticky part is making sure each branch is restricted to only the migrations they care about. Doing/keeping track of that manually would be a pita and prone to error. But none of the migration tools are built for this. That is the point at which I am without a way forward.
I would suggest one of two options:
Option 1
Put your data in seeds.rb. A nice option is to create your seed data via FactoryGirl/Fabrication gem. This way you can guarantee that the data is in sync with the code if we assume, that the factories are updated together with the addition/removal of columns.
After switching from one branch to another, run rake db:reset, which effectively drops/creates/seeds the database.
Option 2
Manually maintain the states of the database by always running rake db:rollback/rake db:migrate before/after a branch checkout. The caveat is that all your migrations need to be reversible, otherwise this won't work.
If you do a git pull, you should already have the latest schema, affected by any migrations that came in via the pull, but your database tables may not be updated
So, you do need to run the migrations after pulling, but this will often change db/schema.rb
If all you've done is pull and migrate, there's no reason you should be responsible for committing any of the resultant schema changes as they don't technically belong to you, and they may end up being extraneous/incorrect
Resetting the schema diff makes the most sense
Here is my step by step version of what to do before creating a new branch
Switch to your parent/base branch
Pull the latest code
Run bundle exec rake db:migrate to update your schema.rb file locally
Do a git checkout db/schema.rb to throw away the changes brought by db:migrate if any
Create your new branch and switch to it
Make sure to commit your changes before switching to another branch
Adapted from here
On development environment:
You should work with rake db:migrate:redo to test if your script are reversible, but keep in mind always should have a seed.rb with the data population.
If you work with git, you seed.rb should be change with an migration change, and the execution of db:migrate:redo for the begining (load the data for a new development on other machine or new database)
Apart of ´change´, with yours up's and down's methods your code always be cover scenarios for the "change" in this moment and when start from zero.
Im have this problem many times, i search much and not found any solution to solve, my problem is this:
After the run
git push heroku push master
When i run
heroku run rake db:migrate
i get this error:
Multiple migrations have the version number 20130615132808
im search by this problem and found this:
rails database migration - multiple migrations have the version number x
but when execute git rm appear some options i dont understand much about git so i need solve this problem, in localhost im delete the archives but the problem persists, thanks very much by the help.
Just rename the files with duplicate timestamps (add 1 to the last digit) and then add, commit and push files. When you run heroku run rake db:migrate again all will be dandy.
And for the future remember to not copy and rename migrations by hand (so you don't get repeated version numbers)
This might occur when you copy-paste multiple "rails generate" commands to generate migrations. The migrations generated may have the same time stamp. If you type them in (or copy-paste them) separately, they will have different timestamps.
When this happens, you can simply rename the migration files under db/migrate to contain different timestamps.
Ok, I have stuffed up my migrations. I tried to sort it by deleting duplicates, sorting the schema.rb etc but I don't think I have done it properly.
When I try to deploy to heroku, or rather heroku run rake db:migrate, I get
Multiple migrations have the version number 20130307005437
The migrate works fine on localhost but not heroku.
Unfortunately when I look for migration no 20130307005437, it's not there in my db/migrate.
How can I find it to sort the problem?
While this file might not be visible within your directory listing, I suspect that there might already be a file within your Git repository, which is what is causing this error from appearing on Heroku and not locally.
Please ensure that you've only got one migration inside that Git repo with that number.
Is there a good way to merge rails db migration files into 1 file per table besides splitting schema.rb manually?
Most of my migration file was created during development and does not represent real data changes. For historical reasons those files will be still accessible on source control system. I feel uncomfortable keeping those unnecessary files.
Well, i can imagine that you want to have a clean start. While being in project development mode for your first version release you don't want all the separate migration files. Although they can't hurt obviously.
Basically what you can do, is this.
FIRST BACKUP your schema and data.
The db/schema.rb contains (or should contain) the latest version of your schema. Otherwise run:
rake db:schema:dump
Now you can clean your migration folder.
Then run:
rake db:drop
rake db:schema:load
The last command runs the db/schema.rb and create a new schema. This should bring you to the last version of your database.
show db task
rake -T db
You can use Squasher gem to merge the migrations all olds into one.
Don't bother. The old migration files are not doing any harm, and they may make maintenance easier. Leave them as they are.