I used to think the db/schema.rb in a Rails project stored the database schema, so that ActiveRecord can know what table/column it has.
But earlier I surprisingly noticed that my project runs normally after I delete db/schema.rb!
So, since the Rails can work without it, what does schema.rb really do?
The schema.rb serves mainly two purposes:
It documents the final current state of the database schema. Often, especially when you have more than a couple of migrations, it's hard to deduce the schema just from the migrations alone. With a present schema.rb, you can just have a look there. ActiveRecord itself will indeed not use it. It will introspect the database during runtime as this is much safer than to expect users to keep the schema.rb up-to-date. However to avoid confusion of your developers, you should always maintain a file that is up-to-date with your migrations.
It is used by the tests to populate the database schema. As such a rake db:schema:dump is often run as part of the rake test:prepare run. The purpose is that the schema of the test database exactly matches the current development database.
Rails Documentation / 6.1 What are Schema Files for?
Migrations, mighty as they may be, are not the authoritative source
for your database schema. That role falls to either db/schema.rb or an
SQL file which Active Record generates by examining the database. They
are not designed to be edited, they just represent the current state
of the database.
There is no need (and it is error prone) to deploy a new instance of
an app by replaying the entire migration history. It is much simpler
and faster to just load into the database a description of the current
schema.
For example, this is how the test database is created: the current
development database is dumped (either to db/schema.rb or
db/structure.sql) and then loaded into the test database.
Schema files are also useful if you want a quick look at what
attributes an Active Record object has. This information is not in the
model's code and is frequently spread across several migrations, but
the information is nicely summed up in the schema file. The
annotate_models gem automatically adds and updates comments at the top
of each model summarizing the schema if you desire that functionality.
Rails documents have you covered.
Related
I am new to programming. I am using Rails 4 and Postgres as database in production. When I change my database structure, what is the best practice to update production database when deploying using Capistrano? I want to keep all the data in production intact.
I noticed sometimes when I changed schema and deploy to production, some of the existing data records are lost.
Normally when making alterations to your schema you'll use rails g migration to produce new migrations and then do things like this:
class AddUsersDiscountToken < ActiveRecord::Migration
def change
add_column :users, :discount_token, :string
end
end
This migration will add a discount_token column to the users table and can be applied with:
rake db:migrate
Within Capistrano there's a task that will do this for you once the deploy is successful. No data should be lost, no records altered apart from introducing this new field. If anything else happens you've got something very odd going on in your migrations.
Remember, a few rules:
Always back-up your data before applying any migrations using the proper tool. mysqldump is a good place to start. Copying the binary MySQL data files is not adequate and will not work reliably, if at all.
Always test your back-ups and ensure everything is there. The backup process may have terminated early for some reason and failed to properly back up all the tables and data.
Never deploy migrations on your live production database without testing on a copy first. This is where the backup comes in handy, you get a chance to restore it, run the migrations, and test the results.
This is why having a staging server is often handy, even if it's just a temporary one, or not as powerful as your production server. It allows you to test your migrations on actual, production data without running the risk of interrupting service. Run your new production code with your newly migrated production database and verify that the new features you've added are functioning correctly and also check you haven't broken any old code with regressions.
Remember, migrations that alter the schema of large tables, such as those with millions of rows, may take some time to complete, especially on servers with non-SSD backed databases. When testing on your staging system make a note of how long it takes to complete as you may need to give your users advance notice for scheduled maintenance or make alterations to your plans to be less disruptive in terms of migrations.
Unless you are deleting a table or removing a column migrations will never cause you any issues.
To avoid some migration-related issues make sure you are following this:
If possible try to rename the table or column than deleting and creating a table with identical structure.
Check if your migration is reversible in case you need to rollback
If you are writing a script to update the database make sure you have counter-script ready.
Most important understand what the migration is exactly doing cross verify it on staging environment to make sure you are not loosing your data - By #CraigRinger
So after alot of reading i found out that i dont need to plan my database ahead. I just start working on the application and do migrations on every change.
So for example if I decide to add something I add it via migration. Then on another migration I delete it for some reason. And in the end I decide to get it back. After a short time there will be a mess of migrations.
How do I keep track of them? Wouldnt be easier to think of the database structure in the first place?
Rails way is to do everything via migrations. As per your scenario it would be like:
migration1 #add column A
migration2 #remove column A
migration3 #add column A again
It seems like there are lots of migrations, but in practical scenario it will keep your database changes clean. Because at any given time when you do:
rake db:migrate
Rails will run only the pending migrations.
And at any given time you will see the db/schema.rb file with all the migrates and latest migration number as the version.
Having said that, if you want to revert a migration there are commands like rollback commands. Read more about migrations here.
You can see your database structure inside db/schema.rb which will show you all the tables, columns and indexes currently in your app.
Not as helpful if you're constantly changing a column, but you can also run rake db:migrate:status which will output a list of all migrations, and tell you whether they've been run or not.
Is is correct to assume that migrations in ruby on rails are simply updates to any database. And that the rake db:migrate script only serves to actualize these changes?
Yes.
Migrations are a convenient way for you to alter your database in a
structured and organized manner. You could edit fragments of SQL by
hand but you would then be responsible for telling other developers
that they need to go and run them. You’d also have to keep track of
which changes need to be run against the production machines next time
you deploy.
Active Record tracks which migrations have already been run so all you
have to do is update your source and run rake db:migrate. Active
Record will work out which migrations should be run. It will also
update your db/schema.rb file to match the structure of your database.
Migrations also allow you to describe these transformations using
Ruby. The great thing about this is that (like most of Active Record’s
functionality) it is database independent: you don’t need to worry
about the precise syntax of CREATE TABLE any more than you worry about
variations on SELECT * (you can drop down to raw SQL for database
specific features). For example you could use SQLite3 in development,
but MySQL in production.
Source: Ruby on Rails Guides: Migrations
I have an existing database in which I am converting a formerly 'NULL' column to one that has a default value (and populating that with said default value). However, that value is an ID of a record I need to create. If I put this record in db/seeds.rb, it won't run because db/seeds.rb runs after migrations -- but the migration needs seed data. If I leave the record creation in the migration, then I don't get the record if I make a fresh database with db:load. Is there a better way other than duplicating this in both db/seeds.rb and the migration?
Thanks!
While I can understand your desire to stay DRY and not have to write this in both the migration and seeds.rb, I think you should write it in both places. Not just to make it work, but to accomplish different requirements related to your problem.
You need to ensure that your migration can execute properly regardless of external processes. That means you should put any code required within that specific migration. This isn't to accomplish anything besides making sure your migration executes properly. Suppose someone else tries to migrate without knowing you put part of the code in seeds.rb, it would be very difficult for them to figure out what's going on.
You can make db:load work properly by including similar code in seeds.rb. However, you should be evaluating the current state of your database in seeds.rb due to the fact that it runs after the migrations. So you can check to see if the column exists, and what the default value is etc. This means that if the migration ran and took care of everything, seeds.rb doesn't repeat work or modify values inappropriately. However, if the migration did not set these variables as expected, it is able to set the values.
I'd recommend looking at it as two separate issues so you can be more confident of each one executing successfully independent of one another. It also creates better maintainability for understanding by yourself or others of what's happening in the future.
In my opinion you should treat this in both db/seeds.rb and the migration.
The migration is used to get an existing database from an older version to another version while seeds.rb and schema.rb are used for a fresh database with the latest version.
I was told that for some reason, you can't update a database schema when using rails. You can drop a table and then recreate a table with an updated schema, but this won't work if you already have content stored in the table that you want to update.
What do you recommend?
Thanks!
What you were told is incorrect. You can update a DB schema when using Rails.
The way you do it is through "migrations."
A common pattern is to write a set of migrations that build your initial schema. As your app develops, you write other migrations that change the tables and columns to suit the evolving design. If the app is in production, you apply these new migrations to the production schema.
Of course some changes will mess up your existing data, but that has nothing to do with Rails. That would be true regardless of what programming language/framework you're using.
If you have a legacy DB schema and are not using migrations, you can still update your schema by interacting directly with the DB server. Again, what will work and what won't in that situation has nothing to do with Rails. It's totally up to the structure of the schema and the data itself.
If you drop a table in the content in it, the content will be destroyed. You can, however, include content migrations alongside db schema migrations, so it will be migrated back to the table once the schema is updated.