Making Changes Between Two Databases ASP.NET MVC - asp.net-mvc

I'm working on a project in which we have two versions of an MVC App, the live, and the dev versions, I've made changes to the dev version and added tables and data, etc.
Is there any way to migrate these changes onto the live version without losing all data(i.e. just regenerating the database).
I've already tried just rebuilding the database but we lose all data that was previously stored( as obviously we are essentially deleting the old database and rebuilding it).
tl;dr
How do I migrate my dev version of an mvc app along with any new tables to the live version of an mvc app with missing models and tables.

Yes, it is possible to migrate your changes from your dev instance to your production instance; to do so you must create SQL scripts that update your production database with the changes. This may be accomplished by manually writing the scripts or by using tools to generate the scripts for you. However you go about it, you will need scripts to update your database (well, you could perform manual updates via the tooling of your database, but this is not desirable, as you want the updates to occur in a short time window, and you want this to happen reliably and repeatably).
The easiest way that I know of to do this is to use tools like SQL Compare (for schema updates) or SQL Data Compare (for data updates). These are from Redgate, but they cost a fair bit of money. They are well worth they price, and most companies I've worked with are happy to pay for licenses. However, you may not want to shell out for them personally. To use these tools, you connect them to source and destination databases, and they analyze the differences between the databases (schematically or data) and produce SQL scripts. These scripts may then be run manually or by the tools themselves.
Ideally, when you work on your application, you should be producing these scripts as you go along. That way when it comes time to update your environments, you may simply run the scripts you have. It is worth taking the time to include this in your build process, so database scripts get included in your builds. You should write your scripts so they are idempotent, meaning that you can run them multiple times and the end result will be the same (the database updated to the desired schema and data).
One may of managing this is creating a DBVersions table in your database. This table includes all your script updates. For example you could have a table like the following (this is SQL Server 2008 dialect):
CREATE TABLE [dbo].[DBVersions] (
[CaseID] [int] NOT NULL,
[DateExecutedOn] [datetime] NOT NULL,
CONSTRAINT [PK_DBVersions] PRIMARY KEY CLUSTERED (
[CaseID] ASC
)
) ON [PRIMARY]
CaseID refers to the case (or issue) number of the feature or bug that requires the SQL update. Your build process can check this table to see if the script has already been run. If not, it runs it. This is useful if you cannot write your scripts in a way that allows them to be run more than once. If all your scripts can be run an unbounded number of times, then this table is not strictly necessary, though it could still be useful to reduce the need to run a large number of script every time a deployment is done.
Here are links to the Redgate tools. There may be many other tools out there, but I've had a very good experience with these.
http://www.red-gate.com/products/sql-development/sql-compare/
http://www.red-gate.com/products/sql-development/sql-data-compare/

It depends on your deployment strategy and this is more of a workflow that your team need to embrace. If regenerating the live database from scratch it can take awhile depending how big the database size is. I don't see a need to do this in most scenarios.
You would only need to separate out database schema object and data row scripts. The live database version should have its database schema objects scripted out and stored in a repository. When a developer is working on a new functionality, he/she will need to make those changes against the database scripts in the repository. If there is a need to make changes to the database rows then the developer will also need to check in the data row scripts in the repository. On a daily deployment the live database version can be compared against what is checked in the repository and pushed to make it in sync.
On our side we use tools such as RedGate Schema Compare and Data Compare to do the database migration from the dev version to our intended target version.

Related

How to send SQL queries to two databases simultaneously in Rails?

I have a very high-traffic Rails app. We use an older version of PostgreSQL as the backend database which we need to upgrade. We cannot use either the data-directory copy method because the formats of data files have changed too much between our existing releases and the current PostgreSQL release (10.x at the time of writing). We also cannot use the dump-restore processes for migration because we would either incur downtime of several hours or lose important customer data. Replication would not be possible as the two DB versions are incompatible for that.
The strategy so far is to have two databases and copy all the data (and functions) from existing to a new installation. However, while the copy is happening, we need data arriving at the backend to reach both servers so that once the data migration is complete, the switch becomes a matter of redeploying the code.
I have figured out the other parts of the puzzle but am unable to determine how to send all writes happening on the Rails app to both DB servers.
I am not bothered if both installations get queried for displaying data to the user (I can discard the data coming out of the new installation); so, if it is possible on driver level, or adding a line somewhere in the ActiveRecord, I am fine with it.
PS: Rails version is 4.1 and the company is not planning to upgrade that.
you can have multiple database by adding an env for the database.yml file. After that you can have a seperate class Like ActiveRecordBase and connect that to the new env.
have a look at this post
However, as I can see, that will not solve your problem. Redirecting new data to the new DB while copying from the old one can lead to data inconsistencies.
For and example, ID of a record can be changed due to two data source feeds.
If you are upgrading the DB, I would recommend define a schedule downtime and let your users know in advance. I would say, having a small downtime is far better than fixing inconstant data down the line.
When you have a downtime,
Let the customers know well in advance
Keep the downtime minimal
Have a backup procedure, in an even the new site takes longer than you think, rollback to the old site.

Database Changes Outside Ruby/Rails Migration

we have several technologies accessing the same database. At the moment, Ruby/Rails is used to create migrations when making changes to the database. The question is a simple one:
Is it possible for our DBAs to make changes to the database (not using Ruby migrations) without stepping on the Ruby devs toes and breaking the Ruby web application?
If so, some generic details about how to get started or pointed in the right direction would be great! Thanks.
I can tell you from experience that this is not the best idea, one that you will eventually regret and later, inevitably, reverse. But I know that it does come up. I've had to do them (against my will or in case of extreme emergencies).
Given the option, I'd push back on it if you can in favor of any solution that bring the SQL closer to the repository and further away from a "quick fix" to the database directly. Why?
1) Your local/testing/staging/production databases will diverge, eventually rendering your code untestable in a reliable way
2) You won't be able to regenerate your database from "scratch" to match production
3) If the database is ever clobbered, you won't be able to re-create it in any sensible way.
DBA's generally don't care about these things until something in the code breaks, and they want you to figure it out. But, for obvious reasons, that now becomes quite difficult.
One approach I have taken that seems to make everyone happy is to do the following:
1) Commit to having ALL database changes, big or small, put into a repository with the code. This means that everything that has happened to the database is all together in one place.
2) Each change, or set of changes, should be a migration. A migration can be simply running an SQL file. But, it should be run from within a migration for all the testability benefits.
So, for example, let's say you have a folder structure like:
- database_updates
-- v1
--- change_1.sql
--- change_2.sql
-- v2
--- change_3.sql
--- change_2_fix.sql
Now, let say you want to make a change or set of change via SQL. First, create a new version folder, let's call it "v1". Next, put your SQL scripts in this folder. Finally, create a migration:
def change
# Read all files in v1 folder, and run the SQL
end
(I have code that does this, happy to share the gist if you find yourself using this approach)
Since each migration is transactional, any of the scripts that fail will cause all of them to fail.
Now, let's say you have the next set, v2. Same exact thing. And, we have a history of these "versioned" changes, so we can look at the migration history and see what's been run, etc.
As a power user note, this set up also allows for recourse if things fail; in these cases, we can opt to go back to v1:
def up
# run v2 scripts
end
def down
# run v1 scripts
end
For this to work, v1 and v2 would need to be autonomous -- that is, they can destroy and rebuild entities without any dependencies. If that's not what you want, just stick with the change method.
This would also allow you to test for breaking changes. Let's say it is reported that something doesn't work anymore with v6. You can rollback your database migrations to v5, v4, etc (because you are doing a migration per folder) and test to see when the test broke, and correct it with v7.
Anyway, the end game of it all is that you can safely check out this project from a repository, create your database, run rake db:migrate and know that your database structure resembles exactly what is deployed elsewhere. And, worst case, if your database gets clobbered, you can just run all your scripts from v1 - vN and end up with your database back again.
For DBA's everything remains SQL for them, they can just send you a file or set of files for you to run.
If you want to get fancy, you could even write a migration generator that knows how to handle a line like rails g migration UpdateDBVersion version:v7 to take care of the repetitive boilerplate.
As long as everyone relies on the same updated schema.rb or structure.sql, everyone will share the same database 'version'.
See this SO answer for more insight.
Changes to the database, tables, or indexes should be made using ActiveRecord migrations whenever possible. This specifically ensures that development and test environments remain logically in sync. Remember that developers must be capable of accurate development and testing against the same database structure as occurs in the production environment, and QA teams must be able to adequately test such changes.
However, some database features are not actually supported by ActiveRecord migrations, and may only be applied directly to the database. These features are often database-specific, such as any of the following:
Views
Triggers
Stored procedures
Indexes with function-based columns
Virtual columns
Essentially any database-specific features that don't have an ActiveRecord abstraction will be made directly to the database.
Sometimes, however, other applications require the addition of tables, columns, or indexes in order to operate properly or efficiently. These other applications may simply be used to view/report against the database, or they may be substantial business applications that have their own independent database requirements and separate development teams. Occasionally, a DBA may have to step in and create an index or provide some optimization needed to solve a real-world production performance issue.
There are simply far too many situations for shared database management to give a definitive answer. Depending on the size of the organization and the complexity of the needs for the shared management, there may be many ways to solve the problem of a shared database schema that are specific to the application or organization.
For instance, I have worked on applications that shared a database with as many as 10 other applications, each of which "owned" portions of the schema and shared other portions with the other teams, all mediated through the DBA group. In situations such as this, the organizational structure and change control process may be the only means of solving this problem.
Whichever the situation, some real-world suggestions may help avoid problems and mitigate maintenance woes:
Offer to translate SQL DDL commands into ActiveRecord migrations, where possible, so that DBAs can accomplish their needs, and the application team can still appropriately maintain the schema
Any changes made outside the ActiveRecord migration should be thoroughly tested for impact to the project in a non-production environment by the same QA resources that test the actual Rails application
Encapsulate any external changes in a .sql file and include the file as part of the project in version control
If the development team is using the same database product in development (some cannot, due to licensing or complexity), those changes should be applied to the developer database instances, as well
It's best if you can apply the changes during a migration, even just by calling the relevant CLI tools as a migration step - the exact mechanism will be database-dependent, as well
Try to avoid doing this more than is absolutely necessary, as this can significantly reduce the database independence of the application, even between versions of the same database product (limiting upgrade opportunities)

Is it safe to run migrations on a live database?

I have a simple rails-backed app running 2-3 million pageviews a day off a Heroku Ronin database. The load on the database is pretty light, though, and it could handle a lot more than we're throwing at it.
Is it safe for me to run a migration to add tables to this database without going into maintenance mode? Also, would it be safe to run a migration to add a few columns to the core table responsible for almost all of the reads and writes?
Downtime is not acceptable, even for a few minutes.
If running migrations live isn't advisable, what I'll probably do is set up a new database, run the migrations on that, write a script to sync the two databases, and then point the app at the new one.
But I'd rather avoid that if possible. :)
Sounds like your migration includes:
adding new tables (perhaps indexes? If so, that could take a bit longer than you might expect)
adding new columns (default values and/or nullable?)
wrapping your changes in a transaction (?)
Suggest you gauge the impact that your changes will have on your Prod environment by:
taking a backup of Prod (with all the Prod data within)
running your change scripts against that. Time each operation
Balance the 2 points above against the typical read & write load at the time you're expecting to run this (02:00, right?).
Consider a 'soft' downtime by disabling (somehow) write operations to the tables being effected.
Overall (or in general), adding n tables and new nullable columns to an existing table would/could likely be done without any downtime or performance impact.
Always measure the impact your changes will have on a copy of Prod. Measure 'responsiveness' at the time you apply your changes to this copy. Of course this means deploying another copy of your Prod app as well, but the effort would be worthwhile.
Assuming it's a pg database (which it should be for Heroku).
http://www.postgresql.org/docs/current/static/explicit-locking.html
alter table will acquire an access exclusive lock. So, the table will be locked.
On top of this, you will be required to restart the Rails application in order for it to be aware of any new models. If you are going to be adding tables to the application or modifying model code in any way.
As for pointing to a new app with a freshly modified database, how are you going to do the sync of the data and also sync the changes in data between the two databases in the time that the sync takes?
Adding tables shouldn't be a concern, as your application won't be aware of them until proper upgrades are done. As for adding columns to a core table, I'm not so sure. If you really need to prevent downtime, perhaps it's better to add a secondary table that (linked by an ID with the core table) adds your extra columns.
Just my two cents.

How to prepare for data loss in a production website?

I am building an app that is fast moving into production and I am concerned about the possibility that due to hacking, some silly personal error (like running rake db:schema:load or rake db:rollback) or other circumstance we may suffer data loss in one database table or even across the system.
While I don't find it likely that the above will happen, I would be remiss in not being prepared in case it ever does.
I am using Heroku's PG Backups (which is to be replaced with something else this month), and I also run automated daily backups to S3: http://trevorturk.com/2010/04/14/automated-heroku-backups/, successfully generating .dump files.
What is the correct way to deal with data loss on a production app?
How would I restore the .dump file in case I need to? Can I do a selective restore if a small part of the system is hit?
In case a selective restore is not possible: assume one table loses data 4 hours after the last backup. Result => would fixing the lost table require rolling back 4 hours of users' activity? Any good solution to this?
What is the best way to support users through the inconvenience if something like this happens?
A full DR (disaster recovery) solution requires the following:
Multisite. If a fire, flood, Osama Bin Laden or whathaveyou strikes the Amazon (or is it Salesforce?) data center that Heroku uses, you want to be sure that your data is safe elsewhere.
On-going replication of the data to a separate site (or sites). That means that every transaction that's written to your database on one site, is replicated within seconds to the mirror database on the other site. Most RDBMS's have mechanisms to let you do a master-slave replication like that.
The same goes for anything you put on a filesystem outside of the database, such as images, XML configuration files etc. S3 is a good solution here - they replicate everything to multiple data centers for you.
I won't hurt to create periodic (daily or so) dumps of the database and store them separately (e.g. on S3). This helps you recover from data corruption that propagates to the slave DBs.
Automate the process of data recovery. You want this to just work when you need it.
Test everything. Ideally, you want to automate the test process and run it periodically to ensure that your backups can restore. Netflix Chaos Monkey is an extreme example of this.
I'm not sure how you'd implement all this on Heroku. A complete solution is still priced out of reach for most companies - we're running this across our own data centers (one in the US, one in EU) and it costs many millions. Work according to the 80-20 rule - on-going backup to a separate site, plus a well tested recovery plan (continuously test your ability to recover from backups) covers 80% of what you need.
As for supporting users, the best solution is simply to communicate timely and truthfully when trouble happens and make sure you don't lose any data. If your users are paying for your service (i.e. you're not ad-supported), then you should probably have an SLA in place.
About backups, you cannot be sure at 100 percent every time that no data will be lost. The best is to test it on another server. You must have at leat two types of backup :
A database backup, like pg-dump. A dump is uniquely SQL commands so you can use it to recreate the whole database, just a table, or just a few rows. You loose the data added in the meantime.
A code backup, for example a git repository.
in addition to Hartator's answer:
use replication if your DB offers it, e.g. at least master/slave replication with one slave
do database backups on a slave DB server and store them externally (e.g. scp or rsync them out of your server)
use a good version control system for your source code, e.g. Git
use a solid deploy mechanism, such as Capistrano and write your custom tasks, so nobody needs to do DB migrations by hand
have somebody you trust check your firewall setup and the security of your system in general
The DB-Dumps contain SQL-commands to recreate all tables and all data... if you were to restore only one table, you could extract that portion from a copy of the dump file and (very carefully) edit it and then restore with the modified dump file (for one table).
Always restore first to an independent machine and check if the data looks right. e.g. you could use one Slave server, take if offline, then restore there locally and check the data. Good if you have two slaves in your system, then the remaining system has still one master and one slave while you restore to the second slave.
To simulate a fairly simple "total disaster recovery" on Heroku, create another Heroku project and replicate your production application completely (except use a different custom domain name).
You can add multiple remote git targets to a single git repository so you can use your current production code base. You can push your database backups to the replicated project, and then you should be good to go.
The only step missing from this exercise verses a real disaster recovery is assigning your production domain to the replicated Heroku project.
If you can afford to run two copies of your application in parallel, you could automate this exercise and have it replicate itself on a regular basis (e.g. hourly, daily) based on your data loss tolerance.

approaches to Build servers and stored procedures

I have been put in charge of looking at setting up a build server for our office. We currently put all queries into stored procedures in SQL 2000 server. This is done manually and no SQL files are produced or put into SVN.
What I am after is a good way of dealing with having a build server that can get all the stored procs from a DB.
I am guessing this might not be possible / practice and am pretty sure not the best practice. I realize one solution could be to start creating SQL script files and putting them into SVN so they can be picked up and dealt with.
You have answered your own question. Get these things into source control before you start digging yourself further into a hole you really don't want to be in.
Once done, an approach we have used successfully is to have an initial snapshot set of scripts, then version numbered script folders for changes, with the overall database version number stored in a database table specifically for that purpose. We then wrote a utility to assemble all the update scripts since the stored version number, run them, and update the version number. This integrated with our build script that was run against the dev DB by an automated build. Schedules and so on are of course up to you.
Would strongly advise you make all DB scripts safely repeatable as well.

Resources