I have an MVC app that using EF6 Code First. I want to deploy this app to multiple datacenters. On deployments that have migrations, I can write a script to migrate them all as simultaneously as possible, but if one datacenter is slower, then the calls could all be rejected since the schema no longer matches. A script that tried to coordinate would also make rolling upgrades impossible.
Is there a way to make EF at least attempt to run the query even though the schemas don't match? Is there a different way I can/should approach this?
UPDATE:
Let's see if I can word this better. I want to have my MVC app in multiple datacenters. Let's assume that I deploy the app to each datacenter individually.
Option 1
Deploy to DC A
Code first migration runs on centralized DB
Requests made to DC A succeed, but requests to DC B fail
Option 2
Deploy to DC A
Do not automatically run migration
Requests made to DC A fail and requests to DC B continue to succeed
How do I develop a deployment strategy that will make it so that requests to either DC will work?
BTW: I am using Azure Web Sites, if a platform-specific solution is needed.
In your post, it seemed like you were concerned with how it would behave during the actual upgrade. Nothing about testing. But in comments you are asking about doing a partial deployment then doing testing. So on one hand you'd want to deploy as quickly as possible to minimize downtime. On the other hand, it sounds like you want to deploy to one site, test, and have the other sites continue to function while you are verifying the first deployment?
Verifying a deployment is reasonable, but fairly complex. I'm not sure you will find much in the way of automation for this. I think you should test prior to production deployment thoroughly, and then simply deploy as quickly as possible in production. If there were an issue you found only when deploying to production, you'd be in a bad situation, because now your site is down until you can fix it. Even if you could get the other instance to work with the new database, that is risky as it is going to be modifying things against a schema it doesn't completely understand. Additionally, if you do need to rollback the DDL then you will almost certainly lose any data that was modified since the deployment. So it is really best that all instances for the old schema fail until they are upgraded, to prevent them from modifying data that is at risk of being lost.
Usually you should have done a deployment to a staging environment that is as close to your production as possible to test the database migration process. This is called pre-production testing, and sometimes involves restoring the most recent backup from production into staging to ensure new constraints/structures are valid for existing data. By deploying to this staging environment, you should have a very high level of confidence that production deployment will go successfully.
You additionally safe guard yourself against production deployment issues by taking backups prior to deployment so that you can rollback as necesary(although this is worst case scenario as it might mean throwing out important data that came in between backup/deployment and realization that there is an issue). I imagine EF migrations uses a transaction to run the DDL scripts so they should rollback all-or-nothing if there is an issue.
Related
I would like to know what the best practice is for taking a ruby on rails application in production and adding a feature to it or debugging a broken feature?
What I mean is, say you have a working application and you have lots of people using it. You want to add a new feature to this app. You clone your application to your local machine. Create a new feature (or w/e) branch.
Now what do you change/do so you don't destroy your database and so you are able to test and debug this application on your local machine?
Also, let's say this is an older rails application with an older ruby version.
I would also like to note that I am having trouble finding any information this and am willing to read books and lots of text to learn if it is a very involved task.
Although the complexity of this type of operation varies quite a bit, usually based on the complexity of the application itself, I think a few generalizations can be made.
Tests
Obviously, do not break any existing tests. Write tests for you new functionality, even if they are the first tests in the application.
Data
Ideally, you will have data to work with that very closely mirrors your production data. In some cases (CMS) this may be an actual dump of the production database and assets, restored locally. In other cases (billing portal for a hospital), you will probably need to rely on well-constructed seed data. Once your automated tests pass, you can perform manual QA against the (possibly simulated) production data.
Staging
If you do not have a staging environment that 100% mirrors your production environment, set one up now. This should be set up as close as you possibly can to your production environment, using the database guidelines from above. Merge your feature branch into staging prior to merging it into production. This will allow you to do a final QA test in a near-production environment. This can be used to test not only new application features, but new server versions, ruby version, etc.
CI/CD
It is becoming very common to use CI/CD to automate the testing and deployment of feature branches. This can help enforce code quality guidelines. It can also allow you to run the tests in an environment that matches production, for extra peace of mind.
Backups
Obviously, even with all of this, things can still go wrong. Keeping up-to-date backups is vital, for worst case scenarios.
we have several technologies accessing the same database. At the moment, Ruby/Rails is used to create migrations when making changes to the database. The question is a simple one:
Is it possible for our DBAs to make changes to the database (not using Ruby migrations) without stepping on the Ruby devs toes and breaking the Ruby web application?
If so, some generic details about how to get started or pointed in the right direction would be great! Thanks.
I can tell you from experience that this is not the best idea, one that you will eventually regret and later, inevitably, reverse. But I know that it does come up. I've had to do them (against my will or in case of extreme emergencies).
Given the option, I'd push back on it if you can in favor of any solution that bring the SQL closer to the repository and further away from a "quick fix" to the database directly. Why?
1) Your local/testing/staging/production databases will diverge, eventually rendering your code untestable in a reliable way
2) You won't be able to regenerate your database from "scratch" to match production
3) If the database is ever clobbered, you won't be able to re-create it in any sensible way.
DBA's generally don't care about these things until something in the code breaks, and they want you to figure it out. But, for obvious reasons, that now becomes quite difficult.
One approach I have taken that seems to make everyone happy is to do the following:
1) Commit to having ALL database changes, big or small, put into a repository with the code. This means that everything that has happened to the database is all together in one place.
2) Each change, or set of changes, should be a migration. A migration can be simply running an SQL file. But, it should be run from within a migration for all the testability benefits.
So, for example, let's say you have a folder structure like:
- database_updates
-- v1
--- change_1.sql
--- change_2.sql
-- v2
--- change_3.sql
--- change_2_fix.sql
Now, let say you want to make a change or set of change via SQL. First, create a new version folder, let's call it "v1". Next, put your SQL scripts in this folder. Finally, create a migration:
def change
# Read all files in v1 folder, and run the SQL
end
(I have code that does this, happy to share the gist if you find yourself using this approach)
Since each migration is transactional, any of the scripts that fail will cause all of them to fail.
Now, let's say you have the next set, v2. Same exact thing. And, we have a history of these "versioned" changes, so we can look at the migration history and see what's been run, etc.
As a power user note, this set up also allows for recourse if things fail; in these cases, we can opt to go back to v1:
def up
# run v2 scripts
end
def down
# run v1 scripts
end
For this to work, v1 and v2 would need to be autonomous -- that is, they can destroy and rebuild entities without any dependencies. If that's not what you want, just stick with the change method.
This would also allow you to test for breaking changes. Let's say it is reported that something doesn't work anymore with v6. You can rollback your database migrations to v5, v4, etc (because you are doing a migration per folder) and test to see when the test broke, and correct it with v7.
Anyway, the end game of it all is that you can safely check out this project from a repository, create your database, run rake db:migrate and know that your database structure resembles exactly what is deployed elsewhere. And, worst case, if your database gets clobbered, you can just run all your scripts from v1 - vN and end up with your database back again.
For DBA's everything remains SQL for them, they can just send you a file or set of files for you to run.
If you want to get fancy, you could even write a migration generator that knows how to handle a line like rails g migration UpdateDBVersion version:v7 to take care of the repetitive boilerplate.
As long as everyone relies on the same updated schema.rb or structure.sql, everyone will share the same database 'version'.
See this SO answer for more insight.
Changes to the database, tables, or indexes should be made using ActiveRecord migrations whenever possible. This specifically ensures that development and test environments remain logically in sync. Remember that developers must be capable of accurate development and testing against the same database structure as occurs in the production environment, and QA teams must be able to adequately test such changes.
However, some database features are not actually supported by ActiveRecord migrations, and may only be applied directly to the database. These features are often database-specific, such as any of the following:
Views
Triggers
Stored procedures
Indexes with function-based columns
Virtual columns
Essentially any database-specific features that don't have an ActiveRecord abstraction will be made directly to the database.
Sometimes, however, other applications require the addition of tables, columns, or indexes in order to operate properly or efficiently. These other applications may simply be used to view/report against the database, or they may be substantial business applications that have their own independent database requirements and separate development teams. Occasionally, a DBA may have to step in and create an index or provide some optimization needed to solve a real-world production performance issue.
There are simply far too many situations for shared database management to give a definitive answer. Depending on the size of the organization and the complexity of the needs for the shared management, there may be many ways to solve the problem of a shared database schema that are specific to the application or organization.
For instance, I have worked on applications that shared a database with as many as 10 other applications, each of which "owned" portions of the schema and shared other portions with the other teams, all mediated through the DBA group. In situations such as this, the organizational structure and change control process may be the only means of solving this problem.
Whichever the situation, some real-world suggestions may help avoid problems and mitigate maintenance woes:
Offer to translate SQL DDL commands into ActiveRecord migrations, where possible, so that DBAs can accomplish their needs, and the application team can still appropriately maintain the schema
Any changes made outside the ActiveRecord migration should be thoroughly tested for impact to the project in a non-production environment by the same QA resources that test the actual Rails application
Encapsulate any external changes in a .sql file and include the file as part of the project in version control
If the development team is using the same database product in development (some cannot, due to licensing or complexity), those changes should be applied to the developer database instances, as well
It's best if you can apply the changes during a migration, even just by calling the relevant CLI tools as a migration step - the exact mechanism will be database-dependent, as well
Try to avoid doing this more than is absolutely necessary, as this can significantly reduce the database independence of the application, even between versions of the same database product (limiting upgrade opportunities)
Is it possible to do something like the Github zero downtime deploy on Heroku using Unicorn on the Cedar stack?
I'm not entirely sure how the restart works on Heroku and what control we have over restarting processes, but I like the possibility of zero downtime deploys and up until now, from what I've read, it's not possible
There are a few things that would be required for this to work.
First off, we'd need backwards compatible migrations. I leave that up to our team to figure out.
Secondly, we'd want to migrate the db right after a push, but before the restart (assuming our migrations are fully backwards compatible, this should not affect anything)
Thirdly, we'd want to instruct Unicorn to launch a new master process and fork some workers, then swap the PIDs and gracefully shut down the old process/workers
I've scoured the docs but I can't find anything that would indicate this is possible on Heroku. Any thoughts?
I can't address migrations, but the part about restarting processes and avoiding wait time:
There is an beta feature for heroku called preboot. After a deploy, it boots your new dynos first and waits a while before switching traffic and killing the old ones:
https://devcenter.heroku.com/articles/labs-preboot/
I also wrote a blog post that has some measurements on my app's performance improvements using this feature:
http://ylan.segal-family.com/blog/2012/08/27/deploy-to-heroku-with-near-zero-downtime/
You might be interested in their feature called preboot.
Taken from their documentation:
This feature provides seamless deploys by booting web dynos with new code before killing existing web dynos.
Some apps take a long time to boot up, and this can cause unacceptable delays in serving HTTP requests during deployment.
There are a few caveats:
You must have at least two web dynos to use this feature. If you have your web process type scaled to 1 or 0, preboot will be disabled.
Whoever is doing the deployment will have to wait a few minutes before the new code starts serving user requests; this happens later than it would without preboot (but in the meanwhile, user requests are still served promptly by old dynos).
There will be a short period (a minute or two) where heroku ps shows the status of the new code, but user requests are still being served by old code.
There is much more information about it, so refer to their documentation.
It is possible, but requires a fair amount of forward planning. As of Rails 3.1 there's three tasks that need carrying out
Upload the new code
Run any database migrations
Sync the assets
Uploading code and restarting is fairly straightforward, the main problem lies with the other two, but the way round them is the pretty much the same.
Essentially you need to:
Make the code compatible with the migration you need to run
Run the migration, and remove any code written specifically for it
For instance, if you want to remove a column, you’ll need to deploy a patch telling ActiveRecord to ignore it first. Only then you can deploy the migration, and clean up that patch.
In short, you need to consider your database and the code compatability an work around them so that the two can overlap in terms of versioning.
An alternative to this method might be to have two versions of the application running on Heroku at the same time. When you deploy, switch the domain to the other version, do the deploy, and switch it back again. This will help in most instances, but again, database compat is an issue.
Personally, I would say that if your deployments are significant to require this sort of consideration, taking parts of the application offline are probably the safest answer. By breaking up an application into several smaller applications can help mitigate this and is a mechanism that I use regularly.
No - this is currently not possible using Unicorn on Heroku cedar. I've been bugging Heroku about this for weeks.
Here was Heroku Support's reply to my email on March 8, 2012:
Hi, you could enable maintenance mode when doing a deploy, at least your users would see a maintenance page instead of an error, and also request queue wouldn't build up.
We're definitely aware this is a pain and we're working to offer rolling / zero-downtime deploys in the future. We have no ETA to announce, though.
I am building an app that is fast moving into production and I am concerned about the possibility that due to hacking, some silly personal error (like running rake db:schema:load or rake db:rollback) or other circumstance we may suffer data loss in one database table or even across the system.
While I don't find it likely that the above will happen, I would be remiss in not being prepared in case it ever does.
I am using Heroku's PG Backups (which is to be replaced with something else this month), and I also run automated daily backups to S3: http://trevorturk.com/2010/04/14/automated-heroku-backups/, successfully generating .dump files.
What is the correct way to deal with data loss on a production app?
How would I restore the .dump file in case I need to? Can I do a selective restore if a small part of the system is hit?
In case a selective restore is not possible: assume one table loses data 4 hours after the last backup. Result => would fixing the lost table require rolling back 4 hours of users' activity? Any good solution to this?
What is the best way to support users through the inconvenience if something like this happens?
A full DR (disaster recovery) solution requires the following:
Multisite. If a fire, flood, Osama Bin Laden or whathaveyou strikes the Amazon (or is it Salesforce?) data center that Heroku uses, you want to be sure that your data is safe elsewhere.
On-going replication of the data to a separate site (or sites). That means that every transaction that's written to your database on one site, is replicated within seconds to the mirror database on the other site. Most RDBMS's have mechanisms to let you do a master-slave replication like that.
The same goes for anything you put on a filesystem outside of the database, such as images, XML configuration files etc. S3 is a good solution here - they replicate everything to multiple data centers for you.
I won't hurt to create periodic (daily or so) dumps of the database and store them separately (e.g. on S3). This helps you recover from data corruption that propagates to the slave DBs.
Automate the process of data recovery. You want this to just work when you need it.
Test everything. Ideally, you want to automate the test process and run it periodically to ensure that your backups can restore. Netflix Chaos Monkey is an extreme example of this.
I'm not sure how you'd implement all this on Heroku. A complete solution is still priced out of reach for most companies - we're running this across our own data centers (one in the US, one in EU) and it costs many millions. Work according to the 80-20 rule - on-going backup to a separate site, plus a well tested recovery plan (continuously test your ability to recover from backups) covers 80% of what you need.
As for supporting users, the best solution is simply to communicate timely and truthfully when trouble happens and make sure you don't lose any data. If your users are paying for your service (i.e. you're not ad-supported), then you should probably have an SLA in place.
About backups, you cannot be sure at 100 percent every time that no data will be lost. The best is to test it on another server. You must have at leat two types of backup :
A database backup, like pg-dump. A dump is uniquely SQL commands so you can use it to recreate the whole database, just a table, or just a few rows. You loose the data added in the meantime.
A code backup, for example a git repository.
in addition to Hartator's answer:
use replication if your DB offers it, e.g. at least master/slave replication with one slave
do database backups on a slave DB server and store them externally (e.g. scp or rsync them out of your server)
use a good version control system for your source code, e.g. Git
use a solid deploy mechanism, such as Capistrano and write your custom tasks, so nobody needs to do DB migrations by hand
have somebody you trust check your firewall setup and the security of your system in general
The DB-Dumps contain SQL-commands to recreate all tables and all data... if you were to restore only one table, you could extract that portion from a copy of the dump file and (very carefully) edit it and then restore with the modified dump file (for one table).
Always restore first to an independent machine and check if the data looks right. e.g. you could use one Slave server, take if offline, then restore there locally and check the data. Good if you have two slaves in your system, then the remaining system has still one master and one slave while you restore to the second slave.
To simulate a fairly simple "total disaster recovery" on Heroku, create another Heroku project and replicate your production application completely (except use a different custom domain name).
You can add multiple remote git targets to a single git repository so you can use your current production code base. You can push your database backups to the replicated project, and then you should be good to go.
The only step missing from this exercise verses a real disaster recovery is assigning your production domain to the replicated Heroku project.
If you can afford to run two copies of your application in parallel, you could automate this exercise and have it replicate itself on a regular basis (e.g. hourly, daily) based on your data loss tolerance.
I have an ASP.NET MVC 3 application, WouldBeBetter.com, currently hosted on Windows Azure. I have an Introductory Special subscription package that was free for several months but was surprised at how expensive it has turned out to be (€150 p/m on average!) now that I have started paying for it. That is just way too much money for a site that is not going to generate money any time soon so I've decided to move to a regular hosting provider (DiscountASP.Net).
One of the things I'll truly miss though, is the separated Staging and Production environments Azure provides, along with the zero-downtime environment swap.
My question is, how could I go about "simulating" a staging environment while hosting on a traditional provider? And what is my best shot at minimizing downtime on new deployments?
Thanks.
UPDATE: I chose the answer I chose not because I consider it the best method, but because it is what makes the most sense for me at this point.
Before abandoning Windows Azure, there are several cost-saving things you can do to lower your monthly bill. For instance:
If you have both a Web role and a Worker role, merge the two. Take your background processing, queue processing, etc. and run them in your Web role (do your time-consuming startup in OnStart(), then just add a Run() override to call queue-processing, etc.
Consider the new Extra Small instance, which costs just under half of a Small instance
Delete your Staging deployment after you're confident your production code is running ok. Keep the cspkg handy though, in blob storage, so that you could always re-deploy it.
I use DiscountASP myself. It's pretty basic hosting for sure, a little behind the times. But I have found just creating a subdirectory and publishing my beta/test/whatever versions there works pretty well. It's not fancy or pretty, but does get the job done.
In order to do this you need to create the subdirectory first, then go into the control panel and tell DASP that directory is an application. Then you also have to consider that directory's web.config is going to be a combination of its own and the parent one. You also have to consider robots.txt for this subdirectory and protecting it in general from nosy people.
You could probably pull this off with subdomains too, depending on how your domain is set up.
Another option: appharbor? They have a free plan. If you can stay within the confines of their free plan, it might work well (I've never used them, currently interested in trying them though)
1) Get an automated deployment tool. There are plenty of free/open-source ones that million/billion dollar companies actually use for their production environments.
2) Get a second hosting package identical to the first. Use it as your staging, then just redeploy to production when staging passes.