How to split an app into 3 to allow independent deployments ? (Rails 3.2, heroku, postgresql, active admin 1.0) - ruby-on-rails

I am building a daily deal app to learn ruby on Rails. I'm pretty newbie but loving learning to "build stuff".
The app is actually made up of 3 parts:
the public app/website where internet visitors come check the deals
my admin interface (on active admin) where I input the deals (product name, dates of validity, prices..)
in-app analytics where each deal company can check indicators such as nb of people who saw their deal page, number of clicked, and so on...
Right now, I'd like to improve my app because I only have now ONE github repo and when I deploy (with the usual "git push in local and then git push master for heroku), it "stops/makes unavailable" EVERYTHING for a couple of seconds/minute.
My goal is to improve the internet visitor experience and allow me to make more iteration/deploys on back end stuff. Indeed the module s2 and 3 change often.
If I only want to deploy changes on modules 2 and 3 (e.g change some stuff in the admin interface), I don't want the public website(1.) accessed by internet visitors to be "stopped"/unavailable.
How can I do this?
Should I create one gitub repo/app for each "part" ? I fear to have problems such as if I have a model Deal, this model has to be both on the public app (1.) but also on my module 2. (active amdin needs the model) and maybe even the third. Will I have each time to copy paste all the changes in the three apps? it would be terrible and not scalable!!. I know it's not good to do that but i don't know what to do then.
So how can I do this in a proper way?
PS: if you know of any "step by step" page or website explaining how to do it in detail, it would be of course great as I think i'll need precise guidance.

Firstly, you don't have 3 applications, it is only 1 application but with multiple types of users. You can't really avoid deploying your whole codebase every time you update I'm afraid.
The essence of your question is how to avoid user downtime during deployments. This is an important topic for anybody with a hosted application.
The usual culprits causing slow deployments are DB migrations and asset compilation.
A good place to start might be this video - http://railscasts.com/episodes/373-zero-downtime-deployment
It covers non-Heroku deployments, but might start you off in the right direction.
Finally, there is a feature in development from Heroku that you could try called preboot. It basically spins up your new dynos before cutting traffic over to them during a deployment. Have a look here: https://devcenter.heroku.com/articles/labs-preboot

Related

How to speed up my Rails 4 app on Heroku (tips for a n00b)?

How to speed up my Rails app on Heroku? Will buying 2 dynos speed up my site significantly?
I found these tips and already implemented some of them on my Rails app, but I'm interested whether there are more?
“Thin controller and fat model”
Split views in separate partials
Use of CDN
Caching
Using the asset Pipeline
Edit: Getting a downvote, so clearly something wrong with my question, too broad?
(It probably would have been better if you asked about a specific part of performance you want to improve upon).
Here's a tip, use New Relic to ping your site and check that is still up and running:
you will be alerted to any issues
your Heroku app won't go to sleep which will increase its response time as it won't have to 'wake up' if no one has viewed your site for a while.
Plus use New Relic anyway as it has a wide range of features that are really useful for monitoring database requests, response times, etc.

How are people solving app pool recycle issues on deployment with large apps?

Currently after a build/deployment of our app (58 projects, large asp.net MVC 3 front end) takes ~15-20secs to load as it goes through the whole 'recycling the app pool' (release configuration).
We do have a web farm if that alters people's answers, but the question really is:
What are people doing in large scale applications where a maintenance window isn't viable (we're a 24/7 very active website) to minimize that initial 'first hit' on the app pool recycle after a deploy?
We've used a number of tools to analyze that startup time and there doesn't really seem to be any way to bring it down so what I'm looking for are what techniques do people employ in order to minimize the impact of a large application deploy affecting users.
By default - if you change 15 files in an ASP.NET application at once (even via FTP) then the app pool is automatically recycled. You can change the number of files but as soon as web.config and bin files are changed then it needs to recycle. So in my opinion the ideal solution for an environment like yours would be as follows:
4 web servers (this is an arbitrary number)
each server has a status.aspx that the load balancer looks at - use TeamCity to take 2 of these servers "off line" (off the load balancer) and wait 20 seconds for the traffic to filter across. A distributed cache will help keep user experience problems
Use TeamCity to deploy to those 2 servers - run your automated tests etc. and once you are happy put those back into the farm and take the other 2 offline and deploy to those
This can all be scripted / automated. The only issue with this is any schema changes that are not backwards compatible may not allow running the new version site in parallel with old version of the site for the 20 seconds for the load balancer to kick back in
This is good old fashioned Canary Releasing - there are some patterns here http://continuousdelivery.com/patterns/ to help take into consideration. Id also suggest a copy of that continuous delivery book - its like a continuous delivery bible and has got me out of a few situations :)
At the very base you could run a tinyget script against the application after completion of deployment which will "warm up" the application however if a customer hits your site before the script can run, they will still face a delay. What do you currently have in place, what post deployment steps do you have in place?
In a farm environment you could stage deployments too, so take one server out of load balance, update it and then bring that online after deployment and take the other out, complete the deployment and then reintroduce into the farm. How is your SQL Server setup - clustered?
copy and paste from my post here
We operate a Blue/Green deployment strategy on a 4 tier architecture which has a web site over 4 servers at the top tier. Due to the complexity the architecture introduced for deployments, we needed a way to deploy without disturbing any traffic to the "live" site. Following Fowler's advice, but not quite in the same way, we came up with a solution that means we have 2 sites on each server (a blue and a green, or in our case site A and site B). The live site has the appropriate host header, and once we have deployed and tested to the non-live site, we then flip the headers of the 2 sites so that what was once live is now the non-live site, and vice-versa. The effect is, a robust deployment that can be done in business hours and with the highest level of confidence.
This of course complicates your configuration and deployment slightly, but it's worth the effort. I guess it kind of goes without saying that you want to script both the deployment, and the host header swapping.
Firstly, unless you're running Google or something bigger, does a 15-20s load time at 3am for a handful of users really impact that much? I'd say the effort invested in eliminating the occasional lag would far outweigh the 15-20s inconvenience of a couple of users.
I consider it a necessary evil of using ASP.NET unfortunately. Using a pre-compiled site (.DLLs instead of the code-behind files) will lessen the time but not necessarily eliminate it.
The best thing you can do is use something like a status notification bar to warn users they may experience some "issues" during "essential maintenance".
But even then, I'd say in terms of user experience it'd be better to keep quiet and have a handful of people blame their "slow internet" when your site takes 20s to load on one occasion, than announce to all and sundry that it will be slow.
You can also try this approach : http://weblogs.asp.net/scottgu/archive/2009/09/15/auto-start-asp-net-applications-vs-2010-and-net-4-0-series.aspx
without knowing anything about your site, my first thought is that you might be able to break it down into smaller sites so that they start faster individually.
second, with your web farm, i assume you have some sort of load balancing device in front of that from which you can pull machines out of the pool when they are being deployed. don't put them back in the pool until after you have sent a request against the site to get it started up. you should be able to script this such that you are pretty much clicking a button that takes a machine out, deploys to it, and sends a request after it's back up and happy.
You can consider using aspnet_compiler.exe to precompile your application, because I think the delay after deployment is caused by the compilation phase rather than "whole recycling the app pool".

Should I deploy my Ruby on Rails application on Heroku [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
A little about myself. I am 24 years old, I graduated from NC State with a Master's in Analytics last year. Statistics, mathematics, that kind of thing. I don't have a strong programming background, which is pretty important for my question. If I say anything that doesn't make any sense, that is why. Ever since graduation, I have been working full time on a Rails app with a few other people. My programming experience is mainly Ruby on Rails (1.2 years.) I know R, SAS (statistical languages, not helpful for this question.)
Obviously, that means it has been over a year in development, and we aren't done yet. The main developer is an excellent programmer, just that he has a full time job already, and does this app in his spare time. Due to him not having enough time recently, I have been given practically full responsibility for the app.
We have it deployed on Slicehost right now. The app is at a point where we don't need to program anything else (unless we think of more features.) The reason I am asking if we should migrate to Heroku is that it seems to me that Heroku is a simple platform to which to deploy. Slicehost seems too complicated for me. The other developer dealt with it, and not me. I looked at how to deploy the app on Heroku, and it looks like I would be able to do it. We need our app to scale if it needs to, which Heroku offers. As far as money, I would start it at the minimum (free) and see how it goes. I can pay for additional features if I need to.
We are using Redmine for project management and repository (not git, which I think we need to use on Heroku.) Is git similar to Redmine? Is it easy to use?
Right now, on Slicehost, we have 4 daemons (constantly running processes.) We have 8 delayed_job workers. I know the command line to start the daemons and delayed_job workers. Would these work on Heroku?
I am wondering if I can still use RAILS_ENV=production script/console with Heroku.
The user interface is a javascript file. In development mode, if I do script/server in a terminal, and go to http://localhost:3000 in a browser, I can see it. Would Heroku load this page the way I want?
We have a working website for the app, with our own domain name. I don't really know what DNS is, so I probably wouldn't be able to link the Heroku app to it, unless there is an easy way. I think Heroku links it to appname.heroku.com as a default.
Based on my programming experience, would Heroku be easy enough for me to use, should I find another job, or should I commit seppuku?
Yes, you should definitely deploy your application in heroku. To do this, this is what you will need to do:
Make sure you have git installed in your computer
Create a heroku account here
Install the heroku gem and do the rest as mentioned in this page
Track your application with git, and create your heroku application as shown here
After you do this, heroku will provide you with a URL for your application, such as http://blah-bleep-123.heroku.com. Now, the next step would be to associate your domain to this heroku URL.
Configure your domain DNS Server as shown in this page. Mind you, after you change your DNS, it might take upto 48 hours for it to work. You can change your domain DNS by logging into the site where you bought your domain, for e.g. godaddy.com, hostingdude.com, etc.
Add this code to your ApplicationController. You can follow this from this page as well
class ApplicationController
before_filter :ensure_domain
APP_DOMAIN = 'www.mydomain.com'
def ensure_domain
if request.env['HTTP_HOST'] != APP_DOMAIN
# HTTP 301 is a "permanent" redirect
redirect_to "http://#{APP_DOMAIN}", :status => 301
end
end
end
Make sure you migrate all your database in heroku, by doing heroku rake db:migrate
After you have completed all these steps, you should be good. Check out your domain URL, everything should work pretty :).
If you see any errors in your page, you can view the log by heroku logs
You can access console as heroku console
With features as these, heroku is very convenient to work with.
Please let me know if you need more help.
It seems to me like you should seriously consider Heroku. I have used it for weekend projects and we use it at work as well, quite successfully. Deployment is a breeze, you don't have to worry about setup (for the most part) and system administration. It's super easy to add modules and "pay as you grow".
As for your needs, you could (I believe) run your redmine on Heroku itself, being a rails app. The only thing is that you mention you use Redmine as "repository" and I'm not sure I understand what you mean, since Redmine is not a version control system. Redmine has integration points for various VCS (SVN, git, Mercurial, CVS, and others). Yes, Heroku uses git and that is what you would need to use in order to push code to the server. If you're familiar with Mercurial, it's pretty similar.
For delayed jobs, Heroku offers free cron jobs that run once a day and hourly ones for a fee (see cron add-on). There is also a delayed job plugin (see this) but I don't have any experience with it.
You should be able to access the Rails console (see heroku docs). Just run 'heroku console' and voila, you're there.
If your app works by running script/server, it should work out of the box in heroku too.
As for the DNS, getting it to work with your custom domain is not hard. Out of the box, you can access your app with appname.heroku.com, to set up your custom domain check heroku docs here, but basically you have to add the custom domain add-on (free unless you want subdomains), configure heroku to respond to your domain's requests (couple of simple commands) and set your DNS provider to point to Heroku (there's even a short video in the docs on how to do this with GoDaddy).
The only drawback I've seen with Heroku, and it's not a huge one, is that if your app does not receive any traffic for an extended period of time, the instances kind of "go to sleep", making the next request to arrive somewhat slow (sometimes even timing out), but once the instance is awake, everything is good to go.
All in all, I think Heroku is a great way to take a ton of the burden off of you as a dev and making a lot of things really easy to implement without having to go into the nitty gritty of setting up a server. The downside: once you start growing, it can become somewhat expensive, but hey, if you're growing it probably means you have the cash now to hire someone that can take care of the nitty-gritty.
You might also want to take a look at this blog post which compares Slicehost and Heroku
Best of lucks
YES, go for it.
If you've managed thus far on the strength of your 'programming experience' then you'll be fine. Have some confidence and ship something! To quote Paul Graham:
The reason to launch fast is not so much that it's critical to get your product to market early, but that you haven't really started working on it till you've launched. Launching teaches you what you should have been building. Till you know that you're wasting your time. So the main value of whatever you launch with is as a pretext for engaging users.
The functionality you outline is easily replicated and well documented and it's free to start with. What else could you ask for?
If you have the free time, you might as well sign up for a free account and give it a shot.
HOWEVER, this is going to come with some pretty severe headaches.
Version control will be one, since heroku uses git, but another one that no one's mentioned yet is that your 12 processes ("dynos" in heroku speak) would cost you $35 * 11 = $385 per month! You can set up an hourly cron for $3/month that will flush your delayed_job queue (instead of having workers running at all times), but is that going to be adequate? (If you're running 8 workers, I'm guessing not). This may or may not require some code changes.
Once you get it set up, deployment and admin is really easy (nonexistent), but it'll cost you if you start needing new features.
YES, go for it.
Its good deployment environment any its fast and easy scale out and scale in feature.
even you can use for your testing or demo usage its provide you free account usage of 1 dynos per app.
List of add-on tools available you can add as per your requirement.

How to architect Rails site that can be edited while running?

I am writing a Rails app that "scrapes/navigates" some other websites and webservices for content. I am using Mechanize and Savon to do the heavylifting.
But given the dynamic nature of the web, I'd like to make my calls to these editable by the admin users of the site - rather than requiring me to release a new version of the site.
The actual scraping thread happens async to the website, using the daemons gem.
My requirements are:
Thinking that the scraping/webservice calling code is quite simple, the easiest route is to make the whole class editable by the admins.
Keep a history of the scraping code - so that we can fairly easily revert if we introduce a problem.
Initially use the code from the file system, but as soon as thats been edited and stored somewhere, to use that code instead.
I am thinking my options are:
Store the code in the db (with a history table for the old versions)
Store the code in a private git repo somewhere and access that for the history/latest versions.
I am thinking the git route might be easiest, given its raison d'etre is to track file history...
But perhaps there is a gem/plugin that does all this for me, out of the box?
Thanks in advance for any tips/advice.
~chris
I really hope you aren't doing something like what's talked about here...
Assuming you are doing a proper mixin, there used to be a gem called "acts_as_versioned" which would do something like you want. It's been a while so I don't know if it's been turned into a plugin or if it's been abandoned. Essentially the process it uses was to provide a combination key for your versioned table.
Your database would have a structure like this:
Key column (id for the record)
Version column (id for the record's version)
All the record attributes
Let's say you had a table for your scripts, and the script you wanted has three versions. Your table would have the following records:
123, 3, '#Be good now'
123, 2, 'puts "Hi"'
123, 1, '#Do not be bad'
Getting the most recent version would be as simple as
Scripts.find :first, :conditions=>{:id=>123}, :order=>"version desc"
Rolling back would be as simple as removing the most recent version, or having another table with a pointer to the active version. It's up to you.
You are correct in that git, subversion, mercurial and company are going to be much better at this. To provide support, you just follow these steps:
Check out the script on the server (using a tag so you can manage what goes there at any time)
Set up a cron job to check out the new script periodically (like every six hours or whatever you feel comfortable with)
The daemon you have for running the script should run the new version automatically.
IF your site is already under source control, and IF you're running under mod_rails/passenger, you could follow this procedure:
edit scraping code
commit change locally
touch yourapp/tmp/restart.txt
that should give you history of the change and you shouldn't have to re-deploy.
A bit safer, but not sure if it's possible for you is on a test/developement server: make change, commit locally, test it, then on production server, git pull then touch tmp/restart.txt
I've written some big spiders and page analyzers in the past, and one of the things to keep in mind is what code is providing what service to the entire application.
Rails is providing the presentation of the data being gathered by your spidering engine. The presentation is one side of the coin, and spidering is the other, and they should be two separate code bases, tied together by some data-sharing mechanism, which, in your case, is the database. The database gives you some huge advantages as does having Rails available, when your spidering code is separate. It sounds like you have some separation already, but I'd recommend creating a wider gap. With that in mind, here's how I've done it before, and what I'd do now.
Previously, I had a separate app for my spidering that was spawning multiple spider tasks. Each task would look at a bunch of different URLs, throw their results in the database, then quit. Each time one quit the main app would spawn another spider to process more URLs. Each loop, the main app checked a YAML configuration file for run-time parameters, like how many sub-tasks it should have running, how many URLs they'd get, how long they'd wait for connections, etc. It stored the last modification date of the config file each time it loaded it so, if I made a change to the file, the app would sense it in a reasonably short time, reread the file, and adjust its behavior.
All state information about the URLs/pages/sites being scraped/spidered, was kept in the database so I could check on its progress. I could see how many had been processed or remained in the queue, the various result codes, and the content being returned. If I didn't like something I could even tweak the filters to skip junk pages, knowing the spidering tasks would be updated in a few minutes.
That system worked extremely well, spidered a major customer's series of websites without a glitch, running for several weeks as I added new sites to the list. (We were helping one of the Fortune 50 companies improve their sites, and every site had been designed and implemented by a different team, making every site completely different. My code had to be flexible and robust; I was really happy with how it worked out.)
To change it, these days I'd use a database table to hold all the configuration info. That way I could easily build an admin form, and let someone else inherit the task of adjusting the app's runtime configuration. The spider tasks would also be written so they'd pull their configuration from the database, rather than inherit it from the main app. I originally had the main app do all the administration and pass the config info to the spidering apps because I wanted to keep the number of connections to the database as low as possible. I was using Postgres and now know it could have easily handled the load, so by letting the individual tasks handle their configuration I could have made it more responsive.
By making the spidering engine separate from the presentation engine it was possible to temporarily stop one or the other without affecting the progress of the spidering job. Once I had the auto-reload of the prefs in place I don't think I had to stop the spidering engine, I just adjusted its prefs. It literally ran for weeks without stopping and we eventually pulled the plug because we had enough data for our needs.
So, I'd recommend tweaking your code so your spidering engine doesn't rely on Rails, instead it will be fired off by cron or a separate scheduling app. If you have to temporarily stop Rails your engine will run anyway. If you have to temporarily stop the engine then Rails can continue serving pages. The database sits between the two acting as the glue.
Of course, if the database goes down you're hosed all the way around, but what else is new? :-)
EDIT: Chris said:
"I see your point about the splitting the code out, though my Ruby-fu is low - not sure how far I can separate things without having to have copies of the ActiveModel/migrations stuff, plus some shared model classes."
If we look at your application as spider engine <--> | <-- database --> | <--> Rails/MVC/presentation, where the engine and Rails separately read and write to the database, and look at what each does well, that helps figure out how to break them into separate code bases.
Rails is designed to handle migrations, so let it. There's no reason to reinvent that wheel. But, how often do you do migrations, and what is effected when you do? You do them seldom once the application is stable, and, at that point you'd do them in a maintenance cycle to tweak the database. You can shut down the spidering engine and the web interface for a few minutes, migrate the database, then bring things up and you're off and running. Migrations are a necessary evil, but are hardly show-stoppers once in production. Most enterprises have "Software Sunday", or some pre-announced window of maintenance, so do the same.
ActiveRecord, modeling and associations are pretty easy to deal with too. The models are in a file that is required internally by Rails already, so the spidering engine can inherit the database know-how that way too; Multiple apps/scripts can use the same model file. You don't see the Rails books talk about it much, but ActiveRecord is actually pretty easy to use outside of Rails. Search the googles for activerecord without rails for more info.
You can pull in ActiveSupport also if you want some of its extensions to classes by doing a regular require, but the Rails "view" and "controller" logic, which normally applies to presenting the web interface, shouldn't be needed at all in the engine.
Business logic, which goes in the controllers in Rails could even be refactored into separate methods that get required by the Rails side of things and by the spidering engine. It's a different way of looking at Rails but falls in line with the "DRY" mantra - don't repeat yourself, so make things modular and require (or require_relative) bits and pieces that are the building blocks of the entire system.
If you don't want a totally separate codebase, you can take advantage of Rail's script runner, which gives a script access to the ActiveRecord::Base and ActiveRecord::Associations and ActiveSupport. Do a rails runner -h from your app's main directory, or search for "rails runner" for more info. runner is not good for a job that starts and runs many times an hour, because Rail's startup cost is high. But, if you have a long-running task, say one that runs in parallel with your rails app, then it's a great choice. I'd give it serious consideration for the spidering side of your application. Eventually you might want to break the spidering-engine out to a separate host so the presentation side has a dedicated host, so runner will help you buy time and do it in small steps.

Tools to assist managing the application promotion process in an enterprise environment

I am curious on how others manage code promotion from DEV to TEST to PROD within an enterprise.
What tools or processes do you use to manage the "red tape", entry/exit criteria side of things?
My current organisation is half stuck between some custom online forms type functionality and paper based dependencies to submit documents, gather approvals and reviews.
All this is left in the project managers hands to track what has been submitted, passed review, approved and advise management if there are any roadblocks that may need approval to be "overlooked" before an application can be promoted to the next environment.
A browser based application would be ideal... so whats out there? please show me that you googlefu is better than mine.
It's hard to find one that's good via google. There is a vast array of tools out there for issue management so I'll mention what we use and what we woudl like to use.
We currently use serena products. They have worked well for us in the past. Team Track is our issue management and handles the life cycle of any issue we work on. Version Manager is our source control and has the feature of implementing promotional groups like DEV TEST And PROD. We use DEV, TSTAGE, TEST, PSTAGE and PROD to signify the movement from one to the other, but it's much the same. The two products integrate nicely so that the source associated with the issues is linked, but we have no build process setup in this environment. It's expensive, but it works well.
We are looking ot move to a more common system using Jira for issue management, Subversion for source control, Fisheye to link the two together and Cruise Control for build management. This is less expensive, totaling a few thousand for an enterprise lisence and provides all the same features but with the added bonus of SVN which is a very nice code version mangager.
I hope that helps.
There are a few different scenarios that I've experienced over the years:
Dev -> Test : There is usually a code freeze date that stops work on new features and gets a test environment the code that has been tagged/labelled/archived that gets built. This then gets copied onto the machines and the tests go fine. This is also usually the least detailed of any push.
Test->Prod : This requires the minor change that production has to go down which can mean that a "gone fishing" page goes up or IIS doesn'thave any sites running and the code is copied over again. There are special cases to this where a load balancer can act as a switch so that the promotion happens and none of the customers experience any down time as the ones on the older server will move once their session ends.
To elaborate on that switch idea, the set up is to have 2 potentially live servers with just one server taking requests that the load balancer just sends all the traffic to one machine that can be switched when the other server has the updated code to go live.
There can also be a staging environment which is between test and production where the process is similar in terms of there is a set date when the promotion happens.
Where I used to work there would be merge days where a developer spent most of a day in Perforce merging code so that it could be promoted from one environment to another.
Now there are a couple of cases where this isn't used:
"Hotfixes" or "Hot patches" would occur where I used to work and in this case the specific files were copied up into the staging and production environments on its own since the code change had to get into Production ASAP since something broke in production or some new thing that had to get done that takes 2 minutes gets done. In this case, the code change getting pushed in had to be reviewed and approved before going out.
Those are the different approaches I've seen used where generally there are schedules and timelines potentially have to be changed or additional resources brought in to make a hard date like if a conference is on a particular weekend that such and such is ready for that.
Of course in a few places there has been the, "Oh, was that broken? Let me see..." and a few minutes later, "No, see it isn't broken for me," where someone changed things without asking permission or anything where a company still has what they call "cowboy programming."
Another point is the scale of the release:
1) Tiny - This is the case where one web page goes up so that user X can do Y.
2) Small - A handful or so of files that isn't really complicated but isn't exactly trivial.
3) Medium - Where going from one environment to another requires changing a bunch of files and usually has scripts to move.
4) Big - Where there are scheduled promotions and various developers are asked for who is taking which shifts when the live push is done. I had this in a case where there was a data migration to do in addition to a release of some new e-commerce sites.
5) Mammoth - Where everything is brand new including how this would be used. I don't think I've ever seen one of this size but I'd imagine Microsoft or Google would have releases of this size.
Somewhere in that spectrum most releases fall and so how much planning and preparation can vary quite a bit and let's not forget that regulatory compliance can be its own pain in getting some things done.

Resources